<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/compute/interface-shader-param-legalization.slang, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2021-03-17T19:55:30+00:00</updated>
<entry>
<title>Remove old code paths from render-test (#1760)</title>
<updated>2021-03-17T19:55:30+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2021-03-17T19:55:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=6e5d85efb9fa5f647f7f0c7ef784a9fd09b29023'/>
<id>urn:sha1:6e5d85efb9fa5f647f7f0c7ef784a9fd09b29023</id>
<content type='text'>
* Remove old code paths from render-test

Historically, the `render-test` tool was using three different code paths:

* One based on `gfx` and manual (non-reflection-based) parameter setting, used for OpenGL, D3D11, D3D12, and Vulkan
* One for CPU that used reflection-based parameter setting but shared no code with the first
* One for CUDA that used reflection-based parameter setting and shared some, but not all, code with the CPU path

Recently we've updated `render-test` to include a fourth option:

* Using `gfx` and the "shader object" system it exposes for a unified reflection-based parameter-setting system taht works across OpenGL, D3D11, D3D12, Vulkan, CUDA, and CPU

This change removes the first three options and leaves only the single unified path. A sa result, a bunch of code in `render-test` is no longer needed, and the codebase no longer relies on things like the `IDescriptorSet`-related APIs in `gfx`.

Several existing tests had to be disabled to make this change possible. Those tests will need to be audited and either re-enabled once we fix issues in the shader object system, or permanently removed if they don't test stuff we intend to support in the long run (e.g., global-scope type parameters, which aren't a clear necessity).

* fixup: CUDA detection logic</content>
</entry>
<entry>
<title>Unify handling of static and dynamic dispatch for interfaces (#1612)</title>
<updated>2020-11-19T09:26:43+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2020-11-19T09:26:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4459d4428761b0581b221c52eaea595d1b257a9f'/>
<id>urn:sha1:4459d4428761b0581b221c52eaea595d1b257a9f</id>
<content type='text'>
Overview
========

Prior to this change, we had two different code generation strategies for interface/existential types in Slang, that didn't always play nicely together:

* The "legacy" static specialization approach could handle plugging in an arbitrary concrete type for an existential type parameter (including types with resources, etc.), but wouldn't work well with things like a `StructuredBuffer&lt;&gt;` of an interface type, and requires somewhat counter-intuitive layout rules to make work.

* The new dynamic dispatch approach produces simpler, more easily understood layouts by assuming that values of interface type can fit into a fixed number of bytes. The tradeoff there is that it cannot handle types that include resources (only POD types).

The goal of this change is to make it so that the two strategies can co-exist. In particular, in cases where a shader is amenable to both static specialization and dynamic dispatch, the type layouts should agree.

In order to make the type layouts agree, we:

* Declare that *all* values of existential type reserve storage according to the dynamic-dispatch rules (so 16 bytes for the RTTI and witness-table information, plus whatever bytes are needed to story "any value" of a conforming type).

* Then we modify the "legacy" layout rules so that if a value of concrete type can fit in the reserved "any value" space for a given interface, then it is laid out there exactly like the dynamic dispatch rules would do. Otherwise, we fall back to the previous legacy rules (since we don't need to agree with the dynamic-dispatch layout on types that can't be used with dynamic dispatch).

Details
=======

* Renamed `ExistentialBox` to `BoundInterfaceType` to better clarify how it relates to `BindExistentialsType`

* Unconditionally apply the `lowerGenerics` pass during emit, since it is now responsible for aspects of the lowering of existential types when specialization is used.

* Made IR type layout take the target into account, so that the layout of resource types can vary by target (e.g., being POD on some targets, and invalid on others)

* Cleaned up some issues around using global shader parameters as the "key" for their layout information in the global-scope layout (only comes up when there are global-scope `uniform` parameters)

* Made there be a default any-value size (16) instead of making it be an error to leave out. This was the simplest option; we could try to go back to having an error, but we'd need to only issue it if we are sure a type/interface is being used with dynamic dispatch, since static dispatch doesn't have to obey the restrictions.

* Changed lowering of existential types to tuples so that bound interfaces where the concrete type won't fit use a "pseudo-pointer" instead of an "any-value" to hold the payload

* Changed IR type legalization to handle the "pseudo-pointer" case and apply layout information from an interface type over to the payload part when static specialization was used.

* Changed some details of how witness tables were being lowered, so that we didn't have to create "proxy" witness tables for the constraints on associated types (just use the actual requirement entries we generate)

* Changed witness tables so that they know the subtype doing the conforming

* Added logic so that we don't generate pack/unpack logic and witness table wrapper functions for types that are incompatible with any-value/dynamic dispatch for a given interface.

* Changed the core AST-level type layout logic to use the dynamic-dispatch layout in case things fit, and the legacy static specialization case when things don't (while also reserving space for the dynamic-dispatch fields)

* Changed a bunch of test cases for static specialization to properly use the new layout (which introduces new buffers in some cases, and moves data around in others).

Future Work
===========

The experience of trying to reconcile our older way of handling interface-type specialization with our newer model (that supports dynamic dispatch) makes it clear that we really need to make similar changes to our handling of generic type parameters on entry points and at the global scope.

A future change should make it so that a global type parameter is lowered with a type layout similar to a value parameter of interface type, including the RTTI and witness-table pieces, and just leaving out the "any value" piece. A similar translation strategy should apply to entry-point generic parameters (mirroring how we lower generic functions for dynamic dispatch already), and value specialization parameters.

Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
<entry>
<title>Remove support for explicit register/binding syntax on TEST_INPUT (#1132)</title>
<updated>2019-11-21T22:06:19+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-11-21T22:06:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=2ea64ff4f2c7c43b72ff24650330fca79a87500f'/>
<id>urn:sha1:2ea64ff4f2c7c43b72ff24650330fca79a87500f</id>
<content type='text'>
The `TEST_INPUT` facility allows textual Slang test cases to provide two kinds of information to the `render-test` tool:

1. Information on what shader inputs exist
2. Information on what values/objects to bind into those shader inputs

Under the first category of information, there exists supporting for attaching a `dxbinding(...)` annotation to a `TEST_INPUT` which seemingly indicates what HLSL `register` the input uses. There is a similar `glbinding(...)` annotation, used for OpenGL and Vulkan.

It turns out that these annotations were, in practice, completely ignored and had no bearing on how `render-test` allocates or bindings graphics API objects. There was some amount of code attempting to validate that explicit registers/bindings were being set appropriately, but the actual values were being ignored.

The visible consequence of the `dxbinding` and `glbinding` annotations being ignored is issue #1036: the order of `TEST_INPUT` lines was *de facto* determining the registers/bindings that were being used by `render-test`.

This change simply removes the placebo features and strips things down to what is implemented in practice: the `TEST_INPUT` lines do not need target-API-specific binding/register numbers, because their order in the file implicitly defines them.

I added logic to the parsing of `TEST_INPUT` lines to make sure I got an error message on any leftover annotations, and went ahead and systematicaly deleted all of the placebo annotations from our test cases.

If we decide to make `TEST_INPUT` lines *not* depend on order of declaration in the future, we can build it up as a new and better considered feature.

The main alternative I considered was to keep the annotations in place, and change `render-test` and the `gfx` abstraction layer to properly respect them, but that path actually creates much more opportunity for breakage (since every single test case would suddenly be specifying its root signature / pipeline layout via a different path using data that has never been tested). The approach in this change has the benefit of giving me high confidence that all the test cases continue to work just as they had before.</content>
</entry>
<entry>
<title>Changes required for application adoption of interface-type parameters (#963)</title>
<updated>2019-05-20T17:40:38+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-05-20T17:40:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=71e35b6822b9e2846e129a888774d45a5e0827da'/>
<id>urn:sha1:71e35b6822b9e2846e129a888774d45a5e0827da</id>
<content type='text'>
* A few changes required for application adoption of interface-type parameters

There are a few small changes here that are all related in that they arose from trying to integrate support for specialization via global interface-type shader parameters into a real application.

Allow querying the "pending" layout via reflection API
------------------------------------------------------

The naming here isn't ideal, and could probably use a round of "bikeshedding" to arrive at something better, but the basic idea is that when you have a type like:

```
struct MyStuff
{
    int a;
    IFoo foo;
    int b;
}
```

the fields `a` and `b` get allocated space directly in the "primary" layout for `MyStuff` (at offsets 0 and 4, with `sizeof(MyStuff) == 8`), but the `foo` field can't be allocated space until we know what concrete type will get plugged in there.

If we have a concrete type in mind:

```
struct Bar : IFoo { int bar; }
```

then we can know how much space the `foo` field will take up, but we still can't allocate it space directly in `MyStuff`, because we already decided that `sizeof(MyStuff) == 8`.

Now imagine we place some `MyStuff` values into constant buffers:

```
cbuffer X {
    MyStuff x;
}

cbuffer Y {
    MyStuff y;
    float4 z;
}
```

In each case we know that we want to place the `MyStuff::foo` field at the end of the containing constant buffer so that it doesn't disrupt the layout of the existing fields. But that means that the offset of `MyStuff::foo` relative to the start of the `MyStuff` isn't fixed, because of unrelated fields like `z` that need to get in between.

In our layout code, we handle this by having a notion of a "pending" layout. Once we know how `MyStuff::foo` will be specialized, we can compute both a "primary" and a "pending" layout for `MyStuff`, which basically treats it as if it were two distinct types:

```
struct MyStuff_Primary
{
    int a;
    int b;
}

struct MyStuff_Pending
{
    Bar foo;
}
```

Layout for an aggregate type like the `X` or `Y` constant buffer then proceeds by computing an aggregate primary layout and an aggregate pending layout, and then finally a constant buffer or parameter block "flushes" all or part of the pending data by appending it to the primary data to get the final layout.

What all this means is that a type like `MyStuff` will have two different layouts (a default one for the primary data and a "pending" one for any specialized interface-type fields), and a variable like `Y::y` will also have two variable layouts that specify offsets (one set of offsets for its primary part, and one set of offsets for its pending part).

In order to handle interface-type fields with these layout rules, an application needs a way to query the "pending" part of a type or variable layout, which luckily gives it back just another type/variable layout. The API change here is minimal, although actually exploiting the new API correctly in application code could prove challenging.

Allow creating of explicitly specialized types
----------------------------------------------

This feature isn't actually implemented all the way through the compiler (I just needed enough to make the API calls go through), but I've added support for specializing a type that has interface-type fields through the reflection API. This maps to an `ExistentialSpecializedType` in the AST, and I'm lowering it to the IR as a `BindExistentialsType`, although that isn't 100% correct for the future.

This feature will require a future PR to actually flesh out the implementation work, but I'll wait until that is the sticking point on the application side before I do that.

Introduce a tiny `Hasher` abstraction
-------------------------------------

While implementing all the boilerplate for a new `Type` subclass (we really need to reduce that work...), I got fed up with how we do hash-code computation and introduced a small utility `Hasher` type that is intended to wrap up the idiom of combining hashes. For now this isn't a major change, but in the future I'd like to expand on the design a bit to clean up some of the warts around how we handle hashing:

* The `Hasher` implementation can and should switch from maintaining a single `HashCode` as its state to something that contains a more complete state (larger than the hash code) and just hashes new bytes into that state as it goes. This should make it possible to implement a `Hasher` for more serious hash functions, whether MD5, CityHash, or whatever we decide is good default.

* Things that are hashable shouldn't have a `getHashCode()` method, but instead should have something like a `hashInto(Hasher&amp;)` method. This change would have the dual benefits that (1) a composite type can easily hash all the fields that contribute to its identity into the hasher with minimal fuss/boilerplate, and (2) the hashes for composite types will be of higher quality because they can exploit all the bits of the hasher's state to combine the fields, instead of restricting each sub-field to just the bits in a hash code.

We should be able to incrementally improve the quality of our design there over future changes, but for now it probably isn't a critical priority.

Fixes for legalization of existential types
-------------------------------------------

There were some missing cases in the handling of type legalization, such that a global interface-type shader parameter that got specialized to a type that contains *only* resource-type fields would cause a crash in the legalization step.

I added a test for this case, and then made `ir-legalize-types.cpp` account for this case (the code to handle it ias a bit of a kludge, and shows that the `declareVars()` routine there is getting to a level of complexity that is worrying.

* fixup: review feedback
</content>
</entry>
</feed>
