<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/compute/func-resource-param.slang, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2024-11-06T11:14:35+00:00</updated>
<entry>
<title>Make various parameters and return types require specialization when targeting WGSL (#5483)</title>
<updated>2024-11-06T11:14:35+00:00</updated>
<author>
<name>Anders Leino</name>
<email>aleino@nvidia.com</email>
</author>
<published>2024-11-06T11:14:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f8294202ce8d5658f6308eeaed634058db9bbb4b'/>
<id>urn:sha1:f8294202ce8d5658f6308eeaed634058db9bbb4b</id>
<content type='text'>
Structured buffer types translate to array types in the WGSL emitter.
WGSL doesn't allow passing runtime-sized arrays to functions.
Similarly for pointers to texture handles.
Also, structured buffers (runtime-sized arrays) cannot be returned in WGSL.

This closes issue #5228, issue #5278 and issue #5288 by enabling specialized functions
to be generated in these cases, in order to work around these constraints.</content>
</entry>
<entry>
<title>Enable WebGPU tests in CI (#5239)</title>
<updated>2024-10-15T16:11:53+00:00</updated>
<author>
<name>Anders Leino</name>
<email>aleino@nvidia.com</email>
</author>
<published>2024-10-15T16:11:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=9e3b0367cfd63f21a0519b61b6fd13e94dac1c51'/>
<id>urn:sha1:9e3b0367cfd63f21a0519b61b6fd13e94dac1c51</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Convert more tests to use shader objects (#1659)</title>
<updated>2021-01-15T20:10:06+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2021-01-15T20:10:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=2a5d5b32348c33aac7ca62aa9a4c2bb7cff8e08a'/>
<id>urn:sha1:2a5d5b32348c33aac7ca62aa9a4c2bb7cff8e08a</id>
<content type='text'>
This change converts a large number of our existing tests to use the `ShaderObject` support that was added to the `gfx` layer.

In many cases, tests were just updated to pass `-shaderobj` and the result Just Worked.
In other cases, a `name` attribute had to be added to one or more `TEST_INPUT` lines.

For tests that did not work with shader objects "out of the box," I spent a little bit of time trying to get them work, but fell back to letting those tests run in the older mode.
Future changes to the infrastructure will be needed to get those additional tests working in the new path.

Along with the changes to test files, the following implementation changes were made to get additional tests working:

* Because the shader object mode uses explicit register bindings (from reflection), the hacky logic that was offseting `u` registers for D3D12 based on the number of render targets gets disabled (by another hack).

* The "flat" reflection information coming from Slang was not correctly reporting "binding ranges" for things that consumed only uniform data (which would be everything on CUDA/CPU), so it was refactored to properly include binding ranges for anything where the type of the field/variable implied a binding range should be created (even if the `LayoutResourceKind` was `::Uniform`).

* A few fixes were made to the CUDA implementation of `Renderer`, in order to get additional tests up and running. Most of these changes had to do with texture bindings, which hadn't really been tested previously.

In addition, a few changes were made that were attempts at getting more tests working, but didn't actually help. These could be dropped if requested:

* As a quality-of-life feature (not being used) the `object` style of `TEST_INPUT` line is upgraded to support inferring the type to use from the type of the input being set.

* Any `object` shader input lines get ignored in non-shader-object mode.</content>
</entry>
<entry>
<title>Remove support for explicit register/binding syntax on TEST_INPUT (#1132)</title>
<updated>2019-11-21T22:06:19+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-11-21T22:06:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=2ea64ff4f2c7c43b72ff24650330fca79a87500f'/>
<id>urn:sha1:2ea64ff4f2c7c43b72ff24650330fca79a87500f</id>
<content type='text'>
The `TEST_INPUT` facility allows textual Slang test cases to provide two kinds of information to the `render-test` tool:

1. Information on what shader inputs exist
2. Information on what values/objects to bind into those shader inputs

Under the first category of information, there exists supporting for attaching a `dxbinding(...)` annotation to a `TEST_INPUT` which seemingly indicates what HLSL `register` the input uses. There is a similar `glbinding(...)` annotation, used for OpenGL and Vulkan.

It turns out that these annotations were, in practice, completely ignored and had no bearing on how `render-test` allocates or bindings graphics API objects. There was some amount of code attempting to validate that explicit registers/bindings were being set appropriately, but the actual values were being ignored.

The visible consequence of the `dxbinding` and `glbinding` annotations being ignored is issue #1036: the order of `TEST_INPUT` lines was *de facto* determining the registers/bindings that were being used by `render-test`.

This change simply removes the placebo features and strips things down to what is implemented in practice: the `TEST_INPUT` lines do not need target-API-specific binding/register numbers, because their order in the file implicitly defines them.

I added logic to the parsing of `TEST_INPUT` lines to make sure I got an error message on any leftover annotations, and went ahead and systematicaly deleted all of the placebo annotations from our test cases.

If we decide to make `TEST_INPUT` lines *not* depend on order of declaration in the future, we can build it up as a new and better considered feature.

The main alternative I considered was to keep the annotations in place, and change `render-test` and the `gfx` abstraction layer to properly respect them, but that path actually creates much more opportunity for breakage (since every single test case would suddenly be specifying its root signature / pipeline layout via a different path using data that has never been tested). The approach in this change has the benefit of giving me high confidence that all the test cases continue to work just as they had before.</content>
</entry>
<entry>
<title>WIP: CPU compute coverage (#1030)</title>
<updated>2019-08-22T19:58:28+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-08-22T19:58:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=06a0e3980fd04fa265bd20eb11f2abc18bd6a215'/>
<id>urn:sha1:06a0e3980fd04fa265bd20eb11f2abc18bd6a215</id>
<content type='text'>
* Add support for '=' when defining a name in test.

* Add support for  double intrinsics.

* Add support for asdouble
Add findOrAddInst - used instead of findOrEmitHoistableInst, for nominal instructions.
Support cloning of string literals.
C++ working on more compute tests.

* Constant buffer support in reflection.
Fixed debugging into source for generated C++.
buffer-layout.slang works.

* Added cpu test result.

* Remove some commented out code.
Comment on next fixes.

* Improvements to reflection CPU code.

* C++ working with ByteAddressBuffer.

* Enabled more compute tests for CPU.

* Enabled more compute tests on CPU.
Added support for [] style access to a vector.

* Enabled more CPU compute tests.

* Handling of buffer-type-splitting.slang
Named buffers can be paths to resources

* Fix some warnings, remove some dead code.

* Fix problem with verification of number of operands for asuint/asint as they can have 1 or 3 operands. asdouble takes 2.

* Fix handling in MemoryArena around aligned allocations. That _allocateAlignedFromNewBlock assumed the block allocated has the aligment that was requested and so did not correct the start address.
</content>
</entry>
<entry>
<title>Specialize away resource-type function parameters (#759)</title>
<updated>2018-12-17T22:48:48+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-12-17T22:48:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=3a02c590afdd2624b2c729e989ada9393d708f75'/>
<id>urn:sha1:3a02c590afdd2624b2c729e989ada9393d708f75</id>
<content type='text'>
* Specialize away resource-type function parameters

Work on #397.

Introduction
------------

Suppose a user writes a function that takes a resource type as a parameter:

```hlsl
float4 getThing(RWStructuredBuffer&lt;float4&gt; buffer, int index)
{
    return buffer[index];
}
```

This function creates challenges when generating code for GLSL-based targets, because a global shader parameter of type `RWStructuredBuffer`:

```hlsl
RWStructuredBuffer&lt;float4&gt; gBuffer;
```

translates to a global GLSL `buffer` declaration:

```hlsl
buffer _S0
{
    float4 _data[];
} gBuffer;
```

There is no equivalent to that `buffer` declaration that can be used in function parameter position, and it is illegal in GLSL to pass `gBuffer` into a function.

(Aside: yes, we could in principle translate a function parameter like `RWStructuredBuffer&lt;float4&gt; buffer` to `float4 buffer[]`, but that will not in turn generalize to arrays of structured buffers; it is a dead-end strategy)

The solution employed by many shader compilers is to "inline everything" to eliminate the need for parameters of resource types, and then rely on dataflow optimization to eliminate locals of resource types. This strategy can of course lead to an increase in code size, and it also means that call stacks are lost when doing step-through debugging. Another serious issue is that an "early `return`" from a function can turn into the equivalent of a multi-level `break` when inlined, and not all of our targets support multi-level `break`.

The solution implemented in this change works around some, but not all, of the problems with full inlining.
The approach here generates specialized versions of a function like `getThing`, adapted to the actual arguments provided at different call sites.
Thus if we have code like:

```hlsl
RWStructuredBuffer&lt;float4&gt; gA;
RWStructuredBuffer&lt;float4&gt; gB[10];
...
getThing(gA, x);
getThing(gA, y);
getThing(gB[someVal], z);
```

we will generate two specializations of `getThing`: one specialized for the `buffer` parameter being `gA` and the other for `gB`:

```hlsl
float4 getThing_gA(int index) { return gA[index]; }
float4 getThing_gB(int _val, int index) { return gB[_val][index]; }
```

and the call sites will change to match:

```hlsl
getThing_gA(x);
getThing_gA(y);
getThing_gB(someVal, z);
```

Note how in the case where the argument being passed in was obtained by indexing into an array of resources, the callee is specialized to the identity of the global shader parameter (`gB`), and now accepts a new parameter to indicate the array index into it.

While this description motivates the change based on GLSL output, the same basic issue can arise for other targets.
For example, while current HLSL has added the `ConstantBuffer&lt;T&gt;` type, it is not supported on older targets, and it turns out that even dxc does not allow functions to have `ConstantBuffer&lt;T&gt;` parameters.
Longer-term, we will likely need to do even more aggressive specialization both in order to generate SPIR-V output directly, and also to deal with function that have return values or `out` parameters of resource types.

Implementation
--------------

The meat of the change is in `ir-specialize-resources.{h,cpp}`, where we have a pass that looks at all call sites (`IRCall` instructions) in the program, and attempts to replace them with calls to specialized functions, where the specializations are generated on-demand.
The code in this pass is heavily commented, so hopefully it serves to explain itself all right.

After specialization is complete, we may still have functions like the original `getThing` that will produce invalid code when emitted as GLSL, so we need a way to make sure they don't appear in the output.
To date we've had some very ad hoc approaches for ignoring IR constructs that we don't want to affect emitted code, but this change goes ahead and adds a more real dead code elimination (DCE) pass in `ir-dce.{h,cpp}`.
This pass follows a straightforward approach of tagging instructions that are "live" and then propagating liveness through the whole program, before making a single pass to delete anything that isn't live.

When I first added the DCE pass it eliminated *everything* because there were no "roots" for liveness.
I solved this for now by adding a new decoration, `IREntryPointDecoration`, to mark shader entry points in the IR which should always be live (as should anything they depend on).

A secondary problem that arose was that for GLSL ray tracing shaders it is possible for the incoming/outgoing payload or attributes parameters to be unused, but eliminating them as dead would change the signature of a shader an potential break the rules for how ray tracing programs communicate.
I added a very simple `IRDependsOnDecoration` that allows one IR instruction to keep another alive *as if* it used it, without actually using it.

There's also a fixup in the IR dumping logic where I was forgetting to store anything in the mapping from instruction to their names, so that the name of an instruction was getting incremented each time it was referenced.

Testing
-------

There are three different tests added as part of this change:

* The `compute/func-resource-param` test covers the basic `RWStructuredBuffer` case above, which we expect to work fine for D3D11/12, but fail for Vulkan without specialization.

* The `cross-compile/func-resource-param-array` test covers the case where we don't just have one resource, but an array of them. This is not an end-to-end compute test primarily because our `render-test` application doesn't yet handle arrays of resources correctly in its binding logic.

* The `compute/func-cbuffer-param` test covers the case of a function with a `ConstantBuffer&lt;T&gt;` parameter, which requires specialization to become valid for any of our targets.

* fixup: warnings/errors from other compilers

* fixup: typos and cleanup

* fixup: typos
</content>
</entry>
</feed>
