diff options
| author | Tim Foley <tfoleyNV@users.noreply.github.com> | 2018-12-17 14:48:48 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2018-12-17 14:48:48 -0800 |
| commit | 3a02c590afdd2624b2c729e989ada9393d708f75 (patch) | |
| tree | ad09bfe9c1bcc132e211c64854ef22b1b42b3678 /tests | |
| parent | d2ddc590601778f309c81f7d19d5e7fed34210de (diff) | |
Specialize away resource-type function parameters (#759)
* Specialize away resource-type function parameters
Work on #397.
Introduction
------------
Suppose a user writes a function that takes a resource type as a parameter:
```hlsl
float4 getThing(RWStructuredBuffer<float4> buffer, int index)
{
return buffer[index];
}
```
This function creates challenges when generating code for GLSL-based targets, because a global shader parameter of type `RWStructuredBuffer`:
```hlsl
RWStructuredBuffer<float4> gBuffer;
```
translates to a global GLSL `buffer` declaration:
```hlsl
buffer _S0
{
float4 _data[];
} gBuffer;
```
There is no equivalent to that `buffer` declaration that can be used in function parameter position, and it is illegal in GLSL to pass `gBuffer` into a function.
(Aside: yes, we could in principle translate a function parameter like `RWStructuredBuffer<float4> buffer` to `float4 buffer[]`, but that will not in turn generalize to arrays of structured buffers; it is a dead-end strategy)
The solution employed by many shader compilers is to "inline everything" to eliminate the need for parameters of resource types, and then rely on dataflow optimization to eliminate locals of resource types. This strategy can of course lead to an increase in code size, and it also means that call stacks are lost when doing step-through debugging. Another serious issue is that an "early `return`" from a function can turn into the equivalent of a multi-level `break` when inlined, and not all of our targets support multi-level `break`.
The solution implemented in this change works around some, but not all, of the problems with full inlining.
The approach here generates specialized versions of a function like `getThing`, adapted to the actual arguments provided at different call sites.
Thus if we have code like:
```hlsl
RWStructuredBuffer<float4> gA;
RWStructuredBuffer<float4> gB[10];
...
getThing(gA, x);
getThing(gA, y);
getThing(gB[someVal], z);
```
we will generate two specializations of `getThing`: one specialized for the `buffer` parameter being `gA` and the other for `gB`:
```hlsl
float4 getThing_gA(int index) { return gA[index]; }
float4 getThing_gB(int _val, int index) { return gB[_val][index]; }
```
and the call sites will change to match:
```hlsl
getThing_gA(x);
getThing_gA(y);
getThing_gB(someVal, z);
```
Note how in the case where the argument being passed in was obtained by indexing into an array of resources, the callee is specialized to the identity of the global shader parameter (`gB`), and now accepts a new parameter to indicate the array index into it.
While this description motivates the change based on GLSL output, the same basic issue can arise for other targets.
For example, while current HLSL has added the `ConstantBuffer<T>` type, it is not supported on older targets, and it turns out that even dxc does not allow functions to have `ConstantBuffer<T>` parameters.
Longer-term, we will likely need to do even more aggressive specialization both in order to generate SPIR-V output directly, and also to deal with function that have return values or `out` parameters of resource types.
Implementation
--------------
The meat of the change is in `ir-specialize-resources.{h,cpp}`, where we have a pass that looks at all call sites (`IRCall` instructions) in the program, and attempts to replace them with calls to specialized functions, where the specializations are generated on-demand.
The code in this pass is heavily commented, so hopefully it serves to explain itself all right.
After specialization is complete, we may still have functions like the original `getThing` that will produce invalid code when emitted as GLSL, so we need a way to make sure they don't appear in the output.
To date we've had some very ad hoc approaches for ignoring IR constructs that we don't want to affect emitted code, but this change goes ahead and adds a more real dead code elimination (DCE) pass in `ir-dce.{h,cpp}`.
This pass follows a straightforward approach of tagging instructions that are "live" and then propagating liveness through the whole program, before making a single pass to delete anything that isn't live.
When I first added the DCE pass it eliminated *everything* because there were no "roots" for liveness.
I solved this for now by adding a new decoration, `IREntryPointDecoration`, to mark shader entry points in the IR which should always be live (as should anything they depend on).
A secondary problem that arose was that for GLSL ray tracing shaders it is possible for the incoming/outgoing payload or attributes parameters to be unused, but eliminating them as dead would change the signature of a shader an potential break the rules for how ray tracing programs communicate.
I added a very simple `IRDependsOnDecoration` that allows one IR instruction to keep another alive *as if* it used it, without actually using it.
There's also a fixup in the IR dumping logic where I was forgetting to store anything in the mapping from instruction to their names, so that the name of an instruction was getting incremented each time it was referenced.
Testing
-------
There are three different tests added as part of this change:
* The `compute/func-resource-param` test covers the basic `RWStructuredBuffer` case above, which we expect to work fine for D3D11/12, but fail for Vulkan without specialization.
* The `cross-compile/func-resource-param-array` test covers the case where we don't just have one resource, but an array of them. This is not an end-to-end compute test primarily because our `render-test` application doesn't yet handle arrays of resources correctly in its binding logic.
* The `compute/func-cbuffer-param` test covers the case of a function with a `ConstantBuffer<T>` parameter, which requires specialization to become valid for any of our targets.
* fixup: warnings/errors from other compilers
* fixup: typos and cleanup
* fixup: typos
Diffstat (limited to 'tests')
| -rw-r--r-- | tests/compute/func-cbuffer-param.slang | 40 | ||||
| -rw-r--r-- | tests/compute/func-cbuffer-param.slang.expected.txt | 4 | ||||
| -rw-r--r-- | tests/compute/func-resource-param.slang | 35 | ||||
| -rw-r--r-- | tests/compute/func-resource-param.slang.expected.txt | 4 | ||||
| -rw-r--r-- | tests/cross-compile/func-resource-param-array.slang | 62 | ||||
| -rw-r--r-- | tests/cross-compile/func-resource-param-array.slang.glsl | 91 | ||||
| -rw-r--r-- | tests/ir/string-literal.slang.expected | 1 |
7 files changed, 237 insertions, 0 deletions
diff --git a/tests/compute/func-cbuffer-param.slang b/tests/compute/func-cbuffer-param.slang new file mode 100644 index 000000000..5730272ab --- /dev/null +++ b/tests/compute/func-cbuffer-param.slang @@ -0,0 +1,40 @@ +// func-cbuffer-param.slang + +// Test that passing a `ConstantBuffer<X>` parameter +// into a function works across all target. + +//TEST(compute):COMPARE_COMPUTE_EX:-slang -compute +//TEST(compute, vulkan):COMPARE_COMPUTE_EX:-dx12 -compute -use-dxil +//TEST(compute, vulkan):COMPARE_COMPUTE_EX:-vk -compute + +struct Data +{ + int4 val[4]; +} + +//TEST_INPUT:cbuffer(data=[0 1 2 3 16 17 18 19 32 33 34 35 48 49 50 51]):dxbinding(0),glbinding(0) +ConstantBuffer<Data> a; + +//TEST_INPUT:cbuffer(data=[16 17 18 19 32 33 34 35 48 49 50 51 64 65 66 67]):dxbinding(1),glbinding(1) +ConstantBuffer<Data> b; + +//TEST_INPUT:ubuffer(data=[0 0 0 0], stride=4):dxbinding(0),glbinding(2),out +RWStructuredBuffer<int> outputBuffer; + +int helper(ConstantBuffer<Data> buffer, int index) +{ + return buffer.val[index].x; +} + +int test(int val) +{ + return val + helper(a, val) + helper(b, val); +} + +[numthreads(4, 1, 1)] +void computeMain(uint3 dispatchThreadID : SV_DispatchThreadID) +{ + int inVal = (int) dispatchThreadID.x; + int outVal = test(inVal); + outputBuffer[dispatchThreadID.x] = outVal; +}
\ No newline at end of file diff --git a/tests/compute/func-cbuffer-param.slang.expected.txt b/tests/compute/func-cbuffer-param.slang.expected.txt new file mode 100644 index 000000000..35000ff87 --- /dev/null +++ b/tests/compute/func-cbuffer-param.slang.expected.txt @@ -0,0 +1,4 @@ +10 +31 +52 +73 diff --git a/tests/compute/func-resource-param.slang b/tests/compute/func-resource-param.slang new file mode 100644 index 000000000..19784b108 --- /dev/null +++ b/tests/compute/func-resource-param.slang @@ -0,0 +1,35 @@ +// func-resource-param.slang + +// Test that a function with a resource parameter that +// requires non-trivial legalization can be compiled +// to work on GLSL-based targets. + +//TEST(compute):COMPARE_COMPUTE_EX:-slang -compute +//TEST(compute, vulkan):COMPARE_COMPUTE_EX:-dx12 -compute +//TEST(compute, vulkan):COMPARE_COMPUTE_EX:-vk -compute + +//NO_TEST:SIMPLE:-target glsl -entry computeMain -stage compute -validate-ir -dump-ir + +//TEST_INPUT:ubuffer(data=[0 0 0 0], stride=4):dxbinding(0),glbinding(0),out +RWStructuredBuffer<int> outputBuffer; + +//TEST_INPUT:ubuffer(data=[0 16 32 48], stride=4):dxbinding(1),glbinding(1) +RWStructuredBuffer<int> inputBuffer; + +int helper(RWStructuredBuffer<int> buffer, int index) +{ + return buffer[index]; +} + +int test(int val) +{ + return helper(inputBuffer, val) + val; +} + +[numthreads(4, 1, 1)] +void computeMain(uint3 dispatchThreadID : SV_DispatchThreadID) +{ + int inVal = (int) dispatchThreadID.x; + int outVal = test(inVal); + outputBuffer[dispatchThreadID.x] = outVal; +}
\ No newline at end of file diff --git a/tests/compute/func-resource-param.slang.expected.txt b/tests/compute/func-resource-param.slang.expected.txt new file mode 100644 index 000000000..d4cb1cc00 --- /dev/null +++ b/tests/compute/func-resource-param.slang.expected.txt @@ -0,0 +1,4 @@ +0 +11 +22 +33 diff --git a/tests/cross-compile/func-resource-param-array.slang b/tests/cross-compile/func-resource-param-array.slang new file mode 100644 index 000000000..7062169dc --- /dev/null +++ b/tests/cross-compile/func-resource-param-array.slang @@ -0,0 +1,62 @@ +// func-resource-param-array.slang + +//TEST:CROSS_COMPILE:-target spirv-assembly -entry main -stage compute + +// Test that we gernerate expected code for scenarios involving +// resource-type function parameters, even when working with +// arrays of resources. + +int f(RWStructuredBuffer<int> fx, uint fi) { return fx[fi] ; } + +// TODO: Note that we are declaring the function +// parameter here with an explicitly-sized array +// because Slang currently doesn't support converison +// from a sized to an unsized array type. +// +int g(RWStructuredBuffer<int> gx[3], uint gi, uint gj) { return gx[gi][gj]; } + +RWStructuredBuffer<int> a; +RWStructuredBuffer<int> b[3]; + +// Note: Slang currently genreates an array-of-arrays in the output +// for this declaration, which glslang complains leads to invalid +// SPIR-V. This means that there is yet another legalization step +// that Slang should perform on this declaration. +// +// For now we are fine with generating invalid SPIR-V, because +// we are not going to execute the output of this test case. +// +RWStructuredBuffer<int> c[4][3]; + +void main(uint3 tid : SV_DispatchThreadID) +{ + uint ii = tid.x; + uint jj = tid.y; + uint kk = tid.z; + + // Can we specialize `f`? + // + int tmp = f(a, ii); + + // If we ask for the same specialization again, do + // we avoid code duplication? + // + tmp += f(a, jj); + + // If we pass in a reference to an array element, + // can we still specialize? + // + tmp += f(b[ii], jj); + + // If we have a function that takes an *array* can + // we specialize? + // + tmp += g(b, ii, jj); + + // What if the function takes an array, and we pass + // in an element of an array-of-arrays? + // + tmp += g(c[ii], jj, kk); + + a[ii] = tmp; +} diff --git a/tests/cross-compile/func-resource-param-array.slang.glsl b/tests/cross-compile/func-resource-param-array.slang.glsl new file mode 100644 index 000000000..6224ccd1c --- /dev/null +++ b/tests/cross-compile/func-resource-param-array.slang.glsl @@ -0,0 +1,91 @@ +// func-resource-param-array.slang.glsl +#version 450 + +#define a a_0 +#define b b_0 +#define c c_0 +#define ii ii_0 +#define jj jj_0 +#define kk kk_0 + +#define f_a f_0 +#define f_b f_1 +#define g_b g_0 +#define g_c g_1 + +#define a_block _S1 +#define b_block _S2 +#define c_block _S3 + +#define f_a_i _S4 +#define f_b_t _S5 +#define f_b_i _S6 +#define g_b_i _S7 +#define g_b_j _S8 +#define g_c_t _S9 +#define g_c_i _S10 +#define g_c_j _S11 +#define tmp_f_a_ii _S12 +#define tmp_f_a_jj _S13 +#define tmp_f_b _S14 +#define tmp_g_b _S15 +#define tmp_g_c _S16 + +layout(std430, binding = 0) buffer a_block { + int _data[]; +} a; + +layout(std430, binding = 1) buffer b_block { + int _data[]; +} b[3]; + +layout(std430, binding = 2) buffer c_block { + int _data[]; +} c[4][3]; + +int f_a(uint f_a_i) +{ + return a._data[f_a_i]; +} + +int f_b(uint f_b_t, uint f_b_i) +{ + return b[f_b_t]._data[f_b_i]; +} + +int g_b(uint g_b_i, uint g_b_j) +{ + return b[g_b_i]._data[g_b_j]; +} + +int g_c(uint g_c_t, uint g_c_i, uint g_c_j) +{ + return c[g_c_t][g_c_i]._data[g_c_j]; +} + +layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in; + +void main() +{ + uint ii = gl_GlobalInvocationID.x; + uint jj = gl_GlobalInvocationID.y; + uint kk = gl_GlobalInvocationID.z; + + int tmp_f_a_ii = f_a(ii); + + int tmp_f_a_jj = f_a(jj); + int tmp_0 = tmp_f_a_ii + tmp_f_a_jj; + + int tmp_f_b = f_b(ii, jj); + int tmp_1 = tmp_0 + tmp_f_b; + + int tmp_g_b = g_b(ii, jj); + int tmp_2 = tmp_1 + tmp_g_b; + + int tmp_g_c = g_c(ii, jj, kk); + int tmp_3 = tmp_2 + tmp_g_c; + + a._data[ii] = tmp_3; + + return; +} diff --git a/tests/ir/string-literal.slang.expected b/tests/ir/string-literal.slang.expected index 7bc66d682..b86eab2c8 100644 --- a/tests/ir/string-literal.slang.expected +++ b/tests/ir/string-literal.slang.expected @@ -1,5 +1,6 @@ result code = 0 standard error = { +[entryPoint] [export("_S04mainp1puV")] [nameHint("main")] func %main : Func(Void, UInt) |
