diff options
| author | Tim Foley <tfoleyNV@users.noreply.github.com> | 2018-12-17 14:48:48 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2018-12-17 14:48:48 -0800 |
| commit | 3a02c590afdd2624b2c729e989ada9393d708f75 (patch) | |
| tree | ad09bfe9c1bcc132e211c64854ef22b1b42b3678 /source/slang/ir.cpp | |
| parent | d2ddc590601778f309c81f7d19d5e7fed34210de (diff) | |
Specialize away resource-type function parameters (#759)
* Specialize away resource-type function parameters
Work on #397.
Introduction
------------
Suppose a user writes a function that takes a resource type as a parameter:
```hlsl
float4 getThing(RWStructuredBuffer<float4> buffer, int index)
{
return buffer[index];
}
```
This function creates challenges when generating code for GLSL-based targets, because a global shader parameter of type `RWStructuredBuffer`:
```hlsl
RWStructuredBuffer<float4> gBuffer;
```
translates to a global GLSL `buffer` declaration:
```hlsl
buffer _S0
{
float4 _data[];
} gBuffer;
```
There is no equivalent to that `buffer` declaration that can be used in function parameter position, and it is illegal in GLSL to pass `gBuffer` into a function.
(Aside: yes, we could in principle translate a function parameter like `RWStructuredBuffer<float4> buffer` to `float4 buffer[]`, but that will not in turn generalize to arrays of structured buffers; it is a dead-end strategy)
The solution employed by many shader compilers is to "inline everything" to eliminate the need for parameters of resource types, and then rely on dataflow optimization to eliminate locals of resource types. This strategy can of course lead to an increase in code size, and it also means that call stacks are lost when doing step-through debugging. Another serious issue is that an "early `return`" from a function can turn into the equivalent of a multi-level `break` when inlined, and not all of our targets support multi-level `break`.
The solution implemented in this change works around some, but not all, of the problems with full inlining.
The approach here generates specialized versions of a function like `getThing`, adapted to the actual arguments provided at different call sites.
Thus if we have code like:
```hlsl
RWStructuredBuffer<float4> gA;
RWStructuredBuffer<float4> gB[10];
...
getThing(gA, x);
getThing(gA, y);
getThing(gB[someVal], z);
```
we will generate two specializations of `getThing`: one specialized for the `buffer` parameter being `gA` and the other for `gB`:
```hlsl
float4 getThing_gA(int index) { return gA[index]; }
float4 getThing_gB(int _val, int index) { return gB[_val][index]; }
```
and the call sites will change to match:
```hlsl
getThing_gA(x);
getThing_gA(y);
getThing_gB(someVal, z);
```
Note how in the case where the argument being passed in was obtained by indexing into an array of resources, the callee is specialized to the identity of the global shader parameter (`gB`), and now accepts a new parameter to indicate the array index into it.
While this description motivates the change based on GLSL output, the same basic issue can arise for other targets.
For example, while current HLSL has added the `ConstantBuffer<T>` type, it is not supported on older targets, and it turns out that even dxc does not allow functions to have `ConstantBuffer<T>` parameters.
Longer-term, we will likely need to do even more aggressive specialization both in order to generate SPIR-V output directly, and also to deal with function that have return values or `out` parameters of resource types.
Implementation
--------------
The meat of the change is in `ir-specialize-resources.{h,cpp}`, where we have a pass that looks at all call sites (`IRCall` instructions) in the program, and attempts to replace them with calls to specialized functions, where the specializations are generated on-demand.
The code in this pass is heavily commented, so hopefully it serves to explain itself all right.
After specialization is complete, we may still have functions like the original `getThing` that will produce invalid code when emitted as GLSL, so we need a way to make sure they don't appear in the output.
To date we've had some very ad hoc approaches for ignoring IR constructs that we don't want to affect emitted code, but this change goes ahead and adds a more real dead code elimination (DCE) pass in `ir-dce.{h,cpp}`.
This pass follows a straightforward approach of tagging instructions that are "live" and then propagating liveness through the whole program, before making a single pass to delete anything that isn't live.
When I first added the DCE pass it eliminated *everything* because there were no "roots" for liveness.
I solved this for now by adding a new decoration, `IREntryPointDecoration`, to mark shader entry points in the IR which should always be live (as should anything they depend on).
A secondary problem that arose was that for GLSL ray tracing shaders it is possible for the incoming/outgoing payload or attributes parameters to be unused, but eliminating them as dead would change the signature of a shader an potential break the rules for how ray tracing programs communicate.
I added a very simple `IRDependsOnDecoration` that allows one IR instruction to keep another alive *as if* it used it, without actually using it.
There's also a fixup in the IR dumping logic where I was forgetting to store anything in the mapping from instruction to their names, so that the name of an instruction was getting incremented each time it was referenced.
Testing
-------
There are three different tests added as part of this change:
* The `compute/func-resource-param` test covers the basic `RWStructuredBuffer` case above, which we expect to work fine for D3D11/12, but fail for Vulkan without specialization.
* The `cross-compile/func-resource-param-array` test covers the case where we don't just have one resource, but an array of them. This is not an end-to-end compute test primarily because our `render-test` application doesn't yet handle arrays of resources correctly in its binding logic.
* The `compute/func-cbuffer-param` test covers the case of a function with a `ConstantBuffer<T>` parameter, which requires specialization to become valid for any of our targets.
* fixup: warnings/errors from other compilers
* fixup: typos and cleanup
* fixup: typos
Diffstat (limited to 'source/slang/ir.cpp')
| -rw-r--r-- | source/slang/ir.cpp | 79 |
1 files changed, 72 insertions, 7 deletions
diff --git a/source/slang/ir.cpp b/source/slang/ir.cpp index 0d93957c8..60e983711 100644 --- a/source/slang/ir.cpp +++ b/source/slang/ir.cpp @@ -555,6 +555,23 @@ namespace Slang return entryBlock->getFirstParam(); } + IRParam* IRGlobalValueWithParams::getLastParam() + { + auto entryBlock = getFirstBlock(); + if(!entryBlock) return nullptr; + + return entryBlock->getLastParam(); + } + + IRInstList<IRParam> IRGlobalValueWithParams::getParams() + { + auto entryBlock = getFirstBlock(); + if(!entryBlock) return IRInstList<IRParam>(); + + return entryBlock->getParams(); + } + + // IRFunc IRType* IRFunc::getResultType() { return getDataType()->getResultType(); } @@ -2774,15 +2791,10 @@ namespace Slang } } - - static String getName( + static String createName( IRDumpContext* context, IRInst* value) { - String name = 0; - if (context->mapValueToName.TryGetValue(value, name)) - return name; - if(auto nameHintDecoration = value->findDecoration<IRNameHintDecoration>()) { String nameHint = nameHintDecoration->getName(); @@ -2811,6 +2823,19 @@ namespace Slang } } + static String getName( + IRDumpContext* context, + IRInst* value) + { + String name; + if (context->mapValueToName.TryGetValue(value, name)) + return name; + + name = createName(context, value); + context->mapValueToName.Add(value, name); + return name; + } + static void dumpID( IRDumpContext* context, IRInst* inst) @@ -3747,6 +3772,7 @@ namespace Slang case kIROp_GlobalGenericParam: case kIROp_WitnessTable: case kIROp_WitnessTableEntry: + case kIROp_Block: return false; case kIROp_Nop: @@ -3808,6 +3834,19 @@ namespace Slang return nullptr; } + // + // IRType + // + + IRType* unwrapArray(IRType* type) + { + IRType* t = type; + while( auto arrayType = as<IRArrayTypeBase>(t) ) + { + t = arrayType->getElementType(); + } + return t; + } // // Legalization of entry points for GLSL: @@ -4880,6 +4919,7 @@ namespace Slang void legalizeRayTracingEntryPointParameterForGLSL( GLSLLegalizationContext* context, + IRFunc* func, IRParam* pp, VarLayout* paramLayout) { @@ -4902,6 +4942,31 @@ namespace Slang builder->addLayoutDecoration(globalParam, paramLayout); moveValueBefore(globalParam, builder->getFunc()); pp->replaceUsesWith(globalParam); + + // Because linkage between ray-tracing shaders is + // based on the type of incoming/outgoing payload + // and attribute parameters, it would be an error to + // eliminate the global parameter *even if* it is + // not actually used inside the entry point. + // + // We attach a decoration to the entry point that + // makes note of the dependency, so that steps + // like dead code elimination cannot get rid of + // the parameter. + // + // TODO: We could consider using a structure like + // this for *all* of the entry point parameters + // that get moved to the global scope, since SPIR-V + // ends up requiring such information on an `OpEntryPoint`. + // + // As a further alternative, we could decide to + // keep entry point varying input/outtput attached + // to the parameter list through all of the Slang IR + // steps, and only declare it as global variables at + // the last minute when emitting a GLSL `main` or + // SPIR-V for an entry point. + // + builder->addDependsOnDecoration(func, globalParam); } void legalizeEntryPointParameterForGLSL( @@ -5059,7 +5124,7 @@ namespace Slang case Stage::Intersection: case Stage::Miss: case Stage::RayGeneration: - legalizeRayTracingEntryPointParameterForGLSL(context, pp, paramLayout); + legalizeRayTracingEntryPointParameterForGLSL(context, func, pp, paramLayout); return; } |
