|
This change adds support for the `[[vk::location(...)]]` and `[[vk::index(...)]]` attributes, which can be used together to mark up shader outputs for dual-source blending on Vulkan. HLSL/Slang code like the following:
```hlsl
struct Output
{
[[vk::location(0)]]
float4 a : SV_Target0;
[[vk::location(0), vk::index(1)]]
float4 b : SV_Target1;
}
[shader("fragment")]
Output main(...) { ...}
```
can be used to set up dual-source blending on both D3D and Vulkan APIs. The output GLSL for the above will look something like:
```glsl
layout(location = 0) out vec4 a;
layout(location = 0, index = 1) out vec4 b;
void main() { ... }
```
The more or less straightforward parts of this change were:
* Added new `attribute_syntax` declarations to the stdlib, for `[[vk::location(...)]]` and `[[vk::index(...)]]`
* Added new AST node types for the new attribute cases, sharing a base class so that argument checking can be shared
* Added checks for the arguments to the new attributes in `slang-check-modifier.cpp` (eventually this kind of logic shouldn't be needed for new attributes)
* Updated GLSL emit logic so that it treats the `index`/`space` parts of a variable layout as the `location`/`index` for varying parameters.
* Updated GLSL legalization so that when it translates entry-point parameters into globals (and scalarizes structures) it handles both a binding index and space for the parameters.
* Added a cross-compilation test case to verify that the basics of the feature work
The remaining work is all in `slang-parameter-binding.cpp`.
There is some work that isn't technically related to this change (and which could be reverted if it causes problems), around the detection and handling of fragment shader outputs with `SV_Target` semantics. The basic changes (which could be backed out and then merged separately) are:
* Made the special-case `SV_Target` logic only trigger for fragment shaders (that is the only place where `SV_Target` should appear, but we weren't guarding against it)
* Made the logic to reserve a `u<N>` register for `SV_Target<N>` only trigger for D3D Shader Model 5.0 and below (since it is not required for SM 5.1 and up). This could be a breaking change for some users, but that seems unlikely.
* Fixed one test case that relied on the behavior of reserving `u0` for `SV_Target0` even though it was a SM6.0 test.
* Also added more comments to the system-value handling logic.
The more interesting changes come up starting in `processEntryPointVaryingParameterDecl()`. The basic issue is that we have so far only supported implicit layout for varying parameters on GLSL/Vulkan, but the `[[vk::location(...)]]` attribute is a form of explicit layout annotation. Rather than try to kludge something that only works in narrow cases, I instead opted to try to fix things more generally.
In `processEntryPointVaryingParameterDecl()` we now check for the `location` and `index` attributes when we are on "Khronos" targets (Vulkan/OpenGL/GLSL) and immediately add them to the variable layout being constructed if they are found. There is nothing in this logic specific to fragment-shader outputs, so this feature now applies to any varying input/output on Khronos targets.
Allowing explicit layouts creates the potential for mixing implicit and explicit layout. For example, consider:
```hlsl
struct Output
{
float4 color : COLOR;
[[vk::location(0)]] float3 normal : NORMAL;
}
```
What `location` should `color` get? Should this code be an error? There are two cases where this conundrum can come up: when working with `struct` types used for varying parameters, and the entry-point parameter list itself.
For the varying `struct` case we currently make an expedient choice. We handle fields with both implicit or explicit layotu with appropriate logic, but logic that doesn't account for the case of mixing the two. Then at the end of layout for the `struct` we issue an error if there was a mix of implicit and explicit layout (such that our results aren't likely to be valid).
For the entry point varying parameter case, things were already using a `ScopeLayoutBuilder` type (that encapsulates some logic shared between entry-point and global parameters). The entry-point-specific bits were moved out into a `SimpleScopeLayoutBuilder` and it was updated so that rather than assuming all parameters use implicit layout it does a two-phase layout approach similar to what we use for the global scope:
* First all parameters are enumerated to collect explicit bindings and mark certain ranges as "used"
* Next the parameters are enumerated again and those without explicit bindings get allocated space using a "first fit" algorithm
In principle we could extend the two-phase approach to apply to `struct` types as well, but that would be best saved for a future refactoring of some of this parameter binding logic, since I would like to exploit more of the opportunities for sharing code across the uniform/varying and struct/entry-point/global cases.
By moving the point where entry point parameters get their offsets assigned, it was necessary to move around some of the logic that removes varying parameter usage (and other things that shouldn't "leak" out of an entry point) to a different point in the entry point layout process.
While adding these various pieces does not quite enable us to support explicit bindings on entry point parameters (e.g., putting `uniform Texture2D t : register(t0)` in an entry point parameter list) or in `struct` types (e.g., explicit `packoffset` annotations on fields), it starts to provide some of the infrastructure that we'd need in order to support those cases.
|
|
* Change how buffers are emitted
This is a change with a lot of pieces, which can't always be separated out cleanly. I'm going to walk through them in what I hope is a logical order.
The main goal of this change was to allow arrays of structured buffers to translate to Vulkan. Consider two declarations of structured buffers in HLSL/Slang:
```hlsl
StructuredBuffer<X> single;
StructuredBuffer<Y> multiple[10];
```
The current translation logic was handling `single` by translating it into an *unnamed* GLSL `buffer` block like:
```glsl
layout(std430)
buffer _S1
{
X single[];
};
```
That syntax allows an expression like `single[i]` in Slang to be translated simply as `single[i]` in GLSL.
But that naive translating doesn't work for `multiple`, since we need to declare a array of blocks in GLSL, which requires giving the whole thing a name:
```glsl
layout(std430)
buffer _S2
{
Y _data[];
} multiple[10];
```
Now a reference to `multiple[i][j]` in Slang needs to become `multiple[i]._data[j]` in GLSL.
To avoid having way too many special cases around single structured buffers vs. arrays, it makes sense to allows emit things in the latter form, so that we instead lower `single` as:
```glsl
layout(std430)
buffer _S1
{
X _data[];
} single;
```
So that now a reference to `single[i]` becomes `single._data[i]` in GLSL.
Most of that can be handled in the standard library translation of the structured buffer indexing operations.
The only wrinkle there is that there were some *old* special-case instructions in the IR intended to handle buffer load/store operations (these were added back when I was trying to keep the "VM" path working). These aren't really needed to have structured-buffer operations work; they can be handled as ordinary functions as far as the stdlib is concerned. I removed the old instructions.
Along the way, it became clear that a few other cases follow the same pattern. Byte-addressed buffers are an obvious case. We were lowering HLSL/Slang:
```hlsl
ByteAddressBuffer b;
...
uint x = b.Load(0);
```
to GLSL like:
```glsl
layout(std430)
buffer _S1
{
uint b[];
};
...
uint x = b[0];
```
That logic would fail for arrays the same way that the structured buffer case was failing. The fix is the same: use named `buffer` blocks and then introduce an explicit `_data` field:
```glsl
layout(std430)
buffer _S1
{
uint _data[];
} b;
...
uint x = b._data[0];
```
Just like with structured buffers, all of the VK translation for operations on byte-addressed buffers can be implemented directly in teh stdlib, so once the emit logic was changed it was just a matter of adding `._data` to a bunch of VK tranlsations.
It turns out that arrays of constant buffers have more or less the same problem, and furthermore we have some problems with any code that directly uses the modern HLSL `ConstantBuffer<T>` type.
Note: the emit logic around constant buffers sometimes refers to "parameter groups" because that is being used in the compiler as a catch-all term for constant buffers, texture buffers, and parameter blocks.
The existing code was going out of its way to reproduce the way that constant buffer declarations are implicitly referenced in HLSL:
```hlsl
cbuffer C { float f; }
...
float tmp = f; // No reference to `C` here
```
This can be seen in the emit logic with the `isDerefBaseImplicit` function, which is used to take the internal IR representation for a reference to `f` (which is closer to the expression `(*C).f` or `C->f`) and leave off any reference to `C` so that we emit just `f`.
That kind of logic just flat out doesn't work in some important cases. Arrays of constant buffers are a clear one:
```hlsl
ConstantBuffer<X> cbArray[3];
...
X x = cbArray[0];
```
There is no way to translate that to an ordinary `cbuffer` declaration at all. The same problem can be created without arrays, though:
```hlsl
ConstantBuffer<X> singleCB;
...
X x = singleCB;
```
The current strategy for translating constant buffers was translating `singleCB` into a `cbuffer` declaration that reproduced the fields of `X` as its members, which just wouldn't work:
```hlsl
cbuffer singleCB
{
float f; // field of `X`
}
...
X x = singleCB; // ERROR: there is nothing named `singleCB` in this HLSL
```
The new strategy is more consistent. We still generate a `cbuffer` declaration for a single constant buffer, but we always give it a single field of the chosen element type:
```hlsl
cbuffer singleCB
{
X singleCB;
}
...
X x = singleCB; // this works fine!
```
And in the array case we generate code that uses the explicit `ConstantBuffer<T>` type:
```hlsl
ConstantBuffer<X> cbArray[3];
...
X x = cbArray[0];
```
The GLSL output is more complicated because unlike with HLSL there is no implicit conversion from a uniform block to its element type (there is no notion of an element type). The array case thus needs a `_data` field similar to what we do for structured buffers:
```glsl
layout(std140)
uniform _S3
{
X _data;
} cbArray[3];
...
X x = cbArray[0]._data;
```
And then the non-array case needs to have a similar `_data` field for consistency:
```glsl
layout(std140)
uniform _S1
{
X _data;
} singleCB;
...
X x = singleCB._data;
```
This is handled by inserting the necessary reference to `_data` whenever we dereference a constant buffer, either as part of a load instruction (loading from the whole CB as a pointer), or an `IRFieldAddress` instruction which forms a pointer into the CB (e.g., `&(singleCB->f)` becomes `singleCB._data.f`).
The current emit logic handles `ParameterBlock<X>` differently from `ConstantBuffer<X>`, but really only to allow parameter blocks to be explicitly named in the output, while constant buffers were left implicit by default. Thus the only difference was a legacy one (from back when trying to exactly reproduce the HLSL text we got as input was considered an important goal), and the new approach to emitting constant buffers would get rid of it.
I removed the separate logic for emitting `ParameterBlock<X>` and just let the handling for constant buffers deal with it.
Note that any resource types inside of a `ParameterBlock<X>` would have been moved out as part of legalization, so that a parameter block is 100% equivalent to a constant buffer when it comes time to emit code.
Unsurprisingly, changing the way we generate HLSL and GLSL output for all these buffer types meant that any tests that were directly comparing the output of `slangc` against `fxc`, `dxc`, or `glslang` broke.
The basic approach to fixing the breakage in GLSL tests was to update the GLSL baseline to reflect the new output startegy. In some cases I used macros to name the various `_S<digits>` temporaries so that future renaming will hopefully be easier (it would be great if we auto-generated temporary names with a bit more context). There was one GLSL test (`tests/bugs/vk-structured-buffer-binding`) that was using raw GLSL expected output, and this was changed to use a GLSL baseline to generate SPIR-V for comparison.
For HLSL tests we were sometimes running the same input file through `slangc` and `fxc`/`dxc`, and in these cases I macro-ized the various `cbuffer` declarations to generate different declarations depending on the compiler.
I completely dropped the tests coming from the D3D SDK because they aren't providing much coverage, and updating them would change them so far from the original code that the purported benefit (using a body of existing shaders) would be lost.
I also dropped the explicit matrix layout qualifiers in the `matrix-layout` test because the new output strategy breaks those for GLSL (you can't put matrix layout qualifiers on `struct` fields, and now the body of every constant buffer is inside a `struct`). This isn't as big of a loss as it seems, because our handling of those qualifiers wasn't really right to begin with. Slang users should only be setting the matrix layout mode globally (and we should probably switch to error out on the explicit qualifiers for now).
The other thing that got dropped is tests involving `packoffset` modifiers.
Slang already warns that it doesn't support these, and the way they were used in the test cases is actually misleading. For the binding/layout-related tests, the goal was to show that Slang reproduces the same layout as fxc, in which case explicitly enforcing a layout via `packoffset` seems like cheating (are we sure we enforced the layout fxc would have produced?). The real reason was that Slang used to emit explicit `packoffset` on *every* field of a `cbuffer` it would output, because of an `fxc` bug where you couldn't use `register` on textures/samplers declared inside a `cbuffer` unless *every* field in the `cbuffer` used a `register` or `packoffset` modifier. Slang hasn't required that behavior in a while because it now splits textures and samplers, and the one test case where we needed `packoffset` to work around the `fxc` bug in the baseline HLSL has been macro-ified even more to work around the bug.
The amount of churn in the test cases is unfortunate, but it continues to point at the weakness of any testing strategy that checks for exact equivalent between Slang's output and that of other compilers. We need to keep working to replace these tests with better alternatives.
In `check.cpp` there is logic to perform implicit dereferencing, so that if you write `obj.f` where `obj` is a `ConstantBuffer<X>` (or some other "pointer-like" type) and `f` is a field in `X`, then this effectively translates as `(*obj).f`. That is, we dereference the value of type `ConstantBuffer<X>` to get a value of type `X`, and then refer to the field of the `X` value.
There was a problem where the logic to insert that kind of implicit dereference operation was using a reference (`auto& type = ...`) for the type of the expression being dereferenced, and then clobbering it. This would mean that an expression of type `ConstantBuffer<X>` would have its type overwritten to be just `X` and then codegen would break later on.
I'm not sure how we haven't run into that before.
The `array-of-buffers` test case was added to confirm that we now support arrays of constant, structured, and byte-address buffers for both DXIL and SPIR-V output.
Okay, so that was a lot of stuff, but hopefully it is clear how this all works to make the output of the compiler more consistent and explicit, while also supporting the required new functionality.
* fixup: review feedback
|