| Age | Commit message (Collapse) | Author |
|
* Initial support for uniform parameters on entry points
The basic feature this work adds is the ability to define a shader entry point like:
```hlsl
[shader("fragment")]
float4 main(
uniform Texture2D t,
uniform SamplerState s,
float2 uv : UV)
{
return t.Sample(s,uv);
}
```
In this example, the `uniform` keyword is used to mark that the given entry point parameters are *not* varying input/output flowing through the pipeline, but rather uniform shader parameters that should function as if the shader was declared more like:
```hlsl
Texture2D t,
SamplerState s,
[shader("fragment")]
float4 main(
float2 uv : UV)
{
return t.Sample(s,uv);
}
```
Allowing `uniform` parameters on entry points makes it easier to define multiple entry points in one file without accidentally polluting the global scope with shader parameters that only certain entry points care about.
This feature is also more or less a prerequisite for allowing generic type parameters directly on entry point functions, since the main use case for those type parameters is for determining what goes in various `ConstantBuffer`s or `ParameterBlock`s.
There are two main pieces to the implementation.
First, we need to be able to compute appropriate layout information for entry points that include `uniform` parameters.
Second, we need to transform the entry point function to move any `uniform` parameters to be ordinary global-scope shader parameters, to make sure that all other back-end passes don't need to worry about this special case.
The latter piece of the implementation is, relatively speaking, simpler.
The pass in `ir-entry-point-uniforms.{h,cpp}` converts entry point parameters that are determined to be uniform (using the already-computed layout information) into fields of a `struct` type and then declares a global shader parameter based on that `struct` type (and applies already-computed layout information to that parameter).
After that, the remaining IR passes (notably including type legalization) will handle things just as for any other global shader parameter.
The changes to the layout step are more significant, but most of the changes are just cleanups and fixes to enable the feature.
The two major changes that enable entry-point `uniform` parameters are:
* In `collectEntryPointParameters` we now dispatch out to a new `computeEntryPointParameterTypeLayout` function, which decided whether to compute the type layout for a `uniform` parameter, or for a varying parameter (what used to be the default behavior handled by `processEntryPointParameterDecl`).
* The main `generateParameterBindings` routine was extended so that it allocates registers/bindings to the resources required by each entry point (using `completeBindingsForParameter`) after it has allocated registers/binding to all of the global-scope parameters (this addition is mirrored in `specializeProgramLayout`).
The effect of these changes is that the `uniform` parameters of any entry points specified in a compile request will be laid out after the global-scope parameters, in the order the entry points were specified in the compile request.
A bunch of smaller changes were made around parameter layout that are worth enumerating so that the diffs make some sense:
* The `EntryPointLayout` type was changed so that instead of trying to *be* a `StructTypeLayout`, it instead *owns* one, in the same fashion as `ProgramLayout`. This commonality was factored into a base class `ScopeLayout`, and a bunch of edits followed from that change.
* Because `uniform` parameters are moved out of the entry point parameter list early in the IR transformations, the logic in `ir-glsl-legalize.cpp` that tried to look up parameter layout information by index would no longer work if the entry point parameter list had been altered. Instead, that logic now looks for the decorations directly on the parameters.
* The `UsedRange` type in `parameter-binding.cpp` was tracking the existing parameter associated with a range using a `ParameterInfo*` (which accounts for the possibility of multiple `VarDecl`s mapping to the same logical shader parameter), when just using a `VarLayout*` is sufficient for all current use cases. The overhead of allocating a `ParameterInfo` seems like overkill for entry-point parameters, where there can't possibly be multiple declarations of the "same" parameter, so avoiding these overheads was a focus when trying to deduplicate code between the global and entry-point parameter cases.
* A bunch of parameter binding logic that was specific to GLSL input has been deleted completely. There was no way to even execute this code in the compiler today, and there is pretty much zero chance of us needing (or wanting) to deal with GLSL input in the future. This includes custom `UsedRangeSet`s specific to each translation unit, which were only needed for global-scope `in` and `out` varying declarations in GLSL.
* A bunch of functions with `EntryPointParameter` in their names were renamed to use `EntryPointVaryingParameter` to help distinguish that they only apply to the varying case, while entry point `uniform` parameters are handled elsewhere.
* The `completeBindingsForParameter` function was re-worked into something that can be used for both global-scope shader parameters (where we have a `ParameterInfo` and possibly explicit bindings) and entry-point parameters (where we expect to have neither). This helps unify the (fairly subtle) logic for how we allocate and assign bindings for resources, constant buffers, parameter blocks, etc.
* A small change was made so that the entry-point stage is attached directly to top-level parameters of the entry point, and *not* recursively to every field along the way. This could be a breaking change for some applications, but it makes more logical sense (to me); we'll have to check if this affects Falcor. This change produces different output for several of the reflection tests, but the changes are consistent with no longer attaching stage information to sub-fields of varying `struct`-type parameters.
* Because there is a bunch of repeated logic in `parameter-binding.cpp` that has to do with computing a `struct` layout for ordinary/uniform data, I tried to factor that into a single `ScopeLayoutBuilder` type, which handles computing the offsets for any parameters with ordinary data, and then also handles wrapping up the layout in a constant buffer layout if there was any ordinary data at the end.
* A similar convenience routine `maybeAllocateConstantBufferBinding` was added because I noticed multiple places in `parameter-binding.cpp` that were trying to allocate a constant buffer binding for global uniforms, and they were wildly inconsistent (and in most cases used logic that would only work for D3D).
* The main `generateParameterBindings` routine is significantly shortened by using all of these utilities that were introduced. I tried to comment the places that changed to explain the overall flow correctly.
* The `specializeProgramLayout` routine (used to take a `ProgramLayout` from `generateParameterBindings` and specialize it based on knowledge of global generic arguments) had basically been rewritten with more explicit commenting/rationale for what happens in each step. It makes use of the same shared utilities as `generateParameterBindings` and `collectEntryPointParameters`.
In terms of testing:
* I added a test case to specifically test the new behavior, and in particular I made sure to include a mix of both global and entry-point parameters and also to have entry-point parameters of both ordinary and resource/object types.
* I tweaked an existing test for global type parameters to use an entry-point `uniform` parameter instead of a global one, in an effort to migrate it toward being able to use an explicitly generic entry point.
* fixups from merge
|
|
There was a bug in the logic for emitting initial IR, such that it was neglecting to emit "methods" (member functions) unless they were also referenced by a non-member (global) function, or were needed to satisfy an interface requirement. This would only matter for `import`ed modules, since for non-`import`ed code, anything relevant would be referenced by the entry point so that the problem would never surface.
This change fixes the underlying problem by adding a step to the IR lowering pass called `ensureAllDeclsRec` that makes sure that not only global-scope declarations, but also anything nested under a `struct` type gets emitted to the initial IR module.
There are also a few unrelated fixes in this PR, which are things I ran into while making the fix:
* Deleted support for the (long gone) `IRDeclRef` type in our `slang.natvis` file
* Added support for visualizing the value of IR string and integer literals when they appear in the debugger
* Fixed IR dumping logic to not skip emitting `struct` and `interface` instructions. Switching those to inherit from `IRType` accidentally affected how they get printed in IR dumps by default.
* Fixed up the IR linking logic so that it correctly takes `[export]` decorations into account, so that an exported definition will always be taken over any other (unless the latter is more specialized for the target). I initially implemented this in an attempt to fix the original issue, but found it wasn't a fix for the root cause. It is still a better approach than what was implemented previously, so I'm leaving it in place.
|
|
* Initial support for dynamic dispatch using "tagged union" types
Suppose a user declares some generic shader code, like the following:
```hlsl
interface IFrobnicator { ... }
type_param T : IFrobincator;
ParameterBlock<T : IFrobnicator> gFrobnicator;
...
gFrobincator.frobnicate(value);
```
and then they have some concrete implementations of the required interface:
```hlsl
struct A : IFrobnicator { ... }
struct B : IFrobnicator { ... }
```
The current Slang compiler allows them to generate distinct compiled kernels for the case of `T=A` and the case of `T=B`. This means that the decision of which implementation to use must be made at or before the time when a shader gets bound in the application.
This change adds a new ability where the Slang compiler can generate code to handle the case where `T` might be *either* `A` or `B`, and which case it is will be determined dynamically at runtime. This means a single compiled kernel can handle both cases, and the decision about which code path to run can be made any time before the shader executes.
This new option is supported by defining a *tagged union* type. Via the API, the user specifies that `T` should be specialized to `__TaggedUnion(A,B)` (the double underscore indicates that this is an experimental and unsupported feature at present). We refer to the types `A` and `B` here as the "case" types of the tagged union. Conceptually, the compiler synthesizes a type something like:
```hlsl
struct TU { union { A a; B b; } payload; uint tag; }
```
The user can then allocate a constant buffer to hold their tagged union type, and when they pick a concrete type to use (say `B`), they fill in the first `sizeof(B)` bytes of their buffer with data describing a `B` instance, and then set the `tag` field to the appopriate 0-based index of the case type they chose (in this case the `B` case gets the tag value `1`).
Actually implementing tagged unions takes a few main steps:
* Type parsing was extended to special-case `__TaggedUnion` as a contextual keyword. This is really only intended to be used when parsing types from the API or command-line, and Bad Things are likely to happen if a user ever puts it directly in their code. Eventually construction of tagged unions should be an API feature and not part of the language syntax.
* Semantic checking was extended to recognize that a tagged union like `__TaggedUnion(A,B)` shoud support an interface like `IFrobnicator` whenever all of the case types suport it, as long as the interface is "safe" for use with tagged unions (which means it doesn't use a few of the advancd langauge features like associated types).
* The IR was extended with instructions to represent tagged union types and to extract their tag and the payload for the different cases as needed.
* IR generation was extended to synthesize implementations of interface methods for any interface that a tagged union needs to support. Right now the implementation is simplistic and only handles simple method requirements, which it does by emitting a `switch` instruction to pick between the different cases.
* A new IR pass was introduced to "desugar" any tagged union types used in the code. The downstream HLSL and GLSL compilers don't support `union`s, so we have to instead emit a tagged union as a "bag of bits" and implement loading the data for particular cases from it manually.
* Final code emit mostly Just Works after the above steps, but we had to introduce an explicit IR instruction for bit-casting to handle the output of the desugaring pass.
There are a bunch of gaps and caveats in this implementation, but that seems reasonable for something that is an experimental feature. The various `TODO` comments and assertion failures in unimplemented cases are intended, so that this work can be checked in even if it isn't feature-complete.
* fixup: missing files
* fixup: typos
|
|
The IR linking logic was recently rewritten to use the (optional) `IRLinkageDecoration`s instead of assuming `IRGlobalVals` always have a mangled name field, and in that process a bug seems to have crept in where in the case that an instruction that would usually quality as a "global value" does *not* have linkage, we were failing to register the instruction we create in the output module as a replacement for the original instruction.
This problem affects `static` variables inside of functions, leading to them potentially getting emitted multiple times.
|
|
* Refactor several IR passes
This change takes some IR passes that lived together in `ir.cpp` and moves them into their own files to improve clarity.
In most cases these were passes introduced early in the life of the IR, so that it didn't seem like a big deal to have them all in one file, but now that `ir.cpp` has grown unwieldly this seems like an important cleanup to make.
To give a quick rundown of the passes involved:
* The IR "linking" step has been pulled out to `ir-link.{h,cpp}`. This code for this pass is pretty much identical to what was in `ir.cpp`, and no attempt has been made to clean up or refactor it in the current change.
* The GLSL legalization step has been pulled out to `ir-glsl-legalize.{h,cpp}`. This used to be invoked directly from the linking step, but has been made a new top-level pass invoked from `emit.cpp`. Just like with the linking, the code in the new file is just a copy-paste of what was in `ir.cpp`, and no attempt at cleanup has been made. Also note that it might be a good idea to move this pass later in the overall sequence, but this PR doesn't attempt to do that as it could change results.
* The generic specialization step has been pulled out to `ir-specialize.{h,cpp}`. The file name does not explicitly reference *generic* specialization because I anticipate this pass having to perform other kinds of specialization as well. The code in this case amounts to a heavy cleanup/refactoring pass and thus deserves careful scrutiny. The reason for the cleanup is that the generic specialization step used to be part of the "linking" step long ago, and continued to share infrastructure with it long after that stopped making sense. The newly cleaned up pass has much simpler logic that should be easy enough to follow from the comments.
* In order to reduce code dulication, the IR "cloning" part of the `ir-specialize-resources.{h,cpp}` pass was pulled into its own files (`ir-clone.{h,cpp}`) that both the generic specialization step and the resource-based specialization step now share.
The remaining changes then pertain to deleting a bunch of code out of `ir.cpp` and adding the new files to the build.
The only test that needed updating was `vkray/raygen`, where some subtle ordering change in the refactored generic specialization logic has lead to the relative order of the specialized `TraceRay` and `saturate` functions beind reversed.
* fixup: typo in assert
* fixup: typos in comments
|