slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Refactor D3D12 renderer root signature creation (#1779)	Tim Foley	2021-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change originated as an attempt to re-enable a test case, but it has ended up disabling more tests (for good reasons) than it re-enables. The main change here is a significant overhaul of the way that the D3D12 render path extracts information from the Slang reflection API to produce a root signature. There were also some supporting fixes in the reflection information to make sure it returns what the D3D12 back-end needed. The big picture here is that the D3D12 path now uses the descriptor ranges stored in the reflection data more or less directly. It still needs to use register/space offset information queried via the "old" reflection API, but it only does so at the top level now, for the program and entry points themselves. All other layout information is derived directly from what Slang provides. Smaller changes: * The "flat" reflection API was expanded to include `getBindingRangeDescriptorRangeCount()` which was clearly missing. * The "flat" reflection results for a constant buffer or parameter block that didn't contain any uniform data and was mapped to a plain constant buffer needed to be fixed up. That logic is still way to subtle to be trusted. * Several additional tests were disabled that relied on static specialization, global/entry-point generi type parameters, structured buffers of interfaces or other features we don't officially support with shader objects right now. All of the affected tests were somehow passing by sheer luck and because they often passed in specialization arguments via explicit `TEST_INPUT` lines. * The `inteface-shader-param` test is re-enabled now that we can properly describe its input with the new `set` mode on `TEST_INPUT` * `ShaderCursor::getElement()` can now be used on structure types (in addition to arrays) to support by-index access to fields * The `TEST_INPUT` system was expanded to support both by-name and by-index setting of structure fields for aggregates * The `TEST_INPUT` system was expanded to allow an `out` prefix to mark parts of an expression as outputs on a `set` lines * The `TEST_INPUT` system was expanded so that anything that would be allowed on a `TEST_INPUT` line by itself (like `ubuffer(...)`) can now be used as a sub-expression on a `set` line Co-authored-by: Yong He <yonghe@outlook.com>
*	Convert more tests to use shader objects (#1659)	Tim Foley	2021-01-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change converts a large number of our existing tests to use the `ShaderObject` support that was added to the `gfx` layer. In many cases, tests were just updated to pass `-shaderobj` and the result Just Worked. In other cases, a `name` attribute had to be added to one or more `TEST_INPUT` lines. For tests that did not work with shader objects "out of the box," I spent a little bit of time trying to get them work, but fell back to letting those tests run in the older mode. Future changes to the infrastructure will be needed to get those additional tests working in the new path. Along with the changes to test files, the following implementation changes were made to get additional tests working: * Because the shader object mode uses explicit register bindings (from reflection), the hacky logic that was offseting `u` registers for D3D12 based on the number of render targets gets disabled (by another hack). * The "flat" reflection information coming from Slang was not correctly reporting "binding ranges" for things that consumed only uniform data (which would be everything on CUDA/CPU), so it was refactored to properly include binding ranges for anything where the type of the field/variable implied a binding range should be created (even if the `LayoutResourceKind` was `::Uniform`). * A few fixes were made to the CUDA implementation of `Renderer`, in order to get additional tests up and running. Most of these changes had to do with texture bindings, which hadn't really been tested previously. In addition, a few changes were made that were attempts at getting more tests working, but didn't actually help. These could be dropped if requested: * As a quality-of-life feature (not being used) the `object` style of `TEST_INPUT` line is upgraded to support inferring the type to use from the type of the input being set. * Any `object` shader input lines get ignored in non-shader-object mode.
*	Remove support for explicit register/binding syntax on TEST_INPUT (#1132)	Tim Foley	2019-11-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `TEST_INPUT` facility allows textual Slang test cases to provide two kinds of information to the `render-test` tool: 1. Information on what shader inputs exist 2. Information on what values/objects to bind into those shader inputs Under the first category of information, there exists supporting for attaching a `dxbinding(...)` annotation to a `TEST_INPUT` which seemingly indicates what HLSL `register` the input uses. There is a similar `glbinding(...)` annotation, used for OpenGL and Vulkan. It turns out that these annotations were, in practice, completely ignored and had no bearing on how `render-test` allocates or bindings graphics API objects. There was some amount of code attempting to validate that explicit registers/bindings were being set appropriately, but the actual values were being ignored. The visible consequence of the `dxbinding` and `glbinding` annotations being ignored is issue #1036: the order of `TEST_INPUT` lines was de facto determining the registers/bindings that were being used by `render-test`. This change simply removes the placebo features and strips things down to what is implemented in practice: the `TEST_INPUT` lines do not need target-API-specific binding/register numbers, because their order in the file implicitly defines them. I added logic to the parsing of `TEST_INPUT` lines to make sure I got an error message on any leftover annotations, and went ahead and systematicaly deleted all of the placebo annotations from our test cases. If we decide to make `TEST_INPUT` lines not depend on order of declaration in the future, we can build it up as a new and better considered feature. The main alternative I considered was to keep the annotations in place, and change `render-test` and the `gfx` abstraction layer to properly respect them, but that path actually creates much more opportunity for breakage (since every single test case would suddenly be specifying its root signature / pipeline layout via a different path using data that has never been tested). The approach in this change has the benefit of giving me high confidence that all the test cases continue to work just as they had before.
*	WIP: CPU compute coverage (#1030)	jsmall-nvidia	2019-08-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for '=' when defining a name in test. * Add support for double intrinsics. * Add support for asdouble Add findOrAddInst - used instead of findOrEmitHoistableInst, for nominal instructions. Support cloning of string literals. C++ working on more compute tests. * Constant buffer support in reflection. Fixed debugging into source for generated C++. buffer-layout.slang works. * Added cpu test result. * Remove some commented out code. Comment on next fixes. * Improvements to reflection CPU code. * C++ working with ByteAddressBuffer. * Enabled more compute tests for CPU. * Enabled more compute tests on CPU. Added support for [] style access to a vector. * Enabled more CPU compute tests. * Handling of buffer-type-splitting.slang Named buffers can be paths to resources * Fix some warnings, remove some dead code. * Fix problem with verification of number of operands for asuint/asint as they can have 1 or 3 operands. asdouble takes 2. * Fix handling in MemoryArena around aligned allocations. That _allocateAlignedFromNewBlock assumed the block allocated has the aligment that was requested and so did not correct the start address.
*	Allow entry points to have explicit generic parameters (#826)	Tim Foley	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Allow entry points to have explicit generic parameters Prior to this change, the Slang implementation required users to use global `type_param` declarations in order to specialize a full shader. For example: ```hlsl type_param L : ILight; ParameterBlock<L> gLight; [shader("fragment")] float4 fs(...) { ... gLight.doSomething() ... } ``` With this change we can rewrite code like the above using explicit generics, plus the ability to have `uniform` entry-point parameters: ```hlsl [shader("fragment")] float4 fs<L : ILight>( uniform ParameterBlock<L> light, ...) { ... light.doSomething() ... } ``` Having this support in place should make it possible for us to eliminate global generic type parameters and the complications they cause (both at a conceptual and implementation level). The most central and visible piece of the change is that `EntryPointRequest` now holds a `DeclRef<FuncDecl>` instead of just ` RefPtr<FuncDecl>`, which allows it to refer to a specialization of a generic function. Various places in the code that refer to the `EntryPointRequest::decl` member now use a `getFuncDecl()` or `getFuncDeclRef()` method as appropriate (see `compiler.h`). In order to fill in the new data, the `findAndValidateEntryPoint` function has been greaterly overhauled. The changes to its operation include: * The by-name lookup step for the entry point function has been adapted to accept either a function or a generic function. * The generic argument strings provided by API or command line are no longer parsed all the way to `Type`s, but instead just to `Expr`s in the first pass. * There are now two cases for checking the global generic arguments against their matching parameters. The first case is the new one, where we plug the generic argument `Expr`s into the explicit generic parameters of an entry point (that case re-uses existing semantic checking logic). The second case is the pre-existing code for dealing with global generic type arguments. The `lower-to-ir.cpp` logic for hadling entry points then had to be extended. Making it deal with a full `DeclRef` instead of just a `Decl` was the easy part (just call `emitDeclRef` instead of `ensureDecl`). The more interesting bits were: * We need to carefully add the `IREntryPointDecoration` to the nested function and not the generic in the case where we have a generic entry point. There is a handy `getResolvedInstForDecorations` that can extract the return value for an IR generic so that we can decorate the right hting. * We need to make sure that in the case where we emit a `specialize` instruction (which normally wouldn't get a linkage decoration), we attach an `[export(...)]` decoration to it with the mangled name of the decl-ref, so that it can be found during the linking step. The IR linking step is then slightly more complicated because the mangled entry point name could either refer directly to an `IRFunc` or to a `specialize` instruction for a generic entry point. The logic was refactored to first clone the entry point symbol without concern for which case it is (the old code was specific to functions), and then if the result is a `specialize` instruction, we attempt to run generic specialization on-demand. That on-demand specialization is a bit of a kludge, but it deals with the fact that all the downstream passing only expect to see an `IRFunc`. A future cleanup might try to split out that specialization step into its own pass, which ends up being a limited form of the specialization pass. Since I was already having to touch a lot of the code around IR linking, I went ahead and refactored the signature of the operations. I eliminated the need for the caller to create, pass in, and then destroy an `IRSpecializationState` (really an IR linking state), and replaced it with a structure local to the pass (that data structure was a remnant of an older approach in the compiler), and then also renamed the main operation to `linkIR` to reflect what it is doing in our conceptual flow. Smaller changes made along the way include: * Refactored `visitGenericAppExpr` to create a subroutine `checkGenericAppWithCheckedArgs` so that it can be used by the entry-point validation logic described above). * Refactored the declarations around the IR passes in `emitEntryPoint()` (`emit.cpp`), to show that things are more self-contained than they used to be (e.g., that the `TypeLegalizationContext` is now only needed by one pass). * Refactored the generic specialization code so that there is a stand-along free function that can perform specialization on a `specialize` instruction without all the other context being required. This is only to support the limited specialization that needs to be done as part of linking. * Updated the `global-type-param.slang` test to actually test entry-point generic parameters. In a later pass we can/should rework all the tests/examples for global type parameters over to use explicit entry-point generic parameters (at which point we should rename the tests as well). For now I am leaving thigns with just one test case, with the expectation that bugs will be found and ironed out as we expand to more tests. * fixup * Fixup: don't leave entry-point decorations on stuff we don't want to keep The IR `[entryPoint]` decoration is effectively a "keep this alive" decoration, which means that attaching it to something we don't intend to keep around can lead to Bad Things. The approach to generic entry points was attaching `[entryPoint]` to the underlying `IRFunc` because that seemed to make sense, but that meant that the `specialize` instruction at global scope scould instantiate that generic and then keep it alive, even if the resulting function wouldn't be valid according to the language rules. As a quick fix, I'm attaching `[entryPoint]` to the `specialize` instruction instead in such cases, and then re-attaching it to the result of explicit specialization during linking. * Port most of remaining test and rename global type parameters This change ports as many as possible of the existing tests for global type parameters over to use entry-point generic parameters instead. For the most part this is a mechanical change. A few test cases remain using global generic parameters, as does the `model-viewer` example application. The reason for this is that the shaders have either or both the following features: * A vertex and fragment shader that can/shold agree on their parameters * A type declaration (e.g., a `struct`) that is dependent on one of the generic type parameters In these cases, it would really only make sense to switch to explicit parameters once we support shader entry points nested inside of a `struct` type, so that we can use an outer generic `struct` as a mechanism to scope the entry points and other type-dependent declrations. Since global-scope type parameters need to persist for at least a bit longer, I went ahead and renamed all the use sites over to use `type_param` for consistency.
*	Initial support for uniform parameters on entry points (#815)	Tim Foley	2019-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial support for uniform parameters on entry points The basic feature this work adds is the ability to define a shader entry point like: ```hlsl [shader("fragment")] float4 main( uniform Texture2D t, uniform SamplerState s, float2 uv : UV) { return t.Sample(s,uv); } ``` In this example, the `uniform` keyword is used to mark that the given entry point parameters are not varying input/output flowing through the pipeline, but rather uniform shader parameters that should function as if the shader was declared more like: ```hlsl Texture2D t, SamplerState s, [shader("fragment")] float4 main( float2 uv : UV) { return t.Sample(s,uv); } ``` Allowing `uniform` parameters on entry points makes it easier to define multiple entry points in one file without accidentally polluting the global scope with shader parameters that only certain entry points care about. This feature is also more or less a prerequisite for allowing generic type parameters directly on entry point functions, since the main use case for those type parameters is for determining what goes in various `ConstantBuffer`s or `ParameterBlock`s. There are two main pieces to the implementation. First, we need to be able to compute appropriate layout information for entry points that include `uniform` parameters. Second, we need to transform the entry point function to move any `uniform` parameters to be ordinary global-scope shader parameters, to make sure that all other back-end passes don't need to worry about this special case. The latter piece of the implementation is, relatively speaking, simpler. The pass in `ir-entry-point-uniforms.{h,cpp}` converts entry point parameters that are determined to be uniform (using the already-computed layout information) into fields of a `struct` type and then declares a global shader parameter based on that `struct` type (and applies already-computed layout information to that parameter). After that, the remaining IR passes (notably including type legalization) will handle things just as for any other global shader parameter. The changes to the layout step are more significant, but most of the changes are just cleanups and fixes to enable the feature. The two major changes that enable entry-point `uniform` parameters are: * In `collectEntryPointParameters` we now dispatch out to a new `computeEntryPointParameterTypeLayout` function, which decided whether to compute the type layout for a `uniform` parameter, or for a varying parameter (what used to be the default behavior handled by `processEntryPointParameterDecl`). * The main `generateParameterBindings` routine was extended so that it allocates registers/bindings to the resources required by each entry point (using `completeBindingsForParameter`) after it has allocated registers/binding to all of the global-scope parameters (this addition is mirrored in `specializeProgramLayout`). The effect of these changes is that the `uniform` parameters of any entry points specified in a compile request will be laid out after the global-scope parameters, in the order the entry points were specified in the compile request. A bunch of smaller changes were made around parameter layout that are worth enumerating so that the diffs make some sense: * The `EntryPointLayout` type was changed so that instead of trying to be a `StructTypeLayout`, it instead owns one, in the same fashion as `ProgramLayout`. This commonality was factored into a base class `ScopeLayout`, and a bunch of edits followed from that change. * Because `uniform` parameters are moved out of the entry point parameter list early in the IR transformations, the logic in `ir-glsl-legalize.cpp` that tried to look up parameter layout information by index would no longer work if the entry point parameter list had been altered. Instead, that logic now looks for the decorations directly on the parameters. * The `UsedRange` type in `parameter-binding.cpp` was tracking the existing parameter associated with a range using a `ParameterInfo` (which accounts for the possibility of multiple `VarDecl`s mapping to the same logical shader parameter), when just using a `VarLayout` is sufficient for all current use cases. The overhead of allocating a `ParameterInfo` seems like overkill for entry-point parameters, where there can't possibly be multiple declarations of the "same" parameter, so avoiding these overheads was a focus when trying to deduplicate code between the global and entry-point parameter cases. * A bunch of parameter binding logic that was specific to GLSL input has been deleted completely. There was no way to even execute this code in the compiler today, and there is pretty much zero chance of us needing (or wanting) to deal with GLSL input in the future. This includes custom `UsedRangeSet`s specific to each translation unit, which were only needed for global-scope `in` and `out` varying declarations in GLSL. * A bunch of functions with `EntryPointParameter` in their names were renamed to use `EntryPointVaryingParameter` to help distinguish that they only apply to the varying case, while entry point `uniform` parameters are handled elsewhere. * The `completeBindingsForParameter` function was re-worked into something that can be used for both global-scope shader parameters (where we have a `ParameterInfo` and possibly explicit bindings) and entry-point parameters (where we expect to have neither). This helps unify the (fairly subtle) logic for how we allocate and assign bindings for resources, constant buffers, parameter blocks, etc. * A small change was made so that the entry-point stage is attached directly to top-level parameters of the entry point, and not recursively to every field along the way. This could be a breaking change for some applications, but it makes more logical sense (to me); we'll have to check if this affects Falcor. This change produces different output for several of the reflection tests, but the changes are consistent with no longer attaching stage information to sub-fields of varying `struct`-type parameters. * Because there is a bunch of repeated logic in `parameter-binding.cpp` that has to do with computing a `struct` layout for ordinary/uniform data, I tried to factor that into a single `ScopeLayoutBuilder` type, which handles computing the offsets for any parameters with ordinary data, and then also handles wrapping up the layout in a constant buffer layout if there was any ordinary data at the end. * A similar convenience routine `maybeAllocateConstantBufferBinding` was added because I noticed multiple places in `parameter-binding.cpp` that were trying to allocate a constant buffer binding for global uniforms, and they were wildly inconsistent (and in most cases used logic that would only work for D3D). * The main `generateParameterBindings` routine is significantly shortened by using all of these utilities that were introduced. I tried to comment the places that changed to explain the overall flow correctly. * The `specializeProgramLayout` routine (used to take a `ProgramLayout` from `generateParameterBindings` and specialize it based on knowledge of global generic arguments) had basically been rewritten with more explicit commenting/rationale for what happens in each step. It makes use of the same shared utilities as `generateParameterBindings` and `collectEntryPointParameters`. In terms of testing: * I added a test case to specifically test the new behavior, and in particular I made sure to include a mix of both global and entry-point parameters and also to have entry-point parameters of both ordinary and resource/object types. * I tweaked an existing test for global type parameters to use an entry-point `uniform` parameter instead of a global one, in an effort to migrate it toward being able to use an explicitly generic entry point. * fixups from merge
*	Remove non-IR codegen paths (#398)	Tim Foley	2018-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic change is simple: remove support for all code generation paths other than the IR. There is a lot of vestigial code left, but the main logic in `ast-legalize.` is gone. Doing this breaks a lot* of tests, for various reasons: - We can no longer guarantee exactly matching DXBC or SPIR-V output after things pass through out IR - Many builtins don't have matching versions defined for GLSL output via IR (even when they had versions defined via the earlier approach that worked with the AST) - A lot of code creates intermediate values of opaque types in the IR, which turn into opaque-type temporaries that aren't allowed (this breaks many GLSL tests, but also some HLSL) I implemented some small fixes for issues that I could get working in the time I had, but most of the above are larger than made sense to fix in this commit. For now I'm disabling the tests that cause problems, but we will need to make a concerted effort to get things working on this new substrate if we are going to make good on our goals.
*	Allow arbitrary type string as type argument in spAddEntryPointEx.	Yong He	2018-01-19
\|
*	Add support for global generic parameters (#285)	Yong He	2017-11-17
	* Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. This commit does not include ProgramLayout specialization. * IR: pass through `[unroll]` attribute (#284) The initial lowering was adding an `IRLoopControlDecoration` to the instruction at the head of a loop, but this was getting dropped when the IR gets cloned for a particular entry point. The fix was simply to add a case for loop-control decorations to `cloneDecoration`. * fix warnings * IR: support `CompileTimeForStmt` (#286) This statement type is a bit of a hack, to support loops that must be unrolled. The AST-to-AST pass handles them by cloning the AST for the loop body N times, and it was easy enough to do the same thing for the IR: emit the instructions for the body N times. The only thing that requires a bit of care is that now we might see the same variable declarations multiple times, so we need to play it safe and overwrite existing entries in our map from declarations to their IR values. Of course a better answer long-term would be to do the actual unrolling in the IR. This is especially true because we might some day want to support compile-time/must-unroll loops in functions, where the loop counter comes in as a parameter (but must still be compile-time-constant at every call site). * Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. progress on parameter binding * Add a more contrived test case for specializing parameter bindings * update render-test to align buffers to 256 bytes (to get rid of D3D complains on minimal buffer size). * adding one more test case for parameter binding specialization. * Cleanup according to @tfoleyNV 's suggestions. * fix a bug introduced in the cleanup