| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* spirv: add support for ops added by multiple extensions
Some spirv ops are added by multiple extensions and capabilities. This
commit adds support to avoid emitting unnecessary extensions and
capabilities if one of the options is already required by some other op.
* spirv: allow OpRaytracingAccelerationStructure to use multiple extensions
This Op is provided by both SPV_KHR_ray_tracing and SPV_KHR_ray_query
and the respective capabilities. Use one if already available and
otherwise fall back to SPV_KHR_ray_tracing.
* tests/vkray: add negative checks for RayTracingKHR and RayQueryKHR
- Add new rayquery-compute.slang to test that only RayQueryKHR is needed
in compute shaders.
- Add checks for RayTracingKHR and RayQueryKHR capabilities and
extensions in raygen.slang
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following external directories are updated.
It is to use a new SPIRV keyword, "OpExtInstWithForwardRefs".
Related to #4304
external/spirv-header:
> commit 2acb319af38d43be3ea76bfabf3998e5281d8d12
> Author: Kévin Petit kevin.petit@arm.com
> Date: Wed Jun 12 16:41:14 2024 +0100
> SPV_ARM_cooperative_matrix_layouts (#433)
external/spirv-tools:
> commit ce46482db7ab3ea9c52fce832d27ca40b14f8e87
> Author: Nathan Gauër brioche@google.com
> Date: Thu Jun 6 12:17:51 2024 +0200
> Add KHR suffix to OpExtInstWithForwardRef opcode. (#5704)
> The KHR suffix was missing from the published SPIR-V extension.
> This is now fixed, but requires some patches in SPIRV-Tools.
external/spirv-tools-generated:
This is generated from spirv-tools
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Legalization of non-struct when expects struct.
`__forceVarIntoStructTemporarily()` solves the issue of passing "non-struct type's" into a parameter that only accepts "struct type's".
The intrinsic solves the issue through checking the parameter of the intrinsic:
If the parameter is a "struct type"
* Return a reference to the parameter
else
* a "struct type" Temporary variable is made and the "non struct type" parameter is copied to a member of this struct. This struct is then returned by `__forceVarIntoStructTemporarily()`. Optionally if the use location of this call is a argument which can have side effects (out, inout, ref, etc.) the temporary struct variable is copied into the original "non struct type" parameter.
Testing code has "addComplexity" functions to avoid optimizations through forcing side effects so we can predict the code output.
* Address review comments
- ForceInline ray functions
- fix testing
- adjust how we replace operands in senarios to avoid unexpected side effects of replacing operands without any explicit checks
* Adjust nv test slightly and remove .glsl file
* Remove implicit LOD sampling & test additions
- Implicit LOD sampling is not allowed in a raygen. Implicit LOD sampling requires depth (from a fragment shader) to sample. Raygen does not have the depth, so this function was replaced.
- Changed other tests for correctness/clarity
* Test if Falcor breaks through use of ForceInline
* Add back force inline
may need to look at how Falcor wrote its slang shaders. This will be done if ForceInline causes issues since ForceInline should not affect code gen in an impactable way.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
resolves #3587 for GLSL & SPIR-V targets #3631 (#3810)
* [early push of code since memory qualifiers may be made into a seperate branch & pr and I rather make it simple to split the implementation if required]
all type & functions impl. for GLSL image type
added all memory qualifiers & tests for direct read/write [GLSL syntax] (DID NOT test or implement parameter qualifiers, that is next commit)
* this inlcudes emit-glsl & emit-spirv for qualifier decorations
* this also includes error handling
* this includes parsing
* full implementation other than Rect; all errors and basic tests are done & working
what is left:
1. need to now add Rect type support (additional TextureImpl flag)
2. tests
3. testing infrastructure to support variety of types
* testing framework now works with images of all types and imageBuffers -- next steps are actual tests
* push code for mostly working image atomics; missing int64/uint64 tests and slightly broken feature
likley due to missing code from master which I pushed for regular atomics
* fix all remaining shader image atomic issues and tests to work with float & i64/u64 fully
will now clean up code and squash the commits (since they are quite all over the place)
* refactor code to work & look correct, fix all regressions
Turned off tests for texture format R64 due to the shader use limitation of currently being only for storage buffers on most hardware (test fail cause, this is not allowed)
Changed raygen.slang & nv-ray-tracing-motion-blur.slang since both cross-compiled with glslang, which does not respect layout(rgba8) for RWBuffer's, in this scenario making the type into a SPIR-V rgba32f, which is incorrect and a known problem, this causes different code to be outputted from Slang & HLSL+GLSL->Slang paths
Clean up all code and better explain the "why" for the gimageDim definition we use various strings of Slang code, the gist is:
1. Parameters are structured as per IMAGE_PARAM keyword in spec, and we respect this in order to match specification (to allow easy code iteration)
2. sample parameters are required for functions
3. types are inconsistently named
fixed regression of breaking l-value lowering when r-value should be lowered (lower-to-ir)
fix compiler warnings
remove unneeded lambdas
`expr->type.isLeftValue = isMutableGLSLBufferBlockVarExpr(baseExpr) && (expr->type.hasReadOnlyOnTarget == false);` is an adjustment made such that a buffer block is mutable only if the block is mutable and the base expression is mutable (to handle case of readonly buffer block, immutable)
* remove rectangle parameter
* use proper const syntax and struct naming
* adjust syntax
* adjust modifier capabilitites: HLSL+GLSL --> GLSL. Notice most specifically, if the parent is a global struct we can put a memory qualifier, this does not include, struct inside a struct, with a member variable with a memory qualifier (since then you could use the struct in invalid ways). Added test for struct inside struct with member variable with memory qualifier.
adjust syntax and remove code which will rot
* adjust formatting for consistency
* addressing review feedback
addressing review feedback:
change testing code to handle int and float/half correctly in all cases
adjust testing code syntax as requested
change vkdevice code to fit a different form as requested
* adjust code as per requested for review:
1. adjusted testing code logic to handle non 0-1 values appropriately, notice int8_t will likley be the range and set order of {[0,127],[-1,-128]}, this is intentional
2. syntax adjustments for correctness
* trying to fix falcor regressions
* add back removed code for regression testing
* test removing changes which may break falcor
* Revert "test removing changes which may break falcor"
This reverts commit 240da97f06c23e98a26ac23cf1d385995c67b251.
* disable R64 support in attempt to fix falcor tests
* Revert "disable R64 support in attempt to fix falcor tests"
This reverts commit 317cb632eb2f47e980fc4aeafe418f8060f4c473.
* disable major device changes (still trying to figure out falcor fails -- locally working different than CI)
* test removing d3d changes
* remove all format changes
* add back removed code for regression testing
* try something to get code to work with falcor
* address review
* Add way to handle constref/ref/encapsulated texture objects with memory qualifiers as a parameter.
Fixed an issue (and improved codegen) for when we have a store(dst,load(src)) pattern, where dst is supposed to be equal to src for when resolving globalParam's (no need for work-arounds anymore)
* move recent-fix/change to textureType loading into a proper optimization pass which now runs after SPIR-V legalization to catch odd SPIR-V emitting after legalizing types for SPIR-V
* Revert most recent optimization pass change, add work around getting a unmangled global parameter address through a intrinsic op instead of spir-v intrinsic (works same as `__imagePointer()`)
* remove unneeded changes
* remove unneeded `__constref` in glsl.meta
* move memory qualifier checks to visitInvoke of check-expr.cpp
move GetLegalizedSPIRVGlobalParamAddr resolving to spirv-legalization pass
move error for "if using non texture type with memory qualifer in param" earlier such that we error with this first. No point in telling user "you are not putting correct memory qualifiers" when memory qualifiers should not have been used.
* add memory qualifier folding modifier 'MemoryQualifierCollectionModifier' to reduce searching and processing (later will be adapted to whole system) as suggested/asked.
The utility is a method to track memory qualifiers without doing a expensive linked-list traversal (image's have 4 modifiers normally).
* properly pass multiple qualifiers from checkModifier down to the `modifier`s list
* addressing review comments:
* change implementation to properly handle restrict modifier
* add comments about implementation for clarity
|
| |
|
|
|
|
|
|
|
|
|
| |
* Add check for invalid use of modifiers.
* Fixes.
* Add test.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Use a dedicated inst opcode to retrieve ray payload locations.
* [Direct SPIRV]: ray tracing pipeline intrinsics.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use "capability" system to select VKRT extension
Slang currently supports translation of ray tracing shader code to Vulkan GLSL code that uses the `GL_NV_ray_tracing` extension. A multi-vendor equivalent of that extension has been released as `GL_EXT_ray_tracing` and we want Slang to support that extension as well.
At the simplest, making the change from one extension to the other is just a matter of changing a few strings, since it does not appear that anything of significance was changed at the GLSL level (or even in SPIR-V). Where this gets trickier is when we have users who want us to support *both* extensions, and to be able to switch between them.
The solution we've implemented here more or less amounts to:
* If you don't tell the compiler which extension to use, it will default to `GL_EXT_ray_tracing` (the newer multi-vendor one).
* If you explicitly want the older extension, you can opt into it using the `-profile` option or via a new API for explicitly adding capabilities to your target.
Making that work required a few different kinds of changes:
* The options parsing and public API needed ways to add optional capabilities to a target.
* During GLSL code emit, we can check the capabilities that were added to the target to see if the `GL_NV_ray_tracing` extension was explicitly enabled and, if not, default to using the `GL_EXT_ray_tracing` names for things. This step is needed because some of the modifiers/attributes involved in the extension have to be handled explicitly in the code generator rather than implicitly as part of mapping intrinsic functions.
* We add two different translations to the relevant operatiosn in the stdlib, one marked with each of the extensions. If profile/capability-based overload resolution can be relied on to pick the right one, this should Just Work.
* Next, a bunch of work had to go into making capability-based overloading Just Work for the purposes of this change. There's been a nearly complete reworking of the implementation of `CapabilitySet` here to make it more suitable for our needs.
* The tests that were using ray tracing translation for Vulkan needed to be updated. For some of them I updated their baselines to use `GL_EXT_ray_tracing` so that they can test the new path. For others, I updated the command line for the test case so that it explicitly opts into using `GL_NV_ray_tracing`. The result is that we have some coverage of each extension. I would have liked to have each test run in both modes, but our pass-through glslang support doesn't support `-D` options, so I couldn't take that step easily.
This change does *not* add support for `GL_EXT_ray_query`, the extension that supports "DXR 1.1" style queries under Vulkan. Adding support for that extension should hopefully be a smaller step because it doesn't have the same multiple-extensions issue.
This change does *not* address a lot of possible avenues for improvement or cleanup around the capability system. It focuses only on those changes that are necessary to make the ray tracing feature work and leaves the rest for future work.
* fixup: infinite loop
* Comment-only change to retrigger TC build
|
|
|
* Move to newer glslang
* Support cross-compilation of ray tracing shaders to Vulkan
This change allows HLSL shaders authored for DirectX Raytracing (DXR) to be cross-compiled to run with the experimental `GL_NVX_raytracing` extension (aka "VKRay").
* The GLSL extension spec is marked as experimental, so that any shaders written using this support should be ready for breaking changes when the spec is finalized.
* "Callable shaders" are not exposed throug the GLSL extension, so this feature of DXR will not be cross-compiled.
* The experimental Vulkan raytracing extension does not have an equivalent to DXR's "local root signature" concept. This does not visibly impact shader translation (because the local/global root signature mapping is handled outside of the HLSL code), but in practice it means that applications which rely on local root signatures on their DXR path will not be able to use the translation in this change as-is; more work will be needed.
The simplest part of the implementation was to go into the Slang standard library and start adding GLSL translations for the various DXR operations.
In some cases, like mapping `IgnoreHit()` to `ignoreIntersectionNVX()` this is almost trivial.
The various functions to query system-provided values (e.g., `RayTMin()`) were also easy, with the only gotcha being that they map to variables rather than function calls in GLSL, and our handling of `__target_intrinsic` assumes that a bare identifier represents a replacement function name, and not a full expression, so we have to wrap these definitions in parentheses.
The tricky operations are then `TraceRay<P>()` and `ReportHit<A>()`, because these two are generics/templates in HLSL.
GLSL doesn't support generics, even for "standard library" functions, so the raytracing extension implements a slightly complex workaround: the matching operations `traceNVX()` and `reportIntersectionNVX()` pass the payload/attributes argument data via a global variable.
That is, shader code for the GLSL extensions writes to the global variable and then calls the intrinsic function.
The linkage between the call site and the global is established by a modifier keyword (`rayPayloadNVX` and `hitAttributeNVX`, respectively) and in the case of ray payload also uses `location` number to identify which payload global to use (since a single shader can trace rays with multiple payload types).
Our translation strategy in Slang tries to leverage standard language mechanisms instead of special-case logic.
For example, to translate the `ReportHit<A>()` function, we provide both a default declaration that will work for HLSL (where the operation is built-in with the signature given), and a *definition* marked with the `__specialized_for_target(glsl)` modifier.
The GLSL definition declares a function `static` variable that will fill the role of the required global, and then does what the GLSL spec requires: assigns to the global, and then calls the `reportIntersectionNVX` builtin (which we declare as a separate builtin).
Our ordinary lowering process will turn that `static` variable into an ordinary global in the IR, and the `[__vulkanHitAttributes]` attribute on the variable will be emitted as `hitAttributeNVX` in the output.
There is no additional cross-compilation logic in Slang specific to `ReportHit<A>()` - the target-specific definition in the standard library Just Works.
The case for `TraceRay<P>()` is a bit more complicated, simply because the GLSL `traceNVX()` function needs to be passed the `location` for the payload global.
We implement the payload global as a function-`static` variable, with the knowledge that every unique specialization of `TraceRay<P>()` will generate a unique global variable of type `P` to implement our function-`static` variable.
We then add a slightly magical builtin function `__rayPayloadLocation()` that can map such a variable to its generated `location`; the logic for this is implemented in `emit.cpp` and described below.
We also changed the `RayDesc` and `BuiltinTriangleIntersectionAttributes` types from "magic" intrinsic types over to ordinary types (because the GLSL output needs to declare them as ordinary `struct` types).
This ends up removing some cases in the AST and IR type representations.
By itself this change would break HLSL emit, because in that case the types really are intrinsic.
We added a `__target_intrinsic` modifier to these types to make them intrinsic for HLSL, and then updated the downstream passes to handle the notion of target-intrinsic types.
The logic for binding/layout of entry point inputs and outputs was updated so that raytracing stages don't follow the default logic for varying input/output parameters.
This is because the input/output parameters of a raytracing entry point aren't really "varying" in the same sense as those in the rasterization pipeline.
In particular, the SPIR-V model for raytracing input and output treats "ray payload" and "hit attributes" parameters as being in a distinct storage class from `in` or `out` parameters.
We also detect cases where a ray tracing stage declares inputs/outputs that it shouldn't have. This logic could conceivably be extended to other stages (e.g., to give an error on a compute shader with user-defined varying input/output).
The type layout logic added cases for handling raytracing payload and hit-attribute data, but this is currently just a stub implementation that follows the same logic as for varying `in` and `out` parameters (it cannot give meaningful byte sizes/offsets right now).
To my knowledge the GLSL spec doesn't currently specify anything about layout, and I haven't read the DXR spec language carefully enough to know what it says about layout.
A future change should update the layout logic to allow for byte-based layout of ray payloads, etc. so that we can query this information via reflection.
The GLSL legalization logic in `ir.cpp` was updated to factor out the per-entry-point-parameter code into its own function, and then that function was updated to special-case the input/output of a ray-tracing shader.
While for rasterization stages we typically want to take the user-declared input/output and "scalarize" it for use in GLSL (in part to deal with language limitations, and in part to tease system values apart from user-defined input/output), the GLSL spec for raytracing requires payload and hit attribute parameters to be declared as single variables. There is also the issue that even for an `in out` parameter, a ray payload parameter should only turn into a single global, whereas the handling for varying `in out` parameters generates both an `in` and an `out` global for the GLSL case.
Other than the handling of entry point parameters, the GLSL legalization pass doesn't need to do anything special for ray tracing shaders.
The trickiest change in the `emit.cpp` logic is that we now generate `location`s for ray payload arguments (the outgoing from a `TraceRay()` call) on demand during code generation.
This is a bit hacky, and it would be nice to handle it as a separate pass on the IR rather than clutter up the emit logic, but this approach was expedient.
Basically, any of the global variables that got generated from the `static` declarations in the standard library implementation of `TraceRay()` will trigger the logic to assign them a `location`.
The logic for emitting intrinsic operations added a few new `$`-based escape sequences. The `$XP` case handles emitting the location of a generated ray payload variable; this is how we emit the matching location at the site where we call `traceNVX`. The `$XT` case emits the appropriate translation for `RayTCurrent()` in HLSL, because it maps to something different depending on the target stage.
All of the test cases here consist of a pair of an HLSL/Slang shader written to the DXR spec, plus a matching GLSL shader for a baseline.
The GLSL shaders are carefully designed so that when fed into glslang they will produce the same SPIR-V as our cross-compilation process.
This kind of testing is quite fragile, but it seems to be the best we can do until our testing framework code supports *both* DXR and VKRay.
A bunch of the core changes ended up being blocked on issues in the rest of the compiler, so some additional features go implemented or fixed along the way:
The first big wall this work ran into was that the `__specialized_for_target` modifier hasn't actually been working correctly for a while.
It turns out that for the one function that is using it, `saturate()`, we have been outputting the workaround GLSL function in *all* cases (including for HLSL output) rather than only on GLSL targets.
The problem here is that for a generic function with a `__specialized_for_target` modifier or a `__target_intrinsic` modifier, the IR-level decoration will end up attached to the `IRFunc` instruction nested in the `IRGeneric`, but the logic for comparing IR declarations to see which is more specialized (via `getTargetSpecializationLevel()`) was looking only at decorations on the top-level value (the generic).
The quick (hacky) fix here is to make `getTargetSpecializationLevel()` try to look at the return value of a generic rather than the generic itself, so that it can see the decorations that indicate target-specific functions.
A more refined fix would be to attach target-specificity decorations to the outer-most generic (to simplify the "linking" logic).
The only reason not to fold that into the current fix is that the `__target_intrinsic` modifier currently serves double-duty as a marker of target specialization *and* information to drive emit logic. The latter (the emit-related stuff) currently needs to live on the `IRFunc`, and moving it to the generic could easily break a lot of code.
This needs more work in a follow-on fix, but for now target specialization should again be working.
The other big gotcha that the simple "just use the standard library" strategy ran into was that function-`static` variables weren't actually implemented yet, and in particular function-`static` variables inside of generic functions required some careful coding.
The logic in `lower-to-ir.cpp` has this `emitOuterGenerics()` function that is supposed to take a declaration that might be nested inside of zero or more levels of AST generics, and emit corresponding IR generics for all those levels.
This is needed because two different AST functions nested inside a single generic `struct` declaration should turn into distinct `IRFunc`s nested in distinct `IRGeneric`s.
The tricky bit to making that all work is that the same AST-level generic type parameter will then map to *different* IR-level instructions (the parameters of distinct `IRGeneric`s) when lowering each function.
The existing logic handled this in an idiomatic way by making "sub-builders" and "sub-contexts."
This change refactors some of the repeated logic into a `NestedContext` type to help simplify the pattern, and applies it consistently throughout the `lower-to-ir.cpp` file.
Besides that cleanup, the major change is `lowerFunctionStaticVarDecl` which, unsurprisingly, handles lower of function-`static` variables to IR globals.
The careful handling of nested contexts here is needed because if we are in the middle of lowering a generic function, then a `static` variable should turn into its *own* `IRGeneric` wrapping an `IRGlobalVar`. The body of the function should refer to the global variable by specializing the global variable's `IRGeneric` to the parameters of the *functions* `IRGeneric`. This tricky detail is handled by `defaultSpecializeOuterGenerics`.
An additional subtlety not actually required for this raytracing work (and thus not properly tested right now) is handling function-`static` variables with initializers.
These can't just be lowered to globals with initializers, because HLSL follows the C rule that function-`static` variables are initialized when the declaration statement is first executed (and this could be visible in the presence of side-effects).
The lowering strategy here translates any `static` variable with an initializer into *two* globals: one for the actual storage, plus a second `bool` variable to track whether it has been initialized yet.
There are some opportunities to optimize this case, especially for `static const` data, but that will need to wait for future changes.
We've slowly been shifting away from the model where a user thinks of a "profile" as including both a stage and a feature level.
Instead, the user should think about selecting a profile that only describes a feature level (e.g., `sm_6_1`, `glsl_450`, etc.), and then separately specifying a stage (`vertex`, `raygeneration, etc.) for each entry point.
The challenge here is that the command-line processing still only had a single `-profile` switch, and no way to specify the stage.
Adding the `-stage` option was relatively easy, but making it work with the existing validation logic for command-line arguments was tricky, because of the complex model that `slangc` supports for compiling multiple entry points in a single pass.
* In `slang.h` add new reflection parameter categories for ray payloads and hit attributes, as part of entry point input/output signatures.
* A previous change already updated our copy of glslang to one that supports the `GL_NVX_raytracing` extension, so in `slang-glslang.cpp` we just needed to map Slang's `enum` values for the raytracing stage names to their equivalents in the glslang code.
* Moved the logic for looking up a stage by name (`findStageByName()`) out of `check.cpp` and into `compiler.cpp`, with a declaration in `profile.h`
* Added a `$z` suffix to the GLSL translation of `Texture*.SampleLevel()`, to handle cases where the texture element type is not a 4-component vector. Note that this fix should actually be applied to *all* these texture-sampling operations, but I didn't want to add a bunch of changes that are (clearly) not being tested right now.
* The layout logic for entry points was updated to correctly skip producing a `TypeLayout` for an entry point result of type `void`, which meant that the related emit logic now needs to guard against a null value for the result layout.
* In `ir.cpp`, dump decorations on every instruction instead of just selected ones, so that our IR dump output is more complete.
* Added a command-line `-line-directive-mode` option so that we can easily turn off `#line` directives in the output when debugging. Not all cases where plumbed through because the `none` case is realistically the most important.
* Parser was fixed to properly initialize parent links for "scope" declarations used for statements, so that we can walk backwards from a function-scope variable (including a `static`) and see the outer function/generics/etc.
* Added GLSL 460 profile, since it is required for ray tracing. Also updated the logic for computing the "effective" profile to use to recognize that GLSL raytracing stages require GLSL 460.
* Added some conventional ray-tracing shader suffixes to the handling in `slang-test`. This code isn't actually used, but was relevant when I started by copy-pasting some existing VKRay shaders as the starting point for my testing.
* Fixup: typos
|