slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2025-09-05	Add warnings for overflows of integer types (#8281)	jarcherNV
	The code int x4 = 0xFFFFFFFFFFFFFFFF previously did not produce a warning due to the value being too large for the type. This patch now checks for this and similar issues during parsing.
2025-09-02	render-test: Change D3D12 default to sm_6_5 (#8320)	James Helferty (NVIDIA)
	Changes default for render-test to sm_6_5. Since sm_6_5 is the new default, remove the -use-dxil option, add -use-dxcb option Remove -use-dxil option from all test cases. Add -use-dxcb to two tests that needed it. Fixes #7611
2025-08-09	Fix atomics error diagnostics (#8117)	venkataram-nv
	Fixes #8116 --------- Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
2024-12-21	Fixed stage and result field names in json reflection (#5927)	Stan

2024-12-03	Conform non-suffixed integer literals (#5717)	Darren Wihandi
	* Make non-suffixed integer literal type resolution conform to C * Update integer literal tests * Clean up integer literal implementation a bit * Update docs on integer literals * Clean up docs update * Clean up docs update * Add comment on INT64_MIN edge case * Fixed failing test, fixed formatting and cleaned up code --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-10-17	Cleanup atomic intrinsics. (#5324)	Yong He
	* Cleanup atomic intrinsics. * Fix. * Fix glsl. * Remove hacky intrinsic expansion logic for glsl image atomics. * Fix all tests. * Fix. * Add `InterlockedAddF16Emulated`. * Fix glsl intrinsic. * Fix.
2024-09-30	Add COM API for querying metadata. (#5168)	Yong He
	* Add COM API for querying metadata. * Fix tests. * fix test.
2024-07-24	Add generic descriptor indexing intrinsic (#4389)	dubiousconst282
	* Add ResourceArray intrinsic type * Move aliased parameter generation to GLSL legalization * Add DynamicResourceEntry type for proxying layout of GenericResourceArray * Reimplement as DynamicResource * Add reflection test * Don't reuse alias cache between different parameters * Add dynamic cast extensions for buffer types * Minor format fix * Fix VarDecl diagnostics after finding non-appliable initializer candidates --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-06-12	Capability System: Implicit capability upgrade warning/error (#4241)	ArielG-NV
	* capability upgrade warning/error adjusted implementation + tests to support a warning/error if capabilities are implicitly upgraded and test accordingly. * add glsl profile caps * add GLSL and HLSL capabilities to the associated capability * syntax error in capdef * only error if user explicitly enables capabilities 1. changed testing infrastructure to not set a `profile` explicitly, 2. Added tests to be sure this works as intended with user API and with slangc command line * Change capability atom definitions and how Slang manages them to fix errors 1. most `glsl_spirv` version atoms have been removed from `.capdef`, instead we will translate `spirv` version atoms into `glsl_spirv` since there is no point in writing the same code twice in `.capdef` files to define `spirv` versions. 2. add spirv version, and hlsl sm version (and equivlent) capability dependencies 3. removed some stage requirments which were set on objects, keep the wrapper capabilities. I am keeping the wrapper capabilities since I am unaware on if there are stage limitations (spec says code in practice does not work). * check internal version instead of version profile (_spirv_1_5 vs. spirv_1_5) * remove unused OpCapability. adjust SPIRV version'ing again for glsl_spirv * apply workaround for glslang bug with rayquery usage * ensure capabilities targetted by a profile and added together by a user are valid * remove additions to `spirv_1_` wrapper spirv_* -> glsl_spirv fix * fix bug where incompatable profiles would cause invalid target caps * try to avoid joining invalid capabilities * fix the warning/error & printing * run through tests to fix capability system and test mistakes many mistakes were mesh shaders doing `-profile glsl_450+spirv_1_4`. This is not allowed for a few reasons 1. the test tooling does not handle arguments the same as `slangc` 2. glsl_450 core profile does not support mesh shaders, nor does spirv_1_4. sm_6_5 does work in this senario * set some sm_4_1 intrinsics to sm_4_0 * replace `GLSL_` defs with `glsl_` * swap the unsupported render-test syntax for working syntax * set d3d11/d3d12 profile defaults this is required since sm version changes compiled code & behavior * adjusted nvapi capabilities with atomics + d3d11 set to use sm_5_0 as per default * cleanup * address review * incorrect styling * change `bitscanForward` to work as intended on 32 bit targets --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-05-29	Improve compile time performance. (#3857)	Yong He
	* Handle type check cache update on extensions more gracefully. * Correctness fix. * Cache implcit cast overload resolution results. * Fix. * More optimizations. * Cache implicit default ctor resolution. * Disable redundancy removal. * Fix. * Fix test. * Fix. * Correctness fix. * Fix. * Fix, * Fix test. * Small tweak.
2024-04-23	Switch to direct-to-spirv backend as default. (#4002)	Yong He
	* Switch to direct-to-spirv backend as default. * Fix slang-test. * Fix. * Fix.
2024-04-19	Initial pass to add capability declarations to stdlib intrinsics. (#3912)	ArielG-NV

2024-04-16	Force Inline all the InterlockedAdd functions in stdlib (#3965)	sriramm-nv
	This change forcibly inlines the InterlockedAdd functions when using byteAddress buffer. The IR generated when using nonUniformResourceInst on RWByteAddressBuffer: buffer[NonUniformResourceIndex(uint(0))].InterlockedAdd(0, 1); follows the sequence of a call into an index lookup that is wrapped by a nonuniformResourceIndex: %ld = nonUniformResourceIndex(0) Call RWStructBufferInterlockedAdd(%ld, 0, 1) This prevents NonUniformResource decoration of the buffer because it is wrapped by the function call to InterlockedAdd, that further expands to: %gep = getElement(%buffer, 0) SpirvAsmInst(..., rwStructuredBufferGEP(%gep, 0), ...) By Force-Inlining the atomic functions, the buffer / resource is made visible to the nonUniformResourceIndex inst, allowing the decoration. Identified while debugging tests/spirv/coherent-2.slang
2023-08-21	Compile append and consume structured buffers to glsl. (#3142)	Yong He
	* Compile append and consume structured buffers to glsl. * Fix. * Update CI config. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-04-04	Preliminary support for realtime clock (#2772)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Initial support for realtime clock. * Add realtime-clock render feature where seems appropriate. * Fixes to make NVAPI compile properly. Change realtime-clock.slang check to use maths that can't overflow.
2023-03-23	Fix optimization pass not converging. (#2725)	Yong He
	* Fix optimization pass not converging. * Fix. * Fix tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-24	More control flow simplifications. (#2673)	Yong He
	* More control flow and Phi param simplifications. * Fix. * Fix gcc error. * Fix. * More IR cleanup. * Fix bug in phi param dce + ifelse simplify. * Propagate and DCE side-effect-free functions. * Enhance CFG simplifcation to remove loops with no side effects. * Fix. * Fixes. * Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation. * More cleanup. * Fixes. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2022-08-17	Warning on lossy implicit casts. (#2367)	Yong He
	* Warning on bool to float conversion. * Fix test cases. * Improve. * LanguageServer: don't show constant value for non constant variables. * Fix tests. * Fix warnings in tests. Co-authored-by: Yong He <yhe@nvidia.com>
2021-10-08	Hotfix/reenable vk test (#1969)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Reenable erroneously disabled test.
2021-10-07	Disable test crashing CI (#1965)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Disable test that appears to be crashing.
2021-07-21	Work to mitigate SPIR-V bloat (#1914)	Theresa Foley
	* Work to mitigate SPIR-V bloat SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things. Note: experimentation shows that the bloat is less pronounced when running either no SPIR-V optimizations or full SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request. This change primarily does two things: First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would always copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters. Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies. Notes: * The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon. * The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing. * There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments. * Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated. * This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create more memory traffic (because the same values get loaded in multiple callee functions). * fixup: run premake * fixup: typo
2021-01-15	Convert more tests to use shader objects (#1659)	Tim Foley
	This change converts a large number of our existing tests to use the `ShaderObject` support that was added to the `gfx` layer. In many cases, tests were just updated to pass `-shaderobj` and the result Just Worked. In other cases, a `name` attribute had to be added to one or more `TEST_INPUT` lines. For tests that did not work with shader objects "out of the box," I spent a little bit of time trying to get them work, but fell back to letting those tests run in the older mode. Future changes to the infrastructure will be needed to get those additional tests working in the new path. Along with the changes to test files, the following implementation changes were made to get additional tests working: * Because the shader object mode uses explicit register bindings (from reflection), the hacky logic that was offseting `u` registers for D3D12 based on the number of render targets gets disabled (by another hack). * The "flat" reflection information coming from Slang was not correctly reporting "binding ranges" for things that consumed only uniform data (which would be everything on CUDA/CPU), so it was refactored to properly include binding ranges for anything where the type of the field/variable implied a binding range should be created (even if the `LayoutResourceKind` was `::Uniform`). * A few fixes were made to the CUDA implementation of `Renderer`, in order to get additional tests up and running. Most of these changes had to do with texture bindings, which hadn't really been tested previously. In addition, a few changes were made that were attempts at getting more tests working, but didn't actually help. These could be dropped if requested: * As a quality-of-life feature (not being used) the `object` style of `TEST_INPUT` line is upgraded to support inferring the type to use from the type of the input being set. * Any `object` shader input lines get ignored in non-shader-object mode.
2020-10-06	InterlockedExchangeU64 support on RWByteAddressBuffer (#1572)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Added [__requiresNVAPI] to functions that need nvapi support. * Added support for InterlockedExchangeU64 Added exchange-int64-byte-address-buffer test Fixed typo in cas-int64-byte-address-buffer test * Improve comment around NVAPI usage in hlsl.meta.slang
2020-09-21	Enable all dynamic dispatch tests on CUDA. (#1552)	Yong He
	* Enable all dynamic dispatch tests on CUDA. * Fix expected cross-compile test results.
2020-08-26	Added more Atomic support for int64 types on RWByteAddressBuffer (#1515)	jsmall-nvidia
	* Support for more 64 bit atomics on ByteAddressBuffer. * min max 64bit test. * Disable CUDA version of min max 64 bit test - as produces the wrong output. * Update target-compatibility.md with added 64 bit atomics. Co-authored-by: Yong He <yonghe@outlook.com>
2020-08-24	RWByteAddressBuffer::InterlockedCompareExchangeU64 (#1513)	jsmall-nvidia
	* First pass at incorporating nvapi into test harness. * D3d12 Atomic Float Add via NVAPI working * Dx12 atomic float appears to work. * Atomic float add on Dx12. * Added atomic64 feature addition to vk. Fix correct output for atomic-float-byte-address.slang * Disable atomic float failing tests. * Upgraded VK headers. * Detect atomic float availability on VK. * Try to get test working for in64 atomic. * Made HLSL prelude controlled via the render-test requirements. * Added -enable-nvapi to premake. * Fix D3D12Renderer when NVAPI is not available. * Small improvements to VKRenderer. * Improve atomic documentation in target-compatibility.md. * Fixed NVAPI working on D3D12. * Test for specific NVAPI features. * Remove requiredFeatures from Renderer::Desc as was ignored. Tried to document more around nvapiExtnSlot. * Readded requiredFeatures to Renderer::Desc * Improve comments in the tests. * Rename Fp32 -> F32 Added cas-int64-byte-address-buffer.slang test Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-24	NVAPI improvements (#1512)	jsmall-nvidia
	* First pass at incorporating nvapi into test harness. * D3d12 Atomic Float Add via NVAPI working * Dx12 atomic float appears to work. * Atomic float add on Dx12. * Added atomic64 feature addition to vk. Fix correct output for atomic-float-byte-address.slang * Disable atomic float failing tests. * Upgraded VK headers. * Detect atomic float availability on VK. * Try to get test working for in64 atomic. * Made HLSL prelude controlled via the render-test requirements. * Added -enable-nvapi to premake. * Fix D3D12Renderer when NVAPI is not available. * Small improvements to VKRenderer. * Improve atomic documentation in target-compatibility.md. * Fixed NVAPI working on D3D12. * Test for specific NVAPI features. * Remove requiredFeatures from Renderer::Desc as was ignored. Tried to document more around nvapiExtnSlot. * Readded requiredFeatures to Renderer::Desc * Improve comments in the tests.
2020-08-21	Vulkan update/NVAPI support (#1511)	jsmall-nvidia
	* First pass at incorporating nvapi into test harness. * D3d12 Atomic Float Add via NVAPI working * Dx12 atomic float appears to work. * Atomic float add on Dx12. * Added atomic64 feature addition to vk. Fix correct output for atomic-float-byte-address.slang * Disable atomic float failing tests. * Upgraded VK headers. * Detect atomic float availability on VK. * Try to get test working for in64 atomic. * Made HLSL prelude controlled via the render-test requirements. * Added -enable-nvapi to premake. * Fix D3D12Renderer when NVAPI is not available. * Small improvements to VKRenderer. * Improve atomic documentation in target-compatibility.md.
2020-08-19	Int64 atomic add RWByteAddressBuffer support (#1504)	jsmall-nvidia
	* Fix premake5.lua so it uses the new path needed for OpenCLDebugInfo100.h * Keep including the includes directory. * Added the spirv-tools-generated files. * We don't need to include the spirv/unified1 path because the files needed are actually in the spirv-tools-generated folder. * Put the build_info.h glslang generated files in external/glslang-generated. Alter premake5.lua to pick up that header. * First pass at documenting how to build glslang and spirv-tools. * Improved glsl/spir-v tools README.md * Added revision.h * Change how gResources is calculated. Update about revision.h * Update docs a little. * Split out spirv-tools into a separate project for building glslang. This was not necessary on linux, but is necessary on windows, because there is a file disassemble.cpp in spirv-tools and in glslang, and this leads to VS choosing only one. With the separate library, the problem is resolved. * Fix direct-spirv-emit output. * Update to latest version of spirv headers and spirv-tools. * Upgrade submodule version of glslang in external. * Add fPIC to build options of slang-spirv-tools * WIP adding support for InterlockedAddFp32 * Upgrade slang-binaries to have new glslang. * Fix issues with Windows slang-glslang binaries, via update of slang-binaries used. * WIP - atomicAdd. This solution can't work as we can't do (float) in glsl. WIP on atomic float ops. * Added checking for multiple decls that takes into account __target_intrinsic and __specialized_for_target. First pass impl of atomic add on float for glsl. * Split __atomicAdd so extensions are applied appropriately. * Made Dxc/Fxc support includes. Use HLSL prelude to pass the path to nvapi Added -nv-api-path * Refactor around IncludeHandler and impl of IncludeSystem * slang-include-handler -> slang-include-system Have IncludeHandler/Impl defined in slang-preprocessor * Small comment improvements. * Document atomic float add addition in target-compatibility.md. * CUDA float atomic support on RWByteAddressBuffer. * Add atomic-float-byte-address-buffer-cross.slang * Removed inappropriate-once.slang - the test is no longer valid when a file is loaded and has a unique identity by default. A test could be made, but would require an API call to create the file (so no unique id). Improved handling of loadFile - uses uniqueId if has one. * Work around for testing target overlaps - to avoid exceptions on adding targets. Simplify PathInfo setup. Modify single-target-intrinsic.slang - it no longer failed because there were no longer multiple definitions for the same target. * Int64 atomic add RwByteAddressBuffer support. * Fix typo in stdlib for int atomic ByteAddressBuffer. * Small fixes to int64 atomic test. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-18	Support for float atomics on RWByteAddressBuffer (#1502)	jsmall-nvidia
	* Fix premake5.lua so it uses the new path needed for OpenCLDebugInfo100.h * Keep including the includes directory. * Added the spirv-tools-generated files. * We don't need to include the spirv/unified1 path because the files needed are actually in the spirv-tools-generated folder. * Put the build_info.h glslang generated files in external/glslang-generated. Alter premake5.lua to pick up that header. * First pass at documenting how to build glslang and spirv-tools. * Improved glsl/spir-v tools README.md * Added revision.h * Change how gResources is calculated. Update about revision.h * Update docs a little. * Split out spirv-tools into a separate project for building glslang. This was not necessary on linux, but is necessary on windows, because there is a file disassemble.cpp in spirv-tools and in glslang, and this leads to VS choosing only one. With the separate library, the problem is resolved. * Fix direct-spirv-emit output. * Update to latest version of spirv headers and spirv-tools. * Upgrade submodule version of glslang in external. * Add fPIC to build options of slang-spirv-tools * WIP adding support for InterlockedAddFp32 * Upgrade slang-binaries to have new glslang. * Fix issues with Windows slang-glslang binaries, via update of slang-binaries used. * WIP - atomicAdd. This solution can't work as we can't do (float) in glsl. WIP on atomic float ops. * Added checking for multiple decls that takes into account __target_intrinsic and __specialized_for_target. First pass impl of atomic add on float for glsl. * Split __atomicAdd so extensions are applied appropriately. * Made Dxc/Fxc support includes. Use HLSL prelude to pass the path to nvapi Added -nv-api-path * Refactor around IncludeHandler and impl of IncludeSystem * slang-include-handler -> slang-include-system Have IncludeHandler/Impl defined in slang-preprocessor * Small comment improvements. * Document atomic float add addition in target-compatibility.md. * CUDA float atomic support on RWByteAddressBuffer. * Add atomic-float-byte-address-buffer-cross.slang * Removed inappropriate-once.slang - the test is no longer valid when a file is loaded and has a unique identity by default. A test could be made, but would require an API call to create the file (so no unique id). Improved handling of loadFile - uses uniqueId if has one. * Work around for testing target overlaps - to avoid exceptions on adding targets. Simplify PathInfo setup. Modify single-target-intrinsic.slang - it no longer failed because there were no longer multiple definitions for the same target. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>