slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2025-06-19	Add support for on-demand AST deserialization (#7482)	Theresa Foley
	Note that this change does not actually enable on-demand deserialization of ASTs, because doing so is incompatible with the current compiler architecture where we have both an `ASTBuilder` and a `SharedASTBuilder`, and there are important invariants about how all AST nodes related to the core module must be created before those of any module using the core module. Instead, this change simply adds the infrastructure for on-demand deserialization, and ensures that those code paths get used at runtime, but actually "demands" all of the nodes in a given serialized AST immediately as part of the deserialization process. Important notes about the implementation approach: * PR #7242 ensured that all of the code accessing the direct member declarations of a `ContainerDecl` went through a small(-ish) set of accessor methods. This change takes advantage of that work by further abstracting the storage of the direct member declarations out in a type, `ContainerDeclDirectMemberDecls`, which makes it easy to add custom serialization logic for just that type. * The `ContainerDeclDirectMemberDecls` type also stores two pointers (one a `RefPtr` and the other a plain pointer) that are only used in the case where the members of a given `ContainerDecl` are being accessed through on-demand deserialization. This can be queried using the `isUsingOnDemandDeserialization()` method but any code accessing a `ContainerDecl` through the intended public API should never need to care about that detail. * Many of the accessor methods that were added in PR #7242 now branch on whether `isUsingOnDemandDeserialization()` is set. The normal code path is unchanged, and the implementation logic for the on-demand-deserialization case is largely held in `slang-serialize-ast.cpp`, to keep it close to the definitions of the serialized data structures themselves. * A few types in the `slang-ast-.h` headers have had `FIDDLE()` annotations added to them, so that they can be used to synthesize some of the serialization logic that was previously hand-written. The `_registerBuiltinDeclsRec()` function (which is used to scan the built-in module ASTs for the various "magic" declarations that the `SharedASTBuilder` needs to know about) was factored a bit to support the way that registration needs to behave differently in the case of loading a serialized module (if we kept using the existing recursive search, then it would force every declaration in the core module to be loaded right away). The new `_collectBuiltinDeclsThatNeedRegistrationRec()` function mirrors the overall traversal pattern to produce a flat list that gets included in the serialized AST module. Note in particular that we no longer call `registerBuiltinDecls()` from within `_readBuiltinModule()`. * The interface of the `Module` type was slightly expanded so that there is a more complete API for accessing the declarations exported from the module. Previously they could only be queried by their mangled name, but the new API also allows the entire list to be iterated over. The `ensureLookupAcceleratorBuilt()` method factors out the logic for building those data structures for a module. Note that in the case where on-demand deserialization is being used for a module, the `findExportedDeclByMandledName()` query will use serialized data directly, rather than build the lookup accelerators as C++ data structures (this is required if we are to avoid immediately deserializing all of the (exported) declarations in the core module as soon as it is loaded). * A few methods related to loading serialized modules (e.g., `loadSerializedModule()`) have been updated so that along with a pointer to the serialized `ModuleChunk` (which, for those who aren't aware, is a pointer directly into the serialized bytes of the module file), they receive an `ISlangBlob` that refers to the entire blob holding the serialized data (which the `ModuleChunk` is part of). Passing this pointer down allows code running under these methods to retain a reference-counted pointer to the blob to stop the memory of the serialized module from being released until deserialization has been completed. * The data types defined in `slang-fossil.h` have been overhauled significantly: * The most important change that is relevant to this work is the introduction of the `Fossilized<T>` template, which is used to statically map a "live" C++ type `T` to its binary fossilized representation. The `slang-fossil.h` file provides infrastructure allowing `Fossilized<T>` to be specialized for user-defined types, and also provides the necessary mappings for the core types like strings, arrays, and dictionaries. * A key point is that in C++ code, one can take a value of some type `Foo`, serialize it using a `Fossil::SerialWriter`, get a pointer to that serialized data, and then directly cast it to a `Fossilized<Foo>` and navigate the serialized data directly (without deserializing it back into a `Foo`). For that process to work, any specialization of `Fossilized<T>` must be sure to match the layout that will be produced by the `serialize()` implementation for `T`, when writing to a `Fossil::SerialWriter`. Another key change in the public interface of `slang-fossil.h` is that dynamically-typed traversal of the data used to be handled just with `FossilizedValRef`, but now uses a few different types. The `Fossil::ValRef<T>` and `Fossil::AnyValRef` types are used to capture the use cases that want reference-like behavior (basically a `Fossil::ValRef<T>` can be thought of as sort of like a `T&`), while `Fossil::ValPtr<T>` and `Fossil::AnyValPtr` are used for cases that want pointer like behavior (akin to `T`). Then there are related changes in `slang-serialize-fossil.`: The implementation of `Fossil::SerialReader` has been changed to use `Fossil::AnyValPtr` in most places where it formerly used `FossilizedValRef`. Using pointers (that can be null) instead of a weird kind of pseudo-reference (that could still be null) to traverse things was making the code harder to follow than it ought to be, in terms of understanding the levels of indirection in various places. * Some of the state that was previously in `Fossil::SerialReader` has been split into `Fossil::ReadContext`. This type allows multiple `Fossil::SerialReader`s to be created to read from the same serialized blob(s), while maintaining a persistent mapping from fossilized data pointers to live object pointers. The `ReadContext` also maintains the work list of deferred deserialization actions waiting to be performed, and only flushes that list when the last currently-open `SerialReader` is about to go out of scope. * In order to support the split of `Fossil::SerialReader` described above (and also to clean up something that didn't quite feel right in the original serialization design) the base serialization framework in `slang-serialize.h` has been tweaked so that a `Serializer` now wraps two pointers instead of just one. The first pointer continues to be an implementation of `ISerializerImpl`, which handles the actual reading/writing of data, while the other pointer is an explicit "context" pointer for operations that need additional user-defined context. * Similar to the changes made to the accessors for direct member declarations in a `ContainerDecl`, the `Module::findExportedDeclByMangledName()` method was updated to conditionally execute a different code path in the case of a module that has been loaded from serialized data. * Some improvements have been made to the fiddle tool: * Most importantly, the error-handling logic around Lua script execution has been cleaned up to better match correct Lua idiom. Native functions exposed to the Lua scripts have been changed to just use `lua_call` instead of `lua_pcall`, so rather than attempt to intercept Lua errors they will just automatically propagate them. * All Lua-related errors are caught at the top level, and reported in a way that uses the source location of the fiddle template that was being evaluated when the error was raised. In most cases, a Lua error should be accompanied by a stack trace of the Lua evluation state. The file paths and line numbers given should be accurate, but aren't directly double-clickable in the Visual Studio output panel, because they use a different format (a good future change might be to process the Lua stack trace and rewrite it into a format that is better for our needs). * Fixed a subtle bug where having "raw" content (parts of the template that should neither be evaluated nor emitted into the output) that consisted of only whitespace could result in a template being translated to invalid Lua code. * The bulk of the change is, unsurprisingly, in `slang-serialize-ast.cpp`. * This file has been refactored enough to look like a complete rewrite. A lot of work has been put into comments that describe the overall approach being taken, so hopefully it can be understood even by somebody who wasn't familiar with the previous code. Some of these are just plain cleanups, rather than being directly related to on-demand serialization. * Where possible, the code for reading and writing types that needed custom serialization has been moved so that the read/write functions are next to one another, making it easier to visually confirm that the serialized representations match on the read and write sides. * Where possible, the serialization logic for all types (not just the AST nodes, as was the case before) is being generated via fiddle. * Rather than just defining `serialize()` overloads for each of the relevant types, the code now defines `Fossilized<...>` specializations for these types as well, to enable statically-typed in-memory traversal of the serialized data. Note, however, that for the most part the `Fossilized<...>` representation types are not being used by the code (really only the `ASTModuleInfo` and `ContainerDeclDirectMemberDeclsInfo` types are traversed directly). This can be considered more as work to prove out the design of the `Fossil<...>` template approach, and it may or may not end up being relevant in the future. * The trivial bit of work to enable on-demand deserialization is in `ASTSerialReadContext::handleContainerDeclDirectMemberDecls()` where, rather than recursively reading the contained declarations, the method effectively just grabs the current cursor of the `Fossil::SerialReader` (which is pointed into the fossilized data) and stashes it into the `ContainerDeclDirectMemberDecls`, along with a `RefPtr` to the `ASTSerialReadContext` itself. Those stashed pointers are what enables the accessors on `ContaienrDeclDirectMemberDecls` to look up information on-demand. * The more interesting bits of the approach mostly come at the end of the file, where the accessor operations for on-demand deserialization are implemented. Once all the relevant work has been done to write the data structures, and produce `Fossilized<...>` types with the right layout, the work itself may seem almost trivial: a little bit of array iteration, and a little bit of binary-search lookup. * As a reminder, all of this infrastructure for on-demand deserialization is now in place and able to be invoked by the rest of the compiler, but declarations are currently all being loaded eagerly. The `SLANG_DISABLE_ON_DEMAND_AST_DESERIALIZATION` macro is being used to enable a small bit of extra logic in `ASTSerialReadContext::_cleanUpASTNode` so that the "cleanup" on a just-deserialized `ContainerDecl` includes eagerly querying its list of direct member declarations, which will cause them to be recursively deserialized.
2025-06-18	Fix retry logic for unit test (#7471)	Jay Kwak
	* Fix the ignored unit-tests on retry * Retrigger CI * Add more error messages * Don't use test-server for retry of unit-test to see error messages * Clean up cl.yml Remove 'has-gpu' because it is unused after debug became full-gpu-test. Renamed files to make the meaning more clear: - Renamed expected-failure.txt to expected-failure-via-glsl.txt - Renamed expected-failure-github-runner.txt to expected-failure-no-gpu.txt * Rename cpu-hello-world.slang to avoid name conflict to example We have an example whose executable name is cpu-hello-world.exe. It gets built when you run `cmake --build`, but it gets overwritten by slang-test when it tests `tests/cpu-program/cpu-hello-world.slang`. This PR renames to avoid the name conflict. * Remove debug code --------- Co-authored-by: Yong He <yonghe@outlook.com>
2025-06-18	Fix additional VVL violations (#7377)	Gangzheng Tong
	* fix: add sampleCount and mipMaps to st2DMS_f32v4 Fix VUID-VkImageCreateInfo-samples-02257: The Vulkan spec states: If an OpTypeImage has an MS operand 1, its bound image must not have been created with VkImageCreateInfo::samples as VK_SAMPLE_COUNT_1_BIT * Fix VUID-VkShaderModuleCreateInfo-pCode-08740 Rename VK_KHR_COMPUTE_SHADER_DERIVATIVES_EXTENSION_NAME to VK_NV_COMPUTE_SHADER_DERIVATIVES_EXTENSION_NAME * fix: add sampleCount and mipMaps to st2DMS_f32v4 Fix VUID-VkImageCreateInfo-samples-02257: The Vulkan spec states: If an OpTypeImage has an MS operand 1, its bound image must not have been created with VkImageCreateInfo::samples as VK_SAMPLE_COUNT_1_BIT * Fix VUID-VkShaderModuleCreateInfo-pCode-08740 Rename VK_KHR_COMPUTE_SHADER_DERIVATIVES_EXTENSION_NAME to VK_NV_COMPUTE_SHADER_DERIVATIVES_EXTENSION_NAME * Fix VUID-vkCmdDispatch-None-06479 Use correct format for combined depth texture. * Fix VUID-vkCmdDispatch-format-07753 by setting format Parse filtering mode for sampler because the RGBA8* formats do not support linear filtering * Create MS texture type for sample count > 1 * Use different texture formats for depth compare and gather ops * Use clearTexture for init the data for MS textures
2025-06-16	Disable periadic diagnostic update on language-server on CI (#7445)	Jay Kwak
	The "textDocument/publishDiagnostics" Notification in the official Language Server Protocol, or LSP for short, is a notification that the server sends to the client such as VSCode or Visual Studio without the client having to ask for it. Its purpose is to provide a list of errors, warnings, or other informational "squiggles" for a specific file. Because the notification is an asynchronous push notification, it is receieved as an unexpected RPC message during the slang-test CI tests. When a notificatoin is unexpectedly sent to slang-test, the communication goes out-of-sync and the rest of language-server based tests intermittently fails. In order to address the problem, this PR adds a new command-line argument to change the behavior of the notification and it will be sent in a more deterministic manner where the notification can be sent only in one of three cases: didOpen, didChange, and didClose. Because these evets are only ways to cause a new notification, we can still expect to get the same diagnostic messages without missing any of them. For slang-test CI test, this new option will be used to make the notification more deterministic.
2025-06-13	Support SM6.9 with GFX (#7387)	Jay Kwak

2025-06-13	Sort test list to be deterministic (#7432)	Jay Kwak
	On Linux/MacOS, the test file lists were not in a sorted order and the order was non-deterministic. It causes intermittent CI failures. This commit is to make the list sorted and CI more reliable.
2025-06-12	Fix API changes from separate debugging support (#7397)	jarcherNV
	Recent separate debugging support added two new functions which broke backwards compatibility. This change restores the old API and moves the new functions to an IComponentType2 interface which can be used if separate debug files are needed.
2025-06-06	Add command line option for separate debug info (#7178)	jarcherNV
	* Add command line option for separate debug info Add command line arg -separate-debug-info which, if provided, produces both a .spv and a .dbg.spv file. The .dbg.spv file contains full debug info and the .spv file has all debug info stripped out. Also add a DebugBuildIdentifier instruction to store a unique hash in both the output files, so they can be more easily matched together. A matching API is provided to allow using the Slang API to retrieve a base and debug SPIRV as well as the debug build identifier string.
2025-06-06	Update slang-rhi (#7303)	Simon Kallweit
	* update slang-rhi * adapt to new slang-rhi API * enable slang-rhi agility sdk * fix handling empty list * disable failing slang-rhi tests * format code * fix slang-rhi-tests ci step * skip running slang-rhi-tests --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-06-05	Clean up a dead code forgot to delete (#7358)	Jay Kwak

2025-06-04	Break down record replay to individual tests to avoid timeout (#7340)	Jay Kwak
	* Break down RecordReply to individual tests to avoid timeout In Debug build, RecordReplay unit-test was timing out. It was running six tests all in one unit-test, but this commit breaks it down to individual test so that each unit test can be done within the timeout limit. This issue has seen only in Debug build but it has been unnoticed because even when the test failed with test-server, it was still passing on its retry because the time-out applies only when using test-server. * Reduce the retry from 2 times to 1 time * Remove RecordReplay from expected failure
2025-06-02	Add a new slang-test option `-enable-debug-layers` (#7300)	Jay Kwak
	* Add a new slang-test option `-enable-debug-layers` A variable `disableDebugLayer` is renamed to `enableDebugLayers`, and a corresponding command-line argument is added, `-enable-debug-layers`. The previous option `-disable-debug-layer` is still available, but it prints a deprecation warning message. The reason why it is added is to make the option available to both Debug and Release. On Debug build, it will be enabled by default, and it will be disabled on Release build. We should be able to not only disable it, but also enable it on Release build. Ideally this option should be enabled all the time, but currently there are too many VUID error messages printed and we are enabling only for Debug build for now. Note that the CI/CD will run with the option disabled until we resolve all of VUID errors.
2025-06-01	Fix test-server debug issues with gfx-unit-test-tool (#7119) (#7279)	sricker-nvidia
	Previously when running slang-test with "-use-test-server" to run slang-unit-test-tool and gfx-unit-test-tool tests, these would fail with a message like, "error: Unable to launch tool". These issues appear to have been resolved, however debug runs of gfx-unit-test-tool using "-use-test-server" were still showing errors of the following nature: ```` error: rpc failed error: result code = -858993460 standard error = { } standard output = { } ignored test: 'gfx-unit-test-tool/uint16BufferTestVulkan.internal' ```` These errors all appeared to be the result of Vulkan VUID print outs and were occuring for nearly every Vulkan test. Existing comments in slang-test-main.cpp indicated that VUID print outs get misinterpreted as the result from a test due to limitations in the Slang RPC implementation. Slang-test then correctly disables use of VK debug layers when the spawn type is UseTestServer. However, this argument is only passed to the test server when running standard tests (see ExecuteToolTestArgs vs ExecuteUnitTestArgs). This change hard codes `unitTestContext.enableDebugLayers = false;` in test-server-main.cpp when running unit tests, as otherwise this will currently result in all Vulkan tests being ignored. Additional tweaks were made to slang-test-main.cpp to restore the spawn type for unit tests and to prevent bogus rpc error result codes.
2025-05-30	Change SLANG_OVERRIDE_xxx_PATH and fix header file path (#7207)	Lujin Wang
	* Fix lua header file path Add two missed files in #7167 * Fix lua header file path Add two missed files in #7167 * Leave lua/ in the path to avoid name conflict * Remove xxx from path of SLANG_OVERRIDE_xxx_PATH Change SLANG_OVERRIDE_xxx_PATH from path-to-parent-folder/xxx to path-to-parent-folder and add "xxx/" back to "#include", which helps to avoid the potential name conflict of external tools. * format code --------- Co-authored-by: Yong He <yonghe@outlook.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-21	Generalize serialization system used for AST (#7126)	Theresa Foley
	This change takes the new approach to serialization that was used for the AST and generalizes it in a few ways: * The new approach is no longer tangled up with the RIFF format. The serialization system supports multiple different implementations of the underlying format. The existing RIFF format is now supported as one back-end, but support for others will follow in subsequent changes. * The new approach is no longer deeply specialized to AST serialization. The old code had things like serialization for `List`s and `Dictionary`s, but it was embedded inside the `AST{Encoding\|Decoding}Context`, and thus couldn't be leveraged for other serialization tasks. This change factors out a completely AST-independent `Serializer` implementation, with an `ASTSerializer` layered on top of it to provide the additional context needed. * There is less duplication of code between reading and writing of serialized data. The old code had both the `ASTEncodingContext` and `ASTDecodingContext`, with serialization logic for most types being implemented in both, but with the constraint that those implementations needed to be kept in sync to avoid serialization-related runtime failures. A key property of the revamped approach is that a single `serialize()` method for a type implements both the reading and writing directions of serialization.
2025-05-20	Fix retry logic and skip high intermittent test failure (#7175)	Gangzheng Tong
	* skip recordReplay; fix retrying logic for unit test * Allow the CI to run with manual dispatch * increase failed test limit to 100 * reduce the serve count to 2
2025-05-20	Update build to allow setting external lua path (#7167)	lujinwangnv
	* Update build to allow setting external lua path Update the build to allow setting user-specific path for the external module lua. * T * Fix an include path
2025-05-17	fix the break to make sure only valid data will be accessed (#7148)	Gangzheng Tong

2025-05-16	Enable Windows full debug testsuite in CI (#7085)	Gangzheng Tong
	* Unify Debug Layer Control Logic and Add Disable Option for Debug Builds This PR refactors and unifies the debug layer control logic in slang-test. A new `-disable-debug-layers` option is introduced, allowing debug builds to skip enabling the validation (debug) layer. This is currently needed to ensure stability in the debug test suite. Previously, different toggles such as ENABLE_VALIDATION_LAYER, ENABLE_DEBUG_LAYER, and debugLayerEnabled were used inconsistently across different components of slang-test. This PR standardizes the logic by using a single variable, debugLayerEnabled, to control the enabling/disabling of the debug layer internally. Notes: By default, the debug/validation layer is enabled in debug builds and is not supported in release builds of slang-test. Fixes: #7132 * Disable spirv-opt for the DebugFunctionDefinition issue * Run debug build only in GCP machines * Fix VUID-vkCmdPipelineBarrier-pBufferMemoryBarriers-02818 dstAcessMask can't include VK_ACCESS_TRANSFER_READ_BIT when stage mask has VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_KHR * Set failed retry limit to 32 --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-16	Fix broken -emit-spirv-via-glsl test option (#7091)	sricker-nvidia
	Fixes issue #6898 The -emit-spirv-via-glsl slang-test option has been broken for some amount of time. Tests that were using it were operating as if using -emit-spirv-directly, leading to many duplicated tests. After fixing the test option, there were an number of errors that appeared as a result. This change fixes the broken test option and the resulting test errors. Some of the test errors revealed some legitimate issues, such as: -The GLSL bitCount instrinsic only supports 32-bit integers and requires emulation for other bit widths. -Emitting GLSL 8-bit and 16-bit glsl integer types did not emit the proper extension requirements -Emitting GLSL and casting for 16-bit integers was missing a closing parenthesis. -Missing profile for GL_EXT_shader_explicit_arithmetic_types -Missing toType cases for UInt8/Int8 for the kIROp_BitCast case in tryEmitInstExprImpl.
2025-05-14	Make Command Line Reference readthedocs compatible (#7048)	aidanfnv
	This change modifies the code that generates the Command Line Reference doc to output H2 headings in place of H1 headings, and H3 in place of existing H2, so that readthedocs will not treat the additional H1 headings as titles. This change also regenerates the Command Line Reference doc, as the current copy in the repo appears to be quite out-of-date. The existing copy is also encoded as UTF-16LE, whereas the other docs are all UTF-8. The regenerated doc is also UTF-8, and all I did to generate that was run slangc.exe -help-style markdown -h > docs\command-line-slangc-reference.md 2>&1 after building slangc on Windows. This change also adds GitHub actions workflows to check the contents of the doc, fail if a regenerated version needs to be checked in, and provide an option to regenerate it with a bot, all in a similar manner to User Guide TOC regeneration. The doc writer was producing different results from my local build until I changed how the writer sorts the shader stages. In the action, the order of pixel and fragment was reversed, despite the only difference from my local build being the OS. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-12	Make CUDA version capabilities reach NVRTC (#7074)	Theresa Foley
	Fixes #7049 The root cause of the problem in #7049 is simply that newer NVRTC versions produce a warning when asked to generate code for older CUDA SM versions, and the default that Slang was requesting compilation for was old enough to trigger that warning, and thus trip up the test case (which only looks at the first diagnostic produced by the downstream compiler). Superficially, the fix was easy: change the test case in question (`tests/diagnostics/local-line.slang`) to request `-capability cuda_sm_8_0`, the minimum version supported by current NVRTC. Unfortunately, the simple fix required some other fixes in order to actually work. The capability system includes capability names of the form `cuda_sm__`, but specifying such a capability had no impact on the CUDA SM version passed in when invoking NVRTC. Instead, only the CUDA SM versions requested in the implementation of intrinsics in the core module were affecting the version number passed down. This change adds logic to `slang-compiler.cpp` to take explicitly requested capabilities into account when inferring the CUDA SM version to be passed downstream. A more complete fix would also add similar logic for all the other targets. Unfortunately... yet again... that fix wasn't enough to make things work as expect. Now I had the problem that requesting `-capability cuda_sm_8_0` was actually causing the NVRTC invocation to request CUDA SM version 9.0! The underlying problem there was that the `slang-capabilities.capdef` file has defined certain capability names in a way that implies atomic capabilities much higher than one would expect. E.g., the `cuda_sm_8_0` alias was including HLSL `sm_5_0`, but then `sm_5_0` in turn included `_cuda_sm_9_0`. The fix, for now, is to change the definitions in `slang-capabilities.capdef` to not have the counter-intuitive definitions for `cuda_sm__`. With this set of fixes, the test failure in the original bug report no longer occurs. The work that went into this change suggests several larger-scope fixes that would be good to pursue: * Ideally the capability definitions would have some sort of validation checking to make sure that counter-intuitive results like `cuda_sm_8_0` requesting CUDA SM 9.0 do not occur. * The translation of capabilities over to version numbers for a downstream compiler should be expanded to cover other targets, and not just CUDA. It might be better/simpler to just pass the capabilities themselves to the downstream compiler, since it is possible that a downstream compiler could have more fine-grained enable/disable options than a simple version number. * The entire approach to computing version numbers required for downstream compilation should be cleaned up so that we don't have this duplication between the capabilities that represent those versions and separate syntactic constructs that are used to "request" those versions as part of code generation. * We are very much at the point where we should consider dropping the current behavior where a profile name or capability like `sm_5_0`, that is specific to a single target or a subset of targets, also implies a set of comparable capabilities for other targets.
2025-05-12	Cleanups related to RIFF support (#7041)	Theresa Foley

2025-05-11	Add a new option "-capability" to slang-test and render-test (#7054)	Jay Kwak

2025-05-10	Update build to allow setting more external paths (#7044)	lujinwangnv
	* Update build to allow setting more external paths Update the build to allow setting user-specific paths for the external modules: glm, imgui, slang-rhi, and tinyobjloader.
2025-05-06	Retry when a few unit tests failed. (#6912)	Jay Kwak
	This PR allows the failed unit-tests to be retried at the end as in a single threaded manner. The purpose of the retry is to increase the stability of CI.
2025-05-06	Update C++ standard to C++20 (#6980)	Ellie Hermaszewska
	* Correct incorrect enum usage on metal * Update C++ standard to C++20 Closes https://github.com/shader-slang/slang/issues/6945 * use bit_cast
2025-05-05	Correct incorrect enum usage on metal (#6994)	Ellie Hermaszewska

2025-05-02	Fix seg-fault in cudaCodeGenBug test (#6985)	Jay Kwak
	`cudaCodeGenBug` is expected to fail on Linux, because the variable `code` is nullptr. When the next test tried to dereference, it causes a seg-fault.
2025-05-02	Fix intermittent failure of slang-unit-test-tool/ReplayRecord (#6981)	Jay Kwak
	* Fix intermittent failure of slang-unit-test-tool/ReplayRecord Three problems are addressed: 1. the graphics driver sometimes returns nullptr from GetShaderIdentifier 2. `findRecordFileName()` may not find any records at all. 3. the return value from cleanupRecordFiles() overwrote the error value in `res` and it returned SLANG_OK even when there were errors. * Fix compiler warnings on Windows
2025-04-28	Add Slang Byte Code generation and interpreter. (#6896)	Yong He
	* Add Slang Byte Code generation and interpreter. * Fix compile issues. * format code * More compile fix. * Fix clang issue. * Fix more clang issues. * Another clang fix. * Fix clang issues. * Fix another clang issue. * Fix wasm build. * Update building.md * Fix test-server. * Fix compile error. * Fix bug. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-26	Added getCanonicalGenericConstraints2 (sorts constraints and allows more ↵	Ronan
	generic expressions) (#6787)
2025-04-25	Update spirv-tools to for SDK v2025.2 (#6893)	Gangzheng Tong
	* Update spirv-tools to for SDK v2025.2 Fixes: #6850 * bump spirv version to 1.4 for op linkage * skip-spirv-validation for coop mat * add skip-spirv-validation option to slang session desc * use SPV_ENV_UNIVERSAL_1_6 for spirv-tool env target Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-24	update slang-rhi (#6587)	Simon Kallweit
	* update slang-rhi submodule * slang-rhi API changes * disable agility sdk * fix texture creation * update formats in tests * Extent3D rename * use 1 mip level for 1D textures for Metal * fix texture upload * update to latest slang-rhi * update slang-rhi * format code * update slang-rhi * do not run texture-intrinsics test on metal * update slang-rhi * deal with failing tests * fix more tests * update slang-rhi --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Simon Kallweit <simon.kallweit@gmail.com>
2025-04-23	Fixed various queryInterface implementations (#6863)	AlexisPollonni
	* Fix: Improper implementation in RendererBase::queryInterface In the case an arbitrary uuid was passed to RendererBase::QueryInterface it would return SLANG_OK while the outObject is null. This is improper and unexpected from an IUnkown implementation. Additionally, the function did not call addRef() when concerning an IDevice interface. * Fix: DebugTransientResourceHeap::queryInterface returns wrong interface When trying to query for the transient heap if the debug layer is enabled, queryInterface would set the outObject to the inner api specific heap (ex: vk::TransientResourceImpl) and NOT the debug heap. This causes a side effect when creating a command buffer that debug wrappers would not be used. The debug version will not be returned, and this snowballs causing an access violation when trying to bind a compute pipeline state. After this fix, debug wrappers for transient heaps, command buffers, encoders, etc... wil be used correctly. * fix weird whitespace change
2025-04-22	A new approach to AST serialization (#6854)	Theresa Foley
	* A new approach to AST serialization This change completely overhauls the way that AST nodes are being serialized, and the offline source-code generation steps that enable that serialization. In practice, this ends up being a complete overhaul of the way that modules are being serialized (not just the AST part), although things like the serialization format for the Slang IR and for source locations are not affected. The rest of this commit message is broken down in to sections, in an attempt to help guide anybody looking at the code in how to make sense of all the changes. The Old C++ Extractor --------------------- AST serialization used to be driven by information scraped using the `slang-cpp-extractor` tool, which did an ad hoc parse of the C++ declarations of the AST node types and then generated a set of "X macros" that could be for macro-based code generation within the rest of the compiler. While the existing approach was functional, it wasn't easy to understand or maintain, and it has been getting in the way of forward progress on other features we'd like to work on in the language and compiler. This change removes the `slang-cpp-extractor` tool entirely. Marking Up the AST Declarations ------------------------------- The most notable change that contributors to the compiler may notice is the large number of invocations of a macro `FIDDLE()` on the declarations of the AST node types. The basic idea is that only declarations (namespaces, types, fields) that are preceded by `FIDDLE()` are visible to the code generator tool. So if somebody is working with the AST and wondering why a new node type isn't working, or why a field they added isn't being serialized correctly, it is probably because they need to add `FIDDLE()` in front of it. Generating the Boilerplate Code ------------------------------- The file `slang-ast-boilerplate.cpp` provides a good example of how the information extracted from the marked-up AST declarations gets used. In that file, the `FIDDLE TEMPLATE` construct is used to generate type information for each of the AST node types. Similar logic is used in `slang-ast-forward-declarations.h` to generate the declaration of the `ASTNodeType` enumeration, and forward-declare all the AST node classes. For many parts of the code, simply including that file replaces the need for the old `slang-generated-.h` files. Replacing Visitors and Related Logic ------------------------------------ The old visitor types for the AST used the macros that were generated by `slang-cpp-extractor`, so something new was needed to replace them. The same goes for the `SLANG_AST_NODE_VIRTUAL_CALL` macros. The core of the solution implemented here is in `slang-ast-dispatch.h`. Given a "dispatchable" AST node type (say, `Expr`), a call like: ``` ASTNodeDispatcher<Expr,R>(expr, [&](auto e) { return doSomething(e); }) ``` is an expression of type `R`, which does the equivalent of something like: ``` switch(expr->getTag()) { case ASTNodeType::VarExpr: return doSomething(static_cast<VarExpr>(expr)); // ... } ``` The `SLANG_AST_NODE_VIRTUAL_CALL` macro is now implemented in terms of `ASTNodeDispatcher`. The implementation of the visitor types is more involved. The code in this change retains some of the macro names from the original version, just to try and make the parallels more clear. The visitor types are all implemented on top of the `ASTNodeDispatcher` approach, and use `FIDDLE TEMPLATE` to generate all the boilerplate `visit()` method declarations. Refactoring of `Linkage` Module Loading --------------------------------------- Needing to revisit all the places where modules get deserialized made it clear that there is a lot of complexity and apparent duplication in the core routines on the `Linkage` that get used for loading modules. This change tries to clean up some of that logic, but it is worth noting that there are two legacy features that get in the way of making things as clean as they should be: The `LoadedModuleDictionary` type that gets passed around a lot exists entirely to handle the corner case where somebody uses the Slang API to perform a compilation with multiple `TranslationUnitRequest`s in the same `FrontEndCompileRequest`, and one of the translation units `import`s the module defined by another of the translation units. * There are a lot of special-case behaviors and routines entirely there to support the `ModuleLibrary` feature, although that feature should be considered deprecated (or at least subject to getting entirely re-designed down the line). The basic idea of the cleanup is that all of the (non-deprecated) ways load a module from a serialized binary, or compile one from source should now bottleneck through `loadModuleImpl`, which then bifurcates into `loadSourceModuleImpl` for the compilation case and `loadBinaryModuleImpl` for the deserialization case. High-Level Serialization Approach --------------------------------- The old serialization logic used the [RIFF](https://en.wikipedia.org/wiki/Resource_Interchange_File_Format) format to encode the high-level structure of things, and this change retains that usage (and actually doubles down on the RIFF usage). The old serialization system relied on the idea that for any given type `Foo` that wants to support serialization, there should be something like a `SerialFooData` type in C++, that can represent the state of a `Foo`, and then the actual serialization applied to that `SerialFooData`. This means that in most cases there are four pieces of code written: * During serialization: * Copying the data of a `Foo` in memory over to a `SerialFooData` in memory * Writing the state of a `SerialFooData` into the serialized data stream * During deserialization: * Reading the state of a `SerialFooData` from a serialized data stream * Copying the data of the `SerialFooData` in memory over to a `Foo` The new logic gets rid of the intermediate `SerialFooData`. In the serialization direction, we take a `Foo` and write it to the `RIFFContainer` directly, or using some other utilities layered on top of it. In the deserialization direction, we have additional flexibility. Given a `RIFFContainer::Chunk` that represents a serialized `Foo`, we often navigate through the in-memory representation of the RIFF data to get to the parts of the serialized value that we actually want/need, without needing to deserialize the entire `Foo`. To support this kind of operation, this change introduces a few helper types like `ContainerChunkRef` an `ModuleChunkRef`, that are little more than typed wrappers around a `RIFFContainer::Chunk`. The Module "Container" Part --------------------------- A serialized `Module` is encoded as a RIFF chunk, using logic in `slang-serialize-container.cpp` - both before and after this change. This change reorganizes a lot of the code in that file, to account for the way that eliminating the intermediate `SerialContainerData` type streamlines the overall task of writing out the parts of the module. In the deserialization logic... there isn't really much to do in `slang-serialize-container.cpp`. Most of the logic in `slang.cpp` and `slang-module-library.cpp` that pertains to deserializing modules uses the `ModuleChunkRef`-based approach, and simply extracts the pieces of the serialized module that it needs. The Actual Serialization of the AST ----------------------------------- The actual AST serialization logic is in `slang-serialize-ast.cpp`. The basic approach in both the writing and reading directions is: * Use the `FIDDLE TEMPLATE` system to generate a set of functions, one for each AST node type, that recursively invoke the read/write logic on each field of that node (after recursively invoking the case for its direct superclass) * Use the `ASTNodeDispatcher` system to dispatch out to those functions whene reading or writing anything derived from `NodeBase` * For now, handle all types not derived from `NodeBase` by hand. There's a lot of room for improvement around that last item: it should be just as easy to generate the serialization and deserialization logic for other types that don't inherit from `NodeBase`, but the current change tries to err on the side of making the logic as explicit and simplistic as possible, rather than trying to get too clever too soon. The actual serialization format used for the AST is almost comically simplistic: the code uses hierarchical RIFF chunks to emulate a JSON-like structure. This is a very wasteful representation (e.g., a `bool` or a null pointer each take up 8 bytes), but the goal for now is to start with the simplest thing that could possibly work, and only add more cleverness once we are sure it won't get in the way of important future improvements (like lazy/on-demand deserialization or IR and AST, to improve compiler startup times). The files `slang-serialize.{h,cpp}` have been co-opted to define a new pair of types `Encoder` and `Decoder` that are used for a more-or-less stream-oriented way or reading or writing RIFF chunks for the JSON-like structure. Almost everything related to the actual AST serialization could do with a cleanup pass, and some time spent on picking good/better names for everything. Smaller Stuff ------------- * Cleaned up a lot of code that was using bare `ASTNodeType` or the extractor's `ReflectClassInfo` type to consistently use `SyntaxClass`. * Fixed an apparent bug in how the destination-driven code genarator was handling `TryExpr`s * Fixed an apparent bug in how the GLSL legalization pass was handling translation of certain `SV_` semantics. format code * fixup: template errors caught by non-VS compilers * format code * fixup: more template errors * fixup: more stuff VS didn't catch * fixup: it's amazing VS doesn't catch these... * fixup: yet more template stuff VS ignores * fixup: more VS template nonsense * fixup: unreachable return macro usage * fixup: more unreacable returns * fixup: unused parameter * fixup: strict aliasing * fixup: allow missing entry point list chunk * fixup: wasm build script * fixup: AST changes since this PR was created --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2025-04-18	Check the available VK extensions before using CoopVec APIs in GFX (#6849)	Jay Kwak
	* Check the available VK extensions before using CoopVec APIs in GFX * Remove a redundant request for cooperative vector extension for vk
2025-04-17	Add Yet Another Source Code Generator (#6844)	Theresa Foley
	* Add Yet Another Source Code Generator This change introduces an offline source code generation tool, provisionally called `fiddle`. More information about the design of the tool can be found in `tools/slang-fiddle/README.md`. Yes... this is yet another code generator in a project that already has too many. Yes, this could easily be a very obvious instnace of [XKCD 927](https://xkcd.com/927/). This change is part of a larger effort to change how the AST types are being serialized, and the way code generation for them is implemented. Right now, the source code for the new tool is being checked in and the relevant build step is enabled, just to make sure everything is working as intended, but please note that this change does not introduce any code in the repository that actually makes use of the new generator. All of the AST-related reflection information that feeds the current serialization system is still being generated using `slang-cpp-extractor`. The design of the new tool is primarily motivated by the new approach to serialization that I'm implementing, and once that new approach lands we should be able to deprecate the `slang-cpp-extractor`. In addition, the new tool should in principle be able to handle many of the kinds of code generation tasks that are currently being implemented with other tools like `slang-generate` (used for the core and glsl libraries). This tool should also be well suited to the task of generating more of the code related to the IR instructions. * format code * Build fixes caught by CI * Fix another warning coming from CI * Another CI-caught fix * Change bare hrows over to more proper abort execptions * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-17	Fix compiler warning with clang 18.1.8 on windows (#6843)	Jay Kwak
	* Fix compiler warning with clang 18.1.8 on windows
2025-04-12	Add slang-test check for D3D11 double support (#6761)	aidanfnv
	Fixes #6171 This commit adds logic for reporting double support to the d3d11 backend, for running tests on GPUs that do not support D3D11_FEATURE_DOUBLES, and add checks for that support to tests that require the feature.
2025-04-11	Fix benchmark/compile.py nested f-string and deprecated constant issues (#6773)	Gangzheng Tong
	This commit resolves three issues in the benchmark script: 1. Fixed nested f-string syntax error that was causing CI failures (fixes #6772) 2. Fixed key access error handling for timing dictionary The nested f-string was causing a syntax error in CI, and the proper fix is to create the key variable separately before using it. Also improved error handling for cases where compilation fails and timing data might be missing. Fixes: #6772
2025-04-07	Return non-escaped strings from user-defined attributes (#6735)	aidanfnv
	Fixes #6624 This commit changes the behavior of getArgumentValueString() to return the string's value, instead of returning the string's token, as that token also contains the surrounding quotation marks. This commit also modifies the relevant unit test accordingly, to not check for the surrounding quotations.
2025-04-07	Support for Payload Access Qualifiers (#3448) (#6595)	Harsh Aggarwal (NVIDIA)
	* Add support for Ray Payload Access Qualifiers (PAQs) (#3448) - Added [raypayload] attribute for struct declarations - Implemented field validation requiring read/write access qualifiers - Added diagnostic error for missing qualifiers - Enabled PAQs in DXC compiler and HLSL emission - Added new test demonstrating PAQ syntax - Implemented proper handling of ray payload attributes in IR generation * format code * Cleanup: Remove unused vars * Add check to enablePAQ only for profile >= lib_6_7 * Review Fix - Add PAQ support for DX Raytracing add enablePAQ flag to DownstreamCompileOpitons, improve PAQ handling update raypayload-attribute-paq.slang to ensure hlsl and dxil is validated * Add diagnostic test for missing paq for lib_6_7 Compile using `-disable-payload-qualifiers` aka lib_6_6 profile raypayload-attribute-no-struct.slang and raypayload-attribute.slang --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-04-04	fix(d3d11): correct parameter in VSSetConstantBuffers1 from uavCount … (#6690)	Harsh Aggarwal (NVIDIA)
	* fix(d3d11): correct parameter in VSSetConstantBuffers1 from uavCount to cbvCount (fixes #6531) Root cause - Incorrect parameter passing in slang-rhi 1. slang-rhi #281 - Add the correct cbvCount for setting Constant Buffer 2. Prevent render tests from overwriting reference images * Add missing tests/render/multiple-stage-io-locations.slang.3.expected.png * Add more expected images from texture2d-gather * Add new option: skipReferenceImageGeneration For Github CI we set this to true - So we don't overwrite the expected images --------- Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
2025-04-03	Implement parameter block to slang-gfx for Metal backend (#6577)	kaizhangNV
	* implement parameterblock for metal Metal uses argument buffer to pass parameter buffer to pipeline, in this change, we implement a simple way to copy the data to argument buffer. In argument buffer tier2 rule, all the fields in parameter block will be flatten to ordinary data, therefore - we keep the m_data as in ShaderObjectImpl a CPU buffer to track on the data set in. - For resource types, they will be represeted as device pointer or resource id in argument buffer, we will just set their address or id at corresponsing offset in the CPU buffer every time when 'setResource' or 'setSampler' is called. - When binding the pipeline, we just simply copy the CPU argument buffer to GPU argument buffer. - The only special case is nested parameter block. Because nested parameter block is represented as a device pointer which will be another argument buffer, we will just recursively call `_ensureArgumentBufferUpToDate` to get sub-object's argument buffer, and fill the GPU address of those 'sub'-argument buffer to the root argument buffer at correct offset. * Inform command encoder to hazard track the bindless resources Since for all the resources within argument buffer are bindless, Metal won't automatically hazard track those resources, we will have to call 'useResources' to inform Metal to hazard track those resources, otherwise we will have to call wait fence after each command submission. * nullptr check * address comment
2025-03-25	Improve embed tool to search all include directories as determined by CMake ↵	Sai Praveen Bangaru
	(#6675) * Improve embed tool to search all include directories as determined by CMake Hopefully this puts an end to prelude generation issues. * Update CMakeLists.txt * Update CMakeLists.txt * Use Slang's string representation instead of malloc-ing chars
2025-03-24	Don't load cached builtin module in slang-bootstrap. (#6667)	Yong He
	* Don't load cached builtin module in slang-bootstrap. * Fixes. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-03-16	Add help screen to slang-test (#6611)	Gangzheng Tong
	This commit adds a help screen to slang-test to improve usability; and update README.md with clearer instructions. The help screen is displayed when: - User explicitly requests help with -h or --help flags - An unknown option is provided - A required argument for an option is missing The help screen provides comprehensive documentation of all available options, organized into sections: - Basic options (e.g. bindir, test-dir) - Output modes (e.g. appveyor, travis, teamcity) - Test prefix usage explanation Fixes: #6560 Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-03-13	test for link type layout caching (#6567)	Ellie Hermaszewska
	* format code * test for link type layout caching --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-03-13	test that link time extern struct layouts are visible for nested types (#6568)	Ellie Hermaszewska
	closes https://github.com/shader-slang/slang/issues/6556