slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2025-05-14	Make Command Line Reference readthedocs compatible (#7048)	aidanfnv
	This change modifies the code that generates the Command Line Reference doc to output H2 headings in place of H1 headings, and H3 in place of existing H2, so that readthedocs will not treat the additional H1 headings as titles. This change also regenerates the Command Line Reference doc, as the current copy in the repo appears to be quite out-of-date. The existing copy is also encoded as UTF-16LE, whereas the other docs are all UTF-8. The regenerated doc is also UTF-8, and all I did to generate that was run slangc.exe -help-style markdown -h > docs\command-line-slangc-reference.md 2>&1 after building slangc on Windows. This change also adds GitHub actions workflows to check the contents of the doc, fail if a regenerated version needs to be checked in, and provide an option to regenerate it with a bot, all in a similar manner to User Guide TOC regeneration. The doc writer was producing different results from my local build until I changed how the writer sorts the shader stages. In the action, the order of pixel and fragment was reversed, despite the only difference from my local build being the OS. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-14	Error out on invalid vector sizes (#7076)	Darren Wihandi
	* Error out on invalid vector sizes * Remove unnecessary include * Fix incorrect assert * Add test
2025-05-13	Fix invalid memory dereference in lower-to-ir (#7080)	Jay Kwak
	A reference-counting pointer type released a heap memory object when it return from the function and we are trying to dereference it later. We should increment the ref-count by one by assigning it to the context before returning.
2025-05-13	Support Array Sizes using Generic arguments to be initialized via {} (#6720)	Sruthik P
	* Add support for Array Sizes using Generic arguments to be initialized via {} Fixes one subissue of #6138 This change adds support for initializing Arrays with Generic size arguments via {} and adds a test to verify it. The change checks for an array whose size parameter is a GenericParamIntVal and since the size of such an array will be known at link time, is not considered as a case of the size not being known statically. * Add support for Array Sizes using Generic arguments to be initialized via {} Fixes one subissue of #6138. Fixes the issue #6958. This change adds support for initializing Arrays with Generic size arguments via {} and adds a test to verify it. Support is added by means of adding a new AST Expr node that lowers down to the IR MakeArrayFromElement and the emission of a diagnostic is replaced with the creation of this new AST Expr node. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-05-13	Add half-precision matrix type aliases in GLSL (#7066)	James Helferty (NVIDIA)
	Fixes #6708 This commit adds type aliases for half-precision matrices, including f16mat3x2, f16mat3x3, f16mat3x4, f16mat4x2, f16mat4x3, and f16mat4x4. Convenience aliases for square matrices (f16mat2, f16mat3, f16mat4) are also added. This commit introduces a new test file that validates the usage of half-precision types in a compute shader context.
2025-05-12	Make CUDA version capabilities reach NVRTC (#7074)	Theresa Foley
	Fixes #7049 The root cause of the problem in #7049 is simply that newer NVRTC versions produce a warning when asked to generate code for older CUDA SM versions, and the default that Slang was requesting compilation for was old enough to trigger that warning, and thus trip up the test case (which only looks at the first diagnostic produced by the downstream compiler). Superficially, the fix was easy: change the test case in question (`tests/diagnostics/local-line.slang`) to request `-capability cuda_sm_8_0`, the minimum version supported by current NVRTC. Unfortunately, the simple fix required some other fixes in order to actually work. The capability system includes capability names of the form `cuda_sm__`, but specifying such a capability had no impact on the CUDA SM version passed in when invoking NVRTC. Instead, only the CUDA SM versions requested in the implementation of intrinsics in the core module were affecting the version number passed down. This change adds logic to `slang-compiler.cpp` to take explicitly requested capabilities into account when inferring the CUDA SM version to be passed downstream. A more complete fix would also add similar logic for all the other targets. Unfortunately... yet again... that fix wasn't enough to make things work as expect. Now I had the problem that requesting `-capability cuda_sm_8_0` was actually causing the NVRTC invocation to request CUDA SM version 9.0! The underlying problem there was that the `slang-capabilities.capdef` file has defined certain capability names in a way that implies atomic capabilities much higher than one would expect. E.g., the `cuda_sm_8_0` alias was including HLSL `sm_5_0`, but then `sm_5_0` in turn included `_cuda_sm_9_0`. The fix, for now, is to change the definitions in `slang-capabilities.capdef` to not have the counter-intuitive definitions for `cuda_sm__`. With this set of fixes, the test failure in the original bug report no longer occurs. The work that went into this change suggests several larger-scope fixes that would be good to pursue: * Ideally the capability definitions would have some sort of validation checking to make sure that counter-intuitive results like `cuda_sm_8_0` requesting CUDA SM 9.0 do not occur. * The translation of capabilities over to version numbers for a downstream compiler should be expanded to cover other targets, and not just CUDA. It might be better/simpler to just pass the capabilities themselves to the downstream compiler, since it is possible that a downstream compiler could have more fine-grained enable/disable options than a simple version number. * The entire approach to computing version numbers required for downstream compilation should be cleaned up so that we don't have this duplication between the capabilities that represent those versions and separate syntactic constructs that are used to "request" those versions as part of code generation. * We are very much at the point where we should consider dropping the current behavior where a profile name or capability like `sm_5_0`, that is specific to a single target or a subset of targets, also implies a set of comparable capabilities for other targets.
2025-05-12	Ensure to emit 32-bit type for bitfield operations for SPIRV (#7046)	kaizhangNV
	Close #7014
2025-05-12	Cleanups related to RIFF support (#7041)	Theresa Foley

2025-05-12	add missing namespace std:: for nullptr_t (error while building with clang) ↵	Pierre EVEN
	(#7059) Co-authored-by: Julius Ikkala <julius.ikkala@gmail.com>
2025-05-12	cluster acceleration structure optix 6431 (#7028)	Harsh Aggarwal (NVIDIA)
	* Add cluster geometry intrinsics for ray tracing - Added GetClusterID() method to HitObject class - Added CandidateClusterID() and CommittedClusterID() methods to RayQuery class - Added SPV_NV_cluster_acceleration_structure extension support - Added GL_NV_cluster_acceleration_structure extension support - Added test files for RayQuery and HitObject cluster methods Fixes #6431 * OpRayQueryGetIntersectionClusterIdNV - unrecognized spirv Disabling spirv backend for SPV_NV_cluster_acceleration_structure hlsl.meta.slang(18674): error 29100: unrecognized spirv opcode: OpRayQueryGetIntersectionClusterIdNV result:$$int = OpRayQueryGetIntersectionClusterIdNV &this $iCandidateOrCommitted; ^~~~~~ hlsl.meta.slang(18670): error 30019: expected an expression of type 'int', got 'void' return spirv_asm ^~~~~~~~~ ninja: build stopped: subcommand failed. * 6431 - Fix spirv opcode * Remove tests * Add relevant tests * Review - Simplify tests
2025-05-10	Fix local constants in switch cases (#7053)	Julius Ikkala
	* Fix using local constants in switch cases * Add test * format code * Always lower switch cases with exprVal * Fix formatting --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2025-05-10	Add debug information for slang inling (#6621)	Mukund Keshava

2025-05-09	Fix SPIRV unsigned to signed widening casts (#7051)	Darren Wihandi
	* Fix unsigned to signed casts for SPIRV * Add test * Fix ICE crash
2025-05-09	Fix various intptr_t issues by defining its width in `getIntTypeInfo` (#6786)	Julius Ikkala
	* Define a bit size for the intptr types * Fix intptr_t sign * Extend intptr test to check for previously broken operations * Fix intptr vector test on CUDA * Handle intptr size in getAnyValueSize * Fix formatting * Try with __ARM_ARCH_ISA_64 * On macs, int64_t != intptr_t Yikes * Move define to prelude header * Also check apple in host-prelude * Fix define location
2025-05-09	Fixed name mangling of generic extensions (#6671)	Ronan
	* Fixed name mangling of generic extensions * Added tests for generic extensions linking. - Disabled bugs/gh-6331.slang since it now triggers an assertion error revealed by the new version of the mangler. * Re-enabled test gh-6331 (fixed by a5efbb1b775afb2f6b29b37d39947c41744bb005) --------- Co-authored-by: Yong He <yonghe@outlook.com>
2025-05-09	Destroy unused witness table after hoisting (#7009)	Gangzheng Tong
	* Destroy unused witness table after hoisting
2025-05-09	Restore Properties section name in stdlib reference (#7042)	aidanfnv

2025-05-08	Fix broken links in Slang docs (#7039)	aidanfnv
	This change fixes some broken links/anchors in the Slang docs that I found after running a link checker tool on the readthedocs site. Some of these were broken on GitHub Pages as well, such as anchors with apostrophes or docs that were moved. The doc writer change adds an explicit .html extension to Parameter links, so that Sphinx/RTD does not treat it as an anchor.
2025-05-07	Add new Valve vfx sections for mesh shaders. (#7030)	Dan Ginsburg

2025-05-06	bitcast require the input has same width with result type (#7018)	kaizhangNV
	bitcast requires the input has same width with result type, this PR ensures that we always lower the bitcast IR instruction satisfies this requirement. Close #7017.
2025-05-07	Output default parameter values on language server hover (#6902)	Ellie Hermaszewska
	* Output default parameter values on language server hover Closes https://github.com/shader-slang/slang/issues/5782 * Add parens around binay expressions in ast print * More cases in ast print test * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-06	Parse char literals as integers (#6989)	Julius Ikkala
	* Parse char literals as integers * Fix formatting * Parse escaped chars correctly --------- Co-authored-by: Yong He <yonghe@outlook.com>
2025-05-06	Add interger pack/unpack intrinsic to glsl (#6997)	kaizhangNV
	* Add interger pack/unpack intrinsic to glsl Add unpack8, unpack16 and pack32 intrinsics to glsl. * add unit test * fix lowering logic for 'bitcast' * format
2025-05-06	Update C++ standard to C++20 (#6980)	Ellie Hermaszewska
	* Correct incorrect enum usage on metal * Update C++ standard to C++20 Closes https://github.com/shader-slang/slang/issues/6945 * use bit_cast
2025-05-05	Fix the intermittent failures of tests/autodiff/auto-differential-type (#7006)	Jay Kwak
	The use of `_wfopen_s()` was incorrect in a way the the result value had to be checked. However, even with a proper handling of the failed case, the repro-rate was same. The issue was reproduced at 5%~20% rate with the following commands, build/Release/bin/slang-test.exe tests/autodiff/auto-differential-type -api dx11 -use-test-server -server-count 8 Co-authored-by: Yong He <yonghe@outlook.com>
2025-05-05	Add a new capability hlsl_2018 that avoid using select/and/or (#7003)	Jay Kwak
	Co-authored-by: Yong He <yonghe@outlook.com>
2025-05-05	Add countbits 16-bit and 8-bit support (#6433) (#6897)	sricker-nvidia
	Change adds 16-bit and 8-bit support for countbits intrinsic. In cases where a backend's native counbits lacks support, support is emulated. New tests are added for 16-bit and 8-bit support. Additional testing added for 32-bit and minor updates made to 64-bit countbits.
2025-05-03	Add IREnumType to distinguish enums from ints and each other (#6973)	Julius Ikkala
	* Add IREnumType to distinguish enums from ints and each other * Add issue example as test * format code * Add expected test output * Fix peephole optimization hanging No idea why this PR triggered this, but there seems to have been a clear bug here anyway, so may just as well fix it now. * Move enum lowering later * Add linkage decoration to enum type * Use filecheck-buffer instead of expected.txt * Fix comment * Make enum casts actually use IR enum casts They were all BuiltinCasts by accident * Lower enum type before VM * Deal with rate-qualified types in enum cast * Allow any value marshalling for enum types * Handle new enum instructions in a couple more switches * Fix formatting --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-02	Fix build on GCC 15 (#6971)	Julius Ikkala
	* Fix build on GCC 15 * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-01	Add fwidth_coarse and fwidth_fine functions (#6941)	pdeayton-nv
	Fixes #6940. Add new Slang fwidth_coarse and fwidth_fine functions, similar to GLSL's fwidthCoarse and fwidthFine. Move the implementation of the GLSL functions from glsl.meta.slang to hlsl.meta.slang. Update the existing spirv/fwidth.slang test with the new functions, and add a new hlsl-intrinsic/fragment-derivative.slang test to test HLSL, SPIR-V, and GLSL targets for the new functions.
2025-05-01	Modify markdown writing to produce readthedocs pages (#6904)	aidanfnv
	This commit makes multple changes to the slang doc markdown writer to make the stdlib reference pages usable with readthedocs. - Adds tables of contents at the bottom of every page with children. TOCs contain category landing pages where applicable and contain uncategorized child pages - Changes links on the pages to use relative paths instead of absolute paths - Changes anchor tags to use HTML syntax instead of markdown syntax The overall TOC on readthedocs will look the same as the TOC seen on the existing docs website, and I have verfified that links and anchors will still work. The tables of contents generated for use on readthedocs would be visible on the existing docs website. This change wraps the TOCs in a comment block, so that it will be hidden from view. The readthedocs build script can filter out the comments to unhide the TOC inside and still properly render it there with it remaining hidden when viewed elsewhere. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-05-01	Smoke test WASM as a part of CI (#6969)	Jay Kwak
	* Simplify build of slang-wasm * Add smoke-test for slang-wasm in ci * Avoid git-clone playground
2025-04-30	Add `IOpaqueDescriptor::descriptorAccess`. (#6967)	Yong He
	* Add `IOpaqueHandle::descriptorAccess`. * Update doc. * fix.
2025-04-30	Initial support for immutable lambda expressions. (#6914)	Yong He
	* Initial support for immutable lambda expressions. * More diagnostics, and langauge server fix. * Language server fix. * Fix bug identified in review. * Add expected result. * Update expected result.
2025-04-30	Fix compiler warning with clang 18.1.8 on windows (#6963)	Jay Kwak

2025-04-30	cuda: Improve entry handling for SV_DispatchThreadID (#6925)	Mukund Keshava
	* cuda: Improve entry handling for SV_DispatchThreadID Fixes #6780 This commit improves CUDA entry point handling by extracting appropriate components from DispatchThreadID based on parameter type. It now properly handles uint scalar (x component only) and uint2 vector (x,y components) instead of always using the full uint3 value. Add a new test case to check for this. * format code * fix CI failure * Handle review comments --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-04-30	Add subscript operator support in cuda (#6830)	Mukund Keshava
	* cuda: Add support for subscript operator This CL adds support for the subscript operator for Read Only textures in cuda. Also adds a test for this. Fixes #6781 * format code * fix review comments * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-04-29	Fixes related to AST serialization change (#6953)	Theresa Foley
	* Fixes related to AST serialization change There are two fixes included here. The smaller fix is in `slang-ast-decl.h`, where the `CallableDecl::primaryDecl` and `::nextDecl` fields need to be serialized to make sure that we can properly deserialize a module that contains any function redeclarations. The larger fix is that the `Encoder` and `Decoder` types used to serialize out the AST nodes in a JSON-like hierarchy were being very strict about matching of integer types, which causes problems in any case where serialization code might use a type that is 32-bit on some targets and 64-bit on others, if serialized modules are ever created on one of those targets and consumed on the other (which happens for the core module for some of our targets). The fix that this change implements is to make the serialization logic there more forgiving, and thus more robust: * In the writing/encoding direction, the logic now looks at the actual value being encoded to decide whether to write it as a 32- or 64-bit value. * In the reading/decoding direction, the logic handles the presence of any of the FOURCCs that are used to encode integers, for whatever type is being read. As a small bit of safety, there is a dynamic check made for cases where a value would be read with a different sign than it was written with. The actual logic in `slang-serialize-ast.cpp` is largely unchanged (it continues to use pointer-sized integers in certain cases), but it should not cause problems because it bottlenecks through the `Encoder`/`Decoder` methods that were changed. The only fix made in the AST serialization itself is to account for all of the FOURCCs that can represent integers when peeking at the input to decide whether a `DeclBase` is represented as an indirection to a `Decl`, or is serialized inline (as a `DeclGroup`). * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-29	Support use of `this` with Mesh Shader Outputs (`IRMeshOutputRef`) (#6920)	16-Bit-Dog
	* Support `this` use of `IRMeshOutputRef` * Fix SPIR-V val error There was a type mismatch causing a spir-v failiure: `store(var, var)` instead of `store(var, load(var))`
2025-04-28	Add Slang Byte Code generation and interpreter. (#6896)	Yong He
	* Add Slang Byte Code generation and interpreter. * Fix compile issues. * format code * More compile fix. * Fix clang issue. * Fix more clang issues. * Another clang fix. * Fix clang issues. * Fix another clang issue. * Fix wasm build. * Update building.md * Fix test-server. * Fix compile error. * Fix bug. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-27	Emit SPIRV NonReadable decoration for WTexture* (#6922)	Darren Wihandi

2025-04-26	Added getCanonicalGenericConstraints2 (sorts constraints and allows more ↵	Ronan
	generic expressions) (#6787)
2025-04-25	Update spirv-tools to for SDK v2025.2 (#6893)	Gangzheng Tong
	* Update spirv-tools to for SDK v2025.2 Fixes: #6850 * bump spirv version to 1.4 for op linkage * skip-spirv-validation for coop mat * add skip-spirv-validation option to slang session desc * use SPV_ENV_UNIVERSAL_1_6 for spirv-tool env target Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-24	Implemented #pragma warning (#6748)	Ronan
	* Implemented #pragma warning Based on https://learn.microsoft.com/en-us/cpp/preprocessor/warning?view=msvc-170 * Make #pragma warning work with #includes. - SourceLoc are not sorted by inclusion order. - Construct a mapping from SourceLoc to "absolute locations" that are sorted by inclusion order (roughly represents a location in a raw file with all #include resolved). - The absolute location can be used in the pragma warning timeline * Added preprocessor #pragma warning tests. - Fixed #pragma warning (push / pop) SourceLoc - Fixed unused directiveLoc in #pragma warning parsing * #pragma warning: Added some comments and fixed some typos * Cleaned #pragma warning preprocessor implementation. --------- Co-authored-by: Yong He <yonghe@outlook.com>
2025-04-24	Handle case where either side of a conditional branch is empty (#6890)	Sai Praveen Bangaru
	Co-authored-by: Yong He <yonghe@outlook.com>
2025-04-25	Fix attempt to construct an abstract AST node (#6905)	Theresa Foley
	Work on #6892 The function `tryInferLoopMaxIterations` in `slang-check-stmt.cpp` was doing: auto litExpr = m_astBuilder->create<LiteralExpr>(); but the declaration of `LiteralExpr` in `slang-ast-expr.h` had been marked as abstract, during the switch to using the `slang-fiddle` tool to generate code: FIDDLE(abstract) class LiteralExpr : public Expr { ... } In this case, the intention of the AST design is that `LiteralExpr` should be kept abstract, and only the concrete subclasses should ever be instantiated. Because of some historical design choices, the `ASTNodeType` enumeration includes both the concrete and abstract AST classes, so the code ended up constructing a `LiteralExpr` that had the tag `ASTNodeType::LiteralExpr`. Then attempts to use the `ASTNodeDispatcher` code on such a node caused crashes, because the dispatcher code only included `case` statements for the non-abstract `ASTNodeType`s. The quick fix here was to change the `tryInferLoopMaxIterations` function to instead do: auto litExpr = m_astBuilder->create<IntegerLiteralExpr>(); A test case was added to help catch any future regressions on this specific issue. A more long-term fix should involve introducing code that statically and/or dynamically prohibits the creation of instances of AST node classes that have been marked abstract.
2025-04-24	Fix #6544: Properly format nested type names in extensions (#6769)	Harsh Aggarwal (NVIDIA)
	* Fix #6544: Properly format nested type names in extensions Modify DeclRefBase::toText to properly handle types defined in extensions by qualifying them with their parent type name. This ensures getFullName() returns the full name like 'FullPrecisionOptimizer<half>.State' instead of just '.State'. Also handle other nested types in structs/classes similarly. * Update extension reflection handling - with generics args and namespaces - stopping namespace inclusion for extension members - Update to use getTargetType() to handle the generic arguments - update test cases * Simplify code to remove using parentDecl
2025-04-22	A new approach to AST serialization (#6854)	Theresa Foley
	* A new approach to AST serialization This change completely overhauls the way that AST nodes are being serialized, and the offline source-code generation steps that enable that serialization. In practice, this ends up being a complete overhaul of the way that modules are being serialized (not just the AST part), although things like the serialization format for the Slang IR and for source locations are not affected. The rest of this commit message is broken down in to sections, in an attempt to help guide anybody looking at the code in how to make sense of all the changes. The Old C++ Extractor --------------------- AST serialization used to be driven by information scraped using the `slang-cpp-extractor` tool, which did an ad hoc parse of the C++ declarations of the AST node types and then generated a set of "X macros" that could be for macro-based code generation within the rest of the compiler. While the existing approach was functional, it wasn't easy to understand or maintain, and it has been getting in the way of forward progress on other features we'd like to work on in the language and compiler. This change removes the `slang-cpp-extractor` tool entirely. Marking Up the AST Declarations ------------------------------- The most notable change that contributors to the compiler may notice is the large number of invocations of a macro `FIDDLE()` on the declarations of the AST node types. The basic idea is that only declarations (namespaces, types, fields) that are preceded by `FIDDLE()` are visible to the code generator tool. So if somebody is working with the AST and wondering why a new node type isn't working, or why a field they added isn't being serialized correctly, it is probably because they need to add `FIDDLE()` in front of it. Generating the Boilerplate Code ------------------------------- The file `slang-ast-boilerplate.cpp` provides a good example of how the information extracted from the marked-up AST declarations gets used. In that file, the `FIDDLE TEMPLATE` construct is used to generate type information for each of the AST node types. Similar logic is used in `slang-ast-forward-declarations.h` to generate the declaration of the `ASTNodeType` enumeration, and forward-declare all the AST node classes. For many parts of the code, simply including that file replaces the need for the old `slang-generated-.h` files. Replacing Visitors and Related Logic ------------------------------------ The old visitor types for the AST used the macros that were generated by `slang-cpp-extractor`, so something new was needed to replace them. The same goes for the `SLANG_AST_NODE_VIRTUAL_CALL` macros. The core of the solution implemented here is in `slang-ast-dispatch.h`. Given a "dispatchable" AST node type (say, `Expr`), a call like: ``` ASTNodeDispatcher<Expr,R>(expr, [&](auto e) { return doSomething(e); }) ``` is an expression of type `R`, which does the equivalent of something like: ``` switch(expr->getTag()) { case ASTNodeType::VarExpr: return doSomething(static_cast<VarExpr>(expr)); // ... } ``` The `SLANG_AST_NODE_VIRTUAL_CALL` macro is now implemented in terms of `ASTNodeDispatcher`. The implementation of the visitor types is more involved. The code in this change retains some of the macro names from the original version, just to try and make the parallels more clear. The visitor types are all implemented on top of the `ASTNodeDispatcher` approach, and use `FIDDLE TEMPLATE` to generate all the boilerplate `visit()` method declarations. Refactoring of `Linkage` Module Loading --------------------------------------- Needing to revisit all the places where modules get deserialized made it clear that there is a lot of complexity and apparent duplication in the core routines on the `Linkage` that get used for loading modules. This change tries to clean up some of that logic, but it is worth noting that there are two legacy features that get in the way of making things as clean as they should be: The `LoadedModuleDictionary` type that gets passed around a lot exists entirely to handle the corner case where somebody uses the Slang API to perform a compilation with multiple `TranslationUnitRequest`s in the same `FrontEndCompileRequest`, and one of the translation units `import`s the module defined by another of the translation units. * There are a lot of special-case behaviors and routines entirely there to support the `ModuleLibrary` feature, although that feature should be considered deprecated (or at least subject to getting entirely re-designed down the line). The basic idea of the cleanup is that all of the (non-deprecated) ways load a module from a serialized binary, or compile one from source should now bottleneck through `loadModuleImpl`, which then bifurcates into `loadSourceModuleImpl` for the compilation case and `loadBinaryModuleImpl` for the deserialization case. High-Level Serialization Approach --------------------------------- The old serialization logic used the [RIFF](https://en.wikipedia.org/wiki/Resource_Interchange_File_Format) format to encode the high-level structure of things, and this change retains that usage (and actually doubles down on the RIFF usage). The old serialization system relied on the idea that for any given type `Foo` that wants to support serialization, there should be something like a `SerialFooData` type in C++, that can represent the state of a `Foo`, and then the actual serialization applied to that `SerialFooData`. This means that in most cases there are four pieces of code written: * During serialization: * Copying the data of a `Foo` in memory over to a `SerialFooData` in memory * Writing the state of a `SerialFooData` into the serialized data stream * During deserialization: * Reading the state of a `SerialFooData` from a serialized data stream * Copying the data of the `SerialFooData` in memory over to a `Foo` The new logic gets rid of the intermediate `SerialFooData`. In the serialization direction, we take a `Foo` and write it to the `RIFFContainer` directly, or using some other utilities layered on top of it. In the deserialization direction, we have additional flexibility. Given a `RIFFContainer::Chunk` that represents a serialized `Foo`, we often navigate through the in-memory representation of the RIFF data to get to the parts of the serialized value that we actually want/need, without needing to deserialize the entire `Foo`. To support this kind of operation, this change introduces a few helper types like `ContainerChunkRef` an `ModuleChunkRef`, that are little more than typed wrappers around a `RIFFContainer::Chunk`. The Module "Container" Part --------------------------- A serialized `Module` is encoded as a RIFF chunk, using logic in `slang-serialize-container.cpp` - both before and after this change. This change reorganizes a lot of the code in that file, to account for the way that eliminating the intermediate `SerialContainerData` type streamlines the overall task of writing out the parts of the module. In the deserialization logic... there isn't really much to do in `slang-serialize-container.cpp`. Most of the logic in `slang.cpp` and `slang-module-library.cpp` that pertains to deserializing modules uses the `ModuleChunkRef`-based approach, and simply extracts the pieces of the serialized module that it needs. The Actual Serialization of the AST ----------------------------------- The actual AST serialization logic is in `slang-serialize-ast.cpp`. The basic approach in both the writing and reading directions is: * Use the `FIDDLE TEMPLATE` system to generate a set of functions, one for each AST node type, that recursively invoke the read/write logic on each field of that node (after recursively invoking the case for its direct superclass) * Use the `ASTNodeDispatcher` system to dispatch out to those functions whene reading or writing anything derived from `NodeBase` * For now, handle all types not derived from `NodeBase` by hand. There's a lot of room for improvement around that last item: it should be just as easy to generate the serialization and deserialization logic for other types that don't inherit from `NodeBase`, but the current change tries to err on the side of making the logic as explicit and simplistic as possible, rather than trying to get too clever too soon. The actual serialization format used for the AST is almost comically simplistic: the code uses hierarchical RIFF chunks to emulate a JSON-like structure. This is a very wasteful representation (e.g., a `bool` or a null pointer each take up 8 bytes), but the goal for now is to start with the simplest thing that could possibly work, and only add more cleverness once we are sure it won't get in the way of important future improvements (like lazy/on-demand deserialization or IR and AST, to improve compiler startup times). The files `slang-serialize.{h,cpp}` have been co-opted to define a new pair of types `Encoder` and `Decoder` that are used for a more-or-less stream-oriented way or reading or writing RIFF chunks for the JSON-like structure. Almost everything related to the actual AST serialization could do with a cleanup pass, and some time spent on picking good/better names for everything. Smaller Stuff ------------- * Cleaned up a lot of code that was using bare `ASTNodeType` or the extractor's `ReflectClassInfo` type to consistently use `SyntaxClass`. * Fixed an apparent bug in how the destination-driven code genarator was handling `TryExpr`s * Fixed an apparent bug in how the GLSL legalization pass was handling translation of certain `SV_` semantics. format code * fixup: template errors caught by non-VS compilers * format code * fixup: more template errors * fixup: more stuff VS didn't catch * fixup: it's amazing VS doesn't catch these... * fixup: yet more template stuff VS ignores * fixup: more VS template nonsense * fixup: unreachable return macro usage * fixup: more unreacable returns * fixup: unused parameter * fixup: strict aliasing * fixup: allow missing entry point list chunk * fixup: wasm build script * fixup: AST changes since this PR was created --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2025-04-22	Implement shader subgroup rotate intrinsics (#6878)	Darren Wihandi
	* Initial implementation for SPIRV, GLSL and Metal * test add bool test * Fix and improve subgroup rotate tests * Add proper GLSL extensions and proper Metal type checking * Clean up tests and add diagnostics test for subgroup type for Metal * Update wave-intrinsics docs
2025-04-22	Add a new SM profile 6.9 (#6879)	Jay Kwak