slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2025-07-01	Add arguments for controlling floating point denormal mode (#7461)	aidanfnv
	* Implement -fp-denorm-mode slangc arg * Split fp-denorm-mode into 3 args for fp16/32/64 * Remove redundant option categories * Use emitInst for multiple of the same OpExecutionMode * Fix formatting * Remove -denorm any * Re-add option categories * emitinst for ftz * Use enums for type text * Remove extra categories again * Add tests for denorm mode * Move denorm mode to post linking * format code (#8) Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> * regenerate command line reference (#9) Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> * Clean up tests * Fix option text * format code (#10) Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> * Add tests for "any" mode * Return "any" enum if option not set * Simplify emission logic * Add support for generic entrypoints * Move denorm modes to end of CompilerOptionName enum * format code (#11) Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> * Move new enum members to before CountOf * Add not checks to tests, fix generic test, add functionality tests * Rename denorm to fpDenormal * Clean up functional test * Rename denorm test dir * Fix formatting, regenerate cmdline ref * Fold simple tests into functional tests, add more dxil checks * Remove no-op DX tests, make tests more consistent * Disable VK functionality tests that will fail on the CI configs * Fix formatting * Add comments to disabled tests explaining why --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-07-01	Remove some cruft/complexity from IR serialization (#7483)	Theresa Foley
	* Remove some cruft/complexity from IR serialization This is a very simple cleanup to unnecessary code paths and remove some flexibility that isn't actually needed, to hopefully simplify the task of more completely overhauling the approach to IR serialization in a later change. The concrete feature that gets removed here is a debug-only feature (which thus shouldn't be affecting any users of Slang) that was added long ago in the life of the compiler as we were working to truly separate the front- and back-ends. At the time there was a lot of code in the compiler back-end that still made use of AST-level data structures, and thus got in the way of our goal to support separate compilation and linking (such that final code generation can only depend on the IR, and not the AST). The option was used to cause the Slang IR to be serialized out and then read back in as part of compilation, to try and enforce that only the wanted constructs could pass through that bottleneck. The idea was only ever half implemented, however, because it made use of a secondary implementation path in IR serialization that supported serializing the "raw" source locations (which are heavily dependent on AST-level information, even down to the number of bytes in source files). This change removes the feature entirely, since it is no longer useful for its intended purpose, and its presence causes there to be entire second code path for source locations in IR serialization that would need to have test coverage if we wanted to be sure it kept working. In addition, our pre-existing infrastructure for module serialization had various options that have either stopped being useful, or were not really useful at the time they were introduced. For example: there are no places in the code today where we attempt to serialize out a module without including both the serialized AST and IR. If that was a feature that we ever supported, the relevant code got removed at some preceding point without breaking any of our tests or (seemingly) upsetting users. Similarly, the options being passed into writing of a serialized module included both a flag to control whether source locations should be serialized and a pointer to the `SourceManager` to use in that case... but it was only ever meaningful to set both, or neither. The option has been changed to just be the `SourceManager` pointer, and the name has been updated to reflect its very narrow intended use case. * format code * fixup * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2025-06-10	Allow checking capabilities in specific stages (#7375)	jarcherNV
	This allows checking capabilities in any stage, needed specifically for the hlsl_2018 capability which is defined for sm_5_1 and above. Stage specific capabilities such as cs_5_1 would not find this in any stage other than compute, so we need to restrict the check to only desired stages.
2025-06-09	Mediate access to ContainerDecl members (#7242)	Theresa Foley
	Most of what this change does is straightforward: take all the places in the code that used to operate directly on `ContainerDecl::members` and related fields, and instead have them call into a smaller set of accessor methods defined on `ContainerDecl`. The primary motivation for making this change is that in order to implement on-demand loading of members from serialized AST modules, we need a way to identify and intercept the "demand" for those members. On-demand loading benefits from having all accesses to the members of a `ContainerDecl` be as narrow as possible. If a part of the code only need a member at a specific index, it should say so. If it only needs access to members with a specific name, or a given subclass of `Decl`, then it should say so. A secondary motivation for this change is that there have recently been several changes that added complexity and special cases by introducing code that operated on (and mutated) the member list of a container decl in ways that the existing code had never done before. Any code that mutates the member list of a `ContainerDecl` needs to be sure to not disrupt the invariants that the lookup acceleration structures currently rely on. One of the recent changes added a declaration-to-index map to the set of acceleration structures (with different validation/invalidation behavior than the others...) while other recent changes would remove or insert declarations in ways that could change the indices of other declarations in the same container. It is not clear if any of these pieces of code were aware of the others, and the invariants that might be expected or broken along the way. This change bottlenecks the vast majority of accesses to the members of a `ContainerDecl` through the following operations: * Getting a `List` of all of the direct member declarations of a container * Get the number of direct member declarations, and accessing them by index. * Looking up the list of direct member declarations with a given name. * Adding a new direct member declaration to the end of the list. Some other operations are layered on top of those (e.g., getting a list of all the direct member declarations of a given C++ class). These layered operations are still centralized on the `ContainerDecl`, with the intention that we can change them to be non-layered implementations if we ever need to for performance (e.g., by building a lookup structure for finding member declarations by their type). The exceptional cases of access/mutation on the direct members of a `ContainerDecl` have also been encapsulated, but rather than expose what would risk appearing like general-purpose accessors (e.g., `removeDecl(d)`, `setDecl(index)`, etc.), these operations have been explicitly named after the specific use case that they serve in the codebase today, to discourage others from using them for more kinds of operations we'd rather not support. These operations have also been given parameter signatures that match their use cases, to make it so that even somebody determined to abuse them would have to invent suitable arguments out of thin air. In the case of the declaration-to-index mapping, this change eliminates that acceleration structure, in favor or slightly more complicated (and possibly inefficient, yes) code at the use site. Over time, it would be good to closely scrutinize each of the use cases that requires more complicated interaction with the members of a `ContainerDecl`, to see whether any of them can be reframed in terms of the more basic operations, or if there is some clean abstraction we can introduce to make operations that mutate the member list feel like... hacky.
2025-06-06	Add command line option for separate debug info (#7178)	jarcherNV
	* Add command line option for separate debug info Add command line arg -separate-debug-info which, if provided, produces both a .spv and a .dbg.spv file. The .dbg.spv file contains full debug info and the .spv file has all debug info stripped out. Also add a DebugBuildIdentifier instruction to store a unique hash in both the output files, so they can be more easily matched together. A matching API is provided to allow using the Slang API to retrieve a base and debug SPIRV as well as the debug build identifier string.
2025-05-29	Language version + tuple syntax. (#7230)	Yong He
	* Language version + tuple syntax. * Fix compile error. * regenerate documentation Table of Contents * Fix. * regenerate command line reference * Fix. * Fix. * Fix more test failures. * revert empty line change, * Retrigger CI * #version->#lang * Update source/core/slang-type-text-util.cpp Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> * Remove comments. * Fix parsing logic. * Fix parser. * Fix parser. * update test comment * Update options. * regenerate documentation Table of Contents * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
2025-05-12	Make CUDA version capabilities reach NVRTC (#7074)	Theresa Foley
	Fixes #7049 The root cause of the problem in #7049 is simply that newer NVRTC versions produce a warning when asked to generate code for older CUDA SM versions, and the default that Slang was requesting compilation for was old enough to trigger that warning, and thus trip up the test case (which only looks at the first diagnostic produced by the downstream compiler). Superficially, the fix was easy: change the test case in question (`tests/diagnostics/local-line.slang`) to request `-capability cuda_sm_8_0`, the minimum version supported by current NVRTC. Unfortunately, the simple fix required some other fixes in order to actually work. The capability system includes capability names of the form `cuda_sm__`, but specifying such a capability had no impact on the CUDA SM version passed in when invoking NVRTC. Instead, only the CUDA SM versions requested in the implementation of intrinsics in the core module were affecting the version number passed down. This change adds logic to `slang-compiler.cpp` to take explicitly requested capabilities into account when inferring the CUDA SM version to be passed downstream. A more complete fix would also add similar logic for all the other targets. Unfortunately... yet again... that fix wasn't enough to make things work as expect. Now I had the problem that requesting `-capability cuda_sm_8_0` was actually causing the NVRTC invocation to request CUDA SM version 9.0! The underlying problem there was that the `slang-capabilities.capdef` file has defined certain capability names in a way that implies atomic capabilities much higher than one would expect. E.g., the `cuda_sm_8_0` alias was including HLSL `sm_5_0`, but then `sm_5_0` in turn included `_cuda_sm_9_0`. The fix, for now, is to change the definitions in `slang-capabilities.capdef` to not have the counter-intuitive definitions for `cuda_sm__`. With this set of fixes, the test failure in the original bug report no longer occurs. The work that went into this change suggests several larger-scope fixes that would be good to pursue: * Ideally the capability definitions would have some sort of validation checking to make sure that counter-intuitive results like `cuda_sm_8_0` requesting CUDA SM 9.0 do not occur. * The translation of capabilities over to version numbers for a downstream compiler should be expanded to cover other targets, and not just CUDA. It might be better/simpler to just pass the capabilities themselves to the downstream compiler, since it is possible that a downstream compiler could have more fine-grained enable/disable options than a simple version number. * The entire approach to computing version numbers required for downstream compilation should be cleaned up so that we don't have this duplication between the capabilities that represent those versions and separate syntactic constructs that are used to "request" those versions as part of code generation. * We are very much at the point where we should consider dropping the current behavior where a profile name or capability like `sm_5_0`, that is specific to a single target or a subset of targets, also implies a set of comparable capabilities for other targets.
2025-04-28	Add Slang Byte Code generation and interpreter. (#6896)	Yong He
	* Add Slang Byte Code generation and interpreter. * Fix compile issues. * format code * More compile fix. * Fix clang issue. * Fix more clang issues. * Another clang fix. * Fix clang issues. * Fix another clang issue. * Fix wasm build. * Update building.md * Fix test-server. * Fix compile error. * Fix bug. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-22	A new approach to AST serialization (#6854)	Theresa Foley
	* A new approach to AST serialization This change completely overhauls the way that AST nodes are being serialized, and the offline source-code generation steps that enable that serialization. In practice, this ends up being a complete overhaul of the way that modules are being serialized (not just the AST part), although things like the serialization format for the Slang IR and for source locations are not affected. The rest of this commit message is broken down in to sections, in an attempt to help guide anybody looking at the code in how to make sense of all the changes. The Old C++ Extractor --------------------- AST serialization used to be driven by information scraped using the `slang-cpp-extractor` tool, which did an ad hoc parse of the C++ declarations of the AST node types and then generated a set of "X macros" that could be for macro-based code generation within the rest of the compiler. While the existing approach was functional, it wasn't easy to understand or maintain, and it has been getting in the way of forward progress on other features we'd like to work on in the language and compiler. This change removes the `slang-cpp-extractor` tool entirely. Marking Up the AST Declarations ------------------------------- The most notable change that contributors to the compiler may notice is the large number of invocations of a macro `FIDDLE()` on the declarations of the AST node types. The basic idea is that only declarations (namespaces, types, fields) that are preceded by `FIDDLE()` are visible to the code generator tool. So if somebody is working with the AST and wondering why a new node type isn't working, or why a field they added isn't being serialized correctly, it is probably because they need to add `FIDDLE()` in front of it. Generating the Boilerplate Code ------------------------------- The file `slang-ast-boilerplate.cpp` provides a good example of how the information extracted from the marked-up AST declarations gets used. In that file, the `FIDDLE TEMPLATE` construct is used to generate type information for each of the AST node types. Similar logic is used in `slang-ast-forward-declarations.h` to generate the declaration of the `ASTNodeType` enumeration, and forward-declare all the AST node classes. For many parts of the code, simply including that file replaces the need for the old `slang-generated-.h` files. Replacing Visitors and Related Logic ------------------------------------ The old visitor types for the AST used the macros that were generated by `slang-cpp-extractor`, so something new was needed to replace them. The same goes for the `SLANG_AST_NODE_VIRTUAL_CALL` macros. The core of the solution implemented here is in `slang-ast-dispatch.h`. Given a "dispatchable" AST node type (say, `Expr`), a call like: ``` ASTNodeDispatcher<Expr,R>(expr, [&](auto e) { return doSomething(e); }) ``` is an expression of type `R`, which does the equivalent of something like: ``` switch(expr->getTag()) { case ASTNodeType::VarExpr: return doSomething(static_cast<VarExpr>(expr)); // ... } ``` The `SLANG_AST_NODE_VIRTUAL_CALL` macro is now implemented in terms of `ASTNodeDispatcher`. The implementation of the visitor types is more involved. The code in this change retains some of the macro names from the original version, just to try and make the parallels more clear. The visitor types are all implemented on top of the `ASTNodeDispatcher` approach, and use `FIDDLE TEMPLATE` to generate all the boilerplate `visit()` method declarations. Refactoring of `Linkage` Module Loading --------------------------------------- Needing to revisit all the places where modules get deserialized made it clear that there is a lot of complexity and apparent duplication in the core routines on the `Linkage` that get used for loading modules. This change tries to clean up some of that logic, but it is worth noting that there are two legacy features that get in the way of making things as clean as they should be: The `LoadedModuleDictionary` type that gets passed around a lot exists entirely to handle the corner case where somebody uses the Slang API to perform a compilation with multiple `TranslationUnitRequest`s in the same `FrontEndCompileRequest`, and one of the translation units `import`s the module defined by another of the translation units. * There are a lot of special-case behaviors and routines entirely there to support the `ModuleLibrary` feature, although that feature should be considered deprecated (or at least subject to getting entirely re-designed down the line). The basic idea of the cleanup is that all of the (non-deprecated) ways load a module from a serialized binary, or compile one from source should now bottleneck through `loadModuleImpl`, which then bifurcates into `loadSourceModuleImpl` for the compilation case and `loadBinaryModuleImpl` for the deserialization case. High-Level Serialization Approach --------------------------------- The old serialization logic used the [RIFF](https://en.wikipedia.org/wiki/Resource_Interchange_File_Format) format to encode the high-level structure of things, and this change retains that usage (and actually doubles down on the RIFF usage). The old serialization system relied on the idea that for any given type `Foo` that wants to support serialization, there should be something like a `SerialFooData` type in C++, that can represent the state of a `Foo`, and then the actual serialization applied to that `SerialFooData`. This means that in most cases there are four pieces of code written: * During serialization: * Copying the data of a `Foo` in memory over to a `SerialFooData` in memory * Writing the state of a `SerialFooData` into the serialized data stream * During deserialization: * Reading the state of a `SerialFooData` from a serialized data stream * Copying the data of the `SerialFooData` in memory over to a `Foo` The new logic gets rid of the intermediate `SerialFooData`. In the serialization direction, we take a `Foo` and write it to the `RIFFContainer` directly, or using some other utilities layered on top of it. In the deserialization direction, we have additional flexibility. Given a `RIFFContainer::Chunk` that represents a serialized `Foo`, we often navigate through the in-memory representation of the RIFF data to get to the parts of the serialized value that we actually want/need, without needing to deserialize the entire `Foo`. To support this kind of operation, this change introduces a few helper types like `ContainerChunkRef` an `ModuleChunkRef`, that are little more than typed wrappers around a `RIFFContainer::Chunk`. The Module "Container" Part --------------------------- A serialized `Module` is encoded as a RIFF chunk, using logic in `slang-serialize-container.cpp` - both before and after this change. This change reorganizes a lot of the code in that file, to account for the way that eliminating the intermediate `SerialContainerData` type streamlines the overall task of writing out the parts of the module. In the deserialization logic... there isn't really much to do in `slang-serialize-container.cpp`. Most of the logic in `slang.cpp` and `slang-module-library.cpp` that pertains to deserializing modules uses the `ModuleChunkRef`-based approach, and simply extracts the pieces of the serialized module that it needs. The Actual Serialization of the AST ----------------------------------- The actual AST serialization logic is in `slang-serialize-ast.cpp`. The basic approach in both the writing and reading directions is: * Use the `FIDDLE TEMPLATE` system to generate a set of functions, one for each AST node type, that recursively invoke the read/write logic on each field of that node (after recursively invoking the case for its direct superclass) * Use the `ASTNodeDispatcher` system to dispatch out to those functions whene reading or writing anything derived from `NodeBase` * For now, handle all types not derived from `NodeBase` by hand. There's a lot of room for improvement around that last item: it should be just as easy to generate the serialization and deserialization logic for other types that don't inherit from `NodeBase`, but the current change tries to err on the side of making the logic as explicit and simplistic as possible, rather than trying to get too clever too soon. The actual serialization format used for the AST is almost comically simplistic: the code uses hierarchical RIFF chunks to emulate a JSON-like structure. This is a very wasteful representation (e.g., a `bool` or a null pointer each take up 8 bytes), but the goal for now is to start with the simplest thing that could possibly work, and only add more cleverness once we are sure it won't get in the way of important future improvements (like lazy/on-demand deserialization or IR and AST, to improve compiler startup times). The files `slang-serialize.{h,cpp}` have been co-opted to define a new pair of types `Encoder` and `Decoder` that are used for a more-or-less stream-oriented way or reading or writing RIFF chunks for the JSON-like structure. Almost everything related to the actual AST serialization could do with a cleanup pass, and some time spent on picking good/better names for everything. Smaller Stuff ------------- * Cleaned up a lot of code that was using bare `ASTNodeType` or the extractor's `ReflectClassInfo` type to consistently use `SyntaxClass`. * Fixed an apparent bug in how the destination-driven code genarator was handling `TryExpr`s * Fixed an apparent bug in how the GLSL legalization pass was handling translation of certain `SV_` semantics. format code * fixup: template errors caught by non-VS compilers * format code * fixup: more template errors * fixup: more stuff VS didn't catch * fixup: it's amazing VS doesn't catch these... * fixup: yet more template stuff VS ignores * fixup: more VS template nonsense * fixup: unreachable return macro usage * fixup: more unreacable returns * fixup: unused parameter * fixup: strict aliasing * fixup: allow missing entry point list chunk * fixup: wasm build script * fixup: AST changes since this PR was created --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2025-04-22	Add a new SM profile 6.9 (#6879)	Jay Kwak

2025-04-16	Remove support for ad hoc Slang IR compression (#6834)	Theresa Foley
	* Remove support for ad hoc Slang IR compression This change is part of a larger effort to clean up the approach to serialization in the Slang compiler. The overall goal is to simplify and streamline all of the serialization-related logic, so that we are left with code that is less "clever," and easier to understand for contributors to the codebase. Removing support for compression of serialized Slang IR has benefits that include: * Reduction in code complexity: consider things like the subtle way that the `FOURCC`s for compressed chunks were being computed from the uncompressed versions, and the mental overhead that goes into understanding that, for anybody who would dare to touch this code. * Reduction in testing burden: there have been, de facto, two very different code paths for serialization of the Slang IR, and it is not clear that the existing test corpus for Slang has sufficient coverage for both options. By having only a single code path, every test that performs any amount of IR serialization helps with test coverage of that one path. * Opportunity to explore alternatives. This is perhaps a reiteration of the first point, but once the code is stripped down to the simplest thing that could possibly work (I am not claiming it has reached that point yet), it becomes easier for contributors to understand, and it becomes more tractable for somebody to come along with an improved approach that performs better (in either compression ratio or performance) while still being maintainable. In my own local setup, I found that removing support for Slang IR compression led to the `slang-core-module-generated.h` file increasing in size from 46.1MB to 47.4MB. This increase in the `.h` file size for the core library binary only resulted in a release build of `slang.dll` increasing from 20.0MB to 20.2MB. Removing the ad hoc compression support has almost no impact on the size of actual binary Slang modules so long as the additional LZ4 compression step is being applied to them. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-04-07	Support for Payload Access Qualifiers (#3448) (#6595)	Harsh Aggarwal (NVIDIA)
	* Add support for Ray Payload Access Qualifiers (PAQs) (#3448) - Added [raypayload] attribute for struct declarations - Implemented field validation requiring read/write access qualifiers - Added diagnostic error for missing qualifiers - Enabled PAQs in DXC compiler and HLSL emission - Added new test demonstrating PAQ syntax - Implemented proper handling of ray payload attributes in IR generation * format code * Cleanup: Remove unused vars * Add check to enablePAQ only for profile >= lib_6_7 * Review Fix - Add PAQ support for DX Raytracing add enablePAQ flag to DownstreamCompileOpitons, improve PAQ handling update raypayload-attribute-paq.slang to ensure hlsl and dxil is validated * Add diagnostic test for missing paq for lib_6_7 Compile using `-disable-payload-qualifiers` aka lib_6_6 profile raypayload-attribute-no-struct.slang and raypayload-attribute.slang --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-03-05	Support SPIR-V deferred linking option (#6500)	cheneym2
	The new option "SkipDownstreamLinking" will defer final downstream IR linking to the user application. This option only has an effect if there are modules that were precompiled to the target IR using precompileForTarget(). Until now, the default behavior for SPIR-V was to use deferred linking, and the default behavior for DXIL was to use immediate/internal linking in Slang. This change only affects the SPIR-V behavior such that both deferred and non-deferred linking is supported based on the new option. To support the non-deferred option, Slang will internally call into SPIRV-Tools-link to reconstitute a complete SPIR-V shader program when necessary (due to modules having been precompiled to target IR). Otherwise, if SkipDownstreamLinking is enabled, the shader returned by e.g. getTargetCode() or getEntryPointCode() may have import linkage to the SPIR-V embedded in the constituent modules. Closes #4994 Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-02-27	Map `SV_InstanceID` to `gl_InstanceIndex-gl_BaseInstance` (#6468)	Yong He
	* Map `SV_InstanceID` to `gl_InstanceIndex-gl_BaseInstance` * Fix ci.
2025-02-02	Add support for WGSL subgroup operations (#6213)	Darren Wihandi
	* initial work * more work * more work on glsl intrinsics * add subgroup broadcast for glsl * wip add wgsl extension tracking * enable tests, enable extensions and added some todos * format and warning fixes * fix wgsl extension tracker --------- Co-authored-by: Yong He <yonghe@outlook.com>
2025-01-30	Support cooperative vector (#6223)	Jay Kwak
	* Support cooperative vector without Vulkan-header update Adding a Slang support for cooperative vector. But this commit doesn't have Vulkan-header update.
2025-01-17	Make -depfile work for binary modules output too (#6126)	Julius Ikkala

2024-11-13	Make `getDependencyFilePath` return most unique identity. (#5553)	Yong He

2024-11-05	Move switch statement bodies to their own lines (#5493)	Ellie Hermaszewska
	* Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-10-29	format	Ellie Hermaszewska
	* format * Minor test fixes * enable checking cpp format in ci
2024-10-28	Replace the word stdlib or standard-library with core-module for source code ↵	Jay Kwak
	(#5415) This commit changes the word "stdlib" or "standard library" to "core module" in the source code.
2024-10-15	Move C interface from slang.h to slang-deprecated.h (#5301)	Ellie Hermaszewska
	* Squash redundant move warnings * Move C interface from slang.h to slang-deprecated.h spGetBuildTagString remains, because it's useful to have before the global session exists. This C API is used quite pervasively in the C++ helpers (for example slang::UserAttribute. It's not trivial to move these to slang-deprecated.h as they're entangled with some enums which are themselves used elsewhere in the compiler. The fact that these helpers use the C API can be viewed as an implementation detail for now, and this usage moved to slang-deprecated in due course. Closes https://github.com/shader-slang/slang/issues/4758 * Squash warnings for our usage of our deprecated API --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-10-07	Add WGSL support for slang-test (#5174)	Anders Leino
	* Use the assembly description as target when disassembling I believe this is a bugfix. It seems to have worked before because up until the WGSL case, the disassembler has been the same executable as the one producing the binary to be disassembled. * Add Tint as a downstream compiler This closes issue #5104. * Add downstream compiler for Tint. * Tint is wrapped in a shared library, 'slang-tint' available from [1]. * The header file for slang-tint.dll is added in external/slang-tint-headers. * Add some boilerplate for WGSL targets. * Add an entry point test for WGSL. [1] https://github.com/shader-slang/dawn/releases/tag/slang-tint-0 * Add WGSL_SPIRV as supported target for Glslang * Add WebGPU support to slang-test This helps to address issue #5051. * Disable lots of crashing compute tests for 'wgpu' This closes issue #5051. --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-09-23	Implemented Combined-texture for WGSL (#5130)	Jay Kwak
	* Implemented Combined-texture for WGSL * Remove unnecessary comment * Limit to std430 layout * Fix compiler warning for unused variable --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-09-18	Report AD checkpoint contexts (#5058)	venkataram-nv
	* Transferring source locations when creating phi instructions * Tracking for simple variables * Deriving source locations for loop counters * Printing checkpoint structure breakdown * More readable output format * Special behavior for loop counters * Writing report to file * Add slangc option to enable checkpoint reports * Display types of checkpointed fields * Message in case there are no checkpointing contexts * Catch source locations for function calls * Source cleanup * Fix compilation warnings * Remove stray dump() * Provide the report through diagnostic notes * Add missing path for sourceLoc during unzip pass * Add tests for reporting intermediates * Include more transfer cases for source locations * Fix ordering in address elimination * Fill in more holes with source location transfer * Remove debugging line * Reverting changes to diagnostic sink * Simplify address elimination using source location RAII contexts * Eliminating manual source loc transfers in forward transcription * Fix local var adaptation to use RAII location setter * Simplify primal hoisting logic for source location transfer * Simplify unzipping with RAII location scopes * Simplify transpose logic * Cleaning up for rev.cpp * Reverting spacing changes * Fix mistake with source loc RAII instantiation * Fix formatting issues
2024-09-09	Initial WGSL support (#5006)	Anders Leino
	* Add WGSL as a target This is required for #4807. * C-like emitter: Allow the function header emission to be overloaded WGSL-style function headers are pretty different from normal C-style headers: Normal C-style headers: ReturnType Func(...) void VoidFunc(...) WGSL-style headers: fn Func(...) -> ReturnType fn VoidFunc(...) This change allows the header style to be overloaded, in order to accomodate WGSL-style headers as required to resolve issue #4807, but retains normal C-style headers as the default implementation. [1] https://www.w3.org/TR/WGSL/#function-declaration-sec * C-like emitter: Allow emission of switch case selectors to be overloaded The C-like emitter will emit code like this: switch(a.x) { case 0: case 1: { ... } break; ... } This is not allowed in WGSL. Instead, selectors for cases that share a body must [1] be separated by commas, like this: switch(a.x) { case 0, 1: { ... } break; ... } To prepare for addressing issue #4807, this patch makes the emission of switch case selectors overloadable. [1] https://www.w3.org/TR/WGSL/#syntax-case_selectors * C-like emitter: Support WGSL-style declarations This patch helps to address issue 4807. C-like languages declare variables like this: i32 a; WGSL declares variables like this: var a : i32 The patch introduces overloads so that the forthcoming WGSL emitter can output WGSL-style declarations, which helps to resolve #4807. * C-like emitter: Support overloading of declarators Unlike C-like languages, WGSL does not support the following types at the syntax level, via declarators: - arrays - pointers - references For this reason, this patch introduces support for overloading the declarator emitter, in order to help address issue #4807. C-like languages: int a[3]; // Array-ness of type is mixed into the "declarator" WGSL: var a : array<int, 3>; // Array-ness of type is part of the... type_specifier! * C-like emitter: Allow struct declaration separator to be overridden C-like languages use ';' as a separator, and languages like e.g. WGSL use ','. This change prepares for addressing issue #4807. * C-like emitter: Allow overriding of whether pointer-like syntax is necessary Things like e.g. structured buffers map to "ptr-to-array" in WGSL, but ptr-typed expressions don't always need C-style pointer-like syntax. Therefore, make it overrideable whether or not such syntax is emitted in various cases in order to address #4807. * C-like emitter: Emit parenthesis to avoid warning about & and + precedence This helps with #4807 because WGSL compilers (e.g. Tint) treat absence of parenthesis as an error. * C-like emitter: Add hook for emitting struct field attributes WGSL requires @align attributes to specify explicit field alignment in certain cases. Thus, this patch prepares for addressing #4807. * C-like emitter: Add hook for emitting global param types Declarations of structured buffers map to global array declarations in WGSL. However, in all other cases such as when structured buffers are used in operands, their types map to ptr-to-array. This patch makes it possible for the WGSL back-end to say that structured buffers generally map to "ptr-to-array" types, but still have a special case of just "array" when declaring the global shader parameter. Thus, this patch helps with addressing #4807. * IR lowering: Use std140 for WGSL uniform buffers This patch just cuts out some logic that prevented std140 to be chosen for WGSL uniform buffers. Note that WGSL buffers in the uniform address space is not quite std140, but for now it's close enough to avoid compile issues. Later on, a custom layout should be created for WGSL uniform buffers. When that's done, this change will be revisited, but for now it helps to resolve #4807. * Don't emit line directives in WGSL by default WGSL does not support line directives [1]. The plan currently seems to be to instead support source-map [2]. This is part of addressing issue #4807. [1] https://github.com/gpuweb/gpuweb/issues/606 [2] https://github.com/mozilla/source-map * WGSL IR legalization: Map SV's The implementation closely follows the cooresponding one for Metal. Supported: - DispatchThreadID - GroupID - GroupThreadID - GroupThreadID Unsupported: - GSInstanceID This is not complete, but it helps to address #4807. * WGSL emitter: Add support for basic language constructs A lot of the basics are added in order to generate correct WGSL code for basic Slang language constructs. This addresses issue #4807. This adds support for at least the following: - statments - if statements - ternary operator - while statement - for statements - variable declarations - switch statements - Note: Slang may emit non-constant case expressions, see issue 4834 - literals - integer literals - u?int[16\|32\|64]_t - float and half literals - bool literals - vector literals and splatting (e.g 1.xxx) - function definitions - assignments - +=, =, /= - array assignments - vector assignments/updates - swizzles of other vectors - from matrix rows ('m[i]' notation) - from matrix cols (using swizzle notation, e.g 'm._11_12_13') - matrix assignments/updates - to rows ('m[i]' notation) - to cols (using swizzle notation, e.g 'm._11_12_13') - declarations - arrays [1] https://www.w3.org/TR/WGSL/#syntax-switch_body Add some WGSL capabilities This patch registers some WGSL capabilities required to pass many of the initial compute shader compile tests. Many capabilities still remain to be added -- this is just an initial set to help resolve issue #4807. - asint - min and max - cos and sin - all and any * WGSL and C-like emitters: Add hack to bitcast case expression In WGSL, the switch condition and case types must match. https://www.w3.org/TR/WGSL/#switch-statement Slang currently allows these types to mismatch, as pointed out in #4921. Issue #4921 should eventually be addressed in the front-end by a patch like [1]. However, at the moment that would break Falcor tests. Thus, this patch temporarily works around the issue in the WGSL emitter only in order to help resolve #4807. In the future, the Falcor tests should be fixed, this patch should be dropped and [1] should be merged instead. [1] a32156ef52f43b8503b2c77f2f1d51220ab9bdea
2024-09-05	Initial -embed-spirv support (#4974)	cheneym2
	* Initial -embed-spirv support Add support for SPIR-V precompilation using the framework established for DXIL. Work on #4883 * SLANG_UNUSED * Add linkage attributes to exported spirv functions * Combine DXIL and SPIRV paths * Whitespace fix * Merge remaining precompiled spirv/dxil paths * Change inst accessors to return codegentarget * Add unit test for precompiled spirv --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-09-05	Support entrypoints defined in a namespace. (#5011)	Yong He
	* Support entrypoints defined in a namespace. * Fix test.
2024-08-29	Support mixture of precompiled and non-precompiled modules (#4860)	cheneym2
	* Support mixture of precompiled and non-precompiled modules This changes the implementation of precompile DXIL modules to accept combinations of modules with precompiled DXIL, ones without, and ones with a mixture of precompiled DXIL and Slang IR. During precompilation, module IR is analyzed to find public functions which appear to be capable of being compiled as HLSL, and those functions are given a HLSLExport decoration, ensuring they are emitted as HLSL and preserved in the precompiled DXIL blob. The IR for those functions is then tagged with a new decoration AvailableInDXIL, which marks that their implementation is present in the embedded DXIL blob. The DXIL blob is attached to the IR as before, inside a EmbeddedDXIL BlobLit instruction. The logic that determines whether or not functions should be precompiled to DXIL is a placeholder at this point, returning true always. A subsequent change will add selection criteria. During module linking, the full module IR is available, as well as the optional EmbeddedDXIL blob. The IR for functions implemented by the blob are tagged with AvailableInDXIL in the module IR. After linking the IR for all modules to program level IR, the IR for the functions marked AvailableInDXIL are deleted from the linked IR, prior to emitting HLSL and compiling linking the result. This change also changes the point of time when the module IR is checked for EmbeddedDXIL blobs. Instead of happening at load time as before, it happens during immediately before final linking, meaning that the blob does not need to be independently stored with the module separate from the IR as was done previously. Work on #4792 * Clean up debug prints * Call isSimpleHLSLDataType stub * Address feedback on precompiled dxil support Allow for IR filtering both before and after linking. Only mark AvailableInDXIL those functions which pass both filtering stages. Functions are corrlated using mangled function names. Rather than delete functions entirely when linking with libraries that include precompiled DXIL, instead convert the IR function definitions to declarations by gutting them, removing child blocks. * Use artifact metadata and name list instead of linkedir hack * Use String instead of UnownedStringSlice * Update tests * Renaming * Minor edits * Don't fully remove functions post-link * Unexport before collecting metadata
2024-08-28	Allow capabilities to be used with `[shader("...")]` (#4928)	ArielG-NV
	* Allow capabilities to be used with `[shader("...")]` Fixes: #4917 Changes: 1. Allow using capabilities instead of `Stage`s with `EntryPointAttribute`. 2. When resolving capabilities for an entrypoint+profile (per entrypoint) in `resolveStageOfProfileWithEntryPoint` add our `EntryPointAttribute` and resolved capability 3. Added tests and some capabilities related clean-up * fix a warning made by a mistake in syntax * change fineStageByName to assume it is passed a stage without a '_' * test with and without prefix '_' * cleanup some profiles and reprisentation to work better with 'Stage' and 'Profile' This use case is why we need to clean all profile-usage into `CapabilityName`s directly. * change how we compare * only change profiles * let all capabilities be resolved by 'shader' profile for now * fix warning checks I accidently broke * meshshading_internal to _meshshading --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-08-13	GitHub action benchmark (#4804)	venkataram-nv
	Adds a new Github CI action for benchmarking the slangc compiler on the MDL shaders. For now, the results are only dumped to the output of the CI, which can be later viewed through raw logs. The next step is to use github-action-benchmark to push these results into a page which will show the benchmark results over time as commits are pushed.
2024-08-05	Initial support for precompiled DXIL in slang-modules (#4755)	cheneym2
	* Add embedded precompiled binary IR ops Add IR operations to embed precompiled DXIL or SPIR-V blobs into IR. Adds a BlobLit literal that is mostly identical to StringLit except for its inability to be displayed, e.g. in dumped IR. In the future, the blob might be dumped as hexadecimal, but for now it is summarized as "<binary blob>". * EmbeddedDXIL and SPIR-V options The options, '-embed-dxil' and '-embed-spirv' in slangc, will cause a target dxil or spirv to be compiled and stored in the translation unit IR when written to a slang-module. Subsequent changes actually implement the options. * Per-translation unit DXIL precompilation When -embed-dxil is specified, perform a precompilation to DXIL of each TU, linked only with stdlib. Embed the resulting DXIL for the TU in a IR op. Being part of IR, the precompiled DXIL can be serialized to disk in a slang-module. Upon loading slang-modules, the new IR op will be searched for and the precompiled DXIL blob is saved with the loaded Module. During linking, if all the Modules have precompiled blobs they will be sent to the downstream compile commands as libraries instead of source, skipping the downstream compilation, using DXC only for linking. Fixes Issue #4580 * Remove placeholder embedded SPIRV support Code was added only to sketch out how other precompiled bins will be supported. * Remove the rest of the SPIRV placeholder support * Fix warnings, test error on non-windows * Remove lib_6_6 hack, add dxil_lib capability * Allocate blob value from irmodule memarena * Add null check after memarena allocation * Restore the request->e2erequest code path for generatewholeprogram * Update capability handling, move EmbedDXIL enum to end to preserve abi * Remove lib_6_6 hack * Move ICompileRequest functions to end
2024-07-18	Adjust how `slang` and `slangc` uses a `profile` to manage the stage of an ↵	ArielG-NV
	entry-point (#4670) * Fixes #4656 Changes: 1. Setting a profile via slangc no-longer sets an entry-point target-stage, this is to allow slangc to follow how the SLANG-API works (else `main` is assumed to be the default entry-point) 2. If the stage specified by a profile is not equal to the stage specified by a entry-point, we throw a capability error. 3. Resolving the stage of an entry point was changed to function (mostly) equally for when 0 entry-points are specified versus to when there are 1 or more. 4. changed capabilitySet Iterator so it is invalid if backing data is nullptr (although this should never happen, it would stop crashes in the worst case). * remove the breaking change since it likely is going to be a lot more than just a simple change due to the implicit `main` and stage through `profile` code. * print out profile name with errors * use target's profile for printing * change logic to print warning in a different method (account for more cases) * set unknown stages
2024-06-27	Add API for querying dependency files on IModule (#4493)	skallweitNV
	* Add API for querying dependency files on IModule * return nullptr
2024-06-12	Delete glsl_vulkan and glsl_vulkan_one_desc targets. (#4361)	Yong He

2024-06-12	Capability System: Implicit capability upgrade warning/error (#4241)	ArielG-NV
	* capability upgrade warning/error adjusted implementation + tests to support a warning/error if capabilities are implicitly upgraded and test accordingly. * add glsl profile caps * add GLSL and HLSL capabilities to the associated capability * syntax error in capdef * only error if user explicitly enables capabilities 1. changed testing infrastructure to not set a `profile` explicitly, 2. Added tests to be sure this works as intended with user API and with slangc command line * Change capability atom definitions and how Slang manages them to fix errors 1. most `glsl_spirv` version atoms have been removed from `.capdef`, instead we will translate `spirv` version atoms into `glsl_spirv` since there is no point in writing the same code twice in `.capdef` files to define `spirv` versions. 2. add spirv version, and hlsl sm version (and equivlent) capability dependencies 3. removed some stage requirments which were set on objects, keep the wrapper capabilities. I am keeping the wrapper capabilities since I am unaware on if there are stage limitations (spec says code in practice does not work). * check internal version instead of version profile (_spirv_1_5 vs. spirv_1_5) * remove unused OpCapability. adjust SPIRV version'ing again for glsl_spirv * apply workaround for glslang bug with rayquery usage * ensure capabilities targetted by a profile and added together by a user are valid * remove additions to `spirv_1_` wrapper spirv_* -> glsl_spirv fix * fix bug where incompatable profiles would cause invalid target caps * try to avoid joining invalid capabilities * fix the warning/error & printing * run through tests to fix capability system and test mistakes many mistakes were mesh shaders doing `-profile glsl_450+spirv_1_4`. This is not allowed for a few reasons 1. the test tooling does not handle arguments the same as `slangc` 2. glsl_450 core profile does not support mesh shaders, nor does spirv_1_4. sm_6_5 does work in this senario * set some sm_4_1 intrinsics to sm_4_0 * replace `GLSL_` defs with `glsl_` * swap the unsupported render-test syntax for working syntax * set d3d11/d3d12 profile defaults this is required since sm version changes compiled code & behavior * adjusted nvapi capabilities with atomics + d3d11 set to use sm_5_0 as per default * cleanup * address review * incorrect styling * change `bitscanForward` to work as intended on 32 bit targets --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-06-11	SPIRV backend: add support for tessellation stages, (#4336)	Yong He

2024-06-08	SPIRV `Block` decoration fixes. (#4303)	Yong He
	* SPIRV `Block` decoration fixes. - SPIRV does not allow duplicate `Block` decorations. So we shouldn't be generating them. - Also fixes duplication of OpName. - SPIRV and HLSL do not allow ConstantBuffer with trailing unsized arrays. Added a check in the front-end against such code. * Convert failing cross-compile tests to filecheck. --------- Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
2024-06-05	Avoid duplicating entry points in library (#4279)	kaizhangNV

2024-05-22	Fix all Clang-14 warnings (#4203)	ArielG-NV
	* fix all Clang-14 warnings * remove a clang-14 warning fix because it is a MSVC warning...
2024-05-16	Capabilities System, CapabilitySet Logic Overhaul (#4145)	ArielG-NV
	* Capabilities System, Backing Logic Overhaul Fixes #4015 Problems to address: 1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster. 2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging. What was addressed: 1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below. Design: * `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation. * `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched. * `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched. * `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations. 2. Errors * Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors. * Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility. What is missing: 1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul. 2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint. Notes: * UIntSet had implementation bugs which were fixed in this PR. * The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation. * fix .natvis debug view * Small optimizations I found while working on the addition the AST building pass looks like so now: 1% = ~capabilitySet 2% = capabilitySet() 1.5% capabilitySet::unionWith() 0.8% capabilitySet::join() 1.5% auxillary info for debugging ~0.5-1% extra visitor overhead ~5% total for the visitor ~6.5% for total runtime costs * fix caps which were wrong but worked * push minor syntax fix (still looking for why other tests fail) * perf & bug fixes 1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario. 2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code. NOTE: did not address review comments * fix glsl.meta caps error * fixing findBest logic again & UIntSet wrapper findBest was not checking for 'more specialized' targets & was element counter was flawed * faster getElements algorithm + natvis for UIntSet + wrong warning * type incompatability of bitscanForward implementations * try to fix warnings again * remove ptr for clang intrinsic * add missing header * ifdef to allow clang compile * compiler hackery to fix up platform/type independent operations * bracket * fix MSVC error * missing template * change types out again * changes to fix compiling * adjustment to parameter for Clang/GCC * added iterator to delay processing all atomSets of a CapabilitySet * add a few missing consts's * ensure we never have more than 1 disjointSet Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons: 1. We still need 0-1 set functionality. 2. Might as well keep the code, just disallow the problematic functionality. * address review comments non linked-list refactor review comments addressed; add doc comments + remove redundant code * comments + remove isValid for bool operator * push removal of linkedlist for capabilities * add missing break * address review comments minor adjustments of syntax * push a fix to the `CapabilitySet({shader, missing target})` code * quality + error 1. add iterator to UIntSet 2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400) * fix target_switch erroring + temporarily remove UIntSet::Interator temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator * fix the UIntSet iterator * Revert "fix the UIntSet iterator" temporarily to pull from master * add metal error as per texture.slang (took a while I realize this was why things were breaking, likely should adjust errors to reflect this) * Rework UIntSet to have a template for output type This is done so it is reasonable to debug the iterator output and not just dealing with messy int's Fix problems with the iterators implemented + invalid capabilities handling * removed incorrect `__target_switch` capability barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile. * remove some uses of getElements * remove undeclared_stage for now * remove redundant code associated with `undeclared_stage` * remove unused variable * address review specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate * move GLSL_150 capdef change to sm_4_1 (more accurate) * address most review comments did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776 * revert incorrect code review suggestion * push changes for all code review suggestions
2024-05-03	Add host shared library target. (#4098)	Yong He
	* Add host shared library target. * Attempt fix. * Fix warnings. * try fix. * Fix test. * Fix.
2024-04-30	Avoid classifying methods with `[numthreads]` as entry points for ↵	Sai Praveen Bangaru
	CUDA-related targets (#4063)
2024-04-19	Add metal downstream compiler + metallib target. (#3990)	Yong He
	* Add metal downstream compiler + metallib target. * Add more comments. * Add missing override.
2024-04-17	Add skeleton for metal backend. (#3971)	Yong He

2024-04-11	Fix the issues when compiling slang to library (#3936)	kaizhangNV

2024-02-23	Add slangc interface to compile and use ir modules. (#3615)	Yong He
	* Add slangc interface to compile and use ir modules. * Fix glsl scalar layout settings not copied to target. * Fix. * Cleanups.
2024-02-20	Refactor compiler option representations. (#3598)	Yong He
	* Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
2024-02-15	Support loading serialized modules. (#3588)	Yong He
	* Support loading serialized modules. * Fix. * Fix vs solution files * Fix glsl module loading. * C++ fix. * Fix. * Try fix c++ error. * Try fix. * Fix. * Fix.
2024-02-02	Capability type checking. (#3530)	Yong He
	* Capability type checking. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>