slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	meow	yum	2025-10-31
\|
*	non-exported public labels get namespaced now	yum	2025-10-28
\|
*	Clean up Slang IR representation of undefined values (#8708)	Theresa Foley	2025-10-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this change, the Slang IR used a single opcode (`kIROp_Undefined`) to encode all cases of undefined values. The particular motivation for this change was a need to distinguish those undefined values that represent a load from an uninitialized memory location versus other sorts of undefined values. If transforming a variable into SSA form results in `undefined` values in cases where the a `load` was executed without a prior `store`, that represents an error on the programmer's part, and should be diagnosed. However, other cases of undefined values can arise during program transformation and optimization, and should not typically result in diagnostics being emitted. While it was not the original motivation for this change, it is also worth noting that the LLVM project has transitioned from initially using only a single `undef` instruction to having a more nuanced model, and the same factors that motivated their shift also apply to the Slang IR. Counter-intuitively, the semantics of undefined values actually need to be carefully defined. Concretely, this change splits the pre-existing `undefined` opcode into two sub-cases: - `kIROp_LoadFromUninitializedMemory`, to represent the case of loading from a memory location (such as a local variable) that has not been initialized. - `kIROp_Poison`, corresponding to the LLVM `poison` value. Our poison instruction is intended to have semantics comparable to LLVM's equivalent. Conceptually, any operation that is invoked with a poison value as input will (with a few exceptions) produce a poison value as output. One can think of the behavior of `poison` as similar to how not-a-number values propagate in floating-point computations: by default they "infect" the result of any computation they are involved in. This semantic choice helps to ensure that many optimizations end up being correct in the presence of undefined values, even if they did not specifically account for them. The `kIROp_LoadFromUninitializedMemory` case is comparable to the combination of `freeze` and `undef` in LLVM. An LLVM `undef` value has semantics that allow each use of that value to be replaced with a different arbitrary value; these semantics cause many optimizations to only be correct in the absence of undefined values. An LLVM `freeze` instruction can take an undefined value as input, and produces a single value that is still arbitrary, but must be consistent across all uses. The latter semantics are what we want, since a given `load` from an uninitialized memory location will yield an arbitrary-but-fixed value. Note that we intentionally do not have a direct analogue to LLVM's `undef` instruction, because of the way that `undef` causes so many complications when trying to write optimizations. We also do not add a `kIROp_Freeze` instruction in this change, but that is simply because we currently have no need for it. Existing code that was creating `IRUndefined` values has been updated to create either `IRPoison` or `IRLoadFromUninitializedMemory` values, as appropriate to the use case. Code that was checking for the `kIROp_Undefined` opcode has been updated to either check for both of the new opcodes (in the case of `switch` statements), or to use `as<IRUndefined>` to perform a dynamic cast to the common base type of the two new instructions. Note that this change does not alter the way that instructions representing undefined values are typically emitted as ordinary instructions in the block that produces an undefined value. While emitting `IRLoadFromUninitializedMemory` as an ordinary instruction is exactly what we want, the `IRPoison` case would actually be better represented in Slang IR as a "hoistable" instruction, so that there would only be a singular `poison` value of each type. Changing `IRPoison` to be hoistable would be a good follow-up change, but might run into more challenges depending on what assumptions (if any) the codebase is making about where undefined values get emitted. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Update debug var when in-param proxy var is being updated. (#8671)	Yong He	2025-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Closes #8664. The problem is that when there is an `in` parameter, Slang will create a local variable to proxy the parameter, copy the value of the parameter into the proxy variable, and replace all uses of the parameter in the function body to use the proxy variable instead. This way all writes to the parameter become writes to the proxy variable. However, when there is debug info enabled, we are also going to create a "debugVariable" corresponding to the parameter, but this debugVariable isn't updated when the proxy variable is updated. The fix is to map the proxy var instead of the original param to the debug var during the `insertDebugValueStore` pass, so that any changes to the proxy var will result in additional stores being inserted to the debug var. Allowing function body to modify an `in` parameter is a bad legacy behavior we inherited from HLSL that we should really be moving away from. I would like us to completely treat an `in` parameter as immutable by default in the next language version (Slang 2026), and make it an error if the user tries to do so. This will allow us to generate much cleaner code and in many cases would help with performance.
*	`ExprLoweringVisitorBase::getDefaultVal(Type*)` use ↵	Ronan	2025-10-08
\| \| \| \| \| \| \|	`MakeVector/MatrixFromScalar` (#8512) - Allows using `Vector/Matrix` type with yet unresolved dimensions - Simpler implementation and in-line with default `Array` - Added `test/bugs/gh-8512.slang`
*	Use symbol alias instead of wrapper synthesis to implement link-time types. ↵	Yong He	2025-10-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#8603) This change achieves link-time type resolution with a different mechanism. For `extern struct Foo : IFoo = FooImpl;`, instead of synthesizing a wrapper type `Foo` that has a `FooImpl inner` field and dispatches all interface method calls to `inner.method()`, this PR completely removes this synthesis step, and instead just lower such `extern`/`export` types as `IRSymbolAlias` instructions that is just a reference to the type being wrapped. Then we extend the linker logic to clone the referenced symbol instead of the SymbolAlias insts itself during linking. By doing so, we greatly simply the logic need to support link-time types, and achieves higher robustness by not having to deal with many AST synthesis scenarios. Closes #8554. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Rename some symbols related to pointers types (#8592)	Theresa Foley	2025-10-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note that while this change touched a large numer of files, there are no changes to functionality being made here. The only things being done are renaming various symbols and, in a few cases, updating or adding comments for consistency with the new names. The core of the naming changes are: * Most things named to refer to `OutType` (e.g., `IROutType`, `IRBuilder::getOutType()`, etc.) have been consistently renamed to refer to `OutParamType`, to emphasize that the relevant AST/IR node types are only intended for use to represent `out` parameters. * The same change as described above for `OutType` is also made for `RefType`, which becomes `RefParamType` in most cases. One mess that this exposes is the way that the `ExplicitRef<T>` type in the core module currently lowers to `IRRefParamType`. This change sticks to the rule of not making functional changes, so that mess is left as-is for now. * Names referring to `InOutType` have been changed to instead refer to `BorrowInOutType`. The intention with this naming change is to emphasize that the Slang rules for `inout` are semantically those of a borrow (or at least our interpretation of what a borrow means). * Names referring to `ConstRefType` have been changed to instead refer to `BorrowInType`. This change starts work on clarifying that the existing `__constref` modifier was never intended to be a read-only analogue of `__ref`, and instead is the input-only analogue of `inout`. * The `ParameterDirection` enum type has been changed to `ParamPassingMode`, to reflect the fact that the concept of "direction" fails to capture what is actually being encoded, particularly once we have modes beyond simple `in`/`out`/`inout`. While this change does not alter behavior in any case (the user-exposed Slang language is unchanged), it is intended to set up subsequence changes that will work to make the handling of these types in the compiler more nuanced and correct. Breaking this part of the change out separately is primarily motivated by a desire to minimize the effort for reviewers. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Rewriting the lower-buffer-element-type pass to avoid unnecessary ↵	Yong He	2025-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	packing/unpacking. (#8526) Part of the effort to improve the performance of generated SPIRV code. The existing lower-buffer-element-type pass works by loading the entire buffer element content from memory, and translate it to logical type stored in a local variable at the earliest reference of a buffer handle. This means that is can generate inefficient code that reads more than necessary. Consider this example: ``` struct BigStruct { bool values[1024]; } ConstantBuffer<BigStruct> cb; void test(BigStruct v) { if (v.values[0]) { printf("ok"); } } [numthreads(1,1,1)] void computeMain() { test(cb); } ``` In IR, the `computeMain` function before lower-buffer-element-type pass is something like following: ``` func test: %v = param : BigStruct %barr = fieldExtract(%v, "values") %element = elementExtract(%barr, 0) ... // uses %element func computeMain: %v = load(cb) call %test %v ``` The existing lower-buffer-element-type pass will rewrite the bool array in `BigStruct` into `int` array so it is legal in SPIRV. However, it does so by inserting the translation on the first `load` of the constant buffer: ``` struct BigStruct_std430 { int values[1024]; } var cb : ConstantBuffer<BigStruct_std430>; func computeMain: %tmpVar : var<BigStruct> call %unpackStorage(%tmpVar, cb) %v : BigStruct = load %tmpVar call %test %v ``` This means that the entire array will be loaded and translated to int, before calling `test`, which only uses one element. It turns out that the downstream compiler isn't always able to optimize out this inefficient translation/copy. This PR completely rewrites the way buffer-element-type lowering is handled to avoid producing this inefficient code. It works in two parts: first we turn on the `transformParamsToConstRef` pass for SPIRV target as well, so we will translate the `test` function to take the `v` parameter as `constref`. The second part is a redesigned buffer-element-type pass that defers the storage-type to logical-type translation until a value is actually used by a `load` instruction. In this example, after `transformParamsToConstRef`, the IR is: ``` func test: %v = param : ConstRef<BigStruct> %barr = fieldAddr(%v, "values") %elementPtr = elementAddr(%barr, 0) %element = load(%elementPtr) ... // uses %element func computeMain: call %test %cb ``` The new `buffer-element-type-lowering` pass will take this IR, and insert translation at latest possible time across the entire call graph, and translate the IR into: ``` func test: %v = param : ConstRef<BigStruct_std430> %barr = fieldAddr(%v, "values") %elementPtr : ptr<int> = elementAddr(%barr, 0) %element_int = load(%elementPtr) %element = cast(%element_int) : %bool ... // uses %element func computeMain: call %test %cb ``` In this new IR, there is no longer a load and conversion of the entire array. See new comment in `slang-ir-lower-buffer-element-type.cpp` for more details of how the pass works. This PR also address many other issues surfaced by turning on `transformParamsToConstRef` pass on SPIRV backend. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Lookup refactor (#8467)	kaizhangNV	2025-09-23
\| \| \| \| \| \| \| \| \| \| \|	Close #8201. This PR unify the lowering logic for LookupDeclRef of an interface requirement. We will always lower this AST node to a LookupWitness IR. The key of this IR is the special witnessTableType `ThisTypeWitness`, this witness Table is simply a wrapper for an interface type. Our current specialization pass doesn't handle this kind of LookupWitness IR at all, so we will also add the specialization of this_type IR as well.
*	Split overloaded uses of RefType in front-end (#8427)	Theresa Foley	2025-09-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Overview ======== This change is the start of an attempt to address how the Slang compiler codebase has ended up conflating two similar, but semantically distinct, concepts: * The long-standing notion of `ref` parameters (only allowed for use in the builtin modules), which are encoded using a wrapper `Type` in the AST as part of the representation of the parameters of a `FuncType`. * A recently-introduced notion of explicit reference types that mirror the built-in `Ptr` type, with a relationship comparable to that between pointer and reference types in C++. The change splits the `Ref<T>` type in the core module into two distinct types, with one for each of the two use cases. Similarly, the `RefType` class in the compiler's AST is split into two distinct classes, to represent the two cases. Background ========== The `Ref<T>` type in the core module (hidden and not intended for users to ever see or use) was originally introduced to encode the `ref` parameter-passing mode, comparable to the hidden `Out<T>` and `InOut<T>` types used to encode `out` and `inout` parameter-passing modes. The `Ref<T>` type in the core module was encoded as a instance of the `RefType` class in the Slang AST (similar to how `Out<T>` mapped to an `OutType`). These AST classes were only intended to be used by the compiler front-end as part of its encoding of function types. The `FuncType` class needed a way to distinguish an `inout int` parameter from a plain (implicitly `in`) `int` parameter, so these wrapper like `RefType` and `OutType` were introduced to encode both the parameter type (`T`) and the parameter-passing mode in a form that could be passed around as a `Type`. Notably, the `Ref<T>` type (and `Out<T>`, etc.) were not intended to be type names that ever get uttered in Slang code (not even in the builtin modules), and the vast majority of the compiler code was not supposed to ever encounter them. They were an implementation detail of `FuncType`, and nothing else. (In hindsight it may have been a mistake to use a nominal type declared in the core module to implement these wrappers; it might have been a good idea to use an entirely separate class of `Type` for this case...) Recent changes to the builtin modules introduced functions that wanted to return a reference (so that the parameter-passing-mode modifiers like `ref` could not trivially be used), and as part of those changes the appealingly-named `Ref<T>` type in the core module was re-used for this new case. Builtin operations were declared with an explicit `Ref<T>` return type, and parts of the compiler front-end that had previously been blissfully unaware of the AST's `RefType` (and `InOutType`, etc.) had to start accounting for the possibility that an explicit `Ref<T>` would show up. Related changes also introduced a comparable conflation of the (unfortunately-named) `constref` parameter-passing modifier and builtin operations that wanted to return an explicit reference that is read-only. Both use cases were mapped to the core-module `ConstRef<T>` type, which appeared in the AST as an instance of the `ConstRefType` class. The overlapping use of `ConstRef<T>`` is actually significantly more troublesome than the `Ref<T>` case because, despite what its name implies, `constref` was not really supposed to be the read-only analogue of `ref`, but rather it is closer to the "immutable value borrow" analogue to `inout`'s "mutable value borrow." The semantics of a "value borrow" vs. a "memory reference" in Slang have not been very carefully codified, and the conflation around `ConstRef<T>` has contributed to things becoming increasingly muddy in the compiler back-end. Main Changes ============ Core Module ----------- The `Ref<T>` type has been replaced with two distinct types, with one for each use case: * `RefParam<T>` is intended for use when encoding a `ref` parameter in a function type * `ExplicitRef<T>` is intended for use when an operation in a builtin module wants to return a reference The other types used to represent parameter-passing modes (e.g., `InOut<T>`) were renamed to better indicate that their role in defining parameter types (e.g., `InOutParam<T>`). The `ExplicitRef<T>` type was given additional generic parameters for the allowed access and the address space, akin to what `Ptr<T>` now supports. The pointer dereference operator (prefix ``) in the core module should now properly propagate the access and address space of the pointer over to the reference that gets returned. The two distinct use cases of `ConstRef<T>` were not split in the way as `Ref<T>`, instead the case for the `constref` parameter-passing mode uses `ConstParamRef<T>`, while cases that previously used `ConstRef<T>` to represent a read-only explicit reference instead now use `ExplicitRef<T, Access.Read>`. Prior to this change there were two subscripts declared on pointers: one in the `Ptr` type itself, and another in an `extension` for pointers with `Access.ReadWrite`. The comments on the code seemed to indicate that the catch-all subscript used to only have a `get` accessor, while the `ref` was only available on read-write pointers, but it seems that subsequent changes converted the default subscript to support `ref`. This change eliminates the subscript added via `extension`, since it is redundant. AST and Front-End ================= Similar to the changes in the core module, the AST `RefType` class was split into: `RefParamType` for the case of encoding `ref` parameters * `ExplicitRefType` for the case where the user meant an explicit reference type All the other classes that represent wrappers for encoding parameter-passing modes (e.g., `OutType`) were similarly renamed (e.g., `OutParamType`). The `ConstRefType` class was simply renamed to `ConstRefParamType`, because any use cases of `ConstRefType` that intended an explicit reference type will now use `ExplicitRefType` with `Acccess.Read`. For convenience, this change includes type aliases to map the old names for these types over to the new ones (e.g., `using OutType = OutParamType`) so that the change doesn't need to affect quite so many lines of code. The `RefType` and `ConstRefType` names are intentionally left undefined, since it woudl be unsafe to assume that existing use sites should default to either of the two possible interpretations. All use cases of `RefType` and `ConstRefType` (and their former shared base class `RefTypeBase`) were audited and updated to refer to either `RefParamType`/`ConstRefParamType` or `ExplicitRefType`, as appropriate (based on whether the context of the code indicated it was working with parameter-passing mode wrapper types, or explicit reference types). In many (many) cases comments were added to the code that was updated (and some unrelated code that needed to be audited along the way) to note cases where there appears to be something fishy going on in the compiler and/or there are obvious opportunities for next-step improvement. The `QualType` constructor used to infer l-value-ness when passed a `RefType` or `ConstRefType`; that code was introduced to support explicit reference types. The code was updated to consult the access argument of an `ExplicitRefType` to try and determine the right l-value-ness to use. There is some ambiguity about what should be done in the case where the value of the generic argument representing the access cannot be statically determined; a better solution may be needed. Many other cases in the front-end that were working with `RefType` and `ConstRefType` for explicit references also need to figure out l-value-ness, and these were changed to rely on the logic already added to `QualType` so that it wouldn't have to be duplicated. It isn't clear if this structure is the best way to tackle the problem, but it seems to at least be an upgrade over the more strictly ad-hoc logic that was in place before. Future Work =========== IR-Level Work ------------- The most obvious next step to take is that the split that was made in the compiler front-end needs to be properly plumbed through all of the back-end. There appears to be a lot of code in the back end of the compiler that has made the same conflation of `ref` parameters and explicit reference types that the front-end did. In practice, any uses of `ExplicitRef<T>` in the front-end should desugar into plain pointer-based code in the IR. Clean Up Parameter-Passing Modes -------------------------------- The code that handles different parameter-passing modes (`ParameterDirection`s) and their wrapper types is somewhat scattered and messy (as found while auditing use cases of `RefType`). A cleanup pass is warranted to ensure that most code only needs to think about `ParameterDirection`s. There should ideally be only a single operation in the front-end that handles determining the `ParameterDirection` of a parameter based on its modifiers. Similarly, there should be one operation to wrap a value type based on a parameter direction, and one operation to derive a `ParameterDirection` from the wrapper type. Ideally, the accessors for `FuncType` should not provide unrestricted access to the potentially-wrapped parameter types, and should instead return some kind of `ParamInfo` struct that encodes both a `ParameterDirection` and the unwrapped `Type` of the parameter. Clean Up `QualType` ------------------- A significant piece of future work that appears required is to drastically clean up and improve the way that `QualType`s are represente and handled in the front-end. There are currently various distinct `bool` flags in `QualType` (some with very unclear meaning) and differnet parts of the codebase consult/modify only subsets of them; a clear enumeration of the "value categories" (to use the C++ terminology) that Slang supports could be quite helpful. Naively, a `QualType` should at least encode the basic information that a `Ptr` type encodes: * A value type * Allowed access (read-only, read-write, etc.) * Address space The main additional thing that a `QualType` needs is a way to distinguish cases where an expression evaluates to: * A reference to a memory location, where all the information from a `Ptr` is relevant * A simple value, such that the access and address space are irrelevant * A reference to an abstract storage location (a `property`, `subscript`, or an implicit conversion that needs to support being an l-value), in which case address space is irrelevant and the "allowed access" basically amounts to a listing of the accessors the storage location supports Eliminate Explicit Reference Types ---------------------------------- Finally, twe should eventually eliminate the `ExplicitRef<T>` type from the core module (and all of the supporting code from the front-end), since the feature is not a good fit for the Slang language. We should find some other way to decorate operations in the builtin module that need to returns a reference rather than a value (note how `ref` accessors already avoided exposing explicit reference types, by design). --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Fix DebugCompilationUnit to reference main shader file instead of header ↵	Lujin Wang	2025-09-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	files (#7957) This PR implements the requested fix for issue #7923 where DebugCompilationUnit incorrectly referenced header files instead of the main shader file. ## Summary - Modified IRDebugSource to include isIncludedFile flag as third operand - Updated emitDebugSource function to accept and pass the included file flag - Updated call sites to use source->isIncludedFile() from SourceFile class - Modified SPIR-V emission to only create DebugCompilationUnit for non-included files ## Test Results The fix has been verified with the provided reproducer code. The SPIR-V output now correctly shows DebugCompilationUnit referencing the main shader file instead of header files. Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Lujin Wang <lujinwangnv@users.noreply.github.com> Co-authored-by: Claude Code <claude@anthropic.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Fix crash when compiling specialized generic entrypoint containing a static ↵	Yong He	2025-09-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	const decl. (#8392) Closes #8184. We fixed three issues with this regression test: 1. After generating IR for a `SpecializeComponentType`, we should also strip the frontend decorations from the IR so there is no HighLevelDeclDecoration that will go into the backend. 2. When lowering a static const inside a generic function, we should not give the static const a linkage, because it won't such constant will not appear in global scope. Trying to give it a linkage decoration will lead to the parent generic (for the function) to have two duplicate Export/Import decorations with different mangle names, and confuses the linker. 3. Make sure internal exceptions does not leak through `IComponentType::getEntryPointCode`/`getTargetCode`.
*	[CBP] Pointer frontend changes + groupshared pointer support (#7848)	ArielG-NV	2025-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Resolves #7628 Resolves: #8197 Primary Goals: 1. Add `Access` to pointer 2. AddressSpace::GroupShared support for pointers (SPIR-V) 3. Add `__getAddress()` to replace `&` * `&` is not updated to `require(cpu)` since slangpy uses `&`. This means we must: (1) merge PR; (2) replace `&` with `__getAddress()`; (3) add `require(cpu)` to `&` Changes: * Added to `Ptr` the `Access` generic argument & logic (for `Access::Read`). * Moved the generic argument `AddressSpace` from `Ptr` to the end of the type. * Added pointer casting support between any `Ptr` as long as the `AddressSpace` is the same * Disallow globallycoherent T* and coherent T* * Disallow const T, T const, and const T* * Fixed .natvis display of `ConstantValue` `ValOperandNode` * Support generic resolution of type-casted integers * Added `VariablePointer` emitting for spirv + other minor logic needed for groupshared pointers Breaking Changes: * Anyone using the `AddressSpace` of `Ptr` will now have to account for the `Access` argument * we disallow various syntax paired with `Ptr` and `T*` --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Fix issue of double lowering issue a differentiable function (#8182)	kaizhangNV	2025-08-18
\| \| \| \| \|	Close #8054. For detailed root cause is at: https://github.com/shader-slang/slang/issues/8054#issuecomment-3189579508
*	Error if super-type capabilities are a super-set of sub-type (#7452)	ArielG-NV	2025-08-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: #7410 Changes: 1. super-type capabilities must be a super-set of sub-type capabilities (and support the same shader stages/targets) * InheritanceDecl visits super-type to inherit it's capabilities; validate InheritanceDecl capabilities against sub-type * visit all container decl's with a default case * clean up functionDeclBase visitor * Simplify `diagnoseUndeclaredCapability` by moving logic into capability checking (more correct) 3. added changed behavior to documentation 4. fixed some incorrect capabilities 5. we do not* diagnose capability errors on interface requirement-to-implementation if both lack explicit capability requirements. This change is to work around a slangpy regression (test case for the failing situation is in `tests\language-feature\capability\capability-interface-extension-1.slang`), Note: maybe for slang-2026 we don't do this? 6. requirement & implementation must support the same shader stage/target. This was changed because otherwise we can have cases where `X` inherits from `Y`, but `Y` is only expected to be used in `glsl` whilst `X` is expected to be used in `hlsl \| glsl` 7. removed `tests/language-feature/capability/capabilitySimplification3.slang` because it tests nothing special (redundant) Note: not using rebase due to separate branches depending on this PR --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Fix debug info generation for let variables in SPIR-V output (#7743)	Copilot	2025-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial plan * Fix debug info for let variables Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix parameter count for emitDebugVar function call Fixed regression where let variable debug info generation was missing the optional argIndex parameter in emitDebugVar call. Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Add location validity check for debug info generation Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Don't insert debug value for nondebuggable types. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
*	Prelink ForceInlined functions during lowering. (#7812)	Yong He	2025-07-17
\| \| \| \| \| \| \| \| \|	* Prelink ForceInlined functions during lowering. * Fixes and cleanups. * Fix warning. * Fix crash.
*	Fix duplicate mangled names for interface requirements (#7764)	Yong He	2025-07-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix duplicate mangled names for interface requirements Remove linkage decorations from interface method requirement values to prevent duplicate mangled names and allow DCE to clean up unused functions. Interface requirements only need the type information, not the linkage, so the generated IRFunc with export decorations was causing conflicts. Fixes #7761 Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * Move linkage decoration removal into default case of switch Move the removeLinkageDecorations call into the default case of the switch statement so it only applies to general requirements and not special cases like associated types. Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * Remove conditional check for linkage decoration removal Apply removeLinkageDecorations unconditionally to interface requirement values to ensure all requirement values have their linkage decorations removed. Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * Change removeLinkageDecorations parameter type from IRGlobalValueWithCode* to IRInst* This fixes build errors by allowing the function to accept the broader IRInst* type, making it compatible with the unconditional call in the interface requirement processing. Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * format code (#7787) Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yong He <csyonghe@users.noreply.github.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Replace [KnownBuiltin] string-based comparisons with enum-based system (#7714)	Copilot	2025-07-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial plan * Implement enum-based KnownBuiltin system to replace string comparisons Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Add test for enum-based KnownBuiltin system and verify functionality Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Implement enum-based KnownBuiltin system with direct integer values Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix IntVal access and update tests for new enum-based KnownBuiltin system Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Replace hardcoded KnownBuiltin integers with preprocessor enum syntax - Updated all KnownBuiltin attributes to use $( (int)KnownBuiltinDeclName::EnumValue) syntax - Added space between parentheses to avoid preprocessor bug: $( (int) instead of $((int) - Updated both core.meta.slang and hlsl.meta.slang files - Eliminates preprocessor-time integer conversion, baking enum values directly into meta files - Maintains same functionality while using type-safe enum references Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix IDifferentiablePtr KnownBuiltin mapping regression Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Remove unused IDifferentiablePtrType enum case from KnownBuiltinDeclName Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Clean up temporary AST dump files from testing Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Replace hardcoded integer with descriptive constant in KnownBuiltin test Replace the hardcoded [KnownBuiltin(0)] with a descriptive named constant GEOMETRY_STREAM_APPEND_BUILTIN to improve code readability and maintainability. The test now clearly indicates which builtin enum value is being tested. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Gangzheng Tong <gtong-nv@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gangzheng Tong <gtong-nv@users.noreply.github.com> Co-authored-by: Gangzheng Tong <tonggangzheng@gmail.com>
*	extend fiddle to allow custom lua splices in more places (#7559)	Ellie Hermaszewska	2025-07-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add fkYAML submodule * Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml * generate ir-inst-defs.h * neaten things * neaten inst def parser * add rapidyaml submodule * remove fkyaml * remove fkyaml submodule * remove use of ir-inst-defs.h * format and warnings * fix wasm build * tidy * remove rapidyaml * Extend fiddle to allow custom splices in more places * Use lua to describe ir insts * fix * neaten * neaten * neaten * spelling * neaten * comment comment out assert * merge
*	Remove redundant [payload] attribute (Fix #7528) (#7555)	Harsh Aggarwal (NVIDIA)	2025-06-30
\| \| \| \| \| \| \| \| \| \| \| \|	Removes the PayloadAttribute class and related infrastructure that was made redundant by PR #6595, which added ray payload access qualifiers (PAQs) per the DXR spec. The new [raypayload] attribute with access qualifiers provides the same functionality. Changes: - Remove PayloadAttribute class from slang-ast-modifier.h - Remove [payload] attribute syntax from core.meta.slang - Remove PayloadDecoration IR instruction and related processing - Remove HLSL emission of [payload] attribute - Remove IR lowering support for old PayloadAttribute The new [raypayload] attribute with PAQ support remains unchanged.
*	Minimal optional constraints (#7422)	Julius Ikkala	2025-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Parse optional witness syntax * Allow failing optional constraint * Make `is` work with optional constraint * Allow using optional constraint in checked if statements * Fix tests * Make it work with structs * Fix MSVC build error * Disallow using `as` with optional constraints * Update test to match split is/as errors * Add tests * Fix uninitialized variables in tests * Add tests of incorrect uses & fix related bugs * Mention optional constraints in docs * format code * Fix type unification with NoneWitness * Fix formatting --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Nathan V. Morrical <natemorrical@gmail.com>
*	Fix for missing signedness cast in SwizzleIR (#7448)	Jerran Schmidt	2025-06-16
\| \| \| \| \| \| \| \| \|	* Cast if there is a signedness mismatch on the swizzle * Move isSignedType to slang-util and add test --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Allow interface methods to have default implementations. (#7439)	Yong He	2025-06-13
\|
*	Add DebugLine before IfElse (#7368)	Lujin Wang	2025-06-11
\| \| \| \| \| \| \|	Missing DebugLine in some basic blocks that include OpBranchConditional causes invalid line number '0' presented in the line table of '.debug_line' section. Emiting Debugline before IfElse fixes the issue. Modified maybeEmitDebugLine() to handle the case without Stmt.
*	Mediate access to ContainerDecl members (#7242)	Theresa Foley	2025-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of what this change does is straightforward: take all the places in the code that used to operate directly on `ContainerDecl::members` and related fields, and instead have them call into a smaller set of accessor methods defined on `ContainerDecl`. The primary motivation for making this change is that in order to implement on-demand loading of members from serialized AST modules, we need a way to identify and intercept the "demand" for those members. On-demand loading benefits from having all accesses to the members of a `ContainerDecl` be as narrow as possible. If a part of the code only need a member at a specific index, it should say so. If it only needs access to members with a specific name, or a given subclass of `Decl`, then it should say so. A secondary motivation for this change is that there have recently been several changes that added complexity and special cases by introducing code that operated on (and mutated) the member list of a container decl in ways that the existing code had never done before. Any code that mutates the member list of a `ContainerDecl` needs to be sure to not disrupt the invariants that the lookup acceleration structures currently rely on. One of the recent changes added a declaration-to-index map to the set of acceleration structures (with different validation/invalidation behavior than the others...) while other recent changes would remove or insert declarations in ways that could change the indices of other declarations in the same container. It is not clear if any of these pieces of code were aware of the others, and the invariants that might be expected or broken along the way. This change bottlenecks the vast majority of accesses to the members of a `ContainerDecl` through the following operations: * Getting a `List` of all of the direct member declarations of a container * Get the number of direct member declarations, and accessing them by index. * Looking up the list of direct member declarations with a given name. * Adding a new direct member declaration to the end of the list. Some other operations are layered on top of those (e.g., getting a list of all the direct member declarations of a given C++ class). These layered operations are still centralized on the `ContainerDecl`, with the intention that we can change them to be non-layered implementations if we ever need to for performance (e.g., by building a lookup structure for finding member declarations by their type). The exceptional cases of access/mutation on the direct members of a `ContainerDecl` have also been encapsulated, but rather than expose what would risk appearing like general-purpose accessors (e.g., `removeDecl(d)`, `setDecl(index)`, etc.), these operations have been explicitly named after the specific use case that they serve in the codebase today, to discourage others from using them for more kinds of operations we'd rather not support. These operations have also been given parameter signatures that match their use cases, to make it so that even somebody determined to abuse them would have to invent suitable arguments out of thin air. In the case of the declaration-to-index mapping, this change eliminates that acceleration structure, in favor or slightly more complicated (and possibly inefficient, yes) code at the use site. Over time, it would be good to closely scrutinize each of the use cases that requires more complicated interaction with the members of a `ContainerDecl`, to see whether any of them can be reframed in terms of the more basic operations, or if there is some clean abstraction we can introduce to make operations that mutate the member list feel like... hacky.
*	Implement MapElement for CoopMat (#7159)	Jay Kwak	2025-05-29
\| \| \| \| \| \| \| \| \|	With this PR, MapElement works for the following signatures: - CoopMat<...>::MapElement(functype(...)); - CoopMat<...>::MapElement(capturing-lambda); - CoopMat<...>::MapElement(not-capturing-lambda); - Tuple<CoopMat<...>,...>::MapElement(functype(...)); - Tuple<CoopMat<...>,...>::MapElement(capturing-lambda); - Tuple<CoopMat<...>,...>::MapElement(not-capturing-lambda);
*	Language version + tuple syntax. (#7230)	Yong He	2025-05-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Language version + tuple syntax. * Fix compile error. * regenerate documentation Table of Contents * Fix. * regenerate command line reference * Fix. * Fix. * Fix more test failures. * revert empty line change, * Retrigger CI * #version->#lang * Update source/core/slang-type-text-util.cpp Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> * Remove comments. * Fix parsing logic. * Fix parser. * Fix parser. * update test comment * Update options. * regenerate documentation Table of Contents * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
*	Implement throw & catch statements (#6916)	Julius Ikkala	2025-05-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Implement throw statement It already existed in the IR, so only parsing, checking and lowering was missing. * Initial catch implementation Likely very broken. * Error out when catch() isn't last in scope * Prevent accessing variables from scope preceding catch As those may actually not be available at that point. * Add IError and use it in Result type lowering * Add diagnostic tests * Allow caught throws in non-throw functions * Fix catch propagating between functions & SPIR-V merge issue * Add test for non-trivial error types * Fix MSVC build * Fix invalid value type from Result lowering * Also lower error handling in templates * Lower result types only after specialization * Attempt to disambiguate error enums by witness table * Revert matching by witness, types should be distinct too * Don't assert valueField when getting Result's error value It may not exist if the function returns void, but getting the error value is still legitimate. * Update tests for new error numbers & get rid of expected.txt * Change catch lowering to resemble breaking a loop ... To make SPIR-V happy. * Fix dead catch blocks and invalid cached dominator tree * More SPIR-V adjustment * Lower catch as two nested loops * Add defer interaction test and revert broken defer changes * Fix enum type when throwing literals * Cleanup and bikeshedding * Document error handling mechanism * Fix table of contents * Use boolean tag in Result<T, E> * Use anyValue storage for Result<T,E> * Remove IError * Fix formatting * Eradicate success values from docs and tests * Use parseModernParamDecl for catch parameter * Implement do-catch syntax * Implement catch-all * Fix formatting * Fix marshalling native calls that throw --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Make sizeof(T) & alignof(T) of generic types work as compile-time constants ↵	Julius Ikkala	2025-05-22
\| \| \| \| \| \| \| \| \| \| \|	(#7213) * Make sizeof(generic) work as compile-time constant * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Implement spec const for generic parameter (#7121)	kaizhangNV	2025-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Close #6840. This PR add supports to use specialize constant in generic parameter, and that parameter can also be used as array size, e.g. following code should work: ``` struct MyStruct<let N: int> { float buffer[N]; } MyStruct<SpecConstVar> s; ``` - Loose the restriction from Link-Time to SpecializationConstant when extract generic argument - Tweak the logic of how we decide whether a inst is hoistable. Besides checking existing hoistable flag of each IRInst, when we detect a IRInst's type is SpecConstRateType, we will treat that inst hoistable. Because IRInst in global scope can be deduplicated, and every SpecConstRateType inst should be in the global scope or IRGeneric scope (which will be at global scope after specialization). - Remove the SpecConstIntVal to IRInst map in IR lowering logic, because we already have way to deduplicate the global scope IR.
*	support specialization constant sized array (#6871)	kaizhangNV	2025-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Close #6859 Goal of this PR We want to support an array whose size can be specialization constant for shared/global variable e.g. layout (constant_id = 0) const uint BLOCK_SIZE = 64; shared float buf_a[(BLOCK_SIZE + 5) * 4]; Overview of the solution: During IndexExpr check, we will loose the restriction to allow SpecConst passing, but the size parameter will not be a constant value because it cannot be folded into a constant, so we will make it follow the same logic as generic parameter value, and the size will be represented by FuncCallIntVal/PolynomialIntVal/DeclRefIntVal. During IR lowering, we will detect whether there is spec constant in the IntVal, and wrap the IRInst with a SpecConstRateType, and propagate the type though the lowering logic, such that the IntVal representing the array size will have SpecConstRateType. During spirv emit stage, if we detect that a IRInst has SpecConstRateType, we will emit it as SpecConstantOp. We have to implement new logic to emit OpSpecConstantOp, the existing emit logic doesn't support emitting OpSpecConstantOp, especially this op can embed arithmetic operation at global scope, where we can only emit arithmetic instruct at local. But there are only few instructs we need to support. Overview of the solution: This PR doesn't support generic, and we will create a separate PR to extend that, tracked in #6840.
*	Fix invalid memory dereference in lower-to-ir (#7080)	Jay Kwak	2025-05-13
\| \| \| \|	A reference-counting pointer type released a heap memory object when it return from the function and we are trying to dereference it later. We should increment the ref-count by one by assigning it to the context before returning.
*	Support Array Sizes using Generic arguments to be initialized via {} (#6720)	Sruthik P	2025-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for Array Sizes using Generic arguments to be initialized via {} Fixes one subissue of #6138 This change adds support for initializing Arrays with Generic size arguments via {} and adds a test to verify it. The change checks for an array whose size parameter is a GenericParamIntVal and since the size of such an array will be known at link time, is not considered as a case of the size not being known statically. * Add support for Array Sizes using Generic arguments to be initialized via {} Fixes one subissue of #6138. Fixes the issue #6958. This change adds support for initializing Arrays with Generic size arguments via {} and adds a test to verify it. Support is added by means of adding a new AST Expr node that lowers down to the IR MakeArrayFromElement and the emission of a diagnostic is replaced with the creation of this new AST Expr node. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
*	Fix local constants in switch cases (#7053)	Julius Ikkala	2025-05-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix using local constants in switch cases * Add test * format code * Always lower switch cases with exprVal * Fix formatting --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
*	Add debug information for slang inling (#6621)	Mukund Keshava	2025-05-10
\|
*	Fixed name mangling of generic extensions (#6671)	Ronan	2025-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Fixed name mangling of generic extensions * Added tests for generic extensions linking. - Disabled bugs/gh-6331.slang since it now triggers an assertion error revealed by the new version of the mangler. * Re-enabled test gh-6331 (fixed by a5efbb1b775afb2f6b29b37d39947c41744bb005) --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Add IREnumType to distinguish enums from ints and each other (#6973)	Julius Ikkala	2025-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add IREnumType to distinguish enums from ints and each other * Add issue example as test * format code * Add expected test output * Fix peephole optimization hanging No idea why this PR triggered this, but there seems to have been a clear bug here anyway, so may just as well fix it now. * Move enum lowering later * Add linkage decoration to enum type * Use filecheck-buffer instead of expected.txt * Fix comment * Make enum casts actually use IR enum casts They were all BuiltinCasts by accident * Lower enum type before VM * Deal with rate-qualified types in enum cast * Allow any value marshalling for enum types * Handle new enum instructions in a couple more switches * Fix formatting --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Initial support for immutable lambda expressions. (#6914)	Yong He	2025-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial support for immutable lambda expressions. * More diagnostics, and langauge server fix. * Language server fix. * Fix bug identified in review. * Add expected result. * Update expected result.
*	A new approach to AST serialization (#6854)	Theresa Foley	2025-04-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* A new approach to AST serialization This change completely overhauls the way that AST nodes are being serialized, and the offline source-code generation steps that enable that serialization. In practice, this ends up being a complete overhaul of the way that modules are being serialized (not just the AST part), although things like the serialization format for the Slang IR and for source locations are not affected. The rest of this commit message is broken down in to sections, in an attempt to help guide anybody looking at the code in how to make sense of all the changes. The Old C++ Extractor --------------------- AST serialization used to be driven by information scraped using the `slang-cpp-extractor` tool, which did an ad hoc parse of the C++ declarations of the AST node types and then generated a set of "X macros" that could be for macro-based code generation within the rest of the compiler. While the existing approach was functional, it wasn't easy to understand or maintain, and it has been getting in the way of forward progress on other features we'd like to work on in the language and compiler. This change removes the `slang-cpp-extractor` tool entirely. Marking Up the AST Declarations ------------------------------- The most notable change that contributors to the compiler may notice is the large number of invocations of a macro `FIDDLE()` on the declarations of the AST node types. The basic idea is that only declarations (namespaces, types, fields) that are preceded by `FIDDLE()` are visible to the code generator tool. So if somebody is working with the AST and wondering why a new node type isn't working, or why a field they added isn't being serialized correctly, it is probably because they need to add `FIDDLE()` in front of it. Generating the Boilerplate Code ------------------------------- The file `slang-ast-boilerplate.cpp` provides a good example of how the information extracted from the marked-up AST declarations gets used. In that file, the `FIDDLE TEMPLATE` construct is used to generate type information for each of the AST node types. Similar logic is used in `slang-ast-forward-declarations.h` to generate the declaration of the `ASTNodeType` enumeration, and forward-declare all the AST node classes. For many parts of the code, simply including that file replaces the need for the old `slang-generated-.h` files. Replacing Visitors and Related Logic ------------------------------------ The old visitor types for the AST used the macros that were generated by `slang-cpp-extractor`, so something new was needed to replace them. The same goes for the `SLANG_AST_NODE_VIRTUAL_CALL` macros. The core of the solution implemented here is in `slang-ast-dispatch.h`. Given a "dispatchable" AST node type (say, `Expr`), a call like: ``` ASTNodeDispatcher<Expr,R>(expr, [&](auto e) { return doSomething(e); }) ``` is an expression of type `R`, which does the equivalent of something like: ``` switch(expr->getTag()) { case ASTNodeType::VarExpr: return doSomething(static_cast<VarExpr>(expr)); // ... } ``` The `SLANG_AST_NODE_VIRTUAL_CALL` macro is now implemented in terms of `ASTNodeDispatcher`. The implementation of the visitor types is more involved. The code in this change retains some of the macro names from the original version, just to try and make the parallels more clear. The visitor types are all implemented on top of the `ASTNodeDispatcher` approach, and use `FIDDLE TEMPLATE` to generate all the boilerplate `visit()` method declarations. Refactoring of `Linkage` Module Loading --------------------------------------- Needing to revisit all the places where modules get deserialized made it clear that there is a lot of complexity and apparent duplication in the core routines on the `Linkage` that get used for loading modules. This change tries to clean up some of that logic, but it is worth noting that there are two legacy features that get in the way of making things as clean as they should be: The `LoadedModuleDictionary` type that gets passed around a lot exists entirely to handle the corner case where somebody uses the Slang API to perform a compilation with multiple `TranslationUnitRequest`s in the same `FrontEndCompileRequest`, and one of the translation units `import`s the module defined by another of the translation units. * There are a lot of special-case behaviors and routines entirely there to support the `ModuleLibrary` feature, although that feature should be considered deprecated (or at least subject to getting entirely re-designed down the line). The basic idea of the cleanup is that all of the (non-deprecated) ways load a module from a serialized binary, or compile one from source should now bottleneck through `loadModuleImpl`, which then bifurcates into `loadSourceModuleImpl` for the compilation case and `loadBinaryModuleImpl` for the deserialization case. High-Level Serialization Approach --------------------------------- The old serialization logic used the [RIFF](https://en.wikipedia.org/wiki/Resource_Interchange_File_Format) format to encode the high-level structure of things, and this change retains that usage (and actually doubles down on the RIFF usage). The old serialization system relied on the idea that for any given type `Foo` that wants to support serialization, there should be something like a `SerialFooData` type in C++, that can represent the state of a `Foo`, and then the actual serialization applied to that `SerialFooData`. This means that in most cases there are four pieces of code written: * During serialization: * Copying the data of a `Foo` in memory over to a `SerialFooData` in memory * Writing the state of a `SerialFooData` into the serialized data stream * During deserialization: * Reading the state of a `SerialFooData` from a serialized data stream * Copying the data of the `SerialFooData` in memory over to a `Foo` The new logic gets rid of the intermediate `SerialFooData`. In the serialization direction, we take a `Foo` and write it to the `RIFFContainer` directly, or using some other utilities layered on top of it. In the deserialization direction, we have additional flexibility. Given a `RIFFContainer::Chunk` that represents a serialized `Foo`, we often navigate through the in-memory representation of the RIFF data to get to the parts of the serialized value that we actually want/need, without needing to deserialize the entire `Foo`. To support this kind of operation, this change introduces a few helper types like `ContainerChunkRef` an `ModuleChunkRef`, that are little more than typed wrappers around a `RIFFContainer::Chunk`. The Module "Container" Part --------------------------- A serialized `Module` is encoded as a RIFF chunk, using logic in `slang-serialize-container.cpp` - both before and after this change. This change reorganizes a lot of the code in that file, to account for the way that eliminating the intermediate `SerialContainerData` type streamlines the overall task of writing out the parts of the module. In the deserialization logic... there isn't really much to do in `slang-serialize-container.cpp`. Most of the logic in `slang.cpp` and `slang-module-library.cpp` that pertains to deserializing modules uses the `ModuleChunkRef`-based approach, and simply extracts the pieces of the serialized module that it needs. The Actual Serialization of the AST ----------------------------------- The actual AST serialization logic is in `slang-serialize-ast.cpp`. The basic approach in both the writing and reading directions is: * Use the `FIDDLE TEMPLATE` system to generate a set of functions, one for each AST node type, that recursively invoke the read/write logic on each field of that node (after recursively invoking the case for its direct superclass) * Use the `ASTNodeDispatcher` system to dispatch out to those functions whene reading or writing anything derived from `NodeBase` * For now, handle all types not derived from `NodeBase` by hand. There's a lot of room for improvement around that last item: it should be just as easy to generate the serialization and deserialization logic for other types that don't inherit from `NodeBase`, but the current change tries to err on the side of making the logic as explicit and simplistic as possible, rather than trying to get too clever too soon. The actual serialization format used for the AST is almost comically simplistic: the code uses hierarchical RIFF chunks to emulate a JSON-like structure. This is a very wasteful representation (e.g., a `bool` or a null pointer each take up 8 bytes), but the goal for now is to start with the simplest thing that could possibly work, and only add more cleverness once we are sure it won't get in the way of important future improvements (like lazy/on-demand deserialization or IR and AST, to improve compiler startup times). The files `slang-serialize.{h,cpp}` have been co-opted to define a new pair of types `Encoder` and `Decoder` that are used for a more-or-less stream-oriented way or reading or writing RIFF chunks for the JSON-like structure. Almost everything related to the actual AST serialization could do with a cleanup pass, and some time spent on picking good/better names for everything. Smaller Stuff ------------- * Cleaned up a lot of code that was using bare `ASTNodeType` or the extractor's `ReflectClassInfo` type to consistently use `SyntaxClass`. * Fixed an apparent bug in how the destination-driven code genarator was handling `TryExpr`s * Fixed an apparent bug in how the GLSL legalization pass was handling translation of certain `SV_` semantics. format code * fixup: template errors caught by non-VS compilers * format code * fixup: more template errors * fixup: more stuff VS didn't catch * fixup: it's amazing VS doesn't catch these... * fixup: yet more template stuff VS ignores * fixup: more VS template nonsense * fixup: unreachable return macro usage * fixup: more unreacable returns * fixup: unused parameter * fixup: strict aliasing * fixup: allow missing entry point list chunk * fixup: wasm build script * fixup: AST changes since this PR was created --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
*	Add `vk::offset` to specify member offsets for push constants (#6797)	Darren Wihandi	2025-04-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Add struct member offset qualifier for SPIRV * Implement for GLSL target and add tests * clean up * fix formatting * fix typo * renamed GLSLStructOffset to VkStructOffset and added emit-spirv-via-glsl test case
*	Eliminate back-reference in ChildStmt (#6835)	Theresa Foley	2025-04-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Eliminate back-reference in ChildStmt This change is part of a larger effort to improve the code for AST serialization in the Slang compiler. Tree structures are understandably easier to serialize than DAGs, and DAGs are easier than fully generaal graphs. The Slang AST nodes form a tree structure... except when they don't. Among the exceptions to nice tree-structured ASTs are: 1. References to `Decl`s are encoded as pointers to the AST `Decl` nodes themselves. This can result in cycles in the graph, and requires care in serialization. 2. Nodes that inherit from `Val` represent, well, values instead of actual pieces of syntax, and as such they are deduplicated so that identical values will (hopefully) be identical pointers. This results in a DAG structure for `Val`s, but at least it's not a general graph (except for cycles that go through a `Decl`). 3. There are some minor cases of DAG-structured sharing that the parser can introduce to deal with cases when a traditional-style declaration includes multiple declarators. E.g., given: ``` static int a, b; ``` The resulting `DeclGroup` will include distinct `Decl`s for `a` and `b`, which will share the `static` modifier through a `SharedModifiers` node, and the `int` type specifier through a `SharedTypeExpr` node. This duplication can be ignored, for the purposes of serialization, since duplicating those parts of the AST has no major down-sides. 4. There is the case of `ChildStmt`, used for things like `break` and `continue`, which stores a direct `Stmt` to the enclosing parent statement being targetted. Storing the target is useful so that IR lowering doesn't need to repeat the work that the semantic checking logic did to associate each child statement with its parent. The parent link inside of `ChildStmt` creates a cycle in the AST `Stmt` hierarchy, since the outer statement contains the inner, and the inner statement stores a pointer to the outer. This change eliminates the last of these sources of complication for AST serialization, by changing the `ChildStmt` type to stored an integer ID for the enclosing statement that it matches to, and having each `BreakableStmt` (used to represent the outer `switch`, or loop, or whatever) generate its own unique ID as part of semantic checking. Note: if necessary, it is reasonable for the outer statement to have its unique ID generated as part of parsing, rather than semantic checking. format code * Change unique ID to be a proper Decl The fix here is to make the "unique ID" representation be a full `Decl`-derived AST node, so that it is both allowed to break the tree-structuring rules cleanly, and it is also trivially guaranteed to be unique across all loaded ASTs. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Support for Payload Access Qualifiers (#3448) (#6595)	Harsh Aggarwal (NVIDIA)	2025-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for Ray Payload Access Qualifiers (PAQs) (#3448) - Added [raypayload] attribute for struct declarations - Implemented field validation requiring read/write access qualifiers - Added diagnostic error for missing qualifiers - Enabled PAQs in DXC compiler and HLSL emission - Added new test demonstrating PAQ syntax - Implemented proper handling of ray payload attributes in IR generation * format code * Cleanup: Remove unused vars * Add check to enablePAQ only for profile >= lib_6_7 * Review Fix - Add PAQ support for DX Raytracing add enablePAQ flag to DownstreamCompileOpitons, improve PAQ handling update raypayload-attribute-paq.slang to ensure hlsl and dxil is validated * Add diagnostic test for missing paq for lib_6_7 Compile using `-disable-payload-qualifiers` aka lib_6_6 profile raypayload-attribute-no-struct.slang and raypayload-attribute.slang --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
*	Add defer statement (#6619)	Julius Ikkala	2025-04-06
\|
*	Make IRWitnessTable HOISTABLE (#6417)	Jay Kwak	2025-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	# Make `IRWitnessTable` Hoistable ## Intention of the PR This commit makes `IRWitnessTable` Hoistable so that we can avoid duplicated `IRWitnessTable`. ## Problems This commit tries to address the following issues arise after turning `IRWitnessTable` into Hoistable: 1. A Hoistable instance is immutable. 2. When tries to create a duplicated child, you will get a previously created instance of `IRWitnessTable`, instead of a new one. 3. We don't actually want to hoist `IRWitnessTable`. 4. There can be only one instance of Hoistable and it cannot appear as childs multiple times. 5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface. ## Implementation ### Solution for "1. A Hoistable instance is immutable." `IRWitnessTable::setConcreteType()` is removed, because when an `IRInst` is Hoistable, it is treated as immutable. Any `IRInst::setXXX()` methods don't work anymore. There were two places calling `setConcreteType()` and their logic had to change little bit. `DeclLoweringVisitor::visitInheritanceDecl()` in `source/slang/slang-lower-to-ir.cpp` was calling `setConcreteType()`. It had a little strange logic around `lowerType()`. The `IRWitnessTable` was added with `context->setGlobalValue()` first and its `concreteType` was changed later. This commit works around in a way that it sets the parent of `IRWitnessTable` temporarily and reset it with the correct `IRWitnessTable`. Without this logic, it went into an infinite recursion. `AutoDiffPass::fillDifferentialTypeImplementation()` in `source/slang/slang-ir-autodiff.cpp` was calling `setConcreteType()`. It was changing the concreteType of `innerResult.diffWitness`. This commit creates a new `IRWitnessTable` and copies its `IRWitnessTableEntry`. ### Solution for "2. When tries to create a duplicated child, you will get a previously created instance of IRWitnessTable, instead of a new one" After a call to `IRBuilder::createWitnessTable()`, this commit checks if the returned `IRWitnessTable` is a brand new or not. If it is not a new one, we have to avoid adding the decorations and children. This commit decides when to add decorations and children based on whether `IRWitnessTable` has any of decorations or children already. It doesn't seem like a proper way to check. But when I tried, it was difficult to find a bottleneck point where the decorations and children are added to `IRWitnessTable` first time. Note that we are not trying to find when `IRWitnessTable` is created for the first time; we need to find if the decorations and children were added once. It might be fine to have duplicated `IRWitnessTableEntry` in most of the cases, but I noticed that it fails an assertion check when `shouldDeepCloneWitnessTable()` returns false in `cloneWitnessTableImpl()`. ### Solution for "3. We don't actually want to hoist IRWitnessTable." The reason why this commit makes `IRWitnessTable` is to prevent the duplicated instances of `IRInst`. But we don't really want to "Hoist" them. When an `IRWitnessTable` gets Hoisted out, it causes unexpected problems and the specialization process fails due to the missing `IRWitnessTable` in the input. This commit prevent from hoisting `IRWitnessTable` in `_replaceInstUsesWith()`. The way this is implemented feel little hack but we discussed on Slack and decided to go with this. One of the proper approaches could be to add a new flag in `IROpFlags` and have a new one like `kIROpFlag_Deduplicate`, which is different from just `kIROpFlag_Hoistable`. ### Solution for "4. There can be only one instance of Hoistable and it cannot appear as childs multiple times." When `IRWitnessTable` is Hoistable, there can be only a unique set of instances. And we cannot have an instance as a duplicated childs. It is because `IRInst` has only one set of `IRInst* next` and `IRInst* prev`. Before this commit, an instance of `IRGeneral` could have duplicated instances of `IRWitnessTable`. As an example, `IInteger` interface inherits two other interfaces, `IArithmetic` and `ILogical`. And they both inherits from `IComparable`. ``` interface IInteger : IArithmetic, ILogical {} interface IArithmetic : IComparable {} interface ILogical : IComparable ``` When we specialize it in `specializeGenericImpl()`, an `IRBlock` gets the following list of children: - IRWitnessTable for IComparable, - IRWitnessTable for IArithmetic, - IRWitnessTable for IComparable, - IRWitnessTable for ILogical, For the cloning during the specialize, "IRWitnessTable for `IComparable`" must be cloned before the cloning of "IRWitnessTable for `IArithmetic`". Because "IRWitnessTable for `IArithmetic`" refers "IRWitnessTable for `IComparable`" as its `IRWitnessTableEntry`. The order they appear in the `IRBlock` as children decides which instances will be cloned first. And "IRWitnessTable for `IComparable`" must appear before "IRWitnessTable for `IArithmetic`". Note that "IRWitnessTable for `IComparable`" appears twice, The first one was added for "IRWitnessTable for `IArithmetic`". And the second one is added for "IRWitnessTable for `ILogical`". With this commit "IRWitnessTable for `IComparable`" can appear as a child only once in `IRBlock`. So it causes an error if it gets the following list: - IRWitnessTable for IArithmetic, - IRWitnessTable for IComparable, - IRWitnessTable for ILogical, In order to resolve the problem, "IRWitnessTable for `IComparable`" must appear before both "IRWitnessTable for `IArithmetic`" and "IRWitnessTable for `ILogical`" as following: - IRWitnessTable for IComparable, - IRWitnessTable for IArithmetic, - IRWitnessTable for ILogical, To address the problem, the instances of `IRWitnessTable` is always added to the end of the children list. If it is already added to the list, we don't move. This works out because the AST tree is built based on the dependencies. ### Solution for "5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface." This issue was found while testing with Falcor tests where it uses Conformance-type feature of Slang. We are using different import and export mangled names for a same Witness-table when the witness-table is for "Enum" interface. The way we simplify the implementation of "Enum" causes a problem when it comes to generate export/import for the witness-table. And the exact repro step is still unclear. There were two suggested solutions for the problem and this PR adopted the first option for now. Maybe we want to improve it with the second option later. option 1, when we produce mangled names for those witness-table, we can use a mangled name with the underlying "int" type instead of the name of the enum type. In this way, all witness-tables for enum types whose underlying type is same will get the same mangled name. It will allow us to deduplicate the witness-table during the linking. option 2, we can preserve type info for enum type when generating IR. We can still erase all other uses of the type info of enum types for now. But when we generate the witness-table, instead of filling the conforming type operand to IntType, we fill it as EnumType(IntType) where EnumType is a new global IROp code to represent all enum types (like InterfaceType/StructType). This way the operands for the two witness-tables will be different. "option 1" is more quick and dirty and "option 2" is more proper way to address it. I should go with "option 1" and improve it with "option 2" approach later.
*	Emit errors for missing returns on unsupported targets (#6633)	Darren Wihandi	2025-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* initial wip * more WIP * preserve old lower behavior * remove unnecessary includes * add test * add no target case in test * fix broken test --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
*	Fix lowering of associated types in generic interfaces (#6600)	Sai Praveen Bangaru	2025-03-15
\| \| \| \| \| \| \| \| \| \| \|	* Fix lowering of associated types in generic interfaces. * Update diff-assoctype-generic-interface.slang * Fix-up lowering of differentiable witnesses for implicit ops * Update slang-ir-autodiff-transcriber-base.cpp * Fix issue with differentiating type-packs
*	Add mesh shader output topology checks (#6592)	Darren Wihandi	2025-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* initial wip * more wip * add test * add unexpected for invalid target * fixups and improve error message * fixups and improve error message * remove incorrect comment --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
*	Fix lowering of `extern` types with defaults. (#6512)	Yong He	2025-03-06
\| \| \| \| \| \| \|	* Fix lowering of `extern` types with defaults. * Fix test. * Fix test.
*	Update SPIRV-Tools and fix new validation errors. (#6511)	Yong He	2025-03-06
\| \| \| \| \| \| \|	* Update SPIRV-Tools and fix new validation errors. * Implement pointers for glsl target. * Reworked packStorage/unpackStorage code gen to operate on pointers rather than values.