slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2025-10-03	Rename some symbols related to pointers types (#8592)	Theresa Foley
	Note that while this change touched a large numer of files, there are no changes to functionality being made here. The only things being done are renaming various symbols and, in a few cases, updating or adding comments for consistency with the new names. The core of the naming changes are: * Most things named to refer to `OutType` (e.g., `IROutType`, `IRBuilder::getOutType()`, etc.) have been consistently renamed to refer to `OutParamType`, to emphasize that the relevant AST/IR node types are only intended for use to represent `out` parameters. * The same change as described above for `OutType` is also made for `RefType`, which becomes `RefParamType` in most cases. One mess that this exposes is the way that the `ExplicitRef<T>` type in the core module currently lowers to `IRRefParamType`. This change sticks to the rule of not making functional changes, so that mess is left as-is for now. * Names referring to `InOutType` have been changed to instead refer to `BorrowInOutType`. The intention with this naming change is to emphasize that the Slang rules for `inout` are semantically those of a borrow (or at least our interpretation of what a borrow means). * Names referring to `ConstRefType` have been changed to instead refer to `BorrowInType`. This change starts work on clarifying that the existing `__constref` modifier was never intended to be a read-only analogue of `__ref`, and instead is the input-only analogue of `inout`. * The `ParameterDirection` enum type has been changed to `ParamPassingMode`, to reflect the fact that the concept of "direction" fails to capture what is actually being encoded, particularly once we have modes beyond simple `in`/`out`/`inout`. While this change does not alter behavior in any case (the user-exposed Slang language is unchanged), it is intended to set up subsequence changes that will work to make the handling of these types in the compiler more nuanced and correct. Breaking this part of the change out separately is primarily motivated by a desire to minimize the effort for reviewers. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-10-02	Relax the inst definition order rule (#8588)	kaizhangNV
	Close #8572. The root cause of the issue is that in `_replaceInstUsesWith` call, if the use of the inst is a generic parameter, and the inst is the data type of that generic parameter, we could end up of moving the data type before the generic parameter. This will break the layout of generic parameters, where all the generic parameters should be laid consecutively at the beginning of the first block of the generic. Therefore, we don't make that relocation for such case.
2025-09-03	Diagnose on structured buffers containing resources (#8222)	Ellie Hermaszewska
	closes https://github.com/shader-slang/slang/issues/3313
2025-07-18	Lower int/uint/bool matrices to arrays for SPIRV (#7687)	venkataram-nv
	* Add tests for expected behaviour * Allow matrix types in logical or/and * Legalize int/bool matrix types and construction with makeMatrix * Legalize uint matrices and operations * Limit testing to only SPIRV * Better tests for int and bool * Add test for uint * Remove GLSL tests * Remove old test for diagnosing int matrices * Emit SPIRV directly in tests * format code * Address PR comments * Improve testing * Address PR comments * format code * Add tests for matrix intrinsic operations * Move matrix lowering to dedicated legalization pass * Fix compiler warning * Remove signal again * Reorder matrix and vector legalization * Fix formatting * Add shift and comparison tests --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-07-01	extend fiddle to allow custom lua splices in more places (#7559)	Ellie Hermaszewska
	* Add fkYAML submodule * Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml * generate ir-inst-defs.h * neaten things * neaten inst def parser * add rapidyaml submodule * remove fkyaml * remove fkyaml submodule * remove use of ir-inst-defs.h * format and warnings * fix wasm build * tidy * remove rapidyaml * Extend fiddle to allow custom splices in more places * Use lua to describe ir insts * fix * neaten * neaten * neaten * spelling * neaten * comment comment out assert * merge
2025-06-12	Fix intermittent debug failures with Debug build (#7369)	Jay Kwak
	This PR replaces enable/disable style C function calls with C++ RAII style code. In debug build, when an assertion failed in between enable and disable functions, an exception is thrown and the disable function is not called. RAII style code is safer for an exception
2025-05-14	Error out on invalid vector sizes (#7076)	Darren Wihandi
	* Error out on invalid vector sizes * Remove unnecessary include * Fix incorrect assert * Add test
2025-05-10	Add debug information for slang inling (#6621)	Mukund Keshava

2025-01-22	Add validation for destination of atomic operations (#6093)	Anders Leino
	* Reimplement the GLSL atomic* functions in terms of __intrinsic_op Many of these functions map directly to atomic IR instructions. The functions taking atomic_uint are left as they are. This helps to address #5989, since the destination pointer type validation can then be written only for the atomic IR instructions. * Add validation for atomic operations Diagnose an error if the destination of the atomic operation is not appropriate, where appropriate means it's either: - 'groupshared' - from a device buffer This closes #5989. * Add tests for GLSL atomics destination validation Attempting to use the GLSL atomic functions on destinations that are neither groupshared nor from a device buffer should fail with the following error: error 41403: cannot perform atomic operation because destination is neither groupshared nor from a device buffer. * Validate atomic operations after address space specialization Address space specialization for SPIR-V is not done as part of `linkAndOptimizeIR`, as it is for e.g. Metal, so opt out and add a separate call for SPIR-V. * Allow unchecked in/inout parameters for non-SPIRV targets * Clean up callees left without uses during address space specialziation * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2024-11-05	Move switch statement bodies to their own lines (#5493)	Ellie Hermaszewska
	* Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-10-29	format	Ellie Hermaszewska
	* format * Minor test fixes * enable checking cpp format in ci
2024-04-12	Fix IR lowering bug of do-while loops. (#3941)	Yong He

2024-02-20	Refactor compiler option representations. (#3598)	Yong He
	* Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
2024-01-24	IRSPIRVAsmOperandInst instructions may not have IRBlock as the immediate ↵	Pankaj Mistry
	parent. Previously special case was added to handle IRDecoration similarly. Replace this with a common method getBlock that traverses the parent chain till it gets to the Block (#3486) Fixes bug #3432
2023-09-21	Various slangpy fixes. (#3227)	Yong He
	* Make dynamic cast transparent through `IRAttributedType`. * Add [CUDAXxx] variant of attributes. * Support marshaling of vector types. * Wrap cuda kernels in `extern "C"` block. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-09-21	fix warnings (#3224)	Ellie Hermaszewska
	* Remove unused variable * Remove unused variable * Remove unused if bindings
2023-08-24	Misc. SPIRV Fixes, Part 2. (#3147)	Yong He
	* Misc. SPIRV Fixes, Part 2. * Fix up. * Fix. * Add system smenatic values. * 16 bit int and floats, matrix/vector reshape, bool ops. * Fix. * Fix. * Allow push constant entry point params. * entrypoint params. * swizzleSet and swizzledStore. * packoffset. * string hash. * Fix. * Matrix arithmetics. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-05-30	Fix type checking & loop value hoisting (#2907)	Yong He
	* Fix type checking crash in language server. * Fix loop var hoisting logic. Fixes #2903. * fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-04-25	Dictionary using lowerCamel (#2835)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
2023-03-16	Fix Phi simplification bug. (#2710)	Yong He
	* Fix Phi simplification bug. * Fix up. * Fix. * Fix. * Fix. * Fix. * Fix. * Fix test. * Fix test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-03-14	Support `fwd_diff(bwd_diff(f))`. (#2697)	Yong He
	* Support `fwd_diff(bwd_diff(f))`. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-16	Overhaul global inst deduplication and cpp/cuda backend. (#2654)	Yong He
	* Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-01-30	Overhauled reverse-mode control flow handling (#2608)	Sai Praveen Bangaru
	* Added switch-case support; fixed non-diff parameter transposition * Made region propagation much more robust. Partial loop unzip implementation * WIP: Added most loop handling code, and a test. Still untested * Added CFG Normalization pass + CFG Reversal Pass + Loop Unzipping + most loop transcription * Add single-iter-loop test. * proj files * removed comments * Update reverse-loop.slang * Removed out-of-date code * Disabled IR validation during constructSSA phase of normalizeCFG. constructSSA now reuses sharedBuilder * Moved normalizeCFG() call to prepareFuncForBackwardDiff()
2023-01-11	Make backward differentiation work with generics. (#2586)	Yong He
	* Make backward differentiation work with generics. * Fix. * Another fix. * More fix. Co-authored-by: Yong He <yhe@nvidia.com>
2022-04-11	Refactor: eliminate BackEndCompileRequest (#2178)	Theresa Foley
	An earlier refactoring pass over the compiler codebase split the type that had been called `CompileRequest` into three distinct pieces: * `FrontEndCompileRequest` which was supposed to own state and options related to running the compiler front end and producing IR + reflection (e.g., what translation units and source files/strings are included). * `BackEndCompileRequest` which was supposed to own state and options related to running the compiler back end to translate the IR for a `ComponentType` (program) into output code. (Note that the `BackEndCompileRequest` was conceived of as orthogonal to the `TargetRequest`s, which store per-target and target-specific options.) * `EndToEndCompileRequest` which was an umbrella object that owns separate front-end and back-end requests, plus any state that is only relevant when doing a true end-to-end compile (such as the kinds of compiles initiated with `slangc`). As originally conceived, the only state that this type was supposed to own was stuff related to "pass-through" compilation, as well as state related to writing of generated code to output files. That refactoring work was very useful at the time, because it allowed us to "scrub" the back end compilation steps to remove all dependencies on front-end and AST state (this was important for our goals of enabling linking and codegen from serialized Slang IR). At this point, however, it is clear that the hierarchy that was built up serves very little purpose: * The `BackEndCompileRequest` type is only used in two places: * As part of an `EndToEndCompileRequest`, where the settings on the `BackEndCompileRequest` can be configured, but only through the `EndToEndCompileRequest` * As part of on-demand code generation through the `IComponentType` APIs. In this case, the settings stored on the `BackEndCompileRequest` are not accessible to the application at all, and will always use their default values, so that instantiating a "request" object doesn't really make any sense. * The `FrontEndCompileRequest` type has a similar situation: * Front-end compilation as part of an `EndToEndCompileRequest` supports user configuration of `FrontEndCompileRequest` settings, but only through the `EndToEndCompileRequest` * Front-end compilation triggered by an `import` or a `loadModule()` call does not support user configuration of settings at all. It will always derive all relevant settings from thsoe on the session ("linkage"). In addition, subsequent changes have been made to the compiler that show a bit of a "code smell" and/or forward-looking worries for this decomposition: * In some cases we've had to add the same setting to multiple types in the breakdown (front-end, back-end, end-to-end, linkage, target, etc.) which makes it harder for us to validate that all the possible mixtures of state work correctly. * Related to the above, in some cases we have manual logic that copies state from one of the objects in the breakdown to another, in order to ensure that the user's intention is actually followed. * As a forward-looking concern, it seems that developers have sometimes added new configuration options and state to places that don't really make sense according to the rationale of the original decomposition (e.g., we probably don't want to have a lot of state that is only available via end-to-end requests, given that the API structure is meant to push users away from end-to-end compiles). As a result of all of the above, I've been planning a large refactor with the following big-picture goals: * Eliminate `BackEndCompileRequest` * Move all relevant state/options from the back-end request to the end-to-end request, since that is the only place they could be set anyway. * Introduce a transient "context" type to be used for the duration of code generation that serves the main functions that back-end requests really served in the codebase * Make `EndToEndCompileRequest` be a subclass of `FrontEndCompileRequest` * Consider addding a transient "context" type for front-end compiles that can be used in `import`-like cases rather than needing a full front-end request object. If this works, then eliminate `FrontEndCompileRequest` and be back to world with just a single `CompileRequest` type * Move all compiler configuration options to a distinct type (named something like `CompilerConfig` or `CompilerOptions` or whatever) which stores setting as key-value pairs, and has a notion of "inheritance" such that one configuration can extend or build on top of another. Make all the relevant types use this catch-all structure instead of redundantly storing flags in many places. This change deals with the first of those bullets: removeal of `BackEndCompileRequest`. The addition of the `CodeGenContext` type is perhaps an unncessary additional step, but making that change helps clean up a bunch of the code related to per-target code generation, so I think it is the right choice. Co-authored-by: Yong He <yonghe@outlook.com>
2022-02-25	Improved SCCP, inlining and resource specialization passes, legalize ↵	Yong He
	`ImageSubscript` for GLSL (#2146)
2021-12-17	Cleanup refactoring work around the IR builder (#2061)	Theresa Foley
	* Cleanup refactoring work around the IR builder We have some long-term goals for the IR that require a more centralized and disciplined set of rules for how IR instructions get created/emitted. I had been working on trying to set things up so that all IR instruction creation goes through a single bottleneck point, but the non-trivial work in that branch was getting drowned out by the sheer volume of cleanup and refactoring changes. This change tries to pull together several of the more important cleanups. The big pieces are: * `IRBuilder` and `SharedIRBuilder` now protect their data members and rely on users to initialize them more directly via constructor of an `init()` method. This change affects a bunch of sites where `IRBuilder`s were created. I changed use sites to use the constructors whenever possible, and to use `init()` in cases where we had longer-lived builders that needed to be initialized multiple times. * The insertion location for the `IRBuilder` now uses an encapsulated type called `IRInsertLoc`. This new type can replace what used to be just two `IRInst` fields in the builder, and also covers some new functionality (if we ever want to take advantage of it). Very little client code cares about this change, but it is still a nice cleanup in terms of making things more explicit. The creation of an `IRModule` has been moded out of `IRBuilder`, because in practice we `IRBuilder` always wants to be associated with a pre-existing `IRModule` at creation time (via its `SharedIRBuilder`). There is now an `IRModule::create()` operation instead. This required changing the sequencing at many `IRModule` creation sites, since most had been contriving to make an `IRBuilder` first. There were also several cleanups because code had been carelessly using non-reference-counted pointers for `IRModule`s in ways that broke now that `IRModule::create()` always returns a `RefPtr`. * The core operations to actually allocate memory for IR instructions were moved into `IRModule` (since they interact with the memory pool that the module owns). These were called `createEmptyInst()` but have been renamed into `_allocateInst()`. In principle these seem like they should only be needed to be called by the `IRBuilder`, but in practice they are also needed by the IR deserialization logic. * A few core operations for emitting IR instructions that were associted with `IRBuilder` were moved to actually be methods on `IRBuilder`. First is `_findOrEmitConstant` which is the primary bottleneck for creating simple scalar constant values. Another is `_createInst` (formerly part of the templated `createInstImpl` along with `createInstWithSizeImpl`) which is the main bottleneck for allocation and initialization of any instruction other than a constant (well, the `IRModuleInst` is the other exception...). Finally, there is also `_maybeSetSourceLoc()`, which is obvious to scope inside the `IRBuilder` once it is protecting the source-location info. Notes: * The `minSizeInBytes` parameter to `_createInst()` might not actually be needed at all. At this point any `IRInst` subtypes that need data allocated for things other than their operands already get created manually via `_allocateInst` or `_findOrEmitConstant`, so I think we could remove that part. I will handle that in a subsequent cleanup if it turns out to be the case. * There is one IR pass (`slang-ir-string-hash.cpp`) that is using manual `_allocateInst()` instead of going through an `IRBuilder`. It could be easily cleaned up to not do so (and I will probably make that change down the line), but for now I wanted to avoid doing anything that wasn't close to pure refactoring if I could. * At this point in our design an `IRBuilder` is a very lightweight thing - it basically just owns the insertion location plus a source location to write into instructions. A lot of our code currently treats `IRBuilder`s like they are expensive and/or need to be re-used (which leads to them being used in more mutable/stateful ways). It is quite likely that as we clean up other aspects of the implementation of IR creation/emission we can make `IRBuilder` use feel more lightweight in ways that can streamline and simplify code. * The next step for this work is to identify the different paths that eventually lead to `_createInst()` being called, and unify them at a single bottleneck operation that can own the decisions around when to create an instruction vs. when to re-use an existing one (rather than those decisions being baked into the various `IRBuilder` subroutines that create instructions of the various subtypes). * fixup: gcc/clang C++ spec details
2020-11-04	Improve insertion location for "hoistable" instructions (#1593)	Tim Foley
	The Slang IR builder has a notion of "hoistable" instructions, which are basically those instructions that represent a pure side-effect-free operation on their operands, and which can and should be deduplicated. Most types are "hoistable" instructions. In order to make deduplication of hoistable instructions work, we need to emit them at the right location. Consider if we had: ```hlsl void myFunc<T>(...) { if(someCondition) { vector<T, 4> a = ...; ... } else { vector<T, 4> b = ...; } } ``` The IR instruction that represents `vector<T,4>` can't be inserted at the global scope, because then the parameter `T` would not be visible to it. That instruction also shouldn't be inserted into the same block that declares `a`, because then the instruction itself wouldn't be visible at the point where `b` is declared. The IR builder already has logic to pick the right parent instruction. In the example given, the IR instruction for `vector<T,4>` should be inserted into the body of the IR generic, but outside of the IR function that represents `myFunc`. The problem this change fixes is that while the logic was picking the parent for a hoistable instruction correctly, it wasn't putting much care into pick the insertion location. The existing strategy amounted to: * If the IR builder was set with an insertion location inside the chosen parent, then use that insertion location * Otherwise, insert at the end of the chosen parent Neither of those options is perfect. Either could lead to an instruction being inserted after one of its uses, and the second option could even lead to a type being inserted after the `return` instruction in a function/generic, which violates another structural invariant of our IR (that every block must end with a terminator, and terminators must only appear at the end of blocks). This change updates the rules as follows: * If the type of the instruction being created, or any of its operands are in the chosen parent, then insert immediately after whichever of those instructions is last in that parent. * Otherwise, insert before the first non-decoration, non-parameter child of the chosen parent The combined effect of these two rules is now that we insert any hoistable instruction as early as we can in its parent, without violating the structural validity rules. (One small exception to these rules is that if the parent is the module then we don't worry about ordering and just insert at the end, since order-of-declaration isn't significant at module scope in our IR) All of our existing tests work with this new behavior, although there could conceivably be future cases that lead to complicated breakage. For example, if a pass looks at the first "ordinary" instruction in a block and saves it to use as an insertion point for parameter, and then proceeds to manipulate code in the block before going back and inserting parameters at the chosen location, there is a chance that a hoistable instruction might have been inserted before the chosen insertion point, leading to a parameter being inserted after an ordinary instruction. In general, though, code that works like that would already be playing a dangerous game in that it is manipulating instructions in a block while assuming the first instruction will remain fixed. This change is currently just a refactor, but the underlying issue surfaced as a bug when I made other changes in a feature branch.
2019-05-31	Use slang- prefix on slang compiler and core source (#973)	jsmall-nvidia
	* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.