summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-ir.h
Commit message (Collapse)AuthorAge
* Addition of `Load`/`Store` coherent operations (#8395)16-Bit-Dog2025-10-10
| | | | | | | | | | | | | | | | | | | | | | | | Fixes: https://github.com/shader-slang/slang/issues/7634 Duplicate of PR https://github.com/shader-slang/slang/pull/8052 Primary Changes: * Added `storeCoherent` and `loadCoherent` for coherent load/store via pointers. This is backed by `IRMemoryScopeAttr` which is an `IRAttr` attached to `IRLoad` and `IRStore` * Logic in `source\slang\slang-emit-spirv.cpp` for load/store emitting has been reworked to be less messy and more maintainable * Add to `hlsl.meta.slang` coop vector and coop matrix coherent load/store operations Secondary Changes: * Added a missing load/store test for coop matrix: `tests\cooperative-matrix\load-store-pointer.slang` --------- Co-authored-by: ArielG-NV <aglasroth@nvidia.com> Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Nathan V. Morrical <natemorrical@gmail.com>
* Specialize interfaces in DebugFunction (#8617)Julius Ikkala2025-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | E.g. in [generic-extension-2.slang](https://github.com/shader-slang/slang/blob/master/tests/language-feature/extensions/generic-extension-2.slang), incorrect DebugFunctions are generated for `getFirstOuter`: ``` let %33 : Void = DebugFunction("getFirstOuter", 18 : UInt, 3 : UInt, %26, Func(Int, 0 : Int)) ``` This happens because specialization passes are leaving a `%IFoo` in the function type, instead of replacing with a concrete type: ``` let %34 : Void = DebugFunction("getFirstOuter", 18 : UInt, 3 : UInt, %26, Func(Int, %IFoo)) ``` and later, `cleanUpInterfaceTypes()` just replaces all interfaces with the literal zero. So now we have a parameter type which isn't actually a type at all, but an IntLit instead. I'm not sure if the approach I picked is good, though. Some other options that crossed my mind were: * Make `fixUpFuncType` also update related DebugFunctions - But is there a reason why DebugFunctions separately carry a function type in the first place? * Make `cleanUpInterfaceTypes` less aggressive or at least replace types with a type instead of a value - But this will still make the debug info incorrect :(
* Fix legalization crash when processing metal parameter blocks. (#8591)Yong He2025-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Closes #7606. When Slang compile for a bindful target, we will run the resource type legalization pass to hoist resource typed struct fields outside of the struct type and define them as global parameters and passing them around via dedicated function parameters. When we compile for a bindless target, we don't run this pass. However, Metal is a hybrid bindful and bindless target. We need to run type legalization for the constant buffer, but skip type legalization for parameter block. The previous attempt to support this behavior is to hack the type legalization pass to return `LegalVal::simple` when it sees a `ParameterBlock<T>`. However, whenever the code is accessing `parameterBlock.someNestedField`, the type of the nested field may get a `LegalType::tuple`, and now we will run into inconsistent scenarios where we have a `LegalVal::simple` on the operand val, and but the legalization logic is expecting that val to be a `LegalType::tuple`. This breaks a lot of assumptions and invariants in the type legalization pass, resulting unstable/fragile behavior. To systematically solve this problem, this change generalizes the existing legalize buffer element type pass to translate `ParameterBlock<Texture2D>` (and similar cases) to `ParameterBlock<Texture2D.Handle>`. So that such parameter block will always be legalized to `LegalType:::simple` during type legalization, and we will never run into any inconsistent cases. This allowed us to get rid of the hacky logic in the type legalization pass to try to workaround the inconsistencies.
* Rewriting the lower-buffer-element-type pass to avoid unnecessary ↵Yong He2025-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | packing/unpacking. (#8526) Part of the effort to improve the performance of generated SPIRV code. The existing lower-buffer-element-type pass works by loading the entire buffer element content from memory, and translate it to logical type stored in a local variable at the earliest reference of a buffer handle. This means that is can generate inefficient code that reads more than necessary. Consider this example: ``` struct BigStruct { bool values[1024]; } ConstantBuffer<BigStruct> cb; void test(BigStruct v) { if (v.values[0]) { printf("ok"); } } [numthreads(1,1,1)] void computeMain() { test(cb); } ``` In IR, the `computeMain` function before lower-buffer-element-type pass is something like following: ``` func test: %v = param : BigStruct %barr = fieldExtract(%v, "values") %element = elementExtract(%barr, 0) ... // uses %element func computeMain: %v = load(cb) call %test %v ``` The existing lower-buffer-element-type pass will rewrite the bool array in `BigStruct` into `int` array so it is legal in SPIRV. However, it does so by inserting the translation on the first `load` of the constant buffer: ``` struct BigStruct_std430 { int values[1024]; } var cb : ConstantBuffer<BigStruct_std430>; func computeMain: %tmpVar : var<BigStruct> call %unpackStorage(%tmpVar, cb) %v : BigStruct = load %tmpVar call %test %v ``` This means that the entire array will be loaded and translated to int, before calling `test`, which only uses one element. It turns out that the downstream compiler isn't always able to optimize out this inefficient translation/copy. This PR completely rewrites the way buffer-element-type lowering is handled to avoid producing this inefficient code. It works in two parts: first we turn on the `transformParamsToConstRef` pass for SPIRV target as well, so we will translate the `test` function to take the `v` parameter as `constref`. The second part is a redesigned buffer-element-type pass that defers the storage-type to logical-type translation until a value is actually used by a `load` instruction. In this example, after `transformParamsToConstRef`, the IR is: ``` func test: %v = param : ConstRef<BigStruct> %barr = fieldAddr(%v, "values") %elementPtr = elementAddr(%barr, 0) %element = load(%elementPtr) ... // uses %element func computeMain: call %test %cb ``` The new `buffer-element-type-lowering` pass will take this IR, and insert translation at latest possible time across the entire call graph, and translate the IR into: ``` func test: %v = param : ConstRef<BigStruct_std430> %barr = fieldAddr(%v, "values") %elementPtr : ptr<int> = elementAddr(%barr, 0) %element_int = load(%elementPtr) %element = cast(%element_int) : %bool ... // uses %element func computeMain: call %test %cb ``` In this new IR, there is no longer a load and conversion of the entire array. See new comment in `slang-ir-lower-buffer-element-type.cpp` for more details of how the pass works. This PR also address many other issues surfaced by turning on `transformParamsToConstRef` pass on SPIRV backend. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Diagnose on structured buffers containing resources (#8222)Ellie Hermaszewska2025-09-03
| | | closes https://github.com/shader-slang/slang/issues/3313
* [CBP] Pointer frontend changes + groupshared pointer support (#7848)ArielG-NV2025-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolves #7628 Resolves: #8197 Primary Goals: 1. Add `Access` to pointer 2. AddressSpace::GroupShared support for pointers (SPIR-V) 3. Add `__getAddress()` to replace `&` * `&` is not updated to `require(cpu)` since slangpy uses `&`. This means we must: (1) merge PR; (2) replace `&` with `__getAddress()`; (3) add `require(cpu)` to `&` Changes: * Added to `Ptr` the `Access` generic argument & logic (for `Access::Read`). * Moved the generic argument `AddressSpace` from `Ptr` to the end of the type. * Added pointer casting support between any `Ptr` as long as the `AddressSpace` is the same * Disallow globallycoherent T* and coherent T* * Disallow const T*, T const*, and const T* * Fixed .natvis display of `ConstantValue` `ValOperandNode` * Support generic resolution of type-casted integers * Added `VariablePointer` emitting for spirv + other minor logic needed for groupshared pointers Breaking Changes: * Anyone using the `AddressSpace` of `Ptr` will now have to account for the `Access` argument * we disallow various syntax paired with `Ptr` and `T*` --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Introduce CDataLayout & -fvk-use-c-layout (#8136)Julius Ikkala2025-08-21
| | | | | | | | | | | | | | | | Closes #8112. ~~The issue asks for a "C layout", but in this PR I use the term "CPU layout" because this naming was pre-existing in the codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout is to match CPU-side struct definitions with the shader side. I'm open to better naming suggestions, though.~~ Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the CPU target depends on the object layout rules of existing CPU layout rules, but they're incompatible with actual shaders. So a new `kCLayoutRulesImpl_` was needed anyway. --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Add CI to check ir module versioning (#7821)Ellie Hermaszewska2025-07-22
|
* Add utility to trace creation of problematic IRInsts to assist LLM in ↵Copilot2025-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | debugging (#7820) * Initial plan * Add SLANG_DEBUG_IR_BREAK environment variable support Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Apply code formatting to SLANG_DEBUG_IR_BREAK implementation Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Improve stack trace debugging with -rdynamic flag and backtrace_symbols Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Address PR feedback: use PlatformUtil::getEnvironmentVariable, remove -rdynamic flag, and delete fallback branch Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Address PR feedback: simplify env var parsing, move backtrace to PlatformUtil, use #if for SLANG_LINUX_FAMILY Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Address PR feedback: remove unneeded include, make backtrace() more generic by removing uid parameter Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix and clone source tracking. * Add python script to dump traces. * Update instructions. * Batch calls to addr2line * Cleanup claude instructions. * update claude action. * Remove duplicated build instructions from claude.yml workflow Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * fix build error. * Fix build errors --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com> Co-authored-by: Gangzheng Tong <tonggangzheng@gmail.com>
* slang: Add support for generating getters for IR struct defs. (#7725)Sruthik P2025-07-17
| | | | | | | | | | | | This change expands the IR struct definition generation logic in slang-ir.h.lua to code generate the getters for the operands of an IR. To facilitate the above, the schema for the IR definitions in slang-ir-insts.lua is updated to allow for explicit specification of the operands of an IR, with Fiddle code generating the getters for them. slang-ir.h is updated to remove the hardcoded getters of the IRs since they are now generated by Fiddle. Fixes part of #7185
* Perf improvements to IR serialization (#7751)Ellie Hermaszewska2025-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * option to use riff as serialization backend * option to use riff as serialization backend * perf * shuffle code * perf improvements to deserialization * formatting * remove bit_cast * correct IR verification * neaten serialized format * fix peek module info * formatting * remove temporary profiling code * cleanup * fix wasm build * more explicit sizes * deserialize via fossil on 32 bit wasm * Make serialized modules Int size agnostic * reorder stable names to allow range based check for 64 bit constants * format * review comments * fix build * fix * c++17 compat slang-common.h
* Stable names and backwards compat for serialized IR modules (#7644)Ellie Hermaszewska2025-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable names * tests, options and ci for stable names * Add back compat design document * fix warnings * formatting * comment * neaten * regenerate command line reference * consolidate ci scripts * faster ci * remove libreadline * Move new function to end of interface --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Use fossil for IR serialization (#7619)Ellie Hermaszewska2025-07-08
| | | | | | | | | | | | | | | | | | | | | | | * bottleneck ir module reading and writing * compute/simple working * more complex tests working * neaten * factor out SourceLoc serialization * document changes * Appease clang * Correct name serialization * remove unnecessary code * neaten * neaten
* extend fiddle to allow custom lua splices in more places (#7559)Ellie Hermaszewska2025-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add fkYAML submodule * Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml * generate ir-inst-defs.h * neaten things * neaten inst def parser * add rapidyaml submodule * remove fkyaml * remove fkyaml submodule * remove use of ir-inst-defs.h * format and warnings * fix wasm build * tidy * remove rapidyaml * Extend fiddle to allow custom splices in more places * Use lua to describe ir insts * fix * neaten * neaten * neaten * spelling * neaten * comment comment out assert * merge
* remove unnecessary use of std::bit_cast (#7384)Ellie Hermaszewska2025-06-10
|
* Support tensor addressing (#7060)Jay Kwak2025-05-15
| | | | | | | | | | | This commit implements two new types and related Load/Store functions in CoopMat. tensor_addrressing.TensorLayout tensor_addressing.TensorView CoopMat.Load(..., TensorLayout) CoopMat.Load(..., TensorLayout, TensorView) CoopMat.Store(..., TensorLayout) CoopMat.Store(..., TensorLayout, TensorView) CoopMat.Load(..., TensorLayout, TensorView)
* support specialization constant sized array (#6871)kaizhangNV2025-05-14
| | | | | | | | | | | | | | | | | | | | | Close #6859 Goal of this PR We want to support an array whose size can be specialization constant for shared/global variable e.g. layout (constant_id = 0) const uint BLOCK_SIZE = 64; shared float buf_a[(BLOCK_SIZE + 5) * 4]; Overview of the solution: During IndexExpr check, we will loose the restriction to allow SpecConst passing, but the size parameter will not be a constant value because it cannot be folded into a constant, so we will make it follow the same logic as generic parameter value, and the size will be represented by FuncCallIntVal/PolynomialIntVal/DeclRefIntVal. During IR lowering, we will detect whether there is spec constant in the IntVal, and wrap the IRInst with a SpecConstRateType, and propagate the type though the lowering logic, such that the IntVal representing the array size will have SpecConstRateType. During spirv emit stage, if we detect that a IRInst has SpecConstRateType, we will emit it as SpecConstantOp. We have to implement new logic to emit OpSpecConstantOp, the existing emit logic doesn't support emitting OpSpecConstantOp, especially this op can embed arithmetic operation at global scope, where we can only emit arithmetic instruct at local. But there are only few instructs we need to support. Overview of the solution: This PR doesn't support generic, and we will create a separate PR to extend that, tracked in #6840.
* Fix SPIRV unsigned to signed widening casts (#7051)Darren Wihandi2025-05-09
| | | | | | | * Fix unsigned to signed casts for SPIRV * Add test * Fix ICE crash
* Update C++ standard to C++20 (#6980)Ellie Hermaszewska2025-05-06
| | | | | | | | | * Correct incorrect enum usage on metal * Update C++ standard to C++20 Closes https://github.com/shader-slang/slang/issues/6945 * use bit_cast
* Add IREnumType to distinguish enums from ints and each other (#6973)Julius Ikkala2025-05-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add IREnumType to distinguish enums from ints and each other * Add issue example as test * format code * Add expected test output * Fix peephole optimization hanging No idea why this PR triggered this, but there seems to have been a clear bug here anyway, so may just as well fix it now. * Move enum lowering later * Add linkage decoration to enum type * Use filecheck-buffer instead of expected.txt * Fix comment * Make enum casts actually use IR enum casts They were all BuiltinCasts by accident * Lower enum type before VM * Deal with rate-qualified types in enum cast * Allow any value marshalling for enum types * Handle new enum instructions in a couple more switches * Fix formatting --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Add cooperative matrix 1 support (#6565)Darren Wihandi2025-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * initial wip for spirv * working tiled example * clean up store and load * minor fixes * fix loadAny name * add initial tests, including broken/unimplemented intrinsics * fix subscript * run tests at 16x16, remove not supported arithmetic tests * minor fixups on implementation * rename CoopMatMatrixUse * Update tests to pass validation layers locally * Add mat-mul-add test and minor fixes * Add more tests * Remove dead code * Add coopMatLoad function and tests, enforce constexpr for matrix layout * Use getVectorOrCoopMatrixElementType in place of getVectorElementType
* Make IRWitnessTable HOISTABLE (#6417)Jay Kwak2025-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # Make `IRWitnessTable` Hoistable ## Intention of the PR This commit makes `IRWitnessTable` Hoistable so that we can avoid duplicated `IRWitnessTable`. ## Problems This commit tries to address the following issues arise after turning `IRWitnessTable` into Hoistable: 1. A Hoistable instance is immutable. 2. When tries to create a duplicated child, you will get a previously created instance of `IRWitnessTable`, instead of a new one. 3. We don't actually want to hoist `IRWitnessTable`. 4. There can be only one instance of Hoistable and it cannot appear as childs multiple times. 5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface. ## Implementation ### Solution for "1. A Hoistable instance is immutable." `IRWitnessTable::setConcreteType()` is removed, because when an `IRInst` is Hoistable, it is treated as immutable. Any `IRInst::setXXX()` methods don't work anymore. There were two places calling `setConcreteType()` and their logic had to change little bit. `DeclLoweringVisitor::visitInheritanceDecl()` in `source/slang/slang-lower-to-ir.cpp` was calling `setConcreteType()`. It had a little strange logic around `lowerType()`. The `IRWitnessTable` was added with `context->setGlobalValue()` first and its `concreteType` was changed later. This commit works around in a way that it sets the parent of `IRWitnessTable` temporarily and reset it with the correct `IRWitnessTable`. Without this logic, it went into an infinite recursion. `AutoDiffPass::fillDifferentialTypeImplementation()` in `source/slang/slang-ir-autodiff.cpp` was calling `setConcreteType()`. It was changing the concreteType of `innerResult.diffWitness`. This commit creates a new `IRWitnessTable` and copies its `IRWitnessTableEntry`. ### Solution for "2. When tries to create a duplicated child, you will get a previously created instance of IRWitnessTable, instead of a new one" After a call to `IRBuilder::createWitnessTable()`, this commit checks if the returned `IRWitnessTable` is a brand new or not. If it is not a new one, we have to avoid adding the decorations and children. This commit decides when to add decorations and children based on whether `IRWitnessTable` has any of decorations or children already. It doesn't seem like a proper way to check. But when I tried, it was difficult to find a bottleneck point where the decorations and children are added to `IRWitnessTable` first time. Note that we are not trying to find when `IRWitnessTable` is created for the first time; we need to find if the decorations and children were added once. It might be fine to have duplicated `IRWitnessTableEntry` in most of the cases, but I noticed that it fails an assertion check when `shouldDeepCloneWitnessTable()` returns false in `cloneWitnessTableImpl()`. ### Solution for "3. We don't actually want to hoist IRWitnessTable." The reason why this commit makes `IRWitnessTable` is to prevent the duplicated instances of `IRInst`. But we don't really want to "Hoist" them. When an `IRWitnessTable` gets Hoisted out, it causes unexpected problems and the specialization process fails due to the missing `IRWitnessTable` in the input. This commit prevent from hoisting `IRWitnessTable` in `_replaceInstUsesWith()`. The way this is implemented feel little hack but we discussed on Slack and decided to go with this. One of the proper approaches could be to add a new flag in `IROpFlags` and have a new one like `kIROpFlag_Deduplicate`, which is different from just `kIROpFlag_Hoistable`. ### Solution for "4. There can be only one instance of Hoistable and it cannot appear as childs multiple times." When `IRWitnessTable` is Hoistable, there can be only a unique set of instances. And we cannot have an instance as a duplicated childs. It is because `IRInst` has only one set of `IRInst* next` and `IRInst* prev`. Before this commit, an instance of `IRGeneral` could have duplicated instances of `IRWitnessTable`. As an example, `IInteger` interface inherits two other interfaces, `IArithmetic` and `ILogical`. And they both inherits from `IComparable`. ``` interface IInteger : IArithmetic, ILogical {} interface IArithmetic : IComparable {} interface ILogical : IComparable ``` When we specialize it in `specializeGenericImpl()`, an `IRBlock` gets the following list of children: - IRWitnessTable for IComparable, - IRWitnessTable for IArithmetic, - IRWitnessTable for IComparable, - IRWitnessTable for ILogical, For the cloning during the specialize, "IRWitnessTable for `IComparable`" must be cloned before the cloning of "IRWitnessTable for `IArithmetic`". Because "IRWitnessTable for `IArithmetic`" refers "IRWitnessTable for `IComparable`" as its `IRWitnessTableEntry`. The order they appear in the `IRBlock` as children decides which instances will be cloned first. And "IRWitnessTable for `IComparable`" must appear before "IRWitnessTable for `IArithmetic`". Note that "IRWitnessTable for `IComparable`" appears twice, The first one was added for "IRWitnessTable for `IArithmetic`". And the second one is added for "IRWitnessTable for `ILogical`". With this commit "IRWitnessTable for `IComparable`" can appear as a child only once in `IRBlock`. So it causes an error if it gets the following list: - IRWitnessTable for IArithmetic, - IRWitnessTable for IComparable, - IRWitnessTable for ILogical, In order to resolve the problem, "IRWitnessTable for `IComparable`" must appear before both "IRWitnessTable for `IArithmetic`" and "IRWitnessTable for `ILogical`" as following: - IRWitnessTable for IComparable, - IRWitnessTable for IArithmetic, - IRWitnessTable for ILogical, To address the problem, the instances of `IRWitnessTable` is always added to the end of the children list. If it is already added to the list, we don't move. This works out because the AST tree is built based on the dependencies. ### Solution for "5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface." This issue was found while testing with Falcor tests where it uses Conformance-type feature of Slang. We are using different import and export mangled names for a same Witness-table when the witness-table is for "Enum" interface. The way we simplify the implementation of "Enum" causes a problem when it comes to generate export/import for the witness-table. And the exact repro step is still unclear. There were two suggested solutions for the problem and this PR adopted the first option for now. Maybe we want to improve it with the second option later. option 1, when we produce mangled names for those witness-table, we can use a mangled name with the underlying "int" type instead of the name of the enum type. In this way, all witness-tables for enum types whose underlying type is same will get the same mangled name. It will allow us to deduplicate the witness-table during the linking. option 2, we can preserve type info for enum type when generating IR. We can still erase all other uses of the type info of enum types for now. But when we generate the witness-table, instead of filling the conforming type operand to IntType, we fill it as EnumType(IntType) where EnumType is a new global IROp code to represent all enum types (like InterfaceType/StructType). This way the operands for the two witness-tables will be different. "option 1" is more quick and dirty and "option 2" is more proper way to address it. I should go with "option 1" and improve it with "option 2" approach later.
* Add -dump-module command to slangc (#6638)cheneym22025-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add -dump-module command to slangc The new -dump-module command to slangc will load and disassemble a slang module, similar to what would be seen by the -dump-ir command, except that -dump-ir tells slangc to print IR as it performs some compilation command. That is, -dump-ir requires some larger compilation task. -dump-module on the otherhand requires no additional goal and will simply load a module and print its IR to stdout independently from other compilation steps. Its intended purpose is to inspect .slang-module files on disk. It can also be used on .slang files which will be parsed and lowered if slang does not find an associated ".slang-module" version of the module on disk. The compilation API is extended with a new IModule::disassemble() method which retrieves the string representation of the dumped IR. Closes #6599 * format code * Use FileStream not FILE * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Update SPIRV-Tools and fix new validation errors. (#6511)Yong He2025-03-06
| | | | | | | * Update SPIRV-Tools and fix new validation errors. * Implement pointers for glsl target. * Reworked packStorage/unpackStorage code gen to operate on pointers rather than values.
* Improve performance when compiling small shaders. (#6396)Yong He2025-02-23
| | | | | | | Improve performance when compiling small shaders. Avoid copying witness table entries that are not getting used during linking. Avoid copying auto-diff related decorations and derivative functions during linking, if the user modules doesn't use autodiff. Cache operator overload resolution results on global session, so each new Session doesn't need to repetitively run through overload resolution from scratch.
* Support cooperative vector (#6223)Jay Kwak2025-01-30
| | | | | | | * Support cooperative vector without Vulkan-header update Adding a Slang support for cooperative vector. But this commit doesn't have Vulkan-header update.
* Initial implementation of SP#015 `DescriptorHandle<T>`. (#6028)Yong He2025-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Initial implementation of `ResourcePtr<T>`. * Update docs * Fix build error. * Add more discussion. * Update documentation. * Update TOC. * Fix. * Fix. * Add test case for custom `getResourceFromBindlessHandle`. * Add namehint to generated descriptor heap param. * Fix. * Fix. * format code * Rename to `DescriptorHandle`, and add `T.Handle` alias. * Fix compiler error. * Fix. * Fix build. * Renames. * Fix documentation. * Documentation fix. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* add missing IR_LEAF_ISA for MetalMeshType (#5975)Darren Wihandi2024-12-31
| | | Co-authored-by: Yong He <yonghe@outlook.com>
* Add datalayout for constant buffers. (#5608)Yong He2024-11-21
| | | | | | | | | | | | | | | | | | | * Add datalayout for constant buffers. * Fixes. * Fix test. * Fix glsl codegen. * Update spirv-specific doc. * Fix test. * Fix binding in the presense of specialization constants. * address comments. * Add a test for constant buffer layout.
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Write only texture types. (#5454)Yong He2024-10-30
| | | | | | | | | | | | | | | | * Add support for write-only textures. * Fix capabilities. * Fix implementation. * Fix. * format code --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Replace the word stdlib or standard-library with core-module for source code ↵Jay Kwak2024-10-28
| | | | | (#5415) This commit changes the word "stdlib" or "standard library" to "core module" in the source code.
* Initial `Atomic<T>` type implementation. (#5125)Yong He2024-09-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Initial Atomic<T> type implementation. * Update design doc. * Fix. * Add test. * Fixes and add tests. * Fix WGSL. * Fix glsl. * Fix metal. * experiemnt with github metal. * experiment github metal 2 * github metal experiment 3 * experiment with github metal 4. * experiment with metal 5. * experiment 7. * metal experiment 8. * Fix metal tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Support `IDifferentiablePtrType` (#5031)Sai Praveen Bangaru2024-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * initial diff-ref-type interface * Initial support for `IDifferentiablePtrType` * Fix unused vars * More tests + fix switch case fallthrough. * Update slang-ir-autodiff.cpp * Update diff-ptr-type-loop.slang * Add optimization to allow more complex pair types * Update slang-ir-autodiff-primal-hoist.cpp * Update diff-ptr-type-loop.slang * Update slang-ir-autodiff-primal-hoist.cpp * More fixes to address reviews * Update slang-check-expr.cpp * Optimizations + rename `differentiableRefInterfaceType` -> `differentiablePtrInterfaceType` * Move pair logic to ir-builder, unify the type dictionaries. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Support mixture of precompiled and non-precompiled modules (#4860)cheneym22024-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Support mixture of precompiled and non-precompiled modules This changes the implementation of precompile DXIL modules to accept combinations of modules with precompiled DXIL, ones without, and ones with a mixture of precompiled DXIL and Slang IR. During precompilation, module IR is analyzed to find public functions which appear to be capable of being compiled as HLSL, and those functions are given a HLSLExport decoration, ensuring they are emitted as HLSL and preserved in the precompiled DXIL blob. The IR for those functions is then tagged with a new decoration AvailableInDXIL, which marks that their implementation is present in the embedded DXIL blob. The DXIL blob is attached to the IR as before, inside a EmbeddedDXIL BlobLit instruction. The logic that determines whether or not functions should be precompiled to DXIL is a placeholder at this point, returning true always. A subsequent change will add selection criteria. During module linking, the full module IR is available, as well as the optional EmbeddedDXIL blob. The IR for functions implemented by the blob are tagged with AvailableInDXIL in the module IR. After linking the IR for all modules to program level IR, the IR for the functions marked AvailableInDXIL are deleted from the linked IR, prior to emitting HLSL and compiling linking the result. This change also changes the point of time when the module IR is checked for EmbeddedDXIL blobs. Instead of happening at load time as before, it happens during immediately before final linking, meaning that the blob does not need to be independently stored with the module separate from the IR as was done previously. Work on #4792 * Clean up debug prints * Call isSimpleHLSLDataType stub * Address feedback on precompiled dxil support Allow for IR filtering both before and after linking. Only mark AvailableInDXIL those functions which pass both filtering stages. Functions are corrlated using mangled function names. Rather than delete functions entirely when linking with libraries that include precompiled DXIL, instead convert the IR function definitions to declarations by gutting them, removing child blocks. * Use artifact metadata and name list instead of linkedir hack * Use String instead of UnownedStringSlice * Update tests * Renaming * Minor edits * Don't fully remove functions post-link * Unexport before collecting metadata
* Metal: Mesh Shaders (#4280)Dynamitos2024-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Metal: mesh shading skeleton * Metal: fixing mesh payload * Metal: improving mesh shader indices output * Metal: Implementing conditional mesh output set * Metal: Trying to not break other backends * Metal: trying to fix mesh output set * Metal: Fixing MeshOutputSet usages * Metal: Fixing vertex and primitive semantics * Metal: Fixing code style * Metal: Fixed hlsl indices set * Fixed HLSL mesh output set disappearing and GLSL mesh output crashing * Metal: Adjusting task test matching --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Implement `-fvk-use-dx-layout` (#4912)ArielG-NV2024-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Implement `-fvk-use-dx-layout` Fixes: #4126 Changes: * Added fvk-use-dx-layout * Modified `HLSLConstantBufferLayoutRulesImpl` for correctness (ex: Array is always 16 byte aligned) * Added kFXCShaderResourceLayoutRulesFamilyImpl and kFXCConstantBufferLayoutRulesFamilyImpl to handle fvk-use-dx-layout * Added `ConstantBufferLayoutRules` to manage constant buffer rules * Added `alignCompositeElementOfNonAggregate`/`alignCompositeElementOfAggregate` to handle forced alignment of composites for ConstantBuffers * `StructuredBuffer` rules are mostly equal to `scalar` layout, not much was needed to be changed to support this behavior. * seperate legacy constant buffer and how Slang does constant-buffer normally * undo an addition * remove accidental test * Address review and fix Address review and remove GLSL support since GLSL requires a seperate legalization (need to linearlize structs like with `legalizeMetalIR` to assign explicit offsets) * comments * remove aggregate and non-aggregate logic We don't need this distinction for the logic --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Make variadic generics work with interfaces and forward autodiff. (#4905)Yong He2024-08-23
|
* Tuple swizzling, concat, comparison and `countof`. (#4856)Yong He2024-08-19
| | | | | | | | | | | * Tuple swizzling and element access. * Update proposal status. * Cleanup. * Fix merrge error. * Address review.
* Variadic Generics Part 2: IR lowering and specialization. (#4849)Yong He2024-08-18
| | | | | | | | | * Variadic Generics Part 2: IR lowering and specialization. * Update design doc status. * Update design doc. * Resolve review comments.
* Initial support for precompiled DXIL in slang-modules (#4755)cheneym22024-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add embedded precompiled binary IR ops Add IR operations to embed precompiled DXIL or SPIR-V blobs into IR. Adds a BlobLit literal that is mostly identical to StringLit except for its inability to be displayed, e.g. in dumped IR. In the future, the blob might be dumped as hexadecimal, but for now it is summarized as "<binary blob>". * EmbeddedDXIL and SPIR-V options The options, '-embed-dxil' and '-embed-spirv' in slangc, will cause a target dxil or spirv to be compiled and stored in the translation unit IR when written to a slang-module. Subsequent changes actually implement the options. * Per-translation unit DXIL precompilation When -embed-dxil is specified, perform a precompilation to DXIL of each TU, linked only with stdlib. Embed the resulting DXIL for the TU in a IR op. Being part of IR, the precompiled DXIL can be serialized to disk in a slang-module. Upon loading slang-modules, the new IR op will be searched for and the precompiled DXIL blob is saved with the loaded Module. During linking, if all the Modules have precompiled blobs they will be sent to the downstream compile commands as libraries instead of source, skipping the downstream compilation, using DXC only for linking. Fixes Issue #4580 * Remove placeholder embedded SPIRV support Code was added only to sketch out how other precompiled bins will be supported. * Remove the rest of the SPIRV placeholder support * Fix warnings, test error on non-windows * Remove lib_6_6 hack, add dxil_lib capability * Allocate blob value from irmodule memarena * Add null check after memarena allocation * Restore the request->e2erequest code path for generatewholeprogram * Update capability handling, move EmbedDXIL enum to end to preserve abi * Remove lib_6_6 hack * Move ICompileRequest functions to end
* Overhaul IR lowering of pointer types. (#4710)Yong He2024-07-25
| | | | | | | | | | | | | | | * Overhaul IR lowering of pointer types. * Propagate address space in IRBuilder. * Fixup. * Fix. * Fix. * Change how Ptr type is printed to text. * Fix.
* Add generic descriptor indexing intrinsic (#4389)dubiousconst2822024-07-24
| | | | | | | | | | | | | | | | | | | | | | | * Add ResourceArray intrinsic type * Move aliased parameter generation to GLSL legalization * Add DynamicResourceEntry type for proxying layout of GenericResourceArray * Reimplement as DynamicResource * Add reflection test * Don't reuse alias cache between different parameters * Add dynamic cast extensions for buffer types * Minor format fix * Fix VarDecl diagnostics after finding non-appliable initializer candidates --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Specialize address space during spirv legalization. (#4600)Yong He2024-07-10
| | | | | | | | | | | * Specialize address space during spirv legalization. * Fix. * Fix building doc. * Fix cmake. * Update assert.
* Metal Task Shader payload (#4238)Dynamitos2024-06-02
|
* Prevent hoisting of non-hoistable instructions in non-function values with ↵Ellie Hermaszewska2024-06-01
| | | | code (#4250)
* Add options to speedup compilation. (#4240)Yong He2024-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Add options to speedup compilation. * Fix. * Plumb options to DCE pass. * Revert debug change. * Fix regressions. * More optimizations. * more cleanup and fixes. * remove comment. * Fixes. * Another fix. * Fix errors. * Fix errors. * Add comments.
* Adds functionality to dump IR to stdout (#4065)Sriram Murali2024-05-01
| | | | | Adds a member dump() to IRInst that can writes the immediate value or IR inst value to stdout to help with debugging
* Metal: rewrite global variables as explicit context. (#3981)Yong He2024-04-18
| | | | | * Metal: rewrite global variables as explicit context. * Small tweaks.