summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-ir-util.cpp
Commit message (Collapse)AuthorAge
* meowyum2025-10-31
|
* non-exported public labels get namespaced nowyum2025-10-28
|
* Immutable access qualifier for pointers and use `__ldg` on cuda. (#8710)Yong He2025-10-16
| | | | | | | | | | | | | | | | | | | | | | | | This PR implements `Access.Immutable` to allow pointers to immutable data. The new type `ImmutablePtr<T>` is defined as an alias of `Ptr<T, Address.Immutable>`. By forming a immutable pointer, the programmer is conveying to the compiler that the data at the pointer address will never change during the execution of the current program. Therefore loads from immutable pointers can be deduplicated by the compiler, and will translate to `__ldg` when generating code for CUDA. The SPIRV backend is not changed in this PR, since the current SPIRV spec makes it very difficult to specify loads from immutable address without generating tons of wrappers and boilerplate type declarations. We would like to see the spec evolved a bit to around its support of `NonWritable` physical storage pointers or immutable loads before we attempt to express such immutability in SPIRV. For now we simply emit ordinary pointers and loads when generating spirv. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Clean up Slang IR representation of undefined values (#8708)Theresa Foley2025-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change, the Slang IR used a single opcode (`kIROp_Undefined`) to encode all cases of undefined values. The particular motivation for this change was a need to distinguish those undefined values that represent a load from an uninitialized memory location versus other sorts of undefined values. If transforming a variable into SSA form results in `undefined` values in cases where the a `load` was executed without a prior `store`, that represents an error on the programmer's part, and should be diagnosed. However, other cases of undefined values can arise during program transformation and optimization, and should not typically result in diagnostics being emitted. While it was not the original motivation for this change, it is also worth noting that the LLVM project has transitioned from initially using only a single `undef` instruction to having a more nuanced model, and the same factors that motivated their shift also apply to the Slang IR. Counter-intuitively, the semantics of undefined values actually need to be carefully defined. Concretely, this change splits the pre-existing `undefined` opcode into two sub-cases: - `kIROp_LoadFromUninitializedMemory`, to represent the case of loading from a memory location (such as a local variable) that has not been initialized. - `kIROp_Poison`, corresponding to the LLVM `poison` value. Our poison instruction is intended to have semantics comparable to LLVM's equivalent. Conceptually, any operation that is invoked with a poison value as input will (with a few exceptions) produce a poison value as output. One can think of the behavior of `poison` as similar to how not-a-number values propagate in floating-point computations: by default they "infect" the result of any computation they are involved in. This semantic choice helps to ensure that many optimizations end up being correct in the presence of undefined values, even if they did not specifically account for them. The `kIROp_LoadFromUninitializedMemory` case is comparable to the combination of `freeze` and `undef` in LLVM. An LLVM `undef` value has semantics that allow *each* use of that value to be replaced with a *different* arbitrary value; these semantics cause many optimizations to only be correct in the absence of undefined values. An LLVM `freeze` instruction can take an undefined value as input, and produces a single value that is still arbitrary, but must be consistent across all uses. The latter semantics are what we want, since a given `load` from an uninitialized memory location will yield an arbitrary-but-fixed value. Note that we intentionally do not have a direct analogue to LLVM's `undef` instruction, because of the way that `undef` causes so many complications when trying to write optimizations. We also do not add a `kIROp_Freeze` instruction in this change, but that is simply because we currently have no need for it. Existing code that was creating `IRUndefined` values has been updated to create either `IRPoison` or `IRLoadFromUninitializedMemory` values, as appropriate to the use case. Code that was checking for the `kIROp_Undefined` opcode has been updated to either check for both of the new opcodes (in the case of `switch` statements), or to use `as<IRUndefined>` to perform a dynamic cast to the common base type of the two new instructions. Note that this change does not alter the way that instructions representing undefined values are typically emitted as ordinary instructions in the block that produces an undefined value. While emitting `IRLoadFromUninitializedMemory` as an ordinary instruction is exactly what we want, the `IRPoison` case would actually be better represented in Slang IR as a "hoistable" instruction, so that there would only be a singular `poison` value of each type. Changing `IRPoison` to be hoistable would be a good follow-up change, but might run into more challenges depending on what assumptions (if any) the codebase is making about where undefined values get emitted. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix a bug that causes a struct field to be initialized twice. (#8619)Yong He2025-10-07
| | | | | | | | | | | | | | | | | | | | | | We insert field initialization logic at the beginning of every ctor in `synthesizeCtorBody`, but then immediately inserts another round of initialization again for explicit ctors in `maybeInsertDefaultInitExpr`, both called from `SemanticsDeclBodyVisitor::visitAggTypeDecl` right next to each other. The fix is to remove `maybeInsertDefaultInitExpr`. This change also enhances the address aliasing analysis, so that for the following case: ``` this->member1 = 0; this->member2 = 0; this->member1 = param; ``` We can still remove the first assignment to `this->member1` despite seeing `this->member2=0`, since it is easy to know that `this->member2` cannot alias with `this->member1`. Closes #8600.
* Rename some symbols related to pointers types (#8592)Theresa Foley2025-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that while this change touched a large numer of files, there are no changes to functionality being made here. The only things being done are renaming various symbols and, in a few cases, updating or adding comments for consistency with the new names. The core of the naming changes are: * Most things named to refer to `OutType` (e.g., `IROutType`, `IRBuilder::getOutType()`, etc.) have been consistently renamed to refer to `OutParamType`, to emphasize that the relevant AST/IR node types are only intended for use to represent `out` parameters. * The same change as described above for `OutType` is also made for `RefType`, which becomes `RefParamType` in most cases. One mess that this exposes is the way that the `ExplicitRef<T>` type in the core module currently lowers to `IRRefParamType`. This change sticks to the rule of not making functional changes, so that mess is left as-is for now. * Names referring to `InOutType` have been changed to instead refer to `BorrowInOutType`. The intention with this naming change is to emphasize that the Slang rules for `inout` are semantically those of a borrow (or at least our interpretation of what a borrow means). * Names referring to `ConstRefType` have been changed to instead refer to `BorrowInType`. This change starts work on clarifying that the existing `__constref` modifier was never intended to be a read-only analogue of `__ref`, and instead is the input-only analogue of `inout`. * The `ParameterDirection` enum type has been changed to `ParamPassingMode`, to reflect the fact that the concept of "direction" fails to capture what is actually being encoded, particularly once we have modes beyond simple `in`/`out`/`inout`. While this change does not alter behavior in any case (the user-exposed Slang language is unchanged), it is intended to set up subsequence changes that will work to make the handling of these types in the compiler more nuanced and correct. Breaking this part of the change out separately is primarily motivated by a desire to minimize the effort for reviewers. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Relax the inst definition order rule (#8588)kaizhangNV2025-10-02
| | | | | | | | | | | | Close #8572. The root cause of the issue is that in `_replaceInstUsesWith` call, if the use of the inst is a generic parameter, and the inst is the data type of that generic parameter, we could end up of moving the data type before the generic parameter. This will break the layout of generic parameters, where all the generic parameters should be laid consecutively at the beginning of the first block of the generic. Therefore, we don't make that relocation for such case.
* Enhance buffer load specialization pass to specialize past field extracts. ↵Yong He2025-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (#8547) This allows us to specialize functions whose argument is a sub element of a constant buffer, instead of being only applicable to entire buffer element. Closes #8421. This change also implements a proper heuristic to determine when to specialize the calls and defer the buffer loads. This PR addresses a pathological case exposed in `slangpy\slangpy\benchmarks\test_benchmark_tensor.py`, which used to take 27ms to finish, and now takes 1.25ms. For example, given: ``` struct Bottom { float bigArray[1024]; [mutating] void setVal(int index, float value) { bigArray[index] = value; } } struct Root { Bottom top[2]; [mutating] void setTopVal(int x, int y, float value) { top[x].setVal(y, value); } } RWStructuredBuffer<Root> sb; [shader("compute")] [numthreads(1, 1, 1)] void compute_main(uint3 tid: SV_DispatchThreadID) { sb[0].setTopVal(1, 2, 100.0f); } ``` We are now able to specialize the call to `setTopVal` into: ``` void compute_main(uint3 tid: SV_DispatchThreadID) { setTopVal_specialized(0, 1, 2, 100.0f); } void setTopVal_specialized(int sbIdx, int x, int y, float value) { Bottom_setVal_specialized(sbIdx, x, y, value); } void Bottom_setVal_specialized(int sbIdx, int x, int y, float value) { sb[sbIdx].top[x].bigArray[y] = value; } ``` And get rid of all unnecessary loads. Achieving this requires a combination of function call specialization and buffer-load-defer pass. The buffer-load-defer pass has been completely rewritten to be more correct and avoid introducing redundant loads. This PR also adds tests to make sure pointers, bindless handles, and loads from structured buffer or constant buffers works as expected.
* Rewriting the lower-buffer-element-type pass to avoid unnecessary ↵Yong He2025-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | packing/unpacking. (#8526) Part of the effort to improve the performance of generated SPIRV code. The existing lower-buffer-element-type pass works by loading the entire buffer element content from memory, and translate it to logical type stored in a local variable at the earliest reference of a buffer handle. This means that is can generate inefficient code that reads more than necessary. Consider this example: ``` struct BigStruct { bool values[1024]; } ConstantBuffer<BigStruct> cb; void test(BigStruct v) { if (v.values[0]) { printf("ok"); } } [numthreads(1,1,1)] void computeMain() { test(cb); } ``` In IR, the `computeMain` function before lower-buffer-element-type pass is something like following: ``` func test: %v = param : BigStruct %barr = fieldExtract(%v, "values") %element = elementExtract(%barr, 0) ... // uses %element func computeMain: %v = load(cb) call %test %v ``` The existing lower-buffer-element-type pass will rewrite the bool array in `BigStruct` into `int` array so it is legal in SPIRV. However, it does so by inserting the translation on the first `load` of the constant buffer: ``` struct BigStruct_std430 { int values[1024]; } var cb : ConstantBuffer<BigStruct_std430>; func computeMain: %tmpVar : var<BigStruct> call %unpackStorage(%tmpVar, cb) %v : BigStruct = load %tmpVar call %test %v ``` This means that the entire array will be loaded and translated to int, before calling `test`, which only uses one element. It turns out that the downstream compiler isn't always able to optimize out this inefficient translation/copy. This PR completely rewrites the way buffer-element-type lowering is handled to avoid producing this inefficient code. It works in two parts: first we turn on the `transformParamsToConstRef` pass for SPIRV target as well, so we will translate the `test` function to take the `v` parameter as `constref`. The second part is a redesigned buffer-element-type pass that defers the storage-type to logical-type translation until a value is actually used by a `load` instruction. In this example, after `transformParamsToConstRef`, the IR is: ``` func test: %v = param : ConstRef<BigStruct> %barr = fieldAddr(%v, "values") %elementPtr = elementAddr(%barr, 0) %element = load(%elementPtr) ... // uses %element func computeMain: call %test %cb ``` The new `buffer-element-type-lowering` pass will take this IR, and insert translation at latest possible time across the entire call graph, and translate the IR into: ``` func test: %v = param : ConstRef<BigStruct_std430> %barr = fieldAddr(%v, "values") %elementPtr : ptr<int> = elementAddr(%barr, 0) %element_int = load(%elementPtr) %element = cast(%element_int) : %bool ... // uses %element func computeMain: call %test %cb ``` In this new IR, there is no longer a load and conversion of the entire array. See new comment in `slang-ir-lower-buffer-element-type.cpp` for more details of how the pass works. This PR also address many other issues surfaced by turning on `transformParamsToConstRef` pass on SPIRV backend. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix for Generic Function Redefinition Error (#7891)Gangzheng Tong2025-07-25
| | | | | | | | | * emit literal values in getTypeNameHint for bool, str etc. * add test for specializing generics with bool literals * fix build error * add specializing with Enum type test
* Lower int/uint/bool matrices to arrays for SPIRV (#7687)venkataram-nv2025-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add tests for expected behaviour * Allow matrix types in logical or/and * Legalize int/bool matrix types and construction with makeMatrix * Legalize uint matrices and operations * Limit testing to only SPIRV * Better tests for int and bool * Add test for uint * Remove GLSL tests * Remove old test for diagnosing int matrices * Emit SPIRV directly in tests * format code * Address PR comments * Improve testing * Address PR comments * format code * Add tests for matrix intrinsic operations * Move matrix lowering to dedicated legalization pass * Fix compiler warning * Remove signal again * Reorder matrix and vector legalization * Fix formatting * Add shift and comparison tests --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix duplicate mangled names for interface requirements (#7764)Yong He2025-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fix duplicate mangled names for interface requirements Remove linkage decorations from interface method requirement values to prevent duplicate mangled names and allow DCE to clean up unused functions. Interface requirements only need the type information, not the linkage, so the generated IRFunc with export decorations was causing conflicts. Fixes #7761 Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * Move linkage decoration removal into default case of switch Move the removeLinkageDecorations call into the default case of the switch statement so it only applies to general requirements and not special cases like associated types. Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * Remove conditional check for linkage decoration removal Apply removeLinkageDecorations unconditionally to interface requirement values to ensure all requirement values have their linkage decorations removed. Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * Change removeLinkageDecorations parameter type from IRGlobalValueWithCode* to IRInst* This fixes build errors by allowing the function to accept the broader IRInst* type, making it compatible with the unconditional call in the interface requirement processing. Co-authored-by: Yong He <csyonghe@users.noreply.github.com> * format code (#7787) Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yong He <csyonghe@users.noreply.github.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Replace [KnownBuiltin] string-based comparisons with enum-based system (#7714)Copilot2025-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Initial plan * Implement enum-based KnownBuiltin system to replace string comparisons Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Add test for enum-based KnownBuiltin system and verify functionality Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Implement enum-based KnownBuiltin system with direct integer values Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix IntVal access and update tests for new enum-based KnownBuiltin system Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Replace hardcoded KnownBuiltin integers with preprocessor enum syntax - Updated all KnownBuiltin attributes to use $( (int)KnownBuiltinDeclName::EnumValue) syntax - Added space between parentheses to avoid preprocessor bug: $( (int) instead of $((int) - Updated both core.meta.slang and hlsl.meta.slang files - Eliminates preprocessor-time integer conversion, baking enum values directly into meta files - Maintains same functionality while using type-safe enum references Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix IDifferentiablePtr KnownBuiltin mapping regression Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Remove unused IDifferentiablePtrType enum case from KnownBuiltinDeclName Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Clean up temporary AST dump files from testing Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Replace hardcoded integer with descriptive constant in KnownBuiltin test Replace the hardcoded [KnownBuiltin(0)] with a descriptive named constant GEOMETRY_STREAM_APPEND_BUILTIN to improve code readability and maintainability. The test now clearly indicates which builtin enum value is being tested. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Gangzheng Tong <gtong-nv@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gangzheng Tong <gtong-nv@users.noreply.github.com> Co-authored-by: Gangzheng Tong <tonggangzheng@gmail.com>
* Fix for emitting ArrayStride decoration for arrays of opaque types (#7568)Jerran Schmidt2025-07-02
| | | | | | | | | * WIP opaque type decoration fix * Clearer intent * Formatting * Added test for fix
* extend fiddle to allow custom lua splices in more places (#7559)Ellie Hermaszewska2025-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add fkYAML submodule * Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml * generate ir-inst-defs.h * neaten things * neaten inst def parser * add rapidyaml submodule * remove fkyaml * remove fkyaml submodule * remove use of ir-inst-defs.h * format and warnings * fix wasm build * tidy * remove rapidyaml * Extend fiddle to allow custom splices in more places * Use lua to describe ir insts * fix * neaten * neaten * neaten * spelling * neaten * comment comment out assert * merge
* Fix for OpUConvert producing invalid opcode when to/from signs differ (#7398)Jerran Schmidt2025-06-26
| | | | | | | | | | | | | | | | | * Fix for OpUConvert outputting scalar type for mixed sign vector type conversions * Fix compiler warning * Added regression test for UConvert vector fix * Formatting * getUnsignedType helper * Formatting * Fix for addtional int types * Helper function to convert signed type to unsigned type
* Fix for missing signedness cast in SwizzleIR (#7448)Jerran Schmidt2025-06-16
| | | | | | | | | * Cast if there is a signedness mismatch on the swizzle * Move isSignedType to slang-util and add test --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Fix SPIRV `OpSpecConstantOp` emit (#7158)Darren Wihandi2025-05-29
| | | | | | | | | | | | | * Fix SPIRV specialization constant with floating-point operations * Improve test * WIP * Restrict `OpSpecConstantOp` allowed operations based on SPIRV specifications * Fix typo on floating type check * Emit error on float to int spec cosnt int val casts
* Implement spec const for generic parameter (#7121)kaizhangNV2025-05-15
| | | | | | | | | | | | | | | | | | Close #6840. This PR add supports to use specialize constant in generic parameter, and that parameter can also be used as array size, e.g. following code should work: ``` struct MyStruct<let N: int> { float buffer[N]; } MyStruct<SpecConstVar> s; ``` - Loose the restriction from Link-Time to SpecializationConstant when extract generic argument - Tweak the logic of how we decide whether a inst is hoistable. Besides checking existing hoistable flag of each IRInst, when we detect a IRInst's type is SpecConstRateType, we will treat that inst hoistable. Because IRInst in global scope can be deduplicated, and every SpecConstRateType inst should be in the global scope or IRGeneric scope (which will be at global scope after specialization). - Remove the SpecConstIntVal to IRInst map in IR lowering logic, because we already have way to deduplicate the global scope IR.
* support specialization constant sized array (#6871)kaizhangNV2025-05-14
| | | | | | | | | | | | | | | | | | | | | Close #6859 Goal of this PR We want to support an array whose size can be specialization constant for shared/global variable e.g. layout (constant_id = 0) const uint BLOCK_SIZE = 64; shared float buf_a[(BLOCK_SIZE + 5) * 4]; Overview of the solution: During IndexExpr check, we will loose the restriction to allow SpecConst passing, but the size parameter will not be a constant value because it cannot be folded into a constant, so we will make it follow the same logic as generic parameter value, and the size will be represented by FuncCallIntVal/PolynomialIntVal/DeclRefIntVal. During IR lowering, we will detect whether there is spec constant in the IntVal, and wrap the IRInst with a SpecConstRateType, and propagate the type though the lowering logic, such that the IntVal representing the array size will have SpecConstRateType. During spirv emit stage, if we detect that a IRInst has SpecConstRateType, we will emit it as SpecConstantOp. We have to implement new logic to emit OpSpecConstantOp, the existing emit logic doesn't support emitting OpSpecConstantOp, especially this op can embed arithmetic operation at global scope, where we can only emit arithmetic instruct at local. But there are only few instructs we need to support. Overview of the solution: This PR doesn't support generic, and we will create a separate PR to extend that, tracked in #6840.
* Add IREnumType to distinguish enums from ints and each other (#6973)Julius Ikkala2025-05-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add IREnumType to distinguish enums from ints and each other * Add issue example as test * format code * Add expected test output * Fix peephole optimization hanging No idea why this PR triggered this, but there seems to have been a clear bug here anyway, so may just as well fix it now. * Move enum lowering later * Add linkage decoration to enum type * Use filecheck-buffer instead of expected.txt * Fix comment * Make enum casts actually use IR enum casts They were all BuiltinCasts by accident * Lower enum type before VM * Deal with rate-qualified types in enum cast * Allow any value marshalling for enum types * Handle new enum instructions in a couple more switches * Fix formatting --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Add Slang Byte Code generation and interpreter. (#6896)Yong He2025-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add Slang Byte Code generation and interpreter. * Fix compile issues. * format code * More compile fix. * Fix clang issue. * Fix more clang issues. * Another clang fix. * Fix clang issues. * Fix another clang issue. * Fix wasm build. * Update building.md * Fix test-server. * Fix compile error. * Fix bug. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Add cooperative matrix 1 support (#6565)Darren Wihandi2025-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * initial wip for spirv * working tiled example * clean up store and load * minor fixes * fix loadAny name * add initial tests, including broken/unimplemented intrinsics * fix subscript * run tests at 16x16, remove not supported arithmetic tests * minor fixups on implementation * rename CoopMatMatrixUse * Update tests to pass validation layers locally * Add mat-mul-add test and minor fixes * Add more tests * Remove dead code * Add coopMatLoad function and tests, enforce constexpr for matrix layout * Use getVectorOrCoopMatrixElementType in place of getVectorElementType
* Fix `IRVar` hoisting when its already in the right block. (#6626)Sai Praveen Bangaru2025-03-18
| | | Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Update SPIRV-Tools and fix new validation errors. (#6511)Yong He2025-03-06
| | | | | | | * Update SPIRV-Tools and fix new validation errors. * Implement pointers for glsl target. * Reworked packStorage/unpackStorage code gen to operate on pointers rather than values.
* Add Slang-specific intrinsics for integer pack/unpack (#6459)Darren Wihandi2025-02-28
| | | | | | | | | | | | | | | | | | | | | * update hlsl meta * update test * use slang syntax in meta file * improve meta file * fix pack clamp u8 * remove builtin packed types, use typealias instead * fix wgsl pack clamp * fix formatting --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Fix a bug with hoisting 'IRVar' insts that are used outside the loop (#6446)Sai Praveen Bangaru2025-02-25
| | | | | | | | | | | * Fix a bug with hoisting 'IRVar' insts that are used outside the loop - We introduce a 'CheckpointObject' inst and use that to split loop state insts into two pieces (one for within-loop uses and one for outside-loop uses. - This allows the two kinds of uses to be handled separately by the hoisting mechanism - CheckpointObject is then lowered to a no-op after hoisting is complete. * Update slang-ir-autodiff-primal-hoist.cpp * Update slang-ir-autodiff-primal-hoist.cpp
* Fix `UseGraph::replace` (#6395)Sai Praveen Bangaru2025-02-25
| | | | | | | | | | | | | | | | | | | * Fix `UseGraph::isTrivial()` test. * Fix. * Fix. * Refactor `UseGraph` and `UseChain` * Update slang-ir-autodiff-primal-hoist.cpp * Update all auto-diff locations that handle pointers to treat user pointers as regular values * Update test to use direct-SPIRV only --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Improve performance when compiling small shaders. (#6396)Yong He2025-02-23
| | | | | | | Improve performance when compiling small shaders. Avoid copying witness table entries that are not getting used during linking. Avoid copying auto-diff related decorations and derivative functions during linking, if the user modules doesn't use autodiff. Cache operator overload resolution results on global session, so each new Session doesn't need to repetitively run through overload resolution from scratch.
* Fix DCE for calls to functions that have associations (#6272)Sai Praveen Bangaru2025-02-05
| | | | | | | * Fix DCE for calls to functions that have associations * Update slang-ir-util.cpp * Update slang-ir-util.cpp
* Support cooperative vector (#6223)Jay Kwak2025-01-30
| | | | | | | * Support cooperative vector without Vulkan-header update Adding a Slang support for cooperative vector. But this commit doesn't have Vulkan-header update.
* Fix def-use issue from multi-level break elimination (#6134)Sai Praveen Bangaru2025-01-20
|
* Implement specialization constant support in numthreads / local_size (#5963)Julius Ikkala2025-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Allow using specialization constants in numthreads attribute * Add support for GLSL local_size_x_id syntax * Fix overeager specialization constant parsing * Add diagnostics for specialization constant numthreads * Remove unused variable * Fix local_size_x_id not finding existing specialization constant * Allow materializeGetWorkGroupSize to reference specialization constants * Use SpvOpExecutionModeId for modes that require it * Cleanup specialization constant numthreads code * Add tests for specialization constant work group sizes * Fix implicit Slang::Int -> int32_t cast * Fix querying thread group size in reflection API --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Check whether array element is fully specialized (#6000)kaizhangNV2025-01-07
| | | | | | | | | | | | | | | | | | | | | * Check whether array element is fully specialized close #5776 When we start specialize a "specialize" IR, we should make sure all the elements are fully specialized, but we miss checking the elements of an array. This change will check the it. * add test * add all wrapper types into the check * add utility function to check if the type is wrapper type --------- Co-authored-by: zhangkai <zhangkai@zhangkais-MacBook-Pro.local> Co-authored-by: Yong He <yonghe@outlook.com>
* Add packed 8bit builtin types (#5939)Darren Wihandi2024-12-26
| | | | | * Add packed bytes builtin type * fix test
* Fix IntVal unification logic to insert type casts + buffer element lowering ↵Yong He2024-11-06
| | | | | | | regression. (#5508) * Fix IntVal unification logic to insert type casts. * Fix regression.
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Assorted auto-diff enhancements for increased performance & more streamlined ↵Sai Praveen Bangaru2024-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | auto-diff results (#5394) * Various AD enhancements * Fix issue with pt-loop test * Update pt-loop.slang * More fixes for perf. Final minimal context test now passes. * Fix issue with loop-elimination pass not running after dce * Try fix wgpu test by removing select operator * Disable wgpu * Delete out.wgsl * Remove comments * Update slang-ir-util.cpp * Fix header relative paths for slang-embed * Disbale wgpu for a few other tests * Better way of determining which params to ignore for side-effects * Update slang-ir-dce.cpp * Fix issue with circular reference from previous AD pass being left behind for the next AD pass * Update slang-ir-dce.cpp
* Correct control flow in getParentBreakBlockSet (#5024)Ellie Hermaszewska2024-09-06
| | | | Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
* Support mixture of precompiled and non-precompiled modules (#4860)cheneym22024-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Support mixture of precompiled and non-precompiled modules This changes the implementation of precompile DXIL modules to accept combinations of modules with precompiled DXIL, ones without, and ones with a mixture of precompiled DXIL and Slang IR. During precompilation, module IR is analyzed to find public functions which appear to be capable of being compiled as HLSL, and those functions are given a HLSLExport decoration, ensuring they are emitted as HLSL and preserved in the precompiled DXIL blob. The IR for those functions is then tagged with a new decoration AvailableInDXIL, which marks that their implementation is present in the embedded DXIL blob. The DXIL blob is attached to the IR as before, inside a EmbeddedDXIL BlobLit instruction. The logic that determines whether or not functions should be precompiled to DXIL is a placeholder at this point, returning true always. A subsequent change will add selection criteria. During module linking, the full module IR is available, as well as the optional EmbeddedDXIL blob. The IR for functions implemented by the blob are tagged with AvailableInDXIL in the module IR. After linking the IR for all modules to program level IR, the IR for the functions marked AvailableInDXIL are deleted from the linked IR, prior to emitting HLSL and compiling linking the result. This change also changes the point of time when the module IR is checked for EmbeddedDXIL blobs. Instead of happening at load time as before, it happens during immediately before final linking, meaning that the blob does not need to be independently stored with the module separate from the IR as was done previously. Work on #4792 * Clean up debug prints * Call isSimpleHLSLDataType stub * Address feedback on precompiled dxil support Allow for IR filtering both before and after linking. Only mark AvailableInDXIL those functions which pass both filtering stages. Functions are corrlated using mangled function names. Rather than delete functions entirely when linking with libraries that include precompiled DXIL, instead convert the IR function definitions to declarations by gutting them, removing child blocks. * Use artifact metadata and name list instead of linkedir hack * Use String instead of UnownedStringSlice * Update tests * Renaming * Minor edits * Don't fully remove functions post-link * Unexport before collecting metadata
* Make variadic generics work with interfaces and forward autodiff. (#4905)Yong He2024-08-23
|
* Fix SPIRV emit for small-integer texture types. (#4753)Yong He2024-07-30
| | | | | * Fix SPIRV emit for small-integer texture types. * Disable -emit-spirv-via-glsl test.
* Fix the issue in emitFloatCast (#4559)kaizhangNV2024-07-08
| | | | | | | | | | | | | | | | | | | | | | | * Fix the issue in emitFloatCast In emitFloatCast function, we only considered the input type is float scalar or float vector, so if the input type is a float matrix type, it will crash. We should also handle the float matrix type. Also, we add some diagnose info to point out the source location where there is error happened, so in the future it's easier to tell us what happens. * Add a unit test * Disable the test for metal Metal doesn't support 'double'. " metal 32023.35: /tmp/unknown-YgHAsJ.metal(15): error : 'double' is not supported in Metal matrix<double,int(3),int(4)> b_0 = matrix<double,int(3),int(4)> (a_0); "
* Expand upon existing `ImageSubscript` support (Metal, GLSL, SPIRV) (#4408)ArielG-NV2024-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add additional `ImageSubscript` features: 1. Added ImageSubscript support for Metal & a test case * Merge GLSL/SPIRV/Metal `ImageSubscript` legalization pass 2. Added multisample support to glsl/spirv/metal for when using ImageSubscript * Added in this PR since the overhaul of the code merges together GLSL/SPIRV/Metal implementation 3. Fixed minor metal texture `Load`/`Read` bugs * [HLSL methods of access do not support subscript accessor for texture cube array](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/texturecubearray) * removed swizzling of uint/int/float * other odd bugs which were causing compile errors note: Compute tests do not work due to what seems to be the GFX backend (causes crash without error report). The tests are disabled. * disable LOD with texture 1d seems that LOD for 1d textures need to be a compile time constant as per an error metal throws * syntax error in hlsl.meta * static_assert alone with intrinsic_asm error provides cleaner errors Note: `static_assert` seems to be unstable and not be fully respected (still require `intrinsic_asm` to avoid a stdlib compile error) * change comment to `// lod is not supported for 1D texture * add `static_assert` in related code gen paths * address review * address review * add asserts as per review comment, NOTE: unclear if these should be release 'asserts' as well
* Support SPIR-V DebugTypePointer (#4228)Jay Kwak2024-05-30
|
* Add options to speedup compilation. (#4240)Yong He2024-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Add options to speedup compilation. * Fix. * Plumb options to DCE pass. * Revert debug change. * Fix regressions. * More optimizations. * more cleanup and fixes. * remove comment. * Fixes. * Another fix. * Fix errors. * Fix errors. * Add comments.
* Add host shared library target. (#4098)Yong He2024-05-03
| | | | | | | | | | | | | * Add host shared library target. * Attempt fix. * Fix warnings. * try fix. * Fix test. * Fix.
* Fix compile failures when using debug symbol. (#4069)Yong He2024-05-01
| | | | | | | | | * Fix compile failures when using debug symbol. * Various fixes. * Fix intrinsic. * Fix test.
* SPIRV: Fix performance issue when handling large arrays. (#4064)Yong He2024-05-01
| | | | | | | * SPIRV: Fix performance issue when handling large arrays. * Add test for packing. * Fix clang.