summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-ir-specialize.cpp
Commit message (Collapse)AuthorAge
* Clean up Slang IR representation of undefined values (#8708)Theresa Foley2025-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change, the Slang IR used a single opcode (`kIROp_Undefined`) to encode all cases of undefined values. The particular motivation for this change was a need to distinguish those undefined values that represent a load from an uninitialized memory location versus other sorts of undefined values. If transforming a variable into SSA form results in `undefined` values in cases where the a `load` was executed without a prior `store`, that represents an error on the programmer's part, and should be diagnosed. However, other cases of undefined values can arise during program transformation and optimization, and should not typically result in diagnostics being emitted. While it was not the original motivation for this change, it is also worth noting that the LLVM project has transitioned from initially using only a single `undef` instruction to having a more nuanced model, and the same factors that motivated their shift also apply to the Slang IR. Counter-intuitively, the semantics of undefined values actually need to be carefully defined. Concretely, this change splits the pre-existing `undefined` opcode into two sub-cases: - `kIROp_LoadFromUninitializedMemory`, to represent the case of loading from a memory location (such as a local variable) that has not been initialized. - `kIROp_Poison`, corresponding to the LLVM `poison` value. Our poison instruction is intended to have semantics comparable to LLVM's equivalent. Conceptually, any operation that is invoked with a poison value as input will (with a few exceptions) produce a poison value as output. One can think of the behavior of `poison` as similar to how not-a-number values propagate in floating-point computations: by default they "infect" the result of any computation they are involved in. This semantic choice helps to ensure that many optimizations end up being correct in the presence of undefined values, even if they did not specifically account for them. The `kIROp_LoadFromUninitializedMemory` case is comparable to the combination of `freeze` and `undef` in LLVM. An LLVM `undef` value has semantics that allow *each* use of that value to be replaced with a *different* arbitrary value; these semantics cause many optimizations to only be correct in the absence of undefined values. An LLVM `freeze` instruction can take an undefined value as input, and produces a single value that is still arbitrary, but must be consistent across all uses. The latter semantics are what we want, since a given `load` from an uninitialized memory location will yield an arbitrary-but-fixed value. Note that we intentionally do not have a direct analogue to LLVM's `undef` instruction, because of the way that `undef` causes so many complications when trying to write optimizations. We also do not add a `kIROp_Freeze` instruction in this change, but that is simply because we currently have no need for it. Existing code that was creating `IRUndefined` values has been updated to create either `IRPoison` or `IRLoadFromUninitializedMemory` values, as appropriate to the use case. Code that was checking for the `kIROp_Undefined` opcode has been updated to either check for both of the new opcodes (in the case of `switch` statements), or to use `as<IRUndefined>` to perform a dynamic cast to the common base type of the two new instructions. Note that this change does not alter the way that instructions representing undefined values are typically emitted as ordinary instructions in the block that produces an undefined value. While emitting `IRLoadFromUninitializedMemory` as an ordinary instruction is exactly what we want, the `IRPoison` case would actually be better represented in Slang IR as a "hoistable" instruction, so that there would only be a singular `poison` value of each type. Changing `IRPoison` to be hoistable would be a good follow-up change, but might run into more challenges depending on what assumptions (if any) the codebase is making about where undefined values get emitted. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Specialize interfaces in DebugFunction (#8617)Julius Ikkala2025-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | E.g. in [generic-extension-2.slang](https://github.com/shader-slang/slang/blob/master/tests/language-feature/extensions/generic-extension-2.slang), incorrect DebugFunctions are generated for `getFirstOuter`: ``` let %33 : Void = DebugFunction("getFirstOuter", 18 : UInt, 3 : UInt, %26, Func(Int, 0 : Int)) ``` This happens because specialization passes are leaving a `%IFoo` in the function type, instead of replacing with a concrete type: ``` let %34 : Void = DebugFunction("getFirstOuter", 18 : UInt, 3 : UInt, %26, Func(Int, %IFoo)) ``` and later, `cleanUpInterfaceTypes()` just replaces all interfaces with the literal zero. So now we have a parameter type which isn't actually a type at all, but an IntLit instead. I'm not sure if the approach I picked is good, though. Some other options that crossed my mind were: * Make `fixUpFuncType` also update related DebugFunctions - But is there a reason why DebugFunctions separately carry a function type in the first place? * Make `cleanUpInterfaceTypes` less aggressive or at least replace types with a type instead of a value - But this will still make the debug info incorrect :(
* Lookup refactor (#8467)kaizhangNV2025-09-23
| | | | | | | | | | | Close #8201. This PR unify the lowering logic for LookupDeclRef of an interface requirement. We will always lower this AST node to a LookupWitness IR. The key of this IR is the special witnessTableType `ThisTypeWitness`, this witness Table is simply a wrapper for an interface type. Our current specialization pass doesn't handle this kind of LookupWitness IR at all, so we will also add the specialization of this_type IR as well.
* extend fiddle to allow custom lua splices in more places (#7559)Ellie Hermaszewska2025-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add fkYAML submodule * Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml * generate ir-inst-defs.h * neaten things * neaten inst def parser * add rapidyaml submodule * remove fkyaml * remove fkyaml submodule * remove use of ir-inst-defs.h * format and warnings * fix wasm build * tidy * remove rapidyaml * Extend fiddle to allow custom splices in more places * Use lua to describe ir insts * fix * neaten * neaten * neaten * spelling * neaten * comment comment out assert * merge
* Implement spec const for generic parameter (#7121)kaizhangNV2025-05-15
| | | | | | | | | | | | | | | | | | Close #6840. This PR add supports to use specialize constant in generic parameter, and that parameter can also be used as array size, e.g. following code should work: ``` struct MyStruct<let N: int> { float buffer[N]; } MyStruct<SpecConstVar> s; ``` - Loose the restriction from Link-Time to SpecializationConstant when extract generic argument - Tweak the logic of how we decide whether a inst is hoistable. Besides checking existing hoistable flag of each IRInst, when we detect a IRInst's type is SpecConstRateType, we will treat that inst hoistable. Because IRInst in global scope can be deduplicated, and every SpecConstRateType inst should be in the global scope or IRGeneric scope (which will be at global scope after specialization). - Remove the SpecConstIntVal to IRInst map in IR lowering logic, because we already have way to deduplicate the global scope IR.
* Fix regression in partial specialization of existential arguments (#6818)kaizhangNV2025-04-17
| | | | | | | | | | | | | | | | | | | | | | Close #6589. In PR #6487, we support partial specialization. However there is a corner case we didn't handle correctly. For the IR like this: %val: specialize(...) = some inst; %arg1: specialize(...) = makeExistential(%val, ...); %arg2: %SomeConcreteType: load(...); call func(%arg1, %arg2); when we specialize the call func instruct, we will also specialize the function parameters. On our existing logic, when we find an argument is a makeExistential, we will always extract the existential value, and use its type as the new parameter. But in this case, %arg1 is not fully specialized yet, so it's type will still be a specialize. In this case, we will change the function's first parameter from an existential type to a specialize. This will result in that we lose the chance to specialize the first argument in the next iteration, because the first parameter of this function is not an existential type any more. The reason behind this is that we should always keep specializing the arguments and parameters at the same time. So this PR just does a check before specializing the parameters that if the argument cannot be fully specialized, we won't specialize the parameter this early. Instead, we will wait for the next iteration until the argument can be specialized.
* Allow partial specialization of existential arguments. (#6487)Yong He2025-02-28
| | | | | | | | | | | | | | | | | | | | | * Allow partial specialization of existential arguments. * Fix. * Add test case for improved diagnostics. * Fix compile error. * Fix tests. * Fix. * Fix test. * Fix compile issue. * Fix typo. * Address comment.
* Fix issue with specialization using arithmetic expressions (#6084)Sai Praveen Bangaru2025-01-14
|
* [Auto-diff] Overhaul auto-diff type tracking + Overhaul dynamic dispatch for ↵Sai Praveen Bangaru2025-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | differentiable functions (#5866) * Overhauled the auto-diff system for dynamic dispatch * More fixes * remove intermediate dumps * Update slang-ast-type.h * More fixes + add a workaround for existential no-diff * Update reverse-control-flow-3.slang * remove dumps * remove more dumps * Delete working-reverse-control-flow-3.hlsl * Cleanup comments + unused variables * More comment cleanup * Add support for lowering `DiffPairType(TypePack)` & `MakePair(MakeValuePack, MakeValuePack)` * Fix array of issues in Falcor tests. * Update slang-ir-autodiff-pairs.cpp * More fixes for Falcor image tests * Small fixups. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Check whether array element is fully specialized (#6000)kaizhangNV2025-01-07
| | | | | | | | | | | | | | | | | | | | | * Check whether array element is fully specialized close #5776 When we start specialize a "specialize" IR, we should make sure all the elements are fully specialized, but we miss checking the elements of an array. This change will check the it. * add test * add all wrapper types into the check * add utility function to check if the type is wrapper type --------- Co-authored-by: zhangkai <zhangkai@zhangkais-MacBook-Pro.local> Co-authored-by: Yong He <yonghe@outlook.com>
* Fix two specialization bugs (#5540)Anders Leino2024-11-12
| | | | | | | | | | | | | | | | | * Fix two specialization bugs The first bug was introduced in b2ca2d5a4efeae807d3c3f48f60235e47413b559 and ran some code at scope exit that dereferenced a nullptr context. The second bug was introduced in bea1394ad35680940a0b69b9c67efc43764cc194 and would cause the wrong mangled name to be used during specialization. This closes #5516. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Specialize existential return types when possible. (#5044)Yong He2024-09-10
| | | | | | | | | | | | | * Fix inccorect dropping of declref during Unification of DeclaredSubtypeWitness. * Add extension test. * Specialize existential return types when possible. * Fix. * Fix. * Fix falcor issue.
* Make variadic generics work with interfaces and forward autodiff. (#4905)Yong He2024-08-23
|
* Tuple swizzling, concat, comparison and `countof`. (#4856)Yong He2024-08-19
| | | | | | | | | | | * Tuple swizzling and element access. * Update proposal status. * Cleanup. * Fix merrge error. * Address review.
* Variadic Generics Part 2: IR lowering and specialization. (#4849)Yong He2024-08-18
| | | | | | | | | * Variadic Generics Part 2: IR lowering and specialization. * Update design doc status. * Update design doc. * Resolve review comments.
* FIx issue with specializing witness tables (#4839)Sai Praveen Bangaru2024-08-13
|
* Overhaul IR lowering of pointer types. (#4710)Yong He2024-07-25
| | | | | | | | | | | | | | | * Overhaul IR lowering of pointer types. * Propagate address space in IRBuilder. * Fixup. * Fix. * Fix. * Change how Ptr type is printed to text. * Fix.
* Add option to preserve shader parameter declaration in output SPIRV. (#4344)Yong He2024-06-12
| | | | | * Add option to preserve shader parameter declarations in output. * Add test.
* Add options to speedup compilation. (#4240)Yong He2024-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Add options to speedup compilation. * Fix. * Plumb options to DCE pass. * Revert debug change. * Fix regressions. * More optimizations. * more cleanup and fixes. * remove comment. * Fixes. * Another fix. * Fix errors. * Fix errors. * Add comments.
* Add `-minimum-slang-optimization` to favor compile time. (#4186)Yong He2024-05-17
|
* Support arithmetics on generic arguments (#3968)Jay Kwak2024-04-19
| | | | | Resovles an issue #3935 Slang had to fold the generic arguments after specialization.
* Fix crash when specializing generic entry points. (#3760)Yong He2024-03-13
|
* Refactor compiler option representations. (#3598)Yong He2024-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
* Fix type checking around generic array types. (#3568)Yong He2024-02-11
|
* Support pointers in SPIRV. (#3561)Yong He2024-02-08
| | | | | | | | | | | * Support pointers in SPIRV. * Fix test. * Enhance test. * Fix test. * Cleanup.
* Add per-buffer data layout control. (#3551)Yong He2024-02-05
| | | | | | | | | | | | | * Add per-buffer data layout control. Fixes #3534. * Fixes. * Robustness. * Update test. * Fix.
* Add ConstBufferPointer::subscript. (#3415)Yong He2023-12-15
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* SPIRV compiler performance fixes. (#3258)Yong He2023-10-04
| | | | | | | | | | | | | | | * SPIRV compiler performance fixes. * Cleanup. * update project files * Cleanup debug code. * Make redundancy removal non-recursive. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Use ankerl/unordered_dense as a hashmap implementation (#3036)Ellie Hermaszewska2023-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Correct namespace for getClockFrequency * missing const * Add missing assignment operator * Remove unused variables * Return correct modified variable * Use stable hash code for file system identity * terse static_assert * Structured binding for map iteration * Make (==) and getHashCode const on many structs * Add ConstIterator for LinkedList * Replace uses of ItemProxy::getValue with Dictionary::at * Extract list of loads from gradientsMap before updating it * Const correctness in type layout * Add unordered_dense hashmap submodule * Use wyhash or getHashCode in slang-hash.h * refactor slang-hash.h * Use ankerl/unordered_dense as a hashmap implementation Notable changes: - The subscript operator returns a reference directly to the value, rather than a lazy ItemProxy (pair of dict pointer and key) slang-profile time (95% over 10 runs): - Before: 6.3913906 (±0.0746) - After: 5.9276123 (±0.0964) * 64 bit hash for strings So they have the same hash as char buffers with the same contents * Narrowing warnings for gcc to match msvc * revert back to c++17 * Correct c++ version for msvc * Use path to unordered_dense which keeps tests happy * Do not assign to and read from map in same expression * Remove redundant map operations in primal-hoist * Split out stable hash functions into slang-stable-hash.h * 64 bit hash by default * regenerate vs projects * Correct return type from HashSetBase::getCount() * correct width for call to Dictionary::reserve * Use stable hash for obfuscated module ids * Signed int for reserve * clearer variable naming * Parameterize Dictionary on hash and equality functors * Allow heterogenous lookup for Dictionary * missing const * Use set over operator[] in some places * Remove unused function * s/at/getValue
* SPIR-V WIP (#3064)Ellie Hermaszewska2023-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add type layout for structured buffer * Default to generating spirv directly * vk test for compute simple * Add spirv-dis as a downstream compiler * Emit Array types in SPIR-V * makevector for spirv * Dump whole spirv module on validation failure * register array types todo, use emitTypeInst * Neater formatting for unhandled inst printing * break out emitCompositeConstruct * Correct array type generation * neaten * Allow getElement for vector * Remove unused * Allow predicating target intrinsics on types * Consider functions with intrinsics to have definitions We need to specialize these if they are predicated on types * Correct array type generation * makeArray for spir-v * replace getElement with getElementPtr for spirv * Correct translation of field access for spirv * Push layouts to types for spirv * Spirv intrinsics * operator now makes a pointer * Add structured buffer of struct test * Preserve type layout in spirv structured buffer legalization * neaten * makeVectorFromScalar for SPIRV * placeholder for layouts on param groups * More type safe spirv op construction * Know that constants and types only go in one section * Remove emitTypeInst * Add todo for spirv sampling * Add links to spirv documentation on emit functions * OpTypeImage support for SPIR-V * Add simpler texture test for spirv * s/spirv_direct/spirv/g * Allow several string literals in target_intrinsic * Handle global params without a var layour for SPIR-V For example groupshared vars * uint spirv asm type * Add todo for isDefinition It is currently too broad * Some atomic op spirv intrinsics * Strip ConstantBuffer wrappers for spirv * Add todo for matrix annotations * Do not associate decorations insts with spirv counterparts * Correct entry point parameter generation * Spelling * Assert that fieldAddress is returning a pointer * Add error for existential type layout getting to spir-v emit * Add IRTupleTypeLayout Unused so far * Allow getElementPtr to work with vectors * Correct target name in test * Hide default spirv direct behind a premake option --default-spirv-direct=true * Do not insert space at start of intrinsic def * Correct asm rendering in tests * remove redundant option * Emit directly from direct test * Add source language options for spirv-dis * Add comments to spirv dis * Add dead debug print for before spirv module * Correct asm rendering in tests * s/spirv_direct/spirv/g * Only specialize intrinsic functions with predicates * regenerate vs projects * squash warnings * squash warnings * remove duplication * Silence warnings from msvc * squash warnings * Overload for zero sized array * More msvc warnings * warnings * Add spirv-tools to path for tests * Do not be specific about dxc version for diag test * Normalize line endings from spirv-dis * Correct filecheck matches * Temporarily disable two spirv tests Failing on CI, undebuggable hang :/ * Do not emit storage class more than once for spirv snippet * Do not pass spir-v to spirv-dis by stdin * Do not get spirv-dis output via stream, use file * normalize file endings in spirv-dis output
* Optimize specialization, and remove unnecessary calls to `simplifyIR`. (#2999)Yong He2023-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | * Remove unneccessary calls to `simplifyIR`. * fix. * Delete obsolete hoistConst pass. * Fix. * Small improvements. * Fix. * Fix enum lowering. * fix * tweaks. * tweaks. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Simplify Lookup and improve compiler performance. (#2996)Yong He2023-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | * Simplify lookup. * Various bug fixes. * Report type dictionary size in perf benchmark. * Remove type duplication. * increase initial dict size. * Bug fix. * Fix bugs. * Fixup. * Revert type legalization looping. * Fix specialization pass. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Pool inst worklists and hashsets to avoid rehashing. (#2982)Yong He2023-07-12
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Use scratchData on `IRInst` to replace HashSets. (#2978)Yong He2023-07-12
| | | | | | | | | | | | | | | * Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add perf benchmark utility. (#2977)Yong He2023-07-11
| | | | | | | | | | | | | * Add perf benchmark utility. * Update documentation. * Fix. * Fix doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add API for querying total compile time. (#2898)Yong He2023-05-23
| | | | | | | | | | | | | * Add API for querying total compile time. * Optimize. * Remove redundant simplifyIR calls. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska2023-04-26
|
* Fix specialization dictionaries cleanup pass (#2844)Sai Praveen Bangaru2023-04-26
|
* Dictionary using lowerCamel (#2835)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
* Combine lookupWitness lowering with specialization. (#2794)Yong He2023-04-12
|
* Apply IR simplifcation immediately after specialization to avoid duplicates. ↵Yong He2023-03-27
| | | | | | | | | | | | | | (#2739) * Apply IR simplifcation immediately after specialization to avoid duplicates. * Update source/slang/slang-ir-specialize.cpp Co-authored-by: Ellie Hermaszewska <github@sub.monoid.al> --------- Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Ellie Hermaszewska <github@sub.monoid.al>
* Type legalization and autodiff bug fixes. (#2722)Yong He2023-03-22
| | | | | | | | | | | | | | | | | | | | | * Bug fixes. * Fix. * Only perform autodiff for functions whose derivative is actually used. * Fix loop optimize bug. * Fix high order diff. * Fix trivial diff func generation. * Fixes. * Cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Remove `SharedIRBuilder`. (#2657)Yong He2023-02-16
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Overhaul global inst deduplication and cpp/cuda backend. (#2654)Yong He2023-02-16
| | | | | | | | | * Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Overhaul `transposeParameterBlock` to support `inout` params. (#2621)Yong He2023-02-03
| | | | | | | | | | | | | | | | | | | | | | | * Overhaul `transposeParameterBlock` to support `inout` params. * Small bug fixes. * Bug fix on differentiable intrinsic specialization. * Fixes. * Run autodiff tests on CPU. * Clean up. * More bug fixes., * Add test coverage on inout param. * Fix language server hinting for transcribed mutable params. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* First custom backward-derivative test case working. (#2598)Yong He2023-01-17
|
* Nested bwd-diff func call context save/restore. (#2584)Yong He2023-01-10
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Split bwd_diff op into separate ops for primal and propagate func. (#2582)Yong He2023-01-06
| | | | | | | | | | | * Split bwd_diff op into separate ops for primal and propagate func. * Fix. * Download swiftshader with github actions instead of curl on linux. * Fix github action. Co-authored-by: Yong He <yhe@nvidia.com>