summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-ir-peephole.cpp
Commit message (Collapse)AuthorAge
* Clean up Slang IR representation of undefined values (#8708)Theresa Foley2025-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change, the Slang IR used a single opcode (`kIROp_Undefined`) to encode all cases of undefined values. The particular motivation for this change was a need to distinguish those undefined values that represent a load from an uninitialized memory location versus other sorts of undefined values. If transforming a variable into SSA form results in `undefined` values in cases where the a `load` was executed without a prior `store`, that represents an error on the programmer's part, and should be diagnosed. However, other cases of undefined values can arise during program transformation and optimization, and should not typically result in diagnostics being emitted. While it was not the original motivation for this change, it is also worth noting that the LLVM project has transitioned from initially using only a single `undef` instruction to having a more nuanced model, and the same factors that motivated their shift also apply to the Slang IR. Counter-intuitively, the semantics of undefined values actually need to be carefully defined. Concretely, this change splits the pre-existing `undefined` opcode into two sub-cases: - `kIROp_LoadFromUninitializedMemory`, to represent the case of loading from a memory location (such as a local variable) that has not been initialized. - `kIROp_Poison`, corresponding to the LLVM `poison` value. Our poison instruction is intended to have semantics comparable to LLVM's equivalent. Conceptually, any operation that is invoked with a poison value as input will (with a few exceptions) produce a poison value as output. One can think of the behavior of `poison` as similar to how not-a-number values propagate in floating-point computations: by default they "infect" the result of any computation they are involved in. This semantic choice helps to ensure that many optimizations end up being correct in the presence of undefined values, even if they did not specifically account for them. The `kIROp_LoadFromUninitializedMemory` case is comparable to the combination of `freeze` and `undef` in LLVM. An LLVM `undef` value has semantics that allow *each* use of that value to be replaced with a *different* arbitrary value; these semantics cause many optimizations to only be correct in the absence of undefined values. An LLVM `freeze` instruction can take an undefined value as input, and produces a single value that is still arbitrary, but must be consistent across all uses. The latter semantics are what we want, since a given `load` from an uninitialized memory location will yield an arbitrary-but-fixed value. Note that we intentionally do not have a direct analogue to LLVM's `undef` instruction, because of the way that `undef` causes so many complications when trying to write optimizations. We also do not add a `kIROp_Freeze` instruction in this change, but that is simply because we currently have no need for it. Existing code that was creating `IRUndefined` values has been updated to create either `IRPoison` or `IRLoadFromUninitializedMemory` values, as appropriate to the use case. Code that was checking for the `kIROp_Undefined` opcode has been updated to either check for both of the new opcodes (in the case of `switch` statements), or to use `as<IRUndefined>` to perform a dynamic cast to the common base type of the two new instructions. Note that this change does not alter the way that instructions representing undefined values are typically emitted as ordinary instructions in the block that produces an undefined value. While emitting `IRLoadFromUninitializedMemory` as an ordinary instruction is exactly what we want, the `IRPoison` case would actually be better represented in Slang IR as a "hoistable" instruction, so that there would only be a singular `poison` value of each type. Changing `IRPoison` to be hoistable would be a good follow-up change, but might run into more challenges depending on what assumptions (if any) the codebase is making about where undefined values get emitted. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* extend fiddle to allow custom lua splices in more places (#7559)Ellie Hermaszewska2025-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add fkYAML submodule * Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml * generate ir-inst-defs.h * neaten things * neaten inst def parser * add rapidyaml submodule * remove fkyaml * remove fkyaml submodule * remove use of ir-inst-defs.h * format and warnings * fix wasm build * tidy * remove rapidyaml * Extend fiddle to allow custom splices in more places * Use lua to describe ir insts * fix * neaten * neaten * neaten * spelling * neaten * comment comment out assert * merge
* Fix tuple AST & IR layout size queries (#7502)Julius Ikkala2025-06-26
| | | | | * Fix tuple AST & IR layout size queries * Don't peephole sizeof if size is still indeterminate
* Make interface types non c-style in Slang2026. (#7260)Yong He2025-06-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Make interface types non c-style. * Make Optional<T> work with autodiff and existential types. * Fix. * patch behind slang 2026. * Fix warnings. * cleanup. * Fix tests. * Fix. * Fix com interface lowering. * Add comment to test. * regenerate command line reference * Add test for passing `none` to autodiff function. * Fix recording of `getDynamicObjectRTTIBytes`. * Fix nested Optional types. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* void field rework (#6739)kaizhangNV2025-04-09
| | | | | * void field rework * move void cleanup pass earlier
* Simplify implicit cast ctors for vector & matrix. (#6408)Yong He2025-02-20
| | | | | | | | | | | * Simplify implicit cast ctors for vector & matrix. * Fix formatting. * Fix tests. * Fix Falcor test. * Mark __builtin_cast as internal.
* Support cooperative vector (#6223)Jay Kwak2025-01-30
| | | | | | | * Support cooperative vector without Vulkan-header update Adding a Slang support for cooperative vector. But this commit doesn't have Vulkan-header update.
* Refactor _Texture to constrain on texel types. (#6115)Yong He2025-01-17
| | | | | | | | | * Refactor _Texture to constrain on texel types. * Fix tests. * Fix. * Disable glsl texture test because rhi can't run it correctly.
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Make variadic generics work with interfaces and forward autodiff. (#4905)Yong He2024-08-23
|
* Tuple swizzling, concat, comparison and `countof`. (#4856)Yong He2024-08-19
| | | | | | | | | | | * Tuple swizzling and element access. * Update proposal status. * Cleanup. * Fix merrge error. * Address review.
* Variadic Generics Part 2: IR lowering and specialization. (#4849)Yong He2024-08-18
| | | | | | | | | * Variadic Generics Part 2: IR lowering and specialization. * Update design doc status. * Update design doc. * Resolve review comments.
* Allow passing sized array to unsized array parameter. (#4744)Yong He2024-07-26
|
* Add options to speedup compilation. (#4240)Yong He2024-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Add options to speedup compilation. * Fix. * Plumb options to DCE pass. * Revert debug change. * Fix regressions. * More optimizations. * more cleanup and fixes. * remove comment. * Fixes. * Another fix. * Fix errors. * Fix errors. * Add comments.
* Add LoadAligned and StoreAligned methods to ByteAddressBuffers (#4066)Sriram Murali2024-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes #4062 This change enables wide load/stores for byte-address-buffer backed resources, when the data is accessed at an offset that is aligned. **Goals** - Improve performance by issuing wider instructions instead of sequence of scalar instructions, for load and stores of byte-address buffers. - Reduce code-size and readability of the generated shaders. - Help naive users as well as ninja programmers, generate optimal code. **Non Goals** - Help with Structured buffers, or other resources. - Target compilation time improvements. **Key changes** Adds 2 new overloads for Load and Store operations on ByteAddress Buffers. 1. Load / Store with an extra alignment parameter ``` resource.Load<T>(offset, alignment); resource.Store<T>(offset, value, alignment); ``` 2. LoadAligned / StoreAligned with no extra parameter, with the same signature as orignial Load / Store. ``` resource.LoadAligned<T>(offset); resource.StoreAligned<T>(offset, value); ``` - This overload will implicitly identify the alignment value, from the base type T of the elementary unit of the resource. **Supported resources** 1. Vectors This can be upto 4 elements, i.e. float -- float4. 2. Arrays This does not have a limit on number of elements, but on a conservative estimate, we can limit to few hundreds. 3. Structures This is used to group a resource of a single type. ``` struct { float4 x; } ``` **Code updates** - Modified byte-address-ir legalize to handle struct, array and vector kinds of load or store access - Added custom hlsl stdlib functions to implement all the overloads for Load, Store etc. - Added C-like emitter, SPIR-V emitter for handling ByteAddressBuffers. - Added a new core stdlib function intrinsic to wrap around alignOf<T>(). - Added a new peephole optimization entry to identify the equivalent IntLiteral value from the alignOf<T>() inst. - Added tests to check explicit, and implicit aligned Load and Store operations.
* Fix peephole optimization of `TypeEquals`. (#3865)Yong He2024-04-01
| | | Closes #3861.
* Extend `as` and `is` operator to work on generic types. (#3672)Yong He2024-03-04
|
* Fix various crashes when generating debug info. (#3650)Yong He2024-02-29
| | | | | | | | | | | | | | | | | * Fix crash when generating debug info for geometry shaders. * Fix. * Fix source language field in DebugCompilationUnit. * Fix. * Emit DebugEntryPoint inst. * Add trivial test. * Cleanup. * More cleanup.
* Allow default values for `extern` symbols. (#3632)Yong He2024-02-26
| | | | | | | * Allow default values for `extern` symbols. * Fix. * Fix test.
* Refactor compiler option representations. (#3598)Yong He2024-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
* Support pointers in SPIRV. (#3561)Yong He2024-02-08
| | | | | | | | | | | * Support pointers in SPIRV. * Fix test. * Enhance test. * Fix test. * Cleanup.
* Atomics+Wave ops intrinsics fixes. (#3542)Yong He2024-02-02
| | | | | | | | | | | | | | | * Fix atomics intrinsics, increase kMaxDescriptorSets. * Add SPIRVASM to known non-differentiable insts. * Support fp16 wave ops when targeting glsl. * Fixes. * Fix vk validation errors. * Fix. * Add to allowed failures.
* Add ConstBufferPointer::subscript. (#3415)Yong He2023-12-15
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fix Phi simplification bug (#3325)Ellie Hermaszewska2023-11-13
| | | Fixes https://github.com/shader-slang/slang/issues/3323
* Cleanup builtin arithmetic interfaces. (#3317)Yong He2023-11-10
| | | | | | | | | | | | | | | | | | | | | * wip: clean up IArithmetic * wip. * Cleanup builtin arithmetic interfaces. * Fix. * Fixes. * Fix. * Fix. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* SPIRV compiler performance fixes. (#3258)Yong He2023-10-04
| | | | | | | | | | | | | | | * SPIRV compiler performance fixes. * Cleanup. * update project files * Cleanup debug code. * Make redundancy removal non-recursive. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Various SPIRV fixes. (#3231)Yong He2023-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Various SPIRV fixes. - Geometry shader support (WIP). - Fix texture get dimension and load. - Fold global GetElement(MakeArray/MakeVector) insts. - Call spvopt to inline all functions. - Translate OpImageSubscript. - Emit struct member names and global variable names. - Fix lowering of OpBitNot -> OpNot, instead of OpBitReverse. * Fix test. * Fix geometry shader. * Fix geometry shader emit. * Add atomic Image access test. * Fix tests. * don't fail if spirv-opt fails. * Update comments. * Fix test. * Cleanups. * indentation --------- Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Various slangpy fixes. (#3227)Yong He2023-09-21
| | | | | | | | | | | | | * Make dynamic cast transparent through `IRAttributedType`. * Add [CUDAXxx] variant of attributes. * Support marshaling of vector types. * Wrap cuda kernels in `extern "C"` block. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* SPIR-V image operations (#3163)Ellie Hermaszewska2023-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add __truncate and __sampledType for spirv_asm Allows some texture tests to start passing * add __isVector Currently unused * Add 1-vector legalization pass (WIP) * Add capabilities for image types * neaten instruction dumping * add 1-vector test * Add a couple of cases to vec1 legalization * Remove texture tests from expected failures * comment * regenerate vs projects * Remove redundant define form synchapi emulation * refactoring image methods * All sample functions refactored * Remove incorrect glsl intrinsics Partially addresses https://github.com/shader-slang/slang/issues/3174 * __subscript image ops via writing funcs * Extract texture struct writing from core.meta.slang * Abstract out cuda intrinsic * Remvoe erroneous call to opDecorateIndex * spirv asm IR utils * Correct position of loads for SPIR-V asm inst operands * Raise constructors to global scope during spir-v legalization * Correct snippet output * Implement most texture sampling ops for SPIR-V * Legalize 1-vectors for glsl too * Make SPIR-V inst operands non-hoistable * Better 1-vector legalization * Put textures in ptrs for spirv * insert missing break * Add vec1 legalization test * Add some missing pieces to slang-ir-insts * Greatly neaten vec1 legalization * a * Neaten vec1 legalization * Add image read and write intrinsics for spir-v * Squash warnings * regenerate vs projects * Drop redundant guards * Drop 5 tests from expected failure list * Inst numbering changes to cross compile tests * vec1 legalization tests only on vk * Correct location of asm op emit * Inline constant in spirv-asm * Correct signedness for lane in wave intrinsics * Extract element from float1 for cuda * squash warnings * Neaten spirv-emit * dedupe more capabilities * warnings * neaten assert * comments * comments
* Add `target_switch` and `intrinsic_asm` statement. (#3154)Yong He2023-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add `target_switch` and `__intrinsic_asm` statement. * Cleanup. * WaveGetActiveMask, WaveGetActiveMask, WaveCountBits. * WaveIsFirstLane. * More wave intrinsics. * wave intrinsics. * merge fix. * Fix. * Fix. * Update test. * update test. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Misc. SPIRV Fixes, Part 2. (#3147)Yong He2023-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Misc. SPIRV Fixes, Part 2. * Fix up. * Fix. * Add system smenatic values. * 16 bit int and floats, matrix/vector reshape, bool ops. * Fix. * Fix. * Allow push constant entry point params. * entrypoint params. * swizzleSet and swizzledStore. * packoffset. * string hash. * Fix. * Matrix arithmetics. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix scalar swizzle causes invalid glsl output. (#3028)Yong He2023-07-26
| | | | | | | | | | | | | * Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output. * Add `transpose` to `ObjectToWorld4x3`. * Fix scalar swizzle causes invalid glsl output. Fixes #3026. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Optimize specialization, and remove unnecessary calls to `simplifyIR`. (#2999)Yong He2023-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | * Remove unneccessary calls to `simplifyIR`. * fix. * Delete obsolete hoistConst pass. * Fix. * Small improvements. * Fix. * Fix enum lowering. * fix * tweaks. * tweaks. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix intellisense and autodiff crashes. (#2879)Yong He2023-05-10
| | | | | | | | | | | * Fix intellisense crash. * Fix a bug in updateElement simplification. * cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Generate faster derivative for div by const operations. (#2877)Yong He2023-05-10
| | | | | | | | | * Generate faster derivative for div by const operations. * Increase `kMaxIterationsToAttempt` to 256. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix handling of `[PreferRecompute]`. (#2855)Sai Praveen Bangaru2023-04-28
| | | | Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Yong He <yonghe@outlook.com>
* Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska2023-04-26
|
* Dictionary using lowerCamel (#2835)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
* Combine lookupWitness lowering with specialization. (#2794)Yong He2023-04-12
|
* Update checkpoint policy to make obvious recompute decisions. (#2753)Yong He2023-03-29
| | | | | | | | | | | | | | | * Update checkpoint policy to make obvious recompute decisions. Also adds an optimization to fold updateElement chains on the same array or struct into a single makeArray or makeStruct. * Bug fixes around array types with different int typed count. * change test. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix Phi simplification bug. (#2710)Yong He2023-03-16
| | | | | | | | | | | | | | | | | | | | | | | * Fix Phi simplification bug. * Fix up. * Fix. * Fix. * Fix. * Fix. * Fix. * Fix test. * Fix test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Properly implement differential witness of intermediate context type. (#2699)Yong He2023-03-15
| | | | | | | | | * Properly implement differential witness of intermediate context type. * Modify test to include a loop. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Support `fwd_diff(bwd_diff(f))`. (#2697)Yong He2023-03-14
| | | | | | | | | * Support `fwd_diff(bwd_diff(f))`. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Remove `SharedIRBuilder`. (#2657)Yong He2023-02-16
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Various auto-diff bug fixes. (#2646)Yong He2023-02-13
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add Loop Unrolling Pass. (#2644)Yong He2023-02-13
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Arithmetic simplifications and more IR clean up logic. (#2632)Yong He2023-02-07
|
* Unify UpdateField and UpdateElement with access chain. (#2611)Yong He2023-01-25
| | | | | | | * Unify UpdateField and UpdateElement with access chain. * Fix warnings. Co-authored-by: Yong He <yhe@nvidia.com>
* Reimplement address elimination. (#2605)Yong He2023-01-24
| | | | | | | | | * Reimplement address elimination pass. * Fix error. * Update test references. Co-authored-by: Yong He <yhe@nvidia.com>