summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-ir-sccp.cpp
Commit message (Collapse)AuthorAge
* Fix IEEE 754 NaN comparisons in constant folding (#7721)Jay Kwak2025-07-11
| | | | | | | | | | | | | | | | | * Fix IEEE 754 NaN comparisons in constant folding Added proper NaN handling in SCCP optimization pass to follow IEEE 754 standard: - NaN \!= any value returns true - All other NaN comparisons return false - Added double precision NaN detection support - Fixed type detection to check operands instead of result type * Avoid differentiating NaN and non-NaN cases * format code (#76) --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Add constant folding for % operator. (#4359)Yong He2024-06-12
|
* Fix the sign-extending issue in right shift (#3820)kaizhangNV2024-03-26
| | | | | | | | | | | Fix issue (#3637). In constant folding of a right shift operation,slang always uses signed interger as the operand no matter the input source code is signed or unsigned, this could causes sign-extending issue if the input source is unsigned integer with highest bit set to 1. Fix the issue by checking the original type of the input and use the unsigned type if the input is unsigned.
* SPIRV compiler performance fixes. (#3258)Yong He2023-10-04
| | | | | | | | | | | | | | | * SPIRV compiler performance fixes. * Cleanup. * update project files * Cleanup debug code. * Make redundancy removal non-recursive. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Various slangpy fixes. (#3227)Yong He2023-09-21
| | | | | | | | | | | | | * Make dynamic cast transparent through `IRAttributedType`. * Add [CUDAXxx] variant of attributes. * Support marshaling of vector types. * Wrap cuda kernels in `extern "C"` block. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add `target_switch` and `intrinsic_asm` statement. (#3154)Yong He2023-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add `target_switch` and `__intrinsic_asm` statement. * Cleanup. * WaveGetActiveMask, WaveGetActiveMask, WaveCountBits. * WaveIsFirstLane. * More wave intrinsics. * wave intrinsics. * merge fix. * Fix. * Fix. * Update test. * update test. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Optimize specialization, and remove unnecessary calls to `simplifyIR`. (#2999)Yong He2023-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | * Remove unneccessary calls to `simplifyIR`. * fix. * Delete obsolete hoistConst pass. * Fix. * Small improvements. * Fix. * Fix enum lowering. * fix * tweaks. * tweaks. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix div-by-zero error during sccp. (#2911)Yong He2023-05-31
| | | | | | | | | * Diagnose on div-by-zero during sccp. * fix --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska2023-04-26
|
* Dictionary using lowerCamel (#2835)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
* Fix optimization pass not converging. (#2725)Yong He2023-03-23
| | | | | | | | | | | * Fix optimization pass not converging. * Fix. * Fix tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* More control flow simplifications. (#2673)Yong He2023-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * More control flow and Phi param simplifications. * Fix. * Fix gcc error. * Fix. * More IR cleanup. * Fix bug in phi param dce + ifelse simplify. * Propagate and DCE side-effect-free functions. * Enhance CFG simplifcation to remove loops with no side effects. * Fix. * Fixes. * Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation. * More cleanup. * Fixes. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Remove `SharedIRBuilder`. (#2657)Yong He2023-02-16
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add Loop Unrolling Pass. (#2644)Yong He2023-02-13
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Separate primal computations from unzipped function into an explicit ↵Yong He2022-12-19
| | | | | function. (#2569) Co-authored-by: Yong He <yhe@nvidia.com>
* Lower-to-ir no longer produce `Construct` inst. (#2553)Yong He2022-12-07
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Language feature: pointer sized int types. (#2401)Yong He2022-09-15
| | | | | | | | | | | | | | | | | | | | | * Language feature: pointer sized int types. * Fix. * small change to test. * Fix stdlib. * Fix. * Fix. * Add typedef for `size_t` in stdlib. * Fix test. * Add `intptr_t::size` constant. Co-authored-by: Yong He <yhe@nvidia.com>
* Fix type truncation during SCCP. (#2163)Yong He2022-03-18
|
* Fix folding of no-arg constructs in SCCP pass (#2148)Theresa Foley2022-03-01
|
* Improved SCCP, inlining and resource specialization passes, legalize ↵Yong He2022-02-25
| | | | `ImageSubscript` for GLSL (#2146)
* Cleanup refactoring work around the IR builder (#2061)Theresa Foley2021-12-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Cleanup refactoring work around the IR builder We have some long-term goals for the IR that require a more centralized and disciplined set of rules for how IR instructions get created/emitted. I had been working on trying to set things up so that all IR instruction creation goes through a single bottleneck point, but the non-trivial work in that branch was getting drowned out by the sheer volume of cleanup and refactoring changes. This change tries to pull together several of the more important cleanups. The big pieces are: * `IRBuilder` and `SharedIRBuilder` now protect their data members and rely on users to initialize them more directly via constructor of an `init()` method. This change affects a *bunch* of sites where `IRBuilder`s were created. I changed use sites to use the constructors whenever possible, and to use `init()` in cases where we had longer-lived builders that needed to be initialized multiple times. * The insertion location for the `IRBuilder` now uses an encapsulated type called `IRInsertLoc`. This new type can replace what used to be just two `IRInst*` fields in the builder, and also covers some new functionality (if we ever want to take advantage of it). Very little client code cares about this change, but it is still a nice cleanup in terms of making things more explicit. * The creation of an `IRModule` has been moded *out* of `IRBuilder`, because in practice we `IRBuilder` always wants to be associated with a pre-existing `IRModule` at creation time (via its `SharedIRBuilder`). There is now an `IRModule::create()` operation instead. This required changing the sequencing at many `IRModule` creation sites, since most had been contriving to make an `IRBuilder` first. There were also several cleanups because code had been carelessly using non-reference-counted pointers for `IRModule`s in ways that broke now that `IRModule::create()` always returns a `RefPtr`. * The core operations to actually allocate memory for IR instructions were moved into `IRModule` (since they interact with the memory pool that the module owns). These *were* called `createEmptyInst()` but have been renamed into `_allocateInst()`. In principle these seem like they should only be needed to be called by the `IRBuilder`, but in practice they are also needed by the IR deserialization logic. * A few core operations for emitting IR instructions that were associted with `IRBuilder` were moved to actually be methods on `IRBuilder`. First is `_findOrEmitConstant` which is the primary bottleneck for creating simple scalar constant values. Another is `_createInst` (formerly part of the templated `createInstImpl` along with `createInstWithSizeImpl`) which is the main bottleneck for allocation and initialization of any instruction other than a constant (well, the `IRModuleInst` is the other exception...). Finally, there is also `_maybeSetSourceLoc()`, which is obvious to scope inside the `IRBuilder` once it is protecting the source-location info. Notes: * The `minSizeInBytes` parameter to `_createInst()` might not actually be needed at all. At this point any `IRInst` subtypes that need data allocated for things other than their operands already get created manually via `_allocateInst` or `_findOrEmitConstant`, so I *think* we could remove that part. I will handle that in a subsequent cleanup if it turns out to be the case. * There is one IR pass (`slang-ir-string-hash.cpp`) that is using manual `_allocateInst()` instead of going through an `IRBuilder`. It could be easily cleaned up to not do so (and I will probably make that change down the line), but for now I wanted to avoid doing anything that wasn't close to pure refactoring if I could. * At this point in our design an `IRBuilder` is a very lightweight thing - it basically just owns the insertion location plus a source location to write into instructions. A lot of our code currently treats `IRBuilder`s like they are expensive and/or need to be re-used (which leads to them being used in more mutable/stateful ways). It is quite likely that as we clean up other aspects of the implementation of IR creation/emission we can make `IRBuilder` use feel more lightweight in ways that can streamline and simplify code. * The next step for this work is to identify the different paths that eventually lead to `_createInst()` being called, and unify them at a single bottleneck operation that can own the decisions around when to create an instruction vs. when to re-use an existing one (rather than those decisions being baked into the various `IRBuilder` subroutines that create instructions of the various subtypes). * fixup: gcc/clang C++ spec details
* Add an accessor for IRInst opcode (#1707)Tim Foley2021-02-16
| | | | | | | | | * Add an accessor for IRInst opcode This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it). This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR. * fixup
* Use slang- prefix on slang compiler and core source (#973)jsmall-nvidia2019-05-31
* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.