summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-ir-clone.cpp
Commit message (Collapse)AuthorAge
* Add utility to trace creation of problematic IRInsts to assist LLM in ↵Copilot2025-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | debugging (#7820) * Initial plan * Add SLANG_DEBUG_IR_BREAK environment variable support Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Apply code formatting to SLANG_DEBUG_IR_BREAK implementation Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Improve stack trace debugging with -rdynamic flag and backtrace_symbols Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Address PR feedback: use PlatformUtil::getEnvironmentVariable, remove -rdynamic flag, and delete fallback branch Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Address PR feedback: simplify env var parsing, move backtrace to PlatformUtil, use #if for SLANG_LINUX_FAMILY Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Address PR feedback: remove unneeded include, make backtrace() more generic by removing uid parameter Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix and clone source tracking. * Add python script to dump traces. * Update instructions. * Batch calls to addr2line * Cleanup claude instructions. * update claude action. * Remove duplicated build instructions from claude.yml workflow Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * fix build error. * Fix build errors --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com> Co-authored-by: Gangzheng Tong <tonggangzheng@gmail.com>
* Fix SPIRV `OpSpecConstantOp` emit (#7158)Darren Wihandi2025-05-29
| | | | | | | | | | | | | * Fix SPIRV specialization constant with floating-point operations * Improve test * WIP * Restrict `OpSpecConstantOp` allowed operations based on SPIRV specifications * Fix typo on floating type check * Emit error on float to int spec cosnt int val casts
* Implement spec const for generic parameter (#7121)kaizhangNV2025-05-15
| | | | | | | | | | | | | | | | | | Close #6840. This PR add supports to use specialize constant in generic parameter, and that parameter can also be used as array size, e.g. following code should work: ``` struct MyStruct<let N: int> { float buffer[N]; } MyStruct<SpecConstVar> s; ``` - Loose the restriction from Link-Time to SpecializationConstant when extract generic argument - Tweak the logic of how we decide whether a inst is hoistable. Besides checking existing hoistable flag of each IRInst, when we detect a IRInst's type is SpecConstRateType, we will treat that inst hoistable. Because IRInst in global scope can be deduplicated, and every SpecConstRateType inst should be in the global scope or IRGeneric scope (which will be at global scope after specialization). - Remove the SpecConstIntVal to IRInst map in IR lowering logic, because we already have way to deduplicate the global scope IR.
* Fixed generic interface specialization crashes (#6601): (#6688)Ronan2025-04-03
| | | | | | | | | | | | | | | | | | | | * Fixed generic interface specialization crashes: - Add an export decoration to specialized generic interfaces. * Fixed generic interface specialization crashes: - Add an export decoration to specialized generic interfaces. - Use getTypeNameHint(...) instead of a manual mangler. * In cloneInstDecorationsAndChildren: specialize all linkage decorations, not just the exports. - If a linkage decoration is already present, it is not specialized and replaced by the specialized one. - If a specialization uses the TypeNameHint, sanitize it to be used as an identifier. - Use the identifier name sanitizer from slang-mangle. * Added tests/generics/generic-interface-linkage.slang - See #6601 and #6688
* Fix nullptr in generic specialization (#6066)Julius Ikkala2025-01-17
| | | | | | | | | | | | | * Fix nullptr in generic specialization * Fix formatting * Revert "Fix nullptr in generic specialization" and add emitPtrLit instead * Add type parameter to getPtrValue() --------- Co-authored-by: Yong He <yonghe@outlook.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Report AD checkpoint contexts (#5058)venkataram-nv2024-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Transferring source locations when creating phi instructions * Tracking for simple variables * Deriving source locations for loop counters * Printing checkpoint structure breakdown * More readable output format * Special behavior for loop counters * Writing report to file * Add slangc option to enable checkpoint reports * Display types of checkpointed fields * Message in case there are no checkpointing contexts * Catch source locations for function calls * Source cleanup * Fix compilation warnings * Remove stray dump() * Provide the report through diagnostic notes * Add missing path for sourceLoc during unzip pass * Add tests for reporting intermediates * Include more transfer cases for source locations * Fix ordering in address elimination * Fill in more holes with source location transfer * Remove debugging line * Reverting changes to diagnostic sink * Simplify address elimination using source location RAII contexts * Eliminating manual source loc transfers in forward transcription * Fix local var adaptation to use RAII location setter * Simplify primal hoisting logic for source location transfer * Simplify unzipping with RAII location scopes * Simplify transpose logic * Cleaning up for rev.cpp * Reverting spacing changes * Fix mistake with source loc RAII instantiation * Fix formatting issues
* Support dependent generic constraints. (#4870)Yong He2024-08-20
| | | | | | | | | | | | | * Support dependent generic constraints. * Fix warning. * Update comment. * Fix. * Add a test case to verify fix of #3804. * Address review.
* Initial support for differentiating existential types (#3111)Sai Praveen Bangaru2023-08-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Merge * WIP: Complete auto-diff logic for existential types * Revert "Add compiler option for generating representative hash" This reverts commit 13b09ef4621e73844c96d64d9c111a8ed0d45aae. * More fixes for fwd-mode AD on existential types * Add anyValueSize inference pass * Fix checking of `Differential.Differential==Differential` * In-progress: infer any-value-size for existential types * Existentials now work in forward-mode * Overhaul handling of existential AD types. Fwd-mode works, reverse-mode requires front-end changes * Reverse-mode now works on existentials * Cleanup * Remove diff rules for create existential object for now * Revert treat-as-differentiable changes * Fixes * More fixes * Cleanup * more cleanup * signed/unsigned * Revert "Cleanup" This reverts commit e4f7d71f07bb207736f90708961eeecd09a1b652. * Cleanup (again) * Remove public/export/keep-alive on null differential after AD pass * Minor fix * Update dictionary accessors * Keep export decoration * More fixes + Support for `kIROp_PackAnyValue` * Merge upstream * Update expected-failure.txt
* Dictionary using lowerCamel (#2835)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
* Type legalization and autodiff bug fixes. (#2722)Yong He2023-03-22
| | | | | | | | | | | | | | | | | | | | | * Bug fixes. * Fix. * Only perform autodiff for functions whose derivative is actually used. * Fix loop optimize bug. * Fix high order diff. * Fix trivial diff func generation. * Fixes. * Cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Remove `SharedIRBuilder`. (#2657)Yong He2023-02-16
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Overhaul global inst deduplication and cpp/cuda backend. (#2654)Yong He2023-02-16
| | | | | | | | | * Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Overhaul `transposeParameterBlock` to support `inout` params. (#2621)Yong He2023-02-03
| | | | | | | | | | | | | | | | | | | | | | | * Overhaul `transposeParameterBlock` to support `inout` params. * Small bug fixes. * Bug fix on differentiable intrinsic specialization. * Fixes. * Run autodiff tests on CPU. * Clean up. * More bug fixes., * Add test coverage on inout param. * Fix language server hinting for transcribed mutable params. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Separate primal computations from unzipped function into an explicit ↵Yong He2022-12-19
| | | | | function. (#2569) Co-authored-by: Yong He <yhe@nvidia.com>
* Specialize and SSA in a loop + better diagnostics on dynamic dispatch ↵Yong He2022-09-06
| | | | | | | | | failure (#2396) * Report diagnostic when dynamic dispatch failed instead of crashing. * Specialize and SSA in a loop. Explicit specialization only interface. Co-authored-by: Yong He <yhe@nvidia.com>
* Cleanup refactoring work around the IR builder (#2061)Theresa Foley2021-12-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Cleanup refactoring work around the IR builder We have some long-term goals for the IR that require a more centralized and disciplined set of rules for how IR instructions get created/emitted. I had been working on trying to set things up so that all IR instruction creation goes through a single bottleneck point, but the non-trivial work in that branch was getting drowned out by the sheer volume of cleanup and refactoring changes. This change tries to pull together several of the more important cleanups. The big pieces are: * `IRBuilder` and `SharedIRBuilder` now protect their data members and rely on users to initialize them more directly via constructor of an `init()` method. This change affects a *bunch* of sites where `IRBuilder`s were created. I changed use sites to use the constructors whenever possible, and to use `init()` in cases where we had longer-lived builders that needed to be initialized multiple times. * The insertion location for the `IRBuilder` now uses an encapsulated type called `IRInsertLoc`. This new type can replace what used to be just two `IRInst*` fields in the builder, and also covers some new functionality (if we ever want to take advantage of it). Very little client code cares about this change, but it is still a nice cleanup in terms of making things more explicit. * The creation of an `IRModule` has been moded *out* of `IRBuilder`, because in practice we `IRBuilder` always wants to be associated with a pre-existing `IRModule` at creation time (via its `SharedIRBuilder`). There is now an `IRModule::create()` operation instead. This required changing the sequencing at many `IRModule` creation sites, since most had been contriving to make an `IRBuilder` first. There were also several cleanups because code had been carelessly using non-reference-counted pointers for `IRModule`s in ways that broke now that `IRModule::create()` always returns a `RefPtr`. * The core operations to actually allocate memory for IR instructions were moved into `IRModule` (since they interact with the memory pool that the module owns). These *were* called `createEmptyInst()` but have been renamed into `_allocateInst()`. In principle these seem like they should only be needed to be called by the `IRBuilder`, but in practice they are also needed by the IR deserialization logic. * A few core operations for emitting IR instructions that were associted with `IRBuilder` were moved to actually be methods on `IRBuilder`. First is `_findOrEmitConstant` which is the primary bottleneck for creating simple scalar constant values. Another is `_createInst` (formerly part of the templated `createInstImpl` along with `createInstWithSizeImpl`) which is the main bottleneck for allocation and initialization of any instruction other than a constant (well, the `IRModuleInst` is the other exception...). Finally, there is also `_maybeSetSourceLoc()`, which is obvious to scope inside the `IRBuilder` once it is protecting the source-location info. Notes: * The `minSizeInBytes` parameter to `_createInst()` might not actually be needed at all. At this point any `IRInst` subtypes that need data allocated for things other than their operands already get created manually via `_allocateInst` or `_findOrEmitConstant`, so I *think* we could remove that part. I will handle that in a subsequent cleanup if it turns out to be the case. * There is one IR pass (`slang-ir-string-hash.cpp`) that is using manual `_allocateInst()` instead of going through an `IRBuilder`. It could be easily cleaned up to not do so (and I will probably make that change down the line), but for now I wanted to avoid doing anything that wasn't close to pure refactoring if I could. * At this point in our design an `IRBuilder` is a very lightweight thing - it basically just owns the insertion location plus a source location to write into instructions. A lot of our code currently treats `IRBuilder`s like they are expensive and/or need to be re-used (which leads to them being used in more mutable/stateful ways). It is quite likely that as we clean up other aspects of the implementation of IR creation/emission we can make `IRBuilder` use feel more lightweight in ways that can streamline and simplify code. * The next step for this work is to identify the different paths that eventually lead to `_createInst()` being called, and unify them at a single bottleneck operation that can own the decisions around when to create an instruction vs. when to re-use an existing one (rather than those decisions being baked into the various `IRBuilder` subroutines that create instructions of the various subtypes). * fixup: gcc/clang C++ spec details
* Fix shader-toy example. (#2047)Yong He2021-12-06
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add an accessor for IRInst opcode (#1707)Tim Foley2021-02-16
| | | | | | | | | * Add an accessor for IRInst opcode This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it). This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR. * fixup
* Improvements around hashing (#1355)jsmall-nvidia2020-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fields from upper to lower case in slang-ast-decl.h * Lower camel field names in slang-ast-stmt.h * Fix fields in slang-ast-expr.h * slang-ast-type.h make fields lowerCamel. * slang-ast-base.h members functions lowerCamel. * Method names in slang-ast-type.h to lowerCamel. * GetCanonicalType -> getCanonicalType * Substitute -> substitute * Equals -> equals ToString -> toString * ParentDecl -> parentDecl Members -> members * * Make hash code types explicit * Use HashCode as return type of GetHashCode * Added conversion from double to int64_t * Split Stable from other hash functions * toHash32/64 to convert a HashCode to the other styles. GetHashCode32/64 -> getHashCode32/64 GetStableHashCode32/64 -> getStableHashCode32/64 * Other Get/Stable/HashCode32/64 fixes * GetHashCode -> getHashCode * Equals -> equals * CreateCanonicalType -> createCanonicalType * Catches of polymorphic types should be through references otherwise slicing can occur. * Fixes for newer verison of gcc. Fix hashing problem on gcc for Dictionary. * Another fix for GetHashPos * Fix signed issue around GetHashPos
* Add a basc inlining facility for use in the stdlib (#1271)Tim Foley2020-03-11
| | | | | | | | | | | | | | | | | | | The main feature visible to the stdlib here is the `[__unsafeForceInlineEarly]` attribute, which can be attached to a function definition and forces calls to that function to be inlined immediately after initial IR lowering. The pass is implemented in `slang-ir-inline.{h,cpp}` and currently only handles the completely trivial case of a function with no control flow that ends with a single `return`. The lack of support for any other cases motivates the `__unsafe` prefix on the attribute. In order to test that the pass works, I modified the "comma operator" in the standard library to be defined directly (rather than relying on special-case handling in IR lowering), and then added a test that uses that operator to make sure it generates code as expected. The compute version of the test confirms that we generate semantically correct code for the operator, while the SPIR-V cross-compilation test confirms that our output matches GLSL where the comma operator has been inlined, rather than turned into a subroutine. Notes for the future: * With this change it should be possible (in principle) to redefine all the compound operators in the stdlib to instead be ordinary functions with the new attribute, removing the need for `slang-compound-intrinsics.h`. * Once the compound intrinsics are defined in the stdlib, it should be easier/possible to start making built-in operators like `+` be ordinary functions from the standpoint of the IR * The attribute and pass here could be extended to include an alternative inlining attribute that happens later in compilation (after linking) but otherwise works the same. This could in theory be used for functions where we don't want to inline the definition into generated IR, but still want to inline things berfore generating final HlSL/GLSL/whatever. * The inlining pass itself could be generalized to work for less trivial functions pretty easily; for the most part it would just mean "splitting" the block with the call site and then inserting clones of the blocks in the callee. Any `return` instructions in the clone would become unconditional branches (with arguments) to the block after the call (which would get a parameter to represent the returned value). * The "hard" part for such an inlining pass would be handling cases where the control flow that results from inlining can't be handled by our later restructuring passes. The long-term fix there is to implement something like the "relooper" algorithm to restructure control flow as required for specific targets.
* Use slang- prefix on slang compiler and core source (#973)jsmall-nvidia2019-05-31
* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.