summaryrefslogtreecommitdiff
path: root/source/slang/slang-ir-specialize.cpp
AgeCommit message (Collapse)Author
2024-09-10Specialize existential return types when possible. (#5044)Yong He
* Fix inccorect dropping of declref during Unification of DeclaredSubtypeWitness. * Add extension test. * Specialize existential return types when possible. * Fix. * Fix. * Fix falcor issue.
2024-08-23Make variadic generics work with interfaces and forward autodiff. (#4905)Yong He
2024-08-19Tuple swizzling, concat, comparison and `countof`. (#4856)Yong He
* Tuple swizzling and element access. * Update proposal status. * Cleanup. * Fix merrge error. * Address review.
2024-08-18Variadic Generics Part 2: IR lowering and specialization. (#4849)Yong He
* Variadic Generics Part 2: IR lowering and specialization. * Update design doc status. * Update design doc. * Resolve review comments.
2024-08-13FIx issue with specializing witness tables (#4839)Sai Praveen Bangaru
2024-07-25Overhaul IR lowering of pointer types. (#4710)Yong He
* Overhaul IR lowering of pointer types. * Propagate address space in IRBuilder. * Fixup. * Fix. * Fix. * Change how Ptr type is printed to text. * Fix.
2024-06-12Add option to preserve shader parameter declaration in output SPIRV. (#4344)Yong He
* Add option to preserve shader parameter declarations in output. * Add test.
2024-05-29Add options to speedup compilation. (#4240)Yong He
* Add options to speedup compilation. * Fix. * Plumb options to DCE pass. * Revert debug change. * Fix regressions. * More optimizations. * more cleanup and fixes. * remove comment. * Fixes. * Another fix. * Fix errors. * Fix errors. * Add comments.
2024-05-17Add `-minimum-slang-optimization` to favor compile time. (#4186)Yong He
2024-04-19Support arithmetics on generic arguments (#3968)Jay Kwak
Resovles an issue #3935 Slang had to fold the generic arguments after specialization.
2024-03-13Fix crash when specializing generic entry points. (#3760)Yong He
2024-02-20Refactor compiler option representations. (#3598)Yong He
* Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
2024-02-11Fix type checking around generic array types. (#3568)Yong He
2024-02-08Support pointers in SPIRV. (#3561)Yong He
* Support pointers in SPIRV. * Fix test. * Enhance test. * Fix test. * Cleanup.
2024-02-05Add per-buffer data layout control. (#3551)Yong He
* Add per-buffer data layout control. Fixes #3534. * Fixes. * Robustness. * Update test. * Fix.
2023-12-15Add ConstBufferPointer::subscript. (#3415)Yong He
Co-authored-by: Yong He <yhe@nvidia.com>
2023-10-04SPIRV compiler performance fixes. (#3258)Yong He
* SPIRV compiler performance fixes. * Cleanup. * update project files * Cleanup debug code. * Make redundancy removal non-recursive. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-16Use ankerl/unordered_dense as a hashmap implementation (#3036)Ellie Hermaszewska
* Correct namespace for getClockFrequency * missing const * Add missing assignment operator * Remove unused variables * Return correct modified variable * Use stable hash code for file system identity * terse static_assert * Structured binding for map iteration * Make (==) and getHashCode const on many structs * Add ConstIterator for LinkedList * Replace uses of ItemProxy::getValue with Dictionary::at * Extract list of loads from gradientsMap before updating it * Const correctness in type layout * Add unordered_dense hashmap submodule * Use wyhash or getHashCode in slang-hash.h * refactor slang-hash.h * Use ankerl/unordered_dense as a hashmap implementation Notable changes: - The subscript operator returns a reference directly to the value, rather than a lazy ItemProxy (pair of dict pointer and key) slang-profile time (95% over 10 runs): - Before: 6.3913906 (±0.0746) - After: 5.9276123 (±0.0964) * 64 bit hash for strings So they have the same hash as char buffers with the same contents * Narrowing warnings for gcc to match msvc * revert back to c++17 * Correct c++ version for msvc * Use path to unordered_dense which keeps tests happy * Do not assign to and read from map in same expression * Remove redundant map operations in primal-hoist * Split out stable hash functions into slang-stable-hash.h * 64 bit hash by default * regenerate vs projects * Correct return type from HashSetBase::getCount() * correct width for call to Dictionary::reserve * Use stable hash for obfuscated module ids * Signed int for reserve * clearer variable naming * Parameterize Dictionary on hash and equality functors * Allow heterogenous lookup for Dictionary * missing const * Use set over operator[] in some places * Remove unused function * s/at/getValue
2023-08-15SPIR-V WIP (#3064)Ellie Hermaszewska
* Add type layout for structured buffer * Default to generating spirv directly * vk test for compute simple * Add spirv-dis as a downstream compiler * Emit Array types in SPIR-V * makevector for spirv * Dump whole spirv module on validation failure * register array types todo, use emitTypeInst * Neater formatting for unhandled inst printing * break out emitCompositeConstruct * Correct array type generation * neaten * Allow getElement for vector * Remove unused * Allow predicating target intrinsics on types * Consider functions with intrinsics to have definitions We need to specialize these if they are predicated on types * Correct array type generation * makeArray for spir-v * replace getElement with getElementPtr for spirv * Correct translation of field access for spirv * Push layouts to types for spirv * Spirv intrinsics * operator now makes a pointer * Add structured buffer of struct test * Preserve type layout in spirv structured buffer legalization * neaten * makeVectorFromScalar for SPIRV * placeholder for layouts on param groups * More type safe spirv op construction * Know that constants and types only go in one section * Remove emitTypeInst * Add todo for spirv sampling * Add links to spirv documentation on emit functions * OpTypeImage support for SPIR-V * Add simpler texture test for spirv * s/spirv_direct/spirv/g * Allow several string literals in target_intrinsic * Handle global params without a var layour for SPIR-V For example groupshared vars * uint spirv asm type * Add todo for isDefinition It is currently too broad * Some atomic op spirv intrinsics * Strip ConstantBuffer wrappers for spirv * Add todo for matrix annotations * Do not associate decorations insts with spirv counterparts * Correct entry point parameter generation * Spelling * Assert that fieldAddress is returning a pointer * Add error for existential type layout getting to spir-v emit * Add IRTupleTypeLayout Unused so far * Allow getElementPtr to work with vectors * Correct target name in test * Hide default spirv direct behind a premake option --default-spirv-direct=true * Do not insert space at start of intrinsic def * Correct asm rendering in tests * remove redundant option * Emit directly from direct test * Add source language options for spirv-dis * Add comments to spirv dis * Add dead debug print for before spirv module * Correct asm rendering in tests * s/spirv_direct/spirv/g * Only specialize intrinsic functions with predicates * regenerate vs projects * squash warnings * squash warnings * remove duplication * Silence warnings from msvc * squash warnings * Overload for zero sized array * More msvc warnings * warnings * Add spirv-tools to path for tests * Do not be specific about dxc version for diag test * Normalize line endings from spirv-dis * Correct filecheck matches * Temporarily disable two spirv tests Failing on CI, undebuggable hang :/ * Do not emit storage class more than once for spirv snippet * Do not pass spir-v to spirv-dis by stdin * Do not get spirv-dis output via stream, use file * normalize file endings in spirv-dis output
2023-07-19Optimize specialization, and remove unnecessary calls to `simplifyIR`. (#2999)Yong He
* Remove unneccessary calls to `simplifyIR`. * fix. * Delete obsolete hoistConst pass. * Fix. * Small improvements. * Fix. * Fix enum lowering. * fix * tweaks. * tweaks. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-18Simplify Lookup and improve compiler performance. (#2996)Yong He
* Simplify lookup. * Various bug fixes. * Report type dictionary size in perf benchmark. * Remove type duplication. * increase initial dict size. * Bug fix. * Fix bugs. * Fixup. * Revert type legalization looping. * Fix specialization pass. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-12Pool inst worklists and hashsets to avoid rehashing. (#2982)Yong He
Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-12Use scratchData on `IRInst` to replace HashSets. (#2978)Yong He
* Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-11Add perf benchmark utility. (#2977)Yong He
* Add perf benchmark utility. * Update documentation. * Fix. * Fix doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-05-23Add API for querying total compile time. (#2898)Yong He
* Add API for querying total compile time. * Optimize. * Remove redundant simplifyIR calls. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-04-26Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska
2023-04-26Fix specialization dictionaries cleanup pass (#2844)Sai Praveen Bangaru
2023-04-25Dictionary using lowerCamel (#2835)jsmall-nvidia
* #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
2023-04-12Combine lookupWitness lowering with specialization. (#2794)Yong He
2023-03-27Apply IR simplifcation immediately after specialization to avoid duplicates. ↵Yong He
(#2739) * Apply IR simplifcation immediately after specialization to avoid duplicates. * Update source/slang/slang-ir-specialize.cpp Co-authored-by: Ellie Hermaszewska <github@sub.monoid.al> --------- Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Ellie Hermaszewska <github@sub.monoid.al>
2023-03-22Type legalization and autodiff bug fixes. (#2722)Yong He
* Bug fixes. * Fix. * Only perform autodiff for functions whose derivative is actually used. * Fix loop optimize bug. * Fix high order diff. * Fix trivial diff func generation. * Fixes. * Cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-16Remove `SharedIRBuilder`. (#2657)Yong He
Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-16Overhaul global inst deduplication and cpp/cuda backend. (#2654)Yong He
* Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-03Overhaul `transposeParameterBlock` to support `inout` params. (#2621)Yong He
* Overhaul `transposeParameterBlock` to support `inout` params. * Small bug fixes. * Bug fix on differentiable intrinsic specialization. * Fixes. * Run autodiff tests on CPU. * Clean up. * More bug fixes., * Add test coverage on inout param. * Fix language server hinting for transcribed mutable params. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-01-17First custom backward-derivative test case working. (#2598)Yong He
2023-01-10Nested bwd-diff func call context save/restore. (#2584)Yong He
Co-authored-by: Yong He <yhe@nvidia.com>
2023-01-06Split bwd_diff op into separate ops for primal and propagate func. (#2582)Yong He
* Split bwd_diff op into separate ops for primal and propagate func. * Fix. * Download swiftshader with github actions instead of curl on linux. * Fix github action. Co-authored-by: Yong He <yhe@nvidia.com>
2022-12-07Rename IR opcodes to unify style. (#2556)Yong He
Co-authored-by: Yong He <yhe@nvidia.com>
2022-11-30Support `no_diff` on existential typed params. (#2540)Yong He
Co-authored-by: Yong He <yhe@nvidia.com>
2022-10-27More renaming in jvp pass. (#2475)Yong He
Co-authored-by: Yong He <yhe@nvidia.com>
2022-10-26Adding a differentiable standard library (#2465)Sai Praveen Bangaru
2022-09-06 Specialize and SSA in a loop + better diagnostics on dynamic dispatch ↵Yong He
failure (#2396) * Report diagnostic when dynamic dispatch failed instead of crashing. * Specialize and SSA in a loop. Explicit specialization only interface. Co-authored-by: Yong He <yhe@nvidia.com>
2022-09-01Deduplicate consts and IRSpecialize in IR, propagate type info for `IntVal`. ↵Yong He
(#2388)
2022-06-25Specialize generic/existential calls within generic functions. (#2294)Yong He
* Expose internals of dce and use it to implement call graph walk. * Specialize calls in generic functions. * Fix clang error. Co-authored-by: Yong He <yhe@nvidia.com>
2022-06-23Preserve specialization cache in IR for specialization pass. (#2293)Yong He
* Perserve specialization cache in IR for specialization pass. * Fix compile error. * Fix. * Fix. * Fix test case. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
2022-06-01Clean up void returns. (#2260)Yong He
* Clean up `IRReturnVoid`. * Update gitignore. Co-authored-by: Yong He <yhe@nvidia.com>
2021-12-17Cleanup refactoring work around the IR builder (#2061)Theresa Foley
* Cleanup refactoring work around the IR builder We have some long-term goals for the IR that require a more centralized and disciplined set of rules for how IR instructions get created/emitted. I had been working on trying to set things up so that all IR instruction creation goes through a single bottleneck point, but the non-trivial work in that branch was getting drowned out by the sheer volume of cleanup and refactoring changes. This change tries to pull together several of the more important cleanups. The big pieces are: * `IRBuilder` and `SharedIRBuilder` now protect their data members and rely on users to initialize them more directly via constructor of an `init()` method. This change affects a *bunch* of sites where `IRBuilder`s were created. I changed use sites to use the constructors whenever possible, and to use `init()` in cases where we had longer-lived builders that needed to be initialized multiple times. * The insertion location for the `IRBuilder` now uses an encapsulated type called `IRInsertLoc`. This new type can replace what used to be just two `IRInst*` fields in the builder, and also covers some new functionality (if we ever want to take advantage of it). Very little client code cares about this change, but it is still a nice cleanup in terms of making things more explicit. * The creation of an `IRModule` has been moded *out* of `IRBuilder`, because in practice we `IRBuilder` always wants to be associated with a pre-existing `IRModule` at creation time (via its `SharedIRBuilder`). There is now an `IRModule::create()` operation instead. This required changing the sequencing at many `IRModule` creation sites, since most had been contriving to make an `IRBuilder` first. There were also several cleanups because code had been carelessly using non-reference-counted pointers for `IRModule`s in ways that broke now that `IRModule::create()` always returns a `RefPtr`. * The core operations to actually allocate memory for IR instructions were moved into `IRModule` (since they interact with the memory pool that the module owns). These *were* called `createEmptyInst()` but have been renamed into `_allocateInst()`. In principle these seem like they should only be needed to be called by the `IRBuilder`, but in practice they are also needed by the IR deserialization logic. * A few core operations for emitting IR instructions that were associted with `IRBuilder` were moved to actually be methods on `IRBuilder`. First is `_findOrEmitConstant` which is the primary bottleneck for creating simple scalar constant values. Another is `_createInst` (formerly part of the templated `createInstImpl` along with `createInstWithSizeImpl`) which is the main bottleneck for allocation and initialization of any instruction other than a constant (well, the `IRModuleInst` is the other exception...). Finally, there is also `_maybeSetSourceLoc()`, which is obvious to scope inside the `IRBuilder` once it is protecting the source-location info. Notes: * The `minSizeInBytes` parameter to `_createInst()` might not actually be needed at all. At this point any `IRInst` subtypes that need data allocated for things other than their operands already get created manually via `_allocateInst` or `_findOrEmitConstant`, so I *think* we could remove that part. I will handle that in a subsequent cleanup if it turns out to be the case. * There is one IR pass (`slang-ir-string-hash.cpp`) that is using manual `_allocateInst()` instead of going through an `IRBuilder`. It could be easily cleaned up to not do so (and I will probably make that change down the line), but for now I wanted to avoid doing anything that wasn't close to pure refactoring if I could. * At this point in our design an `IRBuilder` is a very lightweight thing - it basically just owns the insertion location plus a source location to write into instructions. A lot of our code currently treats `IRBuilder`s like they are expensive and/or need to be re-used (which leads to them being used in more mutable/stateful ways). It is quite likely that as we clean up other aspects of the implementation of IR creation/emission we can make `IRBuilder` use feel more lightweight in ways that can streamline and simplify code. * The next step for this work is to identify the different paths that eventually lead to `_createInst()` being called, and unify them at a single bottleneck operation that can own the decisions around when to create an instruction vs. when to re-use an existing one (rather than those decisions being baked into the various `IRBuilder` subroutines that create instructions of the various subtypes). * fixup: gcc/clang C++ spec details
2021-10-21Passing associated type arguments to existential parameters + packing for ↵Yong He
`bool`. (#1987) * Passing associated type arguments to existential parameters + packing for `bool`. * fix typo Co-authored-by: Yong He <yhe@nvidia.com>
2021-08-26Add API to control interface specialization. (#1925)Yong He
2021-02-16Add an accessor for IRInst opcode (#1707)Tim Foley
* Add an accessor for IRInst opcode This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it). This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR. * fixup