| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add fkYAML submodule
* Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml
* generate ir-inst-defs.h
* neaten things
* neaten inst def parser
* add rapidyaml submodule
* remove fkyaml
* remove fkyaml submodule
* remove use of ir-inst-defs.h
* format and warnings
* fix wasm build
* tidy
* remove rapidyaml
* Extend fiddle to allow custom splices in more places
* Use lua to describe ir insts
* fix
* neaten
* neaten
* neaten
* spelling
* neaten
* comment comment out assert
* merge
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Parse optional witness syntax
* Allow failing optional constraint
* Make `is` work with optional constraint
* Allow using optional constraint in checked if statements
* Fix tests
* Make it work with structs
* Fix MSVC build error
* Disallow using `as` with optional constraints
* Update test to match split is/as errors
* Add tests
* Fix uninitialized variables in tests
* Add tests of incorrect uses & fix related bugs
* Mention optional constraints in docs
* format code
* Fix type unification with NoneWitness
* Fix formatting
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Nathan V. Morrical <natemorrical@gmail.com>
|
| |
|
|
|
|
|
|
|
| |
* Move switch statement bodies to their own lines
* format
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
| |
* format
* Minor test fixes
* enable checking cpp format in ci
|
| |
|
|
|
|
|
| |
* Remove use of `G0` and `__target_intrinsic` in stdlib.
* Fix.
* Fix calling intrinsic in global scope.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Refactor compiler option representation.
* Fix binary compatibility.
* Add a test for specifying compiler options at link time.
* Fix binary compatibility.
* Fix binary compatibility.
* Fix backward compatibility on matrix layout.
* Fix.
* Fix.
* Fix.
* Fix gfx.
* Fix gfx.
* Fix dynamic dispatch.
* Polish.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Unify Texture types in stdlib into 1 generic type.
* Fixes.
* Fix.
* Fixes.
* Fix reflection.
* Fix binding reflection.
* Add gather intrinsics.
* Fix gather intrinsics.
* Fix texture type toText.
* Fix intrinsic.
* fix cuda intrinsic.
* Fix project files.
* cleanup.
* Fix.
* Fix.
* Fix sampler feedback test.
* Fix getDimension intrinsics.
* Fix spirv sample image intrinsics.
* Fix test.
* Fix GLSL intrinsic.
* Cleanup.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Correct namespace for getClockFrequency
* missing const
* Add missing assignment operator
* Remove unused variables
* Return correct modified variable
* Use stable hash code for file system identity
* terse static_assert
* Structured binding for map iteration
* Make (==) and getHashCode const on many structs
* Add ConstIterator for LinkedList
* Replace uses of ItemProxy::getValue with Dictionary::at
* Extract list of loads from gradientsMap before updating it
* Const correctness in type layout
* Add unordered_dense hashmap submodule
* Use wyhash or getHashCode in slang-hash.h
* refactor slang-hash.h
* Use ankerl/unordered_dense as a hashmap implementation
Notable changes:
- The subscript operator returns a reference directly to the value,
rather than a lazy ItemProxy (pair of dict pointer and key)
slang-profile time (95% over 10 runs):
- Before: 6.3913906 (±0.0746)
- After: 5.9276123 (±0.0964)
* 64 bit hash for strings
So they have the same hash as char buffers with the same contents
* Narrowing warnings for gcc to match msvc
* revert back to c++17
* Correct c++ version for msvc
* Use path to unordered_dense which keeps tests happy
* Do not assign to and read from map in same expression
* Remove redundant map operations in primal-hoist
* Split out stable hash functions into slang-stable-hash.h
* 64 bit hash by default
* regenerate vs projects
* Correct return type from HashSetBase::getCount()
* correct width for call to Dictionary::reserve
* Use stable hash for obfuscated module ids
* Signed int for reserve
* clearer variable naming
* Parameterize Dictionary on hash and equality functors
* Allow heterogenous lookup for Dictionary
* missing const
* Use set over operator[] in some places
* Remove unused function
* s/at/getValue
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Simplify lookup.
* Various bug fixes.
* Report type dictionary size in perf benchmark.
* Remove type duplication.
* increase initial dict size.
* Bug fix.
* Fix bugs.
* Fixup.
* Revert type legalization looping.
* Fix specialization pass.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
* Emit simpler vector element access code
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* Fix generic lowering.
* Fix generic lowering regression due to IR deduplication.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* Overhaul global inst deduplication and cpp/cuda backend.
* Update IR documentation.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Make backward differentiation work with generics.
* Fix.
* Another fix.
* More fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
| |
failure (#2396)
* Report diagnostic when dynamic dispatch failed instead of crashing.
* Specialize and SSA in a loop. Explicit specialization only interface.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
| |
* Expose internals of dce and use it to implement call graph walk.
* Specialize calls in generic functions.
* Fix clang error.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Perserve specialization cache in IR for specialization pass.
* Fix compile error.
* Fix.
* Fix.
* Fix test case.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
| |
* Clean up `IRReturnVoid`.
* Update gitignore.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* New language feature: basic error handling.
* Fix.
* Fix `tryCall` encoding according to code review.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Cleanup refactoring work around the IR builder
We have some long-term goals for the IR that require a more centralized and disciplined set of rules for how IR instructions get created/emitted. I had been working on trying to set things up so that all IR instruction creation goes through a single bottleneck point, but the non-trivial work in that branch was getting drowned out by the sheer volume of cleanup and refactoring changes. This change tries to pull together several of the more important cleanups.
The big pieces are:
* `IRBuilder` and `SharedIRBuilder` now protect their data members and rely on users to initialize them more directly via constructor of an `init()` method. This change affects a *bunch* of sites where `IRBuilder`s were created. I changed use sites to use the constructors whenever possible, and to use `init()` in cases where we had longer-lived builders that needed to be initialized multiple times.
* The insertion location for the `IRBuilder` now uses an encapsulated type called `IRInsertLoc`. This new type can replace what used to be just two `IRInst*` fields in the builder, and also covers some new functionality (if we ever want to take advantage of it). Very little client code cares about this change, but it is still a nice cleanup in terms of making things more explicit.
* The creation of an `IRModule` has been moded *out* of `IRBuilder`, because in practice we `IRBuilder` always wants to be associated with a pre-existing `IRModule` at creation time (via its `SharedIRBuilder`). There is now an `IRModule::create()` operation instead. This required changing the sequencing at many `IRModule` creation sites, since most had been contriving to make an `IRBuilder` first. There were also several cleanups because code had been carelessly using non-reference-counted pointers for `IRModule`s in ways that broke now that `IRModule::create()` always returns a `RefPtr`.
* The core operations to actually allocate memory for IR instructions were moved into `IRModule` (since they interact with the memory pool that the module owns). These *were* called `createEmptyInst()` but have been renamed into `_allocateInst()`. In principle these seem like they should only be needed to be called by the `IRBuilder`, but in practice they are also needed by the IR deserialization logic.
* A few core operations for emitting IR instructions that were associted with `IRBuilder` were moved to actually be methods on `IRBuilder`. First is `_findOrEmitConstant` which is the primary bottleneck for creating simple scalar constant values. Another is `_createInst` (formerly part of the templated `createInstImpl` along with `createInstWithSizeImpl`) which is the main bottleneck for allocation and initialization of any instruction other than a constant (well, the `IRModuleInst` is the other exception...). Finally, there is also `_maybeSetSourceLoc()`, which is obvious to scope inside the `IRBuilder` once it is protecting the source-location info.
Notes:
* The `minSizeInBytes` parameter to `_createInst()` might not actually be needed at all. At this point any `IRInst` subtypes that need data allocated for things other than their operands already get created manually via `_allocateInst` or `_findOrEmitConstant`, so I *think* we could remove that part. I will handle that in a subsequent cleanup if it turns out to be the case.
* There is one IR pass (`slang-ir-string-hash.cpp`) that is using manual `_allocateInst()` instead of going through an `IRBuilder`. It could be easily cleaned up to not do so (and I will probably make that change down the line), but for now I wanted to avoid doing anything that wasn't close to pure refactoring if I could.
* At this point in our design an `IRBuilder` is a very lightweight thing - it basically just owns the insertion location plus a source location to write into instructions. A lot of our code currently treats `IRBuilder`s like they are expensive and/or need to be re-used (which leads to them being used in more mutable/stateful ways). It is quite likely that as we clean up other aspects of the implementation of IR creation/emission we can make `IRBuilder` use feel more lightweight in ways that can streamline and simplify code.
* The next step for this work is to identify the different paths that eventually lead to `_createInst()` being called, and unify them at a single bottleneck operation that can own the decisions around when to create an instruction vs. when to re-use an existing one (rather than those decisions being baked into the various `IRBuilder` subroutines that create instructions of the various subtypes).
* fixup: gcc/clang C++ spec details
|
| | |
|
| |
|
|
|
|
|
|
|
| |
* Add an accessor for IRInst opcode
This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it).
This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR.
* fixup
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The basic feature here is the ability to use the `&` operator to produce the conjunction/intersection of two interfaces. That is, you can have interfaces:
interface IFirst { int getFirst(); }
interface ISecond { int getSecoond(); }
and if you need a generic function where the type parameter `T` must conform to *both* of these interfaces, you express that by constraining the parameter to the intersection of the interfaces:
void someFunction<T : IFirst & ISecond>(T value) { ... }
Without this feature, the main alternative an application would have is to define an intermediate interface, like:
interface IBoth : IFirst, ISecond {}
Forcing users to deal with an intermediate interface creates more work for type authors (they need to remember to inherit from the right combined interface(s)), or for `extension` authors (when you add `ISecond` to a type that used to just support `IFirst`, you had better also add `IBoth`). In the worst case, a family of N related "leaf" interfaces would give rise to an exponential number of intermediate interfaces to represnt the possible combinations.
A conjunction like `IFirst & ISecond` is officially its own type, and can be used to declare a type alias:
typealias IBoth = IFirst & ISecond;
This change only includes the first pass of work on this feature, so there are several caveats to be aware of:
* Using a conjunction as part of an inheritance clause is not yet supported (e.g., `struct X : IFirst & ISecond`). This is true even if the conjunction was introduced by an intermediate `typealias`
* The `&` syntax introduced here is only parsed in places where only a type (not an expression) is possible. This means you cannot do things like cast to a conjunction with `(IFirst & ISecond)(someValue)`.
* This work *should* apply to conjunctions of more than two interfaces (like `IA & IB & IC`) but that has not yet been tested
* In the long run it may be sensible to allow conjunctions that use concrete types, but we really ought to have the semantic checking logic rule that out for now.
* During testing, I encountered compiler crashes when trying to use this feature together with `property` declarations. Further investigation and debugging is called for.
* The handling of conjunction types is currently incomplete, in that there are many equivalences the compiler does not yet understand. For example, it is clear that `IA & IB` is equivalent to `IB & IA`, but the compiler currently does not understand this and will treat them as different types. A deeper implementation approach is called for.
* Conjunctions are currently only supported for generic type parameter constraints, when performing full specialization. Use of conjunctions for existential-type value parameters or with dynamic dispatch is not yet supported.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Overview
========
Prior to this change, we had two different code generation strategies for interface/existential types in Slang, that didn't always play nicely together:
* The "legacy" static specialization approach could handle plugging in an arbitrary concrete type for an existential type parameter (including types with resources, etc.), but wouldn't work well with things like a `StructuredBuffer<>` of an interface type, and requires somewhat counter-intuitive layout rules to make work.
* The new dynamic dispatch approach produces simpler, more easily understood layouts by assuming that values of interface type can fit into a fixed number of bytes. The tradeoff there is that it cannot handle types that include resources (only POD types).
The goal of this change is to make it so that the two strategies can co-exist. In particular, in cases where a shader is amenable to both static specialization and dynamic dispatch, the type layouts should agree.
In order to make the type layouts agree, we:
* Declare that *all* values of existential type reserve storage according to the dynamic-dispatch rules (so 16 bytes for the RTTI and witness-table information, plus whatever bytes are needed to story "any value" of a conforming type).
* Then we modify the "legacy" layout rules so that if a value of concrete type can fit in the reserved "any value" space for a given interface, then it is laid out there exactly like the dynamic dispatch rules would do. Otherwise, we fall back to the previous legacy rules (since we don't need to agree with the dynamic-dispatch layout on types that can't be used with dynamic dispatch).
Details
=======
* Renamed `ExistentialBox` to `BoundInterfaceType` to better clarify how it relates to `BindExistentialsType`
* Unconditionally apply the `lowerGenerics` pass during emit, since it is now responsible for aspects of the lowering of existential types when specialization is used.
* Made IR type layout take the target into account, so that the layout of resource types can vary by target (e.g., being POD on some targets, and invalid on others)
* Cleaned up some issues around using global shader parameters as the "key" for their layout information in the global-scope layout (only comes up when there are global-scope `uniform` parameters)
* Made there be a default any-value size (16) instead of making it be an error to leave out. This was the simplest option; we could try to go back to having an error, but we'd need to only issue it if we are sure a type/interface is being used with dynamic dispatch, since static dispatch doesn't have to obey the restrictions.
* Changed lowering of existential types to tuples so that bound interfaces where the concrete type won't fit use a "pseudo-pointer" instead of an "any-value" to hold the payload
* Changed IR type legalization to handle the "pseudo-pointer" case and apply layout information from an interface type over to the payload part when static specialization was used.
* Changed some details of how witness tables were being lowered, so that we didn't have to create "proxy" witness tables for the constraints on associated types (just use the actual requirement entries we generate)
* Changed witness tables so that they know the subtype doing the conforming
* Added logic so that we don't generate pack/unpack logic and witness table wrapper functions for types that are incompatible with any-value/dynamic dispatch for a given interface.
* Changed the core AST-level type layout logic to use the dynamic-dispatch layout in case things fit, and the legacy static specialization case when things don't (while also reserving space for the dynamic-dispatch fields)
* Changed a bunch of test cases for static specialization to properly use the new layout (which introduces new buffers in some cases, and moves data around in others).
Future Work
===========
The experience of trying to reconcile our older way of handling interface-type specialization with our newer model (that supports dynamic dispatch) makes it clear that we really need to make similar changes to our handling of generic type parameters on entry points and at the global scope.
A future change should make it so that a global type parameter is lowered with a type layout similar to a value parameter of interface type, including the RTTI and witness-table pieces, and just leaving out the "any value" piece. A similar translation strategy should apply to entry-point generic parameters (mirroring how we lower generic functions for dynamic dispatch already), and value specialization parameters.
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use integer RTTI/witness handles in existential tuples.
* Fix clang error.
* Fix IR serialization to use 16bits for opcode.
* Undo accidental comment change.
* Use variable length encoding for opcode.
* Fix compile error.
* Fixing issues
* Fix code review issues.
|
| |
|
|
|
|
|
|
|
|
|
| |
* Allow existential types in `StructuredBuffer` element type.
* Handle StructuredBuffer.Load/.Consume methods
* Clean up unnecessary changes
* Code cleanup
* Update test comment
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Enable lower-generics pass universally.
* Exclude builtin interfaces and functions from lower-generics pass.
* Update stdlib.
* Fixup.
* Fixes handling of nested intrinsic generic functions.
* Fixes.
* Fixes.
|
| |
|
|
|
| |
* Support initializing an existential value from a generic value.
* Remove trailing spaces and clean up debugging code.
|
| |
|
| |
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
| |
|
|
|
| |
* AnyValue based dynamic code gen
* Fix aarch64 build error
|
| |
|
|
|
|
|
|
|
| |
code (#1444)
* Refactor lower-generics pass into separate subpasses.
* IR pass to generate witness table wrappers.
* Support associatedtype local variables and return values in dynamic dispatch code.
|
| |
|
|
|
|
|
|
|
| |
* Refactor lower-generics pass into separate subpasses.
* IR pass to generate witness table wrappers.
* Re-generate vs project files.
* Fix x86 build error.
|
| |
|