summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-ast-builder.cpp
Commit message (Collapse)AuthorAge
* Rename some symbols related to pointers types (#8592)Theresa Foley2025-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that while this change touched a large numer of files, there are no changes to functionality being made here. The only things being done are renaming various symbols and, in a few cases, updating or adding comments for consistency with the new names. The core of the naming changes are: * Most things named to refer to `OutType` (e.g., `IROutType`, `IRBuilder::getOutType()`, etc.) have been consistently renamed to refer to `OutParamType`, to emphasize that the relevant AST/IR node types are only intended for use to represent `out` parameters. * The same change as described above for `OutType` is also made for `RefType`, which becomes `RefParamType` in most cases. One mess that this exposes is the way that the `ExplicitRef<T>` type in the core module currently lowers to `IRRefParamType`. This change sticks to the rule of not making functional changes, so that mess is left as-is for now. * Names referring to `InOutType` have been changed to instead refer to `BorrowInOutType`. The intention with this naming change is to emphasize that the Slang rules for `inout` are semantically those of a borrow (or at least our interpretation of what a borrow means). * Names referring to `ConstRefType` have been changed to instead refer to `BorrowInType`. This change starts work on clarifying that the existing `__constref` modifier was never intended to be a read-only analogue of `__ref`, and instead is the input-only analogue of `inout`. * The `ParameterDirection` enum type has been changed to `ParamPassingMode`, to reflect the fact that the concept of "direction" fails to capture what is actually being encoded, particularly once we have modes beyond simple `in`/`out`/`inout`. While this change does not alter behavior in any case (the user-exposed Slang language is unchanged), it is intended to set up subsequence changes that will work to make the handling of these types in the compiler more nuanced and correct. Breaking this part of the change out separately is primarily motivated by a desire to minimize the effort for reviewers. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* fix a crash when using type equality constaint (#8515)kaizhangNV2025-09-23
| | | | | | | | | | | | Close #8193. When constructing `TransitiveTypeWitness` node, we should check if there is operand that represents two equal times. Currently, we only check whether the operand is `TypeEqualityWitness`, which is not good enough, because a `DeclaredSubtypeWitness` could also be representing two same types, in that case, we should also const fold this kind of witness. Fails to do so, we could finally ends up with a generating a lookup witness IR on a generic parameter that is not supposed to be looked up.
* Split overloaded uses of RefType in front-end (#8427)Theresa Foley2025-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Overview ======== This change is the start of an attempt to address how the Slang compiler codebase has ended up conflating two similar, but semantically distinct, concepts: * The long-standing notion of `ref` parameters (only allowed for use in the builtin modules), which are encoded using a wrapper `Type` in the AST as part of the representation of the parameters of a `FuncType`. * A recently-introduced notion of explicit reference types that mirror the built-in `Ptr` type, with a relationship comparable to that between pointer and reference types in C++. The change splits the `Ref<T>` type in the core module into two distinct types, with one for each of the two use cases. Similarly, the `RefType` class in the compiler's AST is split into two distinct classes, to represent the two cases. Background ========== The `Ref<T>` type in the core module (hidden and not intended for users to ever see or use) was originally introduced to encode the `ref` parameter-passing mode, comparable to the hidden `Out<T>` and `InOut<T>` types used to encode `out` and `inout` parameter-passing modes. The `Ref<T>` type in the core module was encoded as a instance of the `RefType` class in the Slang AST (similar to how `Out<T>` mapped to an `OutType`). These AST classes were *only* intended to be used by the compiler front-end as part of its encoding of function types. The `FuncType` class needed a way to distinguish an `inout int` parameter from a plain (implicitly `in`) `int` parameter, so these wrapper like `RefType` and `OutType` were introduced to encode both the parameter type (`T`) and the parameter-passing mode in a form that could be passed around as a `Type`. Notably, the `Ref<T>` type (and `Out<T>`, etc.) were *not* intended to be type names that ever get uttered in Slang code (not even in the builtin modules), and the vast majority of the compiler code was not supposed to ever encounter them. They were an implementation detail of `FuncType`, and nothing else. (In hindsight it may have been a mistake to use a nominal type declared in the core module to implement these wrappers; it might have been a good idea to use an entirely separate class of `Type` for this case...) Recent changes to the builtin modules introduced functions that wanted to *return* a reference (so that the parameter-passing-mode modifiers like `ref` could not trivially be used), and as part of those changes the appealingly-named `Ref<T>` type in the core module was re-used for this new case. Builtin operations were declared with an explicit `Ref<T>` return type, and parts of the compiler front-end that had previously been blissfully unaware of the AST's `RefType` (and `InOutType`, etc.) had to start accounting for the possibility that an explicit `Ref<T>` would show up. Related changes also introduced a comparable conflation of the (unfortunately-named) `constref` parameter-passing modifier and builtin operations that wanted to return an explicit reference that is read-only. Both use cases were mapped to the core-module `ConstRef<T>` type, which appeared in the AST as an instance of the `ConstRefType` class. The overlapping use of `ConstRef<T>`` is actually significantly more troublesome than the `Ref<T>` case because, despite what its name implies, `constref` was not really supposed to be the read-only analogue of `ref`, but rather it is closer to the "immutable value borrow" analogue to `inout`'s "mutable value borrow." The semantics of a "value borrow" vs. a "memory reference" in Slang have not been very carefully codified, and the conflation around `ConstRef<T>` has contributed to things becoming increasingly muddy in the compiler back-end. Main Changes ============ Core Module ----------- The `Ref<T>` type has been replaced with two distinct types, with one for each use case: * `RefParam<T>` is intended for use when encoding a `ref` parameter in a function type * `ExplicitRef<T>` is intended for use when an operation in a builtin module wants to return a reference The other types used to represent parameter-passing modes (e.g., `InOut<T>`) were renamed to better indicate that their role in defining parameter types (e.g., `InOutParam<T>`). The `ExplicitRef<T>` type was given additional generic parameters for the allowed access and the address space, akin to what `Ptr<T>` now supports. The pointer dereference operator (prefix `*`) in the core module should now properly propagate the access and address space of the pointer over to the reference that gets returned. The two distinct use cases of `ConstRef<T>` were not split in the way as `Ref<T>`, instead the case for the `constref` parameter-passing mode uses `ConstParamRef<T>`, while cases that previously used `ConstRef<T>` to represent a read-only explicit reference instead now use `ExplicitRef<T, Access.Read>`. Prior to this change there were two subscripts declared on pointers: one in the `Ptr` type itself, and another in an `extension` for pointers with `Access.ReadWrite`. The comments on the code seemed to indicate that the catch-all subscript used to only have a `get` accessor, while the `ref` was only available on read-write pointers, but it seems that subsequent changes converted the default subscript to support `ref`. This change eliminates the subscript added via `extension`, since it is redundant. AST and Front-End ================= Similar to the changes in the core module, the AST `RefType` class was split into: * `RefParamType` for the case of encoding `ref` parameters * `ExplicitRefType` for the case where the user meant an explicit reference type All the other classes that represent wrappers for encoding parameter-passing modes (e.g., `OutType`) were similarly renamed (e.g., `OutParamType`). The `ConstRefType` class was simply renamed to `ConstRefParamType`, because any use cases of `ConstRefType` that intended an explicit reference type will now use `ExplicitRefType` with `Acccess.Read`. For convenience, this change includes type aliases to map the old names for these types over to the new ones (e.g., `using OutType = OutParamType`) so that the change doesn't need to affect quite so many lines of code. The `RefType` and `ConstRefType` names are intentionally left undefined, since it woudl be unsafe to assume that existing use sites should default to either of the two possible interpretations. All use cases of `RefType` and `ConstRefType` (and their former shared base class `RefTypeBase`) were audited and updated to refer to either `RefParamType`/`ConstRefParamType` or `ExplicitRefType`, as appropriate (based on whether the context of the code indicated it was working with parameter-passing mode wrapper types, or explicit reference types). In many (many) cases comments were added to the code that was updated (and some unrelated code that needed to be audited along the way) to note cases where there appears to be something fishy going on in the compiler and/or there are obvious opportunities for next-step improvement. The `QualType` constructor used to infer l-value-ness when passed a `RefType` or `ConstRefType`; that code was introduced to support explicit reference types. The code was updated to consult the access argument of an `ExplicitRefType` to try and determine the right l-value-ness to use. There is some ambiguity about what should be done in the case where the value of the generic argument representing the access cannot be statically determined; a better solution may be needed. Many other cases in the front-end that were working with `RefType` and `ConstRefType` for explicit references also need to figure out l-value-ness, and these were changed to rely on the logic already added to `QualType` so that it wouldn't have to be duplicated. It isn't clear if this structure is the best way to tackle the problem, but it seems to at least be an upgrade over the more strictly ad-hoc logic that was in place before. Future Work =========== IR-Level Work ------------- The most obvious next step to take is that the split that was made in the compiler front-end needs to be properly plumbed through all of the back-end. There appears to be a lot of code in the back end of the compiler that has made the same conflation of `ref` parameters and explicit reference types that the front-end did. In practice, any uses of `ExplicitRef<T>` in the front-end should desugar into plain pointer-based code in the IR. Clean Up Parameter-Passing Modes -------------------------------- The code that handles different parameter-passing modes (`ParameterDirection`s) and their wrapper types is somewhat scattered and messy (as found while auditing use cases of `RefType`). A cleanup pass is warranted to ensure that most code only needs to think about `ParameterDirection`s. There should ideally be only a single operation in the front-end that handles determining the `ParameterDirection` of a parameter based on its modifiers. Similarly, there should be one operation to wrap a value type based on a parameter direction, and one operation to derive a `ParameterDirection` from the wrapper type. Ideally, the accessors for `FuncType` should not provide unrestricted access to the potentially-wrapped parameter types, and should instead return some kind of `ParamInfo` struct that encodes both a `ParameterDirection` and the unwrapped `Type` of the parameter. Clean Up `QualType` ------------------- A significant piece of future work that appears required is to drastically clean up and improve the way that `QualType`s are represente and handled in the front-end. There are currently various distinct `bool` flags in `QualType` (some with very unclear meaning) and differnet parts of the codebase consult/modify only subsets of them; a clear enumeration of the "value categories" (to use the C++ terminology) that Slang supports could be quite helpful. Naively, a `QualType` should at least encode the basic information that a `Ptr` type encodes: * A value type * Allowed access (read-only, read-write, etc.) * Address space The main additional thing that a `QualType` needs is a way to distinguish cases where an expression evaluates to: * A reference to a memory location, where all the information from a `Ptr` is relevant * A simple value, such that the access and address space are irrelevant * A reference to an abstract storage location (a `property`, `subscript`, or an implicit conversion that needs to support being an l-value), in which case address space is irrelevant and the "allowed access" basically amounts to a listing of the accessors the storage location supports Eliminate Explicit Reference Types ---------------------------------- Finally, twe should eventually eliminate the `ExplicitRef<T>` type from the core module (and all of the supporting code from the front-end), since the feature is not a good fit for the Slang language. We should find some other way to decorate operations in the builtin module that need to returns a reference rather than a value (note how `ref` accessors already avoided exposing explicit reference types, by design). --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Added __magic_enum (#8436)Ronan2025-09-17
| | | | | | | | | | | | | | | | | | | | | Fixes #8406 (and #8410). `AddressSpace`, `MemoryScope` and `AccessQualifier` are no longer `BaseType`. I added a new `__magic_enum` (very similar to `__magic_type`) syntax to be able to easily create values or these enums from the compiler. (I don't know if it was the right way to do it, but it works and the changes are small enough?). I had a weird bug: `tests/language-feature/capability/address-of.slang` was failing in `IRBuilder::_findOrEmitConstant(IRConstant& keyInst)`. When needing a new `u64(0)`, it did not find it in the `ConstantMap` first, but then failed to add it right after because it already existed in the map! But this was triggered by `IRPtrType* IRBuilder::getPtrType(IROp op, IRType* valueType, AccessQualifier accessQualifier, AddressSpace addressSpace)`, which is a strange coincidence... but I could not find the issue in what I did. I ended up bumping unordered_dense, and it solved the issue (so there was a bug in there).
* [CBP] Pointer frontend changes + groupshared pointer support (#7848)ArielG-NV2025-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolves #7628 Resolves: #8197 Primary Goals: 1. Add `Access` to pointer 2. AddressSpace::GroupShared support for pointers (SPIR-V) 3. Add `__getAddress()` to replace `&` * `&` is not updated to `require(cpu)` since slangpy uses `&`. This means we must: (1) merge PR; (2) replace `&` with `__getAddress()`; (3) add `require(cpu)` to `&` Changes: * Added to `Ptr` the `Access` generic argument & logic (for `Access::Read`). * Moved the generic argument `AddressSpace` from `Ptr` to the end of the type. * Added pointer casting support between any `Ptr` as long as the `AddressSpace` is the same * Disallow globallycoherent T* and coherent T* * Disallow const T*, T const*, and const T* * Fixed .natvis display of `ConstantValue` `ValOperandNode` * Support generic resolution of type-casted integers * Added `VariablePointer` emitting for spirv + other minor logic needed for groupshared pointers Breaking Changes: * Anyone using the `AddressSpace` of `Ptr` will now have to account for the `Access` argument * we disallow various syntax paired with `Ptr` and `T*` --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Introduce CDataLayout & -fvk-use-c-layout (#8136)Julius Ikkala2025-08-21
| | | | | | | | | | | | | | | | Closes #8112. ~~The issue asks for a "C layout", but in this PR I use the term "CPU layout" because this naming was pre-existing in the codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout is to match CPU-side struct definitions with the shader side. I'm open to better naming suggestions, though.~~ Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the CPU target depends on the object layout rules of existing CPU layout rules, but they're incompatible with actual shaders. So a new `kCLayoutRulesImpl_` was needed anyway. --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* A new approach to AST node deduplication (#8072)Theresa Foley2025-08-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * A new approach to AST node deduplication Background ---------- The Slang compiler currently relies on the idea that AST nodes derived from `Val` are always deduplicated based on their opcode and operands. Deduplication requires caching, and we thus have to determine the right scope at which to allocate and cache different `Val`s. We need to ensure that `Val`s that refer to declarations/nodes in a specific compilation session/linkage are not cached in a scope that will outlive that session/linkage (or else the cache will contain garbage pointers). Conversely, we also need to ensure that any `Val`s that are referred to by something like a builtin module's AST will remain alive at least as long as that module/AST, and that every compilation session/linkage that refers to that module will find that `Val` cached if they try to create an equivalent one. The existing approach to deduplication has some subtleties: * A builtin module like the core module needs to be fully loaded (using the `ASTBuilder` for the global session) before any other compilation session might construct `Val`s referring to nodes in the AST of that module. * The various declarations/types/values cached on the `SharedASTBuilder` need to be created before any user-defined compilation occurs, to ensure that all of the relevant `Val`s are created on the global session's AST builder (see `Session::finalizeSharedASTBuilder`). * Related to all of the above: the deduplication cache on a non-top-level `ASTBuilder` is initialized by copying the cache from the top-level (global session) `ASTBuilder` on creation. Any subsequent allocations on the global-session `ASTBuilder` will not be visible to the linkage-level `ASTBuilder`. This led to weird things like the *options parsing* logic for `-load-core-module` reaching in and performing a manual copy of the cache from the global session to a linkage to try to restore the invariants. * Notably, the deduplication logic doesn't actually care if two different linkages create distinct `Val`s that represent the same thing, so long as nothing in the AST of a builtin module will refer to that thing. E.g., if the builtin modules never mention `Ptr<String>`, then two different linkages that both refer to that type will likely construct distinct `Val`s to represent it. Thus the deduplication is not as complete as some might assume. The subtle ordering considerations when creating `Val`s related to builtin modules has proved to be a sticking point for implementing on-demand deserialization of the builtin modules (in order to improve startup times). Overview -------- This change implements a new approach that hopefully makes it easier to ensure correctness, even in the case where AST nodes for builtin modules and `Val`s that refer to those nodes sometimes get created after other compilation has been performed. The key points of the new approach are: * `ASTBuilder`s are now explicitly (rather than just implicitly) linked into a hierarchy. Currently the compiler codebase will only create a two-level hierarchy: the `ASTBuilder` for a global session is the parent to all of the `ASTBuilder`s created for `Linkage`s. The expectation is that the code can and will generalize to more levels of nesting. * When a request is made to create a `Val` (or find it in a cache), there is a single `ASTBuilder` that is determined to be responsible for owning that `Val` for its lifetime. The logic involved is comparable to the way that the `IRBuilder` decides where to insert a "hoistable" (deduplicated) instruction, with the simplification that we are dealing with a tree instead of a CFG. Details ------- * `Session::finalizeASTBuilder()` is gone, because it is no longer needed * The logic in the `ASTBuilder` constructor and in `slang-options.cpp` that was manually copying the `m_cachedNodes` from the global-session `ASTBuilder` over to a linkage's `ASTBuilder` is removed, since it is no longer needed. * Made every AST node (`NodeBase`) carry a pointer to the `ASTBuilder` that created it. This is wasteful, but makes it easier to be sure the implementation will work. * Introduced a class `RootASTBuilder`, derived from `ASTBuilder` to represent the root of a given hierarchy of AST builders. * Every non-root `ASTBuilder` is now constructed with a pointer to its parent builder. * Changed it so that instead of allocating a `SharedASTBuilder` and then passing it in to create one or more `ASTBuilder`s, the shared AST builder state is more of an implementation detail of the `ASTBuilder` type, and is automatically allocated behind the scenes as part of creating an `ASTBuilder`. * The inline (defined in header) `ASTBuilder::_getOrCreateImpl()` now just does a first-pass check for an existing cached `Val` and, if it doesn't find one, delegates to `ASTBuilder::_getOrCreateImplSlowPath()`, which encapsulates the logic for the cache-miss case (even if that logic is just two lines). * The `ASTBuilder::_findAppropriateASTBuilderForVal()` method inspects a `ValNodeDesc` to determine the "deepest" of the AST builders among its operands (falling back to the root AST builder if there are no relevant operands). * The `ASTBuilder::_getOrCreateValDirectly()` is intended for use when the correct AST builder to use for caching/allocation has already been identified. * Moved the caching of generic arguments for "default substitutions" out of `ASTBuilder` and onto `GenericDecl` itself. Note that the naming of the old field (`m_cachedGenericDefaultArgs`) was unclear about the fact that this is related to "default substitutions" (where each generic parameter is fed an argument that refers to the parameter itself), and has *nothing* to do with any default argument values that might be set on the generic parameters. * Changed the global session (`Session`) so that instead of storing pointers to both the root/builtin `ASTBuilder` *and* the corresponding `SharedASTBuilder`, it just retains a pointer to the root `ASTBuilder`. This is consistent with the move toward making the `SharedASTBuilder` just an implementation detail. Possible Future Work -------------------- * In theory this change should unblock on-demand AST deserialization, so it would be good to get back to those changes once this new approach lands. * Many AST node subclasses end up storing their own `ASTBuilder` pointers, which are likely now redundant with the one in `NodeBase`. These should be eliminated where possible. * A lot of code for AST-related manipulations has been changed over time to require an `ASTBuilder` to be passed in, but at this point such a builder should always be derive-able so long as the operation has at least one non-null AST node available to it. * Most of the accessors that are currently on `SharedASTBuilder` can/should be migrated to just be on `ASTBuilder`, where they can use the data from the `SharedASTBuilder` as part of their implementation. Ideally most code should only nee to interface with `ASTBuilder`s directly, and will never need to talk to the `SharedASTBuilder`. * This change cleaned up some of the ownership hierarchy, and made it so that the global session only retains a pointer to the root AST builder (and not the shared state as well). A logical follow-on for that would be to make the `Linkage` more properly own (and thus allocate) its own AST builder (and other related objects), and have the global session only store a pointer to the root linkage (and not point to the various sub-objects directly). * The `Linkage` and `SourceManager` types have their own form of parent/child hierarchy (restricted to two levels), and could be generalized in a way that is similar to what this change does for `ASTBuilder`. Support for a hierarchy of `Linkage`s could be a powerful tool for end users, if we expose it in the right way. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Add check for the variable requirement (#6677)Gangzheng Tong2025-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add check for the variable requirement This change adds the capability check for the variables requirement. With this check, the shader ``` [require(cpp_cuda_glsl_hlsl_metal_spirv)] Buffer<float> InputTyped; [require(cpp_cuda_glsl_hlsl_metal_spirv)] RWBuffer<float> OutputTyped; ``` will issue error if targeting to WSGL e.g. `.\build\Debug\bin\slangc .\tests\wgsl_no_buffer.slang -o wgsl_no_buffer.txt -target wgsl -entry Main -stage compute` .\tests\wgsl_no_buffer.slang(2): error 36108: 'InputTyped' has dependencies that are not compatible on the required target 'wgsl'. Buffer<float> InputTyped; ^~~~~~~~~~ .\tests\wgsl_no_buffer.slang(4): error 36108: 'OutputTyped' has dependencies that are not compatible on the required target 'wgsl'. RWBuffer<float> OutputTyped; ^~~~~~~~~~~ Fixes #6304 * Add var capability tests * Do capability checks for global var only * Add inferredCapabilityRequirements to var capability check * Add requirement to the intrinsic types Buffer/RWBuffer * format code * Update capabliity test * use DefaultDataLayout as default data layout * Use visitMemberExpr to check the capabilities * Update the cap tests to match the error messages * update test to use the ScalarDataLayout for hlsl target * Update tests check condition to use error number only * Add default push_constant data layout type --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* support specialization constant sized array (#6871)kaizhangNV2025-05-14
| | | | | | | | | | | | | | | | | | | | | Close #6859 Goal of this PR We want to support an array whose size can be specialization constant for shared/global variable e.g. layout (constant_id = 0) const uint BLOCK_SIZE = 64; shared float buf_a[(BLOCK_SIZE + 5) * 4]; Overview of the solution: During IndexExpr check, we will loose the restriction to allow SpecConst passing, but the size parameter will not be a constant value because it cannot be folded into a constant, so we will make it follow the same logic as generic parameter value, and the size will be represented by FuncCallIntVal/PolynomialIntVal/DeclRefIntVal. During IR lowering, we will detect whether there is spec constant in the IntVal, and wrap the IRInst with a SpecConstRateType, and propagate the type though the lowering logic, such that the IntVal representing the array size will have SpecConstRateType. During spirv emit stage, if we detect that a IRInst has SpecConstRateType, we will emit it as SpecConstantOp. We have to implement new logic to emit OpSpecConstantOp, the existing emit logic doesn't support emitting OpSpecConstantOp, especially this op can embed arithmetic operation at global scope, where we can only emit arithmetic instruct at local. But there are only few instructs we need to support. Overview of the solution: This PR doesn't support generic, and we will create a separate PR to extend that, tracked in #6840.
* A new approach to AST serialization (#6854)Theresa Foley2025-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * A new approach to AST serialization This change completely overhauls the way that AST nodes are being serialized, and the offline source-code generation steps that enable that serialization. In practice, this ends up being a complete overhaul of the way that *modules* are being serialized (not just the AST part), although things like the serialization format for the Slang IR and for source locations are not affected. The rest of this commit message is broken down in to sections, in an attempt to help guide anybody looking at the code in how to make sense of all the changes. The Old C++ Extractor --------------------- AST serialization used to be driven by information scraped using the `slang-cpp-extractor` tool, which did an ad hoc parse of the C++ declarations of the AST node types and then generated a set of "X macros" that could be for macro-based code generation within the rest of the compiler. While the existing approach was functional, it wasn't easy to understand or maintain, and it has been getting in the way of forward progress on other features we'd like to work on in the language and compiler. This change removes the `slang-cpp-extractor` tool entirely. Marking Up the AST Declarations ------------------------------- The most notable change that contributors to the compiler may notice is the large number of invocations of a macro `FIDDLE()` on the declarations of the AST node types. The basic idea is that only declarations (namespaces, types, fields) that are preceded by `FIDDLE()` are visible to the code generator tool. So if somebody is working with the AST and wondering why a new node type isn't working, or why a field they added isn't being serialized correctly, it is probably because they need to add `FIDDLE()` in front of it. Generating the Boilerplate Code ------------------------------- The file `slang-ast-boilerplate.cpp` provides a good example of how the information extracted from the marked-up AST declarations gets used. In that file, the `FIDDLE TEMPLATE` construct is used to generate type information for each of the AST node types. Similar logic is used in `slang-ast-forward-declarations.h` to generate the declaration of the `ASTNodeType` enumeration, and forward-declare all the AST node classes. For many parts of the code, simply including that file replaces the need for the old `slang-generated-*.h` files. Replacing Visitors and Related Logic ------------------------------------ The old visitor types for the AST used the macros that were generated by `slang-cpp-extractor`, so something new was needed to replace them. The same goes for the `SLANG_AST_NODE_VIRTUAL_CALL` macros. The core of the solution implemented here is in `slang-ast-dispatch.h`. Given a "dispatchable" AST node type (say, `Expr`), a call like: ``` ASTNodeDispatcher<Expr,R>(expr, [&](auto e) { return doSomething(e); }) ``` is an expression of type `R`, which does the equivalent of something like: ``` switch(expr->getTag()) { case ASTNodeType::VarExpr: return doSomething(static_cast<VarExpr*>(expr)); // ... } ``` The `SLANG_AST_NODE_VIRTUAL_CALL` macro is now implemented in terms of `ASTNodeDispatcher`. The implementation of the visitor types is more involved. The code in this change retains some of the macro names from the original version, just to try and make the parallels more clear. The visitor types are all implemented on top of the `ASTNodeDispatcher` approach, and use `FIDDLE TEMPLATE` to generate all the boilerplate `visit*()` method declarations. Refactoring of `Linkage` Module Loading --------------------------------------- Needing to revisit all the places where modules get deserialized made it clear that there is a lot of complexity and apparent duplication in the core routines on the `Linkage` that get used for loading modules. This change tries to clean up some of that logic, but it is worth noting that there are two legacy features that get in the way of making things as clean as they should be: * The `LoadedModuleDictionary` type that gets passed around a lot exists entirely to handle the corner case where somebody uses the Slang API to perform a compilation with multiple `TranslationUnitRequest`s in the same `FrontEndCompileRequest`, and one of the translation units `import`s the module defined by another of the translation units. * There are a lot of special-case behaviors and routines entirely there to support the `ModuleLibrary` feature, although that feature should be considered deprecated (or at least subject to getting entirely re-designed down the line). The basic idea of the cleanup is that all of the (non-deprecated) ways load a module from a serialized binary, or compile one from source should now bottleneck through `loadModuleImpl`, which then bifurcates into `loadSourceModuleImpl` for the compilation case and `loadBinaryModuleImpl` for the deserialization case. High-Level Serialization Approach --------------------------------- The old serialization logic used the [RIFF](https://en.wikipedia.org/wiki/Resource_Interchange_File_Format) format to encode the high-level structure of things, and this change retains that usage (and actually doubles down on the RIFF usage). The old serialization system relied on the idea that for any given type `Foo` that wants to support serialization, there should be something like a `SerialFooData` type in C++, that can represent the state of a `Foo`, and then the actual serialization applied to that `SerialFooData`. This means that in most cases there are four pieces of code written: * During serialization: * Copying the data of a `Foo` in memory over to a `SerialFooData` in memory * Writing the state of a `SerialFooData` into the serialized data stream * During deserialization: * Reading the state of a `SerialFooData` from a serialized data stream * Copying the data of the `SerialFooData` in memory over to a `Foo` The new logic gets rid of the intermediate `SerialFooData`. In the serialization direction, we take a `Foo` and write it to the `RIFFContainer` directly, or using some other utilities layered on top of it. In the deserialization direction, we have additional flexibility. Given a `RIFFContainer::Chunk*` that represents a serialized `Foo`, we often navigate through the in-memory representation of the RIFF data to get to the parts of the serialized value that we actually want/need, without needing to deserialize the entire `Foo`. To support this kind of operation, this change introduces a few helper types like `ContainerChunkRef` an `ModuleChunkRef`, that are little more than typed wrappers around a `RIFFContainer::Chunk*`. The Module "Container" Part --------------------------- A serialized `Module` is encoded as a RIFF chunk, using logic in `slang-serialize-container.cpp` - both before and after this change. This change reorganizes a lot of the code in that file, to account for the way that eliminating the intermediate `SerialContainerData` type streamlines the overall task of writing out the parts of the module. In the deserialization logic... there isn't really much to do in `slang-serialize-container.cpp`. Most of the logic in `slang.cpp` and `slang-module-library.cpp` that pertains to deserializing modules uses the `ModuleChunkRef`-based approach, and simply extracts the pieces of the serialized module that it needs. The Actual Serialization of the AST ----------------------------------- The actual AST serialization logic is in `slang-serialize-ast.cpp`. The basic approach in both the writing and reading directions is: * Use the `FIDDLE TEMPLATE` system to generate a set of functions, one for each AST node type, that recursively invoke the read/write logic on each field of that node (after recursively invoking the case for its direct superclass) * Use the `ASTNodeDispatcher` system to dispatch out to those functions whene reading or writing anything derived from `NodeBase` * For now, handle all types *not* derived from `NodeBase` by hand. There's a lot of room for improvement around that last item: it should be just as easy to generate the serialization and deserialization logic for other types that don't inherit from `NodeBase`, but the current change tries to err on the side of making the logic as explicit and simplistic as possible, rather than trying to get too clever too soon. The actual serialization *format* used for the AST is almost comically simplistic: the code uses hierarchical RIFF chunks to emulate a JSON-like structure. This is a very wasteful representation (e.g., a `bool` or a null pointer each take up *8 bytes*), but the goal for now is to start with the simplest thing that could possibly work, and only add more cleverness once we are sure it won't get in the way of important future improvements (like lazy/on-demand deserialization or IR and AST, to improve compiler startup times). The files `slang-serialize.{h,cpp}` have been co-opted to define a new pair of types `Encoder` and `Decoder` that are used for a more-or-less stream-oriented way or reading or writing RIFF chunks for the JSON-like structure. Almost everything related to the actual AST serialization could do with a cleanup pass, and some time spent on picking good/better names for everything. Smaller Stuff ------------- * Cleaned up a lot of code that was using bare `ASTNodeType` or the extractor's `ReflectClassInfo` type to consistently use `SyntaxClass`. * Fixed an apparent bug in how the destination-driven code genarator was handling `TryExpr`s * Fixed an apparent bug in how the GLSL legalization pass was handling translation of certain `SV_*` semantics. * format code * fixup: template errors caught by non-VS compilers * format code * fixup: more template errors * fixup: more stuff VS didn't catch * fixup: it's amazing VS doesn't catch these... * fixup: yet more template stuff VS ignores * fixup: more VS template nonsense * fixup: unreachable return macro usage * fixup: more unreacable returns * fixup: unused parameter * fixup: strict aliasing * fixup: allow missing entry point list chunk * fixup: wasm build script * fixup: AST changes since this PR was created --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
* Simplify implicit cast ctors for vector & matrix. (#6408)Yong He2025-02-20
| | | | | | | | | | | * Simplify implicit cast ctors for vector & matrix. * Fix formatting. * Fix tests. * Fix Falcor test. * Mark __builtin_cast as internal.
* Add datalayout for constant buffers. (#5608)Yong He2024-11-21
| | | | | | | | | | | | | | | | | | | * Add datalayout for constant buffers. * Fixes. * Fix test. * Fix glsl codegen. * Update spirv-specific doc. * Fix test. * Fix binding in the presense of specialization constants. * address comments. * Add a test for constant buffer layout.
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Replace the word stdlib or standard-library with core-module for source code ↵Jay Kwak2024-10-28
| | | | | (#5415) This commit changes the word "stdlib" or "standard library" to "core module" in the source code.
* Support `IDifferentiablePtrType` (#5031)Sai Praveen Bangaru2024-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * initial diff-ref-type interface * Initial support for `IDifferentiablePtrType` * Fix unused vars * More tests + fix switch case fallthrough. * Update slang-ir-autodiff.cpp * Update diff-ptr-type-loop.slang * Add optimization to allow more complex pair types * Update slang-ir-autodiff-primal-hoist.cpp * Update diff-ptr-type-loop.slang * Update slang-ir-autodiff-primal-hoist.cpp * More fixes to address reviews * Update slang-check-expr.cpp * Optimizations + rename `differentiableRefInterfaceType` -> `differentiablePtrInterfaceType` * Move pair logic to ir-builder, unify the type dictionaries. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Make tuple types work in autodiff. (#4923)Yong He2024-08-28
|
* Tuple swizzling, concat, comparison and `countof`. (#4856)Yong He2024-08-19
| | | | | | | | | | | * Tuple swizzling and element access. * Update proposal status. * Cleanup. * Fix merrge error. * Address review.
* Variadic Generics Part 2: IR lowering and specialization. (#4849)Yong He2024-08-18
| | | | | | | | | * Variadic Generics Part 2: IR lowering and specialization. * Update design doc status. * Update design doc. * Resolve review comments.
* Variadic Generics Part 1: parsing and type checking. (#4833)Yong He2024-08-14
|
* Overhaul IR lowering of pointer types. (#4710)Yong He2024-07-25
| | | | | | | | | | | | | | | * Overhaul IR lowering of pointer types. * Propagate address space in IRBuilder. * Fixup. * Fix. * Fix. * Change how Ptr type is printed to text. * Fix.
* Add slangc flag to `-zero-initialize` all variables (#3987)ArielG-NV2024-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Default (zero'd) values with `-zero-initialize` flag Adds `-zero-initialize` flag to set values to a __default() expression if they are missing a initExpr. * address review and ensure __default calls ctor + zero's fields. 1. We must keep zero-initialize in SemanticsDeclHeaderVisitor. This is done because else a ctor will be initialized before we can set struct fields to `__default`. 2. IRDefaultCtorDecoration was added to track default ctor's with parent struct. 3. ParentAggTypeModifier was added to track ChildOfStruct->IRType for sharing data such as with functions. This is required to ensure we associate a lowered function with a lowered struct type * Removed decoration to track defaultCtor in favor of field. This was done since decorations are checked for IR objects, storing auxillary info does not work here as a result if usable object. * address some review comments Since `IDefaultInitializable` is taking a considerabley larger amount of time than anticipated I am pushing some of the other fixes requested. I did not remove the "IRStruct storing a default Ctor" hack yet. mostly renamed/adjusted tests to work as intended added test to ensure we don't synthisize a junk `= 0` when not in `zero initialize` mode removed member in favor of sharedContext+dictionary. * a working but incorrect impl * default init without any IR hacks (fully working aside from generic/containored-types) * Finish zero init code 1. IDefaultInitializer interface was added. If conforming, your type may be zero-initialized. To Conform a `__init()` is required 2. `[OnlyAutoInitIfForced]` was added. This attribute states that a default initializer should only be implicitly called if forced by the compiler (`zero-initialize` for example). This allows types which implicitly/explicitly conform to IDefaultInitialize to have optional auto-init behavior (which is Slang's default for user structs) to be disabled. * note about `[OnlyAutoInitIfForced]`. This is required for std-lib to not automatically resolve init-expressions for std-lib, but it has the added benifit of allowing user made structs/classes to control the default behavior of initializing * fix ErrType assumption * testing why dx12 fails local but passes CI * push vector changes to generic test * push syntax adjustment, still figuring out what is wrong with cuda. * remove debug changes & adjust style * fix field-init expressions with structs initializers don't init a static in a ctor. This would be illegal code and wrong code (init list in lower-to-ir) * minor adjustments temporarily while the rest of the issue is discussed * fix * implement IDefaultInitializable * remove a unneeded whitespace change * fix type checking error should be checking if a valid type is `Type`, not `BasicExpressionType` * needs to be DeclRefType, not Type * fix langguage server error * change findinheritance for correctness + cleanup * remove return false verified the issue was `findInheritance` * push attempt at language server fix * still trying to fix inheritance * added extension support, remove redundant code Did not address all review comments yet, want to see if CI also passes my changes * undo a change which caused CI to fail * change logic + DefaultConstructExpr setup code to use defaultConstructExpr when possible to construct a default without overhead of invoke/related also changed code so parent's defaultInitializable propegates to derived member * 1. fix error in `isSubtype` 2. add flag to isSubtype `subtypeInheritanceIsNotFullyResolved` was added since we may not be done the lookup stage but still require `isSubtype` checking to verify usage of inheritance while working with inheritance. In This case we will just skip `ensureLookup` and "caching" (since we don't have a cache invalidation system, nor need) * fix bug in logic + add test to better catch the bug * address comment + isSubTypeOption + wrapper type test, * fix wrong code adjustment I checked on the CI and realized I caused a failure, mistake was made not negating some code * syntax, class naming capital * remove stdlib default initialize changes, replace with `__default()` for init * remove redundant code + fix defaultConstruct emitting previously defaultConstruct emitting was crashing due to having generics unresolved. By not resolving the default construct immediately, everything works. * remove a coment * add test to ensure static variables dont `init` inside a struct's `__init` * fix Ptr members breaking struct use * address review and add -zero-initialize test `-zero-initialize` test was added to be sure debug pointers are not broken with default init values --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Fix type checking around generic array types. (#3568)Yong He2024-02-11
|
* Type layouts for structured buffers with counters (#3269)Ellie Hermaszewska2023-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * More tests for append structured buffer * Append and Consume structured buffer tests for DX12 * neaten * test wobble * Add counter layout information to append/consume structured buffers * add getRWStructuredBufferType * Correct definition of get size for append/consume structured buffers * tweak append structured buffer test * Allow initializing counter buffer in render test * vulkan test for consume structured buffer * Handle null counterVarLayout in getExplicitCounterBindingRangeOffset * remove dead code * Implement atomic counter increment/decrement for spirv * explicit spirv test * Add missing check on result * Hold on to counter resources --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Support `constref` parameters passing. (#3249)Yong He2023-09-28
| | | | | | | | | | | | | | | * Support `constref` parameters passing. * Fix. * Fix. * Add test and diagnostic on mix use of __constref and no_diff. * check for [constref] on differentiable member method. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix for threading issues around global session & epoch ids. (#3232)jsmall-nvidia2023-09-25
| | | | | * Fix for threading issues around global session & epoch ids. * Make m_epochId atomic for thread visibility.
* Use ankerl/unordered_dense as a hashmap implementation (#3036)Ellie Hermaszewska2023-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Correct namespace for getClockFrequency * missing const * Add missing assignment operator * Remove unused variables * Return correct modified variable * Use stable hash code for file system identity * terse static_assert * Structured binding for map iteration * Make (==) and getHashCode const on many structs * Add ConstIterator for LinkedList * Replace uses of ItemProxy::getValue with Dictionary::at * Extract list of loads from gradientsMap before updating it * Const correctness in type layout * Add unordered_dense hashmap submodule * Use wyhash or getHashCode in slang-hash.h * refactor slang-hash.h * Use ankerl/unordered_dense as a hashmap implementation Notable changes: - The subscript operator returns a reference directly to the value, rather than a lazy ItemProxy (pair of dict pointer and key) slang-profile time (95% over 10 runs): - Before: 6.3913906 (±0.0746) - After: 5.9276123 (±0.0964) * 64 bit hash for strings So they have the same hash as char buffers with the same contents * Narrowing warnings for gcc to match msvc * revert back to c++17 * Correct c++ version for msvc * Use path to unordered_dense which keeps tests happy * Do not assign to and read from map in same expression * Remove redundant map operations in primal-hoist * Split out stable hash functions into slang-stable-hash.h * 64 bit hash by default * regenerate vs projects * Correct return type from HashSetBase::getCount() * correct width for call to Dictionary::reserve * Use stable hash for obfuscated module ids * Signed int for reserve * clearer variable naming * Parameterize Dictionary on hash and equality functors * Allow heterogenous lookup for Dictionary * missing const * Use set over operator[] in some places * Remove unused function * s/at/getValue
* Support per field matrix layout (#3101)Yong He2023-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Support per field matrix layout * Fix warnings. * Fix. * Fix tests. * Fix spiv gen. * Fix. * More test fixes. * Fix. * Run only GPU tests on self-hosted servers. * Remove -use-glsl-matrix-layout-modifier. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Clean up and improve Val deduplication performance. (#3069)Yong He2023-08-09
| | | | | | | | | | | | | | | * Clean up and improve Val deuplication performance. * Fix. * Fix. * Fix. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Redesign `DeclRef` and systematic `Val` deduplication (#3049)Yong He2023-08-04
| | | | | | | | | | | | | | | | | | | | | | | * Redesign DeclRef + Deduplicate Val. * Update project files * Fix warning. * Fix. * Fix. * Remove `Val::_equalsImplOverride`. * Rmove `Val::_getHashCodeOverride`. * Remove `semanticVisitor` param from `resolve`. * Cleanups. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix push constant on global variables. (#3034)Yong He2023-07-27
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Simplify Lookup and improve compiler performance. (#2996)Yong He2023-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | * Simplify lookup. * Various bug fixes. * Report type dictionary size in perf benchmark. * Remove type duplication. * increase initial dict size. * Bug fix. * Fix bugs. * Fixup. * Revert type legalization looping. * Fix specialization pass. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Create and cache flattened inheritance lists (#2740)Theresa Foley2023-07-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Create and cache flattened inheritance lists The basic change here is to have a cached lookup that can map a `Type`, or a `DeclRef` that might refer to a type or `extension`, to a list of the *facets* that comprise it. The notion of a *facet* here is similar to what the C++ standard calls "sub-objects". A declared type like a `struct` has: * a facet for its own direct members * one facet for each of its (transitive) base `struct` types * one facet for each `interface` it conforms to * one facet for each `extension` that applies to that type The set of facets for a type is de-duplicated (so that "diamond" inheritance patterns don't cause issues) and deterministically ordered, using a variation of the C3 linearization algorithm. The creation of a linearized list of facets should help the compiler implementation in two key places: * Testing if a type implements an interface (or inherits from a base type) should now only take time linear in the number of (transitive) bases of that type. We can simply scan the linearized facet list to see if it contains a facet corresponding to the given base. * Looking up the members of a type (or a value of a given type) should be greatly simplified, since all of the members can be found in a single linear scan of the facet list. In addition, those facets will be ordered so that facets for "more derived" types will precede those for "less derived" types, so that shadowing in the case of overrides should be easier to implement. This change only implements the first of these two improvements, since there is already a *lot* of churn involved. Notes and caveats: * The handling of conjunction types (e.g., `IFoo & IBar`) complicates the implementation, both because the simple approach to subtype testing alluded to above is no longer complete, and also because we need to be more careful about what forms of subtype witnesses we construct, so that we can maintain the currently-required invariant that two witnesses are only equal if they have matching structure. * We don't implement the full/"proper" C3 algorithm here because it has some failure cases that we'd still like to support. In particular if we have both `IX : IA, IB` and `IY : IB, IA`, the C3 algorithm says it is illegal to have `IZ : IX, IY` because the two bases it inherits from disagree on the relative ordering of `IA` and `IB` in their own linearizations. Handling such cases may make our implementation less efficient, and it will also require testing of those corner caes. * When it comes time to revamp the implementation of lookup, we will need to deal with the fact that a single linear list (seemingly) cannot give us sufficient information to decide which of two members of the same name should shadow the other, or if there is an ambiguity. Or rather, it *can* give us that information if we are willing to accept some very user-unfriendly behavior and simply say that declarations earlier in the linearization always shadow later declarations, even if the facets involved are not related by an inheritance relationship of any kind. * In order to remove one kind of vicious circularity from the approach, the linearization that we are computing for `extension` declarations will not be sufficient for lookups in the body of such an `extension`. A future change may need to have support for creating and caching two distinct linearizations for each `extension`: one that is to be used when that `extension` is pulled into the linearization for a type that it applies to, and another for when lookup will be performed in the context of the `extension` itself. * This change does *not* include the simple expedient of adding a direct cache for subtype tests to the `SharedSemanticsContext`, although adding such a cache would be a simple matter. * This change introduces more deduplication for subtype witnesses, which should enable more deduplication for other `Val`s (including `Type`s), but it does not introduce any assumptions that equal `Val`s or `Type`s must have identical pointer representations. * Eventually we may find that, similar to the situation with `Type`s, we will want to have a split between surface-level and canonicalized versions of other `Val`s, including subtype witnesses. * Fix clang error. * remove debugging code. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Use scratchData on `IRInst` to replace HashSets. (#2978)Yong He2023-07-12
| | | | | | | | | | | | | | | * Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Make DeclRefBase a Val, and DeclRef<T> a helper class. (#2967)Yong He2023-07-07
| | | | | | | | | | | | | | | | | | | | | * Make DeclRefBase a Val, and DeclRef<T> a helper class. * Fixes. * Workaround gcc parser issue. * Revert NodeOperand change. * Fix. * Fix clang incomplete class complains. * Fix code review. * Small cleanups and improvements. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Bottleneck DeclRef creation through ASTBuilder. (#2689)Yong He2023-07-05
| | | | | | | | | | | | | | | | | * Bottleneck DeclRef creation through ASTBuilder. * Fix clang error. * Fix. * Fix. * More fix. * Rebase on top of tree. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* MVP for higher order functions (#2849)Ellie Hermaszewska2023-05-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * MVP for higher order functions * Add shader subgroup partitioned glsl intrinsics * Implement parsing and checking for tuple types Currently there is no way to do anything useful with them from the source language however * neaten * Correct precedence of function type parsing * neaten * higher order function tests * function types of any arity * Inference for higher order functions * Add second test for unsynchronized params * regenerate vs projects * dx11 -> dx12 for saturated cooperations tests * Disable saturated cooperation tests on vulkan They fail on release builds in CI, not essential for the higher order function work however * remove saturated-cooperation tests * Remove unnecessary assert and clarify control flow in AddDeclRefOverloadCandidates * Add Tuple type name mangling * Use functype keyword to introduce function types * Add more inference tests for hof --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Dictionary using lowerCamel (#2835)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
* Make ArrayExpressionType a DeclRefType and define its autodiff extension in ↵Yong He2023-01-30
| | | | | | | | | | | | | | | | | stdlib. (#2615) * Allow array parameters in forward diff. * Use type canonicalization instead of coersion. * Reimplement array type. * Fix. * Update test case. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Allow `no_diff` on `this` parameter. (#2543)Yong He2022-12-01
|
* Allow `no_diff` modifier on parameters (#2538)Yong He2022-11-29
|
* Complete removal of DifferentialBottom type. (#2537)Yong He2022-11-29
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Mesh shader support (#2464)Ellie Hermaszewska2022-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add gdb generated files to .gitignore * Switch to c++17 TODO: Ellie update coding style doc * WIP mesh shaders * Add MeshOutputType and mesh output decorations * Lift array type layout creation out of _createTypeLayout in preparation for sharing it elsewhere * Initial pass at GLSL legalization for mesh shaders * Create output types for builtin mesh outputs This should be rendered as an out paramter block * Handle writes to member fields in mesh shader output * Per primitive output from mesh shaders * Add mesh shader tests * Redeclare mesh output builtins * Remove unused instruction * Emit explicit mesh output max max size * Add unimplemented warning for array members in mesh output * Implement mesh output splitting for GLSL in terms of getSubscriptVal * Allow HLSL syntax for mesh output modifiers * Improve error messages for mesh output * Add test for HLSL style mesh output syntax * Emit explicit mesh output indices max size * HLSL generation support for mesh shaders * Better errors for mesh shader misuse * Neaten comments * Regenerate vs2019 project files * Fix build on vs2019 * Retreat on c++17 Will make the change in a separate PR * slang-glslang binary dep 11.10.0 -> 11.12.0-32 * Fixes for msvc compiler * Update msvc project
* Higher order differentiation. (#2487)Yong He2022-11-04
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Make `DifferentialPair` able to nest. (#2477)Yong He2022-11-01
|
* Modified the new type system to support generic differentiable types … (#2413)Sai Praveen Bangaru2022-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Modified the new type system to support generic differentiable types and added support for differentiating overloaded functions. * Changed a few asserts to release asserts to avoid unreferenced variable errors * Fixed a naming issue with TypeWitnessBreadcumb::Flavor::Decl * Added logic to avoid tracking differentiable types if the module does not use auto-diff or define differentiable types. * Moved the auto-diff passes to after the specialization step, added a more complex generics test * Added a generics stress test and fixed AST-side logic. IR side needs some more work * Added differential getter and setter logic, fixed multiple issues with DifferentiableTypeDictionary, added support for loops and conditions * Changed differential getters to use pointer types, added getter type checking * Fixed some bugs related to diff type registration and differential getters * Removed some superfluous code * Removed some more unused code. * Fixed an issue with witness substitution * Minor fix Co-authored-by: Yong He <yonghe@outlook.com>
* Deduplicate AST type nodes and cache lookup operations. (#2397)Yong He2022-09-13
| | | | | | | | | | | * wip: dedup AST type nodes and cache lookup. * Fix. * Remove profiling. * Fixes. Co-authored-by: Yong He <yhe@nvidia.com>
* Add `none` literal that is convertible to `Optional`. (#2356)Yong He2022-08-10
| | | | | | | | | | | * Add `none` literal that is convertible to `Optional`. * Fix cpu code gen. * Include vk and cpu test for is-as operator test. * Inline comparison operators. Co-authored-by: Yong He <yhe@nvidia.com>
* `is` and `as` operator and `Optional<T>`. (#2355)Yong He2022-08-10
| | | | | | | * `is` and `as` operator and `Optional<T>`. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Added a new differential type system and various improvements (#2343)Sai Praveen Bangaru2022-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Merge slang-ir-diff-jvp.cpp * Added support and tests for other float vector types * Added swizzle test and code to handle it (tests failing currently) * Fixed one test, the other is still pending * Fixed instruction cloning logic to avoid modifying original function * Fixed an issue with custom 'pow_jvp' and added support for vector contructor * Minor update to comments * Fixed support for division * Fixed an issue with uninitialized diagnostic sink * Moved derivative processing to after mandatory inlining. Skip instructions that don't have side-effects and aren't used by anything. * WIP: Handling unconditional control flow and multi-block functions * Support for unconditional multi-block functions * Added a dead code elimination step to the derivative pass * Changed name of 'hasNoSideEffects()' * Refactored variable names * Added initial IR defs for new type system * Added necessary logic for semantic checking * Overhauled type system to use builtin pair types and conform to the IDifferentiable interface * Automatically replace IRDifferentiablePairType to a custom IRStructType * Added generics handling by expanding the conformance context functionality and allowing for type parameters * Minor fix: early return in processPairTypes() * Minor fixes to differentiable resolution on generic types * Added new instructions for differential pairs. Basic tests work now. Looking into generic types. * Adjusted most tests to the new type system. OutType and InOutType are still not properly working. * Updated __jvp to produce both primal and differential output * Moved autodiff related declarations to diff.meta.slang * Refactored variable names * Added initial IR defs for new type system * Added necessary logic for semantic checking * Overhauled type system to use builtin pair types and conform to the IDifferentiable interface * Automatically replace IRDifferentiablePairType to a custom IRStructType * Added generics handling by expanding the conformance context functionality and allowing for type parameters * Minor fix: early return in processPairTypes() * Minor fixes to differentiable resolution on generic types * Added new instructions for differential pairs. Basic tests work now. Looking into generic types. * Adjusted most tests to the new type system. OutType and InOutType are still not properly working. * Updated __jvp to produce both primal and differential output * Moved autodiff related declarations to diff.meta.slang * Removed external changes * Cleanup the transcription logic: each case returns a pair of insts for the primal and differential computation.
* Clean up void returns. (#2260)Yong He2022-06-01
| | | | | | | * Clean up `IRReturnVoid`. * Update gitignore. Co-authored-by: Yong He <yhe@nvidia.com>