summaryrefslogtreecommitdiffstats
path: root/source
Commit message (Collapse)AuthorAge
* Pool inst worklists and hashsets to avoid rehashing. (#2982)Yong He2023-07-12
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Create and cache flattened inheritance lists (#2740)Theresa Foley2023-07-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Create and cache flattened inheritance lists The basic change here is to have a cached lookup that can map a `Type`, or a `DeclRef` that might refer to a type or `extension`, to a list of the *facets* that comprise it. The notion of a *facet* here is similar to what the C++ standard calls "sub-objects". A declared type like a `struct` has: * a facet for its own direct members * one facet for each of its (transitive) base `struct` types * one facet for each `interface` it conforms to * one facet for each `extension` that applies to that type The set of facets for a type is de-duplicated (so that "diamond" inheritance patterns don't cause issues) and deterministically ordered, using a variation of the C3 linearization algorithm. The creation of a linearized list of facets should help the compiler implementation in two key places: * Testing if a type implements an interface (or inherits from a base type) should now only take time linear in the number of (transitive) bases of that type. We can simply scan the linearized facet list to see if it contains a facet corresponding to the given base. * Looking up the members of a type (or a value of a given type) should be greatly simplified, since all of the members can be found in a single linear scan of the facet list. In addition, those facets will be ordered so that facets for "more derived" types will precede those for "less derived" types, so that shadowing in the case of overrides should be easier to implement. This change only implements the first of these two improvements, since there is already a *lot* of churn involved. Notes and caveats: * The handling of conjunction types (e.g., `IFoo & IBar`) complicates the implementation, both because the simple approach to subtype testing alluded to above is no longer complete, and also because we need to be more careful about what forms of subtype witnesses we construct, so that we can maintain the currently-required invariant that two witnesses are only equal if they have matching structure. * We don't implement the full/"proper" C3 algorithm here because it has some failure cases that we'd still like to support. In particular if we have both `IX : IA, IB` and `IY : IB, IA`, the C3 algorithm says it is illegal to have `IZ : IX, IY` because the two bases it inherits from disagree on the relative ordering of `IA` and `IB` in their own linearizations. Handling such cases may make our implementation less efficient, and it will also require testing of those corner caes. * When it comes time to revamp the implementation of lookup, we will need to deal with the fact that a single linear list (seemingly) cannot give us sufficient information to decide which of two members of the same name should shadow the other, or if there is an ambiguity. Or rather, it *can* give us that information if we are willing to accept some very user-unfriendly behavior and simply say that declarations earlier in the linearization always shadow later declarations, even if the facets involved are not related by an inheritance relationship of any kind. * In order to remove one kind of vicious circularity from the approach, the linearization that we are computing for `extension` declarations will not be sufficient for lookups in the body of such an `extension`. A future change may need to have support for creating and caching two distinct linearizations for each `extension`: one that is to be used when that `extension` is pulled into the linearization for a type that it applies to, and another for when lookup will be performed in the context of the `extension` itself. * This change does *not* include the simple expedient of adding a direct cache for subtype tests to the `SharedSemanticsContext`, although adding such a cache would be a simple matter. * This change introduces more deduplication for subtype witnesses, which should enable more deduplication for other `Val`s (including `Type`s), but it does not introduce any assumptions that equal `Val`s or `Type`s must have identical pointer representations. * Eventually we may find that, similar to the situation with `Type`s, we will want to have a split between surface-level and canonicalized versions of other `Val`s, including subtype witnesses. * Fix clang error. * remove debugging code. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Use scratchData on `IRInst` to replace HashSets. (#2978)Yong He2023-07-12
| | | | | | | | | | | | | | | * Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Extend `no_diff` to support subscript operations on resources and array ↵Sai Praveen Bangaru2023-07-12
| | | | | | | variables… (#2981) * Extend `no_diff` to support subscript operations on resources and array variables * Update autodiff.slang.expected
* Fix native string emit for CUDA/Cpp backend. (#2980)Yong He2023-07-12
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add perf benchmark utility. (#2977)Yong He2023-07-11
| | | | | | | | | | | | | * Add perf benchmark utility. * Update documentation. * Fix. * Fix doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add support for texture footprint queries (#2970)Theresa Foley2023-07-10
|
* Fix hit object emit for HLSL + FuncType specialization bug fix. (#2976)Yong He2023-07-10
| | | | | | | | | | | * Fix hit object emit for HLSL. * Fix a bug involving specialization of functon type. * Add a test case. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add glsl intrinsic for SampleCmpLevelZero with offset and correct existing ↵Ellie Hermaszewska2023-07-10
| | | | | | | | | | | | | intrinsic (#2975) * Correct glsl intrinsic for SampleCmpLevelZero without offset * Add glsl intrinsic for SampleCmpLevelZero with offset * Add test for samplecmplevelzero glsl translation --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Do not fail when emitting GLSL using unorm/snorm textures (#2973)Ellie Hermaszewska2023-07-10
| | | | | | | * Do not fail when emitting GLSL using unorm/snorm textures Ignored in glslang https://github.com/KhronosGroup/glslang/blob/main/glslang/HLSL/hlslGrammar.cpp\#L1476 * Add test for unorm modifier on glsl
* Make DeclRefBase a Val, and DeclRef<T> a helper class. (#2967)Yong He2023-07-07
| | | | | | | | | | | | | | | | | | | | | * Make DeclRefBase a Val, and DeclRef<T> a helper class. * Fixes. * Workaround gcc parser issue. * Revert NodeOperand change. * Fix. * Fix clang incomplete class complains. * Fix code review. * Small cleanups and improvements. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Do not use member function of incomplete SemanticsVisitor (#2968)Ellie Hermaszewska2023-07-07
| | | Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
* Fix erroneous error claiming variable is being used before its declaration ↵Ellie Hermaszewska2023-07-06
| | | | | | | | | | | | | | | | | | | (#2958) * Simplify type of diagnoseImpl * Show source line for Note diagnostics, opting out of this where appropriate * Make declared after use diagnostic clearer * Fix erroneous error claiming variable is being used before its declaration Closes https://github.com/shader-slang/slang/issues/2936 * Fix build on msvc --------- Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
* Add support for length for scalar floating point types. (#2965)jsmall-nvidia2023-07-06
|
* Work around for NonUniformResourceIndex with non integral types. (#2963)jsmall-nvidia2023-07-06
| | | | | | | * Work around for NonUniformResourceIndex with non integral types. * Make the non integral NonUniformResourceIndex, inline early. * Add a depreciated warning.
* Bottleneck DeclRef creation through ASTBuilder. (#2689)Yong He2023-07-05
| | | | | | | | | | | | | | | | | * Bottleneck DeclRef creation through ASTBuilder. * Fix clang error. * Fix. * Fix. * More fix. * Rebase on top of tree. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Squash some warnings (#2956)Ellie Hermaszewska2023-07-05
| | | | | | | | | | | | | | | | | | | | | * restrict -Wno-assume to clang (gcc does not have this warning) * Add move where possible Annoyingly this warns for c++17, but will not be necessary with c++20 * Do not partially initialize struct * Remove unused variable * Silence unused var warning It is actually still referenced from an uninstantiated (for now) template * Use unused var --------- Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
* Disable l-value coercion for ref types (#2960)jsmall-nvidia2023-07-05
| | | | | * Make lvalue coercion not work for ref, to stop problem with atomics (for GLSL output). * Improve some comments.
* Initial sizeof/alignof implementation. (#2954)jsmall-nvidia2023-07-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Initial sizeof implementation. * Small macro improvement. * Fix some typos. * Refactor NaturalSize. Add more sizeof tests. * Use _makeParseExpr to add sizeof support. * Add size-of.slang diagnostic result. * Fix typo in folding with macro change. * Add a sizeof test of This. * Some more NaturalSize coverage. * Simple alignof support. * Testing for alignof. * Added 8 bit enum to check enums values are correctly sized. * Add alignof to completion. * Lower sizeof/alignof to IR. sizeof/alignof IR pass. Tests for simple generic scenarios. * Make append handle invalid properly. Improve comments. --------- Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Refactor "meta" decls for stdlib texture types (#2932)Theresa Foley2023-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use some ad-hoc "template engine" code generation / metaprogramming to generate many of the declarations in the Slang standard library. In many cases the level of meta-ness is (relatively) manageable, but one of the biggest tangles in the whole thing is the generation of the texture-related types. We basically have a single set of nested `for` loops that generate all types of the form: (RW|RasterizerOrdered|/**/)(Texture|Sampler)(1D|2D|...)Array?MS? Inside that loop we then have tons of conditional logic to determine: * Which points in the cross-product space should be skipped, rather than emitted as a type. * Which methods to emit, or not. * The type signature(s) of those methods. * The translation of those methods for each target (via `__target_intrinsic`) The code ends up being long, complicated, and very hard to maintain or extend. This change takes a first small step to try to help us get the complexity more under control. The basic approach is that the data that defines each point in the cross-product space is aggregated into a `TextureTypeInfo` structure in the meta-level code, and then the logic for emitting the declarations related to a given texture type is expressed as a member function of that type. The intention is that this design will more easily allow the meta-level code to be factored into distinct subroutines, and enable us to clean up and re-use recurring bits of text that need to appear in the output. It is possible (though I am not yet predicting it) that we will end up wanting to utilize a bit of an inheritance hierarchy on `TextureTypeInfo` to allow us to more cleanly factor out code that is specific to certain cases (e.g., there is only a small amount of sharing between `RW`/`RasterizerOrdered` and read-only texture types). It is intentional that this step introduces no significant changes to the logic that used to be inside the loop (and is now inside of a method). Instead, the goal is to minimize the scale of the diffs that reviewers might be expectecd to deal with in follow-on changes. Co-authored-by: Yong He <yonghe@outlook.com>
* Non-Recursive CFG DFS. (#2953)Yong He2023-06-30
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fix for operator assignment issue (#2951)jsmall-nvidia2023-06-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP handling LValue coercion via LValueImplicitCast * Need to have the ptr type for the cast. * Casting conversion working on C++. * Make the LValue casts record if in or in/out as we can produce better code if we know the difference. * WIP LValueCast pass * Fix tests so we don't fail because downstream compilers detect use of uninitialized variable. * Do conversions through through tmp for l-value scenarios that can't work other ways. * Fix a typo. * Change diagnostic implicit-cast-lvalue for a type that still exhibits the issue. * Add matrix test. * Added a bit more clarity around LValue casting choices. * Small comment improvements. Improvements based on comments on PR. * Use findOuterGeneric.
* Issue diagnostic for incorrect parameter types & directionality when ↵Sai Praveen Bangaru2023-06-29
| | | | | | | | | | | | | defining custom derivatives (#2947) * Issue diagnostic for incorrect directionality when defining custom derivative * Better diagnostics on invalid custom derivatives * Avoid duplicating `getParameterDirection()` --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Apply SCCP on global scope before unrolling loops. (#2952)Yong He2023-06-29
| | | | | | | | | * Apply SCCP on global scope before unrolling loops. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Small fixes to GLSL-legalize and func-property prop. (#2950)Yong He2023-06-29
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Warn on semicolon after `if`. (#2948)Yong He2023-06-29
| | | | | | | | | * Warn on semicolon after `if`. * add test result --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix parameter block loads in GLSL emit. (#2946)Yong He2023-06-28
| | | | | | | | | | | * Fix parameter block loads in GLSL emit. * Revert `[NoSideEffect]` declarations in DXR1.1 API. * fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add support for vk::image_format attribute (#2945)jsmall-nvidia2023-06-28
|
* Support for infinite literal of from 34.2432#INF (#2944)jsmall-nvidia2023-06-27
|
* Pointer layout support (#2930)jsmall-nvidia2023-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP looking at reflection with pointers. * Added GetPointerLayout. * Initial test via reflection with layout of ptr type. * WIP handles ptrs to types that have layout that hasn't been completed. * Move tests to ptr. * WIP try to take into account lowering correctly between AggTypeDecl and Type, but doesn't quite work. * WIP a different path to handling recursive lowering problem with Ptr. * Fix issues with reflection output. * Small tidy. * Fix for infinite recursion issue. * Lower IRPointerTypeLayout * Working with generics. Has a hack to work around Layout around Ptr in IR. The reflection around the generic - the name isn't much use, it should probably have the generic parameters, but that would require getName to do something more sophisticated. * Fix issue around calling finishOuterGenerics to early. * Remove feature/ptr test. * Fix type legalization being an infinite loop with Ptr self referencing. * Disable the pointer self reference test because produces an infintie loop on emit. * Fixed comment based on review. * Fix for issue with emit and pointers causing infinite recursion.
* Fix DCE on mutable calls in a loop. (#2943)Yong He2023-06-26
| | | | | | | | | | | | | * Fix DCE on mutable calls in a loop. * More accurate in-loop test. * code review fixes. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Handling SV_ClipDistance system semantic on GLSL/VK (#2942)jsmall-nvidia2023-06-26
| | | | | | | | | | | | | | | | | | | | | * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * WIP support for gl_ClipDistance * Working but doesn't have layout. * Check out param works with gl_ClipDistance. * Test clip distance works with out parameters. * Enable file check. * Add a test that splits clip distance writing. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Multiple cast issue fix (#2940)jsmall-nvidia2023-06-26
| | | | | | | | | | | | | * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * WIP parens casting issue. * Fix issue with multiple casts. * Match previous location point for casting, with 'fast' path. * Removed logic to output the found decl, as not needed to construct ExplicitCastExpr.
* Allow multiple attributes to not require separating comma (#2939)jsmall-nvidia2023-06-22
| | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Allow attributes without comma separation.
* [branch] and [flatten] support (#2928)jsmall-nvidia2023-06-22
| | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Add HLSL output support for [flatten] and [branch] * Handle [branch] on switch.
* Avoid materializing multiple swizzle gradients (#2923)Sai Praveen Bangaru2023-06-21
|
* Fix for generic with scope issue (#2925)jsmall-nvidia2023-06-20
| | | | | | | | | | | | | | | | | * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Sundary improvements around StringBlob. * Fix for generic scope issue. * Fix expected output. * Add scope-generic test. --------- Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Fixes for Shader Execution Reordering on VK (#2929)Theresa Foley2023-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fixes for Shader Execution Reordering on VK There are some mismatches between the way that hit objects are handled between the current NVAPI/HLSL and proposed GLSL extensions for shader execution reordering. These mismatches create complications for generating valid GLSL/SPIR-V code from input Slang. Many of the problems that apply to `HitObject` also apply to the existing `RayQuery<>` type used for "inline" ray tracing. In the case of `RayQuery<>` we have that for *both* HLSL and GLSL/SPIR-V: * A `RayQuery` (or `rayQueryEXT`) is an opaque handle to underlying mutable storage * The storage that backs a `RayQuery` is allocated as part of the "defualt constructor" for a local variable declared with type `RayQuery`. * The `RayQuery` API provides numerous operations that mutate the storage referred to by the opaque handle. The key difference between HLSL and GLSL/SPIR-V for the case of a `RayQuery` amounts to: * In HLSL, local variables of type `RayQuery` can be assigned to, and assignment has by-reference semantics. It is possible to create multiple aliased handles to the same underlying storage. * In GLSL/SPIR-V, local variables of type `rayQueryEXT` cannot be assigned to, returned from functions, etc. It is impossible to create multiple aliased handles to the same underlying storage. The case for `HitObject`s is signicantly *more* messy, because: * In NVAPI/HLSL a `HitObject` is effectively a "value type" in that it only exposes constructors, and there is no way to mutate the state of a `HitObject` other than by assignment to a variable of that type. It makes no semantic difference whether a `HitObject` directly stores the value(s), or if it is a handle, since there is no way to introduce aliasing of mutable state. Assignment of `HitObject`s semantically creates a copy. * In GLSL/SPIR-V, a `hitObjectNV` is, like a `rayQueryEXT`, a handle to underlying mutable state. These handles cannot be assigned, returned from functions, etc. There is no way to make a copy of a hit object. This change includes several changes to how *both* `RayQuery<>` and `HitObject` are implemented, with the intention of getting more cases to work correctly when compiling for GLSL/SPIR-V, and to set up a more clear mental model for the semantics we want to give to these types in Slang, and how those semantics can/should map to our targets. An overview of important changes: * Marked a few operations on `RayQuery` as `[mutating]` that realistically should have already been that way. * Marked the `HitObject` type as being non-copyable (an attribute we do not currently enforce), and marked the various GLSL operations that construct a hit object as having an `out` parameter of the `HitObject` type (even if they are nominally specified in GLSL as not writing to the correspondign parameter). * Added a distinct IR opcode (`allocateOpaqueHandle`) to represent the implicit allocation that happens when declaring a variable of type `HitObject` or `RayQuery`, and made the "implicit constructor" for those types map to the new op. This operation took a lot of tweaking to get emitting in a reasonable way, and I'm still not 100% sure that all of the emission-related logic for it is strictly required (or correct). * Added new IR instructions for `HitObject` and `RayQuery` types, and made the stdlib types map to those IR instructions. * Treat `HitObject` and `RayQuery` as resource types for the purpose of our existing pass that specializes calls to functions that have outputs of resource type * Added a new test case that includes a function that returns a `HitObject` as its result. * Many test cases saw slight changes in their output (especially around the relative ordering of declarations of `HitObject`s and `RayQuery`s with other instructions) * Remove debugging logic
* Small improvements around StringBlob (#2924)jsmall-nvidia2023-06-09
| | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Sundary improvements around StringBlob.
* Improvements around StringBlob (#2921)jsmall-nvidia2023-06-08
| | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Improvements around handling StringBlob and storing stdlib source in ISlangBlob. * Fix some issues with comments around StringBlob. * Default initialize StringBlob fields.
* AD: Fix out-of-scope indexing rules for insts in loop header blocks during ↵Sai Praveen Bangaru2023-06-07
| | | | | | | | | | | | the primal-inst availability pass (#2918) * add test case * Fix out-of-scope indexing rules for loop header blocks --------- Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Yong He <yonghe@outlook.com>
* Fix generic param inference through TypeCastIntVal. (#2916)Yong He2023-06-02
|
* Be lenient on same-size unsigend->signed conversion. (#2913)Yong He2023-06-01
| | | | | | | | | | | | | | | | | | | * Be lenient on same-size unsigend->signed conversion. * Fix tests. * Use 250. * wip * Fix. * Fix tests. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix def-use legalization in CFG normalization. (#2909)Yong He2023-05-31
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Preserve type cast during AST constant folding. (#2912)Yong He2023-05-31
| | | | | | | | | | | | | | | * Preserve type cast during AST constant folding. Fixes #2891. * Fix. * Fix truncating. * fix test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix div-by-zero error during sccp. (#2911)Yong He2023-05-31
| | | | | | | | | * Diagnose on div-by-zero during sccp. * fix --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix type checking & loop value hoisting (#2907)Yong He2023-05-30
| | | | | | | | | | | | | * Fix type checking crash in language server. * Fix loop var hoisting logic. Fixes #2903. * fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix derivative signature bug in checkDerivativeAttribute. (#2905)Yong He2023-05-30
|
* Disallow duplicate enumerator names in the same enum (#2904)Ellie Hermaszewska2023-05-30
| | | Fixes https://github.com/shader-slang/slang/issues/2895
* Fix bug in legalizeFuncType that leads to invalid IR. (#2902)Yong He2023-05-26
| | | | | | | | | * Fix bug in legalizeFuncType that leads to invalid IR. * Diagnose on functions that never returns when differentiate it. --------- Co-authored-by: Yong He <yhe@nvidia.com>