diff options
| author | ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> | 2024-05-16 00:04:12 -0400 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2024-05-16 00:04:12 -0400 |
| commit | 1b89f78cd1762aa08402bd656e807b66833b11d0 (patch) | |
| tree | 2be71c9d97af8d28d440981d0c5adc726d9eac56 /tests/language-feature/capability | |
| parent | 3b0de8b6ea484091146f61e663c63beeac5b4798 (diff) | |
Capabilities System, CapabilitySet Logic Overhaul (#4145)
* Capabilities System, Backing Logic Overhaul
Fixes #4015
Problems to address:
1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster.
2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging.
What was addressed:
1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below.
Design:
* `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation.
* `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched.
* `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched.
* `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations.
2. Errors
* Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors.
* Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility.
What is missing:
1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul.
2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint.
Notes:
* UIntSet had implementation bugs which were fixed in this PR.
* The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation.
* fix .natvis debug view
* Small optimizations I found while working on the addition
the AST building pass looks like so now:
1% = ~capabilitySet
2% = capabilitySet()
1.5% capabilitySet::unionWith()
0.8% capabilitySet::join()
1.5% auxillary info for debugging
~0.5-1% extra visitor overhead
~5% total for the visitor
~6.5% for total runtime costs
* fix caps which were wrong but worked
* push minor syntax fix (still looking for why other tests fail)
* perf & bug fixes
1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario.
2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code.
NOTE: did not address review comments
* fix glsl.meta caps error
* fixing findBest logic again & UIntSet wrapper
findBest was not checking for 'more specialized' targets & was element counter was flawed
* faster getElements algorithm + natvis for UIntSet + wrong warning
* type incompatability of bitscanForward implementations
* try to fix warnings again
* remove ptr for clang intrinsic
* add missing header
* ifdef to allow clang compile
* compiler hackery to fix up platform/type independent operations
* bracket
* fix MSVC error
* missing template
* change types out again
* changes to fix compiling
* adjustment to parameter for Clang/GCC
* added iterator to delay processing all atomSets of a CapabilitySet
* add a few missing consts's
* ensure we never have more than 1 disjointSet
Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons:
1. We still need 0-1 set functionality.
2. Might as well keep the code, just disallow the problematic functionality.
* address review comments
non linked-list refactor review comments addressed; add doc comments + remove redundant code
* comments + remove isValid for bool operator
* push removal of linkedlist for capabilities
* add missing break
* address review comments
minor adjustments of syntax
* push a fix to the `CapabilitySet({shader, missing target})` code
* quality + error
1. add iterator to UIntSet
2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400)
* fix target_switch erroring + temporarily remove UIntSet::Interator
temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator
* fix the UIntSet iterator
* Revert "fix the UIntSet iterator" temporarily to pull from master
* add metal error as per texture.slang
(took a while I realize this was why things were breaking, likely should adjust errors to reflect this)
* Rework UIntSet to have a template for output type
This is done so it is reasonable to debug the iterator output and not just dealing with messy int's
Fix problems with the iterators implemented + invalid capabilities handling
* removed incorrect `__target_switch` capability
barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile.
* remove some uses of getElements
* remove undeclared_stage for now
* remove redundant code associated with `undeclared_stage`
* remove unused variable
* address review
specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate
* move GLSL_150 capdef change to sm_4_1 (more accurate)
* address most review comments
did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776
* revert incorrect code review suggestion
* push changes for all code review suggestions
Diffstat (limited to 'tests/language-feature/capability')
5 files changed, 67 insertions, 12 deletions
diff --git a/tests/language-feature/capability/capability1.slang b/tests/language-feature/capability/capability1.slang index b340f5025..76bbd534f 100644 --- a/tests/language-feature/capability/capability1.slang +++ b/tests/language-feature/capability/capability1.slang @@ -15,7 +15,7 @@ void caller() } [require(spirv, shaderclock)] -// CHECK: ([[# @LINE+1]]): error 36104: +// CHECK: error 36104: void main1() { caller(); // Error, shaderclock does not imply spvShaderNonUniform. @@ -25,6 +25,5 @@ void main1() [require(spirv, shaderclock)] void main2() { - // CHECK-NOT: error leafFunc1(); // OK, shaderclock implies spvShaderClockKHR. } diff --git a/tests/language-feature/capability/capability2.slang b/tests/language-feature/capability/capability2.slang index b66da4563..6125c21d3 100644 --- a/tests/language-feature/capability/capability2.slang +++ b/tests/language-feature/capability/capability2.slang @@ -39,19 +39,16 @@ struct Impl1 : IFoo } } -// CHECK-NOT: error 361 - struct Impl2 : IFoo { - // CHECK: ([[# @LINE+1]]): error 36104: {{.*}}spvGroupNonUniformArithmetic + // CHECK: error 36104: {{.*}}spvGroupNonUniformArithmetic void method1() { useRayQueryKHR(); // OK. useNonUniformArithmetic(); // error. } - // CHECK-NOT: error 361 - // CHECK: ([[# @LINE+1]]): error 36104: {{.*}}spvGroupNonUniformArithmetic + // CHECK: error 36104: {{.*}}spvGroupNonUniformArithmetic void method2() { useAtomicFloat16(); diff --git a/tests/language-feature/capability/capability3.slang b/tests/language-feature/capability/capability3.slang index 96c07a51f..67099a1da 100644 --- a/tests/language-feature/capability/capability3.slang +++ b/tests/language-feature/capability/capability3.slang @@ -9,7 +9,7 @@ module test; RWStructuredBuffer<int> sideEffect; -// CHECK: error 36104 +// CHECK: error {{(36107|36108)}} [require(glsl, _sm_4_0)] public void use1() { @@ -28,7 +28,7 @@ void use2Sub() sideEffect[1] = 1; } } -// CHECK: error 36108 +// CHECK: error {{(36107|36108)}} [require(spirv, spirv_1_0)] public void use2() { diff --git a/tests/language-feature/capability/capability4.slang b/tests/language-feature/capability/capability4.slang index f6ee30339..4a2a6f3c9 100644 --- a/tests/language-feature/capability/capability4.slang +++ b/tests/language-feature/capability/capability4.slang @@ -1,23 +1,23 @@ //TEST:SIMPLE(filecheck=CHECK): -target spirv -emit-spirv-directly -entry main -stage compute //TEST:SIMPLE(filecheck=CHECK_IGNORE_CAPS): -target spirv -emit-spirv-directly -entry main -stage compute -ignore-capabilities -// CHECK_IGNORE_CAPS-NOT: error 36108 +// CHECK_IGNORE_CAPS-NOT: error 36104 // Check that a non-static member method implictly requires capabilities // defined in ThisType. +//CHECK: error 36108: {{.*}} 'glsl'. +//CHECK: note: see using of 'Type' [require(hlsl)] struct Type { int member; [require(glsl)] [mutating] - // CHECK: ([[# @LINE+1]]): error 36108: void f() { } [require(glsl)] - // CHECK-NOT: ([[# @LINE+1]]): error 36108: static void f1() { } diff --git a/tests/language-feature/capability/specializeTargetSwitch.slang b/tests/language-feature/capability/specializeTargetSwitch.slang new file mode 100644 index 000000000..251adfaf8 --- /dev/null +++ b/tests/language-feature/capability/specializeTargetSwitch.slang @@ -0,0 +1,59 @@ +//TEST:SIMPLE(filecheck=CHECK_HLSL): -target hlsl -entry main -stage compute -capability _sm_5_1 +//TEST:SIMPLE(filecheck=CHECK_GLSL1): -target glsl -entry main -stage compute -capability _GLSL_420 +//TEST:SIMPLE(filecheck=CHECK_GLSL2): -target glsl -entry main -stage compute -capability _GLSL_330 +//TEST:SIMPLE(filecheck=CHECK_METAL): -target cpp -entry main -stage compute -capability image_loadstore +//TEST:SIMPLE(filecheck=CHECK_WILL_ERROR1): -target glsl -entry main -stage compute -capability image_loadstore -DWILL_ERROR1 +//TEST:SIMPLE(filecheck=CHECK_WILL_ERROR2): -target glsl -entry main -stage compute -capability _GLSL_130 -DWILL_ERROR2 +//TEST:SIMPLE(filecheck=CHECK_GLSL3): -target glsl -entry main -stage compute -capability _GLSL_130 + +RWTexture1D<int> tex; + +//CHECK_HLSL: {{.*}}21{{.*}}; +//CHECK_GLSL1: {{.*}}13{{.*}} +//CHECK_GLSL2: {{.*}}11{{.*}} +//CHECK_METAL: {{.*}}30{{.*}} +//CHECK_WILL_ERROR1: error 36109 +//CHECK_WILL_ERROR2: error 41011 +//CHECK_GLSL3: {{.*}}30{{.*}} + +int specialize() +{ + __target_switch + { + case spirv_1_0: + return 1; + case spirv_1_1: + return 2; + case spirv_1_2: + return 3; + + case _GLSL_150: + return 10; + case _GLSL_330: + return 11; + case _GLSL_400: + return 12; + case _GLSL_410: + return 13; +#ifdef WILL_ERROR1 + case image_loadstore: + return 14; +#endif + case _sm_5_0: + return 20; + case _sm_5_1: + return 21; + case _sm_6_0: + return 21; +#ifndef WILL_ERROR2 + default: + return 30; +#endif + } +} + +[numthreads(1,1,1)] +void main() +{ + tex[0] = specialize(); +} |
