summaryrefslogtreecommitdiffstats
path: root/source
diff options
context:
space:
mode:
authorArielG-NV <159081215+ArielG-NV@users.noreply.github.com>2024-05-16 00:04:12 -0400
committerGitHub <noreply@github.com>2024-05-16 00:04:12 -0400
commit1b89f78cd1762aa08402bd656e807b66833b11d0 (patch)
tree2be71c9d97af8d28d440981d0c5adc726d9eac56 /source
parent3b0de8b6ea484091146f61e663c63beeac5b4798 (diff)
Capabilities System, CapabilitySet Logic Overhaul (#4145)
* Capabilities System, Backing Logic Overhaul Fixes #4015 Problems to address: 1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster. 2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging. What was addressed: 1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below. Design: * `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation. * `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched. * `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched. * `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations. 2. Errors * Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors. * Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility. What is missing: 1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul. 2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint. Notes: * UIntSet had implementation bugs which were fixed in this PR. * The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation. * fix .natvis debug view * Small optimizations I found while working on the addition the AST building pass looks like so now: 1% = ~capabilitySet 2% = capabilitySet() 1.5% capabilitySet::unionWith() 0.8% capabilitySet::join() 1.5% auxillary info for debugging ~0.5-1% extra visitor overhead ~5% total for the visitor ~6.5% for total runtime costs * fix caps which were wrong but worked * push minor syntax fix (still looking for why other tests fail) * perf & bug fixes 1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario. 2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code. NOTE: did not address review comments * fix glsl.meta caps error * fixing findBest logic again & UIntSet wrapper findBest was not checking for 'more specialized' targets & was element counter was flawed * faster getElements algorithm + natvis for UIntSet + wrong warning * type incompatability of bitscanForward implementations * try to fix warnings again * remove ptr for clang intrinsic * add missing header * ifdef to allow clang compile * compiler hackery to fix up platform/type independent operations * bracket * fix MSVC error * missing template * change types out again * changes to fix compiling * adjustment to parameter for Clang/GCC * added iterator to delay processing all atomSets of a CapabilitySet * add a few missing consts's * ensure we never have more than 1 disjointSet Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons: 1. We still need 0-1 set functionality. 2. Might as well keep the code, just disallow the problematic functionality. * address review comments non linked-list refactor review comments addressed; add doc comments + remove redundant code * comments + remove isValid for bool operator * push removal of linkedlist for capabilities * add missing break * address review comments minor adjustments of syntax * push a fix to the `CapabilitySet({shader, missing target})` code * quality + error 1. add iterator to UIntSet 2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400) * fix target_switch erroring + temporarily remove UIntSet::Interator temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator * fix the UIntSet iterator * Revert "fix the UIntSet iterator" temporarily to pull from master * add metal error as per texture.slang (took a while I realize this was why things were breaking, likely should adjust errors to reflect this) * Rework UIntSet to have a template for output type This is done so it is reasonable to debug the iterator output and not just dealing with messy int's Fix problems with the iterators implemented + invalid capabilities handling * removed incorrect `__target_switch` capability barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile. * remove some uses of getElements * remove undeclared_stage for now * remove redundant code associated with `undeclared_stage` * remove unused variable * address review specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate * move GLSL_150 capdef change to sm_4_1 (more accurate) * address most review comments did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776 * revert incorrect code review suggestion * push changes for all code review suggestions
Diffstat (limited to 'source')
-rw-r--r--source/core/slang-linked-list.h2
-rw-r--r--source/core/slang-uint-set.cpp62
-rw-r--r--source/core/slang-uint-set.h209
-rw-r--r--source/slang/glsl.meta.slang10
-rw-r--r--source/slang/hlsl.meta.slang72
-rw-r--r--source/slang/slang-ast-dump.cpp7
-rw-r--r--source/slang/slang-capabilities.capdef161
-rw-r--r--source/slang/slang-capability.cpp1624
-rw-r--r--source/slang/slang-capability.h307
-rw-r--r--source/slang/slang-check-decl.cpp244
-rw-r--r--source/slang/slang-check-impl.h5
-rw-r--r--source/slang/slang-check-modifier.cpp3
-rw-r--r--source/slang/slang-check-shader.cpp24
-rw-r--r--source/slang/slang-check-stmt.cpp2
-rw-r--r--source/slang/slang-compiler.cpp6
-rw-r--r--source/slang/slang-diagnostic-defs.h3
-rw-r--r--source/slang/slang-ir-link.cpp8
-rw-r--r--source/slang/slang-ir-specialize-target-switch.cpp30
-rw-r--r--source/slang/slang-ir-specialize-target-switch.h3
-rw-r--r--source/slang/slang-ir.cpp12
-rw-r--r--source/slang/slang-serialize-ast-type-info.h153
-rw-r--r--source/slang/slang.cpp4
-rw-r--r--source/slang/slang.natvis110
23 files changed, 1624 insertions, 1437 deletions
diff --git a/source/core/slang-linked-list.h b/source/core/slang-linked-list.h
index 93b5e435c..840ef8cd6 100644
--- a/source/core/slang-linked-list.h
+++ b/source/core/slang-linked-list.h
@@ -323,7 +323,7 @@ public:
}
return rs;
}
- int getCount() { return count; }
+ int getCount() const { return count; }
};
} // namespace Slang
#endif
diff --git a/source/core/slang-uint-set.cpp b/source/core/slang-uint-set.cpp
index e973cbc3a..b6871c192 100644
--- a/source/core/slang-uint-set.cpp
+++ b/source/core/slang-uint-set.cpp
@@ -3,18 +3,6 @@
namespace Slang
{
-static bool _areAllZero(const UIntSet::Element* elems, Index count)
-{
- for (Index i = 0; count; ++i)
- {
- if (elems[i])
- {
- return false;
- }
- }
- return true;
-}
-
UIntSet& UIntSet::operator=(UIntSet&& other)
{
m_buffer = _Move(other.m_buffer);
@@ -49,14 +37,8 @@ void UIntSet::setAll()
void UIntSet::resize(UInt size)
{
- const Index oldCount = m_buffer.getCount();
const Index newCount = Index((size + kElementMask) >> kElementShift);
- m_buffer.setCount(newCount);
-
- if (newCount > oldCount)
- {
- ::memset(m_buffer.getBuffer() + oldCount, 0, (newCount - oldCount) * sizeof(Element));
- }
+ resizeBackingBufferDirectly(newCount);
}
void UIntSet::clear()
@@ -66,17 +48,7 @@ void UIntSet::clear()
bool UIntSet::isEmpty() const
{
- const Element*const src = m_buffer.getBuffer();
- const Index count = m_buffer.getCount();
-
- for (Index i = 0; i < count; ++i)
- {
- if (src[i])
- {
- return false;
- }
- }
- return true;
+ return _areAllZero(m_buffer.getBuffer(), m_buffer.getCount());
}
void UIntSet::clearAndDeallocate()
@@ -106,7 +78,7 @@ bool UIntSet::operator==(const UIntSet& set) const
const Index minCount = Math::Min(aCount, bCount);
- return ::memcmp(aElems, bElems, minCount) == 0 &&
+ return ::memcmp(aElems, bElems, minCount*sizeof(Element)) == 0 &&
_areAllZero(aElems + minCount, aCount - minCount) &&
_areAllZero(bElems + minCount, bCount - minCount);
}
@@ -123,6 +95,15 @@ void UIntSet::intersectWith(const UIntSet& set)
}
}
+void UIntSet::subtractWith(const UIntSet& set)
+{
+ const Index minCount = Math::Min(this->m_buffer.getCount(), set.m_buffer.getCount());
+ for (Index i = 0; i < minCount; i++)
+ {
+ this->m_buffer[i] = this->m_buffer[i] & (~set.m_buffer[i]);
+ }
+}
+
/* static */void UIntSet::calcUnion(UIntSet& outRs, const UIntSet& set1, const UIntSet& set2)
{
outRs.m_buffer.setCount(Math::Max(set1.m_buffer.getCount(), set2.m_buffer.getCount()));
@@ -162,5 +143,24 @@ void UIntSet::intersectWith(const UIntSet& set)
return false;
}
+Index UIntSet::countElements() const
+{
+ // TODO: This can be made faster using SIMD intrinsics to count set bits.
+ uint64_t tmp;
+ constexpr Index loopSize = ((sizeof(Element) / sizeof(tmp)) != 0) ? sizeof(Element) / sizeof(tmp) : 1;
+ Index count = 0;
+ for (auto index = 0; index < this->m_buffer.getCount(); index++)
+ {
+ for (auto i = 0; i < loopSize; i++)
+ {
+ tmp = m_buffer[index] >> (sizeof(tmp) * i);
+ tmp = tmp - ((tmp >> 1) & 0x5555555555555555);
+ tmp = (tmp & 0x3333333333333333) + ((tmp >> 2) & 0x3333333333333333);
+ count += ((tmp + (tmp >> 4) & 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >> 56;
+ }
+ }
+ return count;
+}
+
}
diff --git a/source/core/slang-uint-set.h b/source/core/slang-uint-set.h
index 0f2165bab..22ca457b0 100644
--- a/source/core/slang-uint-set.h
+++ b/source/core/slang-uint-set.h
@@ -6,31 +6,83 @@
#include "slang-common.h"
#include "slang-hash.h"
+#if defined(_MSC_VER)
+#include <intrin.h>
+#endif
#include <memory.h>
namespace Slang
{
+template<typename T>
+constexpr static Index computeElementShift()
+{
+ Index currentShift = 0;
+ Index currentShiftValue = 1;
+
+ while (currentShiftValue != sizeof(T) * 8)
+ {
+ currentShift++;
+ currentShiftValue *= 2;
+ }
+
+ return currentShift;
+}
+
+static inline Index bitscanForward(uint64_t in)
+{
+#if defined(_MSC_VER)
+
+#ifdef _WIN64
+ uint64_t out = 0;
+ _BitScanForward64((unsigned long*)&out, in);
+ return Index(out);
+#else
+ constexpr uint32_t bitsInType = sizeof(uint32_t) * 8;
+ uint32_t out;
+ // check for 0s in 0bit->31bit. If all 0's, check for 0s in 32bit->63bit
+ _BitScanForward((unsigned long*)&out, *(((uint32_t*)&in) + 1));
+ if (out != bitsInType)
+ return Index(out);
+ _BitScanForward((unsigned long*)&out, *(((uint32_t*)&in)));
+ return Index(out + bitsInType);
+#endif// #ifdef _WIN64
+
+#else
+ return Index(__builtin_ctzll(in));
+#endif// #if defined(_MSC_VER)
+}
+
/* Hold a set of UInt values. Implementation works by storing as a bit per value */
+/// UIntSet is essentially a Element[], where each Element is `b` bits big.
+/// Each index has `b` number of integers. If the bit is 1, we have an element there.
+/// Value of each element is equal to the binary offset from Element[0], bit 0.
class UIntSet
{
public:
typedef UIntSet ThisType;
- typedef uint32_t Element; ///< Type that holds the bits to say if value is present
+ typedef uint64_t Element; ///< Type that holds the bits to say if value is present
+ constexpr static Index kElementSize = sizeof(Element) * 8; ///< The number of bits in an element. This also determines how many values a element can hold.
+ constexpr static Index kElementMask = kElementSize - 1; ///< Mask to get shift from an index
+ constexpr static Index kElementShift = computeElementShift<Element>(); ///< How many bits to shift to get Element index from an index. 5 for 2^5=32 elements in a uint32_t. 6 for 2^6=64 in a uint64_t.
+
UIntSet() {}
UIntSet(const UIntSet& other) { m_buffer = other.m_buffer; }
UIntSet(UIntSet && other) { *this = (_Move(other)); }
UIntSet(UInt maxVal) { resizeAndClear(maxVal); }
+ UIntSet(List<UIntSet::Element> buffer) { m_buffer = buffer; }
UIntSet& operator=(UIntSet&& other);
UIntSet& operator=(const UIntSet& other);
HashCode getHashCode() const;
- /// Return the count of all bits directly represented
+ /// Return the count of all bits directly represented
Int getCount() const { return Int(m_buffer.getCount()) * kElementSize; }
+ List<Element>& getBuffer() { return m_buffer; }
+
/// Resize such that val can be stored and clear contents
void resizeAndClear(UInt val);
/// Set all of the values up to count, as set
@@ -38,6 +90,7 @@ public:
/// Resize (but maintain contents) up to bit size.
/// NOTE! That since storage is in Element blocks, it may mean some values after size are set (up to the Element boundary)
void resize(UInt size);
+ void resizeBackingBufferDirectly(Index size);
/// Clear all of the contents (by clearing the bits)
void clear();
@@ -47,6 +100,8 @@ public:
/// Add a value
inline void add(UInt val);
+ inline void add(const UIntSet& val);
+
/// Remove a value
inline void remove(UInt val);
/// Returns true if the value is present
@@ -59,10 +114,12 @@ public:
/// !=
bool operator!=(const UIntSet& set) const { return !(*this == set); }
- /// Store the union between this and set in this
+ /// Store the union between this and set
void unionWith(const UIntSet& set);
- /// Store the intersection between this and set in this
+ /// Store the intersection between this and set
void intersectWith(const UIntSet& set);
+ /// Store the subtraction between this and set
+ void subtractWith(const UIntSet& set);
///
bool isEmpty() const;
@@ -70,6 +127,10 @@ public:
/// Swap this with rhs
void swapWith(ThisType& rhs) { m_buffer.swapWith(rhs.m_buffer); }
+ template<typename T>
+ List<T> getElements() const;
+ Index countElements() const;
+
/// Store the union of set1 and set2 in outRs
static void calcUnion(UIntSet& outRs, const UIntSet& set1, const UIntSet& set2);
/// Store the intersection of set1 and set2 in outRs
@@ -80,16 +141,98 @@ public:
/// Returns true if set1 and set2 have a same value set (ie there is an intersection)
static bool hasIntersection(const UIntSet& set1, const UIntSet& set2);
-private:
- enum
+ struct Iterator
{
- kElementShift = 5, ///< How many bits to shift to get Element index from an index
- kElementSize = sizeof(Element) * 8, ///< The number of bits in an element
- kElementMask = kElementSize - 1, ///< Mask to get shift from an index
+ friend class UIntSet;
+ private:
+ const List<Element>* context;
+ Index block = 0;
+ Element processedElement = 0;
+ uint64_t LSB = 0;
+
+ void clearLSB()
+ {
+ LSB = bitscanForward(processedElement);
+ processedElement &= processedElement - 1;
+ }
+ public:
+ Iterator(const List<Element>* inContext)
+ {
+ context = inContext;
+ }
+
+ Element operator*()
+ {
+ return Element(LSB + (kElementSize * block));
+ }
+
+ Iterator& operator++()
+ {
+ while (processedElement == 0)
+ {
+ block++;
+ if (block >= context->getCount())
+ {
+ return *this;
+ }
+ processedElement = (*context)[block];
+ }
+ clearLSB();
+ return *this;
+ }
+ Iterator& operator++(int)
+ {
+ return ++(*this);
+ }
+ bool operator==(const Iterator& other) const
+ {
+ return other.block == this->block
+ && other.processedElement == this->processedElement;
+ }
+ bool operator!=(const Iterator& other) const
+ {
+ return !(other == *this);
+ }
};
+ Iterator begin() const
+ {
+ Iterator tmp(&m_buffer);
+ if (m_buffer.getCount() == 0)
+ return tmp;
+
+ tmp.processedElement = m_buffer[0];
+ if (tmp.processedElement == 0)
+ tmp++;
+
+ tmp.clearLSB();
- // Make sure they are correct for the Element type
- SLANG_COMPILE_TIME_ASSERT((1 << kElementShift) == kElementSize);
+ return tmp;
+ }
+ Iterator end() const
+ {
+ Iterator tmp(&m_buffer);
+ tmp.block = m_buffer.getCount();
+ tmp.processedElement = 0;
+ return tmp;
+ }
+
+ bool areAllZero()
+ {
+ return _areAllZero(m_buffer.getBuffer(), m_buffer.getCount());
+ }
+
+protected:
+ static bool _areAllZero(const UIntSet::Element* elems, Index count)
+ {
+ for (Index i = 0; i < count; ++i)
+ {
+ if (elems[i])
+ {
+ return false;
+ }
+ }
+ return true;
+ }
List<Element> m_buffer;
};
@@ -132,6 +275,18 @@ inline bool UIntSet::contains(const UIntSet& set) const
}
// --------------------------------------------------------------------------
+
+inline void UIntSet::resizeBackingBufferDirectly(Index newCount)
+{
+ const Index oldCount = m_buffer.getCount();
+ m_buffer.setCount(newCount);
+
+ if (newCount > oldCount)
+ {
+ ::memset(m_buffer.getBuffer() + oldCount, 0, (newCount - oldCount) * sizeof(Element));
+ }
+}
+
inline void UIntSet::add(UInt val)
{
const Index idx = Index(val >> kElementShift);
@@ -142,6 +297,38 @@ inline void UIntSet::add(UInt val)
m_buffer[idx] |= Element(1) << (val & kElementMask);
}
+inline void UIntSet::add(const UIntSet& other)
+{
+ auto otherCount = other.m_buffer.getCount();
+ if (this->m_buffer.getCount() < otherCount)
+ resizeBackingBufferDirectly(otherCount);
+
+ for (auto i = 0; i < otherCount; i++)
+ m_buffer[i] |= other.m_buffer[i];
}
+template<typename T>
+List<T> UIntSet::getElements() const
+{
+ auto count = m_buffer.getCount();
+ if (count == 0)
+ return {};
+
+ // Specific path for uint64_t. If using SIMD we should not use this path due to larger data types.
+
+ List<T> elements;
+ elements.reserve(count);
+ for (Index block = 0; block < count; block++)
+ {
+ Element n = m_buffer[block];
+ while (n != 0)
+ {
+ elements.add(T(bitscanForward((uint64_t)n) + (kElementSize * block)));
+ n &= n - 1;
+ }
+ }
+ return elements;
+}
+
+}
#endif
diff --git a/source/slang/glsl.meta.slang b/source/slang/glsl.meta.slang
index bacc8958e..881fabb52 100644
--- a/source/slang/glsl.meta.slang
+++ b/source/slang/glsl.meta.slang
@@ -328,7 +328,7 @@ public vector<T,N> atan(vector<T,N> y, vector<T,N> x)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_spirv, sm_2_0_GLSL_140)]
public T inversesqrt(T x)
{
return rsqrt(x);
@@ -337,7 +337,7 @@ public T inversesqrt(T x)
__generic<T : __BuiltinFloatingPointType, let N:int>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_spirv, sm_2_0_GLSL_140)]
public vector<T, N> inversesqrt(vector<T, N> x)
{
return rsqrt(x);
@@ -350,7 +350,7 @@ public vector<T, N> inversesqrt(vector<T, N> x)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
public T roundEven(T x)
{
return rint(x);
@@ -359,7 +359,7 @@ public T roundEven(T x)
__generic<T : __BuiltinFloatingPointType, let N:int>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
public vector<T,N> roundEven(vector<T,N> x)
{
return rint(x);
@@ -8425,7 +8425,7 @@ public vec4 noise4(vector<float, N> x)
// TODO: if called after a return, error.
[ForceInline]
-[require(glsl_hlsl_spirv, shader_stages_compute_tesscontrol_tesseval)]
+[require(glsl_hlsl_spirv, glsl_barrier)]
public void barrier()
{
__target_switch
diff --git a/source/slang/hlsl.meta.slang b/source/slang/hlsl.meta.slang
index 176c7c0e4..26e691ddb 100644
--- a/source/slang/hlsl.meta.slang
+++ b/source/slang/hlsl.meta.slang
@@ -1818,7 +1818,7 @@ Array<T,4> __makeArray<T>(T v0, T v1, T v2, T v3);
// Gather for scalar textures.
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_metal_spirv, GLSL_400)]
+[require(glsl_metal_spirv, texture_gather)]
vector<TElement,4> __texture_gather(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 0, format> texture, SamplerState s, vector<float, Shape.dimensions+isArray> location, int component)
{
__target_switch
@@ -1867,7 +1867,7 @@ vector<TElement,4> __texture_gather(__TextureImpl<T, Shape, isArray, 0, sampleCo
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gather(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 1, format> sampler, vector<float, Shape.dimensions+isArray> location, int component)
{
__target_switch
@@ -1882,7 +1882,7 @@ vector<TElement,4> __texture_gather(__TextureImpl<T, Shape, isArray, 0, sampleCo
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_metal_spirv, GLSL_400)]
+[require(glsl_metal_spirv, texture_gather)]
vector<TElement,4> __texture_gather_offset(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 0, format> texture, SamplerState s, constexpr vector<float, Shape.dimensions+isArray> location, constexpr vector<int, Shape.planeDimensions> offset, int component)
{
__target_switch
@@ -1917,7 +1917,7 @@ vector<TElement,4> __texture_gather_offset(__TextureImpl<T, Shape, isArray, 0, s
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gather_offset(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 1, format> sampler, vector<float, Shape.dimensions+isArray> location, constexpr vector<int, Shape.planeDimensions> offset, int component)
{
__target_switch
@@ -1932,7 +1932,7 @@ vector<TElement,4> __texture_gather_offset(__TextureImpl<T, Shape, isArray, 0, s
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gather_offsets(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 0, format> texture, SamplerState s, vector<float, Shape.dimensions+isArray> location,
constexpr vector<int, Shape.planeDimensions> offset1,
constexpr vector<int, Shape.planeDimensions> offset2,
@@ -1955,8 +1955,9 @@ vector<TElement,4> __texture_gather_offsets(__TextureImpl<T, Shape, isArray, 0,
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gather_offsets(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 1, format> sampler, vector<float, Shape.dimensions+isArray> location,
+
constexpr vector<int, Shape.planeDimensions> offset1,
constexpr vector<int, Shape.planeDimensions> offset2,
constexpr vector<int, Shape.planeDimensions> offset3,
@@ -1977,7 +1978,7 @@ vector<TElement,4> __texture_gather_offsets(__TextureImpl<T, Shape, isArray, 0,
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_metal_spirv, GLSL_400)]
+[require(glsl_metal_spirv, texture_gather)]
vector<TElement,4> __texture_gatherCmp(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 0, format> texture, SamplerComparisonState s, vector<float, Shape.dimensions+isArray> location, TElement compareValue)
{
__target_switch
@@ -2025,7 +2026,7 @@ vector<TElement,4> __texture_gatherCmp(__TextureImpl<T, Shape, isArray, 0, sampl
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gatherCmp(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 1, format> sampler, vector<float, Shape.dimensions+isArray> location, TElement compareValue)
{
__target_switch
@@ -2040,7 +2041,7 @@ vector<TElement,4> __texture_gatherCmp(__TextureImpl<T, Shape, isArray, 0, sampl
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_metal_spirv, GLSL_400)]
+[require(glsl_metal_spirv, texture_gather)]
vector<TElement,4> __texture_gatherCmp_offset(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 0, format> texture, SamplerComparisonState s, vector<float, Shape.dimensions+isArray> location, TElement compareValue, constexpr vector<int, Shape.planeDimensions> offset)
{
__target_switch
@@ -2075,7 +2076,7 @@ vector<TElement,4> __texture_gatherCmp_offset(__TextureImpl<T, Shape, isArray, 0
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gatherCmp_offset(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 1, format> sampler, vector<float, Shape.dimensions+isArray> location, TElement compareValue, constexpr vector<int, Shape.planeDimensions> offset)
{
__target_switch
@@ -2090,7 +2091,7 @@ vector<TElement,4> __texture_gatherCmp_offset(__TextureImpl<T, Shape, isArray, 0
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gatherCmp_offsets(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 0, format> texture, SamplerComparisonState s, vector<float, Shape.dimensions+isArray> location, TElement compareValue,
vector<int, Shape.planeDimensions> offset1,
vector<int, Shape.planeDimensions> offset2,
@@ -2112,7 +2113,7 @@ vector<TElement,4> __texture_gatherCmp_offsets(__TextureImpl<T, Shape, isArray,
}
__generic<TElement, T, Shape: __ITextureShape, let isArray:int, let sampleCount:int, let access:int, let isShadow:int, let format:int>
[ForceInline]
-[require(glsl_spirv, GLSL_400)]
+[require(glsl_spirv, texture_gather)]
vector<TElement,4> __texture_gatherCmp_offsets(__TextureImpl<T, Shape, isArray, 0, sampleCount, access, isShadow, 1, format> sampler, vector<float, Shape.dimensions+isArray> location, TElement compareValue,
vector<int, Shape.planeDimensions> offset1,
vector<int, Shape.planeDimensions> offset2,
@@ -2509,7 +2510,7 @@ extension __TextureImpl<T,Shape,isArray,1,sampleCount,0,isShadow,isCombined,form
[__readNone]
[ForceInline]
- [require(cpp_glsl_hlsl_spirv, texture_sm_4_1)]
+ [require(cpp_glsl_hlsl_spirv, texture_sm_4_1_samplerless)]
T Load(vector<int, Shape.dimensions + isArray + 1> locationAndSampleIndex)
{
return Load(__vectorReshape<Shape.dimensions + isArray>(locationAndSampleIndex), locationAndSampleIndex[Shape.dimensions + isArray]);
@@ -5079,7 +5080,7 @@ matrix<T, N, M> acos(matrix<T, N, M> x)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
T acosh(T x)
{
__target_switch
@@ -5099,7 +5100,7 @@ T acosh(T x)
__generic<T : __BuiltinFloatingPointType, let N:int>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
vector<T,N> acosh(vector<T,N> x)
{
__target_switch
@@ -5535,7 +5536,7 @@ matrix<T, N, M> asin(matrix<T, N, M> x)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
T asinh(T x)
{
__target_switch
@@ -5555,7 +5556,7 @@ T asinh(T x)
__generic<T : __BuiltinFloatingPointType, let N:int>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
vector<T,N> asinh(vector<T,N> x)
{
__target_switch
@@ -6114,7 +6115,7 @@ matrix<T,N,M> atan2(matrix<T,N,M> y, matrix<T,N,M> x)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
T atanh(T x)
{
__target_switch
@@ -6485,7 +6486,7 @@ matrix<T, N, M> cos(matrix<T, N, M> x)
// Hyperbolic cosine
__generic<T : __BuiltinFloatingPointType>
[__readNone]
-[require(cpp_cuda_glsl_hlsl_metal_spirv)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
T cosh(T x)
{
__target_switch
@@ -6503,7 +6504,7 @@ T cosh(T x)
__generic<T : __BuiltinFloatingPointType, let N : int>
[__readNone]
-[require(cpp_cuda_glsl_hlsl_metal_spirv)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
vector<T,N> cosh(vector<T,N> x)
{
__target_switch
@@ -6521,7 +6522,7 @@ vector<T,N> cosh(vector<T,N> x)
__generic<T : __BuiltinFloatingPointType, let N : int, let M : int>
[__readNone]
-[require(cpp_cuda_glsl_hlsl_metal_spirv)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
matrix<T, N, M> cosh(matrix<T, N, M> x)
{
__target_switch
@@ -6536,7 +6537,7 @@ matrix<T, N, M> cosh(matrix<T, N, M> x)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
-[require(cpp_cuda_glsl_hlsl_metal_spirv)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
T cospi(T x)
{
__target_switch
@@ -6549,7 +6550,7 @@ T cospi(T x)
__generic<T : __BuiltinFloatingPointType, let N: int>
[__readNone]
-[require(cpp_cuda_glsl_hlsl_metal_spirv)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
vector<T,N> cospi(vector<T,N> x)
{
__target_switch
@@ -6939,7 +6940,7 @@ T distance(T x, T y)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
-[require(cpp_cuda_glsl_hlsl_metal_spirv)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
T fdim(T x, T y)
{
__target_switch
@@ -6952,7 +6953,7 @@ T fdim(T x, T y)
__generic<T : __BuiltinFloatingPointType, let N : int>
[__readNone]
-[require(cpp_cuda_glsl_hlsl_metal_spirv)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
vector<T,N> fdim(vector<T,N> x, vector<T,N> y)
{
__target_switch
@@ -8081,6 +8082,7 @@ matrix<T, N, M> fwidth(matrix<T, N, M> x)
__generic<T : __BuiltinType>
[__readNone]
__glsl_version(450)
+__glsl_extension(GL_EXT_fragment_shader_barycentric)
[require(glsl_hlsl_spirv, getattributeatvertex)]
T GetAttributeAtVertex(T attribute, uint vertexIndex)
{
@@ -8088,7 +8090,7 @@ T GetAttributeAtVertex(T attribute, uint vertexIndex)
{
case hlsl:
__intrinsic_asm "GetAttributeAtVertex";
- case _GL_EXT_fragment_shader_barycentric:
+ case glsl:
__intrinsic_asm "$0[$1]";
case spirv:
return spirv_asm {
@@ -8114,6 +8116,7 @@ T GetAttributeAtVertex(T attribute, uint vertexIndex)
__generic<T : __BuiltinType, let N : int>
[__readNone]
__glsl_version(450)
+__glsl_extension(GL_EXT_fragment_shader_barycentric)
[require(glsl_hlsl_spirv, getattributeatvertex)]
vector<T,N> GetAttributeAtVertex(vector<T,N> attribute, uint vertexIndex)
{
@@ -8121,7 +8124,7 @@ vector<T,N> GetAttributeAtVertex(vector<T,N> attribute, uint vertexIndex)
{
case hlsl:
__intrinsic_asm "GetAttributeAtVertex";
- case _GL_EXT_fragment_shader_barycentric:
+ case glsl:
__intrinsic_asm "$0[$1]";
case spirv:
return spirv_asm {
@@ -8147,6 +8150,7 @@ vector<T,N> GetAttributeAtVertex(vector<T,N> attribute, uint vertexIndex)
__generic<T : __BuiltinType, let N : int, let M : int>
[__readNone]
__glsl_version(450)
+__glsl_extension(GL_EXT_fragment_shader_barycentric)
[require(glsl_hlsl_spirv, getattributeatvertex)]
matrix<T,N,M> GetAttributeAtVertex(matrix<T,N,M> attribute, uint vertexIndex)
{
@@ -8154,7 +8158,7 @@ matrix<T,N,M> GetAttributeAtVertex(matrix<T,N,M> attribute, uint vertexIndex)
{
case hlsl:
__intrinsic_asm "GetAttributeAtVertex";
- case _GL_EXT_fragment_shader_barycentric:
+ case glsl:
__intrinsic_asm "$0[$1]";
case spirv:
return spirv_asm {
@@ -11194,7 +11198,7 @@ vector<uint, N> reversebits(vector<uint, N> value)
__generic<T : __BuiltinFloatingPointType>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
T rint(T x)
{
__target_switch
@@ -11225,7 +11229,7 @@ T rint(T x)
__generic<T : __BuiltinFloatingPointType, let N:int>
[__readNone]
[ForceInline]
-[require(cpp_cuda_glsl_hlsl_metal_spirv, GLSL_130)]
+[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_2_0_GLSL_140)]
vector<T,N> rint(vector<T,N> x)
{
__target_switch
@@ -12091,7 +12095,7 @@ WaveMask __WaveGetActiveMask();
__glsl_extension(GL_KHR_shader_subgroup_ballot)
__spirv_version(1.3)
-[require(cuda_glsl_hlsl_spirv, subgroup_ballot)]
+[require(cuda_glsl_hlsl_spirv, subgroup_ballot_activemask)]
WaveMask WaveGetActiveMask()
{
__target_switch
@@ -12200,7 +12204,7 @@ WaveMask WaveMaskBallot(WaveMask mask, bool condition)
}
}
-[require(cuda_glsl_hlsl_spirv, subgroup_ballot)]
+[require(cuda_glsl_hlsl_spirv, subgroup_basic_ballot)]
uint WaveMaskCountBits(WaveMask mask, bool value)
{
__target_switch
@@ -13793,7 +13797,7 @@ uint4 WaveActiveBallot(bool condition)
}
}
-[require(cuda_glsl_hlsl_spirv, subgroup_ballot)]
+[require(cuda_glsl_hlsl_spirv, subgroup_basic_ballot)]
uint WaveActiveCountBits(bool value)
{
__target_switch
@@ -13873,7 +13877,7 @@ bool WaveIsFirstLane()
// It's useful to have a wave uint4 version of countbits, because some wave functions return uint4.
// This implementation tries to limit the amount of work required by the actual lane count.
__spirv_version(1.3)
-[require(cpp_cuda_glsl_hlsl_spirv, subgroup_ballot)]
+[require(cpp_cuda_glsl_hlsl_spirv, subgroup_basic_ballot)]
uint _WaveCountBits(uint4 value)
{
__target_switch
diff --git a/source/slang/slang-ast-dump.cpp b/source/slang/slang-ast-dump.cpp
index 0986d7284..8b2494310 100644
--- a/source/slang/slang-ast-dump.cpp
+++ b/source/slang/slang-ast-dump.cpp
@@ -715,20 +715,21 @@ struct ASTDumpContext
{
m_writer->emit("capability_set(");
bool isFirstSet = true;
- for (auto& set : capSet.getExpandedAtoms())
+ for (auto& set : capSet.getAtomSets())
{
if (!isFirstSet)
{
m_writer->emit(" | ");
}
bool isFirst = true;
- for (auto atom : set.getExpandedAtoms())
+ for (auto atom : set)
{
+ CapabilityName formattedAtom = (CapabilityName)atom;
if (!isFirst)
{
m_writer->emit("+");
}
- dump(capabilityNameToString((CapabilityName)atom));
+ dump(capabilityNameToString((CapabilityName)formattedAtom));
isFirst = false;
}
isFirstSet = false;
diff --git a/source/slang/slang-capabilities.capdef b/source/slang/slang-capabilities.capdef
index 5c672d398..5a0df9f9b 100644
--- a/source/slang/slang-capabilities.capdef
+++ b/source/slang/slang-capabilities.capdef
@@ -46,13 +46,13 @@ def c : target + textualTarget;
def cpp : target + textualTarget;
def cuda : target + textualTarget;
def metal : target + textualTarget;
+def spirv_1_0 : target;
// We have multiple capabilities for the various SPIR-V versions,
// arranged so that they inherit from one another to represent which versions
// provide a super-set of the features of earlier ones (e.g., SPIR-V 1.4 is
// expressed as inheriting from SPIR-V 1.3).
//
-def spirv_1_0 : target;
def spirv_1_1 : spirv_1_0;
def spirv_1_2 : spirv_1_1;
def spirv_1_3 : spirv_1_2;
@@ -73,6 +73,8 @@ alias cpp_cuda_glsl_hlsl = cpp | cuda | glsl | hlsl;
alias cpp_cuda_glsl_hlsl_spirv = cpp | cuda | glsl | hlsl | spirv_1_0;
alias cpp_cuda_glsl_hlsl_metal_spirv = cpp | cuda | glsl | hlsl | metal | spirv_1_0;
alias cpp_cuda_hlsl = cpp | cuda | hlsl;
+alias cpp_cuda_hlsl_spirv = cpp | cuda | hlsl | spirv_1_0;
+alias cpp_cuda_hlsl_metal_spirv = cpp | cuda | hlsl | metal | spirv_1_0;
alias cpp_glsl = cpp | glsl;
alias cpp_glsl_hlsl_spirv = cpp | glsl | hlsl | spirv_1_0;
alias cpp_glsl_hlsl_metal_spirv = cpp | glsl | hlsl | metal | spirv_1_0;
@@ -99,9 +101,50 @@ def glsl_spirv_1_4 : glsl_spirv_1_3;
def glsl_spirv_1_5 : glsl_spirv_1_4;
def glsl_spirv_1_6 : glsl_spirv_1_5;
+def _GLSL_130 : glsl;
+def _GLSL_140 : _GLSL_130;
+def _GLSL_150 : _GLSL_140;
+def _GLSL_330 : _GLSL_150;
+def _GLSL_400 : _GLSL_330;
+def _GLSL_410 : _GLSL_400;
+def _GLSL_420 : _GLSL_410;
+def _GLSL_430 : _GLSL_420;
+def _GLSL_440 : _GLSL_430;
+def _GLSL_450 : _GLSL_440;
+def _GLSL_460 : _GLSL_450;
+
+
+// metal versions
def metallib_2_3 : metal;
def metallib_2_4 : metallib_2_3;
+// hlsl versions
+def _sm_4_0 : hlsl;
+def _sm_4_1 : _sm_4_0;
+def _sm_5_0 : _sm_4_1;
+def _sm_5_1 : _sm_5_0;
+def _sm_6_0 : _sm_5_1;
+def _sm_6_1 : _sm_6_0;
+def _sm_6_2 : _sm_6_1;
+def _sm_6_3 : _sm_6_2;
+def _sm_6_4 : _sm_6_3;
+def _sm_6_5 : _sm_6_4;
+def _sm_6_6 : _sm_6_5;
+def _sm_6_7 : _sm_6_6;
+
+def hlsl_nvapi : hlsl;
+
+// cuda versions
+def _cuda_sm_1_0 : cuda;
+def _cuda_sm_2_0 : _cuda_sm_1_0;
+def _cuda_sm_3_0 : _cuda_sm_2_0;
+def _cuda_sm_3_5 : _cuda_sm_3_0;
+def _cuda_sm_4_0 : _cuda_sm_3_5;
+def _cuda_sm_5_0 : _cuda_sm_4_0;
+def _cuda_sm_6_0 : _cuda_sm_5_0;
+def _cuda_sm_7_0 : _cuda_sm_6_0;
+def _cuda_sm_8_0 : _cuda_sm_7_0;
+def _cuda_sm_9_0 : _cuda_sm_8_0;
abstract stage;
def vertex : stage;
@@ -118,6 +161,10 @@ def miss : stage;
def mesh : stage;
def amplification : stage;
def callable : stage;
+alias any_stage = vertex | fragment | compute | hull | domain | geometry
+ | raygen | intersection | anyhit | closesthit | miss | mesh
+ | amplification | callable
+ ;
// shader stage alias's
alias pixel = fragment;
@@ -143,44 +190,6 @@ alias raytracing_stages_compute_amplification_mesh = raytracing_stages_compute |
alias raytracing_stages_compute_fragment = raytracing_stages | shader_stages_compute_fragment;
alias raytracing_stages_compute_fragment_geometry_vertex = raytracing_stages | shader_stages_compute_fragment_geometry_vertex;
-def _GLSL_130 : glsl;
-def _GLSL_140 : _GLSL_130;
-def _GLSL_150 : _GLSL_140;
-def _GLSL_330 : _GLSL_150;
-def _GLSL_400 : _GLSL_330;
-def _GLSL_410 : _GLSL_400;
-def _GLSL_420 : _GLSL_410;
-def _GLSL_430 : _GLSL_420;
-def _GLSL_440 : _GLSL_430;
-def _GLSL_450 : _GLSL_440;
-def _GLSL_460 : _GLSL_450;
-
-def _sm_4_0 : hlsl;
-def _sm_4_1 : _sm_4_0;
-def _sm_5_0 : _sm_4_1;
-def _sm_5_1 : _sm_5_0;
-def _sm_6_0 : _sm_5_1;
-def _sm_6_1 : _sm_6_0;
-def _sm_6_2 : _sm_6_1;
-def _sm_6_3 : _sm_6_2;
-def _sm_6_4 : _sm_6_3;
-def _sm_6_5 : _sm_6_4;
-def _sm_6_6 : _sm_6_5;
-def _sm_6_7 : _sm_6_6;
-
-def hlsl_nvapi : hlsl;
-
-def _cuda_sm_1_0 : cuda;
-def _cuda_sm_2_0 : _cuda_sm_1_0;
-def _cuda_sm_3_0 : _cuda_sm_2_0;
-def _cuda_sm_3_5 : _cuda_sm_3_0;
-def _cuda_sm_4_0 : _cuda_sm_3_5;
-def _cuda_sm_5_0 : _cuda_sm_4_0;
-def _cuda_sm_6_0 : _cuda_sm_5_0;
-def _cuda_sm_7_0 : _cuda_sm_6_0;
-def _cuda_sm_8_0 : _cuda_sm_7_0;
-def _cuda_sm_9_0 : _cuda_sm_8_0;
-
// SPIRV extensions.
def SOURCE_EXT_GL_NV_compute_shader_derivatives : spirv_1_0;
@@ -302,10 +311,10 @@ alias GL_ARB_derivative_control = _GL_ARB_derivative_control | spvDerivativeCont
alias GL_ARB_fragment_shader_interlock = _GL_ARB_fragment_shader_interlock | spvFragmentShaderPixelInterlockEXT;
alias GL_ARB_gpu_shader5 = _GL_ARB_gpu_shader5 | spirv_1_0;
alias GL_ARB_sparse_texture_clamp = _GL_ARB_fragment_shader_interlock | spirv_1_0;
-alias GL_EXT_texture_query_lod = _GL_EXT_texture_query_lod | spvImageQuery;
-alias GL_ARB_texture_query_levels = _GL_ARB_texture_query_levels |spvImageQuery;
+alias GL_EXT_texture_query_lod = _GL_EXT_texture_query_lod | spvImageQuery | metal;
+alias GL_ARB_texture_query_levels = _GL_ARB_texture_query_levels | spvImageQuery | metal;
alias GL_ARB_texture_cube_map = _GL_ARB_texture_cube_map | spirv_1_0;
-alias GL_ARB_texture_gather = _GL_ARB_texture_gather | spirv_1_0;
+alias GL_ARB_texture_gather = _GL_ARB_texture_gather | spirv_1_0 | metal;
alias GL_EXT_buffer_reference = _GL_ARB_fragment_shader_interlock | spirv_1_5;
alias GL_EXT_buffer_reference_uvec2 = _GL_EXT_buffer_reference_uvec2 | spirv_1_0;
alias GL_EXT_debug_printf = _GL_EXT_debug_printf | SPV_KHR_non_semantic_info;
@@ -334,8 +343,8 @@ alias GL_KHR_shader_subgroup_shuffle_relative = _GL_KHR_shader_subgroup_shuffle_
alias GL_KHR_shader_subgroup_vote = _GL_KHR_shader_subgroup_vote | spvGroupNonUniformVote;
alias GL_KHR_shader_subgroup_quad = _GL_KHR_shader_subgroup_quad | spvGroupNonUniformQuad;
alias GL_NV_compute_shader_derivatives = _GL_NV_compute_shader_derivatives | SOURCE_EXT_GL_NV_compute_shader_derivatives | SPV_NV_compute_shader_derivatives | _sm_6_6;
-alias GL_ARB_shader_image_size = _GL_ARB_shader_image_size | spvImageQuery;
-alias GL_ARB_shader_texture_image_samples = _GL_ARB_shader_texture_image_samples | spvImageQuery;
+alias GL_ARB_shader_image_size = _GL_ARB_shader_image_size | spvImageQuery | metal;
+alias GL_ARB_shader_texture_image_samples = _GL_ARB_shader_texture_image_samples | spvImageQuery | metal;
alias GL_NV_shader_atomic_fp16_vector = _GL_NV_shader_atomic_fp16_vector + _GL_NV_gpu_shader5 | spirv_1_0;
alias GL_NV_shader_subgroup_partitioned = _GL_NV_shader_subgroup_partitioned | spvGroupNonUniformPartitionedNV;
alias GL_NV_ray_tracing_motion_blur = _GL_NV_ray_tracing_motion_blur | spvRayTracingMotionBlurNV;
@@ -368,17 +377,6 @@ alias fragmentshaderbarycentric = GL_EXT_fragment_shader_barycentric | _sm_6_1;
alias shadermemorycontrol = glsl | spirv_1_0 | _sm_5_0;
alias shadermemorycontrol_compute = raytracing_stages_compute + shadermemorycontrol;
alias subpass = fragment + any_gfx_target;
-alias subgroup_basic = GL_KHR_shader_subgroup_basic | GL_KHR_shader_subgroup_basic + spirv_1_0 | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_basic_ballot = GL_KHR_shader_subgroup_basic + GL_KHR_shader_subgroup_ballot | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_vote = GL_KHR_shader_subgroup_vote | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_arithmetic = GL_KHR_shader_subgroup_arithmetic | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_ballot = GL_KHR_shader_subgroup_ballot | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_shuffle = GL_KHR_shader_subgroup_shuffle | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_shufflerelative = GL_KHR_shader_subgroup_shuffle_relative | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_clustered = GL_KHR_shader_subgroup_clustered | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_quad = GL_KHR_shader_subgroup_quad | _sm_6_0 | _cuda_sm_7_0;
-alias subgroup_partitioned = GL_NV_shader_subgroup_partitioned | _sm_6_5;
-alias shaderinvocationgroup = subgroup_vote;
alias waveprefix = _sm_6_5 | _cuda_sm_7_0 | GL_KHR_shader_subgroup_arithmetic;
alias bufferreference = GL_EXT_buffer_reference;
alias bufferreference_int64 = bufferreference + GL_EXT_shader_explicit_arithmetic_types_int64;
@@ -395,7 +393,7 @@ alias sm_4_0 = _sm_4_0
;
alias sm_4_1 = _sm_4_1
- | glsl_spirv_1_0 + sm_4_0
+ | glsl_spirv_1_0 + _GLSL_150 + sm_4_0
| spirv_1_0 + sm_4_0
| _cuda_sm_6_0
| metal
@@ -587,8 +585,8 @@ alias DX_6_7 = sm_6_7;
alias METAL_2_3 = metallib_2_3;
alias METAL_2_4 = metallib_2_4;
-alias sm_2_0_GLSL_140 = sm_4_0 | glsl | spirv_1_0 | cuda | cpp;
-alias sm_2_0_GLSL_400 = sm_4_0 | glsl | spirv_1_0 | cuda | cpp;
+alias sm_2_0_GLSL_140 = _GLSL_140 + sm_4_0 | sm_4_0;
+alias sm_2_0_GLSL_400 = _GLSL_400 + sm_4_0 | sm_4_0;
alias appendstructuredbuffer = sm_5_0 + raytracing_stages_compute_fragment;
alias atomic_hlsl = _sm_4_0;
alias atomic_hlsl_nvapi = _sm_4_0 + hlsl_nvapi;
@@ -606,15 +604,26 @@ alias fragmentprocessing_derivativecontrol = fragment + _sm_5_0
;
alias getattributeatvertex = fragment + _sm_6_1 | fragment + GL_EXT_fragment_shader_barycentric;
alias memorybarrier_compute = raytracing_stages_compute + sm_5_0;
+alias glsl_barrier = hlsl + memorybarrier_compute
+ | glsl_spirv + shader_stages_compute_tesscontrol_tesseval
+ ;
alias structuredbuffer = sm_4_0;
alias structuredbuffer_rw = sm_4_0 + raytracing_stages_compute_fragment;
-alias texture_sm_4_1 = sm_4_1 + _GLSL_150 | sm_4_1;
-alias texture_sm_4_1_samplerless = texture_sm_4_1 + GL_EXT_samplerless_texture_functions;
+alias texture_sm_4_1 = sm_4_1
+ ;
+alias texture_sm_4_1_samplerless = cpp + texture_sm_4_1
+ | cuda + texture_sm_4_1
+ | glsl + texture_sm_4_1 + GL_EXT_samplerless_texture_functions
+ | hlsl + texture_sm_4_1 + raytracing_stages_compute_fragment
+ | spirv_1_0 + texture_sm_4_1 + GL_EXT_samplerless_texture_functions
+ | metal + texture_sm_4_1
+ ;
alias texture_sm_4_1_compute_fragment = cpp + texture_sm_4_1
| cuda + texture_sm_4_1
| glsl + texture_sm_4_1
| hlsl + texture_sm_4_1 + raytracing_stages_compute_fragment
| spirv_1_0 + texture_sm_4_1
+ | metal + texture_sm_4_1
;
// supposedly works on compute but docs say nothing, so for now keep as compute_fragment
alias texture_sm_4_1_fragment = cpp + texture_sm_4_1
@@ -622,6 +631,7 @@ alias texture_sm_4_1_fragment = cpp + texture_sm_4_1
| glsl + texture_sm_4_1
| hlsl + texture_sm_4_1 + raytracing_stages_compute_fragment
| spirv_1_0 + texture_sm_4_1
+ | metal + texture_sm_4_1
;
alias texture_sm_4_1_clamp_fragment = texture_sm_4_1_fragment + GL_ARB_sparse_texture_clamp;
alias texture_sm_4_1_vertex_fragment_geometry = cpp + texture_sm_4_1
@@ -629,6 +639,7 @@ alias texture_sm_4_1_vertex_fragment_geometry = cpp + texture_sm_4_1
| glsl + texture_sm_4_1
| hlsl + texture_sm_4_1 + raytracing_stages_compute_fragment_geometry_vertex
| spirv_1_0 + texture_sm_4_1
+ | metal + texture_sm_4_1
;
alias texture_gather = texture_sm_4_1_vertex_fragment_geometry + GL_ARB_texture_gather;
alias image_samples = texture_sm_4_1_compute_fragment + GL_ARB_shader_texture_image_samples;
@@ -645,7 +656,7 @@ alias texture_querylevels_cube = texture_querylevels + GL_ARB_texture_cube_map |
alias atomic_glsl_float1 = GL_EXT_shader_atomic_float;
alias atomic_glsl_float2 = GL_EXT_shader_atomic_float2;
alias atomic_glsl_halfvec = GL_NV_shader_atomic_fp16_vector;
-alias atomic_glsl = GLSL_430_SPIRV_1_0;
+alias atomic_glsl = spirv_1_0 | _GLSL_400;
alias atomic_glsl_int64 = atomic_glsl + GL_EXT_shader_atomic_int64;
alias GLSL_430_SPIRV_1_0_compute = GLSL_430_SPIRV_1_0 + compute;
alias image_loadstore = GL_EXT_shader_image_load_store + GLSL_420;
@@ -654,8 +665,32 @@ alias printf = GL_EXT_debug_printf | _sm_4_0 | _cuda_sm_2_0 | cpp;
alias texturefootprint = GL_NV_shader_texture_footprint + GLSL_450 | hlsl_nvapi + _sm_4_0;
alias texturefootprintclamp = texturefootprint + GL_ARB_sparse_texture_clamp;
-alias shader5_sm_4_0 = GL_ARB_gpu_shader5 | sm_4_0;
-alias shader5_sm_5_0 = GL_ARB_gpu_shader5 | sm_5_0;
+alias shader5_sm_4_0 = GL_ARB_gpu_shader5 + sm_4_0 | sm_4_0;
+alias shader5_sm_5_0 = GL_ARB_gpu_shader5 + sm_4_0 | sm_5_0;
+
+alias subgroup_basic = GL_KHR_shader_subgroup_basic | _sm_6_0 | _cuda_sm_7_0;
+alias subgroup_ballot = spirv_1_0 + GL_KHR_shader_subgroup_ballot
+ | glsl + GL_KHR_shader_subgroup_ballot + shader5_sm_5_0
+ | _sm_6_0 + shader5_sm_5_0
+ | _cuda_sm_7_0 + shader5_sm_5_0
+ ;
+alias subgroup_ballot_activemask = spirv_1_0 + GL_KHR_shader_subgroup_ballot
+ | glsl + GL_KHR_shader_subgroup_ballot
+ | _sm_6_0
+ | _cuda_sm_7_0
+ ;
+alias subgroup_basic_ballot = glsl + GL_KHR_shader_subgroup_basic + subgroup_ballot
+ | spirv + GL_KHR_shader_subgroup_basic + subgroup_ballot
+ | hlsl + subgroup_ballot | cuda + subgroup_ballot
+ ;
+alias subgroup_vote = GL_KHR_shader_subgroup_vote | _sm_6_0 | _cuda_sm_7_0;
+alias shaderinvocationgroup = subgroup_vote;
+alias subgroup_arithmetic = GL_KHR_shader_subgroup_arithmetic | _sm_6_0 | _cuda_sm_7_0;
+alias subgroup_shuffle = glsl + GL_KHR_shader_subgroup_shuffle | _sm_6_0 | _cuda_sm_7_0;
+alias subgroup_shufflerelative = GL_KHR_shader_subgroup_shuffle_relative | _sm_6_0 | _cuda_sm_7_0;
+alias subgroup_clustered = GL_KHR_shader_subgroup_clustered | _sm_6_0 | _cuda_sm_7_0;
+alias subgroup_quad = GL_KHR_shader_subgroup_quad | _sm_6_0 | _cuda_sm_7_0;
+alias subgroup_partitioned = GL_NV_shader_subgroup_partitioned + subgroup_ballot_activemask | _sm_6_5;
alias atomic_glsl_hlsl_cuda = atomic_glsl | _sm_5_0 | _cuda_sm_2_0;
alias atomic_glsl_hlsl_cuda_float1 = atomic_glsl_float1 | atomic_hlsl_nvapi | _cuda_sm_2_0;
diff --git a/source/slang/slang-capability.cpp b/source/slang/slang-capability.cpp
index 0daf83dac..5cd46f631 100644
--- a/source/slang/slang-capability.cpp
+++ b/source/slang/slang-capability.cpp
@@ -66,6 +66,12 @@ struct CapabilityAtomInfo
#include "slang-generated-capability-defs-impl.h"
+static UInt asAtomUInt(CapabilityName name)
+{
+ SLANG_ASSERT((CapabilityAtom)name < CapabilityAtom::Count);
+ return (UInt)((CapabilityAtom)name);
+}
+
static CapabilityAtom asAtom(CapabilityName name)
{
SLANG_ASSERT((CapabilityAtom)name < CapabilityAtom::Count);
@@ -110,7 +116,7 @@ bool lookupCapabilityName(const UnownedStringSlice& str, CapabilityName& value);
CapabilityName findCapabilityName(UnownedStringSlice const& name)
{
- CapabilityName result;
+ CapabilityName result{};
if (!lookupCapabilityName(name, result))
return CapabilityName::Invalid;
return result;
@@ -134,664 +140,115 @@ bool isCapabilityDerivedFrom(CapabilityAtom atom, CapabilityAtom base)
return false;
}
-//
-// CapabilityConjunctionSet
-//
-
-// The current design choice in `CapabilityConjunctionSet` is that it stores
-// an expanded, deduplicated, and sorted list of the capability
-// atoms in the set. "Expanded" here means that it includes the
-// transitive closure of the inheritance graph of those atoms.
-//
-// This choice is intended to make certain operations on
-// capability sets more efficient, since use things like
-// binary searches to efficiently detect whether an atom
-// is present in a set.
-
-CapabilityConjunctionSet::CapabilityConjunctionSet()
-{}
-
-CapabilityConjunctionSet::CapabilityConjunctionSet(Int atomCount, CapabilityAtom const* atoms)
-{
- _init(atomCount, atoms);
-}
-
-CapabilityConjunctionSet::CapabilityConjunctionSet(CapabilityAtom atom)
-{
- _init(1, &atom);
-}
+//// CapabiltySet
-CapabilityConjunctionSet::CapabilityConjunctionSet(List<CapabilityAtom> const& atoms)
+void CapabilitySet::addToTargetCapabilityWithValidUIntSetAndTargetAndStage(CapabilityName target, CapabilityName stage, CapabilityAtomSet setToAdd)
{
- _init(atoms.getCount(), atoms.getBuffer());
-}
+ SLANG_ASSERT(target != CapabilityName::Invalid && stage != CapabilityName::Invalid);
+ auto stageAtom = asAtom(stage);
+ auto targetAtom = asAtom(target);
+ CapabilityTargetSet& targetSet = m_targetSets[targetAtom];
+ targetSet.target = targetAtom;
+ targetSet.shaderStageSets.reserve(kCapabilityStageCount);
-CapabilityConjunctionSet CapabilityConjunctionSet::makeEmpty()
-{
- return CapabilityConjunctionSet();
-}
+ auto& localStageSets = targetSet.shaderStageSets[stageAtom];
+ localStageSets.stage = stageAtom;
-CapabilityConjunctionSet CapabilityConjunctionSet::makeInvalid()
-{
- // An invalid capability set will always be a singleton
- // set of the `Invalid` atom, and we will construct
- // the set directly rather than use the more expensive
- // logic in `_init()`.
- //
- CapabilityConjunctionSet result;
- result.m_expandedAtoms.add(CapabilityAtom::Invalid);
- return result;
+ localStageSets.addNewSet(std::move(setToAdd));
}
-void CapabilityConjunctionSet::_init(Int atomCount, CapabilityAtom const* atoms)
+void CapabilitySet::addToTargetCapabilityWithTargetAndStageAtom(CapabilityName target, CapabilityName stage, const ArrayView<CapabilityName>& canonicalRepresentation)
{
- // We will use an explicit hash set to deduplicate input atoms.
- //
- HashSet<CapabilityAtom> expandedAtomsSet;
- for(Int i = 0; i < atomCount; ++i)
- {
- if (expandedAtomsSet.add(atoms[i]))
- {
- auto& info = _getInfo(atoms[i]);
-
- // Add the base items that this atom implies.
- if (info.canonicalRepresentation.getCount() == 1)
- {
- // The atom must have only one conjunction.
- SLANG_ASSERT(info.canonicalRepresentation.getCount() == 1);
-
- for (auto base : info.canonicalRepresentation[0])
- {
- expandedAtomsSet.add(asAtom(base));
- }
- }
- }
- }
-
- // We can then translate the set of atoms into a list,
- // and then sort that list to produce the data that
- // we use in all our other queries.
- //
- for(auto atom : expandedAtomsSet)
+ // If no provided 'stage', set the capability as a target of all stages
+ if (stage == CapabilityName::Invalid)
{
- m_expandedAtoms.add(atom);
- }
- m_expandedAtoms.sort();
-}
-
-void CapabilityConjunctionSet::calcCompactedAtoms(List<CapabilityAtom>& outAtoms) const
-{
- // A "compacted" list of atoms is one that starts with
- // the "expanded" list and removes any atoms that are
- // implied by another atom already in the list.
- //
- // If the expanded list contains atom A, and A inherits
- // from B, then we know that the expanded list also contains B,
- // but the compacted list should not.
- //
- // We can thus look through the list of atoms A and for
- // each base B of A, add it to a set of "redundant" atoms
- // that need not appear in the compacted list.
- //
- HashSet<CapabilityAtom> redundantAtomsSet;
- for( auto atom : m_expandedAtoms )
- {
- auto& atomInfo = _getInfo(atom);
- if (atomInfo.canonicalRepresentation.getCount() != 1)
- {
- // If the atom is not a single conjunction, skip.
- continue;
- }
- for(auto baseAtom : atomInfo.canonicalRepresentation[0])
- {
- // Note: don't add atom itself into redundant set.
- if(asAtom(baseAtom) == atom)
- continue;
-
- redundantAtomsSet.add(asAtom(baseAtom));
- }
- }
-
- // Once we are done figuring out which atoms are redundant,
- // we can iterate over the expanded list and add all the
- // non-redundant ones to the compacted output list.
- //
- outAtoms.clear();
- for( auto atom : m_expandedAtoms )
- {
- if(!redundantAtomsSet.contains(atom))
- {
- outAtoms.add(atom);
- }
- }
-}
-
-bool CapabilityConjunctionSet::isEmpty() const
-{
- // Checking if a capability set is empty is trivial in any representation;
- // all we need to know is if it has zero atoms in its definition.
- //
- return m_expandedAtoms.getCount() == 0;
-}
-
-bool CapabilityConjunctionSet::isInvalid() const
-{
- // We will assume here that there is only one canonical representation of
- // an invalid capability set, which is a singleton set of the `Invalid`
- // atom.
- //
- // TODO: We should ensure that any algorithms that make new capability
- // sets by combining others properly ensure that they return the
- // canonical invalid set rather than any other set that happens to be
- // invalid (e.g., a set {A,B} would be invalid if A and B are incompatible,
- // but it would not be in the canonical form this subroutine checks).
- //
- if(m_expandedAtoms.getCount() != 1) return false;
- return m_expandedAtoms[0] == CapabilityAtom::Invalid;
-}
-
-bool CapabilityConjunctionSet::isIncompatibleWith(CapabilityAtom that) const
-{
- // Checking for incompatibility is complicated, and it is best
- // to only implement it for full (expanded) sets.
- //
- return isIncompatibleWith(CapabilityConjunctionSet(that));
-}
-
-static UIntSet _calcConflictMask(CapabilityAtom atom)
-{
- UIntSet mask;
- auto abstractBase = _getInfo(atom).abstractBase;
- if (abstractBase != CapabilityName::Invalid)
- {
- mask.add((UInt)abstractBase);
- }
- return mask;
-}
-
-static UIntSet _calcConflictMask(const CapabilityConjunctionSet& set)
-{
- // Given a capbility set, we want to compute the mask representing
- // all groups of features for which it holds a potentially-conflicting atom.
- //
- UIntSet mask;
- for (auto atom : set.getExpandedAtoms())
- {
- auto abstractBase = _getInfo(atom).abstractBase;
- if (abstractBase != CapabilityName::Invalid)
- {
- mask.add((UInt)abstractBase);
- }
- }
- return mask;
-}
-
-bool CapabilityConjunctionSet::isIncompatibleWith(CapabilityConjunctionSet const& that) const
-{
- // The `this` and `that` sets are incompatible if there exists
- // an atom A in `this` and an atom `B` in `that` such that
- // A and B are not equal, but the two have overlapping "conflict group."
- //
- // Equivalently, we can say that the two are in conflict if
- //
- // * One of the two sets contains an atom A with conflict mask M
- // * The other set contains at least one atom that conflicts with M
- // * The other set does not contain A
- //
- // Our approach here is all about minimizing the number of
- // iterations we take over lists of atoms, and trying to
- // avoid anything super-linear.
-
- // We start by identifying the OR of the conflict masks for
- // all features in `this` and `that`.
- //
- UIntSet thisMask = _calcConflictMask(*this);
- UIntSet thatMask = _calcConflictMask(that);
-
- // Note: there is a possible early-exit opportunity here if
- // `thisMask` and `thatMask` have no overlap: there could
- // be no conflicts in that case.
-
- // Next we will iterate over the two sets in tandem (O(N) time
- // in the size of the larger set), and identify any elements
- // that are present in one and not the other.
- //
- Index thisCount = this->m_expandedAtoms.getCount();
- Index thatCount = that.m_expandedAtoms.getCount();
- Index thisIndex = 0;
- Index thatIndex = 0;
- for(;;)
- {
- if(thisIndex == thisCount) break;
- if(thatIndex == thatCount) break;
-
- auto thisAtom = this->m_expandedAtoms[thisIndex];
- auto thatAtom = that.m_expandedAtoms[thatIndex];
-
- if(thisAtom == thatAtom)
- {
- thisIndex++;
- thatIndex++;
- continue;
- }
-
- if( thisAtom < thatAtom )
- {
- // `thisAtom` is present in `this` but not `that.
- //
- // If `thisAtom` has a conflict mask that overlaps
- // with `thatMask`, then we have a conflict: the
- // other set doesn't include `thisAtom`, but *does*
- // include something with an overlapping mask
- // (we don't know what at this point in the code).
- //
- auto thisConflictMask = Slang::_calcConflictMask(thisAtom);
- if(UIntSet::hasIntersection(thisConflictMask, thatMask))
- return true;
- thisIndex++;
- }
- else
- {
- SLANG_ASSERT(thisAtom > thatAtom);
-
- // `thatAtom` is present in `that` but not `this.
- //
- // The logic here is the mirror image of the case above.
- //
- auto thatConflictMask = Slang::_calcConflictMask(thatAtom);
- if(UIntSet::hasIntersection(thatConflictMask, thisMask))
- return true;
- thatIndex++;
- }
- }
-
- return false;
-}
-
-bool CapabilityConjunctionSet::implies(CapabilityConjunctionSet const& that) const
-{
- // One capability set implies another if it is a super-set
- // of the other one. Think of it this way: if your target
- // supports features {X, Y, Z}, then that implies it also
- // supports features {X,Z}.
- //
- // Because both `this` and `that` have expanded lists
- // of all the capability atoms they imply *and* those
- // lists are sorted, we can simply walk through the
- // lists in tandem and see if there are any entries
- // in `that` which are not present in `this.
-
- Index thisCount = this->m_expandedAtoms.getCount();
- Index thatCount = that.m_expandedAtoms.getCount();
-
- // We cannot possibly have `this` contain all the atoms
- // in `that` if the latter is has more atoms.
- //
- if(thatCount > thisCount)
- return false;
-
- // Note: the following iteration is O(N) in the size
- // of the larger of the two sets, which is probably
- // needlessly inefficient. We might expect that `that`
- // will often be a much smaller set, and we'd like to
- // scale in its size rather than the size of `this`.
- //
- // A more advanced algorithm here would be to do
- // something recursive:
- //
- // * If `that` is singleton set, then we can find
- // whether `this` contains it via binary search.
- //
- // * Otherwise, we can split `that` into two
- // equally-sized subsets. By taking a "pivot" value
- // from where that split took place we can then
- // use a binary search to partition `this` into
- // two subsets and recurse on each side of that
- // partition.
- //
- // In practice, the size of the sets we are dealing
- // with right now doesn't justify such a "clever" algorithm.
-
- Index thisIndex = 0;
- Index thatIndex = 0;
- for(;;)
- {
- if(thisIndex == thisCount) break;
- if(thatIndex == thatCount) break;
-
- auto thisAtom = this->m_expandedAtoms[thisIndex];
- auto thatAtom = that.m_expandedAtoms[thatIndex];
-
- if( thisAtom == thatAtom )
- {
- // We have an atom that both sets contain;
- // we should skip past it and keep looking.
- //
- thisIndex++;
- thatIndex++;
- continue;
- }
-
- if( thisAtom < thatAtom )
- {
- // We have an atom that `this` contains,
- // but `that` doesn't; that is consistent
- // with `this` being a super-set, so we
- // just skip the item and keep searching.
- //
- thisIndex++;
- }
- else
+ auto info = _getInfo(CapabilityName::any_stage);
+ List<CapabilityName> newArr;
+ auto count = canonicalRepresentation.getCount();
+ newArr.setCount(count + 1);
+ memcpy(newArr.getBuffer(), canonicalRepresentation.getBuffer(), count * sizeof(CapabilityName));
+ m_targetSets[asAtom(target)].shaderStageSets.reserve(info.canonicalRepresentation.getCount());
+ for (auto i : info.canonicalRepresentation)
{
- SLANG_ASSERT(thisAtom > thatAtom);
-
- // We have an atom in `that` which isn't
- // also in `this`, so we know it cannot
- // be a subset.
- //
- return false;
+ newArr[count] = i[0];
+ addToTargetCapabilityWithTargetAndStageAtom(target, i[0], newArr.getArrayView());
}
+ return;
}
- // We reached the end of either this or that atom.
- // If we reached the end of 'that', we know everything in 'that'
- // is also contained in this, so this implies that.
- return thatIndex == thatCount;
-}
-
- /// Helper functor for binary search on lists of `CapabilityAtom`
-struct CapabilityAtomComparator
-{
- int operator()(CapabilityAtom left, CapabilityAtom right)
- {
- return int(Int(left) - Int(right));
- }
-};
+
+ CapabilityAtomSet setToAdd = CapabilityAtomSet((UInt)CapabilityAtom::Count);
+ for(auto i : canonicalRepresentation)
+ setToAdd.add(asAtomUInt(i));
-bool CapabilityConjunctionSet::implies(CapabilityAtom atom) const
-{
- // Every non-alias atom that `this` implies should
- // be presented in the `m_expandedAtoms` list.
- //
- // Because the list is sorted, we can find out whether
- // it contains `atom` with a binary search.
- //
- Index result = m_expandedAtoms.binarySearch(atom, CapabilityAtomComparator());
- return result >= 0;
+ addToTargetCapabilityWithValidUIntSetAndTargetAndStage(target, stage, setToAdd);
}
-Int CapabilityConjunctionSet::countIntersectionWith(CapabilityConjunctionSet const& that) const
+// No targets atoms have been defined on yet, set stage to target any_target capability
+void CapabilitySet::addToTargetCapabilityWithStageAtom(CapabilityName stage, const ArrayView<CapabilityName>& canonicalRepresentation)
{
- // The goal of this subroutine is to count the number of
- // elements in the intersection of `this` and `that`,
- // without explicitly forming that intersection.
- //
- // Our approach here will be to iterate over the two
- // sets in tandem (O(N) in the size of the larger set)
- // and check for elements that both contain.
- //
- // TODO: There should be an asymptotically faster
- // recursive algorithm here.
-
- Int intersectionCount = 0;
-
- Index thisCount = this->m_expandedAtoms.getCount();
- Index thatCount = that.m_expandedAtoms.getCount();
- Index thisIndex = 0;
- Index thatIndex = 0;
- for(;;)
+
+ if (m_targetSets.getCount() == 0)
{
- if(thisIndex == thisCount) break;
- if(thatIndex == thatCount) break;
-
- auto thisAtom = this->m_expandedAtoms[thisIndex];
- auto thatAtom = that.m_expandedAtoms[thatIndex];
-
- if( thisAtom == thatAtom )
- {
- // An item both contain.
-
- intersectionCount++;
- thisIndex++;
- thatIndex++;
- continue;
- }
-
- if( thisAtom < thatAtom )
- {
- // An item in `this` but not `that`.
-
- thisIndex++;
- }
- else
+ const auto anyTargetInfo = _getInfo(CapabilityName::any_target);
+ CapabilityAtomSet setToAdd;
+ setToAdd.resize((UInt)CapabilityAtom::Count);
+ for (int i = 0; i < canonicalRepresentation.getCount(); i++)
+ setToAdd.add((UInt)canonicalRepresentation[i]);
+ CapabilityName targetAtom{};
+ for (const auto& targetAtomCanonicalRep : anyTargetInfo.canonicalRepresentation)
{
- SLANG_ASSERT(thisAtom > thatAtom);
-
- // An item in `that` but not `this`.
-
- thatIndex++;
+ for (auto anyTargetAtom : targetAtomCanonicalRep)
+ {
+ setToAdd.add((UInt)anyTargetAtom);
+ if (_getInfo(anyTargetAtom).abstractBase == CapabilityName::target)
+ targetAtom = anyTargetAtom;
+ }
+ addToTargetCapabilityWithValidUIntSetAndTargetAndStage(targetAtom, stage, setToAdd);
+ for (auto anyTargetAtom : targetAtomCanonicalRep)
+ setToAdd.remove((UInt)anyTargetAtom);
}
}
- return intersectionCount;
}
-bool CapabilityConjunctionSet::isBetterForTarget(
- CapabilityConjunctionSet const& existingCaps,
- CapabilityConjunctionSet const& targetCaps) const
+void CapabilitySet::addToTargetCapabilityWithTargetAndOrStageAtom(CapabilityName target, CapabilityName stage, const ArrayView<CapabilityName>& canonicalRepresentation)
{
- auto& candidateCaps = *this;
-
- // The task here is to determine if `candidateCaps` should
- // be considered "better" than `existingCaps` in the context
- // of compilation for a target with the given `targetCaps`.
- //
- // In an ideal world, this computation could be quite simple:
- //
- // * If either `candidateCaps` or `existingCaps` is not implied by
- // `targetCaps` (that is, they include requirements that aren't
- // provided by the target), then the other is automatically "better."
- //
- // * Otherwise, one set is "better" than the other if it is a
- // super-set (which is what `implies()` tests).
- //
- // There are two main reasons we can't use that simple logic:
- //
- // 1. Currently a user of Slang can compile for a target but
- // not actually spell out its capabilities fully or correctly.
- // They might compile for `sm_5_0` but use ray tracing features
- // that require `sm_6_2` and expect the compiler to figure out
- // what they "obviously" meant. Thus we cannot assume that
- // `targetCaps` can be used to rule out candidates fully.
- //
- // 2. Sometimes there are multiple ways for a target to provide
- // the same feature (e.g., multiple extensions) and because of (1)
- // we cannot always rely on the `targetCaps` to tell us which to
- // use. Thus we cannot rely on pure subset/`implies()` to define
- // better-ness, and need some way to break ties.
- //
- // The following logic is a bunch of "do what I mean" nonsense that
- // tries to capture a reasonable intuition of what "better"-ness
- // should mean with these caveats.
-
- // First, if either candidate is fundamentally incompatible
- // with the target, we shouldn't favor it.
- //
- if(candidateCaps.isIncompatibleWith(targetCaps)) return false;
- if(existingCaps.isIncompatibleWith(targetCaps)) return true;
-
- // Next, we want to compare the candidates to the `targetCaps`
- // to figure out whether one is obviously "more specialized" for
- // the target.
- //
- // We measure the degree to which a candidate is specialized for
- // the target as the size of its set intersection with `targetCaps`.
- //
- // TODO: If both `candidateCaps` and `existingCaps` are implied
- // by `targetCaps`, then this amounts to just measuring the
- // size of each set. We probably want this size-based check to
- // come later in the overall process.
- //
- // TODO: A better model here might be to actually compute the actual
- // intersected sets, and then check if one is a super-set of the other.
- //
- auto candidateIntersectionSize = targetCaps.countIntersectionWith(candidateCaps);
- auto existingIntersectionSize = targetCaps.countIntersectionWith(existingCaps);
- if(candidateIntersectionSize != existingIntersectionSize)
- return candidateIntersectionSize > existingIntersectionSize;
-
- // Next we want to consider that if one of the two candidates
- // is actually available on the target (meaning that it is
- // implied by `targetCaps`) then we probably want to pick that one
- // (since we can use that candidate on the chosen target without
- // enabling any additional features the user didn't ask for).
- //
- // TODO: This step currently needs to come after the preceeding
- // one because otherwise we risk selecting a `__target_intrinsic`
- // decoration with *no* requirements (which are currently being
- // added implicitly in many places) over any one with explicit
- // requirements (since every target implies the empty set of
- // requirements).
- //
- // In many ways the counting-based logic above amounts to a quick
- // fix to prefer a non-empty set of requirements over an empty one,
- // so long as something in that non-empty set overlaps with the target.
- //
- // TODO: The best fix is probably to figure out how "catch-all"
- // intrinsic function definitions should be encoded; we clearly
- // want them to be used only as a fallback when no target-specific
- // variants are present.
- //
- bool candidateIsAvailable = targetCaps.implies(candidateCaps);
- bool existingIsAvailable = targetCaps.implies(existingCaps);
- if(candidateIsAvailable != existingIsAvailable)
- return candidateIsAvailable;
-
- // All preceding factors being equal, we prefer
- // a candidate that is strictly more specialized than the other.
- //
- // We want to avoid choosing the candidate that uses
- // optional features if they aren't necessary.
- // For example, the set {glsl, optionalFeature} should not be preferred
- // over the set {glsl}, if optionalFeature isn't requested explictly.
- //
- // The solution here is that we want to partition
- // `candidateCaps` and `existingCaps` into two parts: their
- // intersection with `targetCaps` and their difference with it.
- //
- // For the intersection part of things, we'd want to favor a
- // definition that is more specialized, while for the difference
- // part we'd actually wnat to favor a definition that is less
- // specialized.
- //
- CapabilityConjunctionSet candidateCapsIntersection;
- CapabilityConjunctionSet candidateCapsDifference;
- for (auto atom : candidateCaps.m_expandedAtoms)
- {
- if (targetCaps.implies(atom))
- candidateCapsIntersection.m_expandedAtoms.add(atom);
- else
- candidateCapsDifference.m_expandedAtoms.add(atom);
- }
- CapabilityConjunctionSet existingCapsIntersection;
- CapabilityConjunctionSet existingCapsDifference;
- for (auto atom : existingCaps.m_expandedAtoms)
- {
- if (targetCaps.implies(atom))
- existingCapsIntersection.m_expandedAtoms.add(atom);
- else
- existingCapsDifference.m_expandedAtoms.add(atom);
- }
- auto scoreCandidate = candidateCapsIntersection.m_expandedAtoms.getCount() - candidateCapsDifference.m_expandedAtoms.getCount();
- auto scoreExisting = existingCapsIntersection.m_expandedAtoms.getCount() - existingCapsDifference.m_expandedAtoms.getCount();
- if (scoreCandidate != scoreExisting)
- return scoreCandidate > scoreExisting;
-
- // At this point we have the problem that neither candidate
- // appears to be "obviously" better for the target, but we
- // want some way to disambiguate them.
- //
- // What we want to do now is scan through what makes each candidate
- // different from the other, and see if anything in either case
- // has a ranking that should make it be preferred.
- //
- auto candidateScore = candidateCapsDifference._calcDifferenceScoreWith(existingCapsDifference);
- auto existingScore = existingCapsDifference._calcDifferenceScoreWith(candidateCapsDifference);
- if(candidateScore != existingScore)
- return candidateScore > existingScore;
-
- return false;
+ if(target != CapabilityName::Invalid)
+ addToTargetCapabilityWithTargetAndStageAtom(target, stage, canonicalRepresentation);
+ else if(stage != CapabilityName::Invalid)
+ addToTargetCapabilityWithStageAtom(stage, canonicalRepresentation);
}
-uint32_t CapabilityConjunctionSet::_calcDifferenceScoreWith(CapabilityConjunctionSet const& that) const
+void CapabilitySet::addToTargetCapabilitesWithCanonicalRepresentation(const ArrayView<CapabilityName>& canonicalRepresentation)
{
- uint32_t score = 0;
-
- // Our approach here will be to scan through `this` and `that`
- // to identify atoms that are in `this` but not `that` (that is,
- // the atoms that would be present in the set difference `this - that`)
- // and then compute the maximum rank/score of those atoms.
-
- Index thisCount = this->m_expandedAtoms.getCount();
- Index thatCount = that.m_expandedAtoms.getCount();
- Index thisIndex = 0;
- Index thatIndex = 0;
- for(;;)
+ // only need to search i == 0/1 to find a relevant node
+ // target node should ALWAYS be first, so if we find a node, we stop searching. This is the most important node. We assume only stage+target with this logic.
+ // canonicalRepresentation of node has optionally 0-1 abstract node of each type, with a minimum of 1 abstract node total.
+ CapabilityName target = CapabilityName::Invalid;
+ CapabilityName stage = CapabilityName::Invalid;
+ for (const auto& i : canonicalRepresentation)
{
- if(thisIndex == thisCount) break;
- if(thatIndex == thatCount) break;
-
- auto thisAtom = this->m_expandedAtoms[thisIndex];
- auto thatAtom = that.m_expandedAtoms[thatIndex];
-
- if( thisAtom == thatAtom )
- {
- thisIndex++;
- thatIndex++;
+ const auto info = _getInfo(i);
+ if (info.abstractBase == CapabilityName::Invalid)
continue;
- }
-
- if( thisAtom < thatAtom )
- {
- // `thisAtom` is not present in `that`, so it
- // should contribute to our ranking of the difference.
- //
- auto thisAtomInfo = _getInfo(thisAtom);
- auto thisAtomRank = thisAtomInfo.rank;
-
- if( thisAtomRank > score )
- {
- score = thisAtomRank;
- }
+ else if (info.abstractBase == CapabilityName::target)
+ target = i;
+ else if (info.abstractBase == CapabilityName::stage)
+ stage = i;
- thisIndex++;
- }
- else
- {
- SLANG_ASSERT(thisAtom > thatAtom);
- thatIndex++;
- }
+ if (target != CapabilityName::Invalid && stage != CapabilityName::Invalid)
+ break;
}
- return score;
-}
-
-bool CapabilityConjunctionSet::operator==(CapabilityConjunctionSet const& other) const
-{
- return m_expandedAtoms == other.m_expandedAtoms;
+ addToTargetCapabilityWithTargetAndOrStageAtom(target, stage, canonicalRepresentation);
}
-bool CapabilityConjunctionSet::operator<(CapabilityConjunctionSet const& that) const
+void CapabilitySet::addUnexpandedCapabilites(CapabilityName atom)
{
- for (Index i = 0; i < Math::Min(m_expandedAtoms.getCount(), that.m_expandedAtoms.getCount()); i++)
- {
- if (m_expandedAtoms[i] < that.m_expandedAtoms[i])
- return true;
- else if (m_expandedAtoms[i] > that.m_expandedAtoms[i])
- return false;
- }
- return m_expandedAtoms.getCount() < that.m_expandedAtoms.getCount();
+ auto info = _getInfo(atom);
+ for (const auto& cr : info.canonicalRepresentation)
+ addToTargetCapabilitesWithCanonicalRepresentation(cr);
}
-
CapabilitySet::CapabilitySet()
{}
@@ -803,14 +260,8 @@ CapabilitySet::CapabilitySet(Int atomCount, CapabilityName const* atoms)
CapabilitySet::CapabilitySet(CapabilityName atom)
{
- auto info = _getInfo(atom);
- for (auto conjunction : info.canonicalRepresentation)
- {
- CapabilityConjunctionSet set;
- for (auto atomName : conjunction)
- set.getExpandedAtoms().add(asAtom(atomName));
- m_conjunctions.add(_Move(set));
- }
+ this->m_targetSets.reserve(kCapabilityTargetCount);
+ addUnexpandedCapabilites(atom);
}
CapabilitySet::CapabilitySet(List<CapabilityName> const& atoms)
@@ -826,13 +277,9 @@ CapabilitySet CapabilitySet::makeEmpty()
CapabilitySet CapabilitySet::makeInvalid()
{
- // An invalid capability set will always be a singleton
- // set of the `Invalid` atom, and we will construct
- // the set directly rather than use the more expensive
- // logic in `_init()`.
- //
CapabilitySet result;
- result.m_conjunctions.add(CapabilityConjunctionSet(CapabilityAtom::Invalid));
+ result.m_targetSets[CapabilityAtom::Invalid].target = CapabilityAtom::Invalid;
+
return result;
}
@@ -843,24 +290,23 @@ void CapabilitySet::addCapability(CapabilityName name)
bool CapabilitySet::isEmpty() const
{
- return m_conjunctions.getCount() == 0;
+ return m_targetSets.getCount() == 0;
}
bool CapabilitySet::isInvalid() const
{
- return m_conjunctions.getCount() == 1 && m_conjunctions[0].isInvalid();
+ return m_targetSets.containsKey(CapabilityAtom::Invalid);
}
bool CapabilitySet::isIncompatibleWith(CapabilityAtom other) const
{
+ // should be a target or derivative, otherwise this makes no sense.
+
if (isEmpty())
return false;
-
- // If all conjunctions are incompatible with the atom, then we are incompatible.
- for (auto& c : m_conjunctions)
- if (!c.isIncompatibleWith(other))
- return false;
- return true;
+
+ CapabilitySet otherSet((CapabilityName)other);
+ return isIncompatibleWith(otherSet);
}
bool CapabilitySet::isIncompatibleWith(CapabilityName other) const
@@ -871,367 +317,440 @@ bool CapabilitySet::isIncompatibleWith(CapabilityName other) const
return isIncompatibleWith(otherSet);
}
-bool CapabilitySet::isIncompatibleWith(CapabilityConjunctionSet const& other) const
+bool CapabilitySet::isIncompatibleWith(CapabilitySet const& other) const
{
if (isEmpty())
return false;
+ if (other.isEmpty())
+ return false;
+
+ // Incompatible means there are 0 intersecting abstract nodes from sets in `other` with sets in `this`
+ for (auto& otherSet : other.m_targetSets)
+ {
+ auto targetSet = this->m_targetSets.tryGetValue(otherSet.first);
+ if (!targetSet)
+ continue;
+
+ for (auto& otherStageSet : otherSet.second.shaderStageSets)
+ {
+ auto stageSet = targetSet->shaderStageSets.tryGetValue(otherStageSet.first);
+ if (!stageSet)
+ continue;
- // If all conjunctions are incompatible with the atom, then we are incompatible.
- for (auto& c : m_conjunctions)
- if (!c.isIncompatibleWith(other))
return false;
+ }
+ }
return true;
}
-bool CapabilitySet::isIncompatibleWith(CapabilitySet const& other) const
+const CapabilityAtomSet& getAtomSetOfTargets()
{
- if (isEmpty())
- return false;
- if (other.isEmpty())
- return false;
-
- // If all conjunctions in other are incompatible with the this set, then we are incompatible.
- for (auto& oc : other.m_conjunctions)
- for (auto& c : m_conjunctions)
- if (!c.isIncompatibleWith(oc))
- return false;
- return true;
+ return kAnyTargetUIntSetBuffer;
+}
+const CapabilityAtomSet& getAtomSetOfStages()
+{
+ return kAnyStageUIntSetBuffer;
}
-bool CapabilitySet::implies(CapabilityAtom atom) const
+bool hasTargetAtom(const CapabilityAtomSet& setIn, CapabilityAtom& targetAtom)
{
- if (isEmpty())
- return false;
+ CapabilityAtomSet intersection;
+ setIn.calcIntersection(intersection, getAtomSetOfTargets(), setIn);
- for (auto& c : m_conjunctions)
- if (c.implies(atom))
- return true;
+ if (intersection.isEmpty())
+ return false;
- return false;
+ targetAtom = intersection.getElements<CapabilityAtom>().getLast();
+ return true;
}
-bool CapabilitySet::implies(const CapabilityConjunctionSet& set) const
+bool CapabilitySet::implies(CapabilityAtom atom) const
{
- if (isEmpty())
+ if (isEmpty() || atom == CapabilityAtom::Invalid)
return false;
- for (auto& c : m_conjunctions)
- if (c.implies(set))
- return true;
+ CapabilitySet tmpSet = CapabilitySet(CapabilityName(atom));
- return false;
+ return this->implies(tmpSet);
}
-bool CapabilitySet::implies(CapabilitySet const& other) const
+bool CapabilitySet::implies(CapabilitySet const& other, const bool onlyRequireSingleImply) const
{
// x implies (c | d) only if (x implies c) and (x implies d).
- if (other.isEmpty())
- return true;
- for (auto& c : other.m_conjunctions)
- if (!implies(c))
- return false;
- return true;
-}
-
-bool CapabilitySet::operator==(CapabilitySet const& that) const
-{
- return m_conjunctions == that.m_conjunctions;
-}
-
-void CapabilitySet::calcCompactedAtoms(List<List<CapabilityAtom>>& outAtoms) const
-{
- for (auto& c : m_conjunctions)
- {
- List<CapabilityAtom> atoms;
- c.calcCompactedAtoms(atoms);
- outAtoms.add(atoms);
- }
-}
-void CapabilitySet::unionWith(const CapabilityConjunctionSet& conjunctionToAdd)
-{
- // We add conjunctionToAdd to resultSet only if it does not imply any existing conjunctions.
- // For example, if `resultSet` is (a), and conjunctionToAdd is (ab), then we don't want to add the conjunction
- // to form (a | ab) because that would reduce to (a).
- bool skipAdd = false;
- for (auto& c : m_conjunctions)
+ for (const auto& otherTarget : other.m_targetSets)
{
- if (conjunctionToAdd.implies(c))
+ auto thisTarget = this->m_targetSets.tryGetValue(otherTarget.first);
+ if (!thisTarget)
{
- skipAdd = true;
- break;
+ // 'this' lacks a target 'other' has.
+ return false;
}
- }
- if (!skipAdd)
- {
- // Once we added the new conjunction, any existing conjunctions that implies the new one can be
- // removed.
- // For example, if resultSet was (ab), and we are adding (a), the result should be just (a).
- for (Index i = 0; i < m_conjunctions.getCount();)
+
+ for (const auto& otherStage : otherTarget.second.shaderStageSets)
{
- if (m_conjunctions[i].implies(conjunctionToAdd))
+ auto thisStage = thisTarget->shaderStageSets.tryGetValue(otherStage.first);
+ if (!thisStage)
{
- m_conjunctions.fastRemoveAt(i);
+ // 'this' lacks a stage 'other' has.
+ return false;
}
- else
+
+ // all stage sets that are in 'other' must be contained by 'this'
+ if(thisStage->atomSet)
{
- i++;
+ auto& thisStageSet = thisStage->atomSet.value();
+ if(otherStage.second.atomSet)
+ {
+ if (!onlyRequireSingleImply)
+ {
+ if (!thisStageSet.contains(otherStage.second.atomSet.value()))
+ return false;
+ }
+ else
+ {
+ if (thisStageSet.contains(otherStage.second.atomSet.value()))
+ return true;
+ }
+ }
}
}
- m_conjunctions.add(conjunctionToAdd);
}
+ return !onlyRequireSingleImply;
}
-void CapabilitySet::canonicalize()
+void CapabilityTargetSet::unionWith(const CapabilityTargetSet& other)
{
- // Make sure conjunctions are sorted so equality tests are trivial.
- m_conjunctions.sort();
+ for (auto otherStageSet : other.shaderStageSets)
+ {
+ auto& thisStageSet = this->shaderStageSets[otherStageSet.first];
+ thisStageSet.stage = otherStageSet.first;
+
+ if (!thisStageSet.atomSet)
+ thisStageSet.atomSet = otherStageSet.second.atomSet;
+ else
+ if(otherStageSet.second.atomSet)
+ thisStageSet.atomSet->unionWith(*otherStageSet.second.atomSet);
+ }
}
-CapabilitySet CapabilitySet::getTargetsThisIsMissingFromOther(const CapabilitySet& other)
+void CapabilitySet::unionWith(const CapabilitySet& other)
{
- CapabilitySet conflicts{};
- List<CapabilityConjunctionSet> textualTargetsNotHandled;
- for (auto conjunction : this->m_conjunctions)
+ if (this->isInvalid() || other.isInvalid())
+ return;
+
+ this->m_targetSets.reserve(other.m_targetSets.getCount());
+ for (auto otherTargetSet : other.m_targetSets)
{
- textualTargetsNotHandled.add({});
- auto& currentList = textualTargetsNotHandled.getLast();
- for (auto thatNode : conjunction.getExpandedAtoms())
- {
- // To make this faster we can make an assumption that the nodes are:
- // {textualTarget, targetAbstract(), targetAbstract(), nonTarget}
- // this assumption is not being used since it relies on ordering of .capdef file
- if (_getInfo(thatNode).abstractBase == CapabilityName::target)
- currentList.getExpandedAtoms().add(thatNode);
- }
+ CapabilityTargetSet& thisTargetSet = this->m_targetSets[otherTargetSet.first];
+ thisTargetSet.target = otherTargetSet.first;
+ thisTargetSet.shaderStageSets.reserve(otherTargetSet.second.shaderStageSets.getCount());
+ thisTargetSet.unionWith(otherTargetSet.second);
}
- for (auto& thatConjunction : other.m_conjunctions)
- {
- // Worth the check to early leave due to ~5*5 elements to loop around
- if (textualTargetsNotHandled.getCount() == 0)
- break;
+}
- for (int i = 0 ; i < textualTargetsNotHandled.getCount(); i++)
- {
- auto& textualTargets = textualTargetsNotHandled[i];
+/// Join sets, but:
+/// 1. do not destroy target set's which are incompatible with `other` (destroying shaderStageSets is fine)
+/// 2. do not create an `CapabilityAtom::Invalid` target set.
+void CapabilitySet::nonDestructiveJoin(const CapabilitySet& other)
+{
+ if (this->isInvalid() || other.isInvalid())
+ return;
- if (textualTargets.countIntersectionWith(thatConjunction) != textualTargets.getExpandedAtoms().getCount())
- continue;
-
- textualTargetsNotHandled[i] = textualTargets.makeEmpty();
- }
+ if (this->isEmpty())
+ {
+ this->m_targetSets = other.m_targetSets;
+ return;
}
- CapabilitySet set;
- for (auto& i : textualTargetsNotHandled)
+ for (auto& thisTargetSet : this->m_targetSets)
{
- if (i.isEmpty())
- continue;
- set.unionWith(i);
+ thisTargetSet.second.tryJoin(other.m_targetSets);
}
- return set;
}
-// We only run 'join' logic on "this" conjunctions which are compatiable with "other" conjunctions.
-// We only add specific nodes which satisfy the abstractMask.
-// Any non-compatible conjunctions with "other"s cconjunctions will be preserved and unmodified.
-void CapabilitySet::simpleJoinWithSetMask(const CapabilitySet& other, CapabilityName abstractMask)
+void CapabilitySet::addCapability(List<List<CapabilityAtom>>& atomLists)
{
- CapabilitySet resultSet;
- HashSet<CapabilityConjunctionSet*> setUsed;
- // get used abstract mask nodes per conjunction so we can trivially check
- // if we need to add the abstract mask node to avoid duplicates
- List<HashSet<CapabilityAtom>> abstractMaskNodeInUse;
- abstractMaskNodeInUse.growToCount(m_conjunctions.getCount());
- for (int i = 0; i < m_conjunctions.getCount(); i++)
+ for (const auto& cr : atomLists)
+ addToTargetCapabilitesWithCanonicalRepresentation( (*(List<CapabilityName>*)(&cr)).getArrayView());
+}
+
+CapabilitySet CapabilitySet::getTargetsThisHasButOtherDoesNot(const CapabilitySet& other)
+{
+ CapabilitySet newSet{};
+ for (auto& i : this->m_targetSets)
{
- auto& thisConjunction = m_conjunctions[i];
- auto& setOfInUseNode = abstractMaskNodeInUse[i];
+ if (other.m_targetSets.tryGetValue(i.first))
+ continue;
- for (auto& atom : thisConjunction.getExpandedAtoms())
- {
- if (_getInfo(atom).abstractBase != abstractMask)
- continue;
- setOfInUseNode.add(atom);
- }
+ newSet.m_targetSets[i.first].target = i.first;
+ auto info = _getInfo(i.first);
+ if(info.canonicalRepresentation.getCount() > 0)
+ newSet.addToTargetCapabilityWithTargetAndStageAtom((CapabilityName)i.first, CapabilityName::Invalid, info.canonicalRepresentation[0]);
}
+ return newSet;
+}
- for (auto& thatConjunction : other.m_conjunctions)
- {
- for (int i = 0; i < m_conjunctions.getCount(); i++)
- {
- auto& thisConjunction = m_conjunctions[i];
- auto& setOfInUseNode = abstractMaskNodeInUse[i];
- CapabilityConjunctionSet conjunctionToAddToResultSet;
+/// Join `this` with a compatble stage set of `CapabilityTargetSet other`.
+/// Return false when `other` is fully incompatible.
+/// incompatability is when `this->stage` is not a supported stage by `other.shaderStageSets`.
+bool CapabilityStageSet::tryJoin(const CapabilityTargetSet& other)
+{
+ const CapabilityStageSet* otherStageSet = other.shaderStageSets.tryGetValue(this->stage);
+ if (!otherStageSet)
+ return false;
- if (thisConjunction.isIncompatibleWith(thatConjunction))
- continue;
- conjunctionToAddToResultSet = thisConjunction;
- setUsed.add(&thisConjunction);
- for (auto atom : thatConjunction.getExpandedAtoms())
- {
- if (_getInfo(atom).abstractBase != abstractMask
- || setOfInUseNode.contains(atom))
- continue;
- conjunctionToAddToResultSet.getExpandedAtoms().add(atom);
- }
- conjunctionToAddToResultSet.getExpandedAtoms().sort();
- resultSet.unionWith(conjunctionToAddToResultSet);
- }
- }
- for (auto& c : m_conjunctions)
+ // should not exceed far beyond 2*2 or 1*1 elements
+ if(otherStageSet->atomSet && this->atomSet)
+ this->atomSet->add(*otherStageSet->atomSet);
+
+ return true;
+}
+
+/// Join a compatable target set from `this` with `CapabilityTargetSet other`.
+/// Return false when `other` is fully incompatible.
+/// incompatability is when one of 2 senarios are true:
+/// 1. `this->target` is not a supported target by `other.shaderStageSets`
+/// 2. `this` has completly disjoint shader stages from other.
+bool CapabilityTargetSet::tryJoin(const CapabilityTargetSets& other)
+{
+ const CapabilityTargetSet* otherTargetSet = other.tryGetValue(this->target);
+ if (otherTargetSet == nullptr)
+ return false;
+
+ List<CapabilityAtom> destroySet;
+ destroySet.reserve(this->shaderStageSets.getCount());
+ for (auto& shaderStageSet : this->shaderStageSets)
{
- if (!setUsed.contains(&c))
- resultSet.m_conjunctions.add(c);
+ if (!shaderStageSet.second.tryJoin(*otherTargetSet))
+ destroySet.add(shaderStageSet.first);
}
- m_conjunctions = resultSet.m_conjunctions;
-}
+ if (destroySet.getCount() == Slang::Index(this->shaderStageSets.getCount()))
+ return false;
+
+ for (const auto& i : destroySet)
+ this->shaderStageSets.remove(i);
+
+ return true;
+}
void CapabilitySet::join(const CapabilitySet& other)
{
- if (isEmpty() || other.isInvalid())
+ if (this->isEmpty() || other.isInvalid())
{
*this = other;
return;
}
- if (isInvalid())
+ if (this->isInvalid())
return;
if (other.isEmpty())
return;
- CapabilitySet resultSet;
- for (auto& thatConjunction : other.m_conjunctions)
+ List<CapabilityAtom> destroySet;
+ destroySet.reserve(this->m_targetSets.getCount());
+ for (auto& thisTargetSet : this->m_targetSets)
{
- for (auto& thisConjunction : m_conjunctions)
+ if (!thisTargetSet.second.tryJoin(other.m_targetSets))
{
- if (thisConjunction.isIncompatibleWith(thatConjunction))
- continue;
+ destroySet.add(thisTargetSet.first);
+ }
+ }
+ for (const auto& i : destroySet)
+ {
+ this->m_targetSets.remove(i);
+ }
+ // join made a invalid CapabilitySet
+ if (this->m_targetSets.getCount() == 0)
+ this->m_targetSets[CapabilityAtom::Invalid].target = CapabilityAtom::Invalid;
+}
- CapabilityConjunctionSet conjunction;
- CapabilityConjunctionSet *conjunctionToAdd = nullptr;
+static uint32_t _calcAtomListDifferenceScore(List<CapabilityAtom> const& thisList, List<CapabilityAtom> const& thatList)
+{
+ uint32_t score = 0;
- // Add atoms from thatConjunction that are not existant in thisConjunction.
- for (auto atom : thatConjunction.getExpandedAtoms())
- {
- if (thisConjunction.getExpandedAtoms().binarySearch(atom, CapabilityAtomComparator()) == -1)
- {
- conjunction.getExpandedAtoms().add(atom);
- }
- }
+ // Our approach here will be to scan through `this` and `that`
+ // to identify atoms that are in `this` but not `that` (that is,
+ // the atoms that would be present in the set difference `this - that`)
+ // and then compute the maximum rank/score of those atoms.
- if (conjunction.getExpandedAtoms().getCount() != 0)
- {
- // If we find any capabilities in thatConjunction that is missing from thisConjunction,
- // create a new ConjunctionSet that contains atoms from both, and add it to the disjunction set.
- conjunction.getExpandedAtoms().addRange(thisConjunction.getExpandedAtoms());
- conjunction.getExpandedAtoms().sort();
- conjunctionToAdd = &conjunction;
- }
- else
+ Index thisCount = thisList.getCount();
+ Index thatCount = thatList.getCount();
+ Index thisIndex = 0;
+ Index thatIndex = 0;
+ for (;;)
+ {
+ if (thisIndex == thisCount) break;
+ if (thatIndex == thatCount) break;
+
+ auto thisAtom = thisList[thisIndex];
+ auto thatAtom = thatList[thatIndex];
+
+ if (thisAtom == thatAtom)
+ {
+ thisIndex++;
+ thatIndex++;
+ continue;
+ }
+
+ if (thisAtom < thatAtom)
+ {
+ // `thisAtom` is not present in `that`, so it
+ // should contribute to our ranking of the difference.
+ //
+ auto thisAtomInfo = _getInfo(thisAtom);
+ auto thisAtomRank = thisAtomInfo.rank;
+
+ if (thisAtomRank > score)
{
- // Otherwise, thisConjunction implies thatConjunction, so we just add thisConjunction to resultSet.
- conjunctionToAdd = &thisConjunction;
+ score = thisAtomRank;
}
- resultSet.unionWith(*conjunctionToAdd);
+
+ thisIndex++;
+ }
+ else
+ {
+ SLANG_ASSERT(thisAtom > thatAtom);
+ thatIndex++;
}
}
- m_conjunctions = _Move(resultSet.m_conjunctions);
+ return score;
+}
- if (m_conjunctions.getCount() == 0)
- {
- // If the result is empty, then we should return as impossible.
- *this = CapabilitySet::makeInvalid();
- }
- else
+bool CapabilitySet::hasSameTargets(const CapabilitySet& other) const
+{
+ for (const auto& i : this->m_targetSets)
{
- canonicalize();
+ if (!other.m_targetSets.tryGetValue(i.first))
+ return false;
}
+ return this->m_targetSets.getCount() == other.m_targetSets.getCount();
}
-bool CapabilitySet::isBetterForTarget(CapabilitySet const& that, CapabilitySet const& targetCaps) const
+
+// MSVC incorrectly throws warning
+#pragma warning(push)
+#pragma warning(disable:4702)
+/// returns true if 'this' is a better target for 'targetCaps' than 'that'
+/// isEqual: is `this` and `that` equal
+/// isIncompatible: is `this` and `that` incompatible
+bool CapabilitySet::isBetterForTarget(CapabilitySet const& that, CapabilitySet const& targetCaps, bool& isEqual) const
{
- if (targetCaps.isIncompatibleWith(*this))
- return false;
- if (targetCaps.isIncompatibleWith(that))
+ if (this->isEmpty() && (that.isEmpty() || that.isInvalid()))
+ {
+ if(this->isEmpty() && that.isEmpty())
+ isEqual = true;
return true;
+ }
- ArrayView<CapabilityConjunctionSet> thisSets = m_conjunctions.getArrayView();
- ArrayView<CapabilityConjunctionSet> thatSets = that.m_conjunctions.getArrayView();
- CapabilityConjunctionSet emtpySet = CapabilityConjunctionSet::makeEmpty();
-
- if (isEmpty())
- thisSets = makeArrayViewSingle(emtpySet);
- if (that.isEmpty())
- thatSets = makeArrayViewSingle(emtpySet);
-
- // It is hard to think about what it means exactly to compare a general disjunction set to another with regard
- // to a target that itself is also a disjunction set.
- // Instead of trying to find a meaning for the general case, we just want to extend the logic
- // for conjunction sets to disjunction sets in a way that common situations are handled correctly.
- // Note that when we reach here, most of these sets are likely to contain only one conjunction, so
- // we just need to make sure the more general logic here yields correct result for that case.
- //
- // Right now, we define betterness for disjunctions as follows:
- // A capability set X is determined to be better for a target T than capability set Y,
- // if we find a conjunction A in X and a conjunction B in Y and a conjunction C in T such that
- // A is better then B for target C.
- //
- struct ViableConjunctionIndex
+ // required to have target.
+ for (auto& targetWeNeed : targetCaps.m_targetSets)
{
- Index index;
- UIntSet targetConjunctionIndices;
- };
- auto getViableConjunction = [&](ArrayView<CapabilityConjunctionSet> set, List<ViableConjunctionIndex>& outList)
+ auto thisTarget = this->m_targetSets.tryGetValue(targetWeNeed.first);
+ if (!thisTarget)
{
- for (Index i = 0; i < set.getCount(); i++)
- {
- auto& conjunction = set[i];
- ViableConjunctionIndex viableConjunction;
- viableConjunction.index = i;
- for (Index j = 0; j < targetCaps.m_conjunctions.getCount(); j++)
- {
- auto& targetConjunction = targetCaps.m_conjunctions[j];
- if (conjunction.isIncompatibleWith(targetConjunction))
- continue;
- viableConjunction.targetConjunctionIndices.add(j);
- }
- if (!viableConjunction.targetConjunctionIndices.isEmpty())
- {
- outList.add(viableConjunction);
- }
- }
- };
- List<ViableConjunctionIndex> viableConjunctionsThis;
- List<ViableConjunctionIndex> viableConjunctionsThat;
+ isEqual = hasSameTargets(that);
+ return false;
+ }
+ auto thatTarget = that.m_targetSets.tryGetValue(targetWeNeed.first);
+ if (!thatTarget)
+ {
+ isEqual = hasSameTargets(that);
+ return true;
+ }
- getViableConjunction(thisSets, viableConjunctionsThis);
- getViableConjunction(thatSets, viableConjunctionsThat);
-
- for (auto& thisConjunctionIndex : viableConjunctionsThis)
- {
- auto& thisConjunction = thisSets[thisConjunctionIndex.index];
- for (auto& thatConjunctionIndex : viableConjunctionsThat)
+ // required to have shader stage
+ for (auto& shaderStageSetsWeNeed : targetWeNeed.second.shaderStageSets)
{
- auto& thatConjunction = thatSets[thatConjunctionIndex.index];
- UIntSet intersection = thisConjunctionIndex.targetConjunctionIndices;
- intersection.intersectWith(thatConjunctionIndex.targetConjunctionIndices);
- if (!intersection.isEmpty())
+ auto thisStageSets = thisTarget->shaderStageSets.tryGetValue(shaderStageSetsWeNeed.first);
+ if (!thisStageSets)
+ return false;
+ auto thatStageSets = thatTarget->shaderStageSets.tryGetValue(shaderStageSetsWeNeed.first);
+ if (!thatStageSets)
+ return true;
+
+ // We want the smallest (most specialized) set which is still contained by this/that. This means:
+ // 1. target.contains(this/that)
+ // 2. choose smallest super set
+ // 3. rank each super set and their atoms, choose the smallest rank'd set (most specialized)
+ if(shaderStageSetsWeNeed.second.atomSet)
{
- for (Index targetConjunctionIndex = 0; targetConjunctionIndex < targetCaps.m_conjunctions.getCount(); targetConjunctionIndex++)
+ auto& shaderStageSetWeNeed = shaderStageSetsWeNeed.second.atomSet.value();
+
+ CapabilityAtomSet tmp_set{};
+ Index tmpCount = 0;
+ CapabilityAtomSet thisSet{};
+ Index thisSetCount = 0;
+ CapabilityAtomSet thatSet{};
+ Index thatSetCount = 0;
+
+ // subtraction of the set we want gets us the "elements which 'targetSet' has but `this/that` is less specialized for"
+ if(thisStageSets->atomSet)
{
- if (!intersection.contains((UInt)targetConjunctionIndex))
- continue;
- if (thisConjunction.isBetterForTarget(thatConjunction, targetCaps.m_conjunctions[targetConjunctionIndex]))
+ auto& thisStageSet = thisStageSets->atomSet.value();
+ // if `thisStageSet` is more specialized than the target, `thisStageSet` should not be a candidate
+ if (thisStageSet == shaderStageSetWeNeed)
+ return true;
+ if (shaderStageSetWeNeed.contains(thisStageSet))
{
- return true;
+ CapabilityAtomSet::calcSubtract(tmp_set, shaderStageSetWeNeed, thisStageSet);
+ tmpCount = tmp_set.countElements();
+ if (thisSetCount < tmpCount)
+ {
+ thisSet = tmp_set;
+ thisSetCount = tmpCount;
+ }
}
}
+ if (thatStageSets->atomSet)
+ {
+ auto& thatStageSet = thatStageSets->atomSet.value();
+ if (thatStageSet == shaderStageSetWeNeed)
+ return false;
+ if (shaderStageSetWeNeed.contains(thatStageSet))
+ {
+ CapabilityAtomSet::calcSubtract(tmp_set, shaderStageSetWeNeed, thatStageSet);
+ tmpCount = tmp_set.countElements();
+ if (thatSetCount < tmpCount)
+ {
+ thatSet = tmp_set;
+ thatSetCount = tmpCount;
+ }
+ }
+ }
+
+ if (thisSet == thatSet)
+ isEqual = true;
+
+ //empty means no candidate
+ if (thisSet.areAllZero())
+ return false;
+ if (thatSet.areAllZero())
+ return true;
+ if (thisSetCount < thatSetCount)
+ return true;
+ else if (thisSetCount > thatSetCount)
+ return false;
+
+ auto thisSetElements = thisSet.getElements<CapabilityAtom>();
+ auto thatSetElements = thisSet.getElements<CapabilityAtom>();
+ auto shaderStageSetWeNeedElements = shaderStageSetWeNeed.getElements<CapabilityAtom>();
+
+ auto thisDiffScore = _calcAtomListDifferenceScore(thisSetElements, shaderStageSetWeNeedElements);
+ auto thatDiffScore = _calcAtomListDifferenceScore(thisSetElements, shaderStageSetWeNeedElements);
+
+ return thisDiffScore < thatDiffScore;
}
}
}
- return false;
+ return true;
}
+#pragma warning(pop)
-bool CapabilitySet::checkCapabilityRequirement(CapabilitySet const& available, CapabilitySet const& required, const CapabilityConjunctionSet*& outFailedAvailableSet)
+CapabilitySet::AtomSets::Iterator CapabilitySet::getAtomSets() const
+{
+ return CapabilitySet::AtomSets::Iterator(&this->getCapabilityTargetSets()).begin();
+}
+
+bool CapabilitySet::checkCapabilityRequirement(CapabilitySet const& available, CapabilitySet const& required, CapabilityAtomSet& outFailedAvailableSet)
{
// Requirements x are met by available disjoint capabilities (a | b) iff
// both 'a' satisfies x and 'b' satisfies x.
@@ -1243,75 +762,85 @@ bool CapabilitySet::checkCapabilityRequirement(CapabilitySet const& available, C
// We will check that for every capability conjunction X of F(), there is one capability conjunction Y in g() such that X implies Y.
//
- outFailedAvailableSet = nullptr;
+ // if empty there is no body, all capabilities are supported.
+ if (required.isEmpty())
+ return true;
if (required.isInvalid())
+ {
+ outFailedAvailableSet.add((UInt)CapabilityAtom::Invalid);
return false;
+ }
// If F's capability is empty, we can satisfy any non-empty requirements.
//
if (available.isEmpty() && !required.isEmpty())
return false;
-
- for (auto& availTargetSet : available.getExpandedAtoms())
+
+
+ // if all sets in `available` are not a super-set to at least 1 `required` set, then we have an err
+ for (auto& availableTarget : available.m_targetSets)
{
- bool implied = false;
- for (auto& requiredTargetSet : required.getExpandedAtoms())
+ auto reqTarget = required.m_targetSets.tryGetValue(availableTarget.first);
+ if (!reqTarget)
{
- if (availTargetSet.implies(requiredTargetSet))
- {
- implied = true;
- break;
- }
- }
- if (!implied)
- {
- outFailedAvailableSet = &availTargetSet;
+ outFailedAvailableSet.add((UInt)availableTarget.first);
return false;
}
- }
-
- return true;
-}
-bool CapabilitySet::isExactSubset(CapabilitySet const& maybeSuperSet)
-{
- // This should only be used when absolutely required due to the
- // cost for complex sets. Simple sets are fine (glsl|spirv...)
- for (auto& thisCon : m_conjunctions)
- {
- bool foundEqualCon = false;
- for (auto& thatCon : maybeSuperSet.m_conjunctions)
+ for (auto& availableStage : availableTarget.second.shaderStageSets)
{
- if (thisCon == thatCon)
- foundEqualCon = true;
- }
- if (foundEqualCon == false)
- return false;
+ auto reqStage = reqTarget->shaderStageSets.tryGetValue(availableStage.first);
+ if (!reqStage)
+ {
+ outFailedAvailableSet.add((UInt)availableStage.first);
+ return false;
+ }
+
+ const CapabilityAtomSet* lastBadStage = nullptr;
+ if(availableStage.second.atomSet)
+ {
+ const auto& availableStageSet = availableStage.second.atomSet.value();
+ lastBadStage = nullptr;
+ if(reqStage->atomSet)
+ {
+ const auto& reqStageSet = reqStage->atomSet.value();
+ if (availableStageSet.contains(reqStageSet))
+ break;
+ else
+ lastBadStage = &reqStageSet;
+ }
+ if (lastBadStage)
+ {
+ // get missing atoms
+ CapabilityAtomSet::calcSubtract(outFailedAvailableSet, *lastBadStage, availableStageSet);
+ return false;
+ }
+ }
+ }
}
+
return true;
}
void printDiagnosticArg(StringBuilder& sb, const CapabilitySet& capSet)
{
bool isFirstSet = true;
- for (auto& set : capSet.getExpandedAtoms())
+ for (auto& set : capSet.getAtomSets())
{
- List<CapabilityAtom> compactAtomList;
- set.calcCompactedAtoms(compactAtomList);
-
if (!isFirstSet)
{
sb<< " | ";
}
bool isFirst = true;
- for (auto atom : compactAtomList)
+ for (auto atom : set)
{
+ CapabilityName formattedAtom = (CapabilityName)atom;
if (!isFirst)
{
sb << " + ";
}
- auto name = capabilityNameToString((CapabilityName)atom);
+ auto name = capabilityNameToString((CapabilityName)formattedAtom);
if (name.startsWith("_"))
name = name.tail(1);
sb << name;
@@ -1331,4 +860,211 @@ void printDiagnosticArg(StringBuilder& sb, CapabilityName name)
sb << _getInfo(name).name;
}
+#ifdef UNIT_TEST_CAPABILITIES
+
+#define CHECK_CAPS(inData) SLANG_ASSERT(inData>0)
+
+int TEST_findTargetCapSet(CapabilitySet& capSet, CapabilityAtom target)
+{
+ return true
+ && capSet.getCapabilityTargetSets().containsKey(target);
+}
+
+int TEST_findTargetStage(
+ CapabilitySet& capSet,
+ CapabilityAtom target,
+ CapabilityAtom stage)
+{
+ return capSet.getCapabilityTargetSets()[target].shaderStageSets.containsKey(stage);
+}
+
+int TEST_targetCapSetWithSpecificSetInStage(
+ CapabilitySet& capSet,
+ CapabilityAtom target,
+ CapabilityAtom stage,
+ List<CapabilityAtom> setToFind)
+{
+
+ bool containsStageKey = capSet.getCapabilityTargetSets()[target].shaderStageSets.containsKey(stage);
+ if (!containsStageKey)
+ return 0;
+
+ auto& stageSet = capSet.getCapabilityTargetSets()[target].shaderStageSets[stage];
+ if (stage != stageSet.stage)
+ return -1;
+
+ CapabilityAtomSet set;
+ for (auto i : setToFind)
+ set.add(UInt(i));
+
+ if (stageSet.atomSet)
+ {
+ auto& i = stageSet.atomSet.value();
+ if (i == set)
+ return true;
+ }
+
+ return -2;
+}
+
+void TEST_CapabilitySet_addAtom()
+{
+ CapabilitySet testCapSet{};
+
+ // ------------------------------------------------------------
+
+ testCapSet = CapabilitySet(CapabilityName::TEST_ADD_1);
+
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSet, CapabilityAtom::hlsl));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSet, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ { CapabilityAtom::textualTarget, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ CapabilityAtom::_sm_4_0, CapabilityAtom::_sm_4_1 }));
+
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSet, CapabilityAtom::glsl));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSet, CapabilityAtom::glsl, CapabilityAtom::vertex,
+ { CapabilityAtom::textualTarget, CapabilityAtom::glsl, CapabilityAtom::vertex,
+ CapabilityAtom::_GLSL_130 }));
+
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSet, CapabilityAtom::spirv_1_0));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSet, CapabilityAtom::spirv_1_0, CapabilityAtom::vertex,
+ { CapabilityAtom::spirv_1_0, CapabilityAtom::vertex,
+ CapabilityAtom::spirv_1_1 }));
+
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSet, CapabilityAtom::metal));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSet, CapabilityAtom::metal, CapabilityAtom::vertex,
+ { CapabilityAtom::textualTarget, CapabilityAtom::metal, CapabilityAtom::vertex }));
+
+ // ------------------------------------------------------------
+
+ testCapSet = CapabilitySet(CapabilityName::TEST_ADD_2);
+
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSet, CapabilityAtom::hlsl));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSet, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ { CapabilityAtom::textualTarget, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ CapabilityAtom::_sm_4_0, CapabilityAtom::_sm_4_1 }));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSet, CapabilityAtom::hlsl, CapabilityAtom::fragment,
+ { CapabilityAtom::textualTarget, CapabilityAtom::hlsl, CapabilityAtom::fragment,
+ CapabilityAtom::_sm_4_0, CapabilityAtom::_sm_4_1 }));
+
+ // ------------------------------------------------------------
+
+ testCapSet = CapabilitySet(CapabilityName::TEST_ADD_3);
+
+ CHECK_CAPS((int)!TEST_findTargetCapSet(testCapSet, CapabilityAtom::spirv_1_0));
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSet, CapabilityAtom::glsl));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSet, CapabilityAtom::glsl, CapabilityAtom::fragment,
+ { CapabilityAtom::textualTarget, CapabilityAtom::glsl, CapabilityAtom::fragment,
+ CapabilityAtom::_GLSL_130 }));
+ // ------------------------------------------------------------
+}
+
+void TEST_CapabilitySet_join()
+{
+ CapabilitySet testCapSetA{};
+ CapabilitySet testCapSetB{};
+
+ // ------------------------------------------------------------
+
+ testCapSetA = CapabilitySet(CapabilityName::TEST_JOIN_1A);
+ testCapSetB = CapabilitySet(CapabilityName::TEST_JOIN_1B);
+ testCapSetA.join(testCapSetB);
+
+ CHECK_CAPS((int)!TEST_findTargetCapSet(testCapSetA, CapabilityAtom::hlsl));
+ CHECK_CAPS((int)!TEST_findTargetCapSet(testCapSetA, CapabilityAtom::glsl));
+
+ // ------------------------------------------------------------
+
+ testCapSetA = CapabilitySet(CapabilityName::TEST_JOIN_2A);
+ testCapSetB = CapabilitySet(CapabilityName::TEST_JOIN_2B);
+ testCapSetA.join(testCapSetB);
+
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSetA, CapabilityAtom::hlsl));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSetA, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ { CapabilityAtom::textualTarget, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ CapabilityAtom::_sm_4_0, CapabilityAtom::_sm_4_1 }));
+
+ // ------------------------------------------------------------
+
+ testCapSetA = CapabilitySet(CapabilityName::TEST_JOIN_3A);
+ testCapSetB = CapabilitySet(CapabilityName::TEST_JOIN_3B);
+ testCapSetA.join(testCapSetB);
+
+ CHECK_CAPS((int)!TEST_findTargetCapSet(testCapSetA, CapabilityAtom::spirv_1_0));
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSetA, CapabilityAtom::glsl));
+ CHECK_CAPS((int)!TEST_findTargetStage(testCapSetA, CapabilityAtom::glsl, CapabilityAtom::raygen));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSetA, CapabilityAtom::glsl, CapabilityAtom::fragment,
+ { CapabilityAtom::textualTarget, CapabilityAtom::glsl, CapabilityAtom::fragment,
+ CapabilityAtom::_GLSL_130, CapabilityAtom::_GLSL_140 }));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSetA, CapabilityAtom::glsl, CapabilityAtom::vertex,
+ { CapabilityAtom::textualTarget, CapabilityAtom::glsl, CapabilityAtom::vertex,
+ CapabilityAtom::_GLSL_130, CapabilityAtom::_GLSL_140 }));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSetA, CapabilityAtom::hlsl, CapabilityAtom::fragment,
+ { CapabilityAtom::textualTarget, CapabilityAtom::hlsl, CapabilityAtom::fragment,
+ CapabilityAtom::_sm_4_0, CapabilityAtom::_sm_4_1 }));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSetA, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ { CapabilityAtom::textualTarget, CapabilityAtom::hlsl, CapabilityAtom::vertex,
+ CapabilityAtom::_sm_4_0 }));
+
+ // ------------------------------------------------------------
+
+ testCapSetA = CapabilitySet(CapabilityName::TEST_JOIN_4A);
+ testCapSetB = CapabilitySet(CapabilityName::TEST_JOIN_4B);
+ testCapSetA.join(testCapSetB);
+
+ CHECK_CAPS(TEST_findTargetCapSet(testCapSetA, CapabilityAtom::glsl));
+ CHECK_CAPS(TEST_targetCapSetWithSpecificSetInStage(testCapSetA, CapabilityAtom::glsl, CapabilityAtom::fragment,
+ { CapabilityAtom::textualTarget, CapabilityAtom::glsl, CapabilityAtom::fragment,
+ CapabilityAtom::_GLSL_130, CapabilityAtom::_GLSL_140, CapabilityAtom::_GLSL_150, CapabilityAtom::_GL_EXT_texture_query_lod, CapabilityAtom::_GL_EXT_texture_shadow_lod }));
+
+ // ------------------------------------------------------------
+
+
+}
+
+void TEST_CapabilitySet()
+{
+ TEST_CapabilitySet_addAtom();
+ TEST_CapabilitySet_join();
+}
+
+/*
+/// Test Capabilities
+
+alias TEST_ADD_1 = _sm_4_1 | _GLSL_130 | spirv_1_1 | metal
+ ;
+
+alias TEST_ADD_2 = _sm_4_1 | _sm_4_0 + shader_stages_compute_fragment
+ ;
+
+alias TEST_ADD_3 = _GLSL_130 + shader_stages_compute_fragment_geometry_vertex;
+
+//
+
+alias TEST_JOIN_1A = hlsl;
+alias TEST_JOIN_1B = glsl;
+
+alias TEST_JOIN_2A = hlsl;
+alias TEST_JOIN_2B = _sm_4_1 | glsl;
+
+alias TEST_JOIN_3A = glsl + fragment | _sm_4_0 + fragment
+ | glsl + vertex | hlsl + vertex
+ ;
+alias TEST_JOIN_3B = _sm_4_1 + fragment
+ | _sm_4_0 + vertex
+ | _sm_4_0 + compute
+ | _GLSL_140 + vertex
+ | _GLSL_140 + fragment
+ | spirv_1_0 + fragment
+ | glsl + raygen
+ | hlsl + raygen
+ ;
+
+alias TEST_JOIN_4A = _GLSL_140 + _GL_EXT_texture_query_lod;
+alias TEST_JOIN_4B = _GLSL_150 + _GL_EXT_texture_shadow_lod;
+///
+*/
+#undef CHECK_CAPS
+
+#endif
+
}
diff --git a/source/slang/slang-capability.h b/source/slang/slang-capability.h
index feac03337..9e4bdb3a8 100644
--- a/source/slang/slang-capability.h
+++ b/source/slang/slang-capability.h
@@ -3,8 +3,10 @@
#include "../core/slang-list.h"
#include "../core/slang-string.h"
+#include "../core/slang-dictionary.h"
#include <stdint.h>
+#include <optional>
namespace Slang
{
@@ -46,108 +48,47 @@ namespace Slang
//
// In all cases, we represent a set of capabilities with `CapabilitySet`.
- /// A set of capabilities, representing features that are either supported or required
-struct CapabilityConjunctionSet
+struct CapabilityAtomSet : UIntSet
{
-public:
- /// Default-construct an empty capability set
- CapabilityConjunctionSet();
-
- CapabilityConjunctionSet(CapabilityConjunctionSet const& other) = default;
- CapabilityConjunctionSet& operator=(CapabilityConjunctionSet const& other) = default;
- CapabilityConjunctionSet(CapabilityConjunctionSet&& other) = default;
- CapabilityConjunctionSet& operator=(CapabilityConjunctionSet&& other) = default;
-
- /// Construct a capability set from an explicit list of atomic capabilities
- CapabilityConjunctionSet(Int atomCount, CapabilityAtom const* atoms);
-
- /// Construct a capability set from an explicit list of atomic capabilities
- explicit CapabilityConjunctionSet(List<CapabilityAtom> const& atoms);
-
- /// Construct a singleton set from a single atomic capability
- explicit CapabilityConjunctionSet(CapabilityAtom atom);
-
- /// Make an empty capability set
- static CapabilityConjunctionSet makeEmpty();
-
- /// Make an invalid capability set (such that no target could ever support it)
- static CapabilityConjunctionSet makeInvalid();
-
- /// Is this capability set empty (such that any target supports it)?
- bool isEmpty() const;
-
- /// Is this capability set invalid (such that no target could support it)?
- bool isInvalid() const;
-
- // Capabilities are "incompatible" if no target platform can ever support both
- // at the same time. For example, the `HLSL` and `GLSL` capabilities are
- // incompatible, because a single target cannot be both an HLSL target and
- // a GLSL target (at least for now).
- //
- // Note that we are using the term "incompatible" here even though it
- // seems like "disjoint" would be intuitively correct (HLSL and GLSL
- // targets sure do seem to be disjoint). The problem is that in our
- // set-theoretic representation of capabilities, incompatible capability
- // sets are *never* disjoint sets of atoms, and (valid) disjoint sets of atoms
- // *never* represent incompatible capability sets.
-
- /// Is this capability set incompatible with the given `other` set.
- bool isIncompatibleWith(CapabilityAtom other) const;
-
- /// Is this capability set incompatible with the given `other` atomic capability.
- bool isIncompatibleWith(CapabilityConjunctionSet const& other) const;
-
- // One capability set A "implies" another set B if a target that
- // supports A must also support all of B.
- //
- // In practice, this means that "A implies B" is the same as
- // "A is a subset of B" in the set-theoretic model, but
- // we ant to think of this primarily as supported/required features,
- // and not get hung up on the set theory.
-
- /// Does this capability set imply all the capabilities in `other`?
- bool implies(CapabilityConjunctionSet const& other) const;
-
-
- /// Does this capability set imply the atomic capability `other`?
- bool implies(CapabilityAtom other) const;
-
- // A capability set is equal to another if each implies the other.
-
- /// Are these two capability sets equal?
- bool operator==(CapabilityConjunctionSet const& that) const;
- bool operator<(CapabilityConjunctionSet const& that) const;
-
- /// Get access to the raw atomic capabilities that define this set.
- List<CapabilityAtom> const& getExpandedAtoms() const { return m_expandedAtoms; }
- List<CapabilityAtom>& getExpandedAtoms() { return m_expandedAtoms; }
-
- /// Calculate a list of "compacted" atoms, which excludes any atoms from the expanded list that are implies by another item in the list.
- void calcCompactedAtoms(List<CapabilityAtom>& outAtoms) const;
+ using UIntSet::UIntSet;
+};
- Int countIntersectionWith(CapabilityConjunctionSet const& that) const;
+struct CapabilityTargetSet;
+typedef Dictionary<CapabilityAtom, CapabilityTargetSet> CapabilityTargetSets;
- bool isBetterForTarget(CapabilityConjunctionSet const& that, CapabilityConjunctionSet const& targetCaps) const;
+/// CapabilityStageSet encapsulates all capabilities of a specific shader stage for a specific target.
+/// Capabilities may be disjoint, but only in rare cases:
+/// {{glsl, _GLSL_130, GL_EXT_FOO1}, {glsl, _GLSL_130, _GLSL_140, _GLSL_150}}
+struct CapabilityStageSet
+{
+ CapabilityAtom stage{};
+
+ /// LinkedList of all disjoint sets for fast remove/add of unconstrained list positions.
+ std::optional<CapabilityAtomSet> atomSet{};
+
+ void addNewSet(CapabilityAtomSet&& setToAdd)
+ {
+ if (!atomSet)
+ atomSet = setToAdd;
+ else
+ atomSet->add(setToAdd);
+ }
+ bool tryJoin(const CapabilityTargetSet& other);
+};
-private:
- void _init(Int atomCount, CapabilityAtom const* atoms);
+/// CapabilityTargetSet encapsulates all capabilities of a specific target
+/// Format: {shader_stage, shader_stage_set}
+typedef Dictionary<CapabilityAtom, CapabilityStageSet> CapabilityStageSets;
+struct CapabilityTargetSet
+{
+ CapabilityAtom target{};
- uint32_t _calcDifferenceScoreWith(CapabilityConjunctionSet const& other) const;
+ CapabilityStageSets shaderStageSets{};
- // The underlying representation we use is a sorted and deduplicated
- // list of all the (non-alias) atoms that are present in the set.
- // This "expanded" list uses the transitive closure over the inheritnace
- // relationship between the atoms.
- //
- List<CapabilityAtom> m_expandedAtoms;
+ bool tryJoin(const CapabilityTargetSets& other);
+ void unionWith(const CapabilityTargetSet& other);
};
- /// Are the `left` and `right` capability sets unequal?
-inline bool operator!=(CapabilityConjunctionSet const& left, CapabilityConjunctionSet const& right)
-{
- return !(left == right);
-}
-
struct CapabilitySet
{
public:
@@ -168,9 +109,6 @@ public:
/// Construct a singleton set from a single atomic capability
explicit CapabilitySet(CapabilityName atom);
- /// Construct a singleton set from conjunctions
- explicit CapabilitySet(const List<CapabilityConjunctionSet>& conjunctions);
-
/// Make an empty capability set
static CapabilitySet makeEmpty();
@@ -190,61 +128,158 @@ public:
bool isIncompatibleWith(CapabilityName other) const;
/// Is this capability set incompatible with the given `other` atomic capability.
- bool isIncompatibleWith(CapabilityConjunctionSet const& other) const;
-
- /// Is this capability set incompatible with the given `other` atomic capability.
bool isIncompatibleWith(CapabilitySet const& other) const;
/// Does this capability set imply all the capabilities in `other`?
- bool implies(CapabilitySet const& other) const;
-
- /// Does this capability set imply all the capabilities in `other`?
- bool implies(CapabilityConjunctionSet const& other) const;
+ bool implies(CapabilitySet const& other, const bool onlyRequireSingleImply = false) const;
/// Does this capability set imply the atomic capability `other`?
bool implies(CapabilityAtom other) const;
- /// Join two capability sets to form (this & other).
+ /// Join two capability sets to form ('this' & 'other').
+ /// Destroy incompatible targets/sets apart of 'this' between ('this' & 'other').
+ /// `this` may be made invalid if other is fully disjoint.
void join(const CapabilitySet& other);
- void unionWith(const CapabilityConjunctionSet& other);
-
- void simpleJoinWithSetMask(const CapabilitySet& other, CapabilityName abstractMask);
+ /// Join two capability sets to form ('this' & 'other').
+ /// If a target/set has an incompatible atom, do not destroy the target/set.
+ void nonDestructiveJoin(const CapabilitySet& other);
- CapabilitySet getTargetsThisIsMissingFromOther(const CapabilitySet& other);
+ /// Add all targets/sets of 'other' into 'this'. Overlapping sets are removed.
+ void unionWith(const CapabilitySet& other);
- void canonicalize();
+ /// Return a capability set of 'target' atoms 'this' has, but 'other' does not.
+ CapabilitySet getTargetsThisHasButOtherDoesNot(const CapabilitySet& other);
/// Are these two capability sets equal?
bool operator==(CapabilitySet const& that) const;
- /// Get access to the raw atomic capabilities that define this set.
- List<CapabilityConjunctionSet>& getExpandedAtoms() { return m_conjunctions; }
- const List<CapabilityConjunctionSet>& getExpandedAtoms() const { return m_conjunctions; }
-
-
+ void addCapability(List<List<CapabilityAtom>>& atomLists);
/// Calculate a list of "compacted" atoms, which excludes any atoms from the expanded list that are implies by another item in the list.
- void calcCompactedAtoms(List<List<CapabilityAtom>>& outAtoms) const;
-
- bool isBetterForTarget(CapabilitySet const& that, CapabilitySet const& targetCaps) const;
-
- static bool checkCapabilityRequirement(CapabilitySet const& available, CapabilitySet const& required, const CapabilityConjunctionSet*& outFailedAvailableSet);
-
- bool isExactSubset(CapabilitySet const& maybeSuperSet);
+ bool isBetterForTarget(CapabilitySet const& that, CapabilitySet const& targetCaps, bool& isEqual) const;
+
+ /// Find any capability sets which are in 'available' but not in 'required'. Return false if this situation occurs.
+ static bool checkCapabilityRequirement(CapabilitySet const& available, CapabilitySet const& required, CapabilityAtomSet& outFailedAvailableSet);
+
+ inline void addToTargetCapabilityWithValidUIntSetAndTargetAndStage(CapabilityName target, CapabilityName stage, CapabilityAtomSet setToAdd);
+ inline void addToTargetCapabilityWithTargetAndStageAtom(CapabilityName target, CapabilityName stage, const ArrayView<CapabilityName>& canonicalRepresentation);
+ inline void addToTargetCapabilityWithTargetAndOrStageAtom(CapabilityName target, CapabilityName stage, const ArrayView<CapabilityName>& canonicalRepresentation);
+ inline void addToTargetCapabilityWithStageAtom(CapabilityName stage, const ArrayView<CapabilityName>& canonicalRepresentation);
+ inline void addToTargetCapabilitesWithCanonicalRepresentation(const ArrayView<CapabilityName>& atom);
+ inline void addUnexpandedCapabilites(CapabilityName atom);
+
+ CapabilityTargetSets& getCapabilityTargetSets() { return m_targetSets; }
+ const CapabilityTargetSets& getCapabilityTargetSets() const { return m_targetSets; }
+
+ struct AtomSets
+ {
+ struct Iterator
+ {
+ private:
+ const CapabilityTargetSets* context;
+ CapabilityTargetSets::ConstIterator targetNode{};
+ CapabilityStageSets::ConstIterator stageNode{};
+ const std::optional<CapabilityAtomSet>* atomSetNode;
+
+ public:
+ operator bool() const
+ {
+ return atomSetNode->has_value();
+ }
+ const CapabilityAtomSet& operator*() const
+ {
+ return *(*this->atomSetNode);
+ }
+ const CapabilityAtomSet* operator->() const
+ {
+ return &(*(*this->atomSetNode));
+ }
+ bool operator==(const Iterator& other) const
+ {
+ return other.context == this->context
+ && other.targetNode == this->targetNode
+ && other.stageNode == this->stageNode
+ ;
+ }
+ bool operator!=(const Iterator& other) const
+ {
+ return !(other == *this);
+ }
+
+ Iterator& operator++()
+ {
+ for(;;)
+ {
+ this->stageNode++;
+ if (this->stageNode == (*this->targetNode).second.shaderStageSets.end())
+ {
+ for(;;)
+ {
+ this->targetNode++;
+ if (this->targetNode == this->context->end())
+ {
+ this->stageNode = {};
+ this->atomSetNode = {};
+ return *this;
+ }
+ this->stageNode = (*this->targetNode).second.shaderStageSets.begin();
+ if (this->stageNode == (*this->targetNode).second.shaderStageSets.end())
+ continue;
+ break;
+ }
+ }
+ if (!(*this->stageNode).second.atomSet)
+ continue;
+ this->atomSetNode = &(*this->stageNode).second.atomSet;
+ break;
+ }
+ return *this;
+ }
+ Iterator& operator++(int)
+ {
+ return ++(*this);
+ }
+ Iterator begin() const
+ {
+ Iterator tmp(this->context);
+ tmp.targetNode = this->context->begin();
+ if (tmp.targetNode == this->context->end())
+ return tmp;
+ tmp.stageNode = (*tmp.targetNode).second.shaderStageSets.begin();
+ if (tmp.stageNode == (*tmp.targetNode).second.shaderStageSets.end())
+ {
+ tmp++;
+ return tmp;
+ }
+ tmp.atomSetNode = &(*tmp.stageNode).second.atomSet;
+ if (!tmp.atomSetNode->has_value())
+ tmp++;
+ return tmp;
+ }
+ Iterator end() const
+ {
+ Iterator tmp(this->context);
+ tmp.targetNode = this->context->end();
+ return tmp;
+ }
+ Iterator(const CapabilityTargetSets* mainContext)
+ {
+ context = mainContext;
+ }
+ };
+ };
+ /// Get access to the raw atomic capabilities that define this set.
+ /// Get all bottom level UIntSets for each CapabilityTargetSet.
+ CapabilitySet::AtomSets::Iterator getAtomSets() const;
private:
- // The underlying representation we use is a list of conjunctions.
- //
- List<CapabilityConjunctionSet> m_conjunctions;
+ /// underlying data of CapabilitySet.
+ CapabilityTargetSets m_targetSets{};
void addCapability(CapabilityName name);
-};
-/// Are the `left` and `right` capability sets unequal?
-inline bool operator!=(CapabilitySet const& left, CapabilitySet const& right)
-{
- return !(left == right);
-}
+ bool hasSameTargets(const CapabilitySet& other) const;
+};
/// Returns true if atom is derived from base
bool isCapabilityDerivedFrom(CapabilityAtom atom, CapabilityAtom base);
@@ -262,4 +297,14 @@ bool isDirectChildOfAbstractAtom(CapabilityAtom name);
void printDiagnosticArg(StringBuilder& sb, CapabilityAtom atom);
void printDiagnosticArg(StringBuilder& sb, CapabilityName name);
+const CapabilityAtomSet& getAtomSetOfTargets();
+const CapabilityAtomSet& getAtomSetOfStages();
+
+bool hasTargetAtom(const CapabilityAtomSet& setIn, CapabilityAtom& targetAtom);
+
+//#define UNIT_TEST_CAPABILITIES
+#ifdef UNIT_TEST_CAPABILITIES
+void TEST_CapabilitySet();
+#endif
+
}
diff --git a/source/slang/slang-check-decl.cpp b/source/slang/slang-check-decl.cpp
index 9a4ee9d71..fbf5332b8 100644
--- a/source/slang/slang-check-decl.cpp
+++ b/source/slang/slang-check-decl.cpp
@@ -744,7 +744,7 @@ namespace Slang
void visitInheritanceDecl(InheritanceDecl* inheritanceDecl);
- void diagnoseUndeclaredCapability(Decl* decl, const DiagnosticInfo& diagnosticInfo, const CapabilityConjunctionSet* failedAvailableSet);
+ void diagnoseUndeclaredCapability(Decl* decl, const DiagnosticInfo& diagnosticInfo, const CapabilityAtomSet& failedAtomsInsideAvailableSet);
};
@@ -9932,11 +9932,17 @@ namespace Slang
oldCaps);
}
}
+
+ // if stmt inside parent, set the provenance tracker to the calling function
+ if(!decl)
+ decl = visitor->getParentFuncOfVisitor();
if (referencedDecl && decl)
{
- for (auto& capSet : nodeCaps.getExpandedAtoms())
+ for (auto& capSet : nodeCaps.getAtomSets())
{
- for (auto atom : capSet.getExpandedAtoms())
+ auto elements = capSet.getElements<CapabilityAtom>();
+ decl->capabilityRequirementProvenance.reserve(decl->capabilityRequirementProvenance.getCount()+elements.getCount());
+ for (auto atom : elements)
{
decl->capabilityRequirementProvenance.addIfNotExists(atom, DeclReferenceWithLoc{ referencedDecl, referenceLoc });
}
@@ -10008,9 +10014,9 @@ namespace Slang
}
if (!maybeRequireCapability)
- targetCap = (CapabilitySet(CapabilityName::any_target).getTargetsThisIsMissingFromOther(set));
+ targetCap = (CapabilitySet(CapabilityName::any_target).getTargetsThisHasButOtherDoesNot(set));
else
- targetCap = (maybeRequireCapability->capabilitySet.getTargetsThisIsMissingFromOther(set));
+ targetCap = (maybeRequireCapability->capabilitySet.getTargetsThisHasButOtherDoesNot(set));
}
else
{
@@ -10024,10 +10030,8 @@ namespace Slang
{
diagnoseCapabilityErrors(Base::getSink(), outerContext.getOptionSet(), targetCase->body->loc, Diagnostics::conflictingCapabilityDueToStatement, bodyCap, "target_switch", oldCap);
}
- for (auto& conjunction : targetCap.getExpandedAtoms())
- set.unionWith(conjunction);
+ set.unionWith(targetCap);
}
- set.canonicalize();
handleReferenceFunc(stmt, set, stmt->loc);
}
@@ -10092,8 +10096,7 @@ namespace Slang
{
for (auto decoration : parent->getModifiersOfType<RequireCapabilityAttribute>())
{
- for (auto& set : decoration->capabilitySet.getExpandedAtoms())
- localDeclaredCaps.unionWith(set);
+ localDeclaredCaps.unionWith(decoration->capabilitySet);
}
}
else
@@ -10102,13 +10105,8 @@ namespace Slang
shouldBreak = true;
}
// Merge decl's capability declaration with the parent.
- for (auto& localConjunction : localDeclaredCaps.getExpandedAtoms())
- {
- if (declaredCaps.isIncompatibleWith(localConjunction))
- declaredCaps.unionWith(localConjunction);
- else
- declaredCaps.join(localDeclaredCaps);
- }
+ declaredCaps.nonDestructiveJoin(localDeclaredCaps);
+
// If the parent already has inferred capability requirements, we should stop now
// since that already covers transitive parents.
if (shouldBreak)
@@ -10127,27 +10125,37 @@ namespace Slang
decl->inferredCapabilityRequirements = getDeclaredCapabilitySet(decl);
}
- void SemanticsDeclCapabilityVisitor::visitFunctionDeclBase(FunctionDeclBase* funcDecl)
+ template<typename ProcessFunc>
+ static inline void _dispatchCapabilitiesVisitorOfFunctionDecl(SemanticsVisitor* visitor, FunctionDeclBase* funcDecl, ProcessFunc propegateFuncForReferences)
{
+ visitor->setParentFuncOfVisitor(funcDecl);
+
for (auto member : funcDecl->members)
{
- ensureDecl(member, DeclCheckState::CapabilityChecked);
- _propagateRequirement(this, funcDecl->inferredCapabilityRequirements, funcDecl, member, member->inferredCapabilityRequirements, member->loc);
+ visitor->ensureDecl(member, DeclCheckState::CapabilityChecked);
+ _propagateRequirement(visitor, funcDecl->inferredCapabilityRequirements, funcDecl, member, member->inferredCapabilityRequirements, member->loc);
}
- visitReferencedDecls(*this, funcDecl->body, funcDecl->loc, funcDecl->findModifier<RequireCapabilityAttribute>(), [this, funcDecl](SyntaxNode* node, const CapabilitySet& nodeCaps, SourceLoc refLoc)
- {
- _propagateRequirement(this, funcDecl->inferredCapabilityRequirements, funcDecl, node, nodeCaps, refLoc);
- });
+
+ visitReferencedDecls(*visitor, funcDecl->body, funcDecl->loc, funcDecl->findModifier<RequireCapabilityAttribute>(), propegateFuncForReferences);
if (!isEffectivelyStatic(funcDecl))
{
auto parentAggTypeDecl = getParentAggTypeDecl(funcDecl);
if (parentAggTypeDecl)
{
- ensureDecl(parentAggTypeDecl, DeclCheckState::CapabilityChecked);
- _propagateRequirement(this, funcDecl->inferredCapabilityRequirements, funcDecl, parentAggTypeDecl, parentAggTypeDecl->inferredCapabilityRequirements, funcDecl->loc);
+ visitor->ensureDecl(parentAggTypeDecl, DeclCheckState::CapabilityChecked);
+ _propagateRequirement(visitor, funcDecl->inferredCapabilityRequirements, funcDecl, parentAggTypeDecl, parentAggTypeDecl->inferredCapabilityRequirements, funcDecl->loc);
}
}
+ }
+
+ void SemanticsDeclCapabilityVisitor::visitFunctionDeclBase(FunctionDeclBase* funcDecl)
+ {
+ _dispatchCapabilitiesVisitorOfFunctionDecl(this, funcDecl,
+ [this, funcDecl](SyntaxNode* node, const CapabilitySet& nodeCaps, SourceLoc refLoc)
+ {
+ _propagateRequirement(this, funcDecl->inferredCapabilityRequirements, funcDecl, node, nodeCaps, refLoc);
+ });
auto declaredCaps = getDeclaredCapabilitySet(funcDecl);
@@ -10169,26 +10177,12 @@ namespace Slang
}
auto vis = getDeclVisibility(funcDecl);
+
+ // If 0 capabilities were annotated on a function, capabilities are inferred from the function body
if (declaredCaps.isEmpty())
{
- // If the user has not declared any capabilities,
- // we should diagnose a warning if any_target is not
- // a super-set by exact atoms.
- if (vis == DeclVisibility::Public && !funcDecl->inferredCapabilityRequirements.isEmpty())
- {
- if (!getModuleDecl(funcDecl)->isInLegacyLanguage)
- {
- if (!funcDecl->inferredCapabilityRequirements.isExactSubset(getAnyPlatformCapabilitySet()))
- {
- diagnoseCapabilityErrors(
- getSink(),
- this->getOptionSet(),
- funcDecl->loc,
- Diagnostics::missingCapabilityRequirementOnPublicDecl,
- funcDecl, funcDecl->inferredCapabilityRequirements);
- }
- }
- }
+ declaredCaps = funcDecl->inferredCapabilityRequirements;
+ return;
}
else
{
@@ -10199,7 +10193,7 @@ namespace Slang
// At a minimum we will propagate shader requirements to our
// function from calling children in all cases so the parent
// can enforce shader targets correctly and propagate to `main`
- const CapabilityConjunctionSet* failedAvailableCapabilityConjunction = nullptr;
+ CapabilityAtomSet failedAvailableCapabilityConjunction;
if (!CapabilitySet::checkCapabilityRequirement(
declaredCaps,
funcDecl->inferredCapabilityRequirements,
@@ -10209,7 +10203,7 @@ namespace Slang
funcDecl->inferredCapabilityRequirements = declaredCaps;
}
else
- funcDecl->inferredCapabilityRequirements.simpleJoinWithSetMask(declaredCaps, CapabilityName::stage);
+ funcDecl->inferredCapabilityRequirements.nonDestructiveJoin(declaredCaps);
}
else
{
@@ -10241,7 +10235,7 @@ namespace Slang
ensureDecl(requirementDecl, DeclCheckState::CapabilityChecked);
ensureDecl(implDecl.declRefBase, DeclCheckState::CapabilityChecked);
- const CapabilityConjunctionSet* failedAvailableCapabilityConjunction = nullptr;
+ CapabilityAtomSet failedAvailableCapabilityConjunction;
if (!CapabilitySet::checkCapabilityRequirement(
requirementDecl->inferredCapabilityRequirements,
implDecl.getDecl()->inferredCapabilityRequirements,
@@ -10303,7 +10297,7 @@ namespace Slang
return defaultVis;
}
- void diagnoseCapabilityProvenance(CompilerOptionSet& optionSet, DiagnosticSink* sink, Decl* decl, CapabilityAtom missingAtom)
+ void diagnoseCapabilityProvenance(CompilerOptionSet& optionSet, DiagnosticSink* sink, Decl* decl, CapabilityAtom atomToFind, bool optionallyNeverPrintDecl)
{
HashSet<Decl*> printedDecls;
auto thisModule = getModuleDecl(decl);
@@ -10311,9 +10305,9 @@ namespace Slang
while (declToPrint)
{
printedDecls.add(declToPrint);
- if (auto provenance = declToPrint->capabilityRequirementProvenance.tryGetValue(missingAtom))
+ if (auto provenance = declToPrint->capabilityRequirementProvenance.tryGetValue(atomToFind))
{
- sink->diagnose(provenance->referenceLoc, Diagnostics::seeUsingOf, provenance->referencedDecl);
+ diagnoseCapabilityErrors(sink, optionSet, provenance->referenceLoc, Diagnostics::seeUsingOf, provenance->referencedDecl);
declToPrint = provenance->referencedDecl;
if (printedDecls.contains(declToPrint))
break;
@@ -10332,54 +10326,17 @@ namespace Slang
break;
}
}
- if (declToPrint)
+ if (declToPrint && !optionallyNeverPrintDecl)
{
diagnoseCapabilityErrors(sink, optionSet, declToPrint->loc, Diagnostics::seeDefinitionOf, declToPrint);
}
}
- // Print diagnostics tracing which referenced decls are not compatible with the given atom.
- void diagnoseIncompatibleAtomProvenance(SemanticsVisitor* visitor, DiagnosticSink* sink, Decl* decl, CapabilityAtom incompatibleAtom, int traceLevels = 10)
+ void SemanticsDeclCapabilityVisitor::diagnoseUndeclaredCapability(Decl* decl, const DiagnosticInfo& diagnosticInfo, const CapabilityAtomSet& failedAtomsInsideAvailableSet)
{
- Decl* refDecl = nullptr;
- SourceLoc loc;
- HashSet<Decl*> printedDecls;
- while (traceLevels > 0)
- {
- refDecl = nullptr;
- visitReferencedDecls(*visitor, decl, decl->loc, decl->findModifier<RequireCapabilityAttribute>(), [&](SyntaxNode* node, const CapabilitySet& nodeCaps, SourceLoc refLoc)
- {
- if (nodeCaps.isIncompatibleWith(incompatibleAtom))
- {
- if (auto referencedDecl = as<Decl>(node))
- {
- refDecl = referencedDecl;
- loc = refLoc;
- }
- else
- diagnoseCapabilityErrors(sink, visitor->getOptionSet(), refLoc, Diagnostics::seeDefinitionOf, "statement");
- }
- });
- if (!refDecl)
- break;
- if (printedDecls.add(refDecl))
- {
- diagnoseCapabilityErrors(sink, visitor->getOptionSet(), loc, Diagnostics::seeUsingOf, refDecl);
- decl = refDecl;
- }
- else
- {
- break;
- }
- traceLevels--;
- }
- }
-
- void SemanticsDeclCapabilityVisitor::diagnoseUndeclaredCapability(Decl* decl, const DiagnosticInfo& diagnosticInfo, const CapabilityConjunctionSet* failedAvailableSet)
- {
- if (decl->inferredCapabilityRequirements.getExpandedAtoms().getCount() == 0)
+ if (decl->inferredCapabilityRequirements.isEmpty())
return;
- if(!failedAvailableSet)
+ if(failedAtomsInsideAvailableSet.isEmpty() || failedAtomsInsideAvailableSet.contains((UInt)CapabilityAtom::Invalid))
return;
// There are two causes for why type checking failed on failedAvailableSet.
@@ -10394,90 +10351,51 @@ namespace Slang
// }
// In this case we should diagnose error reporting printf isn't defined on a required target.
//
- // The second scenario is when the callee is using a capability that is not provided by the requirement.
- // For example:
- // [require(hlsl,b,c)]
- // void caller()
- // {
- // useD(); // require capability (hlsl,d)
- // }
- // In this case we should report that useD() is using a capability that is not declared by caller.
- //
-
// Now, we detect if we are case 1.
- if (decl->inferredCapabilityRequirements.isIncompatibleWith(*failedAvailableSet))
+
{
- // Find the most derived atom that is leading to the incompatiblity.
- for (Index i = failedAvailableSet->getExpandedAtoms().getCount() - 1; i >= 0; i--)
+ CapabilityAtom outFailedAtom{};
+ if (hasTargetAtom(failedAtomsInsideAvailableSet, outFailedAtom))
{
- auto atom = failedAvailableSet->getExpandedAtoms()[i];
- if (!isDirectChildOfAbstractAtom(atom))
- continue;
- if (decl->inferredCapabilityRequirements.isIncompatibleWith(atom))
+ diagnoseCapabilityErrors(getSink(), this->getOptionSet(), decl->loc, Diagnostics::declHasDependenciesNotCompatibleOnTarget, decl, outFailedAtom);
+
+ // Anything defined on a non-failed target atom may be the culprit to why we fail having a target capability.
+ // Print out all possible culprits.
+ CapabilityAtomSet failedAtomSet;
+ failedAtomSet.add((UInt)outFailedAtom);
+ CapabilityAtomSet targetsNotUsedSet;
+ CapabilityAtomSet::calcSubtract(targetsNotUsedSet, getAtomSetOfTargets(), failedAtomSet);
+
+ for (auto atom : targetsNotUsedSet)
{
- diagnoseCapabilityErrors(getSink(), this->getOptionSet(), decl->loc, Diagnostics::declHasDependenciesNotDefinedOnTarget, decl, atom);
- diagnoseIncompatibleAtomProvenance(this, getSink(), decl, atom);
- return;
+ CapabilityAtom formattedAtom = (CapabilityAtom)atom;
+ diagnoseCapabilityProvenance(this->getOptionSet(), getSink(), decl, formattedAtom, true);
}
+ return;
}
- return;
}
- // If we reach here, we are case 2.
+ //// The second scenario is when the callee is using a capability that is not provided by the requirement.
+ //// For example:
+ //// [require(hlsl,b,c)]
+ //// void caller()
+ //// {
+ //// useD(); // require capability (hlsl,d)
+ //// }
+ //// In this case we should report that useD() is using a capability that is not declared by caller.
+ ////
- CapabilityConjunctionSet* matchingRequirement = &decl->inferredCapabilityRequirements.getExpandedAtoms().getFirst();
- CapabilityAtom missingAtom = matchingRequirement->getExpandedAtoms().getFirst();
- if (missingAtom == CapabilityAtom::Invalid)
- return;
+ //// If we reach here, we are case 2.
- if (failedAvailableSet)
+ // We will produce all failed atoms. This is important since provenance of multiple atoms
+ // can come from multiple referenced items in a function body.
+ for (auto i : failedAtomsInsideAvailableSet)
{
- Int maxIntersectionCount = 0;
- for (auto& usedSet : decl->inferredCapabilityRequirements.getExpandedAtoms())
- {
- auto intersection = usedSet.countIntersectionWith(*failedAvailableSet);
- if (intersection > maxIntersectionCount)
- {
- matchingRequirement = &usedSet;
- maxIntersectionCount = intersection;
- }
- }
- Index pos = 0;
- for (Index i = 0; i < matchingRequirement->getExpandedAtoms().getCount(); i++)
- {
- auto atom = matchingRequirement->getExpandedAtoms()[i];
- while (pos < failedAvailableSet->getExpandedAtoms().getCount())
- {
- if (failedAvailableSet->getExpandedAtoms()[pos] < atom)
- pos++;
- else
- break;
- }
-
- if (pos >= failedAvailableSet->getExpandedAtoms().getCount() ||
- failedAvailableSet->getExpandedAtoms()[pos] != atom)
- {
- missingAtom = atom;
- break;
- }
- }
-
- // Select the most derived atom of `missingAtom`.
- for (Index i = matchingRequirement->getExpandedAtoms().getCount() - 1; i >= 0 ; i--)
- {
- auto atom = matchingRequirement->getExpandedAtoms()[i];
- if (CapabilityConjunctionSet(atom).implies(missingAtom))
- {
- missingAtom = atom;
- break;
- }
- }
+ CapabilityAtom formattedAtom = (CapabilityAtom)i;
+ diagnoseCapabilityErrors(getSink(), this->getOptionSet(), decl->loc, diagnosticInfo, decl, formattedAtom);
+ // Print provenances.
+ diagnoseCapabilityProvenance(this->getOptionSet(), getSink(), decl, formattedAtom);
}
-
- diagnoseCapabilityErrors(getSink(), this->getOptionSet(), decl->loc, diagnosticInfo, decl, missingAtom);
-
- // Print provenances.
- diagnoseCapabilityProvenance(this->getOptionSet(), getSink(), decl, missingAtom);
}
}
diff --git a/source/slang/slang-check-impl.h b/source/slang/slang-check-impl.h
index 20139b4e4..569e27e7c 100644
--- a/source/slang/slang-check-impl.h
+++ b/source/slang/slang-check-impl.h
@@ -825,6 +825,9 @@ namespace Slang
return result;
}
+ FunctionDeclBase* getParentFuncOfVisitor() { return m_parentFunc; }
+ void setParentFuncOfVisitor(FunctionDeclBase* funcDecl) { m_parentFunc = funcDecl; }
+
SemanticsContext withParentFunc(FunctionDeclBase* parentFunc)
{
SemanticsContext result(*this);
@@ -2786,7 +2789,7 @@ namespace Slang
DeclVisibility getDeclVisibility(Decl* decl);
- void diagnoseCapabilityProvenance(CompilerOptionSet& optionSet, DiagnosticSink* sink, Decl* decl, CapabilityAtom missingAtom);
+ void diagnoseCapabilityProvenance(CompilerOptionSet& optionSet, DiagnosticSink* sink, Decl* decl, CapabilityAtom atomToFind, bool optionallyNeverPrintDecl = false);
void _ensureAllDeclsRec(
SemanticsDeclVisitorBase* visitor,
diff --git a/source/slang/slang-check-modifier.cpp b/source/slang/slang-check-modifier.cpp
index b8ff1a116..aa30f66ca 100644
--- a/source/slang/slang-check-modifier.cpp
+++ b/source/slang/slang-check-modifier.cpp
@@ -1641,8 +1641,7 @@ namespace Slang
previous = m;
continue;
}
- for(auto& con : req->capabilitySet.getExpandedAtoms())
- firstRequire->capabilitySet.unionWith(con);
+ firstRequire->capabilitySet.unionWith(req->capabilitySet);
if(previous)
previous->next = next;
continue;
diff --git a/source/slang/slang-check-shader.cpp b/source/slang/slang-check-shader.cpp
index 2c1f8651c..7a6deed5c 100644
--- a/source/slang/slang-check-shader.cpp
+++ b/source/slang/slang-check-shader.cpp
@@ -520,29 +520,27 @@ namespace Slang
if (targetCaps.isIncompatibleWith(entryPointFuncDecl->inferredCapabilityRequirements))
{
diagnoseCapabilityErrors(sink, linkage->m_optionSet, entryPointFuncDecl, Diagnostics::entryPointUsesUnavailableCapability, entryPointFuncDecl, entryPointFuncDecl->inferredCapabilityRequirements, targetCaps);
- auto& interredCapConjunctions = entryPointFuncDecl->inferredCapabilityRequirements.getExpandedAtoms();
-
+
// Find out what exactly is incompatible and print out a trace of provenance to
// help user diagnose their code.
- auto& conjunctions = targetCaps.getExpandedAtoms();
- if (conjunctions.getCount() == 1 && interredCapConjunctions.getCount() == 1)
+ // TODO: provedence should have a way to filter out for provenance that are missing X capabilitySet from their caps, else in big functions we get junk errors
+ // This is specifically a problem for when a function is missing a target but otherwise has identical capabilities.
+
+ const auto& interredCapConjunctions = entryPointFuncDecl->inferredCapabilityRequirements.getAtomSets();
+ const auto& compileCaps = targetCaps.getAtomSets();
+ if (compileCaps && interredCapConjunctions)
{
- for (auto atom : conjunctions[0].getExpandedAtoms())
+ for (auto inferredAtom : *interredCapConjunctions.begin())
{
- for (auto inferredAtom : interredCapConjunctions[0].getExpandedAtoms())
+ CapabilityAtom inferredAtomFormatted = (CapabilityAtom)inferredAtom;
+ if (!compileCaps->contains((UInt)inferredAtom))
{
- if (CapabilityConjunctionSet(inferredAtom).isIncompatibleWith(atom))
- {
- diagnoseCapabilityProvenance(linkage->m_optionSet, sink, entryPointFuncDecl, inferredAtom);
- goto breakLabel;
- }
+ diagnoseCapabilityProvenance(linkage->m_optionSet, sink, entryPointFuncDecl, inferredAtomFormatted);
}
}
}
}
}
- breakLabel:;
-
}
// Given an entry point specified via API or command line options,
diff --git a/source/slang/slang-check-stmt.cpp b/source/slang/slang-check-stmt.cpp
index 89ec82e48..ae817f867 100644
--- a/source/slang/slang-check-stmt.cpp
+++ b/source/slang/slang-check-stmt.cpp
@@ -340,7 +340,7 @@ namespace Slang
}
if (stmt->capabilityToken.getContentLength() != 0 &&
- (set.getExpandedAtoms().getCount() != 1 || set.isInvalid() || set.isEmpty()))
+ (set.getCapabilityTargetSets().getCount() != 1 || set.isInvalid() || set.isEmpty()))
{
getSink()->diagnose(
stmt->capabilityToken.loc,
diff --git a/source/slang/slang-compiler.cpp b/source/slang/slang-compiler.cpp
index b2b765c0e..5ef9a50b1 100644
--- a/source/slang/slang-compiler.cpp
+++ b/source/slang/slang-compiler.cpp
@@ -614,11 +614,11 @@ namespace Slang
GLSLExtensionTracker* extensionTracker,
CapabilitySet const& caps)
{
- for( auto conjunctions : caps.getExpandedAtoms() )
+ for(auto& conjunctions : caps.getAtomSets() )
{
- for (auto atom : conjunctions.getExpandedAtoms())
+ for (auto atom : conjunctions)
{
- switch (atom)
+ switch ((CapabilityAtom)atom)
{
default:
break;
diff --git a/source/slang/slang-diagnostic-defs.h b/source/slang/slang-diagnostic-defs.h
index 85989b26d..9ba1e4724 100644
--- a/source/slang/slang-diagnostic-defs.h
+++ b/source/slang/slang-diagnostic-defs.h
@@ -387,7 +387,7 @@ DIAGNOSTIC(36104, Error, useOfUndeclaredCapabilityOfInterfaceRequirement, "'$0'
DIAGNOSTIC(36105, Error, unknownCapability, "unknown capability name '$0'.")
DIAGNOSTIC(36106, Error, expectCapability, "expect a capability name.")
DIAGNOSTIC(36107, Error, entryPointUsesUnavailableCapability, "entrypoint '$0' requires capability '$1', which is incompatible with the current compilation target '$2'.")
-DIAGNOSTIC(36108, Error, declHasDependenciesNotDefinedOnTarget, "'$0' has dependencies that are not defined on the required target '$1'.")
+DIAGNOSTIC(36108, Error, declHasDependenciesNotCompatibleOnTarget, "'$0' has dependencies that are not compatible on the required target '$1'.")
DIAGNOSTIC(36109, Error, invalidTargetSwitchCase, "'$0' cannot be used as a target_switch case.")
DIAGNOSTIC(36110, Error, stageIsInCompatibleWithCapabilityDefinition, "'$0' is defined for stage '$1', which is incompatible with the declared capability set '$2'.")
@@ -725,6 +725,7 @@ DIAGNOSTIC(41000, Warning, unreachableCode, "unreachable code detected")
DIAGNOSTIC(41001, Error, recursiveType, "type '$0' contains cyclic reference to itself.")
DIAGNOSTIC(41010, Warning, missingReturn, "control flow may reach end of non-'void' function")
+DIAGNOSTIC(41011, Error, profileIncompatibleWithTargetSwitch, "__target_switch has no compatable target with current profile '$0'")
DIAGNOSTIC(41015, Error, usingUninitializedValue, "use of uninitialized value '$0'")
DIAGNOSTIC(41016, Warning, returningWithUninitializedOut, "returning without initializing out parameter '$0'")
DIAGNOSTIC(41017, Warning, returningWithPartiallyUninitializedOut, "returning without fully initializing out parameter '$0'")
diff --git a/source/slang/slang-ir-link.cpp b/source/slang/slang-ir-link.cpp
index e652745e7..26d96690f 100644
--- a/source/slang/slang-ir-link.cpp
+++ b/source/slang/slang-ir-link.cpp
@@ -1135,8 +1135,10 @@ bool isBetterForTarget(
if(newCaps.isInvalid()) return false;
if(oldCaps.isInvalid()) return true;
- if(newCaps != oldCaps)
- return newCaps.implies(oldCaps);
+ bool isEqual = false;
+ bool isNewBetter = newCaps.isBetterForTarget(oldCaps, targetCaps, isEqual);
+ if(!isEqual)
+ return isNewBetter;
// All preceding factors being equal, an `[export]` is better
// than an `[import]`.
@@ -1882,7 +1884,7 @@ LinkedIR linkIR(
}
// Specialize target_switch branches to use the best branch for the target.
- specializeTargetSwitch(targetReq, state->irModule);
+ specializeTargetSwitch(targetReq, state->irModule, codeGenContext->getSink());
// Diagnose on unresolved symbols if we are compiling into a target that does
// not allow incomplete symbols.
diff --git a/source/slang/slang-ir-specialize-target-switch.cpp b/source/slang/slang-ir-specialize-target-switch.cpp
index f4cb6bfa7..fac1dd484 100644
--- a/source/slang/slang-ir-specialize-target-switch.cpp
+++ b/source/slang/slang-ir-specialize-target-switch.cpp
@@ -7,13 +7,15 @@
namespace Slang
{
- void specializeTargetSwitch(TargetRequest* target, IRGlobalValueWithCode* code)
+ void specializeTargetSwitch(TargetRequest* target, IRGlobalValueWithCode* code, DiagnosticSink* sink)
{
bool changed = false;
for (auto block : code->getBlocks())
{
+ bool failedImplies = false;
if (auto targetSwitch = as<IRTargetSwitch>(block->getTerminator()))
{
+ bool isEqual;
CapabilitySet bestCapSet = CapabilitySet::makeInvalid();
IRBlock* targetBlock = nullptr;
for (UInt i = 0; i < targetSwitch->getCaseCount(); i++)
@@ -22,14 +24,22 @@ namespace Slang
if (target->getTargetCaps().isIncompatibleWith(cap))
continue;
CapabilitySet capSet;
- if (cap == CapabilityName::Invalid)
+ if (cap == CapabilityName::Invalid) // `default` case
capSet = CapabilitySet::makeEmpty();
else
capSet = CapabilitySet(cap);
- if (capSet.isBetterForTarget(bestCapSet, target->getTargetCaps()))
+ bool isBetterForTarget = capSet.isBetterForTarget(bestCapSet, target->getTargetCaps(), isEqual);
+ if (isBetterForTarget)
{
- targetBlock = targetSwitch->getCaseBlock(i);
- bestCapSet = capSet;
+ bool targetImpliesCapSet = (target->getTargetCaps().implies(capSet, true) || capSet.isEmpty());
+ if (targetImpliesCapSet)
+ {
+ // Now check if bestCapSet contains targetCaps. If it does not then this is an invalid target
+ targetBlock = targetSwitch->getCaseBlock(i);
+ bestCapSet = capSet;
+ }
+ else
+ failedImplies = true;
}
}
IRBuilder builder(targetSwitch);
@@ -40,6 +50,10 @@ namespace Slang
}
else
{
+ // only error if we have the chance of setting a valid target switch, but did not due to incompatability within same `target` atom.
+ // Otherwise we will have an issue when we process a `__target_switch() { case metal: return; }` for glsl targets.
+ if(failedImplies)
+ sink->diagnose(targetSwitch->sourceLoc, Diagnostics::profileIncompatibleWithTargetSwitch, target->getTargetCaps());
builder.emitMissingReturn();
}
targetSwitch->removeAndDeallocate();
@@ -53,19 +67,19 @@ namespace Slang
}
}
- void specializeTargetSwitch(TargetRequest* target, IRModule* module)
+ void specializeTargetSwitch(TargetRequest* target, IRModule* module, DiagnosticSink* sink)
{
for (auto globalInst : module->getGlobalInsts())
{
if (auto code = as<IRGlobalValueWithCode>(globalInst))
{
- specializeTargetSwitch(target, code);
+ specializeTargetSwitch(target, code, sink);
if (auto gen = as<IRGeneric>(code))
{
auto retVal = findGenericReturnVal(gen);
if (auto innerCode = as<IRGlobalValueWithCode>(retVal))
{
- specializeTargetSwitch(target, innerCode);
+ specializeTargetSwitch(target, innerCode, sink);
}
}
}
diff --git a/source/slang/slang-ir-specialize-target-switch.h b/source/slang/slang-ir-specialize-target-switch.h
index 91071cec6..03fd7d85a 100644
--- a/source/slang/slang-ir-specialize-target-switch.h
+++ b/source/slang/slang-ir-specialize-target-switch.h
@@ -5,10 +5,11 @@ namespace Slang
{
struct IRModule;
class TargetRequest;
+ class DiagnosticSink;
// Repalce all target_switch insts with the case that matches current target.
//
- void specializeTargetSwitch(TargetRequest* target, IRModule* module);
+ void specializeTargetSwitch(TargetRequest* target, IRModule* module, DiagnosticSink* sink);
}
diff --git a/source/slang/slang-ir.cpp b/source/slang/slang-ir.cpp
index b53bfc9d1..c0bec9654 100644
--- a/source/slang/slang-ir.cpp
+++ b/source/slang/slang-ir.cpp
@@ -334,7 +334,7 @@ namespace Slang
for (Index i = 0; i < count; ++i)
{
auto operand = cast<IRCapabilitySet>(getOperand(i));
- result.getExpandedAtoms().addRange(operand->getCaps().getExpandedAtoms());
+ result.unionWith(operand->getCaps());
}
return result;
}
@@ -2440,13 +2440,12 @@ namespace Slang
// be a minimal list of atoms such that they will produce
// the same `CapabilitySet` when expanded.
- List<List<CapabilityAtom>> compactedAtoms;
- caps.calcCompactedAtoms(compactedAtoms);
+ auto compactedAtoms = caps.getAtomSets();
List<IRInst*> conjunctions;
- for( auto atomConjunction : compactedAtoms )
+ for( auto& atomConjunctionSet : compactedAtoms )
{
List<IRInst*> args;
- for (auto atom : atomConjunction)
+ for (auto atom : atomConjunctionSet)
args.add(getIntValue(capabilityAtomType, Int(atom)));
auto conjunctionInst = createIntrinsicInst(
capabilitySetType, kIROp_CapabilityConjunction, args.getCount(), args.getBuffer());
@@ -8284,7 +8283,8 @@ namespace Slang
continue;
}
- if(!bestDecoration || decorationCaps.isBetterForTarget(bestCaps, targetCaps))
+ bool isEqual;
+ if(!bestDecoration || decorationCaps.isBetterForTarget(bestCaps, targetCaps, isEqual))
{
bestDecoration = decoration;
bestCaps = decorationCaps;
diff --git a/source/slang/slang-serialize-ast-type-info.h b/source/slang/slang-serialize-ast-type-info.h
index 96c8a438f..1d7628cd4 100644
--- a/source/slang/slang-serialize-ast-type-info.h
+++ b/source/slang/slang-serialize-ast-type-info.h
@@ -78,47 +78,178 @@ struct PtrSerialTypeInfo<T, std::enable_if_t<std::is_base_of_v<Val, T>>>
template <typename T>
struct SerialTypeInfo<DeclRef<T>> : public SerialTypeInfo<DeclRefBase*> {};
+// UIntSet
+
template<>
-struct SerialTypeInfo<CapabilitySet>
+struct SerialTypeInfo<CapabilityAtomSet>
{
- typedef CapabilitySet NativeType;
+ typedef CapabilityAtomSet NativeType;
typedef SerialIndex SerialType;
- enum { SerialAlignment = SLANG_ALIGN_OF(SerialType) };
+ enum { SerialAlignment = SLANG_ALIGN_OF(SerialIndex) };
+ static void toSerial(SerialWriter* writer, const void* native, void* serial)
+ {
+ auto& src = *(NativeType*)native;
+ auto& dst = *(SerialType*)serial;
+
+ dst = writer->addArray(src.getBuffer().getBuffer(), src.getBuffer().getCount());
+ }
+ static void toNative(SerialReader* reader, const void* serial, void* native)
+ {
+ auto& dst = *(NativeType*)native;
+ auto& src = *(const SerialType*)serial;
+
+ List<CapabilityAtomSet::Element> UIntSetBuffer;
+ reader->getArray(src, UIntSetBuffer);
+
+ dst = CapabilityAtomSet(UIntSetBuffer);
+ }
+};
+
+// ~UIntSet
+
+template<>
+struct SerialTypeInfo<CapabilityStageSet>
+{
+ struct SerialType
+ {
+ SerialIndex stage;
+ SerialIndex atomSet;
+ };
+
+ typedef CapabilityStageSet NativeType;
+ enum { SerialAlignment = SLANG_ALIGN_OF(SerialIndex) };
static void toSerial(SerialWriter* writer, const void* native, void* serial)
{
auto& src = *(const NativeType*)native;
auto& dst = *(SerialType*)serial;
- dst = writer->addArray(src.getExpandedAtoms().getBuffer(), src.getExpandedAtoms().getCount());
+ List<SerialTypeInfo<CapabilityStageSet>::SerialType> SatomSetsList;
+ SatomSetsList.setCount(src.atomSet.has_value());
+
+ if(src.atomSet)
+ {
+ auto& i = src.atomSet.value();
+ SerialTypeInfo<CapabilityAtomSet>::toSerial(writer, &i, &SatomSetsList[0]);
+ }
+
+ SerialTypeInfo<CapabilityAtom>::toSerial(writer, &src.stage, &dst.stage);
+ dst.atomSet = writer->addSerialArray<CapabilityStageSet>(SatomSetsList.getBuffer(), SatomSetsList.getCount());
}
static void toNative(SerialReader* reader, const void* serial, void* native)
{
auto& dst = *(NativeType*)native;
auto& src = *(const SerialType*)serial;
- reader->getArray(src, dst.getExpandedAtoms());
+ CapabilityAtom stage;
+ List<CapabilityAtomSet> items;
+ SerialTypeInfo<CapabilityAtom>::toNative(reader, &src.stage, &stage);
+ reader->getArray(src.atomSet, items);
+
+ dst.stage = stage;
+
+ for (auto i : items)
+ {
+ dst.addNewSet(std::move(i));
+ }
}
};
template<>
-struct SerialTypeInfo<CapabilityConjunctionSet>
+struct SerialTypeInfo<CapabilityTargetSet>
{
- typedef CapabilityConjunctionSet NativeType;
- typedef SerialIndex SerialType;
- enum { SerialAlignment = SLANG_ALIGN_OF(SerialType) };
+ struct SerialType
+ {
+ SerialIndex target;
+ SerialIndex shaderStageSets;
+ };
+
+ typedef CapabilityTargetSet NativeType;
+ enum { SerialAlignment = SLANG_ALIGN_OF(SerialIndex) };
static void toSerial(SerialWriter* writer, const void* native, void* serial)
{
auto& src = *(const NativeType*)native;
auto& dst = *(SerialType*)serial;
- dst = writer->addArray(src.getExpandedAtoms().getBuffer(), src.getExpandedAtoms().getCount());
+ List<SerialTypeInfo<CapabilityStageSet>::SerialType> SStageSetList;
+ SStageSetList.setCount(src.shaderStageSets.getCount());
+ Index iter = 0;
+ for (auto& i : src.shaderStageSets)
+ {
+ SerialTypeInfo<CapabilityStageSet>::toSerial(writer, &i.second, &SStageSetList[iter]);
+ iter++;
+ }
+
+ SerialTypeInfo<CapabilityAtom>::toSerial(writer, &src.target, &dst.target);
+ dst.shaderStageSets = writer->addSerialArray<CapabilityStageSet>(SStageSetList.getBuffer(), SStageSetList.getCount());
}
static void toNative(SerialReader* reader, const void* serial, void* native)
{
auto& dst = *(NativeType*)native;
auto& src = *(const SerialType*)serial;
- reader->getArray(src, dst.getExpandedAtoms());
+ CapabilityAtom target;
+ List<CapabilityStageSet> items;
+ SerialTypeInfo<CapabilityAtom>::toNative(reader, &src.target, &target);
+ reader->getArray(src.shaderStageSets, items);
+
+ dst.target = target;
+
+ auto& shaderStageSets = dst.shaderStageSets;
+ shaderStageSets.clear();
+ shaderStageSets.reserve(items.getCount());
+ Index iter = 0;
+ for (auto& i : items)
+ {
+ dst.shaderStageSets[i.stage] = i;
+ iter++;
+ }
+ }
+};
+
+template<>
+struct SerialTypeInfo<CapabilitySet>
+{
+ struct SerialType
+ {
+ SerialIndex m_targetSets;
+ };
+
+ typedef CapabilitySet NativeType;
+ enum { SerialAlignment = SLANG_ALIGN_OF(SerialIndex) };
+ static void toSerial(SerialWriter* writer, const void* native, void* serial)
+ {
+ auto& src = *(const NativeType*)native;
+ auto& dst = *(SerialType*)serial;
+
+ List<SerialTypeInfo<CapabilityTargetSet>::SerialType> STargetSetList;
+ auto capabilityTargetSets = src.getCapabilityTargetSets();
+ STargetSetList.setCount(capabilityTargetSets.getCount());
+ Index iter = 0;
+ for (auto& i : capabilityTargetSets)
+ {
+ SerialTypeInfo<CapabilityTargetSet>::toSerial(writer, &i.second, &STargetSetList[iter]);
+ iter++;
+ }
+
+ dst.m_targetSets = writer->addSerialArray<CapabilityTargetSet>(STargetSetList.getBuffer(), STargetSetList.getCount());
+ }
+ static void toNative(SerialReader* reader, const void* serial, void* native)
+ {
+ auto& dst = *(NativeType*)native;
+ auto& src = *(const SerialType*)serial;
+
+ List<CapabilityTargetSet> items;
+ reader->getArray(src.m_targetSets, items);
+
+ auto& targetSets = dst.getCapabilityTargetSets();
+ targetSets.clear();
+ targetSets.reserve(items.getCount());
+ Index iter = 0;
+ for (auto& i : items)
+ {
+ targetSets[i.target] = i;
+ iter++;
+ }
}
};
diff --git a/source/slang/slang.cpp b/source/slang/slang.cpp
index e43c9a556..4d83823d2 100644
--- a/source/slang/slang.cpp
+++ b/source/slang/slang.cpp
@@ -1740,7 +1740,9 @@ CapabilitySet TargetRequest::getTargetCaps()
CapabilitySet targetCap = CapabilitySet(atoms);
CapabilitySet latestSpirvCapSet = CapabilitySet(CapabilityName::spirv_latest);
- CapabilityName latestSpirvAtom = (CapabilityName)latestSpirvCapSet.getExpandedAtoms()[0].getExpandedAtoms().getLast();
+ auto latestSpirvCapSetElements = latestSpirvCapSet.getAtomSets()->getElements<CapabilityAtom>();
+ CapabilityName latestSpirvAtom = (CapabilityName)latestSpirvCapSetElements[latestSpirvCapSetElements.getCount()-2]; //-1 gets shader stage
+
for (auto atomVal : optionSet.getArray(CompilerOptionName::Capability))
{
auto atom = (CapabilityName)atomVal.intValue;
diff --git a/source/slang/slang.natvis b/source/slang/slang.natvis
index c86faa065..21db4016f 100644
--- a/source/slang/slang.natvis
+++ b/source/slang/slang.natvis
@@ -842,4 +842,114 @@
<Type Name="Slang::BasicExpressionType">
<DisplayString>BasicExpressionType ({*(DeclRefBase*)m_operands.m_buffer[0].values.nodeOperand})</DisplayString>
</Type>
+
+ <Type Name="Slang::CapabilitySet">
+ <DisplayString>{m_targetSets.map.m_values}</DisplayString>
+ <Expand>
+ <CustomListItems>
+ <Item Name="m_targetSets">m_targetSets</Item>
+ <Item Name="m_targetSets.map.m_values">m_targetSets.map.m_values</Item>
+ </CustomListItems>
+ </Expand>
+ </Type>
+ <Type Name="Slang::CapabilityTargetSet">
+ <DisplayString>{{target={target}}}</DisplayString>
+ <Expand>
+ <CustomListItems>
+ <Item Name="target">target</Item>
+ <Item Name="shaderStageSets">shaderStageSets</Item>
+ <Item Name="shaderStageSets.map.m_values">shaderStageSets.map.m_values</Item>
+ </CustomListItems>
+ </Expand>
+ </Type>
+ <Type Name="Slang::CapabilityStageSet">
+ <DisplayString>{{size={atomSet}}}</DisplayString>
+ <Expand>
+ <CustomListItems>
+ <Item Name="stage">stage</Item>
+ <Item Name="atomSet">atomSet</Item>
+ </CustomListItems>
+ </Expand>
+ </Type>
+
+ <!--UIntSet-->
+ <Type Name="Slang::UIntSet">
+ <DisplayString>{{max_size={m_buffer.m_count*Slang::UIntSet::kElementSize}}}</DisplayString>
+ <Expand>
+ <Synthetic Name="[Values]">
+ <Expand>
+ <CustomListItems MaxItemsPerView="1000">
+ <Variable Name="atomType" InitialValue="0"/>
+ <Variable Name="bitValue" InitialValue="0"/>
+ <Variable Name="boolRes" InitialValue="false"/>
+ <Variable Name="bitIter" InitialValue="0"/>
+ <Variable Name="totalBitIter" InitialValue="0"/>
+ <Variable Name="value" InitialValue="0"/>
+ <Variable Name="iter" InitialValue="0"/>
+ <Exec>iter = (Slang::UIntSet::Element)0</Exec>
+ <Exec>bitIter = (Slang::UIntSet::Element)0</Exec>
+ <Exec>totalBitIter = (Slang::UIntSet::Element)0</Exec>
+ <Exec>value = 0</Exec>
+ <Loop>
+ <If Condition="bitIter >= Slang::UIntSet::kElementMask">
+ <Exec>bitIter = 0</Exec>
+ <Exec>totalBitIter++</Exec>
+ <Exec>iter++</Exec>
+ </If>
+ <If Condition="iter >= m_buffer.m_count">
+ <Break/>
+ </If>
+ <Exec>bitValue = (m_buffer[iter]&gt;&gt;bitIter)&amp;1</Exec>
+ <If Condition="bitValue != 0">
+ <Exec>value = totalBitIter</Exec>
+ <Item>(CapabilityAtom)value</Item>
+ </If>
+ <Exec>bitIter++</Exec>
+ <Exec>totalBitIter++</Exec>
+ </Loop>
+ </CustomListItems>
+ </Expand>
+ </Synthetic>
+ </Expand>
+ </Type>
+ <Type Name="Slang::CapabilityAtomSet">
+ <DisplayString>{{max_size={m_buffer.m_count*Slang::UIntSet::kElementSize}}}</DisplayString>
+ <Expand>
+ <Synthetic Name="[CapabilityAtomView]">
+ <Expand>
+ <CustomListItems MaxItemsPerView="1000">
+ <Variable Name="atomType" InitialValue="0"/>
+ <Variable Name="bitValue" InitialValue="0"/>
+ <Variable Name="boolRes" InitialValue="false"/>
+ <Variable Name="bitIter" InitialValue="0"/>
+ <Variable Name="totalBitIter" InitialValue="0"/>
+ <Variable Name="value" InitialValue="0"/>
+ <Variable Name="iter" InitialValue="0"/>
+ <Exec>iter = (Slang::UIntSet::Element)0</Exec>
+ <Exec>bitIter = (Slang::UIntSet::Element)0</Exec>
+ <Exec>totalBitIter = (Slang::UIntSet::Element)0</Exec>
+ <Exec>value = 0</Exec>
+ <Loop>
+ <If Condition="bitIter >= Slang::UIntSet::kElementMask">
+ <Exec>bitIter = 0</Exec>
+ <Exec>totalBitIter++</Exec>
+ <Exec>iter++</Exec>
+ </If>
+ <If Condition="iter >= m_buffer.m_count">
+ <Break/>
+ </If>
+ <Exec>bitValue = (m_buffer[iter]&gt;&gt;bitIter)&amp;1</Exec>
+ <If Condition="bitValue != 0">
+ <Exec>value = totalBitIter</Exec>
+ <Item>(CapabilityAtom)value</Item>
+ </If>
+ <Exec>bitIter++</Exec>
+ <Exec>totalBitIter++</Exec>
+ </Loop>
+ </CustomListItems>
+ </Expand>
+ </Synthetic>
+ </Expand>
+ </Type>
+ <!--~UIntSet-->
</AutoVisualizer>