summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorTim Foley <tfoleyNV@users.noreply.github.com>2020-12-11 08:50:43 -0800
committerGitHub <noreply@github.com>2020-12-11 08:50:43 -0800
commit992778e25c444932921ce92fe7934893b2aca35f (patch)
tree4351c61079da5586c5f469dc8c989364c7a2bd4e
parent4337338ed2d9525b4638f32c6b91ef61b69e41cd (diff)
Add first steps toward a "capability" system (#1636)
* Add first steps toward a "capability" system We already have cases in the stdlib where we mark declarations as being specific to certain targets, e.g.: ``` // My ordinary function to add two numbers. // Works everywhere. // void myFunc(int a, int b) { return a + b; } // On the "coolgpu" target, we can use a secret intrinsic // that adds numbers even faster! // __specialized_for_target(coolgpu) void myFunc(int a, int b) { return __secretIntrinsic(a, b); } ``` The existing logic for dealing with these modifiers (`__specialized_for_target` and `__target_intrinsic`) was almost entirely string-based. We would turn the chosen compilation target into a string, and then use that to try and search for the "best" definition of a function at a few steps: * During IR linking, we always pick one definition of an `[import]`ed function, and that definition will be the one with the "best" target-specialization modifier (if any) * During final code generation, we always look up the "best" target-intrinsic modifier, and use it as the template for the code we output. This change preserves the basic flow there, but replaces the ad hoc string-based logic with something a bit more principled, in terms of a new `CapabilitySet` type. A `CapabilitySet` represents a set of zero or more atomic features (here represented as `CapabilityAtom`s). What a `CapabilitySet` means depends on how and where it is used: * A compilation target implies a `CapabilitySet` where the contents of the set are the features the target *supports*. * A `CapabilitySet` attached to a declaration (or a modifier on that declaration) describes a set of feature that declaration *requires*. The current implementation of `CapabilitySet` is wasteful and inefficient, but that is something we can iterate on over time. In practice, most of the current code only ever uses capability sets that are either empty (because they represent a function with no specific requirements) or singleton (because they represent asingle atomic capability like "is a GLSL target," "is an HLSL target," etc.). The main goal here was to put in the skeleton of a new system, including some of the features it might need down the line, and then to leave changes that eventually use the greater flexibility for later. Eventually, the capability system should encompass: * Differences between shader model versions, GLSL versions, SPIR-V versions, etc. (currently tracked with other modifiers) * Optional extensions, and functions that are made available only with certain extensions (currently tracked with other modifiers) * Front-end checking that the call graph of a program doesn't violate any capability-requirements (e.g., having a GLSL+HLSL portable function call a GLSL-only subroutine) * Hypothetically we can also try to fold stage-specific (vertex-only, fragment-only, etc.) functions into this system, but doing so would require more linker cleverness if we allow overloading on stages (since we might have to clone a caller if it calls through to a callee with multiple stage-specific versions) One important complication that the system has to deal with just because of the "do what I mean" nature of the current compiler is that somethings a current Slang user might compile for target X and specify version N, but then use a function that actually requires version N+1 of that target. Currently the Slang compiler silently "upgrades" the version(s) used by user code in these cases, because it is often what users want in cross-compilation scenarios. Dealing with the "silent upgrade" situation requires us to be a little careful and sometimes pick a "best" capability set that doesn't appear to be supported on our target. Refining that system and potentially getting rid of the "do what I mean" behavior over time could be a goal for future changes. * fixup: handle case where value is incompatible during linking
-rw-r--r--build/visual-studio/slang/slang.vcxproj3
-rw-r--r--build/visual-studio/slang/slang.vcxproj.filters9
-rw-r--r--source/slang/hlsl.meta.slang33
-rw-r--r--source/slang/slang-capability-defs.h62
-rw-r--r--source/slang/slang-capability.cpp428
-rw-r--r--source/slang/slang-capability.h156
-rwxr-xr-xsource/slang/slang-compiler.h4
-rw-r--r--source/slang/slang-emit-c-like.cpp100
-rw-r--r--source/slang/slang-emit-c-like.h30
-rw-r--r--source/slang/slang-emit.cpp6
-rw-r--r--source/slang/slang-hlsl-intrinsic-set.cpp4
-rw-r--r--source/slang/slang-ir-glsl-legalize.cpp2
-rw-r--r--source/slang/slang-ir-inst-defs.h4
-rw-r--r--source/slang/slang-ir-insts.h48
-rw-r--r--source/slang/slang-ir-legalize-varying-params.cpp12
-rw-r--r--source/slang/slang-ir-link.cpp165
-rw-r--r--source/slang/slang-ir-specialize.cpp27
-rw-r--r--source/slang/slang-ir.cpp153
-rw-r--r--source/slang/slang-lower-to-ir.cpp40
-rw-r--r--source/slang/slang.cpp68
20 files changed, 1089 insertions, 265 deletions
diff --git a/build/visual-studio/slang/slang.vcxproj b/build/visual-studio/slang/slang.vcxproj
index 9eca53310..88c4b59e2 100644
--- a/build/visual-studio/slang/slang.vcxproj
+++ b/build/visual-studio/slang/slang.vcxproj
@@ -199,6 +199,8 @@
<ClInclude Include="..\..\..\source\slang\slang-ast-support-types.h" />
<ClInclude Include="..\..\..\source\slang\slang-ast-type.h" />
<ClInclude Include="..\..\..\source\slang\slang-ast-val.h" />
+ <ClInclude Include="..\..\..\source\slang\slang-capability-defs.h" />
+ <ClInclude Include="..\..\..\source\slang\slang-capability.h" />
<ClInclude Include="..\..\..\source\slang\slang-check-impl.h" />
<ClInclude Include="..\..\..\source\slang\slang-check.h" />
<ClInclude Include="..\..\..\source\slang\slang-compiler.h" />
@@ -315,6 +317,7 @@
<ClCompile Include="..\..\..\source\slang\slang-ast-substitutions.cpp" />
<ClCompile Include="..\..\..\source\slang\slang-ast-type.cpp" />
<ClCompile Include="..\..\..\source\slang\slang-ast-val.cpp" />
+ <ClCompile Include="..\..\..\source\slang\slang-capability.cpp" />
<ClCompile Include="..\..\..\source\slang\slang-check-conformance.cpp" />
<ClCompile Include="..\..\..\source\slang\slang-check-constraint.cpp" />
<ClCompile Include="..\..\..\source\slang\slang-check-conversion.cpp" />
diff --git a/build/visual-studio/slang/slang.vcxproj.filters b/build/visual-studio/slang/slang.vcxproj.filters
index 426467529..60b3f0d65 100644
--- a/build/visual-studio/slang/slang.vcxproj.filters
+++ b/build/visual-studio/slang/slang.vcxproj.filters
@@ -48,6 +48,12 @@
<ClInclude Include="..\..\..\source\slang\slang-ast-val.h">
<Filter>Header Files</Filter>
</ClInclude>
+ <ClInclude Include="..\..\..\source\slang\slang-capability-defs.h">
+ <Filter>Header Files</Filter>
+ </ClInclude>
+ <ClInclude Include="..\..\..\source\slang\slang-capability.h">
+ <Filter>Header Files</Filter>
+ </ClInclude>
<ClInclude Include="..\..\..\source\slang\slang-check-impl.h">
<Filter>Header Files</Filter>
</ClInclude>
@@ -392,6 +398,9 @@
<ClCompile Include="..\..\..\source\slang\slang-ast-val.cpp">
<Filter>Source Files</Filter>
</ClCompile>
+ <ClCompile Include="..\..\..\source\slang\slang-capability.cpp">
+ <Filter>Source Files</Filter>
+ </ClCompile>
<ClCompile Include="..\..\..\source\slang\slang-check-conformance.cpp">
<Filter>Source Files</Filter>
</ClCompile>
diff --git a/source/slang/hlsl.meta.slang b/source/slang/hlsl.meta.slang
index 9893effea..29779e796 100644
--- a/source/slang/hlsl.meta.slang
+++ b/source/slang/hlsl.meta.slang
@@ -4438,43 +4438,50 @@ __magic_type(Texture, $(feedbackTexture2DFlavor))
__intrinsic_type($(kIROp_TextureType + (feedbackTexture2DFlavor << kIROpMeta_OtherShift)))
struct FeedbackTexture2D<T : __BuiltinSamplerFeedbackType>
{
- __target_intrinsic(hlsl)
+ __target_intrinsic
void GetDimensions(out uint width, out uint height);
- __target_intrinsic(hlsl)
+ __target_intrinsic
void GetDimensions(uint mipLevel, out uint width, out uint height, out uint numberOfLevels);
- __target_intrinsic(hlsl)
+ __target_intrinsic
void GetDimensions(out float width,out float height);
- __target_intrinsic(hlsl)
+ __target_intrinsic
void GetDimensions(uint mipLevel, out float width,out float height, out float numberOfLevels);
// With Clamp
__target_intrinsic(hlsl, "($0).WriteSamplerFeedback($1, $2, $3, $4)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedback($1, $2, $3, $4)")
void WriteSamplerFeedback<S>(Texture2D<S> tex, SamplerState samp, float2 location, float clamp);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4, $5)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4, $5)")
void WriteSamplerFeedbackBias<S>(Texture2D<S> tex, SamplerState samp, float2 location, float bias, float clamp);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5, $6)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5, $6)")
void WriteSamplerFeedbackGrad<S>(Texture2D<S> tex, SamplerState samp, float2 location, float2 ddx, float2 ddy, float clamp);
// Level
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackLevel($1, $2, $3, $4)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackLevel($1, $2, $3, $4)")
void WriteSamplerFeedbackLevel<S>(Texture2D<S> tex, SamplerState samp, float2 location, float lod);
// Without Clamp
__target_intrinsic(hlsl, "($0).WriteSamplerFeedback($1, $2, $3)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedback($1, $2, $3)")
void WriteSamplerFeedback<S>(Texture2D<S> tex, SamplerState samp, float2 location);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4)")
void WriteSamplerFeedbackBias<S>(Texture2D<S> tex, SamplerState samp, float2 location, float bias);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5)")
void WriteSamplerFeedbackGrad<S>(Texture2D<S> tex, SamplerState samp, float2 location, float2 ddx, float2 ddy);
};
@@ -4484,40 +4491,50 @@ __magic_type(Texture, $(feedbackTexture2DArrayFlavor))
__intrinsic_type($(kIROp_TextureType + (feedbackTexture2DArrayFlavor << kIROpMeta_OtherShift)))
struct FeedbackTexture2DArray<T : __BuiltinSamplerFeedbackType>
{
- __target_intrinsic(hlsl)
+ __target_intrinsic
void GetDimensions(out uint width,out uint height, out uint elements);
- __target_intrinsic(hlsl)
+
+ __target_intrinsic
void GetDimensions(uint mipLevel, out uint width,out uint height, out uint elements, out uint numberOfLevels);
- __target_intrinsic(hlsl)
+
+ __target_intrinsic
void GetDimensions(out float width,out float height, out float elements);
- __target_intrinsic(hlsl)
+
+ __target_intrinsic
void GetDimensions(uint mipLevel, out float width,out float height, out float elements, out float numberOfLevels);
// With Clamp
__target_intrinsic(hlsl, "($0).WriteSamplerFeedback($1, $2, $3, $4)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedback($1, $2, $3, $4)")
void WriteSamplerFeedback<S>(Texture2DArray<S> texArray, SamplerState samp, float3 location, float clamp);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4, $5)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4, $5)")
void WriteSamplerFeedbackBias<S>(Texture2DArray<S> texArray, SamplerState samp, float3 location, float bias, float clamp);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5, $6)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5, $6)")
void WriteSamplerFeedbackGrad<S>(Texture2DArray<S> texArray, SamplerState samp, float3 location, float3 ddx, float3 ddy, float clamp);
// Level
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackLevel($1, $2, $3, $4)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackLevel($1, $2, $3, $4)")
void WriteSamplerFeedbackLevel<S>(Texture2DArray<S> texArray, SamplerState samp, float3 location, float lod);
// Without Clamp
__target_intrinsic(hlsl, "($0).WriteSamplerFeedback($1, $2, $3)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedback($1, $2, $3)")
void WriteSamplerFeedback<S>(Texture2DArray<S> texArray, SamplerState samp, float3 location);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackBias($1, $2, $3, $4)")
void WriteSamplerFeedbackBias<S>(Texture2DArray<S> texArray, SamplerState samp, float3 location, float bias);
__target_intrinsic(hlsl, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5)")
+ __target_intrinsic(cpp, "($0).WriteSamplerFeedbackGrad($1, $2, $3, $4, $5)")
void WriteSamplerFeedbackGrad<S>(Texture2DArray<S> texArray, SamplerState samp, float3 location, float3 ddx, float3 ddy);
};
diff --git a/source/slang/slang-capability-defs.h b/source/slang/slang-capability-defs.h
new file mode 100644
index 000000000..8bf1d80e9
--- /dev/null
+++ b/source/slang/slang-capability-defs.h
@@ -0,0 +1,62 @@
+// slang-capability-defs.h
+
+// This file uses macros to define the capability "atoms" that
+// are used by the `CapabilitySet` implementation.
+//
+// Any file that `#include`s this file is required to set
+// the `SLANG_CAPABILITY_ATOM` macro before including it.
+//
+#ifndef SLANG_CAPABILITY_ATOM
+#error Must define SLANG_CAPABILITY_ATOM before including.
+#endif
+//
+// It is not necessary to `#undef` the macro in the client
+// file, because this file will `#undef` it at the end.
+
+// Our representation allows each capability atom to define
+// a number of other base atoms that it "inherits" from.
+//
+// Different atoms will need different numbers of bases,
+// so we will define a few different macros that wrap
+// `SLANG_CAPABILITY_ATOM` and let us handle the cases
+// more conveniently.
+//
+// TODO: There is probably a way to handle this with
+// variadic macros.
+//
+#define SLANG_CAPABILITY_ATOM4(ENUMERATOR, NAME, FLAGS, BASE0, BASE1, BASE2, BASE3) \
+ SLANG_CAPABILITY_ATOM(ENUMERATOR, NAME, FLAGS, BASE0, BASE1, BASE2, BASE3)
+
+#define SLANG_CAPABILITY_ATOM3(ENUMERATOR, NAME, FLAGS, BASE0, BASE1, BASE2) \
+ SLANG_CAPABILITY_ATOM(ENUMERATOR, NAME, FLAGS, BASE0, BASE1, BASE2, Invalid)
+
+#define SLANG_CAPABILITY_ATOM2(ENUMERATOR, NAME, FLAGS, BASE0, BASE1) \
+ SLANG_CAPABILITY_ATOM(ENUMERATOR, NAME, FLAGS, BASE0, BASE1, Invalid, Invalid)
+
+#define SLANG_CAPABILITY_ATOM1(ENUMERATOR, NAME, FLAGS, BASE0) \
+ SLANG_CAPABILITY_ATOM(ENUMERATOR, NAME, FLAGS, BASE0, Invalid, Invalid, Invalid)
+
+#define SLANG_CAPABILITY_ATOM0(ENUMERATOR, NAME, FLAGS) \
+ SLANG_CAPABILITY_ATOM(ENUMERATOR, NAME, FLAGS, Invalid, Invalid, Invalid, Invalid)
+
+// The `__target` capability exists only to provide a common
+// abstract base for the capabilities that represent each
+// of our compilation targets.
+//
+SLANG_CAPABILITY_ATOM0(Target, __target, Abstract)
+
+SLANG_CAPABILITY_ATOM1(HLSL, hlsl, Concrete, Target)
+SLANG_CAPABILITY_ATOM1(GLSL, glsl, Concrete, Target)
+SLANG_CAPABILITY_ATOM1(C, c, Concrete, Target)
+SLANG_CAPABILITY_ATOM1(CPP, cpp, Concrete, Target)
+SLANG_CAPABILITY_ATOM1(CUDA, cuda, Concrete, Target)
+SLANG_CAPABILITY_ATOM1(SPIRV, spirv, Concrete, Target)
+
+
+#undef SLANG_CAPABILITY_ATOM0
+#undef SLANG_CAPABILITY_ATOM1
+#undef SLANG_CAPABILITY_ATOM2
+#undef SLANG_CAPABILITY_ATOM3
+#undef SLANG_CAPABILITY_ATOM4
+
+#undef SLANG_CAPABILITY_ATOM
diff --git a/source/slang/slang-capability.cpp b/source/slang/slang-capability.cpp
new file mode 100644
index 000000000..7b4361a58
--- /dev/null
+++ b/source/slang/slang-capability.cpp
@@ -0,0 +1,428 @@
+// slang-capability.cpp
+#include "slang-capability.h"
+
+// This file implements the core of the "capability" system.
+
+namespace Slang
+{
+
+//
+// CapabilityAtom
+//
+
+// We are going to divide capabilities into a few categories,
+// which will be represented as flags for now.
+//
+// Every capability will be either concrete or abstract.
+// An abstract capability basically represents a category
+// of related capabilities that all fill a similar role.
+// For example, we could have an abstract capability that
+// represents "stages" and then the concrete capabilities
+// `vertex`, `fragment`, etc. would inherit from it.
+//
+// Abstract capabilities are critical in our model for
+// knowing when two capabilities are fundamentally incompatible.
+// For example, it is meaningless to compile code for both
+// the `vertex` and `fragment` capabilities at the same time,
+// because no target processor supports both at once.
+//
+// TODO: It is possible that instead of flags this could simply
+// identify a "kind" of atom, with two different states.
+//
+// TODO: It is likely that in a future change we will want to
+// add a third case here for "alias" capabilities, which are
+// pseudo-atomic capabilities that are just equivalent to
+// the set of their bases.
+//
+typedef uint32_t CapabilityAtomFlags;
+enum : CapabilityAtomFlags
+{
+ kCapabilityAtomFlags_Concrete = 0,
+ kCapabilityAtomFlags_Abstract = 1 << 0,
+};
+
+// The macros in the `slang-capability-defs.h` file will be used
+// to fill out a `static const` array of information about each
+// capability atom.
+//
+struct CapabilityAtomInfo
+{
+ /// The API-/language-exposed name of the capability.
+ char const* name;
+
+ /// Flags to determine if the capability is concrete-vs-abstract, etc.
+ CapabilityAtomFlags flags;
+ CapabilityAtom bases[4];
+};
+//
+// The array is going to be sized to include an entry for `CapabilityAtom::Invalid`
+// which as a value of -1, so we need to size the array one larger than the `Count`
+// value.
+//
+static const CapabilityAtomInfo kCapabilityAtoms[Int(CapabilityAtom::Count) + 1] =
+{
+ { "invalid", 0, { CapabilityAtom::Invalid, CapabilityAtom::Invalid, CapabilityAtom::Invalid, CapabilityAtom::Invalid } },
+
+#define SLANG_CAPABILITY_ATOM(ENUMERATOR, NAME, FLAGS, BASE0, BASE1, BASE2, BASE3) \
+ { #NAME, kCapabilityAtomFlags_##FLAGS, { CapabilityAtom::BASE0, CapabilityAtom::BASE1, CapabilityAtom::BASE2, CapabilityAtom::BASE3 } },
+#include "slang-capability-defs.h"
+};
+
+ /// Get the extended information structure for the given capability `atom`
+static CapabilityAtomInfo const& _getInfo(CapabilityAtom atom)
+{
+ SLANG_ASSERT(Int(atom) < Int(CapabilityAtom::Count));
+ return kCapabilityAtoms[Int(atom) + 1];
+}
+
+// One capability set or capability atom A implies another set/atom B
+// if any target that supports all of the atoms in A must also support
+// all of those in B.
+
+ /// Does `thisAtom` imply `thatAtom`?
+static bool _implies(CapabilityAtom thisAtom, CapabilityAtom thatAtom)
+{
+ // When looking at atoms, the immediate easy case is when
+ // the two atoms are the same: an atomic capability always
+ // implies itself.
+ //
+ if(thisAtom == thatAtom)
+ return true;
+
+ // Otherwise, we want to look at the bases of `thisAtom`
+ // to see if any of them imply `thatAtom`, since `thisAtom`
+ // implies each of its bases.
+ //
+ auto& thisAtomInfo = _getInfo(thisAtom);
+ for( auto thisAtomBase : thisAtomInfo.bases )
+ {
+ // The lists of bases are currently using `Invalid` as
+ // a sentinel value to terminate them, so we need to
+ // bail out of the loop when we see the sentinel.
+ //
+ if(thisAtomBase == CapabilityAtom::Invalid)
+ break;
+
+ if(_implies(thisAtomBase, thatAtom))
+ return true;
+ }
+
+ return false;
+}
+
+ /// Does `base` have any abstract capabilities in common with `otherAtom`
+ ///
+ /// This subroutine is a helper for `_isIncompatible`.
+static bool _hasAbstractBaseInCommon(CapabilityAtom base, CapabilityAtom otherAtom)
+{
+ // First we check the case where `base` itself is an abstract
+ // capability atom.
+ //
+ auto& baseAtomInfo = _getInfo(base);
+ if(baseAtomInfo.flags & kCapabilityAtomFlags_Abstract)
+ {
+ // If `base` is abstract, and `otherAtom` implies `base`,
+ // then that means that `otherAtom` includes one or
+ // more atoms that inherit from `base`, and thus the
+ // two have an abstract base in common.
+ //
+ if( _implies(otherAtom, base) )
+ return true;
+ }
+
+ // If `base` itself has bases, then we want to check if any
+ // of *those* are abstract bases that overlap with `otherAtom`.
+ //
+ for( auto baseBase : baseAtomInfo.bases )
+ {
+ if(baseBase == CapabilityAtom::Invalid)
+ break;
+
+ if(_hasAbstractBaseInCommon(baseBase, otherAtom))
+ return true;
+ }
+
+ // If we didn't manage to find any overlaps, then we conclude
+ // that there are no shared abstract bases.
+ //
+ return false;
+}
+
+ /// Is `thisAtom` incompatible with `thatAtom` (such that no target could ever support both at once)
+static bool _isIncompatible(CapabilityAtom thisAtom, CapabilityAtom thatAtom)
+{
+ // If either atom implies the other, then they aren't incompatible.
+ //
+ // For example, if there is an atom representing `sm_5_1` that inherits
+ // from an atom representing `sm_5_0`, then clearly the two aren't
+ // in any way incompatible (a single target can support both).
+ //
+ if(_implies(thisAtom, thatAtom) || _implies(thatAtom, thisAtom))
+ return false;
+
+ // If the two atoms are not in an inheritance relationship, then one of
+ // a few cases can apply:
+ //
+ // * They have no common bases; in this case they are compatible.
+ // An example would be `vertex` and `sm_5_0`.
+ //
+ // * They have a common base, but it is not marked abstract; in
+ // this case they are compatible. E.g., two GLSL extensions that
+ // both inherit from the `glsl` capability should not conflict.
+ //
+ // * They have a common base that is marked abstract; in this
+ // case they are incompatible. An example would be `vertex`
+ // and `fragment` both inheriting from the abstract atom
+ // `__stage`.
+ //
+ // To summarize the above list, we note that two atoms are
+ // incompatible with they have an abstract base in common.
+ //
+ return _hasAbstractBaseInCommon(thisAtom, thatAtom);
+
+ // TODO: The above logic is a bit off, but in a way that doesn't
+ // matter just yet.
+ //
+ // We currently have capabilities like:
+ //
+ // abstract capability __target;
+ // capability hlsl : __target;
+ // capability glsl : __target;
+ //
+ // In this case it is clear that `hlsl` and `glsl` should
+ // be incompatible, and that the rules as implemented
+ // make that the case.
+ //
+ // A problem arises when we start to add things like extensions:
+ //
+ // capability EXT_cool_thing : glsl;
+ // capability EXT_other_stuff : glsl;
+ //
+ // In this case, it also seems clear that `EXT_cool_thing`
+ // and `EXT_other_stuff` should be mutually compatible.
+ // However, with the rules implemented here right now, they
+ // would be found incompatible because they share the
+ // abstract base `__target`.
+ //
+ // In this specific case, we know that the relationship
+ // between the extensions is fine because they both inherit
+ // from `__target` *through* the concrete atom `glsl`.
+ //
+ // Before adding capabilities that represent optional
+ // extensions like this we need to codify the semantics
+ // for how incompatibility checks should work in terms
+ // of the inheritance graph of capability atoms.
+}
+
+CapabilityAtom findCapabilityAtom(UnownedStringSlice const& name)
+{
+ // For now we are implementing a linear search over the
+ // array of capability atoms to perform name lookup.
+ //
+ for( Index i = 0; i < Index(CapabilityAtom::Count); ++i )
+ {
+ // Note: using `_getInfo` here instead of accessing
+ // the `kCapabilityAtoms` array directly lets us
+ // avoid dealing with the offset-by-one indexing
+ // choice.
+ //
+ auto& capInfo = _getInfo(CapabilityAtom(i));
+ if(name == UnownedTerminatedStringSlice(capInfo.name))
+ return CapabilityAtom(i);
+ }
+ return CapabilityAtom::Invalid;
+}
+
+//
+// CapabilitySet
+//
+
+// The current design choice in `CapabilitySet` is that it blindly
+// stores exactly the atoms it is told to, without any up-front
+// processing.
+//
+// This choice has some down-sides, and there are other representations
+// that could be much nicer in the future. Possible improcements include:
+//
+// * The list of atoms could be *expanded* so that if it contains atom A
+// and atom A implies atom B, then the list should also include B.
+//
+// * The list of atoms could be *minimized*, such that if atom A implies
+// atom B, then any list that contains A does not include B (both
+// expanded and minimized lists have different benefits).
+//
+// * The list of atoms could be deduplicated.
+//
+// * The list of atoms could be sorted.
+//
+// * The lists could be deduplicated and cached in some central place
+// (the like the session) so that repreated attempts to create the
+// same capability sets return the same objects.
+//
+// In some parts of the code below we will call out how these improvements
+// could affect the algorithms used.
+
+// Given our simple choices right now, the constructors for `CapabilitySet`
+// are all straightforward: just adding the right atoms to the list.
+
+CapabilitySet::CapabilitySet()
+{}
+
+CapabilitySet::CapabilitySet(Int atomCount, CapabilityAtom const* atoms)
+{
+ m_atoms.addRange(atoms, atomCount);
+}
+
+CapabilitySet::CapabilitySet(CapabilityAtom atom)
+{
+ m_atoms.add(atom);
+}
+
+CapabilitySet::CapabilitySet(List<CapabilityAtom> const& atoms)
+ : m_atoms(atoms)
+{}
+
+
+CapabilitySet CapabilitySet::makeEmpty()
+{
+ return CapabilitySet();
+}
+
+CapabilitySet CapabilitySet::makeInvalid()
+{
+ return CapabilitySet(CapabilityAtom::Invalid);
+}
+
+bool CapabilitySet::isEmpty() const
+{
+ // Checking if a capability set is empty is trivial in any representation;
+ // all we need to know is if it has zero atoms in its definition.
+ //
+ return m_atoms.getCount() == 0;
+}
+
+bool CapabilitySet::isInvalid() const
+{
+ // We will assume here that there is only one canonical representation of
+ // an invalid capability set, which is a singleton set of the `Invalid`
+ // atom.
+ //
+ // TODO: We should ensure that any algorithms that make new capability
+ // sets by combining others properly ensure that they return the
+ // canonical invalid set rather than any other set that happens to be
+ // invalid (e.g., a set {A,B} would be invalid if A and B are incompatible,
+ // but it would not be in the canonical form this subroutine checks).
+ //
+ if(m_atoms.getCount() != 1) return false;
+ return m_atoms[0] == CapabilityAtom::Invalid;
+}
+
+bool CapabilitySet::isIncompatibleWith(CapabilityAtom that) const
+{
+ // We know that capabilities that are in an inheritnace
+ // relationship with one another can't be incompatible.
+ //
+ if(this->implies(that) || CapabilitySet(that).implies(*this))
+ return false;
+
+ // Othwerise, we want to perform a check for each of the
+ // atoms in this set, whether it is incompatible with any
+ // of the atoms in the other set (which in this case is one atom).
+ //
+ for( auto thisAtom : this->m_atoms )
+ {
+ if(_isIncompatible(thisAtom, that))
+ return true;
+ }
+
+ return false;
+}
+
+bool CapabilitySet::isIncompatibleWith(CapabilitySet const& that) const
+{
+ // We need to look at the atoms in `this` that are not
+ // present in `that`, and vice versa. For each such atom
+ // we will check if it is incompatible with the other, by
+ // virtue of the other already including a concrete atom
+ // that cannot co-exist with it.
+ //
+ for( auto thisAtom : this->m_atoms )
+ {
+ if(that.isIncompatibleWith(thisAtom))
+ return true;
+ }
+ for( auto thatAtom : that.m_atoms )
+ {
+ if(this->isIncompatibleWith(thatAtom))
+ return true;
+ }
+ return false;
+
+ // TODO: If we had a representation that stored a minified,
+ // sorted, deduplicated list of atoms, then it would be easy
+ // to iterate over the two lists in tandem and identify any
+ // element that is present in one list but not the other.
+ //
+ // Those elements would be the candidates that could cause
+ // incompatiblity, so that we wouldn't need to perform
+ // the check on each atom like we do above.
+}
+
+bool CapabilitySet::implies(CapabilitySet const& that) const
+{
+ // This capability set implies `other` if for every atom in `other`,
+ // that atom is present in this sets list of atoms or it is
+ // implies by something in the list of atoms.
+ //
+ for( auto atom : that.m_atoms )
+ {
+ if(!this->implies(atom))
+ return false;
+ }
+ return true;
+
+ // TODO: If we had a representation that stored an expanded
+ // sorted, deduplicated list of atoms, then we could
+ // check the `implies` relationship by scanning through
+ // the two lists in tandem and identifying any element
+ // in the `that` list that isn't in the `this` list.
+ // Such elements would indicate that `that` is not a subset
+ // of `this`.
+}
+
+
+bool CapabilitySet::implies(CapabilityAtom atom) const
+{
+ // If our list of explicit atoms contains `atom`, then
+ // we definitely imply it.
+ //
+ // TODO: If we stored our atom lists sorted, then
+ // this operation could be logarithmic rather than
+ // linear.
+ //
+ if(m_atoms.contains(atom))
+ return true;
+
+ // If any of our atoms implies `atom` then we
+ // also imply it.
+ //
+ // TODO: If we stored an expanded atom list, then
+ // this recursion could be skipped completely, since
+ // the containment check above would cover inheirtance
+ // relationships too.
+ //
+ for( auto thisAtom : m_atoms )
+ {
+ if(_implies(thisAtom, atom))
+ return true;
+ }
+
+ return false;
+}
+
+bool CapabilitySet::operator==(CapabilitySet const& other) const
+{
+ return this->implies(other) && other.implies(*this);
+}
+
+}
diff --git a/source/slang/slang-capability.h b/source/slang/slang-capability.h
new file mode 100644
index 000000000..662f7eed8
--- /dev/null
+++ b/source/slang/slang-capability.h
@@ -0,0 +1,156 @@
+// slang-capability.h
+#pragma once
+
+#include "../core/slang-list.h"
+#include "../core/slang-string.h"
+
+#include <stdint.h>
+
+namespace Slang
+{
+
+// This file defines a system for reasoning about the "capabilities" that a
+// target supports or, conversely, the capabilities that a function or other
+// symbol requires.
+//
+// The central idea is that we can think of the each of these cases as a set,
+// where the elements of the set are atomic features that are either present
+// on a target or not (no in-between states). For example, an atomic feature
+// might be used to represent support for double-precision floating-point
+// operations. When compiling for a target, we need to know whether the
+// target supports double-precision or not, and for a particular function
+// it either requires double-precision math to run, or not.
+//
+// In this system, the atomic capabilities are represented as cases of
+// the `CapabilityAtom` enumeration, which is generated from declarations
+// in the `slang-capability-defs.h` file.
+//
+enum class CapabilityAtom : int32_t
+{
+ // The "invalid" capability represents an atomic feature that no
+ // platform can/will ever support. If we ever determine that a
+ // function needs the invalid capability, it would be reasonable
+ // to report that situation as an error.
+ //
+ Invalid = -1,
+
+#define SLANG_CAPABILITY_ATOM(ENUMERATOR, NAME, FLAGS, BASE0, BASE1, BASE2, BASE3) \
+ ENUMERATOR,
+
+#include "slang-capability-defs.h"
+
+ Count,
+};
+
+// Once we have a universe of suitable capability atoms, we can define
+// the capabilities of a target as simply the set of all atomic capabilities
+// that it supports.
+//
+// The situation is slightly more complicated for a function. A function
+// might require a specific set of atomic feature, and that is the simple
+// case. In this simple case, we know that a target can run a function
+// if the features of the target are a super-set of those required by
+// the function.
+//
+// In the more general case, we might have a function that can be used
+// with multiple different combinations of features: e.g., you can use
+// the function if your target supports features A and B, or if it supports
+// features C and D. In our representation, that case is handled by
+// assocaiting multiple distinct sets of capabilities with one declaration,
+// with each set expressing one way that the declaration can be legally used.
+//
+// In all cases, we represent a set of capabilities with `CapabilitySet`.
+
+ /// A set of capabilities, representing features that are either supported or required
+struct CapabilitySet
+{
+public:
+ /// Default-construct an empty capability set
+ CapabilitySet();
+
+ /// Construct a capability set from an explicit list of atomic capabilities
+ CapabilitySet(Int atomCount, CapabilityAtom const* atoms);
+
+ /// Construct a capability set from an explicit list of atomic capabilities
+ explicit CapabilitySet(List<CapabilityAtom> const& atoms);
+
+ /// Construct a singleton set from a single atomic capability
+ explicit CapabilitySet(CapabilityAtom atom);
+
+ /// Make an empty capability set
+ static CapabilitySet makeEmpty();
+
+ /// Make an invalid capability set (such that no target could ever support it)
+ static CapabilitySet makeInvalid();
+
+ /// Is this capability set empty (such that any target supports it)?
+ bool isEmpty() const;
+
+ /// Is this capability set invalid (such that no target could support it)?
+ bool isInvalid() const;
+
+ // Capabilities are "incompatible" if no target platform can ever support both
+ // at the same time. For example, the `HLSL` and `GLSL` capabilities are
+ // incompatible, because a single target cannot be both an HLSL target and
+ // a GLSL target (at least for now).
+ //
+ // Note that we are using the term "incompatible" here even though it
+ // seems like "disjoint" would be intuitively correct (HLSL and GLSL
+ // targets sure do seem to be disjoint). The problem is that in our
+ // set-theoretic representation of capabilities, incompatible capability
+ // sets are *never* disjoint sets of atoms, and (valid) disjoint sets of atoms
+ // *never* represent incompatible capability sets.
+
+ /// Is this capability set incompatible with the given `other` set.
+ bool isIncompatibleWith(CapabilityAtom other) const;
+
+ /// Is this capability set incompatible with the given `other` atomic capability.
+ bool isIncompatibleWith(CapabilitySet const& other) const;
+
+ // One capability set A "implies" another set B if a target that
+ // supports A must also support all of B.
+ //
+ // In practice, this means that "A implies B" is the same as
+ // "A is a subset of B" in the set-theoretic model, but
+ // we ant to think of this primarily as supported/required features,
+ // and not get hung up on the set theory.
+
+ /// Does this capability set imply all the capabilities in `other`?
+ bool implies(CapabilitySet const& other) const;
+
+ /// Does this capability set imply the atomic capability `other`?
+ bool implies(CapabilityAtom other) const;
+
+ // A capability set is equal to another if each implies the other.
+
+ /// Are these two capability sets equal?
+ bool operator==(CapabilitySet const& that) const;
+
+ /// Get access to the raw atomic capabilities that define this set.
+ List<CapabilityAtom> const& getAtoms() const { return m_atoms; }
+
+private:
+
+ // The underlying representation we are using is currently very simple:
+ // a capability set is stored as a list of the atoms that were passed
+ // in at the time the set was constructed.
+ //
+ // Currently, no effort is made to sort the atoms, remove duplicates,
+ // or to expand the list when one atom entails another.
+ //
+ // TODO: Much more efficient representations are possible, and we
+ // should consider them if the performance of `CapabilitySet` ever
+ // prooves to be an issue.
+ //
+ List<CapabilityAtom> m_atoms;
+};
+
+ /// Are the `left` and `right` capability sets unequal?
+inline bool operator!=(CapabilitySet const& left, CapabilitySet const& right)
+{
+ return !(left == right);
+}
+
+ /// Find a capability atom with the given `name`, or return CapabilityAtom::Invalid.
+CapabilityAtom findCapabilityAtom(UnownedStringSlice const& name);
+}
diff --git a/source/slang/slang-compiler.h b/source/slang/slang-compiler.h
index ab02a0c54..ace2cb842 100755
--- a/source/slang/slang-compiler.h
+++ b/source/slang/slang-compiler.h
@@ -8,6 +8,7 @@
#include "../../slang-com-ptr.h"
+#include "slang-capability.h"
#include "slang-diagnostics.h"
#include "slang-name.h"
#include "slang-preprocessor.h"
@@ -1145,6 +1146,8 @@ namespace Slang
SlangTargetFlags targetFlags = 0;
Slang::Profile targetProfile = Slang::Profile();
FloatingPointMode floatingPointMode = FloatingPointMode::Default;
+ CapabilitySet targetCaps = CapabilitySet::makeInvalid();
+
bool isWholeProgramRequest()
{
return (targetFlags & SLANG_TARGET_FLAG_GENERATE_WHOLE_PROGRAM) != 0;
@@ -1154,6 +1157,7 @@ namespace Slang
CodeGenTarget getTarget() { return target; }
Profile getTargetProfile() { return targetProfile; }
FloatingPointMode getFloatingPointMode() { return floatingPointMode; }
+ CapabilitySet getTargetCaps();
Session* getSession();
MatrixLayoutMode getDefaultMatrixLayoutMode();
diff --git a/source/slang/slang-emit-c-like.cpp b/source/slang/slang-emit-c-like.cpp
index fa3b7c6c4..eefa2363c 100644
--- a/source/slang/slang-emit-c-like.cpp
+++ b/source/slang/slang-emit-c-like.cpp
@@ -130,6 +130,7 @@ CLikeSourceEmitter::CLikeSourceEmitter(const Desc& desc)
SLANG_ASSERT(m_sourceLanguage != SourceLanguage::Unknown);
m_target = desc.target;
+ m_targetCaps = desc.targetCaps;
m_compileRequest = desc.compileRequest;
m_entryPointStage = desc.entryPointStage;
@@ -392,46 +393,6 @@ void CLikeSourceEmitter::maybeCloseParens(bool needClose)
if(needClose) m_writer->emit(")");
}
-bool CLikeSourceEmitter::isTargetIntrinsicModifierApplicable(const String& targetName)
-{
- switch(getSourceLanguage())
- {
- default:
- SLANG_DIAGNOSE_UNEXPECTED(getSink(), SourceLoc(), "unhandled code generation target");
- return false;
-
- case SourceLanguage::C: return targetName == "c";
- case SourceLanguage::CPP: return targetName == "cpp";
- case SourceLanguage::GLSL: return targetName == "glsl";
- case SourceLanguage::HLSL: return targetName == "hlsl";
- case SourceLanguage::CUDA: return targetName == "cuda";
- }
-}
-
-bool CLikeSourceEmitter::isTargetIntrinsicModifierApplicable(IRTargetIntrinsicDecoration* decoration)
-{
- auto targetName = String(decoration->getTargetName());
-
- // If no target name was specified, then the modifier implicitly
- // applies to all targets.
- if(targetName.getLength() == 0)
- return true;
-
- return isTargetIntrinsicModifierApplicable(targetName);
-}
-
-bool CLikeSourceEmitter::isTargetIntrinsicModifierBetter(IRTargetIntrinsicDecoration* candidate, IRTargetIntrinsicDecoration* existing)
-{
- // For now, the rule is that an empty string represents a catch-all
- // definition, which is worse than any target-specific declaration.
- // Therefore, if the new `candidate` has a non-empty target name
- // specified, then it is automatically better (or at least as
- // good) as `existing`.
- //
- SLANG_UNUSED(existing);
- return candidate->getTargetName().getLength() != 0;
-}
-
void CLikeSourceEmitter::emitStringLiteral(String const& value)
{
m_writer->emit("\"");
@@ -669,7 +630,7 @@ String CLikeSourceEmitter::generateName(IRInst* inst)
// If the instruction names something
// that should be emitted as a target intrinsic,
// then use that name instead.
- if(auto intrinsicDecoration = findTargetIntrinsicDecoration(inst))
+ if(auto intrinsicDecoration = findBestTargetIntrinsicDecorationXXX(inst))
{
return String(intrinsicDecoration->getDefinition());
}
@@ -940,6 +901,7 @@ bool CLikeSourceEmitter::shouldFoldInstIntoUseSites(IRInst* inst)
case kIROp_IntLit:
case kIROp_FloatLit:
case kIROp_BoolLit:
+ case kIROp_CapabilitySet:
return true;
// Always fold these in, because their results
@@ -1130,7 +1092,7 @@ bool CLikeSourceEmitter::shouldFoldInstIntoUseSites(IRInst* inst)
// This is significant, because we can within a target intrinsics definition multiple accesses to the same
// parameter. This is not indicated into the call, and can lead to output code computes something multiple
// times as it is folding into the expression of the the target intrinsic, which we don't want.
- if (auto targetIntrinsicDecoration = findTargetIntrinsicDecoration(funcValue))
+ if (auto targetIntrinsicDecoration = findBestTargetIntrinsicDecorationXXX(funcValue))
{
// Find the index of the original instruction, to see if it's multiply used.
IRUse* args = callInst->getArgs();
@@ -1333,50 +1295,14 @@ void CLikeSourceEmitter::emitInstResultDecl(IRInst* inst)
m_writer->emit(" = ");
}
-IRTargetIntrinsicDecoration* CLikeSourceEmitter::findTargetIntrinsicDecoration(IRInst* inInst)
+IRTargetSpecificDecoration* CLikeSourceEmitter::findBestTargetDecoration(IRInst* inInst)
{
- // An intrinsic generic function will be invoked through a `specialize` instruction,
- // so the callee won't directly be the thing that is decorated. We will look up
- // through specializations until we can see the actual thing being called.
- //
- IRInst* inst = inInst;
- while (auto specInst = as<IRSpecialize>(inst))
- {
- inst = getSpecializedValue(specInst);
-
- // If `getSpecializedValue` can't find the result value
- // of the generic being specialized, then it returns
- // the original instruction. This would be a disaster
- // for use because this loop would go on forever.
- //
- // This case should never happen if the stdlib is well-formed
- // and the compiler is doing its job right.
- //
- SLANG_ASSERT(inst != specInst);
- }
-
- // We will search through all the `IRTargetIntrinsicDecoration`s on
- // the instruction, looking for those that are applicable to the
- // current code generation target. Among the application decorations
- // we will try to find one that is "best" in the sense that it is
- // more (or at least as) specialized for the target than the
- // others.
- //
- IRTargetIntrinsicDecoration* best = nullptr;
- for(auto dd : inst->getDecorations())
- {
- if (dd->op != kIROp_TargetIntrinsicDecoration)
- continue;
-
- auto targetIntrinsic = (IRTargetIntrinsicDecoration*)dd;
- if (!isTargetIntrinsicModifierApplicable(targetIntrinsic))
- continue;
-
- if(!best || isTargetIntrinsicModifierBetter(targetIntrinsic, best))
- best = targetIntrinsic;
- }
+ return Slang::findBestTargetDecoration(inInst, getTargetCaps());
+}
- return best;
+IRTargetIntrinsicDecoration* CLikeSourceEmitter::findBestTargetIntrinsicDecorationXXX(IRInst* inInst)
+{
+ return as<IRTargetIntrinsicDecoration>(findBestTargetDecoration(inInst));
}
/* static */bool CLikeSourceEmitter::isOrdinaryName(UnownedStringSlice const& name)
@@ -2029,7 +1955,7 @@ void CLikeSourceEmitter::emitCallExpr(IRCall* inst, EmitOpInfo outerPrec)
// We want to detect any call to an intrinsic operation,
// that we can emit it directly without mangling, etc.
- if(auto targetIntrinsic = findTargetIntrinsicDecoration(funcValue))
+ if(auto targetIntrinsic = findBestTargetIntrinsicDecorationXXX(funcValue))
{
emitIntrinsicCallExpr(inst, targetIntrinsic, outerPrec);
}
@@ -3408,7 +3334,7 @@ bool CLikeSourceEmitter::isTargetIntrinsic(IRFunc* func)
// it has a suitable decoration marking it as a
// target intrinsic for the current compilation target.
//
- return findTargetIntrinsicDecoration(func) != nullptr;
+ return findBestTargetIntrinsicDecorationXXX(func) != nullptr;
}
void CLikeSourceEmitter::emitFunc(IRFunc* func)
@@ -3441,7 +3367,7 @@ void CLikeSourceEmitter::emitStruct(IRStructType* structType)
{
// If the selected `struct` type is actually an intrinsic
// on our target, then we don't want to emit anything at all.
- if(auto intrinsicDecoration = findTargetIntrinsicDecoration(structType))
+ if(auto intrinsicDecoration = findBestTargetIntrinsicDecorationXXX(structType))
{
return;
}
diff --git a/source/slang/slang-emit-c-like.h b/source/slang/slang-emit-c-like.h
index 9c91078ee..9c6da8a64 100644
--- a/source/slang/slang-emit-c-like.h
+++ b/source/slang/slang-emit-c-like.h
@@ -22,14 +22,20 @@ public:
struct Desc
{
BackEndCompileRequest* compileRequest = nullptr;
- // The target language we want to generate code for
+
+ /// The target language we want to generate code for
CodeGenTarget target = CodeGenTarget::Unknown;
- // The stage for the entry point we are being asked to compile
+
+ /// The stage for the entry point we are being asked to compile
Stage entryPointStage = Stage::Unknown;
- // The "effective" profile that is being used to emit code,
- // combining information from the target and entry point.
+
+ /// The "effective" profile that is being used to emit code,
+ /// combining information from the target and entry point.
Profile effectiveProfile = Profile::RawEnum::Unknown;
+ /// The capabilities of the target
+ CapabilitySet targetCaps;
+
SourceWriter* sourceWriter = nullptr;
};
@@ -105,6 +111,8 @@ public:
void noteInternalErrorLoc(SourceLoc loc) { return getSink()->noteInternalErrorLoc(loc); }
+ CapabilitySet getTargetCaps() { return m_targetCaps; }
+
//
// Types
//
@@ -126,13 +134,6 @@ public:
void maybeCloseParens(bool needClose);
- bool isTargetIntrinsicModifierApplicable(String const& targetName);
-
- bool isTargetIntrinsicModifierApplicable(IRTargetIntrinsicDecoration* decoration);
-
- /// Is the `candidate` decoration more specialized for the current target than `existing`?
- bool isTargetIntrinsicModifierBetter(IRTargetIntrinsicDecoration* candidate, IRTargetIntrinsicDecoration* existing);
-
void emitStringLiteral(const String& value);
void emitVal(IRInst* val, const EmitOpInfo& outerPrec);
@@ -174,7 +175,8 @@ public:
void emitInstResultDecl(IRInst* inst);
- IRTargetIntrinsicDecoration* findTargetIntrinsicDecoration(IRInst* inst);
+ IRTargetSpecificDecoration* findBestTargetDecoration(IRInst* inst);
+ IRTargetIntrinsicDecoration* findBestTargetIntrinsicDecorationXXX(IRInst* inst);
// Check if the string being used to define a target intrinsic
// is an "ordinary" name, such that we can simply emit a call
@@ -381,6 +383,10 @@ public:
// The target language we want to generate code for
CodeGenTarget m_target;
+
+ /// The capabilities of the target
+ CapabilitySet m_targetCaps;
+
// Source language (based on the more nuanced m_target)
SourceLanguage m_sourceLanguage;
diff --git a/source/slang/slang-emit.cpp b/source/slang/slang-emit.cpp
index 574631567..2118268cd 100644
--- a/source/slang/slang-emit.cpp
+++ b/source/slang/slang-emit.cpp
@@ -722,6 +722,12 @@ SlangResult emitEntryPointsSourceFromIR(
desc.entryPointStage = entryPoint->getStage();
desc.effectiveProfile = getEffectiveProfile(entryPoint, targetRequest);
}
+ else
+ {
+ desc.entryPointStage = Stage::Unknown;
+ desc.effectiveProfile = targetRequest->getTargetProfile();
+ }
+ desc.targetCaps = targetRequest->getTargetCaps();
desc.sourceWriter = &sourceWriter;
// Define here, because must be in scope longer than the sourceEmitter, as sourceEmitter might reference
diff --git a/source/slang/slang-hlsl-intrinsic-set.cpp b/source/slang/slang-hlsl-intrinsic-set.cpp
index b73aa4e8e..13b14a548 100644
--- a/source/slang/slang-hlsl-intrinsic-set.cpp
+++ b/source/slang/slang-hlsl-intrinsic-set.cpp
@@ -514,10 +514,10 @@ HLSLIntrinsic::Op HLSLIntrinsicOpLookup::getOpFromTargetDecoration(IRInst* inIns
// not a targets transformation)
//
// It turns out that addCatchAllIntrinsicDecorationIfNeeded will add a target intrinsic with the
- // original HLSL name, which has a target of ""
+ // original HLSL name, which has an empty `CapabilitySet`.
//
// It's not 100% clear this covers all the cases, but for now lets go with that
- if (decor->getTargetName().getLength() == 0)
+ if (decor->getTargetCaps().isEmpty())
{
Op op = getOpByName(decor->getDefinition());
if (op != Op::Invalid)
diff --git a/source/slang/slang-ir-glsl-legalize.cpp b/source/slang/slang-ir-glsl-legalize.cpp
index 2abd8ba66..5a14fe1aa 100644
--- a/source/slang/slang-ir-glsl-legalize.cpp
+++ b/source/slang/slang-ir-glsl-legalize.cpp
@@ -1431,7 +1431,7 @@ void legalizeEntryPointParameterForGLSL(
// HACK: we will identify the operation based
// on the target-intrinsic definition that was
// given to it.
- auto decoration = findTargetIntrinsicDecoration(callee, "glsl");
+ auto decoration = as<IRTargetIntrinsicDecoration>(findBestTargetDecoration(callee, CapabilityAtom::GLSL));
if(!decoration)
continue;
diff --git a/source/slang/slang-ir-inst-defs.h b/source/slang/slang-ir-inst-defs.h
index 4642ed3f0..736cb0cec 100644
--- a/source/slang/slang-ir-inst-defs.h
+++ b/source/slang/slang-ir-inst-defs.h
@@ -26,6 +26,8 @@ INST(Nop, nop, 0, 0)
INST(StringType, String, 0, 0)
+ INST(CapabilitySetType, CapabilitySet, 0, 0)
+
INST(DynamicType, DynamicType, 0, 0)
INST(AnyValueType, AnyValueType, 1, 0)
@@ -236,6 +238,8 @@ INST(Block, block, 0, PARENT)
INST(StringLit, string_constant, 0, 0)
INST_RANGE(Constant, BoolLit, StringLit)
+INST(CapabilitySet, capabilitySet, 0, 0)
+
INST(undefined, undefined, 0, 0)
// A `defaultConstruct` operation creates an initialized
diff --git a/source/slang/slang-ir-insts.h b/source/slang/slang-ir-insts.h
index baa511a3c..dc42fde70 100644
--- a/source/slang/slang-ir-insts.h
+++ b/source/slang/slang-ir-insts.h
@@ -8,6 +8,7 @@
//
// TODO: the builder probably needs its own file.
+#include "slang-capability.h"
#include "slang-compiler.h"
#include "slang-ir.h"
#include "slang-syntax.h"
@@ -17,6 +18,13 @@ namespace Slang {
class Decl;
+struct IRCapabilitySet : IRInst
+{
+ IR_LEAF_ISA(CapabilitySet);
+
+ CapabilitySet getCaps();
+};
+
struct IRDecoration : IRInst
{
IR_PARENT_ISA(Decoration)
@@ -63,12 +71,9 @@ struct IRTargetSpecificDecoration : IRDecoration
{
IR_PARENT_ISA(TargetSpecificDecoration)
- IRStringLit* getTargetNameOperand() { return cast<IRStringLit>(getOperand(0)); }
+ IRCapabilitySet* getTargetCapsOperand() { return cast<IRCapabilitySet>(getOperand(0)); }
- UnownedStringSlice getTargetName()
- {
- return getTargetNameOperand()->getStringSlice();
- }
+ CapabilitySet getTargetCaps() { return getTargetCapsOperand()->getCaps(); }
};
struct IRTargetDecoration : IRTargetSpecificDecoration
@@ -1822,6 +1827,7 @@ struct IRBuilder
IRInst* getFloatValue(IRType* type, IRFloatingPointValue value);
IRStringLit* getStringValue(const UnownedStringSlice& slice);
IRPtrLit* getPtrValue(void* value);
+ IRInst* getCapabilityValue(CapabilitySet const& caps);
IRBasicType* getBasicType(BaseType baseType);
IRBasicType* getVoidType();
@@ -1830,6 +1836,7 @@ struct IRBuilder
IRBasicType* getUIntType();
IRBasicType* getUInt64Type();
IRStringType* getStringType();
+ IRType* getCapabilitySetType();
IRAssociatedType* getAssociatedType(ArrayView<IRInterfaceType*> constraintTypes);
IRThisType* getThisType(IRInterfaceType* interfaceType);
@@ -2483,14 +2490,24 @@ struct IRBuilder
addDecoration(value, kIROp_SemanticDecoration, getStringValue(text), getIntValue(getIntType(), index));
}
- void addTargetIntrinsicDecoration(IRInst* value, UnownedStringSlice const& target, UnownedStringSlice const& definition)
+ void addTargetIntrinsicDecoration(IRInst* value, IRInst* caps, UnownedStringSlice const& definition)
{
- addDecoration(value, kIROp_TargetIntrinsicDecoration, getStringValue(target), getStringValue(definition));
+ addDecoration(value, kIROp_TargetIntrinsicDecoration, caps, getStringValue(definition));
}
- void addTargetDecoration(IRInst* value, UnownedStringSlice const& target)
+ void addTargetIntrinsicDecoration(IRInst* value, CapabilitySet const& caps, UnownedStringSlice const& definition)
{
- addDecoration(value, kIROp_TargetDecoration, getStringValue(target));
+ addTargetIntrinsicDecoration(value, getCapabilityValue(caps), definition);
+ }
+
+ void addTargetDecoration(IRInst* value, IRInst* caps)
+ {
+ addDecoration(value, kIROp_TargetDecoration, caps);
+ }
+
+ void addTargetDecoration(IRInst* value, CapabilitySet const& caps)
+ {
+ addTargetDecoration(value, getCapabilityValue(caps));
}
void addRequireGLSLExtensionDecoration(IRInst* value, UnownedStringSlice const& extensionName)
@@ -2640,9 +2657,16 @@ void markConstExpr(
//
-IRTargetIntrinsicDecoration* findTargetIntrinsicDecoration(
- IRInst* val,
- String const& targetName);
+IRTargetIntrinsicDecoration* findAnyTargetIntrinsicDecoration(
+ IRInst* val);
+
+IRTargetSpecificDecoration* findBestTargetDecoration(
+ IRInst* val,
+ CapabilitySet const& targetCaps);
+
+IRTargetSpecificDecoration* findBestTargetDecoration(
+ IRInst* val,
+ CapabilityAtom targetCapabilityAtom);
}
diff --git a/source/slang/slang-ir-legalize-varying-params.cpp b/source/slang/slang-ir-legalize-varying-params.cpp
index 3df651e6a..c802513e8 100644
--- a/source/slang/slang-ir-legalize-varying-params.cpp
+++ b/source/slang/slang-ir-legalize-varying-params.cpp
@@ -1092,15 +1092,15 @@ struct CUDAEntryPointVaryingParamLegalizeContext : EntryPointVaryingParamLegaliz
// a unique name).
threadIdxGlobalParam = builder.createGlobalParam(uint3Type);
- builder.addTargetIntrinsicDecoration(threadIdxGlobalParam, UnownedTerminatedStringSlice(""), UnownedTerminatedStringSlice("threadIdx"));
+ builder.addTargetIntrinsicDecoration(threadIdxGlobalParam, CapabilitySet::makeEmpty(), UnownedTerminatedStringSlice("threadIdx"));
builder.addLayoutDecoration(threadIdxGlobalParam, varLayout);
blockIdxGlobalParam = builder.createGlobalParam(uint3Type);
- builder.addTargetIntrinsicDecoration(blockIdxGlobalParam, UnownedTerminatedStringSlice(""), UnownedTerminatedStringSlice("blockIdx"));
+ builder.addTargetIntrinsicDecoration(blockIdxGlobalParam, CapabilitySet::makeEmpty(), UnownedTerminatedStringSlice("blockIdx"));
builder.addLayoutDecoration(blockIdxGlobalParam, varLayout);
blockDimGlobalParam = builder.createGlobalParam(uint3Type);
- builder.addTargetIntrinsicDecoration(blockDimGlobalParam, UnownedTerminatedStringSlice(""), UnownedTerminatedStringSlice("blockDim"));
+ builder.addTargetIntrinsicDecoration(blockDimGlobalParam, CapabilitySet::makeEmpty(), UnownedTerminatedStringSlice("blockDim"));
builder.addLayoutDecoration(blockDimGlobalParam, varLayout);
}
@@ -1220,14 +1220,14 @@ struct CPUEntryPointVaryingParamLegalizeContext : EntryPointVaryingParamLegalize
varyingInputStructType = builder.createStructType();
varyingInputStructPtrType = builder.getPtrType(varyingInputStructType);
- builder.addTargetIntrinsicDecoration(varyingInputStructType, UnownedTerminatedStringSlice(""), UnownedTerminatedStringSlice("ComputeThreadVaryingInput"));
+ builder.addTargetIntrinsicDecoration(varyingInputStructType, CapabilitySet::makeEmpty(), UnownedTerminatedStringSlice("ComputeThreadVaryingInput"));
groupIDKey = builder.createStructKey();
- builder.addTargetIntrinsicDecoration(groupIDKey, UnownedTerminatedStringSlice(""), UnownedTerminatedStringSlice("groupID"));
+ builder.addTargetIntrinsicDecoration(groupIDKey, CapabilitySet::makeEmpty(), UnownedTerminatedStringSlice("groupID"));
builder.createStructField(varyingInputStructType, groupIDKey, uint3Type);
groupThreadIDKey = builder.createStructKey();
- builder.addTargetIntrinsicDecoration(groupThreadIDKey, UnownedTerminatedStringSlice(""), UnownedTerminatedStringSlice("groupThreadID"));
+ builder.addTargetIntrinsicDecoration(groupThreadIDKey, CapabilitySet::makeEmpty(), UnownedTerminatedStringSlice("groupThreadID"));
builder.createStructField(varyingInputStructType, groupThreadIDKey, uint3Type);
}
diff --git a/source/slang/slang-ir-link.cpp b/source/slang/slang-ir-link.cpp
index c96286eec..e4c1bad85 100644
--- a/source/slang/slang-ir-link.cpp
+++ b/source/slang/slang-ir-link.cpp
@@ -1,6 +1,7 @@
// slang-ir-link.cpp
#include "slang-ir-link.h"
+#include "slang-capability.h"
#include "slang-ir.h"
#include "slang-ir-insts.h"
#include "slang-mangle.h"
@@ -37,6 +38,9 @@ struct IRSharedSpecContext
// The code-generation target in use
CodeGenTarget target;
+ // The API-level target request
+ TargetRequest* targetReq = nullptr;
+
// The specialized module we are building
RefPtr<IRModule> module;
@@ -915,79 +919,35 @@ IRFunc* specializeIRForEntryPoint(
// Get a string form of the target so that we can
// use it to match against target-specialization modifiers
//
-// TODO: We shouldn't be using strings for this.
-String getTargetName(IRSpecContext* context)
+CapabilitySet getTargetCapabilities(IRSpecContext* context)
{
- switch( context->shared->target )
- {
- case CodeGenTarget::HLSL:
- return "hlsl";
-
- case CodeGenTarget::GLSL:
- return "glsl";
-
- case CodeGenTarget::CSource:
- return "c";
-
- case CodeGenTarget::CPPSource:
- return "cpp";
-
- case CodeGenTarget::CUDASource:
- return "cuda";
-
- case CodeGenTarget::SPIRV:
- return "spirv";
-
-
- default:
- SLANG_UNEXPECTED("unhandled case");
- UNREACHABLE_RETURN("unknown");
- }
+ return context->getShared()->targetReq->getTargetCaps();
}
-// How specialized is a given declaration for the chosen target?
-enum class TargetSpecializationLevel
-{
- specializedForOtherTarget = 0,
- notSpecialized,
- specializedForTarget,
-};
-
-TargetSpecializationLevel getTargetSpecialiationLevel(
- IRInst* inVal,
- String const& targetName)
+ /// Get the most appropriate ("best") capability requirements for `inVal` based on the `targetCaps`.
+static CapabilitySet _getBestSpecializationCaps(
+ IRInst* inVal,
+ CapabilitySet const& targetCaps)
{
- // HACK: Currently the front-end is placing modifiers related
- // to target specialization on nodes like functions, even when
- // those functions are being returned by a generic. This
- // means that we need to try and inspect the value being
- // returned by the generic if we are looking at a generic.
- IRInst* val = inVal;
- while( auto genericVal = as<IRGeneric>(val) )
- {
- auto firstBlock = genericVal->getFirstBlock();
- if(!firstBlock) break;
+ IRInst* val = getResolvedInstForDecorations(inVal);
- auto returnInst = as<IRReturnVal>(firstBlock->getLastInst());
- if(!returnInst) break;
+ // If the instruction has no target-related decorations,
+ // then it is implied to be an unspecialized, target-independent
+ // declaration.
+ //
+ // Such a declaration amounts to an empty set of capabilities.
+ //
+ if(!val->findDecoration<IRTargetDecoration>())
+ return CapabilitySet::makeEmpty();
- val = returnInst->getVal();
+ if( auto targetDecoration = findBestTargetDecoration(inVal, targetCaps) )
+ {
+ return targetDecoration->getTargetCaps();
}
-
- TargetSpecializationLevel result = TargetSpecializationLevel::notSpecialized;
- for(auto dd : val->getDecorations())
+ else
{
- if(dd->op != kIROp_TargetDecoration)
- continue;
-
- auto decoration = (IRTargetDecoration*) dd;
- if(String(decoration->getTargetName()) == targetName)
- return TargetSpecializationLevel::specializedForTarget;
-
- result = TargetSpecializationLevel::specializedForOtherTarget;
+ return CapabilitySet::makeInvalid();
}
-
- return result;
}
// Is `newVal` marked as being a better match for our
@@ -1006,43 +966,61 @@ bool isBetterForTarget(
return true;
}
- String targetName = getTargetName(context);
-
// For right now every declaration might have zero or more
- // modifiers, representing the targets for which it is specialized.
- // Each modifier has a single string "tag" to represent a target.
- // We thus decide that a declaration is "more specialized" by:
+ // decorations, representing the capabilities for which it is specialized.
+ // Each decorations has a `CapabilitySet` to represent what it requires of a target.
+ //
+ // We need to look at all the candidate declarations for a symbol
+ // and pick the one that has the "most specialized" set of capabilities
+ // for our chosen target.
//
- // - Does it have a modifier with a tag with the string for the current target?
- // If yes, it is the most specialized it can be.
+ // In principle, this should be as simple as:
//
- // - Does it have a no tags? Then it is "unspecialized" and that is okay.
+ // * Ignore all decorations with capability sets that aren't subsets
+ // of the capabilities of our target.
//
- // - Does it have a modifier with a tag for a *different* target?
- // If yes, then it shouldn't even be usable on this target.
+ // * From the remaining decorations, pick the one that is a superset
+ // of all the others (and give an ambiguity error if there is
+ // no unique "best" option).
//
- // Longer term a better approach is to think of this in terms
- // of a "disjunction of conjunctions" that is:
+ // In practice, the choice is complicated by the way that we currently
+ // have the compiler automatically deduce dependencies on extensions
+ // or other features that were not included as part of the target
+ // description by the user.
//
- // (A and B and C) or (A and D) or (E) or (F and G) ...
+ // In order to preserve the ability to infer more specialized requirements
+ // than what the target includes, we change the two steps slightly:
//
- // A code generation target would then consist of a
- // conjunction of individual tags:
+ // * Ignore all decorations with capability sets that are *incompatible*
+ // with the capabilities of our target, such that they could never be
+ // used together.
//
- // (HLSL and SM_4_0 and Vertex and ...)
+ // * From all the remaining decorations, pick the one that is "better"
+ // than all the others in that it is either a supserset of each other
+ // candidate, or for each feature that another candidate requires,
+ // it requires a "better" feature that covers the same ground.
//
- // A declaration is *applicable* on a target if one of
- // its conjunctions of tags is a subset of the target's.
+ // Note: This approach isn't really sound, so we are likely to have
+ // to tweak it over time. Most notably, we probably need/want to
+ // push back on the automatic inference of extensions/versions in
+ // the compiler as much as possible.
//
- // One declaration is *better* than another on a target
- // if it is applicable and its tags are a superset
- // of the other's.
+ CapabilitySet targetCaps = getTargetCapabilities(context);
+ CapabilitySet newCaps = _getBestSpecializationCaps(newVal, targetCaps);
+ CapabilitySet oldCaps = _getBestSpecializationCaps(oldVal, targetCaps);
- auto newLevel = getTargetSpecialiationLevel(newVal, targetName);
-
- auto oldLevel = getTargetSpecialiationLevel(oldVal, targetName);
- if(newLevel != oldLevel)
- return UInt(newLevel) > UInt(oldLevel);
+ // If either value returned an invalid capability set, it implies
+ // that it cannot be used on this target at all, and the other
+ // value should be considered better by default.
+ //
+ // Note: if both of the candidate values we have are incompatible
+ // with our target, then it doesn't matter which we favor.
+ //
+ if(newCaps.isInvalid()) return false;
+ if(oldCaps.isInvalid()) return true;
+
+ if(newCaps != oldCaps)
+ return newCaps.implies(oldCaps);
// All preceding factors being equal, an `[export]` is better
// than an `[import]`.
@@ -1300,7 +1278,8 @@ void initializeSharedSpecContext(
IRSharedSpecContext* sharedContext,
Session* session,
IRModule* module,
- CodeGenTarget target)
+ CodeGenTarget target,
+ TargetRequest* targetReq)
{
SharedIRBuilder* sharedBuilder = &sharedContext->sharedBuilderStorage;
@@ -1318,6 +1297,7 @@ void initializeSharedSpecContext(
sharedBuilder->module = module;
sharedContext->module = module;
sharedContext->target = target;
+ sharedContext->targetReq = targetReq;
}
struct IRSpecializationState
@@ -1372,7 +1352,8 @@ LinkedIR linkIR(
sharedContext,
compileRequest->getSession(),
nullptr,
- target);
+ target,
+ targetReq);
state->irModule = sharedContext->module;
diff --git a/source/slang/slang-ir-specialize.cpp b/source/slang/slang-ir-specialize.cpp
index 693494ac1..91852ff88 100644
--- a/source/slang/slang-ir-specialize.cpp
+++ b/source/slang/slang-ir-specialize.cpp
@@ -756,27 +756,20 @@ struct SpecializationContext
}
}
- // Finds any `IRTargetDecoration` from `inst`. Recursively chasing `specialize` chains.
- IRTargetIntrinsicDecoration* findTargetIntrinsicDecorationRec(IRInst* inst)
- {
- while (auto specialize = as<IRSpecialize>(inst))
- {
- inst = specialize->getBase();
- }
- while (auto genericInst = as<IRGeneric>(inst))
- {
- inst = findGenericReturnVal(genericInst);
- }
- if (auto decor = inst->findDecoration<IRTargetIntrinsicDecoration>())
- return decor;
- return nullptr;
- }
-
// Returns true if the call inst represents a call to
// StructuredBuffer::operator[]/Load/Consume methods.
bool isBufferLoadCall(IRCall* inst)
{
- if (auto targetIntrinsic = findTargetIntrinsicDecorationRec(inst->getCallee()))
+ // TODO: We should have something like a `[knownSemantics(...)]` decoration
+ // that can identify that an `IRFunc` has semantics that are consistent
+ // with a list of compiler-known behaviors. The operand to `knownSemantics`
+ // could come from a `KnownSemantics` enumeration or something similar,
+ // so that we don't have to make string-based checks here.
+ //
+ // Note that `[knownSemantics(...)]` would be independent of any targets,
+ // and could even apply to functions that are implemented entirely in Slang.
+
+ if (auto targetIntrinsic = findAnyTargetIntrinsicDecoration(inst->getCallee()))
{
auto name = targetIntrinsic->getDefinition();
if (name == ".operator[]" || name == ".Load" || name == ".Consume")
diff --git a/source/slang/slang-ir.cpp b/source/slang/slang-ir.cpp
index 99c601051..aa72cc0c3 100644
--- a/source/slang/slang-ir.cpp
+++ b/source/slang/slang-ir.cpp
@@ -234,6 +234,23 @@ namespace Slang
}
}
+ // IRCapabilitySet
+
+ CapabilitySet IRCapabilitySet::getCaps()
+ {
+ List<CapabilityAtom> atoms;
+
+ Index count = (Index) getOperandCount();
+ for(Index i = 0; i < count; ++i)
+ {
+ auto operand = cast<IRIntLit>(getOperand(i));
+ atoms.add(CapabilityAtom(operand->getValue()));
+ }
+
+ return CapabilitySet(atoms.getCount(), atoms.getBuffer());
+ }
+
+
// IRParam
IRParam* IRParam::getNextParam()
@@ -2138,6 +2155,21 @@ namespace Slang
return (IRPtrLit*) findOrEmitConstant(this, keyInst);
}
+ IRInst* IRBuilder::getCapabilityValue(CapabilitySet const& caps)
+ {
+ IRType* capabilityAtomType = getIntType();
+ IRType* capabilitySetType = getCapabilitySetType();
+
+ List<IRInst*> args;
+ for( auto atom : caps.getAtoms() )
+ {
+ args.add(getIntValue(capabilityAtomType, Int(atom)));
+ }
+
+ return findOrEmitHoistableInst(
+ capabilitySetType, kIROp_CapabilitySet, args.getCount(), args.getBuffer());
+ }
+
IRInst* IRBuilder::findOrEmitHoistableInst(
IRType* type,
IROp op,
@@ -2406,6 +2438,11 @@ namespace Slang
return (IRStringType*)getType(kIROp_StringType);
}
+ IRType* IRBuilder::getCapabilitySetType()
+ {
+ return getType(kIROp_CapabilitySetType);
+ }
+
IRDynamicType* IRBuilder::getDynamicType() { return (IRDynamicType*)getType(kIROp_DynamicType); }
IRAssociatedType* IRBuilder::getAssociatedType(ArrayView<IRInterfaceType*> constraintTypes)
@@ -5615,23 +5652,76 @@ namespace Slang
return t;
}
- IRTargetIntrinsicDecoration* findTargetIntrinsicDecoration(
- IRInst* val,
- String const& targetName)
+ //
+ // IRTargetIntrinsicDecoration
+ //
+
+ static bool _areIntrinsicCapsBetterForTarget(
+ CapabilitySet const& candidateCaps,
+ CapabilitySet const& existingCaps,
+ CapabilitySet const& targetCaps)
+ {
+ bool candidateIsAvailable = targetCaps.implies(candidateCaps);
+ bool existingIsAvailable = targetCaps.implies(existingCaps);
+ if(candidateIsAvailable != existingIsAvailable)
+ return candidateIsAvailable;
+
+ if(candidateCaps.implies(existingCaps))
+ return true;
+
+ return false;
+ }
+
+ IRTargetIntrinsicDecoration* findAnyTargetIntrinsicDecoration(
+ IRInst* val)
{
- for(auto dd : val->getDecorations())
+ IRInst* inst = getResolvedInstForDecorations(val);
+ return inst->findDecoration<IRTargetIntrinsicDecoration>();
+ }
+
+ IRTargetSpecificDecoration* findBestTargetDecoration(
+ IRInst* inInst,
+ CapabilitySet const& targetCaps)
+ {
+ IRInst* inst = getResolvedInstForDecorations(inInst);
+
+ // We will search through all the `IRTargetIntrinsicDecoration`s on
+ // the instruction, looking for those that are applicable to the
+ // current code generation target. Among the application decorations
+ // we will try to find one that is "best" in the sense that it is
+ // more (or at least as) specialized for the target than the
+ // others.
+ //
+ IRTargetSpecificDecoration* bestDecoration = nullptr;
+ CapabilitySet bestCaps;
+ for(auto dd : inst->getDecorations())
{
- if(dd->op != kIROp_TargetIntrinsicDecoration)
+ auto decoration = as<IRTargetSpecificDecoration>(dd);
+ if(!decoration)
continue;
- auto decoration = (IRTargetIntrinsicDecoration*) dd;
- if(String(decoration->getTargetName()) == targetName)
- return decoration;
+ auto decorationCaps = decoration->getTargetCaps();
+ if (decorationCaps.isIncompatibleWith(targetCaps))
+ continue;
+
+ if(!bestDecoration || _areIntrinsicCapsBetterForTarget(decorationCaps, bestCaps, targetCaps))
+ {
+ bestDecoration = decoration;
+ bestCaps = decorationCaps;
+ }
}
- return nullptr;
+ return bestDecoration;
}
+ IRTargetSpecificDecoration* findBestTargetDecoration(
+ IRInst* val,
+ CapabilityAtom targetCapabilityAtom)
+ {
+ return findBestTargetDecoration(val, CapabilitySet(targetCapabilityAtom));
+ }
+
+
#if 0
IRFunc* cloneSimpleFuncWithoutRegistering(IRSpecContextBase* context, IRFunc* originalFunc)
{
@@ -5671,29 +5761,46 @@ namespace Slang
IRInst* findSpecializeReturnVal(IRSpecialize* specialize)
{
- auto generic = findSpecializedGeneric(specialize);
- if(!generic)
- return nullptr;
+ auto base = specialize->getBase();
+
+ while( auto baseSpec = as<IRSpecialize>(base) )
+ {
+ auto returnVal = findSpecializeReturnVal(baseSpec);
+ if(!returnVal)
+ break;
- return findGenericReturnVal(generic);
+ base = returnVal;
+ }
+
+ if( auto generic = as<IRGeneric>(base) )
+ {
+ return findGenericReturnVal(generic);
+ }
+
+ return nullptr;
}
IRInst* getResolvedInstForDecorations(IRInst* inst)
{
IRInst* candidate = inst;
- while(auto specInst = as<IRSpecialize>(candidate))
+ for(;;)
{
- auto genericInst = as<IRGeneric>(specInst->getBase());
- if(!genericInst)
- break;
-
- auto returnVal = findGenericReturnVal(genericInst);
- if(!returnVal)
- break;
+ if(auto specInst = as<IRSpecialize>(candidate))
+ {
+ candidate = specInst->getBase();
+ continue;
+ }
+ if( auto genericInst = as<IRGeneric>(candidate) )
+ {
+ if( auto returnVal = findGenericReturnVal(genericInst) )
+ {
+ candidate = returnVal;
+ continue;
+ }
+ }
- candidate = returnVal;
+ return candidate;
}
- return candidate;
}
bool isDefinition(
diff --git a/source/slang/slang-lower-to-ir.cpp b/source/slang/slang-lower-to-ir.cpp
index b2a7529c5..34b189b14 100644
--- a/source/slang/slang-lower-to-ir.cpp
+++ b/source/slang/slang-lower-to-ir.cpp
@@ -530,6 +530,9 @@ bool isEffectivelyStatic(
Decl* decl,
ContainerDecl* parentDecl);
+bool isStdLibMemberFuncDecl(
+ Decl* decl);
+
// Ensure that a version of the given declaration has been emitted to the IR
LoweredValInfo ensureDecl(
IRGenContext* context,
@@ -6522,7 +6525,15 @@ struct DeclLoweringVisitor : DeclVisitor<DeclLoweringVisitor, LoweredValInfo>
}
else
{
- definition = decl->getName()->text;
+ if( isStdLibMemberFuncDecl(decl) )
+ {
+ // We will mark member functions by appending a `.` to the
+ // start of their name.
+ //
+ definition.append(".");
+ }
+
+ definition.append(decl->getName()->text);
}
UnownedStringSlice targetName;
@@ -6532,7 +6543,19 @@ struct DeclLoweringVisitor : DeclVisitor<DeclLoweringVisitor, LoweredValInfo>
targetName = targetToken.getContent();
}
- builder->addTargetIntrinsicDecoration(irInst, targetName, definition.getUnownedSlice());
+ CapabilitySet targetCaps;
+ if( targetName.getLength() == 0 )
+ {
+ targetCaps = CapabilitySet::makeEmpty();
+ }
+ else
+ {
+ CapabilityAtom targetCap = findCapabilityAtom(targetName);
+ SLANG_ASSERT(targetCap != CapabilityAtom::Invalid);
+ targetCaps = CapabilitySet(targetCap);
+ }
+
+ builder->addTargetIntrinsicDecoration(irInst, targetCaps, definition.getUnownedSlice());
}
if(auto nvapiMod = decl->findModifier<NVAPIMagicModifier>())
@@ -6543,8 +6566,12 @@ struct DeclLoweringVisitor : DeclVisitor<DeclLoweringVisitor, LoweredValInfo>
/// Is `decl` a member function (or effectively a member function) when considered as a stdlib declaration?
bool isStdLibMemberFuncDecl(
- CallableDecl* decl)
+ Decl* inDecl)
{
+ auto decl = as<CallableDecl>(inDecl);
+ if(!decl)
+ return false;
+
// Constructors aren't really member functions, insofar
// as they aren't called with a `this` parameter.
//
@@ -6655,7 +6682,7 @@ struct DeclLoweringVisitor : DeclVisitor<DeclLoweringVisitor, LoweredValInfo>
definition.append(getText(declForName->getName()));
- getBuilder()->addTargetIntrinsicDecoration(irInst, UnownedStringSlice(), definition.getUnownedSlice());
+ getBuilder()->addTargetIntrinsicDecoration(irInst, CapabilitySet::makeEmpty(), definition.getUnownedSlice());
}
void addParamNameHint(IRInst* inst, IRLoweringParameterInfo const& info)
@@ -6974,7 +7001,10 @@ struct DeclLoweringVisitor : DeclVisitor<DeclLoweringVisitor, LoweredValInfo>
// a specialized definition of the particular function for the given
// target, and we need to reflect that at the IR level.
- getBuilder()->addTargetDecoration(irFunc, targetMod->targetToken.getContent());
+ auto targetName = targetMod->targetToken.getContent();
+ auto targetCap = findCapabilityAtom(targetName);
+
+ getBuilder()->addTargetDecoration(irFunc, CapabilitySet(targetCap));
}
// If this declaration was marked as having a target-specific lowering
diff --git a/source/slang/slang.cpp b/source/slang/slang.cpp
index 92a08a224..ef838b871 100644
--- a/source/slang/slang.cpp
+++ b/source/slang/slang.cpp
@@ -971,6 +971,74 @@ MatrixLayoutMode TargetRequest::getDefaultMatrixLayoutMode()
return linkage->getDefaultMatrixLayoutMode();
}
+CapabilitySet TargetRequest::getTargetCaps()
+{
+ if(!targetCaps.isInvalid())
+ return targetCaps;
+
+ // The full `CapabilitySet` for the target will be computed
+ // from the combination of the code generation format, and
+ // the profile.
+ //
+ // Note: the preofile might have been set in a way that is
+ // inconsistent with the output code format of SPIR-V, but
+ // a profile of Direct3D Shader Model 5.1. In those cases,
+ // the format should always override the implications in
+ // the profile.
+ //
+ // TODO: This logic isn't currently taking int account
+ // the information in the profile, because the current
+ // `CapabilityAtom`s that we support don't include any
+ // of the details there (e.g., the shader model versions).
+ //
+ // Eventually, we'd want to have a rich set of capability
+ // atoms, so that most of the information about what operations
+ // are available where can be directly encoded on the declarations.
+
+ List<CapabilityAtom> atoms;
+ switch(target)
+ {
+ case CodeGenTarget::GLSL:
+ case CodeGenTarget::GLSL_Vulkan:
+ case CodeGenTarget::GLSL_Vulkan_OneDesc:
+ case CodeGenTarget::SPIRV:
+ case CodeGenTarget::SPIRVAssembly:
+ atoms.add(CapabilityAtom::GLSL);
+ break;
+
+ case CodeGenTarget::HLSL:
+ case CodeGenTarget::DXBytecode:
+ case CodeGenTarget::DXBytecodeAssembly:
+ case CodeGenTarget::DXIL:
+ case CodeGenTarget::DXILAssembly:
+ atoms.add(CapabilityAtom::HLSL);
+ break;
+
+ case CodeGenTarget::CSource:
+ atoms.add(CapabilityAtom::C);
+ break;
+
+ case CodeGenTarget::CPPSource:
+ case CodeGenTarget::Executable:
+ case CodeGenTarget::SharedLibrary:
+ case CodeGenTarget::HostCallable:
+ atoms.add(CapabilityAtom::CPP);
+ break;
+
+ case CodeGenTarget::CUDASource:
+ case CodeGenTarget::PTX:
+ atoms.add(CapabilityAtom::CUDA);
+ break;
+
+ default:
+ break;
+ }
+
+ targetCaps = CapabilitySet(atoms);
+ return targetCaps;
+}
+
+
TypeLayout* TargetRequest::getTypeLayout(Type* type)
{
// TODO: We are not passing in a `ProgramLayout` here, although one