diff options
| author | Tim Foley <tfoleyNV@users.noreply.github.com> | 2019-11-14 13:11:07 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2019-11-14 13:11:07 -0800 |
| commit | ce4829b03622c7c23096253b0ee80b0fc923321e (patch) | |
| tree | e1232ee908b2e3fa604d1c68c08ca4df4f4f8652 | |
| parent | d631233f4fcc2e41a9e7d7e0d3e277c90c81582b (diff) | |
Initial work on direct emission of SPIR-V (#1118)
* Initial work on direct emission of SPIR-V
This change adds a first vertical slice of support for emitting SPIR-V code directly from the Slang IR, instead of generating it indirectly via GLSL.
This work isn't usable for anything valuable right now; the goal is just to get something checked in that we can incrementally extend over time.
When invoking `slangc`, the `-emit-spirv-directly` option can be used to turn on the new code path.
I have not bothered to add an equivalent API option, because this flag is only intended to be used for testing in the immediate future.
The existing `emitEntryPoint()` function has become `emitEntryPointSource()` to more accurately reflect its role in a world where we can also emit entry points to a binary format.
Much of the logic that was inside `emitEntryPoint()` had to do with linking and then optimizing/transforming Slang IR code to get it ready for emission on a particular target.
This logic has been factored into a new `linkAndOptimizeIR()` function that can be shared between the path that emits source and the new one that emits SPIR-V.
The meat of the change is then the `emitSPIRVFromIR()` function in `slang-emit-spirv.cpp`, which is called *after* all the optimizations and transformations have been applied to the Slang IR to get it ready.
Rather than repeat myself here, I will try to make the comments in `slang-emit-spirv.cpp` usable as documentation of the approach being taken.
Smaller notes:
* I've included a test case that compares `slangc` output directly to expected SPIR-V. This is perhaps not an ideal plan for how to test SPIR-V emission going forward, but it suffices for now.
* The `external/` directory needed to be added to the include dirs for the `slang` project so that the new code can depend on the SPIR-V header.
* In `slang-ir-link`, the direct SPIR-V generation path means that we now link with a target of SPIR-V instead of GLSL. In principle this can be used to ensure that appropriate variants of intrinsics are selected based on the knowledge that we are emitting SPIR-V. In practice, that isn't being used at all.
* Fixup: path for SPIR-V headers
While working on this PR I used a copy of `spirv.h` that I placed into the repository tree manually, but since I started the work we ended up with SPIR-V headers in our tree anyway, albeit at a different path.
This change tries to fix things up so that my code uses the headers that were already placed in the repository.
* fixup; 64-bit build issue
* fixup: typo fixes based on review
| -rw-r--r-- | external/spirv/spirv.h | 1093 | ||||
| -rw-r--r-- | premake5.lua | 2 | ||||
| -rw-r--r-- | source/slang/slang-compiler.cpp | 44 | ||||
| -rw-r--r-- | source/slang/slang-compiler.h | 3 | ||||
| -rw-r--r-- | source/slang/slang-emit-spirv.cpp | 1141 | ||||
| -rw-r--r-- | source/slang/slang-emit.cpp | 606 | ||||
| -rw-r--r-- | source/slang/slang-emit.h | 2 | ||||
| -rw-r--r-- | source/slang/slang-ir-link.cpp | 3 | ||||
| -rw-r--r-- | source/slang/slang-options.cpp | 4 | ||||
| -rw-r--r-- | source/slang/slang.vcxproj | 5 | ||||
| -rw-r--r-- | source/slang/slang.vcxproj.filters | 3 | ||||
| -rw-r--r-- | tests/spirv/direct-spirv-emit.slang | 9 | ||||
| -rw-r--r-- | tests/spirv/direct-spirv-emit.slang.expected | 20 |
13 files changed, 2671 insertions, 264 deletions
diff --git a/external/spirv/spirv.h b/external/spirv/spirv.h new file mode 100644 index 000000000..4c90c936c --- /dev/null +++ b/external/spirv/spirv.h @@ -0,0 +1,1093 @@ +/* +** Copyright (c) 2014-2018 The Khronos Group Inc. +** +** Permission is hereby granted, free of charge, to any person obtaining a copy +** of this software and/or associated documentation files (the "Materials"), +** to deal in the Materials without restriction, including without limitation +** the rights to use, copy, modify, merge, publish, distribute, sublicense, +** and/or sell copies of the Materials, and to permit persons to whom the +** Materials are furnished to do so, subject to the following conditions: +** +** The above copyright notice and this permission notice shall be included in +** all copies or substantial portions of the Materials. +** +** MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS KHRONOS +** STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS SPECIFICATIONS AND +** HEADER INFORMATION ARE LOCATED AT https://www.khronos.org/registry/ +** +** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +** OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +** FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +** THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +** LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +** FROM,OUT OF OR IN CONNECTION WITH THE MATERIALS OR THE USE OR OTHER DEALINGS +** IN THE MATERIALS. +*/ + +/* +** This header is automatically generated by the same tool that creates +** the Binary Section of the SPIR-V specification. +*/ + +/* +** Enumeration tokens for SPIR-V, in various styles: +** C, C++, C++11, JSON, Lua, Python +** +** - C will have tokens with a "Spv" prefix, e.g.: SpvSourceLanguageGLSL +** - C++ will have tokens in the "spv" name space, e.g.: spv::SourceLanguageGLSL +** - C++11 will use enum classes in the spv namespace, e.g.: spv::SourceLanguage::GLSL +** - Lua will use tables, e.g.: spv.SourceLanguage.GLSL +** - Python will use dictionaries, e.g.: spv['SourceLanguage']['GLSL'] +** +** Some tokens act like mask values, which can be OR'd together, +** while others are mutually exclusive. The mask-like ones have +** "Mask" in their name, and a parallel enum that has the shift +** amount (1 << x) for each corresponding enumerant. +*/ + +#ifndef spirv_H +#define spirv_H + +typedef unsigned int SpvId; + +#define SPV_VERSION 0x10300 +#define SPV_REVISION 1 + +static const unsigned int SpvMagicNumber = 0x07230203; +static const unsigned int SpvVersion = 0x00010300; +static const unsigned int SpvRevision = 1; +static const unsigned int SpvOpCodeMask = 0xffff; +static const unsigned int SpvWordCountShift = 16; + +typedef enum SpvSourceLanguage_ { + SpvSourceLanguageUnknown = 0, + SpvSourceLanguageESSL = 1, + SpvSourceLanguageGLSL = 2, + SpvSourceLanguageOpenCL_C = 3, + SpvSourceLanguageOpenCL_CPP = 4, + SpvSourceLanguageHLSL = 5, + SpvSourceLanguageMax = 0x7fffffff, +} SpvSourceLanguage; + +typedef enum SpvExecutionModel_ { + SpvExecutionModelVertex = 0, + SpvExecutionModelTessellationControl = 1, + SpvExecutionModelTessellationEvaluation = 2, + SpvExecutionModelGeometry = 3, + SpvExecutionModelFragment = 4, + SpvExecutionModelGLCompute = 5, + SpvExecutionModelKernel = 6, + SpvExecutionModelMax = 0x7fffffff, +} SpvExecutionModel; + +typedef enum SpvAddressingModel_ { + SpvAddressingModelLogical = 0, + SpvAddressingModelPhysical32 = 1, + SpvAddressingModelPhysical64 = 2, + SpvAddressingModelMax = 0x7fffffff, +} SpvAddressingModel; + +typedef enum SpvMemoryModel_ { + SpvMemoryModelSimple = 0, + SpvMemoryModelGLSL450 = 1, + SpvMemoryModelOpenCL = 2, + SpvMemoryModelMax = 0x7fffffff, +} SpvMemoryModel; + +typedef enum SpvExecutionMode_ { + SpvExecutionModeInvocations = 0, + SpvExecutionModeSpacingEqual = 1, + SpvExecutionModeSpacingFractionalEven = 2, + SpvExecutionModeSpacingFractionalOdd = 3, + SpvExecutionModeVertexOrderCw = 4, + SpvExecutionModeVertexOrderCcw = 5, + SpvExecutionModePixelCenterInteger = 6, + SpvExecutionModeOriginUpperLeft = 7, + SpvExecutionModeOriginLowerLeft = 8, + SpvExecutionModeEarlyFragmentTests = 9, + SpvExecutionModePointMode = 10, + SpvExecutionModeXfb = 11, + SpvExecutionModeDepthReplacing = 12, + SpvExecutionModeDepthGreater = 14, + SpvExecutionModeDepthLess = 15, + SpvExecutionModeDepthUnchanged = 16, + SpvExecutionModeLocalSize = 17, + SpvExecutionModeLocalSizeHint = 18, + SpvExecutionModeInputPoints = 19, + SpvExecutionModeInputLines = 20, + SpvExecutionModeInputLinesAdjacency = 21, + SpvExecutionModeTriangles = 22, + SpvExecutionModeInputTrianglesAdjacency = 23, + SpvExecutionModeQuads = 24, + SpvExecutionModeIsolines = 25, + SpvExecutionModeOutputVertices = 26, + SpvExecutionModeOutputPoints = 27, + SpvExecutionModeOutputLineStrip = 28, + SpvExecutionModeOutputTriangleStrip = 29, + SpvExecutionModeVecTypeHint = 30, + SpvExecutionModeContractionOff = 31, + SpvExecutionModeInitializer = 33, + SpvExecutionModeFinalizer = 34, + SpvExecutionModeSubgroupSize = 35, + SpvExecutionModeSubgroupsPerWorkgroup = 36, + SpvExecutionModeSubgroupsPerWorkgroupId = 37, + SpvExecutionModeLocalSizeId = 38, + SpvExecutionModeLocalSizeHintId = 39, + SpvExecutionModePostDepthCoverage = 4446, + SpvExecutionModeStencilRefReplacingEXT = 5027, + SpvExecutionModeMax = 0x7fffffff, +} SpvExecutionMode; + +typedef enum SpvStorageClass_ { + SpvStorageClassUniformConstant = 0, + SpvStorageClassInput = 1, + SpvStorageClassUniform = 2, + SpvStorageClassOutput = 3, + SpvStorageClassWorkgroup = 4, + SpvStorageClassCrossWorkgroup = 5, + SpvStorageClassPrivate = 6, + SpvStorageClassFunction = 7, + SpvStorageClassGeneric = 8, + SpvStorageClassPushConstant = 9, + SpvStorageClassAtomicCounter = 10, + SpvStorageClassImage = 11, + SpvStorageClassStorageBuffer = 12, + SpvStorageClassMax = 0x7fffffff, +} SpvStorageClass; + +typedef enum SpvDim_ { + SpvDim1D = 0, + SpvDim2D = 1, + SpvDim3D = 2, + SpvDimCube = 3, + SpvDimRect = 4, + SpvDimBuffer = 5, + SpvDimSubpassData = 6, + SpvDimMax = 0x7fffffff, +} SpvDim; + +typedef enum SpvSamplerAddressingMode_ { + SpvSamplerAddressingModeNone = 0, + SpvSamplerAddressingModeClampToEdge = 1, + SpvSamplerAddressingModeClamp = 2, + SpvSamplerAddressingModeRepeat = 3, + SpvSamplerAddressingModeRepeatMirrored = 4, + SpvSamplerAddressingModeMax = 0x7fffffff, +} SpvSamplerAddressingMode; + +typedef enum SpvSamplerFilterMode_ { + SpvSamplerFilterModeNearest = 0, + SpvSamplerFilterModeLinear = 1, + SpvSamplerFilterModeMax = 0x7fffffff, +} SpvSamplerFilterMode; + +typedef enum SpvImageFormat_ { + SpvImageFormatUnknown = 0, + SpvImageFormatRgba32f = 1, + SpvImageFormatRgba16f = 2, + SpvImageFormatR32f = 3, + SpvImageFormatRgba8 = 4, + SpvImageFormatRgba8Snorm = 5, + SpvImageFormatRg32f = 6, + SpvImageFormatRg16f = 7, + SpvImageFormatR11fG11fB10f = 8, + SpvImageFormatR16f = 9, + SpvImageFormatRgba16 = 10, + SpvImageFormatRgb10A2 = 11, + SpvImageFormatRg16 = 12, + SpvImageFormatRg8 = 13, + SpvImageFormatR16 = 14, + SpvImageFormatR8 = 15, + SpvImageFormatRgba16Snorm = 16, + SpvImageFormatRg16Snorm = 17, + SpvImageFormatRg8Snorm = 18, + SpvImageFormatR16Snorm = 19, + SpvImageFormatR8Snorm = 20, + SpvImageFormatRgba32i = 21, + SpvImageFormatRgba16i = 22, + SpvImageFormatRgba8i = 23, + SpvImageFormatR32i = 24, + SpvImageFormatRg32i = 25, + SpvImageFormatRg16i = 26, + SpvImageFormatRg8i = 27, + SpvImageFormatR16i = 28, + SpvImageFormatR8i = 29, + SpvImageFormatRgba32ui = 30, + SpvImageFormatRgba16ui = 31, + SpvImageFormatRgba8ui = 32, + SpvImageFormatR32ui = 33, + SpvImageFormatRgb10a2ui = 34, + SpvImageFormatRg32ui = 35, + SpvImageFormatRg16ui = 36, + SpvImageFormatRg8ui = 37, + SpvImageFormatR16ui = 38, + SpvImageFormatR8ui = 39, + SpvImageFormatMax = 0x7fffffff, +} SpvImageFormat; + +typedef enum SpvImageChannelOrder_ { + SpvImageChannelOrderR = 0, + SpvImageChannelOrderA = 1, + SpvImageChannelOrderRG = 2, + SpvImageChannelOrderRA = 3, + SpvImageChannelOrderRGB = 4, + SpvImageChannelOrderRGBA = 5, + SpvImageChannelOrderBGRA = 6, + SpvImageChannelOrderARGB = 7, + SpvImageChannelOrderIntensity = 8, + SpvImageChannelOrderLuminance = 9, + SpvImageChannelOrderRx = 10, + SpvImageChannelOrderRGx = 11, + SpvImageChannelOrderRGBx = 12, + SpvImageChannelOrderDepth = 13, + SpvImageChannelOrderDepthStencil = 14, + SpvImageChannelOrdersRGB = 15, + SpvImageChannelOrdersRGBx = 16, + SpvImageChannelOrdersRGBA = 17, + SpvImageChannelOrdersBGRA = 18, + SpvImageChannelOrderABGR = 19, + SpvImageChannelOrderMax = 0x7fffffff, +} SpvImageChannelOrder; + +typedef enum SpvImageChannelDataType_ { + SpvImageChannelDataTypeSnormInt8 = 0, + SpvImageChannelDataTypeSnormInt16 = 1, + SpvImageChannelDataTypeUnormInt8 = 2, + SpvImageChannelDataTypeUnormInt16 = 3, + SpvImageChannelDataTypeUnormShort565 = 4, + SpvImageChannelDataTypeUnormShort555 = 5, + SpvImageChannelDataTypeUnormInt101010 = 6, + SpvImageChannelDataTypeSignedInt8 = 7, + SpvImageChannelDataTypeSignedInt16 = 8, + SpvImageChannelDataTypeSignedInt32 = 9, + SpvImageChannelDataTypeUnsignedInt8 = 10, + SpvImageChannelDataTypeUnsignedInt16 = 11, + SpvImageChannelDataTypeUnsignedInt32 = 12, + SpvImageChannelDataTypeHalfFloat = 13, + SpvImageChannelDataTypeFloat = 14, + SpvImageChannelDataTypeUnormInt24 = 15, + SpvImageChannelDataTypeUnormInt101010_2 = 16, + SpvImageChannelDataTypeMax = 0x7fffffff, +} SpvImageChannelDataType; + +typedef enum SpvImageOperandsShift_ { + SpvImageOperandsBiasShift = 0, + SpvImageOperandsLodShift = 1, + SpvImageOperandsGradShift = 2, + SpvImageOperandsConstOffsetShift = 3, + SpvImageOperandsOffsetShift = 4, + SpvImageOperandsConstOffsetsShift = 5, + SpvImageOperandsSampleShift = 6, + SpvImageOperandsMinLodShift = 7, + SpvImageOperandsMax = 0x7fffffff, +} SpvImageOperandsShift; + +typedef enum SpvImageOperandsMask_ { + SpvImageOperandsMaskNone = 0, + SpvImageOperandsBiasMask = 0x00000001, + SpvImageOperandsLodMask = 0x00000002, + SpvImageOperandsGradMask = 0x00000004, + SpvImageOperandsConstOffsetMask = 0x00000008, + SpvImageOperandsOffsetMask = 0x00000010, + SpvImageOperandsConstOffsetsMask = 0x00000020, + SpvImageOperandsSampleMask = 0x00000040, + SpvImageOperandsMinLodMask = 0x00000080, +} SpvImageOperandsMask; + +typedef enum SpvFPFastMathModeShift_ { + SpvFPFastMathModeNotNaNShift = 0, + SpvFPFastMathModeNotInfShift = 1, + SpvFPFastMathModeNSZShift = 2, + SpvFPFastMathModeAllowRecipShift = 3, + SpvFPFastMathModeFastShift = 4, + SpvFPFastMathModeMax = 0x7fffffff, +} SpvFPFastMathModeShift; + +typedef enum SpvFPFastMathModeMask_ { + SpvFPFastMathModeMaskNone = 0, + SpvFPFastMathModeNotNaNMask = 0x00000001, + SpvFPFastMathModeNotInfMask = 0x00000002, + SpvFPFastMathModeNSZMask = 0x00000004, + SpvFPFastMathModeAllowRecipMask = 0x00000008, + SpvFPFastMathModeFastMask = 0x00000010, +} SpvFPFastMathModeMask; + +typedef enum SpvFPRoundingMode_ { + SpvFPRoundingModeRTE = 0, + SpvFPRoundingModeRTZ = 1, + SpvFPRoundingModeRTP = 2, + SpvFPRoundingModeRTN = 3, + SpvFPRoundingModeMax = 0x7fffffff, +} SpvFPRoundingMode; + +typedef enum SpvLinkageType_ { + SpvLinkageTypeExport = 0, + SpvLinkageTypeImport = 1, + SpvLinkageTypeMax = 0x7fffffff, +} SpvLinkageType; + +typedef enum SpvAccessQualifier_ { + SpvAccessQualifierReadOnly = 0, + SpvAccessQualifierWriteOnly = 1, + SpvAccessQualifierReadWrite = 2, + SpvAccessQualifierMax = 0x7fffffff, +} SpvAccessQualifier; + +typedef enum SpvFunctionParameterAttribute_ { + SpvFunctionParameterAttributeZext = 0, + SpvFunctionParameterAttributeSext = 1, + SpvFunctionParameterAttributeByVal = 2, + SpvFunctionParameterAttributeSret = 3, + SpvFunctionParameterAttributeNoAlias = 4, + SpvFunctionParameterAttributeNoCapture = 5, + SpvFunctionParameterAttributeNoWrite = 6, + SpvFunctionParameterAttributeNoReadWrite = 7, + SpvFunctionParameterAttributeMax = 0x7fffffff, +} SpvFunctionParameterAttribute; + +typedef enum SpvDecoration_ { + SpvDecorationRelaxedPrecision = 0, + SpvDecorationSpecId = 1, + SpvDecorationBlock = 2, + SpvDecorationBufferBlock = 3, + SpvDecorationRowMajor = 4, + SpvDecorationColMajor = 5, + SpvDecorationArrayStride = 6, + SpvDecorationMatrixStride = 7, + SpvDecorationGLSLShared = 8, + SpvDecorationGLSLPacked = 9, + SpvDecorationCPacked = 10, + SpvDecorationBuiltIn = 11, + SpvDecorationNoPerspective = 13, + SpvDecorationFlat = 14, + SpvDecorationPatch = 15, + SpvDecorationCentroid = 16, + SpvDecorationSample = 17, + SpvDecorationInvariant = 18, + SpvDecorationRestrict = 19, + SpvDecorationAliased = 20, + SpvDecorationVolatile = 21, + SpvDecorationConstant = 22, + SpvDecorationCoherent = 23, + SpvDecorationNonWritable = 24, + SpvDecorationNonReadable = 25, + SpvDecorationUniform = 26, + SpvDecorationSaturatedConversion = 28, + SpvDecorationStream = 29, + SpvDecorationLocation = 30, + SpvDecorationComponent = 31, + SpvDecorationIndex = 32, + SpvDecorationBinding = 33, + SpvDecorationDescriptorSet = 34, + SpvDecorationOffset = 35, + SpvDecorationXfbBuffer = 36, + SpvDecorationXfbStride = 37, + SpvDecorationFuncParamAttr = 38, + SpvDecorationFPRoundingMode = 39, + SpvDecorationFPFastMathMode = 40, + SpvDecorationLinkageAttributes = 41, + SpvDecorationNoContraction = 42, + SpvDecorationInputAttachmentIndex = 43, + SpvDecorationAlignment = 44, + SpvDecorationMaxByteOffset = 45, + SpvDecorationAlignmentId = 46, + SpvDecorationMaxByteOffsetId = 47, + SpvDecorationExplicitInterpAMD = 4999, + SpvDecorationOverrideCoverageNV = 5248, + SpvDecorationPassthroughNV = 5250, + SpvDecorationViewportRelativeNV = 5252, + SpvDecorationSecondaryViewportRelativeNV = 5256, + SpvDecorationNonUniformEXT = 5300, + SpvDecorationHlslCounterBufferGOOGLE = 5634, + SpvDecorationHlslSemanticGOOGLE = 5635, + SpvDecorationMax = 0x7fffffff, +} SpvDecoration; + +typedef enum SpvBuiltIn_ { + SpvBuiltInPosition = 0, + SpvBuiltInPointSize = 1, + SpvBuiltInClipDistance = 3, + SpvBuiltInCullDistance = 4, + SpvBuiltInVertexId = 5, + SpvBuiltInInstanceId = 6, + SpvBuiltInPrimitiveId = 7, + SpvBuiltInInvocationId = 8, + SpvBuiltInLayer = 9, + SpvBuiltInViewportIndex = 10, + SpvBuiltInTessLevelOuter = 11, + SpvBuiltInTessLevelInner = 12, + SpvBuiltInTessCoord = 13, + SpvBuiltInPatchVertices = 14, + SpvBuiltInFragCoord = 15, + SpvBuiltInPointCoord = 16, + SpvBuiltInFrontFacing = 17, + SpvBuiltInSampleId = 18, + SpvBuiltInSamplePosition = 19, + SpvBuiltInSampleMask = 20, + SpvBuiltInFragDepth = 22, + SpvBuiltInHelperInvocation = 23, + SpvBuiltInNumWorkgroups = 24, + SpvBuiltInWorkgroupSize = 25, + SpvBuiltInWorkgroupId = 26, + SpvBuiltInLocalInvocationId = 27, + SpvBuiltInGlobalInvocationId = 28, + SpvBuiltInLocalInvocationIndex = 29, + SpvBuiltInWorkDim = 30, + SpvBuiltInGlobalSize = 31, + SpvBuiltInEnqueuedWorkgroupSize = 32, + SpvBuiltInGlobalOffset = 33, + SpvBuiltInGlobalLinearId = 34, + SpvBuiltInSubgroupSize = 36, + SpvBuiltInSubgroupMaxSize = 37, + SpvBuiltInNumSubgroups = 38, + SpvBuiltInNumEnqueuedSubgroups = 39, + SpvBuiltInSubgroupId = 40, + SpvBuiltInSubgroupLocalInvocationId = 41, + SpvBuiltInVertexIndex = 42, + SpvBuiltInInstanceIndex = 43, + SpvBuiltInSubgroupEqMask = 4416, + SpvBuiltInSubgroupEqMaskKHR = 4416, + SpvBuiltInSubgroupGeMask = 4417, + SpvBuiltInSubgroupGeMaskKHR = 4417, + SpvBuiltInSubgroupGtMask = 4418, + SpvBuiltInSubgroupGtMaskKHR = 4418, + SpvBuiltInSubgroupLeMask = 4419, + SpvBuiltInSubgroupLeMaskKHR = 4419, + SpvBuiltInSubgroupLtMask = 4420, + SpvBuiltInSubgroupLtMaskKHR = 4420, + SpvBuiltInBaseVertex = 4424, + SpvBuiltInBaseInstance = 4425, + SpvBuiltInDrawIndex = 4426, + SpvBuiltInDeviceIndex = 4438, + SpvBuiltInViewIndex = 4440, + SpvBuiltInBaryCoordNoPerspAMD = 4992, + SpvBuiltInBaryCoordNoPerspCentroidAMD = 4993, + SpvBuiltInBaryCoordNoPerspSampleAMD = 4994, + SpvBuiltInBaryCoordSmoothAMD = 4995, + SpvBuiltInBaryCoordSmoothCentroidAMD = 4996, + SpvBuiltInBaryCoordSmoothSampleAMD = 4997, + SpvBuiltInBaryCoordPullModelAMD = 4998, + SpvBuiltInFragStencilRefEXT = 5014, + SpvBuiltInViewportMaskNV = 5253, + SpvBuiltInSecondaryPositionNV = 5257, + SpvBuiltInSecondaryViewportMaskNV = 5258, + SpvBuiltInPositionPerViewNV = 5261, + SpvBuiltInViewportMaskPerViewNV = 5262, + SpvBuiltInFullyCoveredEXT = 5264, + SpvBuiltInMax = 0x7fffffff, +} SpvBuiltIn; + +typedef enum SpvSelectionControlShift_ { + SpvSelectionControlFlattenShift = 0, + SpvSelectionControlDontFlattenShift = 1, + SpvSelectionControlMax = 0x7fffffff, +} SpvSelectionControlShift; + +typedef enum SpvSelectionControlMask_ { + SpvSelectionControlMaskNone = 0, + SpvSelectionControlFlattenMask = 0x00000001, + SpvSelectionControlDontFlattenMask = 0x00000002, +} SpvSelectionControlMask; + +typedef enum SpvLoopControlShift_ { + SpvLoopControlUnrollShift = 0, + SpvLoopControlDontUnrollShift = 1, + SpvLoopControlDependencyInfiniteShift = 2, + SpvLoopControlDependencyLengthShift = 3, + SpvLoopControlMax = 0x7fffffff, +} SpvLoopControlShift; + +typedef enum SpvLoopControlMask_ { + SpvLoopControlMaskNone = 0, + SpvLoopControlUnrollMask = 0x00000001, + SpvLoopControlDontUnrollMask = 0x00000002, + SpvLoopControlDependencyInfiniteMask = 0x00000004, + SpvLoopControlDependencyLengthMask = 0x00000008, +} SpvLoopControlMask; + +typedef enum SpvFunctionControlShift_ { + SpvFunctionControlInlineShift = 0, + SpvFunctionControlDontInlineShift = 1, + SpvFunctionControlPureShift = 2, + SpvFunctionControlConstShift = 3, + SpvFunctionControlMax = 0x7fffffff, +} SpvFunctionControlShift; + +typedef enum SpvFunctionControlMask_ { + SpvFunctionControlMaskNone = 0, + SpvFunctionControlInlineMask = 0x00000001, + SpvFunctionControlDontInlineMask = 0x00000002, + SpvFunctionControlPureMask = 0x00000004, + SpvFunctionControlConstMask = 0x00000008, +} SpvFunctionControlMask; + +typedef enum SpvMemorySemanticsShift_ { + SpvMemorySemanticsAcquireShift = 1, + SpvMemorySemanticsReleaseShift = 2, + SpvMemorySemanticsAcquireReleaseShift = 3, + SpvMemorySemanticsSequentiallyConsistentShift = 4, + SpvMemorySemanticsUniformMemoryShift = 6, + SpvMemorySemanticsSubgroupMemoryShift = 7, + SpvMemorySemanticsWorkgroupMemoryShift = 8, + SpvMemorySemanticsCrossWorkgroupMemoryShift = 9, + SpvMemorySemanticsAtomicCounterMemoryShift = 10, + SpvMemorySemanticsImageMemoryShift = 11, + SpvMemorySemanticsMax = 0x7fffffff, +} SpvMemorySemanticsShift; + +typedef enum SpvMemorySemanticsMask_ { + SpvMemorySemanticsMaskNone = 0, + SpvMemorySemanticsAcquireMask = 0x00000002, + SpvMemorySemanticsReleaseMask = 0x00000004, + SpvMemorySemanticsAcquireReleaseMask = 0x00000008, + SpvMemorySemanticsSequentiallyConsistentMask = 0x00000010, + SpvMemorySemanticsUniformMemoryMask = 0x00000040, + SpvMemorySemanticsSubgroupMemoryMask = 0x00000080, + SpvMemorySemanticsWorkgroupMemoryMask = 0x00000100, + SpvMemorySemanticsCrossWorkgroupMemoryMask = 0x00000200, + SpvMemorySemanticsAtomicCounterMemoryMask = 0x00000400, + SpvMemorySemanticsImageMemoryMask = 0x00000800, +} SpvMemorySemanticsMask; + +typedef enum SpvMemoryAccessShift_ { + SpvMemoryAccessVolatileShift = 0, + SpvMemoryAccessAlignedShift = 1, + SpvMemoryAccessNontemporalShift = 2, + SpvMemoryAccessMax = 0x7fffffff, +} SpvMemoryAccessShift; + +typedef enum SpvMemoryAccessMask_ { + SpvMemoryAccessMaskNone = 0, + SpvMemoryAccessVolatileMask = 0x00000001, + SpvMemoryAccessAlignedMask = 0x00000002, + SpvMemoryAccessNontemporalMask = 0x00000004, +} SpvMemoryAccessMask; + +typedef enum SpvScope_ { + SpvScopeCrossDevice = 0, + SpvScopeDevice = 1, + SpvScopeWorkgroup = 2, + SpvScopeSubgroup = 3, + SpvScopeInvocation = 4, + SpvScopeMax = 0x7fffffff, +} SpvScope; + +typedef enum SpvGroupOperation_ { + SpvGroupOperationReduce = 0, + SpvGroupOperationInclusiveScan = 1, + SpvGroupOperationExclusiveScan = 2, + SpvGroupOperationClusteredReduce = 3, + SpvGroupOperationPartitionedReduceNV = 6, + SpvGroupOperationPartitionedInclusiveScanNV = 7, + SpvGroupOperationPartitionedExclusiveScanNV = 8, + SpvGroupOperationMax = 0x7fffffff, +} SpvGroupOperation; + +typedef enum SpvKernelEnqueueFlags_ { + SpvKernelEnqueueFlagsNoWait = 0, + SpvKernelEnqueueFlagsWaitKernel = 1, + SpvKernelEnqueueFlagsWaitWorkGroup = 2, + SpvKernelEnqueueFlagsMax = 0x7fffffff, +} SpvKernelEnqueueFlags; + +typedef enum SpvKernelProfilingInfoShift_ { + SpvKernelProfilingInfoCmdExecTimeShift = 0, + SpvKernelProfilingInfoMax = 0x7fffffff, +} SpvKernelProfilingInfoShift; + +typedef enum SpvKernelProfilingInfoMask_ { + SpvKernelProfilingInfoMaskNone = 0, + SpvKernelProfilingInfoCmdExecTimeMask = 0x00000001, +} SpvKernelProfilingInfoMask; + +typedef enum SpvCapability_ { + SpvCapabilityMatrix = 0, + SpvCapabilityShader = 1, + SpvCapabilityGeometry = 2, + SpvCapabilityTessellation = 3, + SpvCapabilityAddresses = 4, + SpvCapabilityLinkage = 5, + SpvCapabilityKernel = 6, + SpvCapabilityVector16 = 7, + SpvCapabilityFloat16Buffer = 8, + SpvCapabilityFloat16 = 9, + SpvCapabilityFloat64 = 10, + SpvCapabilityInt64 = 11, + SpvCapabilityInt64Atomics = 12, + SpvCapabilityImageBasic = 13, + SpvCapabilityImageReadWrite = 14, + SpvCapabilityImageMipmap = 15, + SpvCapabilityPipes = 17, + SpvCapabilityGroups = 18, + SpvCapabilityDeviceEnqueue = 19, + SpvCapabilityLiteralSampler = 20, + SpvCapabilityAtomicStorage = 21, + SpvCapabilityInt16 = 22, + SpvCapabilityTessellationPointSize = 23, + SpvCapabilityGeometryPointSize = 24, + SpvCapabilityImageGatherExtended = 25, + SpvCapabilityStorageImageMultisample = 27, + SpvCapabilityUniformBufferArrayDynamicIndexing = 28, + SpvCapabilitySampledImageArrayDynamicIndexing = 29, + SpvCapabilityStorageBufferArrayDynamicIndexing = 30, + SpvCapabilityStorageImageArrayDynamicIndexing = 31, + SpvCapabilityClipDistance = 32, + SpvCapabilityCullDistance = 33, + SpvCapabilityImageCubeArray = 34, + SpvCapabilitySampleRateShading = 35, + SpvCapabilityImageRect = 36, + SpvCapabilitySampledRect = 37, + SpvCapabilityGenericPointer = 38, + SpvCapabilityInt8 = 39, + SpvCapabilityInputAttachment = 40, + SpvCapabilitySparseResidency = 41, + SpvCapabilityMinLod = 42, + SpvCapabilitySampled1D = 43, + SpvCapabilityImage1D = 44, + SpvCapabilitySampledCubeArray = 45, + SpvCapabilitySampledBuffer = 46, + SpvCapabilityImageBuffer = 47, + SpvCapabilityImageMSArray = 48, + SpvCapabilityStorageImageExtendedFormats = 49, + SpvCapabilityImageQuery = 50, + SpvCapabilityDerivativeControl = 51, + SpvCapabilityInterpolationFunction = 52, + SpvCapabilityTransformFeedback = 53, + SpvCapabilityGeometryStreams = 54, + SpvCapabilityStorageImageReadWithoutFormat = 55, + SpvCapabilityStorageImageWriteWithoutFormat = 56, + SpvCapabilityMultiViewport = 57, + SpvCapabilitySubgroupDispatch = 58, + SpvCapabilityNamedBarrier = 59, + SpvCapabilityPipeStorage = 60, + SpvCapabilityGroupNonUniform = 61, + SpvCapabilityGroupNonUniformVote = 62, + SpvCapabilityGroupNonUniformArithmetic = 63, + SpvCapabilityGroupNonUniformBallot = 64, + SpvCapabilityGroupNonUniformShuffle = 65, + SpvCapabilityGroupNonUniformShuffleRelative = 66, + SpvCapabilityGroupNonUniformClustered = 67, + SpvCapabilityGroupNonUniformQuad = 68, + SpvCapabilitySubgroupBallotKHR = 4423, + SpvCapabilityDrawParameters = 4427, + SpvCapabilitySubgroupVoteKHR = 4431, + SpvCapabilityStorageBuffer16BitAccess = 4433, + SpvCapabilityStorageUniformBufferBlock16 = 4433, + SpvCapabilityStorageUniform16 = 4434, + SpvCapabilityUniformAndStorageBuffer16BitAccess = 4434, + SpvCapabilityStoragePushConstant16 = 4435, + SpvCapabilityStorageInputOutput16 = 4436, + SpvCapabilityDeviceGroup = 4437, + SpvCapabilityMultiView = 4439, + SpvCapabilityVariablePointersStorageBuffer = 4441, + SpvCapabilityVariablePointers = 4442, + SpvCapabilityAtomicStorageOps = 4445, + SpvCapabilitySampleMaskPostDepthCoverage = 4447, + SpvCapabilityStorageBuffer8BitAccess = 4448, + SpvCapabilityUniformAndStorageBuffer8BitAccess = 4449, + SpvCapabilityStoragePushConstant8 = 4450, + SpvCapabilityFloat16ImageAMD = 5008, + SpvCapabilityImageGatherBiasLodAMD = 5009, + SpvCapabilityFragmentMaskAMD = 5010, + SpvCapabilityStencilExportEXT = 5013, + SpvCapabilityImageReadWriteLodAMD = 5015, + SpvCapabilitySampleMaskOverrideCoverageNV = 5249, + SpvCapabilityGeometryShaderPassthroughNV = 5251, + SpvCapabilityShaderViewportIndexLayerEXT = 5254, + SpvCapabilityShaderViewportIndexLayerNV = 5254, + SpvCapabilityShaderViewportMaskNV = 5255, + SpvCapabilityShaderStereoViewNV = 5259, + SpvCapabilityPerViewAttributesNV = 5260, + SpvCapabilityFragmentFullyCoveredEXT = 5265, + SpvCapabilityGroupNonUniformPartitionedNV = 5297, + SpvCapabilityShaderNonUniformEXT = 5301, + SpvCapabilityRuntimeDescriptorArrayEXT = 5302, + SpvCapabilityInputAttachmentArrayDynamicIndexingEXT = 5303, + SpvCapabilityUniformTexelBufferArrayDynamicIndexingEXT = 5304, + SpvCapabilityStorageTexelBufferArrayDynamicIndexingEXT = 5305, + SpvCapabilityUniformBufferArrayNonUniformIndexingEXT = 5306, + SpvCapabilitySampledImageArrayNonUniformIndexingEXT = 5307, + SpvCapabilityStorageBufferArrayNonUniformIndexingEXT = 5308, + SpvCapabilityStorageImageArrayNonUniformIndexingEXT = 5309, + SpvCapabilityInputAttachmentArrayNonUniformIndexingEXT = 5310, + SpvCapabilityUniformTexelBufferArrayNonUniformIndexingEXT = 5311, + SpvCapabilityStorageTexelBufferArrayNonUniformIndexingEXT = 5312, + SpvCapabilitySubgroupShuffleINTEL = 5568, + SpvCapabilitySubgroupBufferBlockIOINTEL = 5569, + SpvCapabilitySubgroupImageBlockIOINTEL = 5570, + SpvCapabilityMax = 0x7fffffff, +} SpvCapability; + +typedef enum SpvOp_ { + SpvOpNop = 0, + SpvOpUndef = 1, + SpvOpSourceContinued = 2, + SpvOpSource = 3, + SpvOpSourceExtension = 4, + SpvOpName = 5, + SpvOpMemberName = 6, + SpvOpString = 7, + SpvOpLine = 8, + SpvOpExtension = 10, + SpvOpExtInstImport = 11, + SpvOpExtInst = 12, + SpvOpMemoryModel = 14, + SpvOpEntryPoint = 15, + SpvOpExecutionMode = 16, + SpvOpCapability = 17, + SpvOpTypeVoid = 19, + SpvOpTypeBool = 20, + SpvOpTypeInt = 21, + SpvOpTypeFloat = 22, + SpvOpTypeVector = 23, + SpvOpTypeMatrix = 24, + SpvOpTypeImage = 25, + SpvOpTypeSampler = 26, + SpvOpTypeSampledImage = 27, + SpvOpTypeArray = 28, + SpvOpTypeRuntimeArray = 29, + SpvOpTypeStruct = 30, + SpvOpTypeOpaque = 31, + SpvOpTypePointer = 32, + SpvOpTypeFunction = 33, + SpvOpTypeEvent = 34, + SpvOpTypeDeviceEvent = 35, + SpvOpTypeReserveId = 36, + SpvOpTypeQueue = 37, + SpvOpTypePipe = 38, + SpvOpTypeForwardPointer = 39, + SpvOpConstantTrue = 41, + SpvOpConstantFalse = 42, + SpvOpConstant = 43, + SpvOpConstantComposite = 44, + SpvOpConstantSampler = 45, + SpvOpConstantNull = 46, + SpvOpSpecConstantTrue = 48, + SpvOpSpecConstantFalse = 49, + SpvOpSpecConstant = 50, + SpvOpSpecConstantComposite = 51, + SpvOpSpecConstantOp = 52, + SpvOpFunction = 54, + SpvOpFunctionParameter = 55, + SpvOpFunctionEnd = 56, + SpvOpFunctionCall = 57, + SpvOpVariable = 59, + SpvOpImageTexelPointer = 60, + SpvOpLoad = 61, + SpvOpStore = 62, + SpvOpCopyMemory = 63, + SpvOpCopyMemorySized = 64, + SpvOpAccessChain = 65, + SpvOpInBoundsAccessChain = 66, + SpvOpPtrAccessChain = 67, + SpvOpArrayLength = 68, + SpvOpGenericPtrMemSemantics = 69, + SpvOpInBoundsPtrAccessChain = 70, + SpvOpDecorate = 71, + SpvOpMemberDecorate = 72, + SpvOpDecorationGroup = 73, + SpvOpGroupDecorate = 74, + SpvOpGroupMemberDecorate = 75, + SpvOpVectorExtractDynamic = 77, + SpvOpVectorInsertDynamic = 78, + SpvOpVectorShuffle = 79, + SpvOpCompositeConstruct = 80, + SpvOpCompositeExtract = 81, + SpvOpCompositeInsert = 82, + SpvOpCopyObject = 83, + SpvOpTranspose = 84, + SpvOpSampledImage = 86, + SpvOpImageSampleImplicitLod = 87, + SpvOpImageSampleExplicitLod = 88, + SpvOpImageSampleDrefImplicitLod = 89, + SpvOpImageSampleDrefExplicitLod = 90, + SpvOpImageSampleProjImplicitLod = 91, + SpvOpImageSampleProjExplicitLod = 92, + SpvOpImageSampleProjDrefImplicitLod = 93, + SpvOpImageSampleProjDrefExplicitLod = 94, + SpvOpImageFetch = 95, + SpvOpImageGather = 96, + SpvOpImageDrefGather = 97, + SpvOpImageRead = 98, + SpvOpImageWrite = 99, + SpvOpImage = 100, + SpvOpImageQueryFormat = 101, + SpvOpImageQueryOrder = 102, + SpvOpImageQuerySizeLod = 103, + SpvOpImageQuerySize = 104, + SpvOpImageQueryLod = 105, + SpvOpImageQueryLevels = 106, + SpvOpImageQuerySamples = 107, + SpvOpConvertFToU = 109, + SpvOpConvertFToS = 110, + SpvOpConvertSToF = 111, + SpvOpConvertUToF = 112, + SpvOpUConvert = 113, + SpvOpSConvert = 114, + SpvOpFConvert = 115, + SpvOpQuantizeToF16 = 116, + SpvOpConvertPtrToU = 117, + SpvOpSatConvertSToU = 118, + SpvOpSatConvertUToS = 119, + SpvOpConvertUToPtr = 120, + SpvOpPtrCastToGeneric = 121, + SpvOpGenericCastToPtr = 122, + SpvOpGenericCastToPtrExplicit = 123, + SpvOpBitcast = 124, + SpvOpSNegate = 126, + SpvOpFNegate = 127, + SpvOpIAdd = 128, + SpvOpFAdd = 129, + SpvOpISub = 130, + SpvOpFSub = 131, + SpvOpIMul = 132, + SpvOpFMul = 133, + SpvOpUDiv = 134, + SpvOpSDiv = 135, + SpvOpFDiv = 136, + SpvOpUMod = 137, + SpvOpSRem = 138, + SpvOpSMod = 139, + SpvOpFRem = 140, + SpvOpFMod = 141, + SpvOpVectorTimesScalar = 142, + SpvOpMatrixTimesScalar = 143, + SpvOpVectorTimesMatrix = 144, + SpvOpMatrixTimesVector = 145, + SpvOpMatrixTimesMatrix = 146, + SpvOpOuterProduct = 147, + SpvOpDot = 148, + SpvOpIAddCarry = 149, + SpvOpISubBorrow = 150, + SpvOpUMulExtended = 151, + SpvOpSMulExtended = 152, + SpvOpAny = 154, + SpvOpAll = 155, + SpvOpIsNan = 156, + SpvOpIsInf = 157, + SpvOpIsFinite = 158, + SpvOpIsNormal = 159, + SpvOpSignBitSet = 160, + SpvOpLessOrGreater = 161, + SpvOpOrdered = 162, + SpvOpUnordered = 163, + SpvOpLogicalEqual = 164, + SpvOpLogicalNotEqual = 165, + SpvOpLogicalOr = 166, + SpvOpLogicalAnd = 167, + SpvOpLogicalNot = 168, + SpvOpSelect = 169, + SpvOpIEqual = 170, + SpvOpINotEqual = 171, + SpvOpUGreaterThan = 172, + SpvOpSGreaterThan = 173, + SpvOpUGreaterThanEqual = 174, + SpvOpSGreaterThanEqual = 175, + SpvOpULessThan = 176, + SpvOpSLessThan = 177, + SpvOpULessThanEqual = 178, + SpvOpSLessThanEqual = 179, + SpvOpFOrdEqual = 180, + SpvOpFUnordEqual = 181, + SpvOpFOrdNotEqual = 182, + SpvOpFUnordNotEqual = 183, + SpvOpFOrdLessThan = 184, + SpvOpFUnordLessThan = 185, + SpvOpFOrdGreaterThan = 186, + SpvOpFUnordGreaterThan = 187, + SpvOpFOrdLessThanEqual = 188, + SpvOpFUnordLessThanEqual = 189, + SpvOpFOrdGreaterThanEqual = 190, + SpvOpFUnordGreaterThanEqual = 191, + SpvOpShiftRightLogical = 194, + SpvOpShiftRightArithmetic = 195, + SpvOpShiftLeftLogical = 196, + SpvOpBitwiseOr = 197, + SpvOpBitwiseXor = 198, + SpvOpBitwiseAnd = 199, + SpvOpNot = 200, + SpvOpBitFieldInsert = 201, + SpvOpBitFieldSExtract = 202, + SpvOpBitFieldUExtract = 203, + SpvOpBitReverse = 204, + SpvOpBitCount = 205, + SpvOpDPdx = 207, + SpvOpDPdy = 208, + SpvOpFwidth = 209, + SpvOpDPdxFine = 210, + SpvOpDPdyFine = 211, + SpvOpFwidthFine = 212, + SpvOpDPdxCoarse = 213, + SpvOpDPdyCoarse = 214, + SpvOpFwidthCoarse = 215, + SpvOpEmitVertex = 218, + SpvOpEndPrimitive = 219, + SpvOpEmitStreamVertex = 220, + SpvOpEndStreamPrimitive = 221, + SpvOpControlBarrier = 224, + SpvOpMemoryBarrier = 225, + SpvOpAtomicLoad = 227, + SpvOpAtomicStore = 228, + SpvOpAtomicExchange = 229, + SpvOpAtomicCompareExchange = 230, + SpvOpAtomicCompareExchangeWeak = 231, + SpvOpAtomicIIncrement = 232, + SpvOpAtomicIDecrement = 233, + SpvOpAtomicIAdd = 234, + SpvOpAtomicISub = 235, + SpvOpAtomicSMin = 236, + SpvOpAtomicUMin = 237, + SpvOpAtomicSMax = 238, + SpvOpAtomicUMax = 239, + SpvOpAtomicAnd = 240, + SpvOpAtomicOr = 241, + SpvOpAtomicXor = 242, + SpvOpPhi = 245, + SpvOpLoopMerge = 246, + SpvOpSelectionMerge = 247, + SpvOpLabel = 248, + SpvOpBranch = 249, + SpvOpBranchConditional = 250, + SpvOpSwitch = 251, + SpvOpKill = 252, + SpvOpReturn = 253, + SpvOpReturnValue = 254, + SpvOpUnreachable = 255, + SpvOpLifetimeStart = 256, + SpvOpLifetimeStop = 257, + SpvOpGroupAsyncCopy = 259, + SpvOpGroupWaitEvents = 260, + SpvOpGroupAll = 261, + SpvOpGroupAny = 262, + SpvOpGroupBroadcast = 263, + SpvOpGroupIAdd = 264, + SpvOpGroupFAdd = 265, + SpvOpGroupFMin = 266, + SpvOpGroupUMin = 267, + SpvOpGroupSMin = 268, + SpvOpGroupFMax = 269, + SpvOpGroupUMax = 270, + SpvOpGroupSMax = 271, + SpvOpReadPipe = 274, + SpvOpWritePipe = 275, + SpvOpReservedReadPipe = 276, + SpvOpReservedWritePipe = 277, + SpvOpReserveReadPipePackets = 278, + SpvOpReserveWritePipePackets = 279, + SpvOpCommitReadPipe = 280, + SpvOpCommitWritePipe = 281, + SpvOpIsValidReserveId = 282, + SpvOpGetNumPipePackets = 283, + SpvOpGetMaxPipePackets = 284, + SpvOpGroupReserveReadPipePackets = 285, + SpvOpGroupReserveWritePipePackets = 286, + SpvOpGroupCommitReadPipe = 287, + SpvOpGroupCommitWritePipe = 288, + SpvOpEnqueueMarker = 291, + SpvOpEnqueueKernel = 292, + SpvOpGetKernelNDrangeSubGroupCount = 293, + SpvOpGetKernelNDrangeMaxSubGroupSize = 294, + SpvOpGetKernelWorkGroupSize = 295, + SpvOpGetKernelPreferredWorkGroupSizeMultiple = 296, + SpvOpRetainEvent = 297, + SpvOpReleaseEvent = 298, + SpvOpCreateUserEvent = 299, + SpvOpIsValidEvent = 300, + SpvOpSetUserEventStatus = 301, + SpvOpCaptureEventProfilingInfo = 302, + SpvOpGetDefaultQueue = 303, + SpvOpBuildNDRange = 304, + SpvOpImageSparseSampleImplicitLod = 305, + SpvOpImageSparseSampleExplicitLod = 306, + SpvOpImageSparseSampleDrefImplicitLod = 307, + SpvOpImageSparseSampleDrefExplicitLod = 308, + SpvOpImageSparseSampleProjImplicitLod = 309, + SpvOpImageSparseSampleProjExplicitLod = 310, + SpvOpImageSparseSampleProjDrefImplicitLod = 311, + SpvOpImageSparseSampleProjDrefExplicitLod = 312, + SpvOpImageSparseFetch = 313, + SpvOpImageSparseGather = 314, + SpvOpImageSparseDrefGather = 315, + SpvOpImageSparseTexelsResident = 316, + SpvOpNoLine = 317, + SpvOpAtomicFlagTestAndSet = 318, + SpvOpAtomicFlagClear = 319, + SpvOpImageSparseRead = 320, + SpvOpSizeOf = 321, + SpvOpTypePipeStorage = 322, + SpvOpConstantPipeStorage = 323, + SpvOpCreatePipeFromPipeStorage = 324, + SpvOpGetKernelLocalSizeForSubgroupCount = 325, + SpvOpGetKernelMaxNumSubgroups = 326, + SpvOpTypeNamedBarrier = 327, + SpvOpNamedBarrierInitialize = 328, + SpvOpMemoryNamedBarrier = 329, + SpvOpModuleProcessed = 330, + SpvOpExecutionModeId = 331, + SpvOpDecorateId = 332, + SpvOpGroupNonUniformElect = 333, + SpvOpGroupNonUniformAll = 334, + SpvOpGroupNonUniformAny = 335, + SpvOpGroupNonUniformAllEqual = 336, + SpvOpGroupNonUniformBroadcast = 337, + SpvOpGroupNonUniformBroadcastFirst = 338, + SpvOpGroupNonUniformBallot = 339, + SpvOpGroupNonUniformInverseBallot = 340, + SpvOpGroupNonUniformBallotBitExtract = 341, + SpvOpGroupNonUniformBallotBitCount = 342, + SpvOpGroupNonUniformBallotFindLSB = 343, + SpvOpGroupNonUniformBallotFindMSB = 344, + SpvOpGroupNonUniformShuffle = 345, + SpvOpGroupNonUniformShuffleXor = 346, + SpvOpGroupNonUniformShuffleUp = 347, + SpvOpGroupNonUniformShuffleDown = 348, + SpvOpGroupNonUniformIAdd = 349, + SpvOpGroupNonUniformFAdd = 350, + SpvOpGroupNonUniformIMul = 351, + SpvOpGroupNonUniformFMul = 352, + SpvOpGroupNonUniformSMin = 353, + SpvOpGroupNonUniformUMin = 354, + SpvOpGroupNonUniformFMin = 355, + SpvOpGroupNonUniformSMax = 356, + SpvOpGroupNonUniformUMax = 357, + SpvOpGroupNonUniformFMax = 358, + SpvOpGroupNonUniformBitwiseAnd = 359, + SpvOpGroupNonUniformBitwiseOr = 360, + SpvOpGroupNonUniformBitwiseXor = 361, + SpvOpGroupNonUniformLogicalAnd = 362, + SpvOpGroupNonUniformLogicalOr = 363, + SpvOpGroupNonUniformLogicalXor = 364, + SpvOpGroupNonUniformQuadBroadcast = 365, + SpvOpGroupNonUniformQuadSwap = 366, + SpvOpSubgroupBallotKHR = 4421, + SpvOpSubgroupFirstInvocationKHR = 4422, + SpvOpSubgroupAllKHR = 4428, + SpvOpSubgroupAnyKHR = 4429, + SpvOpSubgroupAllEqualKHR = 4430, + SpvOpSubgroupReadInvocationKHR = 4432, + SpvOpGroupIAddNonUniformAMD = 5000, + SpvOpGroupFAddNonUniformAMD = 5001, + SpvOpGroupFMinNonUniformAMD = 5002, + SpvOpGroupUMinNonUniformAMD = 5003, + SpvOpGroupSMinNonUniformAMD = 5004, + SpvOpGroupFMaxNonUniformAMD = 5005, + SpvOpGroupUMaxNonUniformAMD = 5006, + SpvOpGroupSMaxNonUniformAMD = 5007, + SpvOpFragmentMaskFetchAMD = 5011, + SpvOpFragmentFetchAMD = 5012, + SpvOpGroupNonUniformPartitionNV = 5296, + SpvOpSubgroupShuffleINTEL = 5571, + SpvOpSubgroupShuffleDownINTEL = 5572, + SpvOpSubgroupShuffleUpINTEL = 5573, + SpvOpSubgroupShuffleXorINTEL = 5574, + SpvOpSubgroupBlockReadINTEL = 5575, + SpvOpSubgroupBlockWriteINTEL = 5576, + SpvOpSubgroupImageBlockReadINTEL = 5577, + SpvOpSubgroupImageBlockWriteINTEL = 5578, + SpvOpDecorateStringGOOGLE = 5632, + SpvOpMemberDecorateStringGOOGLE = 5633, + SpvOpMax = 0x7fffffff, +} SpvOp; + +#endif // #ifndef spirv_H + diff --git a/premake5.lua b/premake5.lua index 92668f4d2..1905bd2f4 100644 --- a/premake5.lua +++ b/premake5.lua @@ -619,6 +619,8 @@ standardProject "slang" -- defines { "SLANG_DYNAMIC_EXPORT" } + includedirs { "external/spirv-headers/include" } + -- The `standardProject` operation already added all the code in -- `source/slang/*`, but we also want to incldue the umbrella -- `slang.h` header in this prject, so we do that manually here. diff --git a/source/slang/slang-compiler.cpp b/source/slang/slang-compiler.cpp index ba2ad1dd8..2ab376181 100644 --- a/source/slang/slang-compiler.cpp +++ b/source/slang/slang-compiler.cpp @@ -660,7 +660,7 @@ namespace Slang } else { - return emitEntryPoint( + return emitEntryPointSource( compileRequest, entryPointIndex, CodeGenTarget::HLSL, @@ -692,7 +692,7 @@ namespace Slang } else { - return emitEntryPoint(compileRequest, entryPointIndex, CodeGenTarget::CPPSource, targetReq); + return emitEntryPointSource(compileRequest, entryPointIndex, CodeGenTarget::CPPSource, targetReq); } } @@ -732,7 +732,7 @@ namespace Slang } else { - return emitEntryPoint( + return emitEntryPointSource( compileRequest, entryPointIndex, CodeGenTarget::GLSL, @@ -1624,7 +1624,13 @@ SlangResult dissassembleDXILUsingDXC( return SLANG_OK; } - SlangResult emitSPIRVForEntryPoint( + SlangResult emitSPIRVForEntryPointDirectly( + BackEndCompileRequest* compileRequest, + Int entryPointIndex, + TargetRequest* targetReq, + List<uint8_t>& spirvOut); + + SlangResult emitSPIRVForEntryPointViaGLSL( BackEndCompileRequest* slangRequest, EntryPoint* entryPoint, Int entryPointIndex, @@ -1663,6 +1669,34 @@ SlangResult dissassembleDXILUsingDXC( return SLANG_OK; } + SlangResult emitSPIRVForEntryPoint( + BackEndCompileRequest* slangRequest, + EntryPoint* entryPoint, + Int entryPointIndex, + TargetRequest* targetReq, + EndToEndCompileRequest* endToEndReq, + List<uint8_t>& spirvOut) + { + if( slangRequest->shouldEmitSPIRVDirectly ) + { + return emitSPIRVForEntryPointDirectly( + slangRequest, + entryPointIndex, + targetReq, + spirvOut); + } + else + { + return emitSPIRVForEntryPointViaGLSL( + slangRequest, + entryPoint, + entryPointIndex, + targetReq, + endToEndReq, + spirvOut); + } + } + SlangResult emitSPIRVAssemblyForEntryPoint( BackEndCompileRequest* slangRequest, EntryPoint* entryPoint, @@ -1755,7 +1789,7 @@ SlangResult dissassembleDXILUsingDXC( case CodeGenTarget::CPPSource: case CodeGenTarget::CSource: { - return emitEntryPoint( + return emitEntryPointSource( compileRequest, entryPointIndex, target, diff --git a/source/slang/slang-compiler.h b/source/slang/slang-compiler.h index e3fbf57f6..69513ada6 100644 --- a/source/slang/slang-compiler.h +++ b/source/slang/slang-compiler.h @@ -1697,6 +1697,9 @@ namespace Slang // bool useUnknownImageFormatAsDefault = false; + /// Should SPIR-V be generated directly from Slang IR rather than via translation to GLSL? + bool shouldEmitSPIRVDirectly = false; + private: RefPtr<ComponentType> m_program; }; diff --git a/source/slang/slang-emit-spirv.cpp b/source/slang/slang-emit-spirv.cpp new file mode 100644 index 000000000..c6f2f7468 --- /dev/null +++ b/source/slang/slang-emit-spirv.cpp @@ -0,0 +1,1141 @@ +// slang-emit-spirv.cpp +#include "slang-emit.h" + +#include "slang-compiler.h" +#include "slang-ir.h" +#include "slang-ir-insts.h" + +#include "spirv/unified1/spirv.h" + +namespace Slang +{ + +// Our goal in this file is to convert a module in the Slang IR over to an +// equivalent module in the SPIR-V intermediate language. +// +// The Slang IR is (intentionally) similar to SPIR-V in many ways, and both +// can represent shaders at similar levels of abstraction, so much of the +// translation involves one-to-one translation of Slang IR instructions +// to their SPIR-V equivalents. +// +// SPIR-V differs from Slang IR in some key ways, and the SPIR-V +// specification places many restrictions on how the IR can be encoded. +// In some cases we will rely on earlier IR passes to convert Slang IR +// into a form closer to what SPIR-V expects (e.g., by moving all +// varying entry point parameters to global scope), but other differences +// will be handled during the translation process. +// +// The logic in this file relies on the formal [SPIR-V Specification]. +// When we are making use of or enforcing some property from the spec, +// we will try to refer to the relevant section in comments. +// +// [SPIR-V Specification]: https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html + +// [2.3: Physical Layout of a SPIR-V Module and Instruction] +// +// > A SPIR-V module is a single linear stream of words. +// +// [2.2: Terms] +// +// > Word: 32 bits. +// +// Despite the importance to SPIR-V, the `spirv.h` header doesn't +// define a type for words, so we'll do it here. + + /// A SPIR-V word. +typedef uint32_t SpvWord; + +// [2.3: Physical Layout of a SPIR-V Module and Instruction] +// +// > All remaining words are a linear sequence of instructions. +// > Each instruction is a stream of words +// +// After a fixed-size header, the contents of a SPIR-V module +// is just a flat sequence of instructions, each of which is +// just a sequence of words. +// +// In principle we could try to emit instructions directly +// in one pass as a stream of words, but there are additional +// constraints placed by the SPIR-V encoding that would make +// a single-pass strategy very hard, so we don't attempt it. +// +// [2.4 Logical Layout of a Module] +// +// SPIR-V imposes some global ordering constraints on instructions, +// such that certain instructions must come before or after others. +// For example, all `OpCapability` instructions must come before any +// `OpEntryPoint` instructions. +// +// While the SPIR-V spec doesn't use such a term, we will take +// the enumeration of the ordering in Section 2.4 and use it to +// define a list of *logical sections* that make up a SPIR-V module. + + /// Logical sections of a SPIR-V module. +enum class SpvLogicalSectionID +{ + Capabilities, + Extensions, + ExtIntInstImports, + MemoryModel, + EntryPoints, + ExecutionModes, + DebugStringsAndSource, + DebugNames, + Annotations, + Types, + Constants, + GlobalVariables, + FunctionDeclarations, + FunctionDefinitions, + + Count, +}; + +// While the SPIR-V module is nominally (according to the spec) just +// a flat sequence of instructions, in practice some of the instructions +// are logically in a parent/child relationship. +// +// In particular, functions "own" the instructions between an `OpFunction` +// and the matching `OpFunctionEnd`. We can also think of basic +// blocks within a function as owning the instructions between +// an `OpLabel` (which represents the bloc) and the next label +// or the end of the function. +// +// Furthermore, the common case is SPIR-V is that an instruction +// that defines some value must appear before any instruction +// that uses that value as an operand. This property is often true +// in a Slang IR module, but isn't strictly enforced for things at +// the global scope. +// +// To deal with the above issues, our strategy will be to emit +// SPIR-V instructions into a lightweight intermediate structure +// that simplifies dealing with ordering constraiints on +// instructions. +// +// We will start by forward-declaring the type we will +// use to represent instructions: +// +struct SpvInst; + +// Next, we will define a base type that can serve as a parent +// to SPIR-V instructions. Both the logical sections defined +// earlier and instructions such as functions will be used +// as parents. + + /// Base type for SPIR-V instructions and logical sections of a module + /// + /// Holds and supports appending to a list of child instructions. +struct SpvInstParent +{ +public: + /// Add an instruction to the end of the list of children + void addInst(SpvInst* inst); + + /// Dump all children, recursively, to a flattened list of SPIR-V words + void dumpTo(List<SpvWord>& ioWords); + +private: + /// The first child, if any. + SpvInst* m_firstChild = nullptr; + + /// A pointer to the null pointer at the end of the linked list. + /// + /// If the list of children is empty this points to `m_firstChild`, + /// while if it is non-empty it points to the `nextSibling` field + /// of the last instruction. + /// + SpvInst** m_link = &m_firstChild; +}; + +// A SPIR-V instruction is then (in the general case) a potential +// parent to other instructions. + + /// A type to represent a SPIR-V instruction to be emitted. + /// + /// This type alows the instruction to be built up across + /// multiple steps in a mutable fashion. + /// +struct SpvInst : SpvInstParent +{ + // [2.3: Physical Layout of a SPIR-V Module and Instruction] + // + // > Each instruction is a stream of words + // + // > Opcode: The 16 high-order bits are the WordCount of the instruction. + // > The 16 low-order bits are the opcode enumerant. + // + // We will store the "opcode enumerant" directly in our + // intermediate structure, and compute the word count on + // the fly when writing an instruction to an output buffer. + + /// The SPIR-V opcode for the instruction + SpvOp opcode; + + // [2.3: Physical Layout of a SPIR-V Module and Instruction] + // + // > Optional instruction type <id> (presence determined by opcode) + // > Optional instruction Result <id> (presence determined by opcode) + // > Operand 1 (if needed) + // > Operand 2 (if needed) + // > ... + // + // We represent the remaining words of the instruction (after + // the opcode word) as an undifferentiated array. Any code + // that encodes an instruction is responsible for knowing the + // opcode-specific data that is required. + // + // Our code does not need to process instruction operands after + // they have been written into a `SpvInst`. If we ever had + // cases where we needed to do post-processing, then we would + // need to store a more refined representation here. + + /// The additional words of the instruction after the opcode + List<SpvWord> operandWords; + + // We will store the instructions in a given `SpvInstParent` + // using an intrusive linked list. + + /// The next instruction in the same `SpvInstParent` + SpvInst* nextSibling = nullptr; + + /// The result <id> produced by this instruction, or zero if it has no result. + SpvWord id = 0; + + /// Dump the instruction (and any children, recursively) into the flat array of SPIR-V words. + void dumpTo(List<SpvWord>& ioWords) + { + // [2.2: Terms] + // + // > Word Count: The complete number of words taken by an instruction, + // > including the word holding the word count and opcode, and any optional + // > operands. An instruction’s word count is the total space taken by the instruction. + // + SpvWord wordCount = 1 + SpvWord(operandWords.getCount()); + + // [2.3: Physical Layout of a SPIR-V Module and Instruction] + // + // > Opcode: The 16 high-order bits are the WordCount of the instruction. + // > The 16 low-order bits are the opcode enumerant. + // + ioWords.add(wordCount << 16 | opcode); + + // The operand words simply follow the opcode word. + // + for( auto word : operandWords ) + { + ioWords.add(word); + } + + // In our representation choice, the children of a + // parent instruction will always follow the encoded + // words of a parent: + // + // * The instructions inside a function always follow the `OpFunction` + // * The instructions inside a block always follow the `OpLabel` + // + SpvInstParent::dumpTo(ioWords); + } +}; + + /// A logical section of a SPIR-V module +struct SpvLogicalSection : SpvInstParent +{ +}; + +// Now that we've filled in the definition of `SpvInst`, we can +// go back and define the key operations on `SpvInstParent`. + +void SpvInstParent::addInst(SpvInst* inst) +{ + SLANG_ASSERT(inst); + + // The user shouldn't be trying to add multiple instructions at once. + // If they really want that then they probably wanted to give `inst` + // some children. + // + SLANG_ASSERT(!inst->nextSibling); + + *m_link = inst; + m_link = &inst->nextSibling; +} + +void SpvInstParent::dumpTo(List<SpvWord>& ioWords) +{ + for( auto child = m_firstChild; child; child = child->nextSibling ) + { + child->dumpTo(ioWords); + } +} + +// Now that we've defined the intermediate data structures we will +// use to represent SPIR-V code during emission, we will move on +// to defining the main context type that will drive SPIR-V +// code generation. + + /// Context used for translating a Slang IR module to SPIR-V +struct SPIRVEmitContext +{ + /// The Slang IR module being translated + IRModule* m_irModule; + + // [2.2: Terms] + // + // > <id>: A numerical name; the name used to refer to an object, a type, + // > a function, a label, etc. An <id> always consumes one word. + // > The <id>s defined by a module obey SSA. + // + // [2.3: Physical Layout of a SPIR-V Module and Instruction] + // + // > Bound; where all <id>s in this module are guaranteed to satisfy + // > 0 < id < Bound + // > Bound should be small, smaller is better, with all <id> in a module being densely packed and near 0. + // + // Instructions will be referred to by their <id>s. + // We need to generate <id>s for instructions, and also + // compute the "bound" value that will be stored in + // the module header. + // + // We will use a single counter and allocate <id>s + // on demand. There may be some slop where we allocate + // an <id> for something that never gets referenced, + // but we expect the amount of slop to be small (and + // it can be cleaned up by other tools/passes). + + /// The next destination `<id>` to allocate. + SpvWord m_nextID = 1; + + // We will store the logical sections of the SPIR-V module + // in a single array so that we can easily look up a + // section by its `SpvLogicalSectionID`. + + /// The logical sections of the SPIR-V module + SpvLogicalSection m_sections[int(SpvLogicalSectionID::Count)]; + + /// Get a logical section based on its `SpvLogicalSectionID` + SpvLogicalSection* getSection(SpvLogicalSectionID id) + { + return &m_sections[int(id)]; + } + + // At the end of emission we need a single linear stream of words, + // so we will eventually flatten `m_sections` into a single array. + + /// The final array of SPIR-V words that defines the encoded module + List<SpvWord> m_words; + + /// Emit the concrete words that make up the binary SPIR-V module. + /// + /// This function fills in `m_words` based on the data in `m_sections`. + /// This function should only be called once. + /// + void emitPhysicalLayout() + { + // [2.3: Physical Layout of a SPIR-V Module and Instruction] + // + // > Magic Number + // + m_words.add(SpvMagicNumber); + + // > Version nuumber + // + m_words.add(SpvVersion); + + // > Generator's magic number. + // > Its value does not affect any semantics, and is allowed to be 0. + // + // TODO: We should eventually register a non-zero + // magic number to represent Slang/slangc. + // + m_words.add(0); + + // > Bound + // + // As described above, we use `m_nextID` to allocate + // <id>s, so its value when we are done emitting code + // can serve as the bound. + // + m_words.add(m_nextID); + + // > 0 (Reserved for instruction schema, if needed.) + // + m_words.add(0); + + // > First word of instruction stream + // > All remaining words are a linear sequence of instructions. + // + // Once we are done emitting the header, we emit all + // the instructions in our logical sections. + // + for( int ii = 0; ii < int(SpvLogicalSectionID::Count); ++ii ) + { + m_sections[ii].dumpTo(m_words); + } + } + + // We will often need to refer to an instrcition by its + // <id>, given only the Slang IR instruction that represents + // it (e.g., when it is used as an operand of another + // instruction). + // + // To that end we will keep a map of instructions that + // have been emitted, where a Slang IR instruction maps + // to the corresponding SPIR-V instruction. + + /// Map a Slang IR instruction to the corresponding SPIR-V instruction + Dictionary<IRInst*, SpvInst*> m_mapIRInstToSpvInst; + + /// Register that `irInst` maps to `spvInst` + void registerInst(IRInst* irInst, SpvInst* spvInst) + { + m_mapIRInstToSpvInst.Add(irInst, spvInst); + } + + // When we are emitting an instruction that can produce + // a result, we will allocate an <id> to it so that other + // instructions can refer to it. + // + // We will allocate <id>s on emand as they are needed. + + /// Get the <id> for `inst`, or assign one if it doesn't have one yet + SpvWord getID(SpvInst* inst) + { + auto id = inst->id; + if( !id ) + { + id = m_nextID++; + inst->id = id; + } + return id; + } + + // We will build up `SpvInst`s in a stateful fashion, + // mostly for convenience. We could in theory compute + // the number of words each instruction needs, then allocate + // the words, then fill them in, but that would make the + // emit logic more complicated and we'd like to keep it simple + // until we are sure performance is an issue. + // + // Emitting an instruction starts with picking the opcode + // and allocating the `SpvInst`. + + /// Begin emitting an instruction with the given SPIR-V `opcode`. + /// + /// If `irInst` is non-null, then the resulting SPIR-V instruction + /// will be registered as corresponding to `irInst`. + /// + SpvInst* beginInst(SpvOp opcode, IRInst* irInst = nullptr) + { + // TODO: We are currently just leaking the `SpvInst`s we allocate. + // We should set up a pool allocator that this pass can use for + // both the `SpvInst`s and for their constituent words. + // + auto spvInst = new SpvInst(); + spvInst->opcode = opcode; + + if(irInst) + { + registerInst(irInst, spvInst); + } + + return spvInst; + } + + // Once an instruction has been created, we append the operand + // words to it with `emitOperand`. There are a few different + // case of operands that we handle. + // + // The simplest case is when an instruction takes an operand + // that is just a literal SPIR-V word. + + /// Emit a literal `word` as an operand to `dst`. + void emitOperand(SpvInst* dst, SpvWord word) + { + dst->operandWords.add(word); + } + + // The most common case of operand is an <id> that represents + // some other instruction. In cases where we already have + // an <id> we can emit it as a literal and the meaning is + // the same. If we have a `SpvInst` we can look up or + // generate an <id> for it. + + /// Emit an operand to the `dst` instruction, which references `src` by its <id> + void emitOperand(SpvInst* dst, SpvInst* src) + { + emitOperand(dst, getID(src)); + } + + // Commonly, we will have an operand in the form of an `IRInst` + // which might either represent an instruction we've already + // emitted (e.g., because it came earlier in a function body) + // or which we have yet to emit (because it is a global-scope + // instruction that has not been referenced before). + + /// Emit an operand to the `dst` instruction, which references `src` by its <id> + void emitOperand(SpvInst* dst, IRInst* src) + { + // We first ensure that the `src` instruction has been emitted, + // and then handle it as for any other <id> operand. + // + SpvInst* spvSrc = ensureInst(src); + emitOperand(dst, getID(spvSrc)); + } + + /// Ensure that an instruction has been emitted + SpvInst* ensureInst(IRInst* irInst) + { + SpvInst* spvInst = nullptr; + if( !m_mapIRInstToSpvInst.TryGetValue(irInst, spvInst) ) + { + // If the `irInst` hasn't already been emitted, + // then we will assume that is is a global instruction + // (a constant, type, function, etc.) and we should make + // sure it gets emitted now. + // + // Note: this step means that emitting an instruction + // can be re-entrant/recursive. Because we emit the SPIR-V + // words for an instruction into an intermediate structure + // we don't have to worry about the re-entrancy causing + // the ordering of instruction words to be interleaved. + // + spvInst = emitGlobalInst(irInst); + } + return spvInst; + } + + // Some instructions take a string as a literal operand, + // which requires us to follow the SPIR-V rules to + // encode the string into multiple operand words. + + /// Emit an operand that is encoded as a literal string + void emitOperand(SpvInst* dst, UnownedStringSlice const& text) + { + // [Section 2.2.1 : Instructions] + // + // > Literal String: A nul-terminated stream of characters consuming + // > an integral number of words. The character set is Unicode in the + // > UTF-8 encoding scheme. The UTF-8 octets (8-bit bytes) are packed + // > four per word, following the little-endian convention (i.e., the + // > first octet is in the lowest-order 8 bits of the word). + // + // We start by emitting the contents of `text` in + // 4-byte chunks. + // + char const* cursor = text.begin(); + char const* end = text.end(); + while( (end - cursor) >= 4 ) + { + SpvWord word; + memcpy(&word, cursor, 4); + emitOperand(dst, word); + cursor += 4; + } + // + // > The final word contains the string’s nul-termination character (0), and + // > all contents past the end of the string in the final word are padded with 0. + // + // For the last word, the low-order bytes will + // come from the remainder of the string (if + // there is anything left), and the rest will + // be left as zeros. + // + // TODO: This code should probably assert that `text` + // doesn't contain any embedded nul bytes, since they + // could lead to invalid encoded results. + // + SLANG_ASSERT((end - cursor) <= 3); + SpvWord lastWord = 0; + memcpy(&lastWord, cursor, (end - cursor)); + emitOperand(dst, lastWord); + } + + // Sometimes we will want to pass down an argument that + // represents a result <id> operand, but we won't yet + // have access to the `SpvInst` that will get the <id>. + // We will use a dummy `enum` type to support this case. + + enum ResultIDToken { kResultID }; + + void emitOperand(SpvInst* dst, ResultIDToken) + { + // A result <id> operand uses the <id> of the instruction itself. + emitOperand(dst, getID(dst)); + } + + // As another convenience, there are often cases where + // we will want to emit all of the operands of some + // IR instruction as <id> operands of a SPIR-V + // instruction. This is handy in cases where the + // Slang IR and SPIR-V instructions agree on the + // number, order, and meaning of their operands. + + /// Helper type for emitting all the operands of some IR instruction + struct OperandsOf + { + OperandsOf(IRInst* irInst) + : irInst(irInst) + {} + + IRInst* irInst = nullptr; + }; + + /// Emit operand words for all the operands of a given IR instruction + void emitOperand(SpvInst* dst, OperandsOf const& other) + { + auto irInst = other.irInst; + auto operandCount = irInst->getOperandCount(); + for( UInt ii = 0; ii < operandCount; ++ii ) + { + emitOperand(dst, irInst->getOperand(ii)); + } + } + + // With the above routines, code can easily construct a SPIR-V + // instruction with arbitrary operands over multiple lines of code. + // + // In many cases, however, it is desirable to be able to emit + // an instruction more compactly, and for that we will introduce + // a number of `emitInst()` helpers that handle creating an + // instruction, filling in its operands, and adding it to a parent. + // + // These routines are overloaded on the number of operands, and + // also templates to work with any of the types for which + // `emitOperand()` works. + // + // In all of these cases, the caller takes responsibility for + // correctly matching the SPIR-V encoding rules for the chosen + // opcode, including whether a type <id> or result <id> is + // required. + + SpvInst* emitInst(SpvInstParent* parent, IRInst* irInst, SpvOp opcode) + { + auto spvInst = beginInst(opcode, irInst); + parent->addInst(spvInst); + return spvInst; + } + + template<typename A> + SpvInst* emitInst(SpvInstParent* parent, IRInst* irInst, SpvOp opcode, A const& a) + { + auto spvInst = beginInst(opcode, irInst); + emitOperand(spvInst, a); + parent->addInst(spvInst); + return spvInst; + } + + template<typename A, typename B> + SpvInst* emitInst(SpvInstParent* parent, IRInst* irInst, SpvOp opcode, A const& a, B const& b) + { + auto spvInst = beginInst(opcode, irInst); + emitOperand(spvInst, a); + emitOperand(spvInst, b); + parent->addInst(spvInst); + return spvInst; + } + + template<typename A, typename B, typename C> + SpvInst* emitInst(SpvInstParent* parent, IRInst* irInst, SpvOp opcode, A const& a, B const& b, C const& c) + { + auto spvInst = beginInst(opcode, irInst); + emitOperand(spvInst, a); + emitOperand(spvInst, b); + emitOperand(spvInst, c); + parent->addInst(spvInst); + return spvInst; + } + + template<typename A, typename B, typename C, typename D> + SpvInst* emitInst(SpvInstParent* parent, IRInst* irInst, SpvOp opcode, A const& a, B const& b, C const& c, D const& d) + { + auto spvInst = beginInst(opcode, irInst); + emitOperand(spvInst, a); + emitOperand(spvInst, b); + emitOperand(spvInst, c); + emitOperand(spvInst, d); + parent->addInst(spvInst); + return spvInst; + } + + template<typename A, typename B, typename C, typename D, typename E> + SpvInst* emitInst(SpvInstParent* parent, IRInst* irInst, SpvOp opcode, A const& a, B const& b, C const& c, D const& d, E const& e) + { + auto spvInst = beginInst(opcode, irInst); + emitOperand(spvInst, a); + emitOperand(spvInst, b); + emitOperand(spvInst, c); + emitOperand(spvInst, d); + emitOperand(spvInst, e); + parent->addInst(spvInst); + return spvInst; + } + + // Now that we've gotten the core infrastructure out of the way, + // let's start looking at emitting some instructions that make + // up a SPIR-V module. + // + // We will start with certain instructions that are required + // to appear in a well-formed SPIR-V module for Vulkan, but + // which do not directly relate to any instruction in the + // Slang IR. + + /// Emit the mandatory "front-matter" instructions that + /// the SPIR-V module must include to make it usable. + void emitFrontMatter() + { + // TODO: We should ideally add SPIR-V capabilities to + // the module as we emit instructions that require them. + // For now we will always emit the `Shader` capability, + // since every Vulkan shader module will use it. + // + emitInst(getSection(SpvLogicalSectionID::Capabilities), nullptr, SpvOpCapability, SpvCapabilityShader); + + // [2.4: Logical Layout of a Module] + // + // > The single required OpMemoryModel instruction. + // + // A memory model is always required in SPIR-V module. + // + // The Vulkan spec further says: + // + // > The `Logical` addressing model must be selected + // + // It isn't clear if the GLSL450 memory model is also + // a requirement, but it is what glslang produces, + // so we will use it for now. + // + emitInst(getSection(SpvLogicalSectionID::MemoryModel), nullptr, SpvOpMemoryModel, SpvAddressingModelLogical, SpvMemoryModelGLSL450); + } + + // Next, let's look at emitting some of the instructions + // that can occur at global scope. + + /// Emit an instruction that is expected to appear at the global scope of the SPIR-V module. + /// + /// Returns the corresponding SPIR-V instruction. + /// + SpvInst* emitGlobalInst(IRInst* inst) + { + switch( inst->op ) + { + // [3.32.6: Type-Declaration Instructions] + // + +#define CASE(IROP, SPVOP) \ + case IROP: return emitInst(getSection(SpvLogicalSectionID::Types), inst, SPVOP, kResultID) + + // > OpTypeVoid + CASE(kIROp_VoidType, SpvOpTypeVoid); + + // > OpTypeBool + CASE(kIROp_BoolType, SpvOpTypeBool); + +#undef CASE + + // > OpTypeInt + +#define CASE(IROP, BITS, SIGNED) \ + case IROP: return emitInst(getSection(SpvLogicalSectionID::Types), inst, SpvOpTypeInt, kResultID, BITS, SIGNED) + + CASE(kIROp_IntType, 32, 1); + CASE(kIROp_UIntType, 32, 0); + CASE(kIROp_Int64Type, 64, 1); + CASE(kIROp_UInt64Type, 64, 0); + +#undef CASE + + // > OpTypeFloat + +#define CASE(IROP, BITS) \ + case IROP: return emitInst(getSection(SpvLogicalSectionID::Types), inst, SpvOpTypeFloat, kResultID, BITS) + + CASE(kIROp_HalfType, 16); + CASE(kIROp_FloatType, 32); + CASE(kIROp_DoubleType, 64); + +#undef CASE + + // > OpTypeVector + // > OpTypeMatrix + // > OpTypeImage + // > OpTypeSampler + // > OpTypeArray + // > OpTypeRuntimeArray + // > OpTypeStruct + // > OpTypeOpaque + // > OpTypePointer + + case kIROp_FuncType: + // > OpTypeFunction + // + // Both Slang and SPIR-V encode a function type + // with the result-type operand coming first, + // followed by operand sfor all the parameter types. + // + return emitInst(getSection(SpvLogicalSectionID::Types), inst, SpvOpTypeFunction, kResultID, OperandsOf(inst)); + + // > OpTypeForwardPointer + + case kIROp_Func: + // [3.32.6: Function Instructions] + // + // > OpFunction + // + // Functions are complex enough that we'll handle + // them in a dedicated subroutine. + // + return emitFunc(as<IRFunc>(inst)); + + // ... + + default: + SLANG_UNIMPLEMENTED_X("unhandled instruction opcode"); + UNREACHABLE_RETURN(nullptr); + } + } + + /// Emit the given `irFunc` to SPIR-V + SpvInst* emitFunc(IRFunc* irFunc) + { + // [2.4: Logical Layout of a Module] + // + // > All function declarations ("declarations" are functions + // > without a body; there is no forward declaration to a + // > function with a body). + // > ... + // > All function definitions (functions with a body). + // + // We need to treat functions differently based + // on whether they have a body or not, since these + // are encoded differently (and to different sections). + // + if( isDefinition(irFunc) ) + { + return emitFuncDefinition(irFunc); + } + else + { + return emitFuncDeclaration(irFunc); + } + } + + /// Emit a declaration for the given `irFunc` + SpvInst* emitFuncDeclaration(IRFunc* irFunc) + { + // For now we aren't handling function declarations; + // we expect to deal only with fully linked modules. + // + SLANG_UNUSED(irFunc); + SLANG_UNEXPECTED("function declaration in SPIR-V emit"); + UNREACHABLE_RETURN(nullptr); + } + + /// Emit a SPIR-V function definition for the Slang IR function `irFunc`. + SpvInst* emitFuncDefinition(IRFunc* irFunc) + { + // [2.4: Logical Layout of a Module] + // + // > All function definitions (functions with a body). + // + auto section = getSection(SpvLogicalSectionID::FunctionDefinitions); + // + // > A function definition is as follows. + // > * Function definition, using OpFunction. + // > * Function parameter declarations, using OpFunctionParameter. + // > * Block + // > * Block + // > * ... + // > * Function end, using OpFunctionEnd. + // + + // [3.24. Function Control] + // + // TODO: We should eventually support emitting the "function control" + // mask to include inline and other hint bits based on decorations + // set on `irFunc`. + // + SpvFunctionControlMask spvFunctionControl = SpvFunctionControlMaskNone; + + // [3.32.9. Function Instructions] + // + // > OpFunction + // + // Note that the type <id> of a SPIR-V function uses the + // *result* type of the function, while the actual function + // type is given as a later operand. Slan IR instead uses + // the type of a function instruction store, you know, its *type*. + // + SpvInst* spvFunc = emitInst(section, irFunc, SpvOpFunction, + irFunc->getDataType()->getResultType(), + kResultID, + spvFunctionControl, + irFunc->getDataType()); + + // > OpFunctionParameter + // + // Unlike Slang, where parameters always belong to blocks, + // the parameters of a SPIR-V function must appear as direct + // children of the function instruction, and before any basic blocks. + // + for( auto irParam : irFunc->getParams() ) + { + emitInst(spvFunc, irParam, SpvOpFunctionParameter, + irParam->getFullType(), + kResultID); + } + + // [3.32.17. Control-Flow Instructions] + // + // > OpLabel + // + // A Slang `IRBlock` corresponds to a SPIR-V `OpLabel`: + // each represents a basic block in the control flow + // graph of a parent function. + // + // We will allocate SPIR-V instructions to represent + // all of the blocks in a function before we emit + // body instructions into any of them. We do this + // because it is possible for one block to make + // forward reference to another (wheras that is + // not possible for ordinary instructions within + // the blocks in the Slang IR) + // + for( auto irBlock : irFunc->getBlocks() ) + { + emitInst(spvFunc, irBlock, SpvOpLabel, kResultID); + } + + // Once all the basic blocks have had instructions allocated + // for them, we go through and fill them in with their bodies. + // + for( auto irBlock : irFunc->getBlocks() ) + { + // Note: because we already created the block above, + // we can be sure that it will have been registred. + // + SpvInst* spvBlock = nullptr; + m_mapIRInstToSpvInst.TryGetValue(irBlock, spvBlock); + SLANG_ASSERT(spvBlock); + + // [3.32.17. Control-Flow Instructions] + // + // > OpPhi + // + // TODO: We eventually need to emit `OpPhi` instructions corresponding + // to the parameters of any non-entry block, with operands representing + // the values passed along incoming edges from the predecessor blocks. + + for( auto irInst : irBlock->getOrdinaryInsts() ) + { + // Any instructions local to the block will be emitted as children + // of the block. + // + emitLocalInst(spvBlock, irInst); + } + } + + // [3.32.9. Function Instructions] + // + // > OpFunctionEnd + // + // In the SPIR-V encoding a function is logically the parent of any + // instructions up to a matching `OpFunctionEnd`. In our intermediate + // structure we will make the `OpFunctionEnd` be the last child of + // the `OpFunction`. + // + emitInst(spvFunc, nullptr, SpvOpFunctionEnd); + + // We will emit any decorations pertinent to the function to the + // appropriate section of the module. + // + emitDecorations(irFunc, getID(spvFunc)); + + return spvFunc; + } + + // The instructions that appear inside the basic blocks of + // functions are what we will call "local" instructions. + // + // When emititng blobal instructions, we usually have to + // pick the right logical section to emit them into, while + // for local instructions they will usually emit into + // a known parent (the basic block that contains them). + + /// Emit an instruction that is local to the body of the given `parent`. + SpvInst* emitLocalInst(SpvInstParent* parent, IRInst* inst) + { + switch( inst->op ) + { + default: + SLANG_UNIMPLEMENTED_X("unhandled instruction opcode"); + break; + + // [3.32.17. Control-Flow Instructions] + // + // > OpReturn + case kIROp_ReturnVoid: return emitInst(parent, inst, SpvOpReturn); + } + } + + // Both "local" and "global" instructions can have decorations. + // When we decide to emit an instruction, we typically also want + // to emit any decoratons that were attached to it that have + // a SPIR-V equivalent. + + /// Emit appropriate SPIR-V decorations for the given IR `irInst`. + /// + /// The given `dstID` should be the `<id>` of the SPIR-V instruction being decorated, + /// and should correspond to `irInst`. + /// + void emitDecorations(IRInst* irInst, SpvWord dstID) + { + for( auto decoration : irInst->getDecorations() ) + { + emitDecoration(dstID, decoration); + } + } + + /// Emit an appropriate SPIR-V decoration for the given IR `decoration`, if necessary and possible. + /// + /// The given `dstID` should be the `<id>` of the SPIR-V instruction being decorated, + /// and should correspond to the parent of `decoration` in the Slang IR. + /// + void emitDecoration(SpvWord dstID, IRDecoration* decoration) + { + // Unlike in the Slang IR, decorations in SPIR-V are not children + // of the instruction they decorate, and instead are free-standing + // instructions at global scope, which reference their target + // instruction by its `<id>`. + // + // The `IRDecoration` hierarchy in Slang also maps to several + // different categories of instruction in SPIR-V, only a subset + // of which are officialy called "decorations." + // + // We will continue to use the Slang terminology here, since + // this code path is a catch-all for stuff that only needs to + // be emitted if the owning instruction gets emitted. + + switch( decoration->op ) + { + default: + break; + + // [3.32.2. Debug Instructions] + // + // > OpName + // + case kIROp_NameHintDecoration: + { + auto section = getSection(SpvLogicalSectionID::DebugNames); + auto nameHint = cast<IRNameHintDecoration>(decoration); + emitInst(section, decoration, SpvOpName, dstID, nameHint->getName()); + } + break; + + // [3.32.5. Mode-Setting Instructions] + // + // > OpEntryPoint + // > Declare an entry point, its execution model, and its interface. + // + case kIROp_EntryPointDecoration: + { + auto section = getSection(SpvLogicalSectionID::EntryPoints); + + // TODO: The `OpEntryPoint` is required to list an varying + // input or output parameters (by `<id>`) used by the entry point, + // although these are encoded as global variables in the IR. + // + // Currently we have a pass that moves entry-point varying + // parameters to global scope for the benefit of GLSL output, + // but we do not maintain a connection between those parameters + // and the original entry point. That pass should be updated + // to attach a decoration linking the original entry point + // to the new globals, which would be used in the SPIR-V emit case. + + auto entryPointDecor = cast<IREntryPointDecoration>(decoration); + auto spvStage = mapStageToExecutionModel(entryPointDecor->getProfile().GetStage()); + auto name = entryPointDecor->getName()->getStringSlice(); + emitInst(section, decoration, SpvOpEntryPoint, spvStage, dstID, name); + } + break; + + // > OpExecutionMode + + // [3.6. Execution Mode]: LocalSize + case kIROp_NumThreadsDecoration: + { + auto section = getSection(SpvLogicalSectionID::ExecutionModes); + + // TODO: The `LocalSize` execution mode option requires + // literal values for the X,Y,Z thread-group sizes. + // There is a `LocalSizeId` variant that takes `<id>`s + // for those sizes, and we should consider using that + // and requiring the appropriate capabilities + // if any of the operands to the decoration are not + // literals (in a future where we support non-literals + // in those positions in the Slang IR). + // + auto numThreads = cast<IRNumThreadsDecoration>(decoration); + emitInst(section, decoration, SpvOpExecutionMode, dstID, SpvExecutionModeLocalSize, + SpvWord(numThreads->getX()->getValue()), + SpvWord(numThreads->getY()->getValue()), + SpvWord(numThreads->getZ()->getValue())); + } + break; + + // ... + } + } + + /// Map a Slang `Stage` to a corresponding SPIR-V execution model + SpvExecutionModel mapStageToExecutionModel(Stage stage) + { + switch( stage ) + { + default: + SLANG_UNEXPECTED("unhandled stage"); + UNREACHABLE_RETURN((SpvExecutionModel)0); + +#define CASE(STAGE, MODEL) \ + case Stage::STAGE: return SpvExecutionModel##MODEL + + CASE(Vertex, Vertex); + CASE(Hull, TessellationControl); + CASE(Domain, TessellationEvaluation); + CASE(Geometry, Geometry); + CASE(Fragment, Fragment); + CASE(Compute, GLCompute); + + // TODO: Extended execution models for ray tracing, etc. + +#undef CASE + } + } +}; + +SlangResult emitSPIRVFromIR( + BackEndCompileRequest* compileRequest, + IRModule* irModule, + IRFunc* irEntryPoint, + List<uint8_t>& spirvOut) +{ + SLANG_UNUSED(compileRequest); + SLANG_UNUSED(irModule); + SLANG_UNUSED(irEntryPoint); + + spirvOut.clear(); + + SPIRVEmitContext context; + context.m_irModule = irModule; + context.emitFrontMatter(); + context.ensureInst(irEntryPoint); + context.emitPhysicalLayout(); + + spirvOut.addRange( + (uint8_t const*) context.m_words.getBuffer(), + context.m_words.getCount() * sizeof(context.m_words[0])); + + return SLANG_OK; +} + + +} // namespace Slang diff --git a/source/slang/slang-emit.cpp b/source/slang/slang-emit.cpp index 345dfe9b2..f4db3c5c1 100644 --- a/source/slang/slang-emit.cpp +++ b/source/slang/slang-emit.cpp @@ -158,7 +158,293 @@ static void dumpIRIfEnabled( } } -String emitEntryPoint( +struct LinkingAndOptimizationOptions +{ + bool shouldLegalizeExistentialAndResourceTypes = true; + CLikeSourceEmitter* sourceEmitter = nullptr; +}; + +Result linkAndOptimizeIR( + BackEndCompileRequest* compileRequest, + Int entryPointIndex, + CodeGenTarget target, + TargetRequest* targetRequest, + LinkingAndOptimizationOptions const& options, + LinkedIR& outLinkedIR) +{ + auto sink = compileRequest->getSink(); + auto program = compileRequest->getProgram(); + auto targetProgram = program->getTargetProgram(targetRequest); + + auto session = targetRequest->getSession(); + + // We start out by performing "linking" at the level of the IR. + // This step will create a fresh IR module to be used for + // code generation, and will copy in any IR definitions that + // the desired entry point requires. Along the way it will + // resolve references to imported/exported symbols across + // modules, and also select between the definitions of + // any "profile-overloaded" symbols. + // + outLinkedIR = linkIR( + compileRequest, + entryPointIndex, + target, + targetProgram); + auto irModule = outLinkedIR.module; + auto irEntryPoint = outLinkedIR.entryPoint; + +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "LINKED"); +#endif + + validateIRModuleIfEnabled(compileRequest, irModule); + + // If the user specified the flag that they want us to dump + // IR, then do it here, for the target-specific, but + // un-specialized IR. + dumpIRIfEnabled(compileRequest, irModule); + + // Replace any global constants with their values. + // + replaceGlobalConstants(irModule); +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "GLOBAL CONSTANTS REPLACED"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + + // When there are top-level existential-type parameters + // to the shader, we need to take the side-band information + // on how the existential "slots" were bound to concrete + // types, and use it to introduce additional explicit + // shader parameters for those slots, to be wired up to + // use sites. + // + bindExistentialSlots(irModule, sink); +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "EXISTENTIALS BOUND"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + + + + + // Now that we've linked the IR code, any layout/binding + // information has been attached to shader parameters + // and entry points. Now we are safe to make transformations + // that might move code without worrying about losing + // the connection between a parameter and its layout. + // + // An easy transformation of this kind is to take uniform + // parameters of a shader entry point and move them into + // the global scope instead. + // + moveEntryPointUniformParamsToGlobalScope(irModule, target); +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "ENTRY POINT UNIFORMS MOVED"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + // Desguar any union types, since these will be illegal on + // various targets. + // + desugarUnionTypes(irModule); +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "UNIONS DESUGARED"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + // Next, we need to ensure that the code we emit for + // the target doesn't contain any operations that would + // be illegal on the target platform. For example, + // none of our target supports generics, or interfaces, + // so we need to specialize those away. + // + // Simplification of existential-based and generics-based + // code may each open up opportunities for the other, so + // the relevant specialization transformations are handled in a + // single pass that looks for all simplification opportunities. + // + // TODO: We also need to extend this pass so that it will "expose" + // existential values that are nested inside of other types, + // so that the simplifications can be applied. + // + // TODO: This pass is *also* likely to be the place where we + // perform specialization of functions based on parameter + // values that need to be compile-time constants. + // + specializeModule(irModule); + + // Debugging code for IR transformations... +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "SPECIALIZED"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + + // Specialization can introduce dead code that could trip + // up downstream passes like type legalization, so we + // will run a DCE pass to clean up after the specialization. + // + // TODO: Are there other cleanup optimizations we should + // apply at this point? + // + eliminateDeadCode(irModule); +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "AFTER DCE"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + // We don't need the legalize pass for C/C++ based types + if(options.shouldLegalizeExistentialAndResourceTypes ) +// if (!(sourceStyle == SourceStyle::CPP || sourceStyle == SourceStyle::C)) + { + // The Slang language allows interfaces to be used like + // ordinary types (including placing them in constant + // buffers and entry-point parameter lists), but then + // getting them to lay out in a reasonable way requires + // us to treat fields/variables with interface type + // *as if* they were pointers to heap-allocated "objects." + // + // Specialization will have replaced fields/variables + // with interface types like `IFoo` with fields/variables + // with pointer-like types like `ExistentialBox<SomeType>`. + // + // We need to legalize these pointer-like types away, + // which involves two main changes: + // + // 1. Any `ExistentialBox<...>` fields need to be moved + // out of their enclosing `struct` type, so that the layout + // of the enclosing type is computed as if the field had + // zero size. + // + // 2. Once an `ExistentialBox<X>` has been floated out + // of its parent and landed somwhere permanent (e.g., either + // a dedicated variable, or a field of constant buffer), + // we need to replace it with just an `X`, after which we + // will have (more) legal shader code. + // + legalizeExistentialTypeLayout( + irModule, + sink); + eliminateDeadCode(irModule); + +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "EXISTENTIALS LEGALIZED"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + // Many of our target languages and/or downstream compilers + // don't support `struct` types that have resource-type fields. + // In order to work around this limitation, we will rewrite the + // IR so that any structure types with resource-type fields get + // split into a "tuple" that comprises the ordinary fields (still + // bundles up as a `struct`) and one element for each resource-type + // field (recursively). + // + // What used to be individual variables/parameters/arguments/etc. + // then become multiple variables/parameters/arguments/etc. + // + legalizeResourceTypes( + irModule, + sink); + eliminateDeadCode(irModule); + + // Debugging output of legalization + #if 0 + dumpIRIfEnabled(compileRequest, irModule, "LEGALIZED"); + #endif + validateIRModuleIfEnabled(compileRequest, irModule); + } + + // Once specialization and type legalization have been performed, + // we should perform some of our basic optimization steps again, + // to see if we can clean up any temporaries created by legalization. + // (e.g., things that used to be aggregated might now be split up, + // so that we can work with the individual fields). + constructSSA(irModule); + +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "AFTER SSA"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + // After type legalization and subsequent SSA cleanup we expect + // that any resource types passed to functions are exposed + // as their own top-level parameters (which might have + // resource or array-of-...-resource types). + // + // Many of our targets place restrictions on how certain + // resource types can be used, so that having them as + // function parameters is invalid. To clean this up, + // we will try to specialize called functions based + // on the actual resources that are being passed to them + // at specific call sites. + // + // Because the legalization may depend on what target + // we are compiling for (certain things might be okay + // for D3D targets that are not okay for Vulkan), we + // pass down the target request along with the IR. + // + specializeResourceParameters(compileRequest, targetRequest, irModule); + +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "AFTER RESOURCE SPECIALIZATION"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + + // For GLSL only, we will need to perform "legalization" of + // the entry point and any entry-point parameters. + // + // TODO: We should consider moving this legalization work + // as late as possible, so that it doesn't affect how other + // optimization passes need to work. + // + switch (target) + { + case CodeGenTarget::GLSL: + { + legalizeEntryPointForGLSL( + session, + irModule, + irEntryPoint, + compileRequest->getSink(), + options.sourceEmitter->getGLSLExtensionTracker()); + +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "GLSL LEGALIZED"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + } + break; + + default: + break; + } + + // The resource-based specialization pass above + // may create specialized versions of functions, but + // it does not try to completely eliminate the original + // functions, so there might still be invalid code in + // our IR module. + // + // To clean up the code, we will apply a fairly general + // dead-code-elimination (DCE) pass that only retains + // whatever code is "live." + // + eliminateDeadCode(irModule); +#if 0 + dumpIRIfEnabled(compileRequest, irModule, "AFTER DCE"); +#endif + validateIRModuleIfEnabled(compileRequest, irModule); + + return SLANG_OK; +} + +String emitEntryPointSource( BackEndCompileRequest* compileRequest, Int entryPointIndex, CodeGenTarget target, @@ -166,7 +452,6 @@ String emitEntryPoint( { auto sink = compileRequest->getSink(); auto program = compileRequest->getProgram(); - auto targetProgram = program->getTargetProgram(targetRequest); auto entryPoint = program->getEntryPoint(entryPointIndex); @@ -226,269 +511,30 @@ String emitEntryPoint( // Outside because we want to keep IR in scope whilst we are processing emits LinkedIR linkedIR; { - auto session = targetRequest->getSession(); - - // We start out by performing "linking" at the level of the IR. - // This step will create a fresh IR module to be used for - // code generation, and will copy in any IR definitions that - // the desired entry point requires. Along the way it will - // resolve references to imported/exported symbols across - // modules, and also select between the definitions of - // any "profile-overloaded" symbols. - // - linkedIR = linkIR( - compileRequest, - entryPointIndex, - target, - targetProgram); - auto irModule = linkedIR.module; - auto irEntryPoint = linkedIR.entryPoint; - -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "LINKED"); -#endif - - validateIRModuleIfEnabled(compileRequest, irModule); + LinkingAndOptimizationOptions linkingAndOptimizationOptions; - // If the user specified the flag that they want us to dump - // IR, then do it here, for the target-specific, but - // un-specialized IR. - dumpIRIfEnabled(compileRequest, irModule); - - // Replace any global constants with their values. - // - replaceGlobalConstants(irModule); -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "GLOBAL CONSTANTS REPLACED"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - - // When there are top-level existential-type parameters - // to the shader, we need to take the side-band information - // on how the existential "slots" were bound to concrete - // types, and use it to introduce additional explicit - // shader parameters for those slots, to be wired up to - // use sites. - // - bindExistentialSlots(irModule, sink); -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "EXISTENTIALS BOUND"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); + linkingAndOptimizationOptions.sourceEmitter = sourceEmitter; - - - - - // Now that we've linked the IR code, any layout/binding - // information has been attached to shader parameters - // and entry points. Now we are safe to make transformations - // that might move code without worrying about losing - // the connection between a parameter and its layout. - // - // An easy transformation of this kind is to take uniform - // parameters of a shader entry point and move them into - // the global scope instead. - // - moveEntryPointUniformParamsToGlobalScope(irModule, target); -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "ENTRY POINT UNIFORMS MOVED"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - // Desguar any union types, since these will be illegal on - // various targets. - // - desugarUnionTypes(irModule); -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "UNIONS DESUGARED"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - // Next, we need to ensure that the code we emit for - // the target doesn't contain any operations that would - // be illegal on the target platform. For example, - // none of our target supports generics, or interfaces, - // so we need to specialize those away. - // - // Simplification of existential-based and generics-based - // code may each open up opportunities for the other, so - // the relevant specialization transformations are handled in a - // single pass that looks for all simplification opportunities. - // - // TODO: We also need to extend this pass so that it will "expose" - // existential values that are nested inside of other types, - // so that the simplifications can be applied. - // - // TODO: This pass is *also* likely to be the place where we - // perform specialization of functions based on parameter - // values that need to be compile-time constants. - // - specializeModule(irModule); - - // Debugging code for IR transformations... -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "SPECIALIZED"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - - // Specialization can introduce dead code that could trip - // up downstream passes like type legalization, so we - // will run a DCE pass to clean up after the specialization. - // - // TODO: Are there other cleanup optimizations we should - // apply at this point? - // - eliminateDeadCode(irModule); -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "AFTER DCE"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - // We don't need the legalize pass for C/C++ based types - if (!(sourceStyle == SourceStyle::CPP || sourceStyle == SourceStyle::C)) + switch( sourceStyle ) { - // The Slang language allows interfaces to be used like - // ordinary types (including placing them in constant - // buffers and entry-point parameter lists), but then - // getting them to lay out in a reasonable way requires - // us to treat fields/variables with interface type - // *as if* they were pointers to heap-allocated "objects." - // - // Specialization will have replaced fields/variables - // with interface types like `IFoo` with fields/variables - // with pointer-like types like `ExistentialBox<SomeType>`. - // - // We need to legalize these pointer-like types away, - // which involves two main changes: - // - // 1. Any `ExistentialBox<...>` fields need to be moved - // out of their enclosing `struct` type, so that the layout - // of the enclosing type is computed as if the field had - // zero size. - // - // 2. Once an `ExistentialBox<X>` has been floated out - // of its parent and landed somwhere permanent (e.g., either - // a dedicated variable, or a field of constant buffer), - // we need to replace it with just an `X`, after which we - // will have (more) legal shader code. - // - legalizeExistentialTypeLayout( - irModule, - sink); - eliminateDeadCode(irModule); - -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "EXISTENTIALS LEGALIZED"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - // Many of our target languages and/or downstream compilers - // don't support `struct` types that have resource-type fields. - // In order to work around this limitation, we will rewrite the - // IR so that any structure types with resource-type fields get - // split into a "tuple" that comprises the ordinary fields (still - // bundles up as a `struct`) and one element for each resource-type - // field (recursively). - // - // What used to be individual variables/parameters/arguments/etc. - // then become multiple variables/parameters/arguments/etc. - // - legalizeResourceTypes( - irModule, - sink); - eliminateDeadCode(irModule); - } - - // Debugging output of legalization -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "LEGALIZED"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - // Once specialization and type legalization have been performed, - // we should perform some of our basic optimization steps again, - // to see if we can clean up any temporaries created by legalization. - // (e.g., things that used to be aggregated might now be split up, - // so that we can work with the individual fields). - constructSSA(irModule); - -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "AFTER SSA"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - // After type legalization and subsequent SSA cleanup we expect - // that any resource types passed to functions are exposed - // as their own top-level parameters (which might have - // resource or array-of-...-resource types). - // - // Many of our targets place restrictions on how certain - // resource types can be used, so that having them as - // function parameters is invalid. To clean this up, - // we will try to specialize called functions based - // on the actual resources that are being passed to them - // at specific call sites. - // - // Because the legalization may depend on what target - // we are compiling for (certain things might be okay - // for D3D targets that are not okay for Vulkan), we - // pass down the target request along with the IR. - // - specializeResourceParameters(compileRequest, targetRequest, irModule); - -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "AFTER RESOURCE SPECIALIZATION"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - - - // For GLSL only, we will need to perform "legalization" of - // the entry point and any entry-point parameters. - // - // TODO: We should consider moving this legalization work - // as late as possible, so that it doesn't affect how other - // optimization passes need to work. - // - switch (target) - { - case CodeGenTarget::GLSL: - { - legalizeEntryPointForGLSL( - session, - irModule, - irEntryPoint, - compileRequest->getSink(), - sourceEmitter->getGLSLExtensionTracker()); - -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "GLSL LEGALIZED"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); - } - break; - default: break; + + case SourceStyle::CPP: + case SourceStyle::C: + linkingAndOptimizationOptions.shouldLegalizeExistentialAndResourceTypes = false; + break; } - // The resource-based specialization pass above - // may create specialized versions of functions, but - // it does not try to completely eliminate the original - // functions, so there might still be invalid code in - // our IR module. - // - // To clean up the code, we will apply a fairly general - // dead-code-elimination (DCE) pass that only retains - // whatever code is "live." - // - eliminateDeadCode(irModule); -#if 0 - dumpIRIfEnabled(compileRequest, irModule, "AFTER DCE"); -#endif - validateIRModuleIfEnabled(compileRequest, irModule); + linkAndOptimizeIR( + compileRequest, + entryPointIndex, + target, + targetRequest, + linkingAndOptimizationOptions, + linkedIR); + + auto irModule = linkedIR.module; // After all of the required optimization and legalization // passes have been performed, we can emit target code from @@ -570,4 +616,48 @@ String emitEntryPoint( return finalResult; } +SlangResult emitSPIRVFromIR( + BackEndCompileRequest* compileRequest, + IRModule* irModule, + IRFunc* irEntryPoint, + List<uint8_t>& spirvOut); + +SlangResult emitSPIRVForEntryPointDirectly( + BackEndCompileRequest* compileRequest, + Int entryPointIndex, + TargetRequest* targetRequest, + List<uint8_t>& spirvOut) +{ + auto sink = compileRequest->getSink(); + auto program = compileRequest->getProgram(); + auto targetProgram = program->getTargetProgram(targetRequest); + auto programLayout = targetProgram->getOrCreateLayout(sink); + + RefPtr<EntryPointLayout> entryPointLayout = programLayout->entryPoints[entryPointIndex]; + + // Outside because we want to keep IR in scope whilst we are processing emits + LinkedIR linkedIR; + LinkingAndOptimizationOptions linkingAndOptimizationOptions; + linkAndOptimizeIR( + compileRequest, + entryPointIndex, + targetRequest->getTarget(), + targetRequest, + linkingAndOptimizationOptions, + linkedIR); + + auto irModule = linkedIR.module; + auto irEntryPoint = linkedIR.entryPoint; + + emitSPIRVFromIR( + compileRequest, + irModule, + irEntryPoint, + spirvOut); + + return SLANG_OK; +} + + + } // namespace Slang diff --git a/source/slang/slang-emit.h b/source/slang/slang-emit.h index 46131d726..e9ee361d7 100644 --- a/source/slang/slang-emit.h +++ b/source/slang/slang-emit.h @@ -34,7 +34,7 @@ namespace Slang /// generate different HLSL output if we know it /// will be used to generate SPIR-V). /// - String emitEntryPoint( + String emitEntryPointSource( BackEndCompileRequest* compileRequest, Int entryPointIndex, CodeGenTarget target, diff --git a/source/slang/slang-ir-link.cpp b/source/slang/slang-ir-link.cpp index 88f385548..80b0cd39e 100644 --- a/source/slang/slang-ir-link.cpp +++ b/source/slang/slang-ir-link.cpp @@ -918,6 +918,9 @@ String getTargetName(IRSpecContext* context) case CodeGenTarget::CPPSource: return "cpp"; + case CodeGenTarget::SPIRV: + return "spirv"; + default: SLANG_UNEXPECTED("unhandled case"); UNREACHABLE_RETURN("unknown"); diff --git a/source/slang/slang-options.cpp b/source/slang/slang-options.cpp index 360526a56..0e55c6f3d 100644 --- a/source/slang/slang-options.cpp +++ b/source/slang/slang-options.cpp @@ -931,6 +931,10 @@ struct OptionsParser { sink->diagnoseRaw(Severity::Note, session->getBuildTagString()); } + else if( argStr == "-emit-spirv-directly" ) + { + requestImpl->getBackEndReq()->shouldEmitSPIRVDirectly = true; + } else if (argStr == "--") { // The `--` option causes us to stop trying to parse options, diff --git a/source/slang/slang.vcxproj b/source/slang/slang.vcxproj index 9b2b4d7ea..142a90b12 100644 --- a/source/slang/slang.vcxproj +++ b/source/slang/slang.vcxproj @@ -99,6 +99,7 @@ <WarningLevel>Level4</WarningLevel> <TreatWarningAsError>true</TreatWarningAsError> <PreprocessorDefinitions>_DEBUG;SLANG_DYNAMIC_EXPORT;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..\external\spirv-headers\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <DebugInformationFormat>EditAndContinue</DebugInformationFormat> <Optimization>Disabled</Optimization> <RuntimeLibrary>MultiThreadedDebug</RuntimeLibrary> @@ -119,6 +120,7 @@ <WarningLevel>Level4</WarningLevel> <TreatWarningAsError>true</TreatWarningAsError> <PreprocessorDefinitions>_DEBUG;SLANG_DYNAMIC_EXPORT;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..\external\spirv-headers\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <DebugInformationFormat>EditAndContinue</DebugInformationFormat> <Optimization>Disabled</Optimization> <RuntimeLibrary>MultiThreadedDebug</RuntimeLibrary> @@ -139,6 +141,7 @@ <WarningLevel>Level4</WarningLevel> <TreatWarningAsError>true</TreatWarningAsError> <PreprocessorDefinitions>NDEBUG;SLANG_DYNAMIC_EXPORT;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..\external\spirv-headers\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <Optimization>Full</Optimization> <FunctionLevelLinking>true</FunctionLevelLinking> <IntrinsicFunctions>true</IntrinsicFunctions> @@ -163,6 +166,7 @@ <WarningLevel>Level4</WarningLevel> <TreatWarningAsError>true</TreatWarningAsError> <PreprocessorDefinitions>NDEBUG;SLANG_DYNAMIC_EXPORT;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..\external\spirv-headers\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <Optimization>Full</Optimization> <FunctionLevelLinking>true</FunctionLevelLinking> <IntrinsicFunctions>true</IntrinsicFunctions> @@ -278,6 +282,7 @@ <ClCompile Include="slang-emit-hlsl.cpp" /> <ClCompile Include="slang-emit-precedence.cpp" /> <ClCompile Include="slang-emit-source-writer.cpp" /> + <ClCompile Include="slang-emit-spirv.cpp" /> <ClCompile Include="slang-emit.cpp" /> <ClCompile Include="slang-file-system.cpp" /> <ClCompile Include="slang-ir-bind-existentials.cpp" /> diff --git a/source/slang/slang.vcxproj.filters b/source/slang/slang.vcxproj.filters index b2c045fa8..9153206b6 100644 --- a/source/slang/slang.vcxproj.filters +++ b/source/slang/slang.vcxproj.filters @@ -293,6 +293,9 @@ <ClCompile Include="slang-emit-source-writer.cpp"> <Filter>Source Files</Filter> </ClCompile> + <ClCompile Include="slang-emit-spirv.cpp"> + <Filter>Source Files</Filter> + </ClCompile> <ClCompile Include="slang-emit.cpp"> <Filter>Source Files</Filter> </ClCompile> diff --git a/tests/spirv/direct-spirv-emit.slang b/tests/spirv/direct-spirv-emit.slang new file mode 100644 index 000000000..99507f795 --- /dev/null +++ b/tests/spirv/direct-spirv-emit.slang @@ -0,0 +1,9 @@ +// direct-spirv-emit.slang + +//TEST:SIMPLE:-target spirv -entry computeMain -stage compute -emit-spirv-directly + +// Test ability to directly output SPIR-V + +[numthreads(4,1,1)] +void computeMain() +{} diff --git a/tests/spirv/direct-spirv-emit.slang.expected b/tests/spirv/direct-spirv-emit.slang.expected new file mode 100644 index 000000000..28b7ed85a --- /dev/null +++ b/tests/spirv/direct-spirv-emit.slang.expected @@ -0,0 +1,20 @@ +result code = 0 +standard error = { +} +standard output = { +// Module Version 10400 +// Generated by (magic number): 0 +// Id's are bound by 5 + + Capability Shader + MemoryModel Logical GLSL450 + EntryPoint GLCompute 2 "computeMain" + ExecutionMode 2 LocalSize 4 1 1 + Name 2 "computeMain" + 1: TypeVoid + 3: TypeFunction 1 + 2(computeMain): 1 Function None 3 + 4: Label + Return + FunctionEnd +} |
