Change the policy for entry-point uniform parameters on Vulkan (#1476)

Entry point `uniform` parameters were a feature of the original Cg and HLSL, but have not been used much in production shader code. One of our goals on Slang is to reduce the (ab)use of the global scope, so bringing entry point `uniform` parameters up to a greater level of usability is an important goal. Some policy choices about how global vs. entry-point `uniform` parameters behave have already been made, that shape decisions looking forward: * For DXBC/DXIL, it makes the most sense to follow the lead of fxc/dxc, by treating entry point `uniform` parameters as a kind of syntax sugar for global shader parameters. Any parameters of "ordinary" types are bundles up into an implicit constant buffer, and all the resources (including the implicit constant buffer) are assigned `register`s just as for globals. It is up to the application to decide how to bind those parameters via a root signature (using root descriptors, root constants, descriptor tables, local vs. global root signature, etc.) * For CPU, it makes sense to pass global vs. entry-point parameters as two different pointers, although the details of what we do for CPU are the least constrained across all current targets. * For CUDA compute, it makes the most sense to map global shader parameters to `__constant__` global data, and entry-point `uniform` parameters to kernel parameters. This choice ensures that the signature of a kernel when translated from Slang->CUDA follows the Principle of Least Surprise, at the cost of making entry-point vs. global parameters be passed via different mechanisms. * For OptiX ray tracing, it makes sense to expand on the precedent from CUDA compute: pass global parameters via global `__constant__` data (as is already expected by OptiX for whole-launch parameters), and pass entry-point `uniform` parameters via the "shader record." This establishes a precedent that for ray-tracing shaders, global-scope parameters map to the "global root signature" concept from DXR, while entry-point `uniform` parameters map to a "local root signature" or "shader record." * For Vulkan ray tracing, the precedent from OptiX then argues that entry-point `uniform` parameters should map to the Vulkan "shader record" concept (and thus cannot support things like resource types). * The remaining interesting case is what to do for non-ray-tracing shaders on Vulkan. The dev team agrees that the most reasonable choice to make for non-ray-tracing Vulkan shaders is to map entry-point `uniform` parameters to "push constants." In particular, this makes it easy to express the case of a compute kernel with direct parameters of ordinary/value types in the way that will be implemented most efficiently. The big picture is then that a kernel like: ```hlsl void computeMain(uniform float someValue) { ... } ``` will map to output GLSL like: ```glsl layout(push_constant) uniform { float someValue; } U; void main() { ... } ``` If the user really wanted a constant-buffer binding to be created instead, they can easily change their input to make the buffer explicit: ```hlsl struct Params { float someValue; } void computeMain(uniform ConstantBuffer<Params> params) { ... } ``` (Forcing the user to be explicit about the desire for a buffer here creates a nice symmetry between Vulkan and CUDA; in the first case the user sets up the data in host memory and passes it to the GPU by copy, while in the second case the user must allocate and set up a device-memory buffer for the data. This symmetry extends to D3D if the application chooses to map entry-point `uniform` parameters to root constants.) This change implements logic in the "parameter binding" part of the Slang compiler to make sure that entry-point `uniform` parameters are wrapped up in a push-constant buffer rather than an ordinary constant buffer for non-ray-tracing shaders on Vulkan (and in a shader record "buffer" for the ray-tracing case). The majority of the actual work was in adding support for root/push constants to the test framework and the graphics API abstraction it uses. To be clear about that support: * Root constant ranges are (perhaps confusingly) treated as a new kind of "slot" that can appear on a descriptor set. This choice ensures that the implicit numbering of registers/spaces used by the back-ends can account for these ranges correctly. * The `TEST_INPUT` lines are extended to allow a `root_constants` case that behaves more or less like `cbuffer` * The CPU and CUDA paths can treat a `root_constants` input identically to a `cbuffer`. They already allocate the actual buffers based on reflection, and just use `cbuffer` as a directive that causes bytes to be copied in. * On D3D12 and Vulkan, a descriptor set allocates a `List<char>` to hold the bytes of root constant data assigned into it, and these bytes are flushed to the command list when the table is actually bound (usually right before rendering). * On D3D11, a descriptor set treats a root constant range more or less like a constant buffer range (with a single buffer), except that it also automatically allocates a buffer to hold the data. Assigning "root constant" data automatically copies it into that buffer. The small number of tests that used entry-point `uniform` parameters of ordinary types were updated to use the new `root_constant` input type, and the bugs that surfaced were fixed. A new test to confirm that entry-point `uniform` parameters map to the shader record for VK ray tracing was added. An important but technically unrelated change is the removal of the `DescriptorSetImpl::Binding` type and related function from the Vulkan implementation of `Renderer`. That type was created to ensure that objects that are bound into a descriptor set don't get released while the descriptor set is still alive, but the implementation relied on a complicated linear search to check for existing bindings, which could create a performance issue for descriptor sets that include large arrays of descriptors. The new implementation makes use of the approach already present in the various `Renderer` implementations (including the Vulkan one) for assigning ranges in a descriptor set a flat/linear index for where their pertinent data is to be bound. As a result, the Vulkan `DescriptorSetImpl` now uses a single flat array of `RefPtr`s to track bound objects, and has no need for linear search when binding. Co-authored-by: Yong He <yonghe@outlook.com>
author: Tim Foley <tfoleyNV@users.noreply.github.com> 2020-08-05 11:47:18 -0700
committer: GitHub <noreply@github.com> 2020-08-05 11:47:18 -0700
commit: e713b56a63dcbf945e3e0e6d82666318795c74ff (patch)
tree: 7883169c68f9516d1ebff70c5529b6f10933e1d5 /tools/render-test
parent: 6fb2aa70a2681bffbac7e8de67e9598105389945 (diff)
3 files changed, 33 insertions, 1 deletions
diff --git a/tools/render-test/shader-input-layout.cpp b/tools/render-test/shader-input-layout.cpp
index 4c68899ef..5e487f1ad 100644
--- a/tools/render-test/shader-input-layout.cpp
+++ b/tools/render-test/shader-input-layout.cpp
@@ -140,6 +140,11 @@ namespace renderer_test
                             entry.type = ShaderInputType::Buffer;
                             entry.bufferDesc.type = InputBufferType::ConstantBuffer;
                         }
+                        else if (parser.LookAhead("root_constants"))
+                        {
+                            entry.type = ShaderInputType::Buffer;
+                            entry.bufferDesc.type = InputBufferType::RootConstantBuffer;
+                        }
                         else if (parser.LookAhead("ubuffer"))
                         {
                             entry.type = ShaderInputType::Buffer;
diff --git a/tools/render-test/shader-input-layout.h b/tools/render-test/shader-input-layout.h
index 0831f73bb..4c8d0dfa2 100644
--- a/tools/render-test/shader-input-layout.h
+++ b/tools/render-test/shader-input-layout.h
@@ -42,7 +42,8 @@ struct InputTextureDesc
 
 enum class InputBufferType
 {
-    ConstantBuffer, StorageBuffer
+    ConstantBuffer, StorageBuffer,
+    RootConstantBuffer,
 };
 
 struct InputBufferDesc
diff --git a/tools/render-test/shader-renderer-util.cpp b/tools/render-test/shader-renderer-util.cpp
index 987b63b48..b2911ffd5 100644
--- a/tools/render-test/shader-renderer-util.cpp
+++ b/tools/render-test/shader-renderer-util.cpp
@@ -195,6 +195,18 @@ static RefPtr<SamplerState> _createSamplerState(
                     case InputBufferType::StorageBuffer:
                         slotRangeDesc.type = DescriptorSlotType::StorageBuffer;
                         break;
+
+                    case InputBufferType::RootConstantBuffer:
+                        {
+                            // A root constant buffer maps to a root constant range
+                            // where the `count` of slots is equal to the number
+                            // of bytes of data.
+                            //
+                            Slang::UInt size = srcEntry.bufferData.getCount() * sizeof(srcEntry.bufferData[0]);
+                            slotRangeDesc.type = DescriptorSlotType::RootConstant;
+                            slotRangeDesc.count = size;
+                        }
+                        break;
                     }
                 }
                 break;
@@ -269,6 +281,20 @@ static RefPtr<SamplerState> _createSamplerState(
                     const InputBufferDesc& srcBuffer = srcEntry.bufferDesc;
                     const size_t bufferSize = srcEntry.bufferData.getCount() * sizeof(uint32_t);
 
+                    if( srcBuffer.type == InputBufferType::RootConstantBuffer )
+                    {
+                        // A root constant buffer at the HLSL/Slang level actually
+                        // maps to root constant data stored directly in the descriptor
+                        // set, and thus does not need/want us to allocate a buffer
+                        // to hold the data.
+                        //
+                        // Instead, we set the data directly here and then bypass
+                        // the logic that handles the buffer-backed cases below.
+                        //
+                        descriptorSet->setRootConstants(rangeIndex, 0, bufferSize, srcEntry.bufferData.getBuffer());
+                        break;
+                    }
+
                     RefPtr<BufferResource> bufferResource;
                     SLANG_RETURN_ON_FAIL(createBufferResource(srcEntry.bufferDesc, srcEntry.isOutput, bufferSize, srcEntry.bufferData.getBuffer(), renderer, bufferResource));
author	Tim Foley <tfoleyNV@users.noreply.github.com>	2020-08-05 11:47:18 -0700
committer	GitHub <noreply@github.com>	2020-08-05 11:47:18 -0700
commit	e713b56a63dcbf945e3e0e6d82666318795c74ff (patch)
tree	7883169c68f9516d1ebff70c5529b6f10933e1d5 /tools/render-test
parent	6fb2aa70a2681bffbac7e8de67e9598105389945 (diff)