Initial work to support OptiX output for ray tracing shaders (#1307)

* Initial work to support OptiX output for ray tracing shaders This change represents in-progress work toward allowing Slang/HLSL ray-tracing shaders to be cross-compiled for execution on top of OptiX. The work as it exists here is incomplete, but the changes are incremental and should not disturb existing supported use cases. One major unresolved issue in this work is that the OptiX SDK does not appear to set an environment variable Changes include: * Modified the premake script to support new options for adding OptiX to the build. Right now the default path to the OptiX SDK is hard-coded because the installer doesn't seem to set an environment variable. We will want to update that to have a reasonable default path for both Windows and Unix-y platforms in a later chance. * I ran the premake generator on the project since I added new options, which resulted in a bunch of diffs to the Visual Studio project files that are unrelated to this change. Many of the diffs come from previous edits that added files using only the Visual Studio IDE rather than by re-running premake, so it is arguably better to have the checked-in project files more accurately reflect the generated files used for CI builds. * The "downstream compiler" abstraction was extended to have an explicit notion of the kind of pipeline that shaders are being compiled for (e.g., compute vs. rasterization vs. ray tracing). This option is used to tell the NVRTC case when it needs to include the OptiX SDK headers in the search path for shader compilation (and also when it should add a `#define` to make the prelude pull in OptiX). This code again uses a hard-coded default path for the OptiX SDK; we will need to modify that to have a better discovery approach and also to support an API or command-line override. * One note for the future is that instead of passing down a "pipeline type" we could instead pass down the list/set of stages for the kernels being compiled, and the OptiX support could be enabled whenever there is *any* ray tracing entry point present in a module. That approach would allow mixing RT and compute kernels during downstream compilation. We will need to revisit these choices when we start supporting code generation for multiple entry points at a time. * The CUDA emit logic is currently mostly unchanged. The biggest difference is that when emitting a ray-tracing entry point we prefix the name of the generated `__global__` function with a marker for its stage type, as required by the OptiX runtime (e.g., a `__raygen__` prefix is required on all ray-generation entry points). * The `Renderer` abstraction had a bare minimum of changes made to be able to understand that ray-tracing pipelines exist, and also that some APIs will require the name of each entry point along with its binary data in order to create a program. * The `ShaderCompileRequest` type was updated so that only a single "source" is supported (rather than distinct source for each entry point), and also the entry points have been turned into a single list where each entry identifies its stage instead of a fixed list of fields for the supported entry-point types. * The CUDA compute path had a lot of code added to support execution for the new ray-tracing pipeline type. The logic is mostly derived from the `optixHello` example in the OptiX SDK, and at present only supports running a single ray-generation shader with no parameters. The code here is not intended to be ready for use, but represents a signficiant amount of learning-by-doing. * The `slang-support.cpp` file in `render-test` was updated so that instead of having separate compilation logic for compute vs. rasterization shaders (which would mean adding a third path for ray tracing), there is now a single flow to the code that works for all pipeline types and any kind of entry points. * Implicit in the new code is dropping support for the way GLSL was being compiled for pass-through render tests, which means pass-through GLSL render tests will no longer work. It seems like we didn't have any of those to begin with, though, so it is no great loss. * Also implicit are some new invariants about how shaders without known/default entry points need to be handled. For example, the ray tracing case intentionally does not fill in entry points on the `ShaderCompileRequest` and instead fully relies on the Slang compiler's support for discovering and enumerating entry points via reflection. As a consequence of those edits the `-no-default-entry-point` flag on `render-test` is probably not working, but it seems like we don't have any test cases that use that flag anyway. Given the seemingly breaking changes in those last two bullets, I was surprised to find that all our current tests seem to pass with this change. If there are things that I'm missing, I hope they will come up in review. * fixup: issues from review and CI * Some issues noted during the review process (e.g., a missing `break`) * Fix logic for render tests with `-no-default-entry-point`. I had somehow missed that we had tests reliant on that flag. This required a bit of refactoring to pass down the relevant flag (luckily the function in question was already being passed most of what was in `Options`, so that just passing that in directly actually simplifies the call sites a bit. * There was a missing line of code to actually add the default compute entry points to the compile request. I think this was a problem that slipped in as part of some pre-PR refactoring/cleanup changes that I failed to re-test.
author: Tim Foley <tfoleyNV@users.noreply.github.com> 2020-04-08 13:57:24 -0700
committer: GitHub <noreply@github.com> 2020-04-08 13:57:24 -0700
commit: 6274e175a2b6a07f448feadd4d7da35b2784d746 (patch)
tree: 8178263ab71c158d3195a4ec7daa4db5247ed27a /tools/render-test/slang-support.cpp
parent: f38c082c7c576da5f82fcd952c2622d4f98fb3a2 (diff)
1 files changed, 147 insertions, 120 deletions
diff --git a/tools/render-test/slang-support.cpp b/tools/render-test/slang-support.cpp
index 5afcc6d24..3e5cc9a1c 100644
--- a/tools/render-test/slang-support.cpp
+++ b/tools/render-test/slang-support.cpp
@@ -18,6 +18,37 @@ using namespace Slang;
 static const char vertexEntryPointName[] = "vertexMain";
 static const char fragmentEntryPointName[] = "fragmentMain";
 static const char computeEntryPointName[] = "computeMain";
+static const char rtEntryPointName[] = "raygenMain";
+
+static gfx::StageType _translateStage(SlangStage slangStage)
+{
+    switch(slangStage)
+    {
+    default:
+        SLANG_ASSERT(!"unhandled case");
+        return gfx::StageType::Unknown;
+
+#define CASE(FROM, TO) \
+    case SLANG_STAGE_##FROM: return gfx::StageType::TO
+
+    CASE(VERTEX,    Vertex);
+    CASE(HULL,      Hull);
+    CASE(DOMAIN,    Domain);
+    CASE(GEOMETRY,  Geometry);
+    CASE(FRAGMENT,  Fragment);
+
+    CASE(COMPUTE,   Compute);
+
+    CASE(RAY_GENERATION,    RayGeneration);
+    CASE(INTERSECTION,      Intersection);
+    CASE(ANY_HIT,           AnyHit);
+    CASE(CLOSEST_HIT,       ClosestHit);
+    CASE(MISS,              Miss);
+    CASE(CALLABLE,          Callable);
+
+#undef CASE
+    }
+}
 
 /* static */ SlangResult ShaderCompilerUtil::compileProgram(SlangSession* session, const Input& input, const ShaderCompileRequest& request, Output& out)
 {
@@ -80,46 +111,12 @@ static const char computeEntryPointName[] = "computeMain";
     // the `-xslang <arg>` option to `render-test`.
     SLANG_RETURN_ON_FAIL(spProcessCommandLineArguments(slangRequest, input.args, input.argCount)); 
 
-    int computeTranslationUnit = 0;
-    int vertexTranslationUnit = 0;
-    int fragmentTranslationUnit = 0;
-    char const* vertexEntryPointName = request.vertexShader.name;
-    char const* fragmentEntryPointName = request.fragmentShader.name;
-    char const* computeEntryPointName = request.computeShader.name;
-
     const auto sourceLanguage = input.sourceLanguage;
 
-    if (sourceLanguage == SLANG_SOURCE_LANGUAGE_GLSL)
-    {
-        // GLSL presents unique challenges because, frankly, it got the whole
-        // compilation model wrong. One aspect of working around this is that
-        // we will compile the same source file multiple times: once per
-        // entry point, and we will have different preprocessor definitions
-        // active in each case.
-
-        vertexTranslationUnit = spAddTranslationUnit(slangRequest, sourceLanguage, nullptr);
-        spAddTranslationUnitSourceString(slangRequest, vertexTranslationUnit, request.source.path, request.source.dataBegin);
-        spTranslationUnit_addPreprocessorDefine(slangRequest, vertexTranslationUnit, "__GLSL_VERTEX__", "1");
-        vertexEntryPointName = "main";
-
-        fragmentTranslationUnit = spAddTranslationUnit(slangRequest, sourceLanguage, nullptr);
-        spAddTranslationUnitSourceString(slangRequest, fragmentTranslationUnit, request.source.path, request.source.dataBegin);
-        spTranslationUnit_addPreprocessorDefine(slangRequest, fragmentTranslationUnit, "__GLSL_FRAGMENT__", "1");
-        fragmentEntryPointName = "main";
-
-        computeTranslationUnit = spAddTranslationUnit(slangRequest, sourceLanguage, nullptr);
-        spAddTranslationUnitSourceString(slangRequest, computeTranslationUnit, request.source.path, request.source.dataBegin);
-        spTranslationUnit_addPreprocessorDefine(slangRequest, computeTranslationUnit, "__GLSL_COMPUTE__", "1");
-        computeEntryPointName = "main";
-    }
-    else
+    int translationUnitIndex = 0;
     {
-        int translationUnit = spAddTranslationUnit(slangRequest, sourceLanguage, nullptr);
-        spAddTranslationUnitSourceString(slangRequest, translationUnit, request.source.path, request.source.dataBegin);
-
-        vertexTranslationUnit = translationUnit;
-        fragmentTranslationUnit = translationUnit;
-        computeTranslationUnit = translationUnit;
+        translationUnitIndex = spAddTranslationUnit(slangRequest, sourceLanguage, nullptr);
+        spAddTranslationUnitSourceString(slangRequest, translationUnitIndex, request.source.path, request.source.dataBegin);
     }
 
     const int globalSpecializationArgCount = int(request.globalSpecializationArgs.getCount());
@@ -137,105 +134,100 @@ static const char computeEntryPointName[] = "computeMain";
         }
     };
 
-    if (request.computeShader.name)
-    {
-        int computeEntryPointIndex = 0;
-        if(!gOptions.dontAddDefaultEntryPoints)
+   Index explicitEntryPointCount = request.entryPoints.getCount();
+   for(Index ee = 0; ee < explicitEntryPointCount; ++ee)
+   {
+        if(gOptions.dontAddDefaultEntryPoints)
         {
-            computeEntryPointIndex = spAddEntryPoint(slangRequest, computeTranslationUnit,
-                computeEntryPointName,
-                SLANG_STAGE_COMPUTE);
-
-            setEntryPointSpecializationArgs(computeEntryPointIndex);
+            // If default entry points are not to be added, then
+            // the `request.entryPoints` array should have been
+            // left empty.
+            //
+            SLANG_ASSERT(false);
         }
 
-        spSetLineDirectiveMode(slangRequest, SLANG_LINE_DIRECTIVE_MODE_NONE);
+       auto& entryPointInfo = request.entryPoints[ee];
+       int entryPointIndex = spAddEntryPoint(
+           slangRequest,
+           translationUnitIndex,
+           entryPointInfo.name,
+           entryPointInfo.slangStage);
+       SLANG_ASSERT(entryPointIndex == ee);
 
-        const SlangResult res = spCompile(slangRequest);
+       setEntryPointSpecializationArgs(entryPointIndex);
+   }
 
-        if (auto diagnostics = spGetDiagnosticOutput(slangRequest))
-        {
-            fprintf(stderr, "%s", diagnostics);
-        }
+    spSetLineDirectiveMode(slangRequest, SLANG_LINE_DIRECTIVE_MODE_NONE);
 
-        SLANG_RETURN_ON_FAIL(res);
+    const SlangResult res = spCompile(slangRequest);
 
-        // We are going to get the entry point count... lets check what we have
-        if (input.passThrough == SLANG_PASS_THROUGH_NONE)
-        {
-            auto reflection = spGetReflection(slangRequest);
-            // Get the amount of entry points in reflection
-            const int entryPointCount = int(spReflection_getEntryPointCount(reflection));
+    if (auto diagnostics = spGetDiagnosticOutput(slangRequest))
+    {
+        fprintf(stderr, "%s", diagnostics);
+    }
 
-            // Above code assumes there is an entry point
-            SLANG_ASSERT(entryPointCount && computeEntryPointIndex < entryPointCount);
+    SLANG_RETURN_ON_FAIL(res);
 
-            auto entryPoint = spReflection_getEntryPointByIndex(reflection, computeEntryPointIndex);
+    
+    List<ShaderCompileRequest::EntryPoint> actualEntryPoints;
+    if(input.passThrough == SLANG_PASS_THROUGH_NONE)
+    {
+        // In the case where pass-through compilation is not being used,
+        // we can use the Slang reflection information to discover what
+        // the entry points were, and then use those to drive the
+        // loading of code.
+        //
+        auto reflection = slang::ProgramLayout::get(slangRequest);
 
-            // Get the entry point name
-            const char* entryPointName = spReflectionEntryPoint_getName(entryPoint);
+        // Get the amount of entry points in reflection
+        Index entryPointCount = Index(reflection->getEntryPointCount());
 
-            SLANG_ASSERT(entryPointName);
-        }
+        // We must have at least one entry point (whether explicit or implicit)
+        SLANG_ASSERT(entryPointCount);
 
+        for(Index ee = 0; ee < entryPointCount; ++ee)
         {
-            size_t codeSize = 0;
-            char const* code = (char const*) spGetEntryPointCode(slangRequest, computeEntryPointIndex, &codeSize);
+            auto entryPoint = reflection->getEntryPointByIndex(ee);
+            const char* entryPointName = entryPoint->getName();
+            SLANG_ASSERT(entryPointName);
+
+            auto slangStage = entryPoint->getStage();
 
-            ShaderProgram::KernelDesc kernelDesc;
-            kernelDesc.stage = StageType::Compute;
-            kernelDesc.codeBegin = code;
-            kernelDesc.codeEnd = code + codeSize;
+            ShaderCompileRequest::EntryPoint entryPointInfo;
+            entryPointInfo.name = entryPointName;
+            entryPointInfo.slangStage = slangStage;
 
-            out.set(PipelineType::Compute, &kernelDesc, 1);
+            actualEntryPoints.add(entryPointInfo);
         }
     }
     else
     {
-        int vertexEntryPoint = 0;
-        int fragmentEntryPoint = 1;
-        if( !gOptions.dontAddDefaultEntryPoints )
-        {
-            vertexEntryPoint = spAddEntryPoint(slangRequest, vertexTranslationUnit, vertexEntryPointName, SLANG_STAGE_VERTEX);
-            fragmentEntryPoint = spAddEntryPoint(slangRequest, fragmentTranslationUnit, fragmentEntryPointName, SLANG_STAGE_FRAGMENT);
-
-            setEntryPointSpecializationArgs(vertexEntryPoint);
-            setEntryPointSpecializationArgs(fragmentEntryPoint);
-        }
-
-        const SlangResult res = spCompile(slangRequest);
-        if (auto diagnostics = spGetDiagnosticOutput(slangRequest))
-        {
-            // TODO(tfoley): re-enable when I get a logging solution in place
-//            OutputDebugStringA(diagnostics);
-            fprintf(stderr, "%s", diagnostics);
-        }
-
-        SLANG_RETURN_ON_FAIL(res);
-
-        {
-            size_t vertexCodeSize = 0;
-            char const* vertexCode = (char const*) spGetEntryPointCode(slangRequest, vertexEntryPoint, &vertexCodeSize);
+        actualEntryPoints = request.entryPoints;
+    }
 
-            size_t fragmentCodeSize = 0;
-            char const* fragmentCode = (char const*) spGetEntryPointCode(slangRequest, fragmentEntryPoint, &fragmentCodeSize);
+    List<ShaderProgram::KernelDesc> kernelDescs;
 
-            static const int kDescCount = 2;
+    Index actualEntryPointCount = actualEntryPoints.getCount();
+    for(Index ee = 0; ee < actualEntryPointCount; ++ee)
+    {
+        auto& actualEntryPoint = actualEntryPoints[ee];
 
-            ShaderProgram::KernelDesc kernelDescs[kDescCount];
+        size_t codeSize = 0;
+        char const* code = (char const*) spGetEntryPointCode(slangRequest, int(ee), &codeSize);
 
-            kernelDescs[0].stage = StageType::Vertex;
-            kernelDescs[0].codeBegin = vertexCode;
-            kernelDescs[0].codeEnd = vertexCode + vertexCodeSize;
+        auto gfxStage = _translateStage(actualEntryPoint.slangStage);
 
-            kernelDescs[1].stage = StageType::Fragment;
-            kernelDescs[1].codeBegin = fragmentCode;
-            kernelDescs[1].codeEnd = fragmentCode + fragmentCodeSize;
+        ShaderProgram::KernelDesc kernelDesc;
+        kernelDesc.stage = gfxStage;
+        kernelDesc.codeBegin = code;
+        kernelDesc.codeEnd = code + codeSize;
+        kernelDesc.entryPointName = actualEntryPoint.name;
 
-            out.set(PipelineType::Graphics, kernelDescs, kDescCount);
-        }
+        kernelDescs.add(kernelDesc);
     }
 
+    out.set(input.pipelineType, kernelDescs.getBuffer(), kernelDescs.getCount());
+
     return SLANG_OK;
 }
 
@@ -260,8 +252,12 @@ static const char computeEntryPointName[] = "computeMain";
     return SLANG_OK;
 }
 
-/* static */SlangResult ShaderCompilerUtil::compileWithLayout(SlangSession* session, const String& sourcePath, const Slang::List<Slang::CommandLine::Arg>& compileArgs, Options::ShaderProgramType shaderType, const ShaderCompilerUtil::Input& input, OutputAndLayout& output)
+/* static */SlangResult ShaderCompilerUtil::compileWithLayout(SlangSession* session, const Options& options, const ShaderCompilerUtil::Input& input, OutputAndLayout& output)
 {
+    String sourcePath = options.sourcePath;
+    auto& compileArgs = options.compileArgs;
+    auto shaderType = options.shaderType;
+
     List<char> sourceText;
     SLANG_RETURN_ON_FAIL(readSource(sourcePath, sourceText));
 
@@ -294,6 +290,7 @@ static const char computeEntryPointName[] = "computeMain";
             break;
 
         case Options::ShaderProgramType::Compute:
+        case Options::ShaderProgramType::RayTracing:
             layout.numRenderTargets = 0;
             break;
     }
@@ -317,17 +314,47 @@ static const char computeEntryPointName[] = "computeMain";
     compileRequest.compileArgs = compileArgs;
 
     compileRequest.source = sourceInfo;
-    if (shaderType == Options::ShaderProgramType::Graphics || shaderType == Options::ShaderProgramType::GraphicsCompute)
-    {
-        compileRequest.vertexShader.source = sourceInfo;
-        compileRequest.vertexShader.name = vertexEntryPointName;
-        compileRequest.fragmentShader.source = sourceInfo;
-        compileRequest.fragmentShader.name = fragmentEntryPointName;
-    }
-    else
+
+    // Now we will add the "default" entry point names/stages that
+    // are appropriate to the pipeline type being targetted, *unless*
+    // the options specify that we should leave out the default
+    // entry points and instead rely on the Slang compiler's built-in
+    // mechanisms for discovering entry points (e.g., `[shader(...)]`
+    // attributes).
+    //
+    if( !options.dontAddDefaultEntryPoints )
     {
-        compileRequest.computeShader.source = sourceInfo;
-        compileRequest.computeShader.name = computeEntryPointName;
+        if (shaderType == Options::ShaderProgramType::Graphics || shaderType == Options::ShaderProgramType::GraphicsCompute)
+        {
+            ShaderCompileRequest::EntryPoint vertexEntryPoint;
+            vertexEntryPoint.name = vertexEntryPointName;
+            vertexEntryPoint.slangStage = SLANG_STAGE_VERTEX;
+            compileRequest.entryPoints.add(vertexEntryPoint);
+
+            ShaderCompileRequest::EntryPoint fragmentEntryPoint;
+            fragmentEntryPoint.name = fragmentEntryPointName;
+            fragmentEntryPoint.slangStage = SLANG_STAGE_FRAGMENT;
+            compileRequest.entryPoints.add(fragmentEntryPoint);
+        }
+        else if( shaderType == Options::ShaderProgramType::RayTracing )
+        {
+            // Note: Current GPU ray tracing pipelines allow for an
+            // almost arbitrary mix of entry points for different stages
+            // to be used together (e.g., a single "program" might
+            // have multiple any-hit shaders, multiple miss shaders, etc.)
+            //
+            // Rather than try to define a fixed set of entry point
+            // names and stages that the testing will support, we will
+            // instead rely on `[shader(...)]` annotations to tell us
+            // what entry points are present in the input code.
+        }
+        else
+        {
+            ShaderCompileRequest::EntryPoint computeEntryPoint;
+            computeEntryPoint.name = computeEntryPointName;
+            computeEntryPoint.slangStage = SLANG_STAGE_COMPUTE;
+            compileRequest.entryPoints.add(computeEntryPoint);
+        }
     }
     compileRequest.globalSpecializationArgs = layout.globalSpecializationArgs;
     compileRequest.entryPointSpecializationArgs = layout.entryPointSpecializationArgs;
author	Tim Foley <tfoleyNV@users.noreply.github.com>	2020-04-08 13:57:24 -0700
committer	GitHub <noreply@github.com>	2020-04-08 13:57:24 -0700
commit	6274e175a2b6a07f448feadd4d7da35b2784d746 (patch)
tree	8178263ab71c158d3195a4ec7daa4db5247ed27a /tools/render-test/slang-support.cpp
parent	f38c082c7c576da5f82fcd952c2622d4f98fb3a2 (diff)