diff options
| author | Tim Foley <tfoleyNV@users.noreply.github.com> | 2020-04-08 13:57:24 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2020-04-08 13:57:24 -0700 |
| commit | 6274e175a2b6a07f448feadd4d7da35b2784d746 (patch) | |
| tree | 8178263ab71c158d3195a4ec7daa4db5247ed27a /source | |
| parent | f38c082c7c576da5f82fcd952c2622d4f98fb3a2 (diff) | |
Initial work to support OptiX output for ray tracing shaders (#1307)
* Initial work to support OptiX output for ray tracing shaders
This change represents in-progress work toward allowing Slang/HLSL ray-tracing shaders to be cross-compiled for execution on top of OptiX. The work as it exists here is incomplete, but the changes are incremental and should not disturb existing supported use cases.
One major unresolved issue in this work is that the OptiX SDK does not appear to set an environment variable
Changes include:
* Modified the premake script to support new options for adding OptiX to the build. Right now the default path to the OptiX SDK is hard-coded because the installer doesn't seem to set an environment variable. We will want to update that to have a reasonable default path for both Windows and Unix-y platforms in a later chance.
* I ran the premake generator on the project since I added new options, which resulted in a bunch of diffs to the Visual Studio project files that are unrelated to this change. Many of the diffs come from previous edits that added files using only the Visual Studio IDE rather than by re-running premake, so it is arguably better to have the checked-in project files more accurately reflect the generated files used for CI builds.
* The "downstream compiler" abstraction was extended to have an explicit notion of the kind of pipeline that shaders are being compiled for (e.g., compute vs. rasterization vs. ray tracing). This option is used to tell the NVRTC case when it needs to include the OptiX SDK headers in the search path for shader compilation (and also when it should add a `#define` to make the prelude pull in OptiX). This code again uses a hard-coded default path for the OptiX SDK; we will need to modify that to have a better discovery approach and also to support an API or command-line override.
* One note for the future is that instead of passing down a "pipeline type" we could instead pass down the list/set of stages for the kernels being compiled, and the OptiX support could be enabled whenever there is *any* ray tracing entry point present in a module. That approach would allow mixing RT and compute kernels during downstream compilation. We will need to revisit these choices when we start supporting code generation for multiple entry points at a time.
* The CUDA emit logic is currently mostly unchanged. The biggest difference is that when emitting a ray-tracing entry point we prefix the name of the generated `__global__` function with a marker for its stage type, as required by the OptiX runtime (e.g., a `__raygen__` prefix is required on all ray-generation entry points).
* The `Renderer` abstraction had a bare minimum of changes made to be able to understand that ray-tracing pipelines exist, and also that some APIs will require the name of each entry point along with its binary data in order to create a program.
* The `ShaderCompileRequest` type was updated so that only a single "source" is supported (rather than distinct source for each entry point), and also the entry points have been turned into a single list where each entry identifies its stage instead of a fixed list of fields for the supported entry-point types.
* The CUDA compute path had a lot of code added to support execution for the new ray-tracing pipeline type. The logic is mostly derived from the `optixHello` example in the OptiX SDK, and at present only supports running a single ray-generation shader with no parameters. The code here is not intended to be ready for use, but represents a signficiant amount of learning-by-doing.
* The `slang-support.cpp` file in `render-test` was updated so that instead of having separate compilation logic for compute vs. rasterization shaders (which would mean adding a third path for ray tracing), there is now a single flow to the code that works for all pipeline types and any kind of entry points.
* Implicit in the new code is dropping support for the way GLSL was being compiled for pass-through render tests, which means pass-through GLSL render tests will no longer work. It seems like we didn't have any of those to begin with, though, so it is no great loss.
* Also implicit are some new invariants about how shaders without known/default entry points need to be handled. For example, the ray tracing case intentionally does not fill in entry points on the `ShaderCompileRequest` and instead fully relies on the Slang compiler's support for discovering and enumerating entry points via reflection. As a consequence of those edits the `-no-default-entry-point` flag on `render-test` is probably not working, but it seems like we don't have any test cases that use that flag anyway.
Given the seemingly breaking changes in those last two bullets, I was surprised to find that all our current tests seem to pass with this change. If there are things that I'm missing, I hope they will come up in review.
* fixup: issues from review and CI
* Some issues noted during the review process (e.g., a missing `break`)
* Fix logic for render tests with `-no-default-entry-point`. I had somehow missed that we had tests reliant on that flag. This required a bit of refactoring to pass down the relevant flag (luckily the function in question was already being passed most of what was in `Options`, so that just passing that in directly actually simplifies the call sites a bit.
* There was a missing line of code to actually add the default compute entry points to the compile request. I think this was a problem that slipped in as part of some pre-PR refactoring/cleanup changes that I failed to re-test.
Diffstat (limited to 'source')
| -rw-r--r-- | source/core/slang-downstream-compiler.h | 9 | ||||
| -rw-r--r-- | source/core/slang-nvrtc-compiler.cpp | 57 | ||||
| -rw-r--r-- | source/slang/hlsl.meta.slang | 1 | ||||
| -rw-r--r-- | source/slang/slang-compiler.cpp | 50 | ||||
| -rw-r--r-- | source/slang/slang-emit-cuda.cpp | 60 | ||||
| -rw-r--r-- | source/slang/slang.vcxproj | 13 | ||||
| -rw-r--r-- | source/slang/slang.vcxproj.filters | 45 |
7 files changed, 191 insertions, 44 deletions
diff --git a/source/core/slang-downstream-compiler.h b/source/core/slang-downstream-compiler.h index 3ffa32097..bcf53d065 100644 --- a/source/core/slang-downstream-compiler.h +++ b/source/core/slang-downstream-compiler.h @@ -202,6 +202,14 @@ public: Object, ///< Produce an object file }; + enum PipelineType + { + Unknown, + Compute, + Rasterization, + RayTracing, + }; + struct Define { String nameWithSig; ///< If macro takes parameters include in brackets @@ -236,6 +244,7 @@ public: TargetType targetType = TargetType::Executable; SlangSourceLanguage sourceLanguage = SLANG_SOURCE_LANGUAGE_CPP; FloatingPointMode floatingPointMode = FloatingPointMode::Default; + PipelineType pipelineType = PipelineType::Unknown; Flags flags = Flag::EnableExceptionHandling; diff --git a/source/core/slang-nvrtc-compiler.cpp b/source/core/slang-nvrtc-compiler.cpp index 0e167bf80..6b01ef070 100644 --- a/source/core/slang-nvrtc-compiler.cpp +++ b/source/core/slang-nvrtc-compiler.cpp @@ -334,8 +334,63 @@ SlangResult NVRTCDownstreamCompiler::compile(const CompileOptions& options, RefP cmdLine.addArg(builder); } + List<const char*> headers; + List<const char*> headerIncludeNames; + + // If compiling for OptiX, we need to add the appropriate search paths to the command line. + // + if(options.pipelineType == PipelineType::RayTracing) + { + // The device-side OptiX API is accessed through a constellation + // of headers provided by the OptiX SDK, so we need to set an + // include path for the compile that makes those visible. + // + // TODO: The OptiX SDK installer doesn't set any kind of environment + // variable to indicate where the SDK was installed, so we seemingly + // need to probe paths instead. The form of the path will differ + // betwene Windows and Unix-y platforms, and we will need some kind + // of approach to probe multiple versiosn and use the latest. + // + // HACK: For now I'm using the fixed path for the most recent SDK + // release on Windows. This means that OptiX cross-compilation will + // only "work" on a subset of platforms, but that doesn't matter + // for now since it doesn't really "work" at all. + // + cmdLine.addArg("-I"); + cmdLine.addArg("C:/ProgramData/NVIDIA Corporation/OptiX SDK 7.0.0/include/"); + + // The OptiX headers in turn `#include <stddef.h>` and expect that + // to work. We could try to also add in an include path from the CUDA + // SDK (which seems to provide a `stddef.h` in the most recent version), + // but using that version doesn't seem to work (and also bakes in a + // requirement that the user have the CUDA SDK installed in addition + // to the OptiX SDK). + // + // Instead, we will rely on the NVRTC feature that lets us set up + // memory buffers to be used as include files by the we compile. + // We will define a dummy `stddef.h` that includes the bare minimum + // lines required to get the OptiX headers to compile without complaint. + // + // TODO: Confirm that the `LP64` definition herei s actually needed. + // + headerIncludeNames.add("stddef.h"); + headers.add("#pragma once\n" "#define LP64\n"); + + // Finally, we want the CUDA prelude to be able to react to whether + // or not OptiX is required (most notably by `#include`ing the appropriate + // header(s)), so we will insert a preprocessor define to indicate + // the requirement. + // + cmdLine.addArg("-DSLANG_CUDA_ENABLE_OPTIX"); + } + + SLANG_ASSERT(headers.getCount() == headerIncludeNames.getCount()); + nvrtcProgram program = nullptr; - nvrtcResult res = m_nvrtcCreateProgram(&program, options.sourceContents.getBuffer(), options.sourceContentsPath.getBuffer(), 0, nullptr, nullptr); + nvrtcResult res = m_nvrtcCreateProgram(&program, options.sourceContents.getBuffer(), options.sourceContentsPath.getBuffer(), + (int) headers.getCount(), + headers.getBuffer(), + headerIncludeNames.getBuffer()); if (res != NVRTC_SUCCESS) { return _asResult(res); diff --git a/source/slang/hlsl.meta.slang b/source/slang/hlsl.meta.slang index 483c2299f..4279e4a4e 100644 --- a/source/slang/hlsl.meta.slang +++ b/source/slang/hlsl.meta.slang @@ -3162,6 +3162,7 @@ void AcceptHitAndEndSearch(); // 10.4.1 - Ray Dispatch System Values __target_intrinsic(glsl, "(gl_LaunchIDNV)") +__target_intrinsic(cuda, "optixGetLaunchIndex") uint3 DispatchRaysIndex(); __target_intrinsic(glsl, "(gl_LaunchSizeNV)") diff --git a/source/slang/slang-compiler.cpp b/source/slang/slang-compiler.cpp index 53a028483..f81a63275 100644 --- a/source/slang/slang-compiler.cpp +++ b/source/slang/slang-compiler.cpp @@ -1373,6 +1373,56 @@ SlangResult dissassembleDXILUsingDXC( default: SLANG_ASSERT(!"Unhandled floating point mode"); } + { + // We need to look at the stage of the entry point(s) we are + // being asked to compile, since this will determine the + // "pipeline" that the result should be compiled for (e.g., + // compute vs. ray tracing). + // + // TODO: This logic is kind of messy in that it assumes + // a program to be compiled will only contain kernels for + // a single pipeline type, but that invariant isn't expressed + // at all in the front-end today. It also has no error + // checking for the case where there are conflicts. + // + // HACK: Right now none of the above concerns matter + // because we always perform code generation on a single + // entry point at a time. + // + Index entryPointCount = slangRequest->getProgram()->getEntryPointCount(); + for(Index ee = 0; ee < entryPointCount; ++ee) + { + auto stage = slangRequest->getProgram()->getEntryPoint(ee)->getStage(); + switch(stage) + { + default: + break; + + case Stage::Compute: + options.pipelineType = DownstreamCompiler::PipelineType::Compute; + break; + + case Stage::Vertex: + case Stage::Hull: + case Stage::Domain: + case Stage::Geometry: + case Stage::Fragment: + options.pipelineType = DownstreamCompiler::PipelineType::Rasterization; + break; + + case Stage::RayGeneration: + case Stage::Intersection: + case Stage::AnyHit: + case Stage::ClosestHit: + case Stage::Miss: + case Stage::Callable: + options.pipelineType = DownstreamCompiler::PipelineType::RayTracing; + break; + } + } + + } + // Add all the search paths (as calculated earlier - they will only be set if this is a pass through else will be empty) options.includePaths = includePaths; diff --git a/source/slang/slang-emit-cuda.cpp b/source/slang/slang-emit-cuda.cpp index a728df755..702543fc8 100644 --- a/source/slang/slang-emit-cuda.cpp +++ b/source/slang/slang-emit-cuda.cpp @@ -606,21 +606,57 @@ void CUDASourceEmitter::emitModuleImpl(IRModule* module) IREntryPointDecoration* entryPointDecor = func->findDecoration<IREntryPointDecoration>(); - if (entryPointDecor && entryPointDecor->getProfile().GetStage() == Stage::Compute) + if (entryPointDecor) { - Int sizeAlongAxis[kThreadGroupAxisCount]; - getComputeThreadGroupSize(func, sizeAlongAxis); - - // - m_writer->emit("// [numthreads("); - for (int ii = 0; ii < kThreadGroupAxisCount; ++ii) + // We have an entry-point function in the IR module, which we + // will want to emit as a `__global__` function in the generated + // CUDA C++. + // + // The most common case will be a compute kernel, in which case + // we will emit the function more or less as-is, including + // usingits original name as the name of the global symbol. + // + String funcName = getName(func); + String globalSymbolName = funcName; + + // We also suport emitting ray tracing kernels for use with + // OptiX, and in that case the name of the global symbol + // must be prefixed to indicate to the OptiX runtime what + // stage it is to be compiled for. + // + auto stage = entryPointDecor->getProfile().GetStage(); + switch( stage ) { - if (ii != 0) m_writer->emit(", "); - m_writer->emit(sizeAlongAxis[ii]); + default: + break; + + #define CASE(STAGE, PREFIX) \ + case Stage::STAGE: globalSymbolName = #PREFIX + funcName; break + + CASE(RayGeneration, __raygen__); + // TODO: Add the other ray tracing shader stages here. + #undef CASE } - m_writer->emit(")]\n"); - String funcName = getName(func); + // As a convenience for anybody reading the generated + // CUDA C++ code, we will prefix a compute kernel + // with the information from the `[numthreads(...)]` + // attribute in the source. + // + if(stage == Stage::Compute) + { + Int sizeAlongAxis[kThreadGroupAxisCount]; + getComputeThreadGroupSize(func, sizeAlongAxis); + + // + m_writer->emit("// [numthreads("); + for (int ii = 0; ii < kThreadGroupAxisCount; ++ii) + { + if (ii != 0) m_writer->emit(", "); + m_writer->emit(sizeAlongAxis[ii]); + } + m_writer->emit(")]\n"); + } m_writer->emit("extern \"C\" __global__ "); @@ -628,7 +664,7 @@ void CUDASourceEmitter::emitModuleImpl(IRModule* module) // Emit the actual function emitEntryPointAttributes(func, entryPointDecor); - emitType(resultType, funcName); + emitType(resultType, globalSymbolName); m_writer->emit("(UniformEntryPointParams* params, UniformState* uniformState)"); emitSemantics(func); diff --git a/source/slang/slang.vcxproj b/source/slang/slang.vcxproj index 027c43a95..766893da3 100644 --- a/source/slang/slang.vcxproj +++ b/source/slang/slang.vcxproj @@ -188,7 +188,6 @@ <ItemGroup> <ClInclude Include="..\..\slang.h" /> <ClInclude Include="core.meta.slang.h" /> - <ClInclude Include="glsl.meta.slang.h" /> <ClInclude Include="hlsl.meta.slang.h" /> <ClInclude Include="slang-check-impl.h" /> <ClInclude Include="slang-check.h" /> @@ -345,17 +344,13 @@ <ClCompile Include="slang.cpp" /> </ItemGroup> <ItemGroup> - <None Include="..\core\core.natvis" /> - <None Include="slang.natvis" /> - </ItemGroup> - <ItemGroup> <CustomBuild Include="core.meta.slang"> <FileType>Document</FileType> <Command Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">"../../bin/windows-x86/debug/slang-generate" %(Identity)</Command> <Command Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">"../../bin/windows-x64/debug/slang-generate" %(Identity)</Command> <Command Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">"../../bin/windows-x86/release/slang-generate" %(Identity)</Command> <Command Condition="'$(Configuration)|$(Platform)'=='Release|x64'">"../../bin/windows-x64/release/slang-generate" %(Identity)</Command> - <Outputs>%(Identity).h</Outputs> + <Outputs>../../core.meta.slang.h</Outputs> <Message>slang-generate %(Identity)</Message> <AdditionalInputs Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">../../bin/windows-x86/debug/slang-generate.exe</AdditionalInputs> <AdditionalInputs Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">../../bin/windows-x64/debug/slang-generate.exe</AdditionalInputs> @@ -368,7 +363,7 @@ <Command Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">"../../bin/windows-x64/debug/slang-generate" %(Identity)</Command> <Command Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">"../../bin/windows-x86/release/slang-generate" %(Identity)</Command> <Command Condition="'$(Configuration)|$(Platform)'=='Release|x64'">"../../bin/windows-x64/release/slang-generate" %(Identity)</Command> - <Outputs>%(Identity).h</Outputs> + <Outputs>../../hlsl.meta.slang.h</Outputs> <Message>slang-generate %(Identity)</Message> <AdditionalInputs Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">../../bin/windows-x86/debug/slang-generate.exe</AdditionalInputs> <AdditionalInputs Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">../../bin/windows-x64/debug/slang-generate.exe</AdditionalInputs> @@ -377,6 +372,10 @@ </CustomBuild> </ItemGroup> <ItemGroup> + <Natvis Include="..\core\core.natvis" /> + <Natvis Include="slang.natvis" /> + </ItemGroup> + <ItemGroup> <ProjectReference Include="..\core\core.vcxproj"> <Project>{F9BE7957-8399-899E-0C49-E714FDDD4B65}</Project> </ProjectReference> diff --git a/source/slang/slang.vcxproj.filters b/source/slang/slang.vcxproj.filters index 9be567654..442f545c6 100644 --- a/source/slang/slang.vcxproj.filters +++ b/source/slang/slang.vcxproj.filters @@ -1,4 +1,4 @@ -<?xml version="1.0" encoding="utf-8"?> +<?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <ItemGroup> <Filter Include="Header Files"> @@ -15,9 +15,6 @@ <ClInclude Include="core.meta.slang.h"> <Filter>Header Files</Filter> </ClInclude> - <ClInclude Include="glsl.meta.slang.h"> - <Filter>Header Files</Filter> - </ClInclude> <ClInclude Include="hlsl.meta.slang.h"> <Filter>Header Files</Filter> </ClInclude> @@ -99,6 +96,9 @@ <ClInclude Include="slang-ir-glsl-legalize.h"> <Filter>Header Files</Filter> </ClInclude> + <ClInclude Include="slang-ir-inline.h"> + <Filter>Header Files</Filter> + </ClInclude> <ClInclude Include="slang-ir-inst-defs.h"> <Filter>Header Files</Filter> </ClInclude> @@ -138,6 +138,9 @@ <ClInclude Include="slang-ir-string-hash.h"> <Filter>Header Files</Filter> </ClInclude> + <ClInclude Include="slang-ir-strip-witness-tables.h"> + <Filter>Header Files</Filter> + </ClInclude> <ClInclude Include="slang-ir-strip.h"> <Filter>Header Files</Filter> </ClInclude> @@ -246,12 +249,6 @@ <ClInclude Include="slang-visitor.h"> <Filter>Header Files</Filter> </ClInclude> - <ClInclude Include="slang-ir-strip-witness-tables.h"> - <Filter>Header Files</Filter> - </ClInclude> - <ClInclude Include="slang-ir-inline.h"> - <Filter>Header Files</Filter> - </ClInclude> </ItemGroup> <ItemGroup> <ClCompile Include="slang-check-conformance.cpp"> @@ -353,6 +350,9 @@ <ClCompile Include="slang-ir-glsl-legalize.cpp"> <Filter>Source Files</Filter> </ClCompile> + <ClCompile Include="slang-ir-inline.cpp"> + <Filter>Source Files</Filter> + </ClCompile> <ClCompile Include="slang-ir-legalize-types.cpp"> <Filter>Source Files</Filter> </ClCompile> @@ -389,6 +389,9 @@ <ClCompile Include="slang-ir-string-hash.cpp"> <Filter>Source Files</Filter> </ClCompile> + <ClCompile Include="slang-ir-strip-witness-tables.cpp"> + <Filter>Source Files</Filter> + </ClCompile> <ClCompile Include="slang-ir-strip.cpp"> <Filter>Source Files</Filter> </ClCompile> @@ -470,20 +473,6 @@ <ClCompile Include="slang.cpp"> <Filter>Source Files</Filter> </ClCompile> - <ClCompile Include="slang-ir-strip-witness-tables.cpp"> - <Filter>Source Files</Filter> - </ClCompile> - <ClCompile Include="slang-ir-inline.cpp"> - <Filter>Source Files</Filter> - </ClCompile> - </ItemGroup> - <ItemGroup> - <None Include="..\core\core.natvis"> - <Filter>Source Files</Filter> - </None> - <None Include="slang.natvis"> - <Filter>Source Files</Filter> - </None> </ItemGroup> <ItemGroup> <CustomBuild Include="core.meta.slang"> @@ -493,4 +482,12 @@ <Filter>Source Files</Filter> </CustomBuild> </ItemGroup> + <ItemGroup> + <Natvis Include="..\core\core.natvis"> + <Filter>Source Files</Filter> + </Natvis> + <Natvis Include="slang.natvis"> + <Filter>Source Files</Filter> + </Natvis> + </ItemGroup> </Project>
\ No newline at end of file |
