diff options
| author | Tim Foley <tfoleyNV@users.noreply.github.com> | 2018-08-03 08:39:28 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2018-08-03 08:39:28 -0700 |
| commit | 68d705f6c805c9b4d31b386e065762e6db13ad18 (patch) | |
| tree | 97ffc0f24358101222d1bc62ac0c50affc55af12 | |
| parent | 5ea746a571ced32a8975eb3a238c562b3d487149 (diff) | |
Major overhaul of Renderer abstraction, to support a new example (#624)
The original goal here was to bring up a second example program: `model-viewer`.
While the existing `hello-world` example is enough to get somebody up to speed with the basics of the Slang API (as a drop-in replacement for `D3DCompile` or similar), it doesn't really show any of the big-picture stuff that Slang is meant to enable.
There wasn't any use of D3D12/Vulkan descriptor tables/sets, and there wasn't any use of interfaces, generics, or `ParameterBlock`s in the shader code.
The `model-viewer` example addresses these issues. Its shader code involves generics, interfaces, and multiple `ParameterBlock`s, and the host-side code demonstrates a few key things for working with Slang:
* There is an application-level abstraction for parameter blocks, that combines the graphics-API descriptor set object with Slang type information
* There is a shader cache layer used to look up an appropriate variant of a rendering effect by using parameter block types to "plug in" global type variables
* There is a clear separation between the phases of compilation: a first phase that does semantic checking and enables reflection-based allocation of graphics API objects, followed by one or more code generation passes for specialized kernels.
This example is certainly not perfect, and it will need to be revamped more going forward. In particular:
* The output picture is ugly as sin. We need a plan for how to get this to load better content, perhaps even popping up an error message to note that the required input data isn't present in the basic repository.
* The shader code is too simplistic. There isn't any real material variety, and the `IMaterial` abstraction is completely wrong.
* The use of parameter blocks is facile because there are no resource parameters right now. Fixing that will likely expose issues around interfacing with Slang's reflection API.
* The whole example exposes the issue that Slang's current APIs aren't really designed for the benefit of two-phase compilation (since our many client application has been stuck on one-phase compilation).
* Global type parameters are actually a Bad Idea that we only did for compatibility with existing codebases. We should not be showing them off in an example of the Right Way to use Slang, but the language support for type parameters on entry points is still not complete.
Of course, the majority of the changes here are *not* inside the example applications, and instead involve a major overhaul of the `Renderer` abstraction that is used for both tests and examples. The main thrust of the change is to make the abstraction layer be closer to the D3D12/Vulkan model than to a D3D11-style model. This is important for the `model-viewer` example, since it aspires to show how Slang can be incorporated into a renderer that targets a modern API. The most important bit is actually the use of descriptor sets and "pipeline layouts" a la Vulkan, since without these Slang's `ParameterBlock` abstraction won't make a lot of sense.
Implementation of the abstraction for the various APIs has very much been on an as-needed basis. The current implementation is just enough for the two examples to work, plus enough to get all the tests to pass in both debug and release builds on Windows.
A big missing feature in the API abstraction right now is memory lifetime management. The code had been trending toward something D3D11-like where a constant buffer could be mapped per-frame with the implementation doing behind-the-scenes allocation for targets like D3D12/Vulkan. I'd like to shift more toward a model of just exposing "transient" allocations that are only valid for one frame, because these are more representation of how an efficient renderer for next-generation APIs will work. That transition isn't actually complete, though, so there are problems with the existing examples where `hello-world` is actually scribbling into memory that the GPU might still be using, while `model-viewer` is doing full-on heavy-weight allocations on a per-frame basis with no real concern for the performance implications.
All together, there are a lot of things here that need more work, but this branch has been way too long-lived already, and so I'd like to get this checked in as long as all the tests pass.
| -rw-r--r-- | .gitmodules | 6 | ||||
| -rw-r--r-- | README.md | 31 | ||||
| -rw-r--r-- | examples/hello-world/README.md (renamed from examples/hello/README.md) | 8 | ||||
| -rw-r--r-- | examples/hello-world/hello-world.vcxproj (renamed from examples/hello/hello.vcxproj) | 26 | ||||
| -rw-r--r-- | examples/hello-world/hello-world.vcxproj.filters (renamed from examples/hello/hello.vcxproj.filters) | 4 | ||||
| -rw-r--r-- | examples/hello-world/main.cpp (renamed from examples/hello/hello.cpp) | 290 | ||||
| -rw-r--r-- | examples/hello-world/shaders.slang (renamed from examples/hello/hello.slang) | 43 | ||||
| -rw-r--r-- | examples/hello/hello.sln | 28 | ||||
| -rw-r--r-- | examples/model-viewer/README.md | 25 | ||||
| -rw-r--r-- | examples/model-viewer/cube.mtl | 8 | ||||
| -rw-r--r-- | examples/model-viewer/cube.obj | 24 | ||||
| -rw-r--r-- | examples/model-viewer/main.cpp | 1618 | ||||
| -rw-r--r-- | examples/model-viewer/model-viewer.vcxproj | 184 | ||||
| -rw-r--r-- | examples/model-viewer/model-viewer.vcxproj.filters | 18 | ||||
| -rw-r--r-- | examples/model-viewer/shaders.slang | 178 | ||||
| m--------- | external/glm | 0 | ||||
| -rw-r--r-- | external/stb/stb_image_resize.h | 2627 | ||||
| m--------- | external/tinyobjloader | 0 | ||||
| -rw-r--r-- | premake5.lua | 43 | ||||
| -rw-r--r-- | slang.h | 79 | ||||
| -rw-r--r-- | slang.sln | 33 | ||||
| -rw-r--r-- | source/core/core.vcxproj | 3 | ||||
| -rw-r--r-- | source/core/core.vcxproj.filters | 9 | ||||
| -rw-r--r-- | source/core/smart-pointer.h | 10 | ||||
| -rw-r--r-- | source/slang/reflection.cpp | 37 | ||||
| -rw-r--r-- | source/slang/slang.cpp | 6 | ||||
| -rw-r--r-- | tools/gfx/circular-resource-heap-d3d12.cpp (renamed from tools/slang-graphics/circular-resource-heap-d3d12.cpp) | 4 | ||||
| -rw-r--r-- | tools/gfx/circular-resource-heap-d3d12.h (renamed from tools/slang-graphics/circular-resource-heap-d3d12.h) | 4 | ||||
| -rw-r--r-- | tools/gfx/d3d-util.cpp (renamed from tools/slang-graphics/d3d-util.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/d3d-util.h (renamed from tools/slang-graphics/d3d-util.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/descriptor-heap-d3d12.cpp (renamed from tools/slang-graphics/descriptor-heap-d3d12.cpp) | 4 | ||||
| -rw-r--r-- | tools/gfx/descriptor-heap-d3d12.h (renamed from tools/slang-graphics/descriptor-heap-d3d12.h) | 87 | ||||
| -rw-r--r-- | tools/gfx/gfx.vcxproj (renamed from tools/slang-graphics/slang-graphics.vcxproj) | 21 | ||||
| -rw-r--r-- | tools/gfx/gfx.vcxproj.filters (renamed from tools/slang-graphics/slang-graphics.vcxproj.filters) | 9 | ||||
| -rw-r--r-- | tools/gfx/model.cpp | 530 | ||||
| -rw-r--r-- | tools/gfx/model.h | 73 | ||||
| -rw-r--r-- | tools/gfx/render-d3d11.cpp | 2112 | ||||
| -rw-r--r-- | tools/gfx/render-d3d11.h (renamed from tools/slang-graphics/render-d3d11.h) | 4 | ||||
| -rw-r--r-- | tools/gfx/render-d3d12.cpp (renamed from tools/slang-graphics/render-d3d12.cpp) | 1632 | ||||
| -rw-r--r-- | tools/gfx/render-d3d12.h (renamed from tools/slang-graphics/render-d3d12.h) | 4 | ||||
| -rw-r--r-- | tools/gfx/render-gl.cpp (renamed from tools/slang-graphics/render-gl.cpp) | 489 | ||||
| -rw-r--r-- | tools/gfx/render-gl.h (renamed from tools/slang-graphics/render-gl.h) | 4 | ||||
| -rw-r--r-- | tools/gfx/render-vk.cpp (renamed from tools/slang-graphics/render-vk.cpp) | 1212 | ||||
| -rw-r--r-- | tools/gfx/render-vk.h (renamed from tools/slang-graphics/render-vk.h) | 4 | ||||
| -rw-r--r-- | tools/gfx/render.cpp (renamed from tools/slang-graphics/render.cpp) | 13 | ||||
| -rw-r--r-- | tools/gfx/render.h (renamed from tools/slang-graphics/render.h) | 432 | ||||
| -rw-r--r-- | tools/gfx/resource-d3d12.cpp (renamed from tools/slang-graphics/resource-d3d12.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/resource-d3d12.h (renamed from tools/slang-graphics/resource-d3d12.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/surface.cpp (renamed from tools/slang-graphics/surface.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/surface.h (renamed from tools/slang-graphics/surface.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/vector-math.h | 14 | ||||
| -rw-r--r-- | tools/gfx/vk-api.cpp (renamed from tools/slang-graphics/vk-api.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-api.h (renamed from tools/slang-graphics/vk-api.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-device-queue.cpp (renamed from tools/slang-graphics/vk-device-queue.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-device-queue.h (renamed from tools/slang-graphics/vk-device-queue.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-module.cpp (renamed from tools/slang-graphics/vk-module.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-module.h (renamed from tools/slang-graphics/vk-module.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-swap-chain.cpp (renamed from tools/slang-graphics/vk-swap-chain.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-swap-chain.h (renamed from tools/slang-graphics/vk-swap-chain.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-util.cpp (renamed from tools/slang-graphics/vk-util.cpp) | 2 | ||||
| -rw-r--r-- | tools/gfx/vk-util.h (renamed from tools/slang-graphics/vk-util.h) | 2 | ||||
| -rw-r--r-- | tools/gfx/window.cpp (renamed from tools/slang-graphics/window.cpp) | 48 | ||||
| -rw-r--r-- | tools/gfx/window.h (renamed from tools/slang-graphics/window.h) | 23 | ||||
| -rw-r--r-- | tools/render-test/main.cpp | 117 | ||||
| -rw-r--r-- | tools/render-test/options.h | 2 | ||||
| -rw-r--r-- | tools/render-test/png-serialize-util.h | 2 | ||||
| -rw-r--r-- | tools/render-test/render-test.vcxproj | 10 | ||||
| -rw-r--r-- | tools/render-test/shader-input-layout.h | 2 | ||||
| -rw-r--r-- | tools/render-test/shader-renderer-util.cpp | 251 | ||||
| -rw-r--r-- | tools/render-test/shader-renderer-util.h | 56 | ||||
| -rw-r--r-- | tools/render-test/slang-support.cpp | 4 | ||||
| -rw-r--r-- | tools/render-test/slang-support.h | 4 | ||||
| -rw-r--r-- | tools/slang-graphics/render-d3d11.cpp | 1101 |
73 files changed, 11404 insertions, 2238 deletions
diff --git a/.gitmodules b/.gitmodules index 5ee785420..d410bf9b7 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,3 +1,9 @@ [submodule "external/glslang"] path = external/glslang url = https://github.com/KhronosGroup/glslang.git +[submodule "external/tinyobjloader"] + path = external/tinyobjloader + url = https://github.com/syoyo/tinyobjloader +[submodule "external/glm"] + path = external/glm + url = https://github.com/g-truc/glm.git @@ -3,18 +3,22 @@ [](https://ci.appveyor.com/project/shader-slang/slang/branch/master) [](https://travis-ci.org/shader-slang/slang) Slang is a shading language that extends HLSL with new capabilities for building modular, extensible, and high-performance real-time shading systems. -This repository provides a command-line compiler and a plain C API for loading, compiling, and reflecting shader code in Slang or plain HLSL. +This repository provides a command-line compiler and a C/C++ API for loading, compiling, and reflecting shader code in Slang or plain HLSL. -Using Slang you can: +The extensions provided by the Slang language make it easier for you to write high-performance shader codebases with a maintainable and modular structure. For example: -* Compile your HLSL or Slang code to DX bytecode, SPIR-V, or plain source code in HLSL or GLSL (DXIL support is planned). +* Parameter blocks (exposed as `ParameterBlock<T>`) let you group together related shader parameters -- both simple uniform values and resources like samplers/textures - in ordinary `struct` types, and then specify that they should be passed to the GPU as a single coherent block. Your application code can easily map a parameter block to abstractions like descriptor tables/sets on D3D12/Vulkan, or to the facilities provided by other APIs. + +* Generics and interfaces can be used to perform static specialization of your shader code without resort to preprocessor techniques or string-pasting. Unlike C++ templates, Slang's generics can be checked ahead of time and don't produce cascading error messages that are difficult to diagnose. The same generic shader can be specialized for a variety of different types to produce specialized code ahead of time, or on the fly, completely under application control. + +The Slang implementation in this repository provides a library and a stand-alone compiler for Slang that can be used to: + +* Compile your HLSL or Slang code to DX bytecode, DXIL, SPIR-V, or plain source code in HLSL or GLSL. * Get full reflection information about the parameters of your shader code, with a consistent interface no matter the target graphics API. Slang doesn't silently drop unused or "dead" shader parameters from the reflection data, so you can always see the full picture. * Take ordinary HLSL code that neglects to include all those tedious `register` and `layout` bindings, and transform it into code that includes explicit bindings on every shader parameter. This frees you to write simple and clean code, while still getting completely deterministic binding locations. -* Write shading code that uses first-class support for modules, interfaces, and generics to build clean and reusable shader libraries. - ## Getting Started The fastest way to get started with Slang is to use a pre-built binary package, available through GitHub [releases](https://github.com/shader-slang/slang/releases). @@ -25,6 +29,14 @@ If you would like to build Slang from source, please consult the instructions [h ## Documentation +For users getting started with Slang, it may help to start by looking at our example programs: + +* The [`hello-world`](examples/hello-world/) example shows the basics for integrating the Slang API into an application as a more-or-less drop-in replacement for `D3DCompile`. + +* The [`model-viewer`](examples/model-viewer/) example shows a more involved rendering application that uses Slang's new language features to perform efficient shader specialization and parameter binding while maintaining clear and modular shader code. + +A [paper](http://graphics.cs.cmu.edu/projects/slang/) on the Slang system was accepted into SIGGRAPH 2018, and it provides an overview of the language and the design of the impelemtnation. + The Slang [language guide](docs/language-guide.md) provides information on extended language features that Slang provides for user code. The [API user's guide](docs/api-users-guide.md) gives information on how to drive Slang programmatically from an application. @@ -34,15 +46,15 @@ Be warned, however, that the command-line tool is primarily intended for experim ## Limitations -The Slang project is in a very early state, so there are many rough edges to be aware of. +The Slang project is in an early state, so there are many rough edges to be aware of. Slang is *not* currently recommended for production use. The project is intentionally on a pre-`1.0.0` version to reflect the fact that interfaces and features may change at any time (though we try not to break user code without good reason). Major limitations to be aware of (beyond everything files in the issue tracker): -* Slang only supports outputting GLSL/SPIR-V for Vulkan, not OpenGL +* Slang only officially supports outputting GLSL/SPIR-V for Vulkan, not OpenGL -* Slang's current approach to automatically assigning registers is appropriate to D3D12, but not D3D11 +* Slang's current approach to automatically assigning registers is appropriate to D3D12, and is not ideal for D3D11 * Slang-to-GLSL cross-compilation only supports vertex, fragment, and compute shaders. Geometry and tessellation shader cross-compilation is not yet implemented. @@ -66,7 +78,6 @@ The Slang code itself is under the MIT license (see [LICENSE](LICENSE)). The Slang projet can be compiled to use the [`glslang`](https://github.com/KhronosGroup/glslang) project as a submodule (under `external/glslang`), and `glslang` is under a BSD license. -The Slang tests (which are not distributed with source/binary releases) include example shaders extracted from: -* Sample HLSL shaders from the Microsoft DirectX SDK, which has its own license +The Slang tests (which are not distributed with source/binary releases) include example HLSL shaders extracted from the Microsoft DirectX SDK, which has its own license Some of the Slang examples and tests use the `stb_image` and `stb_image_write` libraries (under `external/stb`) which have been placed in the public domain by their author(s). diff --git a/examples/hello/README.md b/examples/hello-world/README.md index 31a983428..ba377b8cb 100644 --- a/examples/hello/README.md +++ b/examples/hello-world/README.md @@ -3,10 +3,10 @@ Slang "Hello World" Example The goal of this example is to demonstrate an almost minimal application that uses Slang for shading. -The `hello.slang` file contains simple vertex and fragment shader entry points. The shader code should compile as either Slang or HLSL code (that is, this example does not show off any new Slang language features). +The `shaders.slang` file contains simple vertex and fragment shader entry points. The shader code should compile as either Slang or HLSL code (that is, this example does not show off any new Slang language features). -The `hello.cpp` file contains the C++ application code, showing how to use the Slang C API to load and compile the shader code to DirectX shader bytecode (DXBC). -The application perform rendering using the D3D11 API, through a platform and graphics API abstraction layer that is implemented in `tools/slang-graphics`. -Note that this abstraction layer is *not* required in order to work with Slang, and it is just there to help us write example applications more conveniently. +The `main.cpp` file contains the C++ application code, showing how to use the Slang API to load and compile the shader code to DirectX shader bytecode (DXBC). +The application perform rendering using the D3D11 API, through a platform and graphics API abstraction layer that is implemented in `tools/gfx`. +Note that this abstraction layer is *not* required in order to work with Slang, and it is just there to help us write example and test applications more conveniently. This example is not necessarily representative of best practices for integrating Slang into a production engine; the goal is merely to use the minimum amount of code possible to demonstrate a complete applicaiton that uses Slang. diff --git a/examples/hello/hello.vcxproj b/examples/hello-world/hello-world.vcxproj index 885c2ff86..9efb28688 100644 --- a/examples/hello/hello.vcxproj +++ b/examples/hello-world/hello-world.vcxproj @@ -19,10 +19,10 @@ </ProjectConfiguration> </ItemGroup> <PropertyGroup Label="Globals"> - <ProjectGuid>{E6385042-1649-4803-9EBD-168F8B7EF131}</ProjectGuid> + <ProjectGuid>{5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}</ProjectGuid> <IgnoreWarnCompileDuplicatedFilename>true</IgnoreWarnCompileDuplicatedFilename> <Keyword>Win32Proj</Keyword> - <RootNamespace>hello</RootNamespace> + <RootNamespace>hello-world</RootNamespace> </PropertyGroup> <Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" /> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration"> @@ -68,29 +68,29 @@ <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'"> <LinkIncremental>true</LinkIncremental> <OutDir>..\..\bin\windows-x86\debug\</OutDir> - <IntDir>..\..\intermediate\windows-x86\debug\hello\</IntDir> - <TargetName>hello</TargetName> + <IntDir>..\..\intermediate\windows-x86\debug\hello-world\</IntDir> + <TargetName>hello-world</TargetName> <TargetExt>.exe</TargetExt> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'"> <LinkIncremental>true</LinkIncremental> <OutDir>..\..\bin\windows-x64\debug\</OutDir> - <IntDir>..\..\intermediate\windows-x64\debug\hello\</IntDir> - <TargetName>hello</TargetName> + <IntDir>..\..\intermediate\windows-x64\debug\hello-world\</IntDir> + <TargetName>hello-world</TargetName> <TargetExt>.exe</TargetExt> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'"> <LinkIncremental>false</LinkIncremental> <OutDir>..\..\bin\windows-x86\release\</OutDir> - <IntDir>..\..\intermediate\windows-x86\release\hello\</IntDir> - <TargetName>hello</TargetName> + <IntDir>..\..\intermediate\windows-x86\release\hello-world\</IntDir> + <TargetName>hello-world</TargetName> <TargetExt>.exe</TargetExt> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'"> <LinkIncremental>false</LinkIncremental> <OutDir>..\..\bin\windows-x64\release\</OutDir> - <IntDir>..\..\intermediate\windows-x64\release\hello\</IntDir> - <TargetName>hello</TargetName> + <IntDir>..\..\intermediate\windows-x64\release\hello-world\</IntDir> + <TargetName>hello-world</TargetName> <TargetExt>.exe</TargetExt> </PropertyGroup> <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'"> @@ -162,10 +162,10 @@ </Link> </ItemDefinitionGroup> <ItemGroup> - <ClCompile Include="hello.cpp" /> + <ClCompile Include="main.cpp" /> </ItemGroup> <ItemGroup> - <None Include="hello.slang" /> + <None Include="shaders.slang" /> </ItemGroup> <ItemGroup> <ProjectReference Include="..\..\source\slang\slang.vcxproj"> @@ -174,7 +174,7 @@ <ProjectReference Include="..\..\source\core\core.vcxproj"> <Project>{F9BE7957-8399-899E-0C49-E714FDDD4B65}</Project> </ProjectReference> - <ProjectReference Include="..\..\tools\slang-graphics\slang-graphics.vcxproj"> + <ProjectReference Include="..\..\tools\gfx\gfx.vcxproj"> <Project>{222F7498-B40C-4F3F-A704-DDEB91A4484A}</Project> </ProjectReference> </ItemGroup> diff --git a/examples/hello/hello.vcxproj.filters b/examples/hello-world/hello-world.vcxproj.filters index 6855e69cc..a02cb79fc 100644 --- a/examples/hello/hello.vcxproj.filters +++ b/examples/hello-world/hello-world.vcxproj.filters @@ -6,12 +6,12 @@ </Filter> </ItemGroup> <ItemGroup> - <ClCompile Include="hello.cpp"> + <ClCompile Include="main.cpp"> <Filter>Source Files</Filter> </ClCompile> </ItemGroup> <ItemGroup> - <None Include="hello.slang"> + <None Include="shaders.slang"> <Filter>Source Files</Filter> </None> </ItemGroup> diff --git a/examples/hello/hello.cpp b/examples/hello-world/main.cpp index 8f2fbca0b..bac378d96 100644 --- a/examples/hello/hello.cpp +++ b/examples/hello-world/main.cpp @@ -1,7 +1,11 @@ -// hello.cpp +// main.cpp // This file implements an extremely simple example of loading and -// executing a Slang shader program. +// executing a Slang shader program. This is primarily an example +// of how to use Slang as a "drop-in" replacement for an existing +// HLSL compiler like the `D3DCompile` API. More advanced usage +// of advanced Slang language and API features is left to the +// next example. // // The comments in the file will attempt to explain concepts as // they are introduced. @@ -28,10 +32,34 @@ // with Slang may depend on an application/engine making certain // design choices in their abstraction layer. // -#include "slang-graphics/render.h" -#include "slang-graphics/render-d3d11.h" -#include "slang-graphics/window.h" -using namespace slang_graphics; +#include "gfx/render.h" +#include "gfx/render-d3d11.h" +#include "gfx/window.h" +using namespace gfx; + +// For the purposes of a small example, we will define the vertex data for a +// single triangle directly in the source file. It should be easy to extend +// this example to load data from an external source, if desired. +// +struct Vertex +{ + float position[3]; + float color[3]; +}; + +static const int kVertexCount = 3; +static const Vertex kVertexData[kVertexCount] = +{ + { { 0, 0, 0.5 }, { 1, 0, 0 } }, + { { 0, 1, 0.5 }, { 0, 0, 1 } }, + { { 1, 0, 0.5 }, { 0, 1, 0 } }, +}; + +// The example application will be implemented as a `struct`, so that +// we can scope the resources it allocates without using global variables. +// +struct HelloWorld +{ // We will start with a function that will invoke the Slang compiler // to generate target-specific code from a shader file, and then @@ -42,7 +70,7 @@ using namespace slang_graphics; // Slang API. This function is representative of code that a user // might write to integrate Slang into their renderer/engine. // -ShaderProgram* loadShaderProgram(Renderer* renderer) +RefPtr<gfx::ShaderProgram> loadShaderProgram(gfx::Renderer* renderer) { // First, we need to create a "session" for interacting with the Slang // compiler. This scopes all of our application's interactions @@ -50,10 +78,12 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // Slang to load and validate its standard library, so this is a // somewhat heavy-weight operation. When possible, an application // should try to re-use the same session across multiple compiles. + // SlangSession* slangSession = spCreateSession(NULL); // A compile request represents a single invocation of the compiler, // to process some inputs and produce outputs (or errors). + // SlangCompileRequest* slangRequest = spCreateCompileRequest(slangSession); // We would like to request a single target (output) format: DirectX shader bytecode (DXBC) @@ -61,6 +91,7 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // We will specify the desired "profile" for this one target in terms of the // DirectX "shader model" that should be supported. + // spSetTargetProfile(slangRequest, targetIndex, spFindProfile(slangSession, "sm_4_0")); // A compile request can include one or more "translation units," which more or @@ -70,11 +101,13 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // For this example, our code will all be in the Slang language. The user may // also specify HLSL input here, but that currently doesn't affect the compiler's // behavior much. + // int translationUnitIndex = spAddTranslationUnit(slangRequest, SLANG_SOURCE_LANGUAGE_SLANG, nullptr); - // We will load source code for our translation unit from the file `hello.slang`. + // We will load source code for our translation unit from the file `shaders.slang`. // There are also variations of this API for adding source code from application-provided buffers. - spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, "hello.slang"); + // + spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, "shaders.slang"); // Next we will specify the entry points we'd like to compile. // It is often convenient to put more than one entry point in the same file, @@ -85,9 +118,9 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // translation unit in which that function can be found, and the stage // that we need to compile for (e.g., vertex, fragment, geometry, ...). // - char const* vertexEntryPointName = "vertexMain"; - char const* fragmentEntryPointName = "fragmentMain"; - int vertexIndex = spAddEntryPoint(slangRequest, translationUnitIndex, vertexEntryPointName, SLANG_STAGE_VERTEX); + char const* vertexEntryPointName = "vertexMain"; + char const* fragmentEntryPointName = "fragmentMain"; + int vertexIndex = spAddEntryPoint(slangRequest, translationUnitIndex, vertexEntryPointName, SLANG_STAGE_VERTEX); int fragmentIndex = spAddEntryPoint(slangRequest, translationUnitIndex, fragmentEntryPointName, SLANG_STAGE_FRAGMENT); // Once all of the input options for the compiler have been specified, @@ -100,13 +133,13 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // compiler may have produced "diagnostic" output such as warnings. // We will go ahead and print that output here. // - if (auto diagnostics = spGetDiagnosticOutput(slangRequest)) + if(auto diagnostics = spGetDiagnosticOutput(slangRequest)) { reportError("%s", diagnostics); } // If compilation failed, there is no point in continuing any further. - if (SLANG_FAILED(compileRes)) + if(SLANG_FAILED(compileRes)) { spDestroyCompileRequest(slangRequest); spDestroySession(slangSession); @@ -119,6 +152,7 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // If you are using a D3D API, then your application may want to // take advantage of the fact taht these blobs are binary compatible // with the `ID3DBlob`, `ID3D10Blob`, etc. interfaces. + // ISlangBlob* vertexShaderBlob = nullptr; spGetEntryPointCodeBlob(slangRequest, vertexIndex, 0, &vertexShaderBlob); @@ -128,13 +162,14 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // We extract the begin/end pointers to the output code buffers // using operations on the `ISlangBlob` interface. - char const* vertexCode = (char const*)vertexShaderBlob->getBufferPointer(); + // + char const* vertexCode = (char const*) vertexShaderBlob->getBufferPointer(); char const* vertexCodeEnd = vertexCode + vertexShaderBlob->getBufferSize(); - char const* fragmentCode = (char const*)fragmentShaderBlob->getBufferPointer(); + char const* fragmentCode = (char const*) fragmentShaderBlob->getBufferPointer(); char const* fragmentCodeEnd = fragmentCode + fragmentShaderBlob->getBufferSize(); - // Once we have extract the output blobs, it is safe to destroy + // Once we have extracted the output blobs, it is safe to destroy // the compile request and even the session. // spDestroyCompileRequest(slangRequest); @@ -146,18 +181,18 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // Reminder: this section does not involve the Slang API at all. // - ShaderProgram::KernelDesc kernelDescs[] = + gfx::ShaderProgram::KernelDesc kernelDescs[] = { - { StageType::Vertex, vertexCode, vertexCodeEnd }, - { StageType::Fragment, fragmentCode, fragmentCodeEnd }, + { gfx::StageType::Vertex, vertexCode, vertexCodeEnd }, + { gfx::StageType::Fragment, fragmentCode, fragmentCodeEnd }, }; - ShaderProgram::Desc programDesc; - programDesc.pipelineType = PipelineType::Graphics; + gfx::ShaderProgram::Desc programDesc; + programDesc.pipelineType = gfx::PipelineType::Graphics; programDesc.kernels = &kernelDescs[0]; programDesc.kernelCount = 2; - ShaderProgram* shaderProgram = renderer->createProgram(programDesc); + auto shaderProgram = renderer->createProgram(programDesc); // Once we've used the output blobs from the Slang compiler to initialize // the API-specific shader program, we can release their memory. @@ -180,26 +215,8 @@ ShaderProgram* loadShaderProgram(Renderer* renderer) // We will hard-code the size of our rendering window. // -static int gWindowWidth = 1024; -static int gWindowHeight = 768; - -// For the purposes of a small example, we will define the vertex data for a -// single triangle directly in the source file. It should be easy to extend -// this example to load data from an external source, if desired. -// -struct Vertex -{ - float position[3]; - float color[3]; -}; - -static const int kVertexCount = 3; -static const Vertex kVertexData[kVertexCount] = -{ - { { 0, 0, 0.5 },{ 1, 0, 0 } }, - { { 0, 1, 0.5 },{ 0, 0, 1 } }, - { { 1, 0, 0.5 },{ 0, 1, 0 } }, -}; +int gWindowWidth = 1024; +int gWindowHeight = 768; // We will define global variables for the various platform and // graphics API objects that our application needs: @@ -208,18 +225,25 @@ static const Vertex kVertexData[kVertexCount] = // of them come from the utility library we are using to simplify // building an example program. // -ApplicationContext* gAppContext; -Window* gWindow; -Renderer* gRenderer; -BufferResource* gConstantBuffer; -InputLayout* gInputLayout; -BufferResource* gVertexBuffer; -ShaderProgram* gShaderProgram; -BindingState* gBindingState; - -SlangResult initialize() +gfx::ApplicationContext* gAppContext; +gfx::Window* gWindow; +RefPtr<gfx::Renderer> gRenderer; +RefPtr<gfx::BufferResource> gConstantBuffer; + +RefPtr<gfx::PipelineLayout> gPipelineLayout; +RefPtr<gfx::PipelineState> gPipelineState; +RefPtr<gfx::DescriptorSet> gDescriptorSet; + +RefPtr<gfx::BufferResource> gVertexBuffer; + +// Now that we've covered the function that actually loads and +// compiles our Slang shade code, we can go through the rest +// of the application code without as much commentary. +// +Result initialize() { // Create a window for our application to render into. + // WindowDesc windowDesc; windowDesc.title = "Hello, World!"; windowDesc.width = gWindowWidth; @@ -238,15 +262,17 @@ SlangResult initialize() rendererDesc.width = gWindowWidth; rendererDesc.height = gWindowHeight; { - const SlangResult res = gRenderer->initialize(rendererDesc, getPlatformWindowHandle(gWindow)); - if (SLANG_FAILED(res)) return res; + Result res = gRenderer->initialize(rendererDesc, getPlatformWindowHandle(gWindow)); + if(SLANG_FAILED(res)) return res; } // Create a constant buffer for passing the model-view-projection matrix. // - // TODO: A future version of this example will show how to - // use the Slang reflection API to query the required size - // for the data in this constant buffer. + // Note: the Slang API supports reflection which could be used + // to query the size of the `Uniform` constant buffer, but we + // will not deal with that here because Slang also supports + // applications that want to hard-code things like memory + // layout and parameter locations. // int constantBufferSize = 16 * sizeof(float); @@ -258,55 +284,112 @@ SlangResult initialize() gConstantBuffer = gRenderer->createBufferResource( Resource::Usage::ConstantBuffer, constantBufferDesc); - if (!gConstantBuffer) return SLANG_FAIL; - - // Input Assembler (IA) - - // Input Layout + if(!gConstantBuffer) return SLANG_FAIL; + // Now we will create objects needed to configur the "input assembler" + // (IA) stage of the D3D pipeline. + // + // First, we create an input layout: + // InputElementDesc inputElements[] = { { "POSITION", 0, Format::RGB_Float32, offsetof(Vertex, position) }, { "COLOR", 0, Format::RGB_Float32, offsetof(Vertex, color) }, }; - gInputLayout = gRenderer->createInputLayout( + auto inputLayout = gRenderer->createInputLayout( &inputElements[0], 2); - if (!gInputLayout) return SLANG_FAIL; - - // Vertex Buffer + if(!inputLayout) return SLANG_FAIL; + // Next we allocate a vertex buffer for our pre-initialized + // vertex data. + // BufferResource::Desc vertexBufferDesc; vertexBufferDesc.init(kVertexCount * sizeof(Vertex)); vertexBufferDesc.setDefaults(Resource::Usage::VertexBuffer); - gVertexBuffer = gRenderer->createBufferResource( Resource::Usage::VertexBuffer, vertexBufferDesc, &kVertexData[0]); - if (!gVertexBuffer) return SLANG_FAIL; + if(!gVertexBuffer) return SLANG_FAIL; + + // Now we will use our `loadShaderProgram` function to load + // the code from `shaders.slang` into the graphics API. + // + RefPtr<ShaderProgram> shaderProgram = loadShaderProgram(gRenderer); + if(!shaderProgram) return SLANG_FAIL; + + // Our example graphics API usess a "modern" D3D12/Vulkan style + // of resource binding, so now we will dive into describing and + // allocating "descriptor sets." + // + // First, we need to construct a descriptor set *layout*. + // + DescriptorSetLayout::SlotRangeDesc slotRanges[] = + { + DescriptorSetLayout::SlotRangeDesc(DescriptorSlotType::UniformBuffer), + }; + DescriptorSetLayout::Desc descriptorSetLayoutDesc; + descriptorSetLayoutDesc.slotRangeCount = 1; + descriptorSetLayoutDesc.slotRanges = &slotRanges[0]; + auto descriptorSetLayout = gRenderer->createDescriptorSetLayout(descriptorSetLayoutDesc); + if(!descriptorSetLayout) return SLANG_FAIL; + + // Next we will allocate a pipeline layout, which specifies + // that we will render with only a single descriptor set bound. + // + + PipelineLayout::DescriptorSetDesc descriptorSets[] = + { + PipelineLayout::DescriptorSetDesc( descriptorSetLayout ), + }; + PipelineLayout::Desc pipelineLayoutDesc; + pipelineLayoutDesc.renderTargetCount = 1; + pipelineLayoutDesc.descriptorSetCount = 1; + pipelineLayoutDesc.descriptorSets = &descriptorSets[0]; + auto pipelineLayout = gRenderer->createPipelineLayout(pipelineLayoutDesc); + if(!pipelineLayout) return SLANG_FAIL; + + gPipelineLayout = pipelineLayout; + + // Once we have the descriptor set layout, we can allocate + // and fill in a descriptor set to hold our parameters. + // + auto descriptorSet = gRenderer->createDescriptorSet(descriptorSetLayout); + if(!descriptorSet) return SLANG_FAIL; - // Shaders (VS, PS, ...) + descriptorSet->setConstantBuffer(0, 0, gConstantBuffer); - gShaderProgram = loadShaderProgram(gRenderer); - if (!gShaderProgram) return SLANG_FAIL; + gDescriptorSet = descriptorSet; - // Resource binding state + // Following the D3D12/Vulkan style of API, we need a pipeline state object + // (PSO) to encapsulate the configuration of the overall graphics pipeline. + // + GraphicsPipelineStateDesc desc; + desc.pipelineLayout = gPipelineLayout; + desc.inputLayout = inputLayout; + desc.program = shaderProgram; + desc.renderTargetCount = 1; + auto pipelineState = gRenderer->createGraphicsPipelineState(desc); + if(!pipelineState) return SLANG_FAIL; - BindingState::Desc bindingStateDesc; - bindingStateDesc.addBufferResource(gConstantBuffer, BindingState::RegisterRange::makeSingle(0)); - gBindingState = gRenderer->createBindingState(bindingStateDesc); + gPipelineState = pipelineState; // Once we've initialized all the graphics API objects, // it is time to show our application window and start rendering. - + // showWindow(gWindow); return SLANG_OK; } +// With the initialization out of the way, we can now turn our attention +// to the per-frame rendering logic. As with the initialization, there is +// nothing really Slang-specific here, so the commentary doesn't need +// to be very detailed. +// void renderFrame() { - // Clear our framebuffer (color target only) + // We start by clearing our framebuffer, which only has a color target. // static const float kClearColor[] = { 0.25, 0.25, 0.25, 1.0 }; gRenderer->setClearColor(kClearColor); @@ -316,9 +399,10 @@ void renderFrame() // of the example, but we don't actually load different data // per-frame (we always use an identity projection). // - if (float* data = (float*)gRenderer->map(gConstantBuffer, MapFlavor::WriteDiscard)) + if(float* data = (float*) gRenderer->map(gConstantBuffer, MapFlavor::WriteDiscard)) { - static const float kIdentity[] = { + static const float kIdentity[] = + { 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, @@ -328,53 +412,59 @@ void renderFrame() gRenderer->unmap(gConstantBuffer); } - // Input Assembler (IA) + // Now we configure our graphics pipeline state by setting the + // PSO, binding our descriptor set (which references the + // constant buffer that we wrote to above), and setting + // some additional bits of state, before drawing our triangle. + // + gRenderer->setPipelineState(PipelineType::Graphics, gPipelineState); + gRenderer->setDescriptorSet(PipelineType::Graphics, gPipelineLayout, 0, gDescriptorSet); - gRenderer->setInputLayout(gInputLayout); + gRenderer->setVertexBuffer(0, gVertexBuffer, sizeof(Vertex)); gRenderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); - UInt vertexStride = sizeof(Vertex); - UInt vertexBufferOffset = 0; - gRenderer->setVertexBuffers(0, 1, &gVertexBuffer, &vertexStride, &vertexBufferOffset); - - // Vertex Shader (VS) - // Pixel Shader (PS) - - gRenderer->setShaderProgram(gShaderProgram); - gRenderer->setBindingState(gBindingState); - - // - gRenderer->draw(3); + // With that, we are done drawing for one frame, and ready for the next. + // gRenderer->presentFrame(); } void finalize() { - // TODO: Proper cleanup. + // All of our graphics API objects are reference-counted, + // so there isn't any additional cleanup work that needs + // to be done in this simple example. } +}; + // This "inner" main function is used by the platform abstraction // layer to deal with differences in how an entry point needs // to be defined for different platforms. // void innerMain(ApplicationContext* context) { - if (SLANG_FAILED(initialize())) + // We construct an instance of our example application + // `struct` type, and then walk through the lifecyle + // of the application. + + HelloWorld app; + + if (SLANG_FAILED(app.initialize())) { return exitApplication(context, 1); } - while (dispatchEvents(context)) + while(dispatchEvents(context)) { - renderFrame(); + app.renderFrame(); } - finalize(); + app.finalize(); } // This macro instantiates an appropriate main function to // invoke the `innerMain` above. // -SG_UI_MAIN(innerMain) +GFX_UI_MAIN(innerMain) diff --git a/examples/hello/hello.slang b/examples/hello-world/shaders.slang index 5a68979ce..2df26b3d9 100644 --- a/examples/hello/hello.slang +++ b/examples/hello-world/shaders.slang @@ -1,50 +1,50 @@ -// hello.slang +// shaders.slang +// // This file provides a simple vertex and fragment shader that can be compiled // using Slang. This code should also be valid as HLSL, and thus it does not // use any of the new language features supported by Slang. +// +// Uniform data to be passed from application -> shader. cbuffer Uniforms { float4x4 modelViewProjection; } +// Per-vertex attributes to be assembled from bound vertex buffers. struct AssembledVertex { float3 position : POSITION; float3 color : COLOR; }; +// Output of the vertex shader, and input to the fragment shader. struct CoarseVertex { float3 color; }; +// Output of the fragment shader struct Fragment { float4 color; }; - // Vertex Shader -struct VertexStageInput -{ - AssembledVertex assembledVertex; -}; - struct VertexStageOutput { CoarseVertex coarseVertex : CoarseVertex; float4 sv_position : SV_Position; }; - -VertexStageOutput vertexMain(VertexStageInput input) +VertexStageOutput vertexMain( + AssembledVertex assembledVertex) { VertexStageOutput output; - float3 position = input.assembledVertex.position; - float3 color = input.assembledVertex.color; + float3 position = assembledVertex.position; + float3 color = assembledVertex.color; output.coarseVertex.color = color; output.sv_position = mul(modelViewProjection, float4(position, 1.0)); @@ -54,23 +54,10 @@ VertexStageOutput vertexMain(VertexStageInput input) // Fragment Shader -struct FragmentStageInput +float4 fragmentMain( + CoarseVertex coarseVertex : CoarseVertex) : SV_Target { - CoarseVertex coarseVertex : CoarseVertex; -}; + float3 color = coarseVertex.color; -struct FragmentStageOutput -{ - Fragment fragment : SV_Target; -}; - -FragmentStageOutput fragmentMain(FragmentStageInput input) -{ - FragmentStageOutput output; - - float3 color = input.coarseVertex.color; - - output.fragment.color = float4(color, 1.0); - - return output; + return float4(color, 1.0); } diff --git a/examples/hello/hello.sln b/examples/hello/hello.sln deleted file mode 100644 index 3ddf262df..000000000 --- a/examples/hello/hello.sln +++ /dev/null @@ -1,28 +0,0 @@ - -Microsoft Visual Studio Solution File, Format Version 12.00 -# Visual Studio 14 -VisualStudioVersion = 14.0.25420.1 -MinimumVisualStudioVersion = 10.0.40219.1 -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "hello", "hello.vcxproj", "{E6385042-1649-4803-9EBD-168F8B7EF131}" -EndProject -Global - GlobalSection(SolutionConfigurationPlatforms) = preSolution - Debug|x64 = Debug|x64 - Debug|x86 = Debug|x86 - Release|x64 = Release|x64 - Release|x86 = Release|x86 - EndGlobalSection - GlobalSection(ProjectConfigurationPlatforms) = postSolution - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.ActiveCfg = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.Build.0 = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x86.ActiveCfg = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x86.Build.0 = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.ActiveCfg = Release|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.Build.0 = Release|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x86.ActiveCfg = Release|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x86.Build.0 = Release|Win32 - EndGlobalSection - GlobalSection(SolutionProperties) = preSolution - HideSolutionNode = FALSE - EndGlobalSection -EndGlobal diff --git a/examples/model-viewer/README.md b/examples/model-viewer/README.md new file mode 100644 index 000000000..a350a48a2 --- /dev/null +++ b/examples/model-viewer/README.md @@ -0,0 +1,25 @@ +Model Viewer Example +==================== + +This example expands on the simple Slang API integration from the "Hello, World" example by actually loading and rendering model data with extremely basic surface and light shading. + +This time, the shader code is making use of various Slang language features, so readers may want to read through `shaders.slang` to see an example of how the various mechanisms can be used to build out a more complicated shader library. +While the shader code in this example is still simplistic, it shows examples of: + +* Using multiple Slang `ParameterBlock`s to manage the space of shader parameter bindings in a graphics-API-independent fashion, while still taking advantage of the performance opportunities afforded by D3D12 and Vulkan. + +* Using `interface`s and generics to express multiple variations of a feature with static specialization, in place of more traditional preprocessor techniques. + +The application code in `main.cpp` also shows a more advanced integration of the Slang API than that in the "Hello, World" example, including examples of: + +* Loading a library of Slang shader code to perform reflection on its types *without* specifying a particular entry point to generate code for + +* Using Slang's reflection information to allocate graphics-API objects to implement parameter blocks (e.g., D3D12/Vulkan descriptor tables/sets) + +* Performing on-demand specialization of Slang's generics using type information from parameter blocks to achieve simple shader specialization + +It is perhaps worth taking note of the two things this example intentionally does *not* do: + +* There is no use of the C-style preprocessor in the shader code presented, in order to demonstrate that shader specialization can be achieved without preprocessor techniques. + +* There is no use of explicit parameter binding decorations (e.g., HLSL `regsiter` or GLSL `layout` modifiers), in order to demonstrate that these are not needed in order to achieve high-performance shader parameter binding. diff --git a/examples/model-viewer/cube.mtl b/examples/model-viewer/cube.mtl new file mode 100644 index 000000000..6634af823 --- /dev/null +++ b/examples/model-viewer/cube.mtl @@ -0,0 +1,8 @@ +newmtl Material +Ns 96.078431 +Ka 0.000000 0.000000 0.000000 +Kd 0.640000 0.640000 0.640000 +Ks 0.500000 0.500000 0.500000 +Ni 1.000000 +d 1.000000 +illum 2 diff --git a/examples/model-viewer/cube.obj b/examples/model-viewer/cube.obj new file mode 100644 index 000000000..7226aaa77 --- /dev/null +++ b/examples/model-viewer/cube.obj @@ -0,0 +1,24 @@ +mtllib cube.mtl +o Cube +v 1.000000 -1.000000 -1.000000 +v 1.000000 -1.000000 1.000000 +v -1.000000 -1.000000 1.000000 +v -1.000000 -1.000000 -1.000000 +v 1.000000 1.000000 -1.000000 +v 1.000000 1.000000 1.000000 +v -1.000000 1.000000 1.000000 +v -1.000000 1.000000 -1.000000 +vn 0.000000 -1.000000 0.000000 +vn 0.000000 1.000000 0.000000 +vn 1.000000 0.000000 0.000000 +vn 0.000000 0.000000 1.000000 +vn -1.000000 0.000000 0.000000 +vn 0.000000 0.000000 -1.000000 +usemtl Material +s off +f 1//1 2//1 3//1 4//1 +f 5//2 8//2 7//2 6//2 +f 1//3 5//3 6//3 2//3 +f 2//4 6//4 7//4 3//4 +f 3//5 7//5 8//5 4//5 +f 5//6 1//6 4//6 8//6 diff --git a/examples/model-viewer/main.cpp b/examples/model-viewer/main.cpp new file mode 100644 index 000000000..cd6b404ee --- /dev/null +++ b/examples/model-viewer/main.cpp @@ -0,0 +1,1618 @@ +// main.cpp + +// +// This example is much more involved than the `hello-world` example, +// so readers are encouraged to work through the simpler code first +// before diving into this application. We will gloss over parts of +// the code that are similar to the code in `hello-world`, and +// instead focus on the new code that is required to use Slang in +// more advanced ways. +// + +// We still need to include the Slang header to use the Slang API +// +#include <slang.h> + +// We will again make use of a simple graphics API abstraction +// layer, just to keep the examples short and to the point. +// +#include "gfx/model.h" +#include "gfx/render.h" +#include "gfx/render-d3d11.h" +#include "gfx/vector-math.h" +#include "gfx/window.h" +using namespace gfx; + +// We will use a few utilities from the C++ standard library, +// just to keep the code short. Note that the Slang API does +// not use or require any C++ standard library features. +// +#include <memory> +#include <vector> + +// A larger application will typically want to load/compile +// multiple modules/files of shader code. When using the +// Slang API, some one-time setup work can be amortized +// across multiple modules by using a single Slang +// "session" across multiple compiles. +// +// To that end, our application will use a function-`static` +// variable to create a session on demand and re-use it +// for the duration of the application. +// +SlangSession* getSlangSession() +{ + static SlangSession* slangSession = spCreateSession(NULL); + return slangSession; +} + +// This application is going to build its own layered +// application-specific abstractions on top of Slang, +// so it will have its own notion of a shader "module," +// which comprises the results of a Slang compilation, +// including the reflection information. +// +struct ShaderModule : RefObject +{ + // The file that the module was loaded from. + std::string inputPath; + + // Slang compile request and reflection data. + SlangCompileRequest* slangRequest; + slang::ShaderReflection* slangReflection; + + // Reference to the renderer, used to service requests + // that load graphics API objects based on the module. + RefPtr<gfx::Renderer> renderer; +}; +// +// In order to load a shader module from a `.slang` file on +// disk, we will use a Slang compile session, much like +// how the earlier Hello World example loaded shader code. +// +// We will point out major differences between the earlier +// example's `loadShaderProgram()` function, and how this function +// loads a module for reflection purposes. +// +RefPtr<ShaderModule> loadShaderModule(Renderer* renderer, char const* inputPath) +{ + auto slangSession = getSlangSession(); + SlangCompileRequest* slangRequest = spCreateCompileRequest(slangSession); + + // When *loading* the shader library, we will request that concrete + // kernel code *not* be generated, because the module might have + // unspecialized generic parameters. Instead, we will generate kernels + // on demand at runtime. + // + spSetCompileFlags( + slangRequest, + SLANG_COMPILE_FLAG_NO_CODEGEN); + + // The main logic for specifying target information and loading source + // code is the same as before with the notable change that we are *not* + // specifying specific vertex/fragment entry points to compile here. + // + // Instead, the `[shader(...)]` attributes used in `shaders.slang` will + // identify the entry points in the shader library to the compiler with + // specific action needing to be taken in the application. + // + int targetIndex = spAddCodeGenTarget(slangRequest, SLANG_DXBC); + spSetTargetProfile(slangRequest, targetIndex, spFindProfile(slangSession, "sm_4_0")); + int translationUnitIndex = spAddTranslationUnit(slangRequest, SLANG_SOURCE_LANGUAGE_SLANG, nullptr); + spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, inputPath); + int compileErr = spCompile(slangRequest); + if(auto diagnostics = spGetDiagnosticOutput(slangRequest)) + { + reportError("%s", diagnostics); + } + if(compileErr) + { + spDestroyCompileRequest(slangRequest); + spDestroySession(slangSession); + return nullptr; + } + auto slangReflection = (slang::ShaderReflection*) spGetReflection(slangRequest); + + // We will not destroy the Slang compile request here, because we want to + // keep it around to service reflection quries made from the application code. + // + RefPtr<ShaderModule> module = new ShaderModule(); + module->renderer = renderer; + module->inputPath = inputPath; + module->slangRequest = slangRequest; + module->slangReflection = slangReflection; + return module; +} + +// Once a shader moduel has been loaded, it is possible to look up +// individual entry points by their name to get reflection information, +// including the stage for which the entry point was compiled. +// +// As with `ShaderModule` above, the `EntryPoint` type is the application's +// wrapper around a Slang entry point. In this case it caches the +// identity of the target stage as encoded for the graphics API. +// +struct EntryPoint : RefObject +{ + // Name of the entry point function + std::string name; + + // Stage targetted by the entry point (Slang version) + SlangStage slangStage; + + // Stage targetted by the entry point (graphics API version) + gfx::StageType apiStage; +}; +// +// Loading an entry point from a module is a straightforward +// application of the Slang reflection API. +// +RefPtr<EntryPoint> loadEntryPoint( + ShaderModule* module, + char const* name) +{ + auto slangReflection = module->slangReflection; + + // Look up the Slang entry point based on its name, and bail + // out with an error if it isn't found. + // + auto slangEntryPoint = slangReflection->findEntryPointByName(name); + if(!slangEntryPoint) return nullptr; + + // Extract the stage of the entry point using the Slang API, + // and then try to map it to the corresponding stage as + // exposed by the graphics API. + // + auto slangStage = slangEntryPoint->getStage(); + StageType apiStage = StageType::Unknown; + switch(slangStage) + { + default: + return nullptr; + + case SLANG_STAGE_VERTEX: apiStage = gfx::StageType::Vertex; break; + case SLANG_STAGE_FRAGMENT: apiStage = gfx::StageType::Fragment; break; + } + + // Allocate an application object to hold on to this entry point + // so that we can use it in later specialization steps. + // + RefPtr<EntryPoint> entryPoint = new EntryPoint(); + entryPoint->name = name; + entryPoint->slangStage = slangEntryPoint->getStage(); + entryPoint->apiStage = apiStage; + return entryPoint; +} + +// In this application a `Program` represents a combination of entry +// points that will be used together (e.g., matching vertex and fragment +// entry points). +// +// Along with the entry points themselves, the `Program` object will +// cache information gleaned from Slang's reflection interface. Notably: +// +// * The number of `ParamterBlock`s that the program uses +// * Information about generic (type) parameters +// +struct Program : RefObject +{ + // The shader module that the program was loaded from. + RefPtr<ShaderModule> shaderModule; + + // The entry points that comprise the program + // (e.g., both a vertex and a fragment entry point). + std::vector<RefPtr<EntryPoint>> entryPoints; + + // The number of parameter blocks that are used by the shader + // program. This will be used by our rendering code later to + // decide how many descriptor set bindings should affect + // specialization/execution using this program. + // + int parameterBlockCount; + + // We will store information about the generic (type) parameters + // of the program. In particular, for each generic parameter + // we are going to find a parameter block that uses that + // generic type parameter. + // + // E.g., given input code like: + // + // type_param A; + // type_param B; + // + // ParameterBlock<B> x; // block 0 + // ParameterBlock<Foo> y; // block 1 + // ParameterBlock<A> z; // block 2 + // + // We would have two `GenericParam` entries. The first one, + // for `A`, would store a `parameterBlockIndex` of `2`, because + // `A` is used as the type of the `x` parameter block. + // + // This information will be used later when we want to specialize + // shader code, because if `z` is bound using a `ParameterBlock<Bar>` + // then we can infer that `A` should be bound to `Bar`. + // + struct GenericParam + { + int parameterBlockIndex; + }; + std::vector<GenericParam> genericParams; +}; +// +// As with entry points, loading a program is done with +// the help of Slang's reflection API. +// +RefPtr<Program> loadProgram( + ShaderModule* module, + int entryPointCount, + const char* const* entryPointNames) +{ + auto slangReflection = module->slangReflection; + + RefPtr<Program> program = new Program(); + program->shaderModule = module; + + // We will loop over the entry point names that were requested, + // loading each and adding it to our program. + // + for(int ee = 0; ee < entryPointCount; ++ee) + { + auto entryPoint = loadEntryPoint(module, entryPointNames[ee]); + if(!entryPoint) + return nullptr; + program->entryPoints.push_back(entryPoint); + } + + // Next, we will look at the reflection information to see how + // many generic type parameters were declared, and allocate + // space in the `genericParams` array for them. + // + // We don't yet have enough information to fill in the + // `parameterBlockIndex` field. + // + auto genericParamCount = slangReflection->getTypeParameterCount(); + for(unsigned int pp = 0; pp < genericParamCount; ++pp) + { + auto slangGenericParam = slangReflection->getTypeParameterByIndex(pp); + + Program::GenericParam genericParam = {}; + program->genericParams.push_back(genericParam); + } + + // We want to specialize our shaders based on what gets bound + // in parameter blocks, so we will scan the shader parameters + // looking for `ParameterBlock<G>` where `G` is one of our + // generic type parameters. + // + // We do this by iterating over *all* the global shader paramters, + // and looking for those that happen to be parameter blocks, and + // of those the ones where the "element type" of the parameter block + // is a generic type parameter. + // + auto paramCount = slangReflection->getParameterCount(); + int parameterBlockCounter = 0; + for(unsigned int pp = 0; pp < paramCount; ++pp) + { + auto slangParam = slangReflection->getParameterByIndex(pp); + + // Is it a parameter block? If not, skip it. + if(slangParam->getType()->getKind() != slang::TypeReflection::Kind::ParameterBlock) + continue; + + // Okay, we've found another parameter block, so we can compute its zero-based index. + int parameterBlockIndex = parameterBlockCounter++; + + // Get the element type of the parameter block, and if it isn't a generic type + // parameter, then skip it. + auto slangElementTypeLayout = slangParam->getTypeLayout()->getElementTypeLayout(); + if(slangElementTypeLayout->getKind() != slang::TypeReflection::Kind::GenericTypeParameter) + continue; + + // At this point we've found a `ParameterBlock<G>` where `G` is a `type_param`, + // so we can store the index of the parameter block back into our array of + // generic type parameter info. + // + auto genericParamIndex = slangElementTypeLayout->getGenericParamIndex(); + program->genericParams[genericParamIndex].parameterBlockIndex = parameterBlockIndex; + } + + // The above loop over the global shader parameters will have found all the + // parameter blocks that were specified in the shader code, so now we know + // how many parameter blocks are expected to be bound when this program is used. + // + program->parameterBlockCount = parameterBlockCounter; + + return program; +} +// +// As a convenience, we will define a simple wrapper around `loadProgram` for the case +// where we have just two entry points, since that is what the application actually uses. +// +RefPtr<Program> loadProgram(ShaderModule* module, char const* entryPoint0, char const* entryPoint1) +{ + char const* entryPointNames[] = { entryPoint0, entryPoint1 }; + return loadProgram(module, 2, entryPointNames); +} + +// The `ParameterBlock<T>` type is supported by the Slang language and compiler, +// but it is up to each application to map it down to whatever graphics API +// abstraction is most fitting. +// +// For our application, a parameter block will be implemented as a combination +// of Slang type reflection information (to determine the layout) plus a +// graphics API descriptor set object. +// +// Note: the example graphics API abstraction we are using exposes descriptor sets +// similar to those in Vulkan, and then maps these down to efficient alternatives +// on other APIs including D3D12, D3D11, and OpenGL. +// +// Every parameter block is allocated based on a particular layout, and we +// can share the same layout across multiple blocks: +// +struct ParameterBlockLayout : RefObject +{ + // The graphics API device that should be used to allocate parameter + // block instances. + // + RefPtr<gfx::Renderer> renderer; + + // The Slang type layout information that will be used to decide + // how much space is needed in instances of this layout. + // + // If the user declares a `ParameterBlock<Batman>` parameter, then + // this will be the type layout information for `Batman`. + // + slang::TypeLayoutReflection* slangTypeLayout; + + // The size of the "primary" constant buffer that will hold any + // "ordinary" (not-resource) fields in the `slangTypeLayout` above. + // + size_t primaryConstantBufferSize; + + // API-specific layout information computes from `slangTypelayout`. + // + RefPtr<gfx::DescriptorSetLayout> descriptorSetLayout; +}; +// +// A parameter block layout can be computed for any `struct` type +// declared in the user's shade code. We extract the relevant +// information from the type using the Slang reflection API. +// +RefPtr<ParameterBlockLayout> getParameterBlockLayout( + ShaderModule* module, + char const* name) +{ + auto slangReflection = module->slangReflection; + auto renderer = module->renderer; + + // Look up the type with the given name, and bail out + // if no such type is found in the module. + // + auto type = slangReflection->findTypeByName(name); + if(!type) return nullptr; + + // Request layout information for the type. Note that a single + // type might be laid out differently for different compilation + // targets, or based on how it is used (e.g., as a `cbuffer` + // field vs. in a `StructuredBuffer`). + // + auto typeLayout = slangReflection->getTypeLayout(type); + if(!typeLayout) return nullptr; + + // If the type that is going in the parameter block has + // any ordinary data in it (as opposed to resources), then + // a constant buffer will be needed to hold that data. + // + // In turn any resource parameters would need to go into + // the descriptor set *after* this constant buffer. + // + size_t primaryConstantBufferSize = typeLayout->getSize(SLANG_PARAMETER_CATEGORY_UNIFORM); + + // We need to use the Slang reflection information to + // create a graphics-API-level descriptor-set layout that + // is compatible with the original declaration. + // + std::vector<gfx::DescriptorSetLayout::SlotRangeDesc> slotRanges; + + // If the type has any ordinary data, then the descriptor set + // will need a constant buffer to be the first thing it stores. + // + // Note: for a renderer only targetting D3D12, it might make + // sense to allocate this "primary" constant buffer as a root + // descriptor instead of inside the descriptor set (or at least + // do this *if* there are no non-uniform parameters). Policy + // decisions like that are up to the application, not Slang. + // This example application just does something simple. + // + if(primaryConstantBufferSize) + { + slotRanges.push_back( + gfx::DescriptorSetLayout::SlotRangeDesc( + gfx::DescriptorSlotType::UniformBuffer)); + } + + // Next, the application will recursively walk + // the structure of `typeLayout` to figure out what resource + // binding ranges are required for the target API. + // + // TODO: This application doesn't yet use any resource parameters, + // so we are skipping this step, but it is obviously needed + // for a fully fleshed-out example. + + // Now that we've collected the graphics-API level binding + // information, we can construct a graphics API descriptor set + // layout. + gfx::DescriptorSetLayout::Desc descriptorSetLayoutDesc; + descriptorSetLayoutDesc.slotRangeCount = slotRanges.size(); + descriptorSetLayoutDesc.slotRanges = slotRanges.data(); + auto descriptorSetLayout = renderer->createDescriptorSetLayout(descriptorSetLayoutDesc); + if(!descriptorSetLayout) return nullptr; + + RefPtr<ParameterBlockLayout> parameterBlockLayout = new ParameterBlockLayout(); + parameterBlockLayout->renderer = renderer; + parameterBlockLayout->primaryConstantBufferSize = primaryConstantBufferSize; + parameterBlockLayout->slangTypeLayout = typeLayout; + parameterBlockLayout->descriptorSetLayout = descriptorSetLayout; + return parameterBlockLayout; +} + +// A `ParameterBlock` abstracts over the allocated storage +// for a descriptor set, based on some `ParameterBlockLayout` +// +struct ParameterBlock : RefObject +{ + // The graphics API device used to allocate this block. + RefPtr<gfx::Renderer> renderer; + + // The associated parameter block layout. + RefPtr<ParameterBlockLayout> layout; + + // The (optional) constant buffer that holds the values + // for any ordinay fields. This will be null if + // `layout->primaryConstantBufferSize` is zero. + RefPtr<BufferResource> primaryConstantBuffer; + + // The graphics-API descriptor set that provides storage + // for any resource fields. + RefPtr<gfx::DescriptorSet> descriptorSet; + + // Map/unmap operations are provided to access the + // contents of the primary constant buffer. + void* map(); + void unmap(); + + // A a convenience, `pb->mapAs<X>()` map be used as + // a declaration of intent, instead of `(X*) pb->map()` + template<typename T> + T* mapAs() { return (T*)map(); } +}; + +// Allocating a parameter block is mostly a matter of allocating +// the required graphics API objects. +// +RefPtr<ParameterBlock> allocateParameterBlockImpl( + ParameterBlockLayout* layout) +{ + auto renderer = layout->renderer; + + // A descriptor set is then used to provide the storage for all + // resource parameters (including the primary constant buffer, if any). + // + auto descriptorSet = renderer->createDescriptorSet( + layout->descriptorSetLayout); + + // If the parameter block has any ordinary data, then it requires + // a "primary" constant buffer to hold that data. + // + RefPtr<gfx::BufferResource> primaryConstantBuffer = nullptr; + if(auto primaryConstantBufferSize = layout->primaryConstantBufferSize) + { + gfx::BufferResource::Desc bufferDesc; + bufferDesc.init(primaryConstantBufferSize); + bufferDesc.setDefaults(gfx::Resource::Usage::ConstantBuffer); + bufferDesc.cpuAccessFlags = gfx::Resource::AccessFlag::Write; + primaryConstantBuffer = renderer->createBufferResource( + gfx::Resource::Usage::ConstantBuffer, + bufferDesc); + + // The primary constant buffer will always be the first thing + // stored in the descriptor set for a parameter block. + // + descriptorSet->setConstantBuffer(0, 0, primaryConstantBuffer); + } + + // Now that we've allocated the graphics API objects, we can just + // allocate our application-side wrapper object to tie everything + // together. + // + RefPtr<ParameterBlock> parameterBlock = new ParameterBlock(); + parameterBlock->renderer = renderer; + parameterBlock->layout = layout; + parameterBlock->primaryConstantBuffer = primaryConstantBuffer; + parameterBlock->descriptorSet = descriptorSet; + return parameterBlock; +} + +// A full-featured high-performance application would likely draw +// a distinction between "persistent" parameter blocks that are +// filled in once and then used over many frames, and "transient" +// blocks that are allocated, filled in, and discarded within +// a single frame. +// +// These two cases warrant very different allocation strategies, +// but for now we are using the same logic in both cases. +// +RefPtr<ParameterBlock> allocatePersistentParameterBlock( + ParameterBlockLayout* layout) +{ + return allocateParameterBlockImpl(layout); +} +RefPtr<ParameterBlock> allocateTransientParameterBlock( + ParameterBlockLayout* layout) +{ + return allocateParameterBlockImpl(layout); +} + +// As described earlier, it is convenient to be able +// to easily map the primary constant buffer of a parameter +// block, since this will hold the values for any ordinary fields. +// +void* ParameterBlock::map() +{ + return renderer->map( + primaryConstantBuffer, + MapFlavor::WriteDiscard); +} +void ParameterBlock::unmap() +{ + renderer->unmap(primaryConstantBuffer); +} + +// Our application code has a rudimentary material system, +// to match the `IMaterial` abstraction used in the shade code. +// +struct Material : RefObject +{ + // The key feature of a matrial in our application is that + // it can provide a parameter block that describes it and + // its parameters. The contents of the parameter block will + // be any colors, textures, etc. that the material needs, + // while the Slang type that was used to allocate the + // block will be an implementation of `IMaterial` that + // provides the evaluation logic for the material. + + // Each subclass of `Material` will provide a routine to + // create a parameter block of its chosen type/layout. + virtual RefPtr<ParameterBlock> createParameterBlock() = 0; + + // The parameter block for a material will be stashed here + // after it is created. + RefPtr<ParameterBlock> parameterBlock; +}; + +// For now we have only a single implementation of `Material`, +// which corresponds to the `SimpleMaterial` type in our shader +// code. +// +struct SimpleMaterial : Material +{ + // The `SimpleMaterial` shader type has only uniform data, + // so we declare a `struct` type for that data here. + struct Uniforms + { + glm::vec3 diffuseColor; + float pad; + }; + Uniforms uniforms; + + // When asked to create a parameter block, the `SimpleMaterial` + // type will allocate a block based on the corresponding + // shader type, and fill it in based on the data in the C++ + // object. + // + RefPtr<ParameterBlock> createParameterBlock() override + { + auto parameterBlockLayout = gParameterBlockLayout; + auto parameterBlock = allocatePersistentParameterBlock( + parameterBlockLayout); + + if(auto u = parameterBlock->mapAs<Uniforms>()) + { + *u = uniforms; + parameterBlock->unmap(); + } + + return parameterBlock; + } + + // We cache the corresponding parameter block layout for + // `SimpleMaterial` in a static variable so that we don't + // load it more than once. + // + static RefPtr<ParameterBlockLayout> gParameterBlockLayout; +}; +RefPtr<ParameterBlockLayout> SimpleMaterial::gParameterBlockLayout; + +// With the `Material` abstraction defined, we can go on to define +// the representation for loaded models that we will use. +// +// A `Model` will own vertex/index buffers, along with a list of meshes, +// while each `Mesh` will own a material and a range of indices. +// For this example we will be loading models from `.obj` files, but +// that is just a simple lowest-common-denominator choice. +// +struct Mesh : RefObject +{ + RefPtr<Material> material; + int firstIndex; + int indexCount; +}; +struct Model : RefObject +{ + typedef ModelLoader::Vertex Vertex; + + RefPtr<BufferResource> vertexBuffer; + RefPtr<BufferResource> indexBuffer; + PrimitiveTopology primitiveTopology; + int vertexCount; + int indexCount; + std::vector<RefPtr<Mesh>> meshes; +}; +// +// Loading a model from disk is done with the help of some utility +// code for parsing the `.obj` file format, so that the application +// mostly just registers some callbacks to allocate the objects +// used for its representation. +// +RefPtr<Model> loadModel( + Renderer* renderer, + char const* inputPath, + ModelLoader::LoadFlags loadFlags = 0, + float scale = 1.0f) +{ + // The model loading interface using a C++ interface of + // callback functions to handle creating the application-specific + // representation of meshes, materials, etc. + // + struct Callbacks : ModelLoader::ICallbacks + { + void* createMaterial(MaterialData const& data) override + { + SimpleMaterial* material = new SimpleMaterial(); + material->uniforms.diffuseColor = data.diffuseColor; + + material->parameterBlock = material->createParameterBlock(); + + return material; + } + + void* createMesh(MeshData const& data) override + { + Mesh* mesh = new Mesh(); + mesh->firstIndex = data.firstIndex; + mesh->indexCount = data.indexCount; + mesh->material = (Material*)data.material; + return mesh; + } + + void* createModel(ModelData const& data) override + { + Model* model = new Model(); + model->vertexBuffer = data.vertexBuffer; + model->indexBuffer = data.indexBuffer; + model->primitiveTopology = data.primitiveTopology; + model->vertexCount = data.vertexCount; + model->indexCount = data.indexCount; + + int meshCount = data.meshCount; + for(int ii = 0; ii < meshCount; ++ii) + model->meshes.push_back((Mesh*)data.meshes[ii]); + + return model; + } + }; + Callbacks callbacks; + + // We instantiate a model loader object and then use it to + // try and load a model from the chosen path. + // + ModelLoader loader; + loader.renderer = renderer; + loader.loadFlags = loadFlags; + loader.scale = scale; + loader.callbacks = &callbacks; + Model* model = nullptr; + if(SLANG_FAILED(loader.load(inputPath, (void**)&model))) + { + log("failed to load '%s'\n", inputPath); + return nullptr; + } + + return model; +} + +// The core of our application's rendering abstraction is +// the notion of an "effect," which ties together a particular +// set of shader entry points (as a `Program`), with graphics +// API state objects for the fixed-function parts of the pipeline. +// +// Note that the program here is an *unspecialized* program, +// which might have unbound global `type_param`s. Thus the +// `Effect` type here is not one-to-one with a "pipeline state +// object," because the same effect could be used to instantiate +// multiple pipeline state objects based on how things get +// specialized. +// +struct Effect : RefObject +{ + // The shader program entry point(s) to execute + RefPtr<Program> program; + + // Additional state corresponding to the data needed + // to create a graphics-API pipeline state object. + RefPtr<gfx::InputLayout> inputLayout; + Int renderTargetCount; +}; + +// In order to render using the `Effect` abstraction, our +// application will be creating various specialized +// shader kernels and pipeline states on-demand. +// +// We'll start with the representation of a specialized +// "variant" of an effect. +// +struct EffectVariant : RefObject +{ + // The graphics API pipeline layout and state + // that need to be bound in order to use this + // effect. + // + RefPtr<gfx::PipelineLayout> pipelineLayout; + RefPtr<gfx::PipelineState> pipelineState; +}; +// +// A specialized variant is created based on a base effect +// and the types that will be bound to its parameter blocks. +// +RefPtr<EffectVariant> createEffectVaraint( + Effect* effect, + UInt parameterBlockCount, + ParameterBlockLayout* const* parameterBlockLayouts) +{ + // One note to make at the very start is that the creation + // of a specialized variant is based on the *layout* of + // the parameter blocks in use and not on the particular + // parameter blocks themselves. This is important because + // it means that, e.g., two materials that use the same code, + // but different parameter values (different textures, colors, + // etc.) do *not* require switching between different + // shader code or specialized PSOs. + + // We'll start by extracting some of the pieces of + // information taht we need into local variables, + // just to simplify the remaining code. + // + auto program = effect->program; + auto shaderModule = program->shaderModule; + auto renderer = shaderModule->renderer; + + // Our specialized effect is going to need a few things: + // + // 1. A specialized pipeline layout, based on the layout + // of the bound parameter blocks. + // + // 2. Specialized shader kernels, based on "plugging in" + // the parameter block types for generic type parameters + // as needed. + // + // 3. A specialized pipeline state object that ties the + // above items together with the fixed-function state + // already specified in the effect. + // + // We will now go through these steps in order. + + // (1) The pipline layout (aka D3D12 "root signature") will + // be determined based on the descriptor-set layouts + // already cached in the given parameter block layouts. + // + std::vector<PipelineLayout::DescriptorSetDesc> descriptorSets; + for(UInt pp = 0; pp < parameterBlockCount; ++pp) + { + descriptorSets.emplace_back( + parameterBlockLayouts[pp]->descriptorSetLayout); + } + PipelineLayout::Desc pipelineLayoutDesc; + pipelineLayoutDesc.renderTargetCount = 1; + pipelineLayoutDesc.descriptorSetCount = descriptorSets.size(); + pipelineLayoutDesc.descriptorSets = descriptorSets.data(); + auto pipelineLayout = renderer->createPipelineLayout(pipelineLayoutDesc); + + // (2) The final shader kernels to bind will be computed + // from the kernels we extracted into an application `EntryPoint` + // plus the types of the bound paramter blocks, as needed. + // + // We will "infer" a type argument for each of the generic + // parameters of our shader program by looking for a + // parameter block that is declared using that generic + // type. + // + std::vector<const char*> genericArgs; + for(auto gp : program->genericParams) + { + int parameterBlockIndex = gp.parameterBlockIndex; + auto typeName = parameterBlockLayouts[parameterBlockIndex]->slangTypeLayout->getName(); + genericArgs.push_back(typeName); + } + + // Now that we are ready to generate specialized shader code, + // we wil invoke the Slang compiler again. This time we leave + // full code generation turned on, and we also specify the + // entry points that we want explicitly (so that we don't + // generate code for any other entry points). + // + auto slangSession = getSlangSession(); + SlangCompileRequest* slangRequest = spCreateCompileRequest(slangSession); + int targetIndex = spAddCodeGenTarget(slangRequest, SLANG_DXBC); + spSetTargetProfile(slangRequest, targetIndex, spFindProfile(slangSession, "sm_4_0")); + int translationUnitIndex = spAddTranslationUnit(slangRequest, SLANG_SOURCE_LANGUAGE_SLANG, nullptr); + spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, program->shaderModule->inputPath.c_str()); + + int entryPointCont = program->entryPoints.size(); + for(int ii = 0; ii < entryPointCont; ++ii) + { + auto entryPoint = program->entryPoints[ii]; + + // We are using the `spAddEntryPointEx` API so that we + // can specify the type names to use for the generic + // type parameters of the program. + // + spAddEntryPointEx( + slangRequest, + translationUnitIndex, + entryPoint->name.c_str(), + entryPoint->slangStage, + genericArgs.size(), + genericArgs.data()); + } + + // We expect compilation to go through without a hitch, because the + // code was already statically checked back in `loadShaderModule()`. + // It is still possible for errors to arise if, e.g., the application + // tries to specialize code based on a type that doesn't implement + // a required interface. + // + int compileErr = spCompile(slangRequest); + if(auto diagnostics = spGetDiagnosticOutput(slangRequest)) + { + reportError("%s", diagnostics); + } + if(compileErr) + { + spDestroyCompileRequest(slangRequest); + assert(!"unexected"); + return nullptr; + } + + // Once compilation is done we can extract the kernel code + // for each of the entry points, and set them up for passing + // to the graphics APIs loading logic. + // + std::vector<ISlangBlob*> kernelBlobs; + std::vector<gfx::ShaderProgram::KernelDesc> kernelDescs; + for(int ii = 0; ii < entryPointCont; ++ii) + { + auto entryPoint = program->entryPoints[ii]; + + ISlangBlob* blob = nullptr; + spGetEntryPointCodeBlob(slangRequest, ii, 0, &blob); + + kernelBlobs.push_back(blob); + + ShaderProgram::KernelDesc kernelDesc; + + char const* codeBegin = (char const*) blob->getBufferPointer(); + char const* codeEnd = codeBegin + blob->getBufferSize(); + + kernelDesc.stage = entryPoint->apiStage; + kernelDesc.codeBegin = codeBegin; + kernelDesc.codeEnd = codeEnd; + + kernelDescs.push_back(kernelDesc); + } + + // Once we've extracted the "blobs" of compiled code, + // we are done with the Slang compilation request. + // + // Note that all of our reflection was performed on the unspecialized + // shader code at load time, but we know that information is still + // applicable to specialized kernels because of the guarantees + // the Slang compiler makes about type layout. + // + spDestroyCompileRequest(slangRequest); + + // We use the graphics API to load a program into the GPU + gfx::ShaderProgram::Desc programDesc; + programDesc.pipelineType = gfx::PipelineType::Graphics; + programDesc.kernels = kernelDescs.data(); + programDesc.kernelCount = kernelDescs.size(); + auto specializedProgram = renderer->createProgram(programDesc); + + // Then we unload our "blobs" of kernel code once the graphics + // API is doen with their data. + // + for(auto blob : kernelBlobs) + { + blob->release(); + } + + // (3) We construct a full graphics API pipeline state + // object that combines our new program and pipeline layout + // with the other state objects from the `Effect`. + // + gfx::GraphicsPipelineStateDesc pipelineStateDesc = {}; + pipelineStateDesc.program = specializedProgram; + pipelineStateDesc.pipelineLayout = pipelineLayout; + pipelineStateDesc.inputLayout = effect->inputLayout; + pipelineStateDesc.renderTargetCount = effect->renderTargetCount; + auto pipelineState = renderer->createGraphicsPipelineState(pipelineStateDesc); + + RefPtr<EffectVariant> variant = new EffectVariant(); + variant->pipelineLayout = pipelineLayout; + variant->pipelineState = pipelineState; + return variant; +} + +// A more advanced application might add logic to +// pre-populate the shader cache with shader variants +// that were compiled offline. +// +struct ShaderCache : RefObject +{ + struct VariantKey + { + Effect* effect; + UInt parameterBlockCount; + ParameterBlockLayout* parameterBlockLayouts[8]; + + // In order to be used as a hash-table key, our + // variant key representation must support + // equality comparison and a matching hashin function. + + bool operator==(VariantKey const& other) const + { + if(effect != other.effect) return false; + if(parameterBlockCount != other.parameterBlockCount) return false; + for( UInt ii = 0; ii < parameterBlockCount; ++ii ) + { + if(parameterBlockLayouts[ii] != other.parameterBlockLayouts[ii]) return false; + } + return true; + } + + UInt GetHashCode() const + { + auto hash = ::GetHashCode(effect); + hash = combineHash(hash, ::GetHashCode(parameterBlockCount)); + for( UInt ii = 0; ii < parameterBlockCount; ++ii ) + { + hash = combineHash(hash, ::GetHashCode(parameterBlockLayouts[ii])); + } + return hash; + } + }; + + // The shader cache is mostly just a dictionary mapping + // variant keys to the associated variant, generated on-demand. + // + // TODO: A more advanced application might support removing + // entries from the shader cache when effects get unloaded, + // or in order to respond to operations like a "hot reload" + // key in a development build (e.g., just clear the + // cache of variants and allow the ordinary loading logic + // to re-populate it). + // + Dictionary<VariantKey, RefPtr<EffectVariant> > variants; + + // Getting a variant is just a matter of looking for an + // existing entry in the dictionary, and creating one + // on demand in case of a miss. + // + RefPtr<EffectVariant> getEffectVariant( + VariantKey const& key) + { + RefPtr<EffectVariant> variant; + if(variants.TryGetValue(key, variant)) + return variant; + + variant = createEffectVaraint( + key.effect, + key.parameterBlockCount, + key.parameterBlockLayouts); + + variants.Add(key, variant); + return variant; + } +}; + + +// In order to render using the `Effect` abstraction, our +// application will use its own rendering context type +// to manage the state that it is binding. This layer +// performs a small amount of shadowing on top of the +// underlying graphics API. +// +// Note: for the purposes of our examples the "graphcis API" +// in a cross-platform abstraction over multiple APIs, but +// we do not actually advocate that real applications should +// be built in terms of distinct layers for cross-platform +// GPU API abstraction and "effect" state management. +// +// A high-performance application built on top of this approach +// would instead implement the concepts like `ParameterBlock` +// and `RenderContext` on a per-API basis, making use of +// whatever is most efficeint on that API without any +// additional abstraction layers in between. +// +// We've done things differently in this example program in +// order to avoid getting bogged down in the specifics of +// any one GPU API. +// +// With that disclaimer out of the way, let's talk through +// the `RenderContext` type in this application. +// +struct RenderContext +{ +private: + // The `RenderContext` type is used to wrap the graphics + // API "context" or "command list" type for submission. + // Our current abstraction layer lumps this all together + // with the "device." + // + RefPtr<gfx::Renderer> renderer; + + // We also retain a pointer to the shader cache, which + // will be used to implement lookup of the right + // effect variant to execute based on bound parameter + // blocks. + // + RefPtr<ShaderCache> shaderCache; + + // We will establish a small upper bound on how many + // parameter blocks can be used simultaneously. In + // practice, most shaders won't need more than about + // four parameter blocks, and attempting to use more + // than that under Vulkan can cause portability issues. + // + enum { kMaxParameterBlocks = 8 }; + + // The overall "state" of the rendering context consists of: + // + // * The currently selected "effect" + // * The parameter blocks that are used to specialize and + // provide parameters for that effects. + // + RefPtr<Effect> effect; + RefPtr<ParameterBlock> parameterBlocks[kMaxParameterBlocks]; + + // Along with the retained state above, we also store + // state in exactly the form required for looking up + // an effect variant in our shader cache, to minimize + // the work that needs to be done when looking up state. + // + ShaderCache::VariantKey variantKey; + + // When state gets changed, we track a few dirty flags rather than + // flush changes to the GPU right away. + + // Tracks whether any state has changed in a way that requires computing + // and binding a new GPU pipeline state object (PSO). + // + // E.g., changing the current effect would set this flag, but changing + // a parameter block binding to one with a new layout would also set the flag. + bool pipelineStateDirty = true; + + // The `minDirtyBlockBinding` flag tracks the lowest-numbered parameter + // block binding that needs to be flushed to the GPU. That is, if + // parameters blocks [0,N) have been bound to the GPU, and then the user + // tries to set block K, then the range [0,K-1) will be left alone, + // while the range [K,N) needs to be set again. + // + // This is an optimization that can be exploited on the Vulkan API + // (and potentially others) if switching pipeline layouts doesn't invalidate + // all currently-bound descriptor sets. + // + int minDirtyBlockBinding = 0; + + // Finally, we cache the specialized effect variant that has been + // most recently bound to the GPU state, so that we can use the + // information it stores (specifically the pipeline layout) when + // binding descriptor sets. + // + RefPtr<EffectVariant> currentEffectVariant; + +public: + // Initializing a render context just sets its pointer to the GPU API device + RenderContext( + gfx::Renderer* renderer, + ShaderCache* shaderCache) + : renderer(renderer) + , shaderCache(shaderCache) + {} + + void setEffect( + Effect* inEffect) + { + // Bail out if nothing is changing. + if( inEffect == effect ) + return; + + effect = inEffect; + variantKey.effect = effect; + variantKey.parameterBlockCount = effect->program->parameterBlockCount; + + // Binding a new effect invalidates the current state object, since + // it will be a specialization of some other effect. + // + pipelineStateDirty = true; + } + + void setParameterBlock( + int index, + ParameterBlock* parameterBlock) + { + // Bail out if nothing is changing. + if(parameterBlock == parameterBlocks[index]) + return; + + parameterBlocks[index] = parameterBlock; + + // This parameter block needs to be bound to the GPU, and any + // parameter blocks after it in the list will also get re-bound + // (even if they haven't changed). This is a reasonable choice + // if parameter blocks are ordered based on expected frequency + // of update (so that lower-numbered blocks change less often). + // + minDirtyBlockBinding = std::min(index, minDirtyBlockBinding); + + // Next, check if the layout for the block we just bound + // is different than the one that was in place before, + // as stored in the "variant key" + // + auto layout = parameterBlock->layout; + if(layout.Ptr() == variantKey.parameterBlockLayouts[index]) + return; + + variantKey.parameterBlockLayouts[index] = layout; + + // Changing the layout of a parameter block (which includes + // the underlying Slang type) requires computing a new + // pipeline state object, because it may lead to differently + // specialized code being generated. + // + pipelineStateDirty = true; + } + + void flushState() + { + // The `flushState()` operation must be used by the application + // any time it binds a different effect or parameter block(s), + // to ensure that the GPU state is fully configured for rendering. + // It is thus important that this function do as little work + // as possible, especially in the common case where state + // doesn't actually need to change. + // + // The first check we do is to see if any change might require + // a different set of shader kernels. + // + if(pipelineStateDirty) + { + pipelineStateDirty = false; + + // Almost all of the logic for retrieving or creating + // a new pipeline state with specialized kernels is + // handled by our shader cache. + // + // In the common case, the desired variant will already + // be present in the cache, and this function returns + // without much effort. + // + auto variant = shaderCache->getEffectVariant(variantKey); + + // In order to adapt to a change in shader variant, + // we simply bind its PSO into the GPU state, and + // remember the variant we've selected. + // + renderer->setPipelineState(PipelineType::Graphics, variant->pipelineState); + currentEffectVariant = variant; + } + + // Even if the current pipeline state was fine, we may need to + // bind one or more descriptor sets. We do this by walking + // from our lowest-numbered "dirty" set up to the number + // of sets expected by the current effect and binding them. + // + // If `minDirtyBlockBinding` is greater than or equal to the + // `parameterBlockCount` of the currently bound effect, then + // this will be a no-op. + // + // The common case in a tight drawing loop will be that only + // the last block will be dirty, and we will only execute + // one iteration of this loop. + // + auto program = effect->program; + auto parameterBlockCount = program->parameterBlockCount; + auto pipelineLayout = currentEffectVariant->pipelineLayout; + for(int ii = minDirtyBlockBinding; ii < parameterBlockCount; ++ii) + { + renderer->setDescriptorSet( + PipelineType::Graphics, + pipelineLayout, + ii, + parameterBlocks[ii]->descriptorSet); + } + minDirtyBlockBinding = parameterBlockCount; + } +}; + +// We will again structure our example application as a C++ `struct`, +// so that we can scope its allocations for easy cleanup, rather than +// use global variables. +// +struct ModelViewer { + +Window* gWindow; +RefPtr<gfx::Renderer> gRenderer; +RefPtr<gfx::ResourceView> gDepthTarget; + +// We keep a pointer to the one effect we are using (for a forward +// rendering pass), plus the parameter-block layouts for our `PerView` +// and `PerModel` shader types. +// +RefPtr<Effect> gEffect; +RefPtr<ParameterBlockLayout> gPerViewParameterBlockLayout; +RefPtr<ParameterBlockLayout> gPerModelParameterBlockLayout; + +RefPtr<ShaderCache> shaderCache; + +// Most of the application state is stored in the list of loaded models. +// +std::vector<RefPtr<Model>> gModels; + +// During startup the application will load one or more models and +// add them to the `gModels` list. +// +void loadAndAddModel( + char const* inputPath, + ModelLoader::LoadFlags loadFlags = 0, + float scale = 1.0f) +{ + auto model = loadModel(gRenderer, inputPath, loadFlags, scale); + if(!model) return; + gModels.push_back(model); +} + +int gWindowWidth = 1024; +int gWindowHeight = 768; + +// For this more complex example we will be passing multiple +// parameter blocks into the shader code, and each will +// need its own `struct` type the define the layout of the +// uniform data. +// +struct PerView +{ + glm::mat4x4 viewProjection; + + glm::vec3 lightDir; + float pad0; + + glm::vec3 lightColor; + float pad1; +}; +struct PerModel +{ + glm::mat4x4 modelTransform; + glm::mat4x4 inverseTransposeModelTransform; +}; + +// The overall initialization logic is quite similar to +// the earlier example. The biggest difference is that we +// create instances of our application-specific parameter +// block layout and effect types instead of just creating +// raw graphics API objects. +// +Result initialize() +{ + WindowDesc windowDesc; + windowDesc.title = "Model Viewer"; + windowDesc.width = gWindowWidth; + windowDesc.height = gWindowHeight; + gWindow = createWindow(windowDesc); + + gRenderer = createD3D11Renderer(); + Renderer::Desc rendererDesc; + rendererDesc.width = gWindowWidth; + rendererDesc.height = gWindowHeight; + gRenderer->initialize(rendererDesc, getPlatformWindowHandle(gWindow)); + + InputElementDesc inputElements[] = { + {"POSITION", 0, Format::RGB_Float32, offsetof(Model::Vertex, position) }, + {"NORMAL", 0, Format::RGB_Float32, offsetof(Model::Vertex, normal) }, + {"UV", 0, Format::RG_Float32, offsetof(Model::Vertex, uv) }, + }; + auto inputLayout = gRenderer->createInputLayout( + &inputElements[0], + 3); + if(!inputLayout) return SLANG_FAIL; + + // Because we are rendering more than a single triangle this time, we + // require a depth buffer to resolve visibility. + // + TextureResource::Desc depthBufferDesc = gRenderer->getSwapChainTextureDesc(); + depthBufferDesc.format = Format::D_Float32; + depthBufferDesc.setDefaults(Resource::Usage::DepthWrite); + auto depthTexture = gRenderer->createTextureResource( + Resource::Usage::DepthWrite, + depthBufferDesc); + if(!depthTexture) return SLANG_FAIL; + + ResourceView::Desc textureViewDesc; + textureViewDesc.type = ResourceView::Type::DepthStencil; + auto depthTarget = gRenderer->createTextureView(depthTexture, textureViewDesc); + if (!depthTarget) return SLANG_FAIL; + + gDepthTarget = depthTarget; + + // Unlike the earlier example, we will not generate final shader kernel + // code during initialization. Instead, we simply load the shader module + // so that we can perform reflection and allocate resources. + // + auto shaderModule = loadShaderModule(gRenderer, "shaders.slang"); + if(!shaderModule) return SLANG_FAIL; + + // Once the shader code has been loaded, we can look up types declared + // in the shader code by name and perform reflection on them to determine + // parameter block layouts, etc. + // + // A more advanced application might load this information on-demand + // and potentially tie into an application-level reflection system + // that already knows the string names of its types (e.g., to connect + // the `PerView` type in shader code to the `PerView` type declared + // in the application code). + // + gPerViewParameterBlockLayout = getParameterBlockLayout( + shaderModule, "PerView"); + gPerModelParameterBlockLayout = getParameterBlockLayout( + shaderModule, "PerModel"); + // + // Note how we are able to load the type definition for `SimpleMaterial` + // from the Slang shader module even though the `SimpleMaterial` type + // is not actually *used* by any entry point in the file. + // + SimpleMaterial::gParameterBlockLayout = getParameterBlockLayout( + shaderModule, "SimpleMaterial"); + + // We also load a shader program based on vertex/fragment shaders in our + // module, and then use this to create an application-level effect. + // + // Note that the `loadProgram` operation here does *not* invoke any + // Slang compilation, because the shader module was already completely + // parsed, checked, etc. by the logic in `loadShaderModule()` above. + // + auto program = loadProgram(shaderModule, "vertexMain", "fragmentMain"); + if(!program) return SLANG_FAIL; + + RefPtr<Effect> effect = new Effect(); + effect->program = program; + effect->inputLayout = inputLayout; + effect->renderTargetCount = 1; + gEffect = effect; + + // In order to create specialized variants of the effect(s) that + // get used for rendering, we will use a shader cache. + // + shaderCache = new ShaderCache(); + + // Once we have created all our graphcis API and application resources, + // we can start to load models. For now we are keeping things extremely + // simple by using a trivial `.obj` file that can be checked into source + // control. + // + // Support for loading more interesting/complex models will be added + // to this example over time (although model loading is *not* the focus). + // + loadAndAddModel("cube.obj"); + + showWindow(gWindow); + + return SLANG_OK; +} + +// With the setup work done, we can look at the per-frame rendering +// logic to see how the application will drive the `RenderContext` +// type to perform both shader parameter binding and code specialization. +// +void renderFrame() +{ + // In order to see that things are rendering properly we need some + // kind of animation, so we will compute a crude delta-time value here. + // + static uint64_t lastTime = getCurrentTime(); + uint64_t currentTime = getCurrentTime(); + float deltaTime = float(currentTime - lastTime) / float(getTimerFrequency()); + lastTime = currentTime; + + // We will use the GLM library to do the matrix math required + // to set up our various transformation matrices. + // + glm::mat4x4 identity = glm::mat4x4(1.0f); + + glm::mat4x4 projection = glm::perspective( + glm::radians(60.0f), + float(gWindowWidth) / float(gWindowHeight), + 0.1f, + 1000.0f); + + glm::mat4x4 view = identity; + view = translate(view, glm::vec3(0, 0, -5)); + + glm::mat4x4 viewProjection = projection * view; + + // We set up a light source with a simple animation applied + // to its direction. + // + glm::vec3 lightDir = normalize(glm::vec3(10, 10, -10)); + glm::vec3 lightColor = glm::vec3(1, 1, 1); + static float angle = 0.0f; + angle += 0.5f * deltaTime; + glm::mat4x4 lightTransform = identity; + lightTransform = rotate(lightTransform, angle, glm::vec3(0, 1, 0)); + lightDir = glm::vec3(lightTransform * glm::vec4(lightDir, 0)); + + // Some of the basic rendering setup is identical to the previous example. + // + static const float kClearColor[] = { 0.25, 0.25, 0.25, 1.0 }; + gRenderer->setClearColor(kClearColor); + gRenderer->clearFrame(); + gRenderer->setDepthStencilTarget(gDepthTarget); + gRenderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); + + // Now we will start in on the more interesting rendering logic, + // by creating the `RenderContext` we will use for submission. + // + // Note: in a multi-threaded submission case, the application would + // need to use a distinct `RenderContext` on each thread. + // + RenderContext context(gRenderer, shaderCache); + + // Next we set the effect that we will use for our forward rendering + // pass. Note that an example with multiple passes would use a + // distinct effect for each pass. + // + context.setEffect(gEffect); + + // We are only rendering one view, so we can fill in a per-view + // parameter block once and use it across all draw calls. + // This parameter block will be different every frame, so we + // allocate a transient parameter block rather than try to + // carefully track and re-use an allocation. + // + auto viewParameterBlock = allocateTransientParameterBlock( + gPerViewParameterBlockLayout); + if(auto perView = viewParameterBlock->mapAs<PerView>()) + { + perView->viewProjection = viewProjection; + perView->lightDir = lightDir; + perView->lightColor = lightColor; + + viewParameterBlock->unmap(); + } + // + // Note: the assignment of indices to parameter blocks is driven + // by their order of declaration in the shader code, so we know + // that the per-view parameter block has index zero. Alternatively, + // an application could use reflection API operations to look up + // the index of a parameter block based on its name. + // + context.setParameterBlock(0, viewParameterBlock); + + // The majority of our rendering logic is handled as a loop + // over the models in the scene, and their meshes. + // + for(auto& model : gModels) + { + gRenderer->setVertexBuffer(0, model->vertexBuffer, sizeof(Model::Vertex)); + gRenderer->setIndexBuffer(model->indexBuffer, Format::R_UInt32); + + // For each model we provide a parameter + // block that holds the per-model transformation + // parameters, corresponding to the `PerModel` type + // in the shader code. + // + // Like the view parameter block, it makes sense + // to allocate this block as a transient allocation, + // since its contents would be different on the next + // frame anyway. + // + glm::mat4x4 modelTransform = identity; + glm::mat4x4 inverseTransposeModelTransform = inverse(transpose(modelTransform)); + + auto modelParameterBlock = allocateTransientParameterBlock( + gPerModelParameterBlockLayout); + if(auto perModel = modelParameterBlock->mapAs<PerModel>()) + { + perModel->modelTransform = modelTransform; + perModel->inverseTransposeModelTransform = inverseTransposeModelTransform; + + modelParameterBlock->unmap(); + } + context.setParameterBlock(1, modelParameterBlock); + + // Now we loop over the meshes in the model. + // + // A more advanced rendering loop would sort things by material + // rather than by model, to avoid overly frequent state changes. + // We are just doing something simple for the purposes of an + // exmple program. + // + for(auto& mesh : model->meshes) + { + // Each mesh has a material, and each material has its own + // parameter block that was created at load time, so we + // can just re-use the persistent parameter block for the + // chosen material. + // + // Note that binding the material parameter block here is + // both selecting the values to use for various material + // parameters as well as the *code* to use for material + // evaluation (based on the concrete shader type that + // is implementing the `IMaterial` interface). + // + context.setParameterBlock( + 2, + mesh->material->parameterBlock); + + // Once we've set up all the parameter blocks needed + // for a given drawing operation, we need to flush + // any pending state changes (e.g., if the type of + // material changed, a shader switch might be + // required). + // + context.flushState(); + + gRenderer->drawIndexed(mesh->indexCount, mesh->firstIndex); + } + } + + gRenderer->presentFrame(); +} + +void finalize() +{ + // Because we've stored a reference to some graphics API objects + // in a class-static variable (effectively a global) we need + // to clear those out before tearing down the application so + // that we aren't relying on C++ global destructors to tear + // down our application cleanly. + // + SimpleMaterial::gParameterBlockLayout = nullptr; +} + +}; + +void innerMain(ApplicationContext* context) +{ + ModelViewer app; + if(SLANG_FAILED(app.initialize())) + { + exitApplication(context, 1); + } + + while(dispatchEvents(context)) + { + app.renderFrame(); + } + + app.finalize(); +} +GFX_UI_MAIN(innerMain) diff --git a/examples/model-viewer/model-viewer.vcxproj b/examples/model-viewer/model-viewer.vcxproj new file mode 100644 index 000000000..ea7ee1521 --- /dev/null +++ b/examples/model-viewer/model-viewer.vcxproj @@ -0,0 +1,184 @@ +<?xml version="1.0" encoding="utf-8"?> +<Project DefaultTargets="Build" ToolsVersion="14.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> + <ItemGroup Label="ProjectConfigurations"> + <ProjectConfiguration Include="Debug|Win32"> + <Configuration>Debug</Configuration> + <Platform>Win32</Platform> + </ProjectConfiguration> + <ProjectConfiguration Include="Debug|x64"> + <Configuration>Debug</Configuration> + <Platform>x64</Platform> + </ProjectConfiguration> + <ProjectConfiguration Include="Release|Win32"> + <Configuration>Release</Configuration> + <Platform>Win32</Platform> + </ProjectConfiguration> + <ProjectConfiguration Include="Release|x64"> + <Configuration>Release</Configuration> + <Platform>x64</Platform> + </ProjectConfiguration> + </ItemGroup> + <PropertyGroup Label="Globals"> + <ProjectGuid>{639B13F2-CF07-CFEC-98FB-664A0427F154}</ProjectGuid> + <IgnoreWarnCompileDuplicatedFilename>true</IgnoreWarnCompileDuplicatedFilename> + <Keyword>Win32Proj</Keyword> + <RootNamespace>model-viewer</RootNamespace> + </PropertyGroup> + <Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" /> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration"> + <ConfigurationType>Application</ConfigurationType> + <UseDebugLibraries>true</UseDebugLibraries> + <CharacterSet>Unicode</CharacterSet> + <PlatformToolset>v140</PlatformToolset> + </PropertyGroup> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration"> + <ConfigurationType>Application</ConfigurationType> + <UseDebugLibraries>true</UseDebugLibraries> + <CharacterSet>Unicode</CharacterSet> + <PlatformToolset>v140</PlatformToolset> + </PropertyGroup> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration"> + <ConfigurationType>Application</ConfigurationType> + <UseDebugLibraries>false</UseDebugLibraries> + <CharacterSet>Unicode</CharacterSet> + <PlatformToolset>v140</PlatformToolset> + </PropertyGroup> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration"> + <ConfigurationType>Application</ConfigurationType> + <UseDebugLibraries>false</UseDebugLibraries> + <CharacterSet>Unicode</CharacterSet> + <PlatformToolset>v140</PlatformToolset> + </PropertyGroup> + <Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" /> + <ImportGroup Label="ExtensionSettings"> + </ImportGroup> + <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'"> + <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" /> + </ImportGroup> + <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|x64'"> + <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" /> + </ImportGroup> + <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|Win32'"> + <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" /> + </ImportGroup> + <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|x64'"> + <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" /> + </ImportGroup> + <PropertyGroup Label="UserMacros" /> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'"> + <LinkIncremental>true</LinkIncremental> + <OutDir>..\..\bin\windows-x86\debug\</OutDir> + <IntDir>..\..\intermediate\windows-x86\debug\model-viewer\</IntDir> + <TargetName>model-viewer</TargetName> + <TargetExt>.exe</TargetExt> + </PropertyGroup> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'"> + <LinkIncremental>true</LinkIncremental> + <OutDir>..\..\bin\windows-x64\debug\</OutDir> + <IntDir>..\..\intermediate\windows-x64\debug\model-viewer\</IntDir> + <TargetName>model-viewer</TargetName> + <TargetExt>.exe</TargetExt> + </PropertyGroup> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'"> + <LinkIncremental>false</LinkIncremental> + <OutDir>..\..\bin\windows-x86\release\</OutDir> + <IntDir>..\..\intermediate\windows-x86\release\model-viewer\</IntDir> + <TargetName>model-viewer</TargetName> + <TargetExt>.exe</TargetExt> + </PropertyGroup> + <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'"> + <LinkIncremental>false</LinkIncremental> + <OutDir>..\..\bin\windows-x64\release\</OutDir> + <IntDir>..\..\intermediate\windows-x64\release\model-viewer\</IntDir> + <TargetName>model-viewer</TargetName> + <TargetExt>.exe</TargetExt> + </PropertyGroup> + <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'"> + <ClCompile> + <PrecompiledHeader>NotUsing</PrecompiledHeader> + <WarningLevel>Level3</WarningLevel> + <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..;..\..\tools;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <DebugInformationFormat>EditAndContinue</DebugInformationFormat> + <Optimization>Disabled</Optimization> + <RuntimeLibrary>MultiThreadedDebug</RuntimeLibrary> + </ClCompile> + <Link> + <SubSystem>Windows</SubSystem> + <GenerateDebugInformation>true</GenerateDebugInformation> + </Link> + </ItemDefinitionGroup> + <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'"> + <ClCompile> + <PrecompiledHeader>NotUsing</PrecompiledHeader> + <WarningLevel>Level3</WarningLevel> + <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..;..\..\tools;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <DebugInformationFormat>EditAndContinue</DebugInformationFormat> + <Optimization>Disabled</Optimization> + <RuntimeLibrary>MultiThreadedDebug</RuntimeLibrary> + </ClCompile> + <Link> + <SubSystem>Windows</SubSystem> + <GenerateDebugInformation>true</GenerateDebugInformation> + </Link> + </ItemDefinitionGroup> + <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'"> + <ClCompile> + <PrecompiledHeader>NotUsing</PrecompiledHeader> + <WarningLevel>Level3</WarningLevel> + <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..;..\..\tools;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <Optimization>Full</Optimization> + <FunctionLevelLinking>true</FunctionLevelLinking> + <IntrinsicFunctions>true</IntrinsicFunctions> + <MinimalRebuild>false</MinimalRebuild> + <StringPooling>true</StringPooling> + <RuntimeLibrary>MultiThreaded</RuntimeLibrary> + </ClCompile> + <Link> + <SubSystem>Windows</SubSystem> + <EnableCOMDATFolding>true</EnableCOMDATFolding> + <OptimizeReferences>true</OptimizeReferences> + </Link> + </ItemDefinitionGroup> + <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'"> + <ClCompile> + <PrecompiledHeader>NotUsing</PrecompiledHeader> + <WarningLevel>Level3</WarningLevel> + <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> + <AdditionalIncludeDirectories>..\..;..\..\tools;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <Optimization>Full</Optimization> + <FunctionLevelLinking>true</FunctionLevelLinking> + <IntrinsicFunctions>true</IntrinsicFunctions> + <MinimalRebuild>false</MinimalRebuild> + <StringPooling>true</StringPooling> + <RuntimeLibrary>MultiThreaded</RuntimeLibrary> + </ClCompile> + <Link> + <SubSystem>Windows</SubSystem> + <EnableCOMDATFolding>true</EnableCOMDATFolding> + <OptimizeReferences>true</OptimizeReferences> + </Link> + </ItemDefinitionGroup> + <ItemGroup> + <ClCompile Include="main.cpp" /> + </ItemGroup> + <ItemGroup> + <None Include="shaders.slang" /> + </ItemGroup> + <ItemGroup> + <ProjectReference Include="..\..\source\slang\slang.vcxproj"> + <Project>{DB00DA62-0533-4AFD-B59F-A67D5B3A0808}</Project> + </ProjectReference> + <ProjectReference Include="..\..\source\core\core.vcxproj"> + <Project>{F9BE7957-8399-899E-0C49-E714FDDD4B65}</Project> + </ProjectReference> + <ProjectReference Include="..\..\tools\gfx\gfx.vcxproj"> + <Project>{222F7498-B40C-4F3F-A704-DDEB91A4484A}</Project> + </ProjectReference> + </ItemGroup> + <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" /> + <ImportGroup Label="ExtensionTargets"> + </ImportGroup> +</Project>
\ No newline at end of file diff --git a/examples/model-viewer/model-viewer.vcxproj.filters b/examples/model-viewer/model-viewer.vcxproj.filters new file mode 100644 index 000000000..a02cb79fc --- /dev/null +++ b/examples/model-viewer/model-viewer.vcxproj.filters @@ -0,0 +1,18 @@ +<?xml version="1.0" encoding="utf-8"?> +<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> + <ItemGroup> + <Filter Include="Source Files"> + <UniqueIdentifier>{E9C7FDCE-D52A-8D73-7EB0-C5296AF258F6}</UniqueIdentifier> + </Filter> + </ItemGroup> + <ItemGroup> + <ClCompile Include="main.cpp"> + <Filter>Source Files</Filter> + </ClCompile> + </ItemGroup> + <ItemGroup> + <None Include="shaders.slang"> + <Filter>Source Files</Filter> + </None> + </ItemGroup> +</Project>
\ No newline at end of file diff --git a/examples/model-viewer/shaders.slang b/examples/model-viewer/shaders.slang new file mode 100644 index 000000000..b79636d15 --- /dev/null +++ b/examples/model-viewer/shaders.slang @@ -0,0 +1,178 @@ +// shaders.slang + +// +// This example builds on the simplistic shaders presented in the +// "Hello, World" example by adding support for (intentionally +// simplistic) surface materil and light shading. +// +// The code here is not meant to exemplify state-of-the-art material +// and lighting techniques, but rather to show how a shader +// library can be developed in a modular fashion without reliance +// on the C preprocessor manual parameter-binding decorations. +// + +// We will start with a `struct` for per-view parameters that +// will be allocated into a `ParameterBlock`. +// +// As written, this isn't very different from using an HLSL +// `cbuffer` declaration, but importantly this code will +// continue to work if we add one or more resources (e.g., +// an enironment map texture) to the `PerView` type. +// +struct PerView +{ + float4x4 viewProjection; + + float3 lightDir; + float3 lightColor; +}; +ParameterBlock<PerView> gViewParams; + +// Declaring a block for per-model parameter data is +// similarly simple. +// +struct PerModel +{ + float4x4 modelTransform; + float4x4 inverseTransposeModelTransform; +}; +ParameterBlock<PerModel> gModelParams; + + +// Next, we are going to demonstrate a simplistic interface +// for surface materials. As written, materials can only +// determine how to compute the diffuse color component +// of a surface; a more advanced example would fold +// the entire BRDF into the material interface. +// +interface IMaterial +{ + float3 getDiffuseColor(); +}; + +// In order for our shader to be able to take a material +// as a parameter, we need to declare a `ParameterBlock<M>` +// for some material type `M`. Rather than hard-code the +// specific material type to use, or select one via the +// preprocessor, we will use Slang's support for generics, +// by defining a "global type parameter": +// +type_param TMaterial : IMaterial; +// +// This declaration declares a shader parameter `TMaterial` +// that is a to-be-determined *type*. The `TMaterial` +// type parameter is *constrained* to only support types +// that implement our `IMaterial` interface. +// +// With the `TMaterial` parameter declared, we can +// declare that our shader takes as input a parameter block +// containing material data: +// +ParameterBlock<TMaterial> gMaterial; + +// For now, we will define only a single implementation +// of the `IMaterial` interface, which is a simple material +// with a uniform diffuse color: +// +struct SimpleMaterial : IMaterial +{ + float3 diffuseColor; + + float3 getDiffuseColor() + { + return diffuseColor; + } +}; +// +// Note that no other code in this file statically +// references the `SimpleMaterial` type, and instead +// it is up to the application to "plug in" this type, +// or another `IMaterial` implementation for the +// `TMaterial` parameter. +// + +// Our vertex shader entry point is only marginally more +// complicated than the Hello World example. We will +// start by declaring the various "connector" `struct`s. +// +struct AssembledVertex +{ + float3 position : POSITION; + float3 normal : NORMAL; + float2 uv : UV; +}; +struct CoarseVertex +{ + float3 worldPosition; + float3 worldNormal; + float2 uv; +}; +struct VertexStageOutput +{ + CoarseVertex coarseVertex : CoarseVertex; + float4 sv_position : SV_Position; +}; + +// Perhaps most interesting new feature of the entry +// point decalrations is that we use a `[shader(...)]` +// attribute (as introduced in HLSL Shader Model 6.x) +// in order to tag our entry points. +// +// This attribute informs the Slang compiler which +// functions are intended to be compiled as shader +// entry points (and what stage they target), so that +// the programmer no longer needs to specify the +// entry point name/stage through the API (or on +// the command line when using `slangc`). +// +// While HLSL added this feature only in newer versions, +// the Slang compiler supports this attribute across +// *all* targets, so that it is okay to use whether you +// want DXBC, DXIL, or SPIR-V output. +// +[shader("vertex")] +VertexStageOutput vertexMain( + AssembledVertex assembledVertex) +{ + VertexStageOutput output; + + float3 position = assembledVertex.position; + float3 normal = assembledVertex.normal; + float2 uv = assembledVertex.uv; + + float3 worldPosition = mul(gModelParams.modelTransform, float4(position, 1.0)).xyz; + float3 worldNormal = mul(gModelParams.inverseTransposeModelTransform, float4(normal, 0.0)).xyz; + + output.coarseVertex.worldPosition = worldPosition; + output.coarseVertex.worldNormal = worldNormal; + output.coarseVertex.uv = uv; + + output.sv_position = mul(gViewParams.viewProjection, float4(worldPosition, 1.0)); + + return output; +} + +// Our fragment shader is almost trivial, with the most interesting +// thing being how it uses the `TMaterial` type parameter (through the +// value stored in the `gMaterial` parameter block) to dispatch to +// the correct implementation of the `getDiffuseColor()` method +// in the `IMaterial` interface. +// +// The `gMaterial` parameter block declaration thus serves not only +// to group certain shader parameters for efficient CPU-to-GPU +// communication, but also to select the code that will execute +// in specialized versions of the `fragmentMain` entry point. +// +[shader("fragment")] +float4 fragmentMain( + CoarseVertex coarseVertex : CoarseVertex) : SV_Target +{ + float3 N = normalize(coarseVertex.worldNormal); + float3 L = normalize(gViewParams.lightDir); + + float4 color; + color.xyz = gMaterial.getDiffuseColor() * max(0, dot(N, L)); + color.w = 1.0f; + + return color; +} diff --git a/external/glm b/external/glm new file mode 160000 +Subproject 0d973b40a49e550b1ea7df22a8573bc5fff84f2 diff --git a/external/stb/stb_image_resize.h b/external/stb/stb_image_resize.h new file mode 100644 index 000000000..031ca99dc --- /dev/null +++ b/external/stb/stb_image_resize.h @@ -0,0 +1,2627 @@ +/* stb_image_resize - v0.95 - public domain image resizing + by Jorge L Rodriguez (@VinoBS) - 2014 + http://github.com/nothings/stb + + Written with emphasis on usability, portability, and efficiency. (No + SIMD or threads, so it be easily outperformed by libs that use those.) + Only scaling and translation is supported, no rotations or shears. + Easy API downsamples w/Mitchell filter, upsamples w/cubic interpolation. + + COMPILING & LINKING + In one C/C++ file that #includes this file, do this: + #define STB_IMAGE_RESIZE_IMPLEMENTATION + before the #include. That will create the implementation in that file. + + QUICKSTART + stbir_resize_uint8( input_pixels , in_w , in_h , 0, + output_pixels, out_w, out_h, 0, num_channels) + stbir_resize_float(...) + stbir_resize_uint8_srgb( input_pixels , in_w , in_h , 0, + output_pixels, out_w, out_h, 0, + num_channels , alpha_chan , 0) + stbir_resize_uint8_srgb_edgemode( + input_pixels , in_w , in_h , 0, + output_pixels, out_w, out_h, 0, + num_channels , alpha_chan , 0, STBIR_EDGE_CLAMP) + // WRAP/REFLECT/ZERO + + FULL API + See the "header file" section of the source for API documentation. + + ADDITIONAL DOCUMENTATION + + SRGB & FLOATING POINT REPRESENTATION + The sRGB functions presume IEEE floating point. If you do not have + IEEE floating point, define STBIR_NON_IEEE_FLOAT. This will use + a slower implementation. + + MEMORY ALLOCATION + The resize functions here perform a single memory allocation using + malloc. To control the memory allocation, before the #include that + triggers the implementation, do: + + #define STBIR_MALLOC(size,context) ... + #define STBIR_FREE(ptr,context) ... + + Each resize function makes exactly one call to malloc/free, so to use + temp memory, store the temp memory in the context and return that. + + ASSERT + Define STBIR_ASSERT(boolval) to override assert() and not use assert.h + + OPTIMIZATION + Define STBIR_SATURATE_INT to compute clamp values in-range using + integer operations instead of float operations. This may be faster + on some platforms. + + DEFAULT FILTERS + For functions which don't provide explicit control over what filters + to use, you can change the compile-time defaults with + + #define STBIR_DEFAULT_FILTER_UPSAMPLE STBIR_FILTER_something + #define STBIR_DEFAULT_FILTER_DOWNSAMPLE STBIR_FILTER_something + + See stbir_filter in the header-file section for the list of filters. + + NEW FILTERS + A number of 1D filter kernels are used. For a list of + supported filters see the stbir_filter enum. To add a new filter, + write a filter function and add it to stbir__filter_info_table. + + PROGRESS + For interactive use with slow resize operations, you can install + a progress-report callback: + + #define STBIR_PROGRESS_REPORT(val) some_func(val) + + The parameter val is a float which goes from 0 to 1 as progress is made. + + For example: + + static void my_progress_report(float progress); + #define STBIR_PROGRESS_REPORT(val) my_progress_report(val) + + #define STB_IMAGE_RESIZE_IMPLEMENTATION + #include "stb_image_resize.h" + + static void my_progress_report(float progress) + { + printf("Progress: %f%%\n", progress*100); + } + + MAX CHANNELS + If your image has more than 64 channels, define STBIR_MAX_CHANNELS + to the max you'll have. + + ALPHA CHANNEL + Most of the resizing functions provide the ability to control how + the alpha channel of an image is processed. The important things + to know about this: + + 1. The best mathematically-behaved version of alpha to use is + called "premultiplied alpha", in which the other color channels + have had the alpha value multiplied in. If you use premultiplied + alpha, linear filtering (such as image resampling done by this + library, or performed in texture units on GPUs) does the "right + thing". While premultiplied alpha is standard in the movie CGI + industry, it is still uncommon in the videogame/real-time world. + + If you linearly filter non-premultiplied alpha, strange effects + occur. (For example, the 50/50 average of 99% transparent bright green + and 1% transparent black produces 50% transparent dark green when + non-premultiplied, whereas premultiplied it produces 50% + transparent near-black. The former introduces green energy + that doesn't exist in the source image.) + + 2. Artists should not edit premultiplied-alpha images; artists + want non-premultiplied alpha images. Thus, art tools generally output + non-premultiplied alpha images. + + 3. You will get best results in most cases by converting images + to premultiplied alpha before processing them mathematically. + + 4. If you pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED, the + resizer does not do anything special for the alpha channel; + it is resampled identically to other channels. This produces + the correct results for premultiplied-alpha images, but produces + less-than-ideal results for non-premultiplied-alpha images. + + 5. If you do not pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED, + then the resizer weights the contribution of input pixels + based on their alpha values, or, equivalently, it multiplies + the alpha value into the color channels, resamples, then divides + by the resultant alpha value. Input pixels which have alpha=0 do + not contribute at all to output pixels unless _all_ of the input + pixels affecting that output pixel have alpha=0, in which case + the result for that pixel is the same as it would be without + STBIR_FLAG_ALPHA_PREMULTIPLIED. However, this is only true for + input images in integer formats. For input images in float format, + input pixels with alpha=0 have no effect, and output pixels + which have alpha=0 will be 0 in all channels. (For float images, + you can manually achieve the same result by adding a tiny epsilon + value to the alpha channel of every image, and then subtracting + or clamping it at the end.) + + 6. You can suppress the behavior described in #5 and make + all-0-alpha pixels have 0 in all channels by #defining + STBIR_NO_ALPHA_EPSILON. + + 7. You can separately control whether the alpha channel is + interpreted as linear or affected by the colorspace. By default + it is linear; you almost never want to apply the colorspace. + (For example, graphics hardware does not apply sRGB conversion + to the alpha channel.) + + CONTRIBUTORS + Jorge L Rodriguez: Implementation + Sean Barrett: API design, optimizations + Aras Pranckevicius: bugfix + Nathan Reed: warning fixes + + REVISIONS + 0.95 (2017-07-23) fixed warnings + 0.94 (2017-03-18) fixed warnings + 0.93 (2017-03-03) fixed bug with certain combinations of heights + 0.92 (2017-01-02) fix integer overflow on large (>2GB) images + 0.91 (2016-04-02) fix warnings; fix handling of subpixel regions + 0.90 (2014-09-17) first released version + + LICENSE + See end of file for license information. + + TODO + Don't decode all of the image data when only processing a partial tile + Don't use full-width decode buffers when only processing a partial tile + When processing wide images, break processing into tiles so data fits in L1 cache + Installable filters? + Resize that respects alpha test coverage + (Reference code: FloatImage::alphaTestCoverage and FloatImage::scaleAlphaToCoverage: + https://code.google.com/p/nvidia-texture-tools/source/browse/trunk/src/nvimage/FloatImage.cpp ) +*/ + +#ifndef STBIR_INCLUDE_STB_IMAGE_RESIZE_H +#define STBIR_INCLUDE_STB_IMAGE_RESIZE_H + +#ifdef _MSC_VER +typedef unsigned char stbir_uint8; +typedef unsigned short stbir_uint16; +typedef unsigned int stbir_uint32; +#else +#include <stdint.h> +typedef uint8_t stbir_uint8; +typedef uint16_t stbir_uint16; +typedef uint32_t stbir_uint32; +#endif + +#ifdef STB_IMAGE_RESIZE_STATIC +#define STBIRDEF static +#else +#ifdef __cplusplus +#define STBIRDEF extern "C" +#else +#define STBIRDEF extern +#endif +#endif + + +////////////////////////////////////////////////////////////////////////////// +// +// Easy-to-use API: +// +// * "input pixels" points to an array of image data with 'num_channels' channels (e.g. RGB=3, RGBA=4) +// * input_w is input image width (x-axis), input_h is input image height (y-axis) +// * stride is the offset between successive rows of image data in memory, in bytes. you can +// specify 0 to mean packed continuously in memory +// * alpha channel is treated identically to other channels. +// * colorspace is linear or sRGB as specified by function name +// * returned result is 1 for success or 0 in case of an error. +// #define STBIR_ASSERT() to trigger an assert on parameter validation errors. +// * Memory required grows approximately linearly with input and output size, but with +// discontinuities at input_w == output_w and input_h == output_h. +// * These functions use a "default" resampling filter defined at compile time. To change the filter, +// you can change the compile-time defaults by #defining STBIR_DEFAULT_FILTER_UPSAMPLE +// and STBIR_DEFAULT_FILTER_DOWNSAMPLE, or you can use the medium-complexity API. + +STBIRDEF int stbir_resize_uint8( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels); + +STBIRDEF int stbir_resize_float( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels); + + +// The following functions interpret image data as gamma-corrected sRGB. +// Specify STBIR_ALPHA_CHANNEL_NONE if you have no alpha channel, +// or otherwise provide the index of the alpha channel. Flags value +// of 0 will probably do the right thing if you're not sure what +// the flags mean. + +#define STBIR_ALPHA_CHANNEL_NONE -1 + +// Set this flag if your texture has premultiplied alpha. Otherwise, stbir will +// use alpha-weighted resampling (effectively premultiplying, resampling, +// then unpremultiplying). +#define STBIR_FLAG_ALPHA_PREMULTIPLIED (1 << 0) +// The specified alpha channel should be handled as gamma-corrected value even +// when doing sRGB operations. +#define STBIR_FLAG_ALPHA_USES_COLORSPACE (1 << 1) + +STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags); + + +typedef enum +{ + STBIR_EDGE_CLAMP = 1, + STBIR_EDGE_REFLECT = 2, + STBIR_EDGE_WRAP = 3, + STBIR_EDGE_ZERO = 4, +} stbir_edge; + +// This function adds the ability to specify how requests to sample off the edge of the image are handled. +STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode); + +////////////////////////////////////////////////////////////////////////////// +// +// Medium-complexity API +// +// This extends the easy-to-use API as follows: +// +// * Alpha-channel can be processed separately +// * If alpha_channel is not STBIR_ALPHA_CHANNEL_NONE +// * Alpha channel will not be gamma corrected (unless flags&STBIR_FLAG_GAMMA_CORRECT) +// * Filters will be weighted by alpha channel (unless flags&STBIR_FLAG_ALPHA_PREMULTIPLIED) +// * Filter can be selected explicitly +// * uint16 image type +// * sRGB colorspace available for all types +// * context parameter for passing to STBIR_MALLOC + +typedef enum +{ + STBIR_FILTER_DEFAULT = 0, // use same filter type that easy-to-use API chooses + STBIR_FILTER_BOX = 1, // A trapezoid w/1-pixel wide ramps, same result as box for integer scale ratios + STBIR_FILTER_TRIANGLE = 2, // On upsampling, produces same results as bilinear texture filtering + STBIR_FILTER_CUBICBSPLINE = 3, // The cubic b-spline (aka Mitchell-Netrevalli with B=1,C=0), gaussian-esque + STBIR_FILTER_CATMULLROM = 4, // An interpolating cubic spline + STBIR_FILTER_MITCHELL = 5, // Mitchell-Netrevalli filter with B=1/3, C=1/3 +} stbir_filter; + +typedef enum +{ + STBIR_COLORSPACE_LINEAR, + STBIR_COLORSPACE_SRGB, + + STBIR_MAX_COLORSPACES, +} stbir_colorspace; + +// The following functions are all identical except for the type of the image data + +STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context); + +STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context); + +STBIRDEF int stbir_resize_float_generic( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context); + + + +////////////////////////////////////////////////////////////////////////////// +// +// Full-complexity API +// +// This extends the medium API as follows: +// +// * uint32 image type +// * not typesafe +// * separate filter types for each axis +// * separate edge modes for each axis +// * can specify scale explicitly for subpixel correctness +// * can specify image source tile using texture coordinates + +typedef enum +{ + STBIR_TYPE_UINT8 , + STBIR_TYPE_UINT16, + STBIR_TYPE_UINT32, + STBIR_TYPE_FLOAT , + + STBIR_MAX_TYPES +} stbir_datatype; + +STBIRDEF int stbir_resize( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context); + +STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float x_scale, float y_scale, + float x_offset, float y_offset); + +STBIRDEF int stbir_resize_region( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float s0, float t0, float s1, float t1); +// (s0, t0) & (s1, t1) are the top-left and bottom right corner (uv addressing style: [0, 1]x[0, 1]) of a region of the input image to use. + +// +// +//// end header file ///////////////////////////////////////////////////// +#endif // STBIR_INCLUDE_STB_IMAGE_RESIZE_H + + + + + +#ifdef STB_IMAGE_RESIZE_IMPLEMENTATION + +#ifndef STBIR_ASSERT +#include <assert.h> +#define STBIR_ASSERT(x) assert(x) +#endif + +// For memset +#include <string.h> + +#include <math.h> + +#ifndef STBIR_MALLOC +#include <stdlib.h> +// use comma operator to evaluate c, to avoid "unused parameter" warnings +#define STBIR_MALLOC(size,c) ((void)(c), malloc(size)) +#define STBIR_FREE(ptr,c) ((void)(c), free(ptr)) +#endif + +#ifndef _MSC_VER +#ifdef __cplusplus +#define stbir__inline inline +#else +#define stbir__inline +#endif +#else +#define stbir__inline __forceinline +#endif + + +// should produce compiler error if size is wrong +typedef unsigned char stbir__validate_uint32[sizeof(stbir_uint32) == 4 ? 1 : -1]; + +#ifdef _MSC_VER +#define STBIR__NOTUSED(v) (void)(v) +#else +#define STBIR__NOTUSED(v) (void)sizeof(v) +#endif + +#define STBIR__ARRAY_SIZE(a) (sizeof((a))/sizeof((a)[0])) + +#ifndef STBIR_DEFAULT_FILTER_UPSAMPLE +#define STBIR_DEFAULT_FILTER_UPSAMPLE STBIR_FILTER_CATMULLROM +#endif + +#ifndef STBIR_DEFAULT_FILTER_DOWNSAMPLE +#define STBIR_DEFAULT_FILTER_DOWNSAMPLE STBIR_FILTER_MITCHELL +#endif + +#ifndef STBIR_PROGRESS_REPORT +#define STBIR_PROGRESS_REPORT(float_0_to_1) +#endif + +#ifndef STBIR_MAX_CHANNELS +#define STBIR_MAX_CHANNELS 64 +#endif + +#if STBIR_MAX_CHANNELS > 65536 +#error "Too many channels; STBIR_MAX_CHANNELS must be no more than 65536." +// because we store the indices in 16-bit variables +#endif + +// This value is added to alpha just before premultiplication to avoid +// zeroing out color values. It is equivalent to 2^-80. If you don't want +// that behavior (it may interfere if you have floating point images with +// very small alpha values) then you can define STBIR_NO_ALPHA_EPSILON to +// disable it. +#ifndef STBIR_ALPHA_EPSILON +#define STBIR_ALPHA_EPSILON ((float)1 / (1 << 20) / (1 << 20) / (1 << 20) / (1 << 20)) +#endif + + + +#ifdef _MSC_VER +#define STBIR__UNUSED_PARAM(v) (void)(v) +#else +#define STBIR__UNUSED_PARAM(v) (void)sizeof(v) +#endif + +// must match stbir_datatype +static unsigned char stbir__type_size[] = { + 1, // STBIR_TYPE_UINT8 + 2, // STBIR_TYPE_UINT16 + 4, // STBIR_TYPE_UINT32 + 4, // STBIR_TYPE_FLOAT +}; + +// Kernel function centered at 0 +typedef float (stbir__kernel_fn)(float x, float scale); +typedef float (stbir__support_fn)(float scale); + +typedef struct +{ + stbir__kernel_fn* kernel; + stbir__support_fn* support; +} stbir__filter_info; + +// When upsampling, the contributors are which source pixels contribute. +// When downsampling, the contributors are which destination pixels are contributed to. +typedef struct +{ + int n0; // First contributing pixel + int n1; // Last contributing pixel +} stbir__contributors; + +typedef struct +{ + const void* input_data; + int input_w; + int input_h; + int input_stride_bytes; + + void* output_data; + int output_w; + int output_h; + int output_stride_bytes; + + float s0, t0, s1, t1; + + float horizontal_shift; // Units: output pixels + float vertical_shift; // Units: output pixels + float horizontal_scale; + float vertical_scale; + + int channels; + int alpha_channel; + stbir_uint32 flags; + stbir_datatype type; + stbir_filter horizontal_filter; + stbir_filter vertical_filter; + stbir_edge edge_horizontal; + stbir_edge edge_vertical; + stbir_colorspace colorspace; + + stbir__contributors* horizontal_contributors; + float* horizontal_coefficients; + + stbir__contributors* vertical_contributors; + float* vertical_coefficients; + + int decode_buffer_pixels; + float* decode_buffer; + + float* horizontal_buffer; + + // cache these because ceil/floor are inexplicably showing up in profile + int horizontal_coefficient_width; + int vertical_coefficient_width; + int horizontal_filter_pixel_width; + int vertical_filter_pixel_width; + int horizontal_filter_pixel_margin; + int vertical_filter_pixel_margin; + int horizontal_num_contributors; + int vertical_num_contributors; + + int ring_buffer_length_bytes; // The length of an individual entry in the ring buffer. The total number of ring buffers is stbir__get_filter_pixel_width(filter) + int ring_buffer_num_entries; // Total number of entries in the ring buffer. + int ring_buffer_first_scanline; + int ring_buffer_last_scanline; + int ring_buffer_begin_index; // first_scanline is at this index in the ring buffer + float* ring_buffer; + + float* encode_buffer; // A temporary buffer to store floats so we don't lose precision while we do multiply-adds. + + int horizontal_contributors_size; + int horizontal_coefficients_size; + int vertical_contributors_size; + int vertical_coefficients_size; + int decode_buffer_size; + int horizontal_buffer_size; + int ring_buffer_size; + int encode_buffer_size; +} stbir__info; + + +static const float stbir__max_uint8_as_float = 255.0f; +static const float stbir__max_uint16_as_float = 65535.0f; +static const double stbir__max_uint32_as_float = 4294967295.0; + + +static stbir__inline int stbir__min(int a, int b) +{ + return a < b ? a : b; +} + +static stbir__inline float stbir__saturate(float x) +{ + if (x < 0) + return 0; + + if (x > 1) + return 1; + + return x; +} + +#ifdef STBIR_SATURATE_INT +static stbir__inline stbir_uint8 stbir__saturate8(int x) +{ + if ((unsigned int) x <= 255) + return x; + + if (x < 0) + return 0; + + return 255; +} + +static stbir__inline stbir_uint16 stbir__saturate16(int x) +{ + if ((unsigned int) x <= 65535) + return x; + + if (x < 0) + return 0; + + return 65535; +} +#endif + +static float stbir__srgb_uchar_to_linear_float[256] = { + 0.000000f, 0.000304f, 0.000607f, 0.000911f, 0.001214f, 0.001518f, 0.001821f, 0.002125f, 0.002428f, 0.002732f, 0.003035f, + 0.003347f, 0.003677f, 0.004025f, 0.004391f, 0.004777f, 0.005182f, 0.005605f, 0.006049f, 0.006512f, 0.006995f, 0.007499f, + 0.008023f, 0.008568f, 0.009134f, 0.009721f, 0.010330f, 0.010960f, 0.011612f, 0.012286f, 0.012983f, 0.013702f, 0.014444f, + 0.015209f, 0.015996f, 0.016807f, 0.017642f, 0.018500f, 0.019382f, 0.020289f, 0.021219f, 0.022174f, 0.023153f, 0.024158f, + 0.025187f, 0.026241f, 0.027321f, 0.028426f, 0.029557f, 0.030713f, 0.031896f, 0.033105f, 0.034340f, 0.035601f, 0.036889f, + 0.038204f, 0.039546f, 0.040915f, 0.042311f, 0.043735f, 0.045186f, 0.046665f, 0.048172f, 0.049707f, 0.051269f, 0.052861f, + 0.054480f, 0.056128f, 0.057805f, 0.059511f, 0.061246f, 0.063010f, 0.064803f, 0.066626f, 0.068478f, 0.070360f, 0.072272f, + 0.074214f, 0.076185f, 0.078187f, 0.080220f, 0.082283f, 0.084376f, 0.086500f, 0.088656f, 0.090842f, 0.093059f, 0.095307f, + 0.097587f, 0.099899f, 0.102242f, 0.104616f, 0.107023f, 0.109462f, 0.111932f, 0.114435f, 0.116971f, 0.119538f, 0.122139f, + 0.124772f, 0.127438f, 0.130136f, 0.132868f, 0.135633f, 0.138432f, 0.141263f, 0.144128f, 0.147027f, 0.149960f, 0.152926f, + 0.155926f, 0.158961f, 0.162029f, 0.165132f, 0.168269f, 0.171441f, 0.174647f, 0.177888f, 0.181164f, 0.184475f, 0.187821f, + 0.191202f, 0.194618f, 0.198069f, 0.201556f, 0.205079f, 0.208637f, 0.212231f, 0.215861f, 0.219526f, 0.223228f, 0.226966f, + 0.230740f, 0.234551f, 0.238398f, 0.242281f, 0.246201f, 0.250158f, 0.254152f, 0.258183f, 0.262251f, 0.266356f, 0.270498f, + 0.274677f, 0.278894f, 0.283149f, 0.287441f, 0.291771f, 0.296138f, 0.300544f, 0.304987f, 0.309469f, 0.313989f, 0.318547f, + 0.323143f, 0.327778f, 0.332452f, 0.337164f, 0.341914f, 0.346704f, 0.351533f, 0.356400f, 0.361307f, 0.366253f, 0.371238f, + 0.376262f, 0.381326f, 0.386430f, 0.391573f, 0.396755f, 0.401978f, 0.407240f, 0.412543f, 0.417885f, 0.423268f, 0.428691f, + 0.434154f, 0.439657f, 0.445201f, 0.450786f, 0.456411f, 0.462077f, 0.467784f, 0.473532f, 0.479320f, 0.485150f, 0.491021f, + 0.496933f, 0.502887f, 0.508881f, 0.514918f, 0.520996f, 0.527115f, 0.533276f, 0.539480f, 0.545725f, 0.552011f, 0.558340f, + 0.564712f, 0.571125f, 0.577581f, 0.584078f, 0.590619f, 0.597202f, 0.603827f, 0.610496f, 0.617207f, 0.623960f, 0.630757f, + 0.637597f, 0.644480f, 0.651406f, 0.658375f, 0.665387f, 0.672443f, 0.679543f, 0.686685f, 0.693872f, 0.701102f, 0.708376f, + 0.715694f, 0.723055f, 0.730461f, 0.737911f, 0.745404f, 0.752942f, 0.760525f, 0.768151f, 0.775822f, 0.783538f, 0.791298f, + 0.799103f, 0.806952f, 0.814847f, 0.822786f, 0.830770f, 0.838799f, 0.846873f, 0.854993f, 0.863157f, 0.871367f, 0.879622f, + 0.887923f, 0.896269f, 0.904661f, 0.913099f, 0.921582f, 0.930111f, 0.938686f, 0.947307f, 0.955974f, 0.964686f, 0.973445f, + 0.982251f, 0.991102f, 1.0f +}; + +static float stbir__srgb_to_linear(float f) +{ + if (f <= 0.04045f) + return f / 12.92f; + else + return (float)pow((f + 0.055f) / 1.055f, 2.4f); +} + +static float stbir__linear_to_srgb(float f) +{ + if (f <= 0.0031308f) + return f * 12.92f; + else + return 1.055f * (float)pow(f, 1 / 2.4f) - 0.055f; +} + +#ifndef STBIR_NON_IEEE_FLOAT +// From https://gist.github.com/rygorous/2203834 + +typedef union +{ + stbir_uint32 u; + float f; +} stbir__FP32; + +static const stbir_uint32 fp32_to_srgb8_tab4[104] = { + 0x0073000d, 0x007a000d, 0x0080000d, 0x0087000d, 0x008d000d, 0x0094000d, 0x009a000d, 0x00a1000d, + 0x00a7001a, 0x00b4001a, 0x00c1001a, 0x00ce001a, 0x00da001a, 0x00e7001a, 0x00f4001a, 0x0101001a, + 0x010e0033, 0x01280033, 0x01410033, 0x015b0033, 0x01750033, 0x018f0033, 0x01a80033, 0x01c20033, + 0x01dc0067, 0x020f0067, 0x02430067, 0x02760067, 0x02aa0067, 0x02dd0067, 0x03110067, 0x03440067, + 0x037800ce, 0x03df00ce, 0x044600ce, 0x04ad00ce, 0x051400ce, 0x057b00c5, 0x05dd00bc, 0x063b00b5, + 0x06970158, 0x07420142, 0x07e30130, 0x087b0120, 0x090b0112, 0x09940106, 0x0a1700fc, 0x0a9500f2, + 0x0b0f01cb, 0x0bf401ae, 0x0ccb0195, 0x0d950180, 0x0e56016e, 0x0f0d015e, 0x0fbc0150, 0x10630143, + 0x11070264, 0x1238023e, 0x1357021d, 0x14660201, 0x156601e9, 0x165a01d3, 0x174401c0, 0x182401af, + 0x18fe0331, 0x1a9602fe, 0x1c1502d2, 0x1d7e02ad, 0x1ed4028d, 0x201a0270, 0x21520256, 0x227d0240, + 0x239f0443, 0x25c003fe, 0x27bf03c4, 0x29a10392, 0x2b6a0367, 0x2d1d0341, 0x2ebe031f, 0x304d0300, + 0x31d105b0, 0x34a80555, 0x37520507, 0x39d504c5, 0x3c37048b, 0x3e7c0458, 0x40a8042a, 0x42bd0401, + 0x44c20798, 0x488e071e, 0x4c1c06b6, 0x4f76065d, 0x52a50610, 0x55ac05cc, 0x5892058f, 0x5b590559, + 0x5e0c0a23, 0x631c0980, 0x67db08f6, 0x6c55087f, 0x70940818, 0x74a007bd, 0x787d076c, 0x7c330723, +}; + +static stbir_uint8 stbir__linear_to_srgb_uchar(float in) +{ + static const stbir__FP32 almostone = { 0x3f7fffff }; // 1-eps + static const stbir__FP32 minval = { (127-13) << 23 }; + stbir_uint32 tab,bias,scale,t; + stbir__FP32 f; + + // Clamp to [2^(-13), 1-eps]; these two values map to 0 and 1, respectively. + // The tests are carefully written so that NaNs map to 0, same as in the reference + // implementation. + if (!(in > minval.f)) // written this way to catch NaNs + in = minval.f; + if (in > almostone.f) + in = almostone.f; + + // Do the table lookup and unpack bias, scale + f.f = in; + tab = fp32_to_srgb8_tab4[(f.u - minval.u) >> 20]; + bias = (tab >> 16) << 9; + scale = tab & 0xffff; + + // Grab next-highest mantissa bits and perform linear interpolation + t = (f.u >> 12) & 0xff; + return (unsigned char) ((bias + scale*t) >> 16); +} + +#else +// sRGB transition values, scaled by 1<<28 +static int stbir__srgb_offset_to_linear_scaled[256] = +{ + 0, 40738, 122216, 203693, 285170, 366648, 448125, 529603, + 611080, 692557, 774035, 855852, 942009, 1033024, 1128971, 1229926, + 1335959, 1447142, 1563542, 1685229, 1812268, 1944725, 2082664, 2226148, + 2375238, 2529996, 2690481, 2856753, 3028870, 3206888, 3390865, 3580856, + 3776916, 3979100, 4187460, 4402049, 4622919, 4850123, 5083710, 5323731, + 5570236, 5823273, 6082892, 6349140, 6622065, 6901714, 7188133, 7481369, + 7781466, 8088471, 8402427, 8723380, 9051372, 9386448, 9728650, 10078021, + 10434603, 10798439, 11169569, 11548036, 11933879, 12327139, 12727857, 13136073, + 13551826, 13975156, 14406100, 14844697, 15290987, 15745007, 16206795, 16676389, + 17153826, 17639142, 18132374, 18633560, 19142734, 19659934, 20185196, 20718552, + 21260042, 21809696, 22367554, 22933648, 23508010, 24090680, 24681686, 25281066, + 25888850, 26505076, 27129772, 27762974, 28404716, 29055026, 29713942, 30381490, + 31057708, 31742624, 32436272, 33138682, 33849884, 34569912, 35298800, 36036568, + 36783260, 37538896, 38303512, 39077136, 39859796, 40651528, 41452360, 42262316, + 43081432, 43909732, 44747252, 45594016, 46450052, 47315392, 48190064, 49074096, + 49967516, 50870356, 51782636, 52704392, 53635648, 54576432, 55526772, 56486700, + 57456236, 58435408, 59424248, 60422780, 61431036, 62449032, 63476804, 64514376, + 65561776, 66619028, 67686160, 68763192, 69850160, 70947088, 72053992, 73170912, + 74297864, 75434880, 76581976, 77739184, 78906536, 80084040, 81271736, 82469648, + 83677792, 84896192, 86124888, 87363888, 88613232, 89872928, 91143016, 92423512, + 93714432, 95015816, 96327688, 97650056, 98982952, 100326408, 101680440, 103045072, + 104420320, 105806224, 107202800, 108610064, 110028048, 111456776, 112896264, 114346544, + 115807632, 117279552, 118762328, 120255976, 121760536, 123276016, 124802440, 126339832, + 127888216, 129447616, 131018048, 132599544, 134192112, 135795792, 137410592, 139036528, + 140673648, 142321952, 143981456, 145652208, 147334208, 149027488, 150732064, 152447968, + 154175200, 155913792, 157663776, 159425168, 161197984, 162982240, 164777968, 166585184, + 168403904, 170234160, 172075968, 173929344, 175794320, 177670896, 179559120, 181458992, + 183370528, 185293776, 187228736, 189175424, 191133888, 193104112, 195086128, 197079968, + 199085648, 201103184, 203132592, 205173888, 207227120, 209292272, 211369392, 213458480, + 215559568, 217672656, 219797792, 221934976, 224084240, 226245600, 228419056, 230604656, + 232802400, 235012320, 237234432, 239468736, 241715280, 243974080, 246245120, 248528464, + 250824112, 253132064, 255452368, 257785040, 260130080, 262487520, 264857376, 267239664, +}; + +static stbir_uint8 stbir__linear_to_srgb_uchar(float f) +{ + int x = (int) (f * (1 << 28)); // has headroom so you don't need to clamp + int v = 0; + int i; + + // Refine the guess with a short binary search. + i = v + 128; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 64; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 32; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 16; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 8; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 4; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 2; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 1; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + + return (stbir_uint8) v; +} +#endif + +static float stbir__filter_trapezoid(float x, float scale) +{ + float halfscale = scale / 2; + float t = 0.5f + halfscale; + STBIR_ASSERT(scale <= 1); + + x = (float)fabs(x); + + if (x >= t) + return 0; + else + { + float r = 0.5f - halfscale; + if (x <= r) + return 1; + else + return (t - x) / scale; + } +} + +static float stbir__support_trapezoid(float scale) +{ + STBIR_ASSERT(scale <= 1); + return 0.5f + scale / 2; +} + +static float stbir__filter_triangle(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x <= 1.0f) + return 1 - x; + else + return 0; +} + +static float stbir__filter_cubic(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x < 1.0f) + return (4 + x*x*(3*x - 6))/6; + else if (x < 2.0f) + return (8 + x*(-12 + x*(6 - x)))/6; + + return (0.0f); +} + +static float stbir__filter_catmullrom(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x < 1.0f) + return 1 - x*x*(2.5f - 1.5f*x); + else if (x < 2.0f) + return 2 - x*(4 + x*(0.5f*x - 2.5f)); + + return (0.0f); +} + +static float stbir__filter_mitchell(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x < 1.0f) + return (16 + x*x*(21 * x - 36))/18; + else if (x < 2.0f) + return (32 + x*(-60 + x*(36 - 7*x)))/18; + + return (0.0f); +} + +static float stbir__support_zero(float s) +{ + STBIR__UNUSED_PARAM(s); + return 0; +} + +static float stbir__support_one(float s) +{ + STBIR__UNUSED_PARAM(s); + return 1; +} + +static float stbir__support_two(float s) +{ + STBIR__UNUSED_PARAM(s); + return 2; +} + +static stbir__filter_info stbir__filter_info_table[] = { + { NULL, stbir__support_zero }, + { stbir__filter_trapezoid, stbir__support_trapezoid }, + { stbir__filter_triangle, stbir__support_one }, + { stbir__filter_cubic, stbir__support_two }, + { stbir__filter_catmullrom, stbir__support_two }, + { stbir__filter_mitchell, stbir__support_two }, +}; + +stbir__inline static int stbir__use_upsampling(float ratio) +{ + return ratio > 1; +} + +stbir__inline static int stbir__use_width_upsampling(stbir__info* stbir_info) +{ + return stbir__use_upsampling(stbir_info->horizontal_scale); +} + +stbir__inline static int stbir__use_height_upsampling(stbir__info* stbir_info) +{ + return stbir__use_upsampling(stbir_info->vertical_scale); +} + +// This is the maximum number of input samples that can affect an output sample +// with the given filter +static int stbir__get_filter_pixel_width(stbir_filter filter, float scale) +{ + STBIR_ASSERT(filter != 0); + STBIR_ASSERT(filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); + + if (stbir__use_upsampling(scale)) + return (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2); + else + return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2 / scale); +} + +// This is how much to expand buffers to account for filters seeking outside +// the image boundaries. +static int stbir__get_filter_pixel_margin(stbir_filter filter, float scale) +{ + return stbir__get_filter_pixel_width(filter, scale) / 2; +} + +static int stbir__get_coefficient_width(stbir_filter filter, float scale) +{ + if (stbir__use_upsampling(scale)) + return (int)ceil(stbir__filter_info_table[filter].support(1 / scale) * 2); + else + return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2); +} + +static int stbir__get_contributors(float scale, stbir_filter filter, int input_size, int output_size) +{ + if (stbir__use_upsampling(scale)) + return output_size; + else + return (input_size + stbir__get_filter_pixel_margin(filter, scale) * 2); +} + +static int stbir__get_total_horizontal_coefficients(stbir__info* info) +{ + return info->horizontal_num_contributors + * stbir__get_coefficient_width (info->horizontal_filter, info->horizontal_scale); +} + +static int stbir__get_total_vertical_coefficients(stbir__info* info) +{ + return info->vertical_num_contributors + * stbir__get_coefficient_width (info->vertical_filter, info->vertical_scale); +} + +static stbir__contributors* stbir__get_contributor(stbir__contributors* contributors, int n) +{ + return &contributors[n]; +} + +// For perf reasons this code is duplicated in stbir__resample_horizontal_upsample/downsample, +// if you change it here change it there too. +static float* stbir__get_coefficient(float* coefficients, stbir_filter filter, float scale, int n, int c) +{ + int width = stbir__get_coefficient_width(filter, scale); + return &coefficients[width*n + c]; +} + +static int stbir__edge_wrap_slow(stbir_edge edge, int n, int max) +{ + switch (edge) + { + case STBIR_EDGE_ZERO: + return 0; // we'll decode the wrong pixel here, and then overwrite with 0s later + + case STBIR_EDGE_CLAMP: + if (n < 0) + return 0; + + if (n >= max) + return max - 1; + + return n; // NOTREACHED + + case STBIR_EDGE_REFLECT: + { + if (n < 0) + { + if (n < max) + return -n; + else + return max - 1; + } + + if (n >= max) + { + int max2 = max * 2; + if (n >= max2) + return 0; + else + return max2 - n - 1; + } + + return n; // NOTREACHED + } + + case STBIR_EDGE_WRAP: + if (n >= 0) + return (n % max); + else + { + int m = (-n) % max; + + if (m != 0) + m = max - m; + + return (m); + } + // NOTREACHED + + default: + STBIR_ASSERT(!"Unimplemented edge type"); + return 0; + } +} + +stbir__inline static int stbir__edge_wrap(stbir_edge edge, int n, int max) +{ + // avoid per-pixel switch + if (n >= 0 && n < max) + return n; + return stbir__edge_wrap_slow(edge, n, max); +} + +// What input pixels contribute to this output pixel? +static void stbir__calculate_sample_range_upsample(int n, float out_filter_radius, float scale_ratio, float out_shift, int* in_first_pixel, int* in_last_pixel, float* in_center_of_out) +{ + float out_pixel_center = (float)n + 0.5f; + float out_pixel_influence_lowerbound = out_pixel_center - out_filter_radius; + float out_pixel_influence_upperbound = out_pixel_center + out_filter_radius; + + float in_pixel_influence_lowerbound = (out_pixel_influence_lowerbound + out_shift) / scale_ratio; + float in_pixel_influence_upperbound = (out_pixel_influence_upperbound + out_shift) / scale_ratio; + + *in_center_of_out = (out_pixel_center + out_shift) / scale_ratio; + *in_first_pixel = (int)(floor(in_pixel_influence_lowerbound + 0.5)); + *in_last_pixel = (int)(floor(in_pixel_influence_upperbound - 0.5)); +} + +// What output pixels does this input pixel contribute to? +static void stbir__calculate_sample_range_downsample(int n, float in_pixels_radius, float scale_ratio, float out_shift, int* out_first_pixel, int* out_last_pixel, float* out_center_of_in) +{ + float in_pixel_center = (float)n + 0.5f; + float in_pixel_influence_lowerbound = in_pixel_center - in_pixels_radius; + float in_pixel_influence_upperbound = in_pixel_center + in_pixels_radius; + + float out_pixel_influence_lowerbound = in_pixel_influence_lowerbound * scale_ratio - out_shift; + float out_pixel_influence_upperbound = in_pixel_influence_upperbound * scale_ratio - out_shift; + + *out_center_of_in = in_pixel_center * scale_ratio - out_shift; + *out_first_pixel = (int)(floor(out_pixel_influence_lowerbound + 0.5)); + *out_last_pixel = (int)(floor(out_pixel_influence_upperbound - 0.5)); +} + +static void stbir__calculate_coefficients_upsample(stbir_filter filter, float scale, int in_first_pixel, int in_last_pixel, float in_center_of_out, stbir__contributors* contributor, float* coefficient_group) +{ + int i; + float total_filter = 0; + float filter_scale; + + STBIR_ASSERT(in_last_pixel - in_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical. + + contributor->n0 = in_first_pixel; + contributor->n1 = in_last_pixel; + + STBIR_ASSERT(contributor->n1 >= contributor->n0); + + for (i = 0; i <= in_last_pixel - in_first_pixel; i++) + { + float in_pixel_center = (float)(i + in_first_pixel) + 0.5f; + coefficient_group[i] = stbir__filter_info_table[filter].kernel(in_center_of_out - in_pixel_center, 1 / scale); + + // If the coefficient is zero, skip it. (Don't do the <0 check here, we want the influence of those outside pixels.) + if (i == 0 && !coefficient_group[i]) + { + contributor->n0 = ++in_first_pixel; + i--; + continue; + } + + total_filter += coefficient_group[i]; + } + + STBIR_ASSERT(stbir__filter_info_table[filter].kernel((float)(in_last_pixel + 1) + 0.5f - in_center_of_out, 1/scale) == 0); + + STBIR_ASSERT(total_filter > 0.9); + STBIR_ASSERT(total_filter < 1.1f); // Make sure it's not way off. + + // Make sure the sum of all coefficients is 1. + filter_scale = 1 / total_filter; + + for (i = 0; i <= in_last_pixel - in_first_pixel; i++) + coefficient_group[i] *= filter_scale; + + for (i = in_last_pixel - in_first_pixel; i >= 0; i--) + { + if (coefficient_group[i]) + break; + + // This line has no weight. We can skip it. + contributor->n1 = contributor->n0 + i - 1; + } +} + +static void stbir__calculate_coefficients_downsample(stbir_filter filter, float scale_ratio, int out_first_pixel, int out_last_pixel, float out_center_of_in, stbir__contributors* contributor, float* coefficient_group) +{ + int i; + + STBIR_ASSERT(out_last_pixel - out_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(scale_ratio) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical. + + contributor->n0 = out_first_pixel; + contributor->n1 = out_last_pixel; + + STBIR_ASSERT(contributor->n1 >= contributor->n0); + + for (i = 0; i <= out_last_pixel - out_first_pixel; i++) + { + float out_pixel_center = (float)(i + out_first_pixel) + 0.5f; + float x = out_pixel_center - out_center_of_in; + coefficient_group[i] = stbir__filter_info_table[filter].kernel(x, scale_ratio) * scale_ratio; + } + + STBIR_ASSERT(stbir__filter_info_table[filter].kernel((float)(out_last_pixel + 1) + 0.5f - out_center_of_in, scale_ratio) == 0); + + for (i = out_last_pixel - out_first_pixel; i >= 0; i--) + { + if (coefficient_group[i]) + break; + + // This line has no weight. We can skip it. + contributor->n1 = contributor->n0 + i - 1; + } +} + +static void stbir__normalize_downsample_coefficients(stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, int input_size, int output_size) +{ + int num_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size); + int num_coefficients = stbir__get_coefficient_width(filter, scale_ratio); + int i, j; + int skip; + + for (i = 0; i < output_size; i++) + { + float scale; + float total = 0; + + for (j = 0; j < num_contributors; j++) + { + if (i >= contributors[j].n0 && i <= contributors[j].n1) + { + float coefficient = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0); + total += coefficient; + } + else if (i < contributors[j].n0) + break; + } + + STBIR_ASSERT(total > 0.9f); + STBIR_ASSERT(total < 1.1f); + + scale = 1 / total; + + for (j = 0; j < num_contributors; j++) + { + if (i >= contributors[j].n0 && i <= contributors[j].n1) + *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0) *= scale; + else if (i < contributors[j].n0) + break; + } + } + + // Optimize: Skip zero coefficients and contributions outside of image bounds. + // Do this after normalizing because normalization depends on the n0/n1 values. + for (j = 0; j < num_contributors; j++) + { + int range, max, width; + + skip = 0; + while (*stbir__get_coefficient(coefficients, filter, scale_ratio, j, skip) == 0) + skip++; + + contributors[j].n0 += skip; + + while (contributors[j].n0 < 0) + { + contributors[j].n0++; + skip++; + } + + range = contributors[j].n1 - contributors[j].n0 + 1; + max = stbir__min(num_coefficients, range); + + width = stbir__get_coefficient_width(filter, scale_ratio); + for (i = 0; i < max; i++) + { + if (i + skip >= width) + break; + + *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i) = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i + skip); + } + + continue; + } + + // Using min to avoid writing into invalid pixels. + for (i = 0; i < num_contributors; i++) + contributors[i].n1 = stbir__min(contributors[i].n1, output_size - 1); +} + +// Each scan line uses the same kernel values so we should calculate the kernel +// values once and then we can use them for every scan line. +static void stbir__calculate_filters(stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, float shift, int input_size, int output_size) +{ + int n; + int total_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size); + + if (stbir__use_upsampling(scale_ratio)) + { + float out_pixels_radius = stbir__filter_info_table[filter].support(1 / scale_ratio) * scale_ratio; + + // Looping through out pixels + for (n = 0; n < total_contributors; n++) + { + float in_center_of_out; // Center of the current out pixel in the in pixel space + int in_first_pixel, in_last_pixel; + + stbir__calculate_sample_range_upsample(n, out_pixels_radius, scale_ratio, shift, &in_first_pixel, &in_last_pixel, &in_center_of_out); + + stbir__calculate_coefficients_upsample(filter, scale_ratio, in_first_pixel, in_last_pixel, in_center_of_out, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0)); + } + } + else + { + float in_pixels_radius = stbir__filter_info_table[filter].support(scale_ratio) / scale_ratio; + + // Looping through in pixels + for (n = 0; n < total_contributors; n++) + { + float out_center_of_in; // Center of the current out pixel in the in pixel space + int out_first_pixel, out_last_pixel; + int n_adjusted = n - stbir__get_filter_pixel_margin(filter, scale_ratio); + + stbir__calculate_sample_range_downsample(n_adjusted, in_pixels_radius, scale_ratio, shift, &out_first_pixel, &out_last_pixel, &out_center_of_in); + + stbir__calculate_coefficients_downsample(filter, scale_ratio, out_first_pixel, out_last_pixel, out_center_of_in, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0)); + } + + stbir__normalize_downsample_coefficients(contributors, coefficients, filter, scale_ratio, input_size, output_size); + } +} + +static float* stbir__get_decode_buffer(stbir__info* stbir_info) +{ + // The 0 index of the decode buffer starts after the margin. This makes + // it okay to use negative indexes on the decode buffer. + return &stbir_info->decode_buffer[stbir_info->horizontal_filter_pixel_margin * stbir_info->channels]; +} + +#define STBIR__DECODE(type, colorspace) ((type) * (STBIR_MAX_COLORSPACES) + (colorspace)) + +static void stbir__decode_scanline(stbir__info* stbir_info, int n) +{ + int c; + int channels = stbir_info->channels; + int alpha_channel = stbir_info->alpha_channel; + int type = stbir_info->type; + int colorspace = stbir_info->colorspace; + int input_w = stbir_info->input_w; + size_t input_stride_bytes = stbir_info->input_stride_bytes; + float* decode_buffer = stbir__get_decode_buffer(stbir_info); + stbir_edge edge_horizontal = stbir_info->edge_horizontal; + stbir_edge edge_vertical = stbir_info->edge_vertical; + size_t in_buffer_row_offset = stbir__edge_wrap(edge_vertical, n, stbir_info->input_h) * input_stride_bytes; + const void* input_data = (char *) stbir_info->input_data + in_buffer_row_offset; + int max_x = input_w + stbir_info->horizontal_filter_pixel_margin; + int decode = STBIR__DECODE(type, colorspace); + + int x = -stbir_info->horizontal_filter_pixel_margin; + + // special handling for STBIR_EDGE_ZERO because it needs to return an item that doesn't appear in the input, + // and we want to avoid paying overhead on every pixel if not STBIR_EDGE_ZERO + if (edge_vertical == STBIR_EDGE_ZERO && (n < 0 || n >= stbir_info->input_h)) + { + for (; x < max_x; x++) + for (c = 0; c < channels; c++) + decode_buffer[x*channels + c] = 0; + return; + } + + switch (decode) + { + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = ((float)((const unsigned char*)input_data)[input_pixel_index + c]) / stbir__max_uint8_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_uchar_to_linear_float[((const unsigned char*)input_data)[input_pixel_index + c]]; + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned char*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint8_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = ((float)((const unsigned short*)input_data)[input_pixel_index + c]) / stbir__max_uint16_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((float)((const unsigned short*)input_data)[input_pixel_index + c]) / stbir__max_uint16_as_float); + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned short*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint16_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / stbir__max_uint32_as_float); + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear((float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / stbir__max_uint32_as_float)); + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint32_as_float); + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = ((const float*)input_data)[input_pixel_index + c]; + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((const float*)input_data)[input_pixel_index + c]); + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = ((const float*)input_data)[input_pixel_index + alpha_channel]; + } + + break; + + default: + STBIR_ASSERT(!"Unknown type/colorspace/channels combination."); + break; + } + + if (!(stbir_info->flags & STBIR_FLAG_ALPHA_PREMULTIPLIED)) + { + for (x = -stbir_info->horizontal_filter_pixel_margin; x < max_x; x++) + { + int decode_pixel_index = x * channels; + + // If the alpha value is 0 it will clobber the color values. Make sure it's not. + float alpha = decode_buffer[decode_pixel_index + alpha_channel]; +#ifndef STBIR_NO_ALPHA_EPSILON + if (stbir_info->type != STBIR_TYPE_FLOAT) { + alpha += STBIR_ALPHA_EPSILON; + decode_buffer[decode_pixel_index + alpha_channel] = alpha; + } +#endif + for (c = 0; c < channels; c++) + { + if (c == alpha_channel) + continue; + + decode_buffer[decode_pixel_index + c] *= alpha; + } + } + } + + if (edge_horizontal == STBIR_EDGE_ZERO) + { + for (x = -stbir_info->horizontal_filter_pixel_margin; x < 0; x++) + { + for (c = 0; c < channels; c++) + decode_buffer[x*channels + c] = 0; + } + for (x = input_w; x < max_x; x++) + { + for (c = 0; c < channels; c++) + decode_buffer[x*channels + c] = 0; + } + } +} + +static float* stbir__get_ring_buffer_entry(float* ring_buffer, int index, int ring_buffer_length) +{ + return &ring_buffer[index * ring_buffer_length]; +} + +static float* stbir__add_empty_ring_buffer_entry(stbir__info* stbir_info, int n) +{ + int ring_buffer_index; + float* ring_buffer; + + stbir_info->ring_buffer_last_scanline = n; + + if (stbir_info->ring_buffer_begin_index < 0) + { + ring_buffer_index = stbir_info->ring_buffer_begin_index = 0; + stbir_info->ring_buffer_first_scanline = n; + } + else + { + ring_buffer_index = (stbir_info->ring_buffer_begin_index + (stbir_info->ring_buffer_last_scanline - stbir_info->ring_buffer_first_scanline)) % stbir_info->ring_buffer_num_entries; + STBIR_ASSERT(ring_buffer_index != stbir_info->ring_buffer_begin_index); + } + + ring_buffer = stbir__get_ring_buffer_entry(stbir_info->ring_buffer, ring_buffer_index, stbir_info->ring_buffer_length_bytes / sizeof(float)); + memset(ring_buffer, 0, stbir_info->ring_buffer_length_bytes); + + return ring_buffer; +} + + +static void stbir__resample_horizontal_upsample(stbir__info* stbir_info, float* output_buffer) +{ + int x, k; + int output_w = stbir_info->output_w; + int channels = stbir_info->channels; + float* decode_buffer = stbir__get_decode_buffer(stbir_info); + stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors; + float* horizontal_coefficients = stbir_info->horizontal_coefficients; + int coefficient_width = stbir_info->horizontal_coefficient_width; + + for (x = 0; x < output_w; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int out_pixel_index = x * channels; + int coefficient_group = coefficient_width * x; + int coefficient_counter = 0; + + STBIR_ASSERT(n1 >= n0); + STBIR_ASSERT(n0 >= -stbir_info->horizontal_filter_pixel_margin); + STBIR_ASSERT(n1 >= -stbir_info->horizontal_filter_pixel_margin); + STBIR_ASSERT(n0 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin); + STBIR_ASSERT(n1 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin); + + switch (channels) { + case 1: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 1; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + } + break; + case 2: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 2; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + } + break; + case 3: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 3; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + } + break; + case 4: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 4; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient; + } + break; + default: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * channels; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + int c; + STBIR_ASSERT(coefficient != 0); + for (c = 0; c < channels; c++) + output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient; + } + break; + } + } +} + +static void stbir__resample_horizontal_downsample(stbir__info* stbir_info, float* output_buffer) +{ + int x, k; + int input_w = stbir_info->input_w; + int channels = stbir_info->channels; + float* decode_buffer = stbir__get_decode_buffer(stbir_info); + stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors; + float* horizontal_coefficients = stbir_info->horizontal_coefficients; + int coefficient_width = stbir_info->horizontal_coefficient_width; + int filter_pixel_margin = stbir_info->horizontal_filter_pixel_margin; + int max_x = input_w + filter_pixel_margin * 2; + + STBIR_ASSERT(!stbir__use_width_upsampling(stbir_info)); + + switch (channels) { + case 1: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 1; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 1; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + } + } + break; + + case 2: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 2; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 2; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + } + } + break; + + case 3: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 3; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 3; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + } + } + break; + + case 4: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 4; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 4; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient; + } + } + break; + + default: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * channels; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int c; + int out_pixel_index = k * channels; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + for (c = 0; c < channels; c++) + output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient; + } + } + break; + } +} + +static void stbir__decode_and_resample_upsample(stbir__info* stbir_info, int n) +{ + // Decode the nth scanline from the source image into the decode buffer. + stbir__decode_scanline(stbir_info, n); + + // Now resample it into the ring buffer. + if (stbir__use_width_upsampling(stbir_info)) + stbir__resample_horizontal_upsample(stbir_info, stbir__add_empty_ring_buffer_entry(stbir_info, n)); + else + stbir__resample_horizontal_downsample(stbir_info, stbir__add_empty_ring_buffer_entry(stbir_info, n)); + + // Now it's sitting in the ring buffer ready to be used as source for the vertical sampling. +} + +static void stbir__decode_and_resample_downsample(stbir__info* stbir_info, int n) +{ + // Decode the nth scanline from the source image into the decode buffer. + stbir__decode_scanline(stbir_info, n); + + memset(stbir_info->horizontal_buffer, 0, stbir_info->output_w * stbir_info->channels * sizeof(float)); + + // Now resample it into the horizontal buffer. + if (stbir__use_width_upsampling(stbir_info)) + stbir__resample_horizontal_upsample(stbir_info, stbir_info->horizontal_buffer); + else + stbir__resample_horizontal_downsample(stbir_info, stbir_info->horizontal_buffer); + + // Now it's sitting in the horizontal buffer ready to be distributed into the ring buffers. +} + +// Get the specified scan line from the ring buffer. +static float* stbir__get_ring_buffer_scanline(int get_scanline, float* ring_buffer, int begin_index, int first_scanline, int ring_buffer_num_entries, int ring_buffer_length) +{ + int ring_buffer_index = (begin_index + (get_scanline - first_scanline)) % ring_buffer_num_entries; + return stbir__get_ring_buffer_entry(ring_buffer, ring_buffer_index, ring_buffer_length); +} + + +static void stbir__encode_scanline(stbir__info* stbir_info, int num_pixels, void *output_buffer, float *encode_buffer, int channels, int alpha_channel, int decode) +{ + int x; + int n; + int num_nonalpha; + stbir_uint16 nonalpha[STBIR_MAX_CHANNELS]; + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_PREMULTIPLIED)) + { + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + float alpha = encode_buffer[pixel_index + alpha_channel]; + float reciprocal_alpha = alpha ? 1.0f / alpha : 0; + + // unrolling this produced a 1% slowdown upscaling a large RGBA linear-space image on my machine - stb + for (n = 0; n < channels; n++) + if (n != alpha_channel) + encode_buffer[pixel_index + n] *= reciprocal_alpha; + + // We added in a small epsilon to prevent the color channel from being deleted with zero alpha. + // Because we only add it for integer types, it will automatically be discarded on integer + // conversion, so we don't need to subtract it back out (which would be problematic for + // numeric precision reasons). + } + } + + // build a table of all channels that need colorspace correction, so + // we don't perform colorspace correction on channels that don't need it. + for (x = 0, num_nonalpha = 0; x < channels; ++x) + { + if (x != alpha_channel || (stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE)) + { + nonalpha[num_nonalpha++] = (stbir_uint16)x; + } + } + + #define STBIR__ROUND_INT(f) ((int) ((f)+0.5)) + #define STBIR__ROUND_UINT(f) ((stbir_uint32) ((f)+0.5)) + + #ifdef STBIR__SATURATE_INT + #define STBIR__ENCODE_LINEAR8(f) stbir__saturate8 (STBIR__ROUND_INT((f) * stbir__max_uint8_as_float )) + #define STBIR__ENCODE_LINEAR16(f) stbir__saturate16(STBIR__ROUND_INT((f) * stbir__max_uint16_as_float)) + #else + #define STBIR__ENCODE_LINEAR8(f) (unsigned char ) STBIR__ROUND_INT(stbir__saturate(f) * stbir__max_uint8_as_float ) + #define STBIR__ENCODE_LINEAR16(f) (unsigned short) STBIR__ROUND_INT(stbir__saturate(f) * stbir__max_uint16_as_float) + #endif + + switch (decode) + { + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((unsigned char*)output_buffer)[index] = STBIR__ENCODE_LINEAR8(encode_buffer[index]); + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((unsigned char*)output_buffer)[index] = stbir__linear_to_srgb_uchar(encode_buffer[index]); + } + + if (!(stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((unsigned char *)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR8(encode_buffer[pixel_index+alpha_channel]); + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((unsigned short*)output_buffer)[index] = STBIR__ENCODE_LINEAR16(encode_buffer[index]); + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((unsigned short*)output_buffer)[index] = (unsigned short)STBIR__ROUND_INT(stbir__linear_to_srgb(stbir__saturate(encode_buffer[index])) * stbir__max_uint16_as_float); + } + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((unsigned short*)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR16(encode_buffer[pixel_index + alpha_channel]); + } + + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__saturate(encode_buffer[index])) * stbir__max_uint32_as_float); + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__linear_to_srgb(stbir__saturate(encode_buffer[index]))) * stbir__max_uint32_as_float); + } + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((unsigned int*)output_buffer)[pixel_index + alpha_channel] = (unsigned int)STBIR__ROUND_INT(((double)stbir__saturate(encode_buffer[pixel_index + alpha_channel])) * stbir__max_uint32_as_float); + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((float*)output_buffer)[index] = encode_buffer[index]; + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((float*)output_buffer)[index] = stbir__linear_to_srgb(encode_buffer[index]); + } + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((float*)output_buffer)[pixel_index + alpha_channel] = encode_buffer[pixel_index + alpha_channel]; + } + break; + + default: + STBIR_ASSERT(!"Unknown type/colorspace/channels combination."); + break; + } +} + +static void stbir__resample_vertical_upsample(stbir__info* stbir_info, int n) +{ + int x, k; + int output_w = stbir_info->output_w; + stbir__contributors* vertical_contributors = stbir_info->vertical_contributors; + float* vertical_coefficients = stbir_info->vertical_coefficients; + int channels = stbir_info->channels; + int alpha_channel = stbir_info->alpha_channel; + int type = stbir_info->type; + int colorspace = stbir_info->colorspace; + int ring_buffer_entries = stbir_info->ring_buffer_num_entries; + void* output_data = stbir_info->output_data; + float* encode_buffer = stbir_info->encode_buffer; + int decode = STBIR__DECODE(type, colorspace); + int coefficient_width = stbir_info->vertical_coefficient_width; + int coefficient_counter; + int contributor = n; + + float* ring_buffer = stbir_info->ring_buffer; + int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index; + int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline; + int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float); + + int n0,n1, output_row_start; + int coefficient_group = coefficient_width * contributor; + + n0 = vertical_contributors[contributor].n0; + n1 = vertical_contributors[contributor].n1; + + output_row_start = n * stbir_info->output_stride_bytes; + + STBIR_ASSERT(stbir__use_height_upsampling(stbir_info)); + + memset(encode_buffer, 0, output_w * sizeof(float) * channels); + + // I tried reblocking this for better cache usage of encode_buffer + // (using x_outer, k, x_inner), but it lost speed. -- stb + + coefficient_counter = 0; + switch (channels) { + case 1: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 1; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + } + } + break; + case 2: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 2; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient; + } + } + break; + case 3: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 3; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient; + encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient; + } + } + break; + case 4: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 4; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient; + encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient; + encode_buffer[in_pixel_index + 3] += ring_buffer_entry[in_pixel_index + 3] * coefficient; + } + } + break; + default: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * channels; + int c; + for (c = 0; c < channels; c++) + encode_buffer[in_pixel_index + c] += ring_buffer_entry[in_pixel_index + c] * coefficient; + } + } + break; + } + stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, encode_buffer, channels, alpha_channel, decode); +} + +static void stbir__resample_vertical_downsample(stbir__info* stbir_info, int n) +{ + int x, k; + int output_w = stbir_info->output_w; + stbir__contributors* vertical_contributors = stbir_info->vertical_contributors; + float* vertical_coefficients = stbir_info->vertical_coefficients; + int channels = stbir_info->channels; + int ring_buffer_entries = stbir_info->ring_buffer_num_entries; + float* horizontal_buffer = stbir_info->horizontal_buffer; + int coefficient_width = stbir_info->vertical_coefficient_width; + int contributor = n + stbir_info->vertical_filter_pixel_margin; + + float* ring_buffer = stbir_info->ring_buffer; + int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index; + int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline; + int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float); + int n0,n1; + + n0 = vertical_contributors[contributor].n0; + n1 = vertical_contributors[contributor].n1; + + STBIR_ASSERT(!stbir__use_height_upsampling(stbir_info)); + + for (k = n0; k <= n1; k++) + { + int coefficient_index = k - n0; + int coefficient_group = coefficient_width * contributor; + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + + switch (channels) { + case 1: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 1; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + } + break; + case 2: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 2; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient; + } + break; + case 3: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 3; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient; + ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient; + } + break; + case 4: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 4; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient; + ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient; + ring_buffer_entry[in_pixel_index + 3] += horizontal_buffer[in_pixel_index + 3] * coefficient; + } + break; + default: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * channels; + + int c; + for (c = 0; c < channels; c++) + ring_buffer_entry[in_pixel_index + c] += horizontal_buffer[in_pixel_index + c] * coefficient; + } + break; + } + } +} + +static void stbir__buffer_loop_upsample(stbir__info* stbir_info) +{ + int y; + float scale_ratio = stbir_info->vertical_scale; + float out_scanlines_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(1/scale_ratio) * scale_ratio; + + STBIR_ASSERT(stbir__use_height_upsampling(stbir_info)); + + for (y = 0; y < stbir_info->output_h; y++) + { + float in_center_of_out = 0; // Center of the current out scanline in the in scanline space + int in_first_scanline = 0, in_last_scanline = 0; + + stbir__calculate_sample_range_upsample(y, out_scanlines_radius, scale_ratio, stbir_info->vertical_shift, &in_first_scanline, &in_last_scanline, &in_center_of_out); + + STBIR_ASSERT(in_last_scanline - in_first_scanline + 1 <= stbir_info->ring_buffer_num_entries); + + if (stbir_info->ring_buffer_begin_index >= 0) + { + // Get rid of whatever we don't need anymore. + while (in_first_scanline > stbir_info->ring_buffer_first_scanline) + { + if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline) + { + // We just popped the last scanline off the ring buffer. + // Reset it to the empty state. + stbir_info->ring_buffer_begin_index = -1; + stbir_info->ring_buffer_first_scanline = 0; + stbir_info->ring_buffer_last_scanline = 0; + break; + } + else + { + stbir_info->ring_buffer_first_scanline++; + stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->ring_buffer_num_entries; + } + } + } + + // Load in new ones. + if (stbir_info->ring_buffer_begin_index < 0) + stbir__decode_and_resample_upsample(stbir_info, in_first_scanline); + + while (in_last_scanline > stbir_info->ring_buffer_last_scanline) + stbir__decode_and_resample_upsample(stbir_info, stbir_info->ring_buffer_last_scanline + 1); + + // Now all buffers should be ready to write a row of vertical sampling. + stbir__resample_vertical_upsample(stbir_info, y); + + STBIR_PROGRESS_REPORT((float)y / stbir_info->output_h); + } +} + +static void stbir__empty_ring_buffer(stbir__info* stbir_info, int first_necessary_scanline) +{ + int output_stride_bytes = stbir_info->output_stride_bytes; + int channels = stbir_info->channels; + int alpha_channel = stbir_info->alpha_channel; + int type = stbir_info->type; + int colorspace = stbir_info->colorspace; + int output_w = stbir_info->output_w; + void* output_data = stbir_info->output_data; + int decode = STBIR__DECODE(type, colorspace); + + float* ring_buffer = stbir_info->ring_buffer; + int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float); + + if (stbir_info->ring_buffer_begin_index >= 0) + { + // Get rid of whatever we don't need anymore. + while (first_necessary_scanline > stbir_info->ring_buffer_first_scanline) + { + if (stbir_info->ring_buffer_first_scanline >= 0 && stbir_info->ring_buffer_first_scanline < stbir_info->output_h) + { + int output_row_start = stbir_info->ring_buffer_first_scanline * output_stride_bytes; + float* ring_buffer_entry = stbir__get_ring_buffer_entry(ring_buffer, stbir_info->ring_buffer_begin_index, ring_buffer_length); + stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, ring_buffer_entry, channels, alpha_channel, decode); + STBIR_PROGRESS_REPORT((float)stbir_info->ring_buffer_first_scanline / stbir_info->output_h); + } + + if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline) + { + // We just popped the last scanline off the ring buffer. + // Reset it to the empty state. + stbir_info->ring_buffer_begin_index = -1; + stbir_info->ring_buffer_first_scanline = 0; + stbir_info->ring_buffer_last_scanline = 0; + break; + } + else + { + stbir_info->ring_buffer_first_scanline++; + stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->ring_buffer_num_entries; + } + } + } +} + +static void stbir__buffer_loop_downsample(stbir__info* stbir_info) +{ + int y; + float scale_ratio = stbir_info->vertical_scale; + int output_h = stbir_info->output_h; + float in_pixels_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(scale_ratio) / scale_ratio; + int pixel_margin = stbir_info->vertical_filter_pixel_margin; + int max_y = stbir_info->input_h + pixel_margin; + + STBIR_ASSERT(!stbir__use_height_upsampling(stbir_info)); + + for (y = -pixel_margin; y < max_y; y++) + { + float out_center_of_in; // Center of the current out scanline in the in scanline space + int out_first_scanline, out_last_scanline; + + stbir__calculate_sample_range_downsample(y, in_pixels_radius, scale_ratio, stbir_info->vertical_shift, &out_first_scanline, &out_last_scanline, &out_center_of_in); + + STBIR_ASSERT(out_last_scanline - out_first_scanline + 1 <= stbir_info->ring_buffer_num_entries); + + if (out_last_scanline < 0 || out_first_scanline >= output_h) + continue; + + stbir__empty_ring_buffer(stbir_info, out_first_scanline); + + stbir__decode_and_resample_downsample(stbir_info, y); + + // Load in new ones. + if (stbir_info->ring_buffer_begin_index < 0) + stbir__add_empty_ring_buffer_entry(stbir_info, out_first_scanline); + + while (out_last_scanline > stbir_info->ring_buffer_last_scanline) + stbir__add_empty_ring_buffer_entry(stbir_info, stbir_info->ring_buffer_last_scanline + 1); + + // Now the horizontal buffer is ready to write to all ring buffer rows. + stbir__resample_vertical_downsample(stbir_info, y); + } + + stbir__empty_ring_buffer(stbir_info, stbir_info->output_h); +} + +static void stbir__setup(stbir__info *info, int input_w, int input_h, int output_w, int output_h, int channels) +{ + info->input_w = input_w; + info->input_h = input_h; + info->output_w = output_w; + info->output_h = output_h; + info->channels = channels; +} + +static void stbir__calculate_transform(stbir__info *info, float s0, float t0, float s1, float t1, float *transform) +{ + info->s0 = s0; + info->t0 = t0; + info->s1 = s1; + info->t1 = t1; + + if (transform) + { + info->horizontal_scale = transform[0]; + info->vertical_scale = transform[1]; + info->horizontal_shift = transform[2]; + info->vertical_shift = transform[3]; + } + else + { + info->horizontal_scale = ((float)info->output_w / info->input_w) / (s1 - s0); + info->vertical_scale = ((float)info->output_h / info->input_h) / (t1 - t0); + + info->horizontal_shift = s0 * info->output_w / (s1 - s0); + info->vertical_shift = t0 * info->output_h / (t1 - t0); + } +} + +static void stbir__choose_filter(stbir__info *info, stbir_filter h_filter, stbir_filter v_filter) +{ + if (h_filter == 0) + h_filter = stbir__use_upsampling(info->horizontal_scale) ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE; + if (v_filter == 0) + v_filter = stbir__use_upsampling(info->vertical_scale) ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE; + info->horizontal_filter = h_filter; + info->vertical_filter = v_filter; +} + +static stbir_uint32 stbir__calculate_memory(stbir__info *info) +{ + int pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale); + int filter_height = stbir__get_filter_pixel_width(info->vertical_filter, info->vertical_scale); + + info->horizontal_num_contributors = stbir__get_contributors(info->horizontal_scale, info->horizontal_filter, info->input_w, info->output_w); + info->vertical_num_contributors = stbir__get_contributors(info->vertical_scale , info->vertical_filter , info->input_h, info->output_h); + + // One extra entry because floating point precision problems sometimes cause an extra to be necessary. + info->ring_buffer_num_entries = filter_height + 1; + + info->horizontal_contributors_size = info->horizontal_num_contributors * sizeof(stbir__contributors); + info->horizontal_coefficients_size = stbir__get_total_horizontal_coefficients(info) * sizeof(float); + info->vertical_contributors_size = info->vertical_num_contributors * sizeof(stbir__contributors); + info->vertical_coefficients_size = stbir__get_total_vertical_coefficients(info) * sizeof(float); + info->decode_buffer_size = (info->input_w + pixel_margin * 2) * info->channels * sizeof(float); + info->horizontal_buffer_size = info->output_w * info->channels * sizeof(float); + info->ring_buffer_size = info->output_w * info->channels * info->ring_buffer_num_entries * sizeof(float); + info->encode_buffer_size = info->output_w * info->channels * sizeof(float); + + STBIR_ASSERT(info->horizontal_filter != 0); + STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late + STBIR_ASSERT(info->vertical_filter != 0); + STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late + + if (stbir__use_height_upsampling(info)) + // The horizontal buffer is for when we're downsampling the height and we + // can't output the result of sampling the decode buffer directly into the + // ring buffers. + info->horizontal_buffer_size = 0; + else + // The encode buffer is to retain precision in the height upsampling method + // and isn't used when height downsampling. + info->encode_buffer_size = 0; + + return info->horizontal_contributors_size + info->horizontal_coefficients_size + + info->vertical_contributors_size + info->vertical_coefficients_size + + info->decode_buffer_size + info->horizontal_buffer_size + + info->ring_buffer_size + info->encode_buffer_size; +} + +static int stbir__resize_allocated(stbir__info *info, + const void* input_data, int input_stride_in_bytes, + void* output_data, int output_stride_in_bytes, + int alpha_channel, stbir_uint32 flags, stbir_datatype type, + stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace, + void* tempmem, size_t tempmem_size_in_bytes) +{ + size_t memory_required = stbir__calculate_memory(info); + + int width_stride_input = input_stride_in_bytes ? input_stride_in_bytes : info->channels * info->input_w * stbir__type_size[type]; + int width_stride_output = output_stride_in_bytes ? output_stride_in_bytes : info->channels * info->output_w * stbir__type_size[type]; + +#ifdef STBIR_DEBUG_OVERWRITE_TEST +#define OVERWRITE_ARRAY_SIZE 8 + unsigned char overwrite_output_before_pre[OVERWRITE_ARRAY_SIZE]; + unsigned char overwrite_tempmem_before_pre[OVERWRITE_ARRAY_SIZE]; + unsigned char overwrite_output_after_pre[OVERWRITE_ARRAY_SIZE]; + unsigned char overwrite_tempmem_after_pre[OVERWRITE_ARRAY_SIZE]; + + size_t begin_forbidden = width_stride_output * (info->output_h - 1) + info->output_w * info->channels * stbir__type_size[type]; + memcpy(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE); + memcpy(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE); + memcpy(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE); + memcpy(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE); +#endif + + STBIR_ASSERT(info->channels >= 0); + STBIR_ASSERT(info->channels <= STBIR_MAX_CHANNELS); + + if (info->channels < 0 || info->channels > STBIR_MAX_CHANNELS) + return 0; + + STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); + STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); + + if (info->horizontal_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table)) + return 0; + if (info->vertical_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table)) + return 0; + + if (alpha_channel < 0) + flags |= STBIR_FLAG_ALPHA_USES_COLORSPACE | STBIR_FLAG_ALPHA_PREMULTIPLIED; + + if (!(flags&STBIR_FLAG_ALPHA_USES_COLORSPACE) || !(flags&STBIR_FLAG_ALPHA_PREMULTIPLIED)) + STBIR_ASSERT(alpha_channel >= 0 && alpha_channel < info->channels); + + if (alpha_channel >= info->channels) + return 0; + + STBIR_ASSERT(tempmem); + + if (!tempmem) + return 0; + + STBIR_ASSERT(tempmem_size_in_bytes >= memory_required); + + if (tempmem_size_in_bytes < memory_required) + return 0; + + memset(tempmem, 0, tempmem_size_in_bytes); + + info->input_data = input_data; + info->input_stride_bytes = width_stride_input; + + info->output_data = output_data; + info->output_stride_bytes = width_stride_output; + + info->alpha_channel = alpha_channel; + info->flags = flags; + info->type = type; + info->edge_horizontal = edge_horizontal; + info->edge_vertical = edge_vertical; + info->colorspace = colorspace; + + info->horizontal_coefficient_width = stbir__get_coefficient_width (info->horizontal_filter, info->horizontal_scale); + info->vertical_coefficient_width = stbir__get_coefficient_width (info->vertical_filter , info->vertical_scale ); + info->horizontal_filter_pixel_width = stbir__get_filter_pixel_width (info->horizontal_filter, info->horizontal_scale); + info->vertical_filter_pixel_width = stbir__get_filter_pixel_width (info->vertical_filter , info->vertical_scale ); + info->horizontal_filter_pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale); + info->vertical_filter_pixel_margin = stbir__get_filter_pixel_margin(info->vertical_filter , info->vertical_scale ); + + info->ring_buffer_length_bytes = info->output_w * info->channels * sizeof(float); + info->decode_buffer_pixels = info->input_w + info->horizontal_filter_pixel_margin * 2; + +#define STBIR__NEXT_MEMPTR(current, newtype) (newtype*)(((unsigned char*)current) + current##_size) + + info->horizontal_contributors = (stbir__contributors *) tempmem; + info->horizontal_coefficients = STBIR__NEXT_MEMPTR(info->horizontal_contributors, float); + info->vertical_contributors = STBIR__NEXT_MEMPTR(info->horizontal_coefficients, stbir__contributors); + info->vertical_coefficients = STBIR__NEXT_MEMPTR(info->vertical_contributors, float); + info->decode_buffer = STBIR__NEXT_MEMPTR(info->vertical_coefficients, float); + + if (stbir__use_height_upsampling(info)) + { + info->horizontal_buffer = NULL; + info->ring_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float); + info->encode_buffer = STBIR__NEXT_MEMPTR(info->ring_buffer, float); + + STBIR_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->encode_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes); + } + else + { + info->horizontal_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float); + info->ring_buffer = STBIR__NEXT_MEMPTR(info->horizontal_buffer, float); + info->encode_buffer = NULL; + + STBIR_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->ring_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes); + } + +#undef STBIR__NEXT_MEMPTR + + // This signals that the ring buffer is empty + info->ring_buffer_begin_index = -1; + + stbir__calculate_filters(info->horizontal_contributors, info->horizontal_coefficients, info->horizontal_filter, info->horizontal_scale, info->horizontal_shift, info->input_w, info->output_w); + stbir__calculate_filters(info->vertical_contributors, info->vertical_coefficients, info->vertical_filter, info->vertical_scale, info->vertical_shift, info->input_h, info->output_h); + + STBIR_PROGRESS_REPORT(0); + + if (stbir__use_height_upsampling(info)) + stbir__buffer_loop_upsample(info); + else + stbir__buffer_loop_downsample(info); + + STBIR_PROGRESS_REPORT(1); + +#ifdef STBIR_DEBUG_OVERWRITE_TEST + STBIR_ASSERT(memcmp(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0); + STBIR_ASSERT(memcmp(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE) == 0); + STBIR_ASSERT(memcmp(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0); + STBIR_ASSERT(memcmp(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE) == 0); +#endif + + return 1; +} + + +static int stbir__resize_arbitrary( + void *alloc_context, + const void* input_data, int input_w, int input_h, int input_stride_in_bytes, + void* output_data, int output_w, int output_h, int output_stride_in_bytes, + float s0, float t0, float s1, float t1, float *transform, + int channels, int alpha_channel, stbir_uint32 flags, stbir_datatype type, + stbir_filter h_filter, stbir_filter v_filter, + stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace) +{ + stbir__info info; + int result; + size_t memory_required; + void* extra_memory; + + stbir__setup(&info, input_w, input_h, output_w, output_h, channels); + stbir__calculate_transform(&info, s0,t0,s1,t1,transform); + stbir__choose_filter(&info, h_filter, v_filter); + memory_required = stbir__calculate_memory(&info); + extra_memory = STBIR_MALLOC(memory_required, alloc_context); + + if (!extra_memory) + return 0; + + result = stbir__resize_allocated(&info, input_data, input_stride_in_bytes, + output_data, output_stride_in_bytes, + alpha_channel, flags, type, + edge_horizontal, edge_vertical, + colorspace, extra_memory, memory_required); + + STBIR_FREE(extra_memory, alloc_context); + + return result; +} + +STBIRDEF int stbir_resize_uint8( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR); +} + +STBIRDEF int stbir_resize_float( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_FLOAT, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR); +} + +STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB); +} + +STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + edge_wrap_mode, edge_wrap_mode, STBIR_COLORSPACE_SRGB); +} + +STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, filter, filter, + edge_wrap_mode, edge_wrap_mode, space); +} + +STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT16, filter, filter, + edge_wrap_mode, edge_wrap_mode, space); +} + + +STBIRDEF int stbir_resize_float_generic( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_FLOAT, filter, filter, + edge_wrap_mode, edge_wrap_mode, space); +} + + +STBIRDEF int stbir_resize( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical, + edge_mode_horizontal, edge_mode_vertical, space); +} + + +STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float x_scale, float y_scale, + float x_offset, float y_offset) +{ + float transform[4]; + transform[0] = x_scale; + transform[1] = y_scale; + transform[2] = x_offset; + transform[3] = y_offset; + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,transform,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical, + edge_mode_horizontal, edge_mode_vertical, space); +} + +STBIRDEF int stbir_resize_region( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float s0, float t0, float s1, float t1) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + s0,t0,s1,t1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical, + edge_mode_horizontal, edge_mode_vertical, space); +} + +#endif // STB_IMAGE_RESIZE_IMPLEMENTATION + +/* +------------------------------------------------------------------------------ +This software is available under 2 licenses -- choose whichever you prefer. +------------------------------------------------------------------------------ +ALTERNATIVE A - MIT License +Copyright (c) 2017 Sean Barrett +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +------------------------------------------------------------------------------ +ALTERNATIVE B - Public Domain (www.unlicense.org) +This is free and unencumbered software released into the public domain. +Anyone is free to copy, modify, publish, use, compile, sell, or distribute this +software, either in source code form or as a compiled binary, for any purpose, +commercial or non-commercial, and by any means. +In jurisdictions that recognize copyright laws, the author or authors of this +software dedicate any and all copyright interest in the software to the public +domain. We make this dedication for the benefit of the public at large and to +the detriment of our heirs and successors. We intend this dedication to be an +overt act of relinquishment in perpetuity of all present and future rights to +this software under copyright law. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +------------------------------------------------------------------------------ +*/ diff --git a/external/tinyobjloader b/external/tinyobjloader new file mode 160000 +Subproject d541711a794343de4ef5ea76f037c9fb9c127a5 diff --git a/premake5.lua b/premake5.lua index 2953922b2..1c900c876 100644 --- a/premake5.lua +++ b/premake5.lua @@ -131,6 +131,14 @@ function baseSlangProject(name, baseDir) -- project(name) + -- We need every project to have a stable UUID for + -- output formats (like Visual Studio and XCode projects) + -- that use UUIDs rather than names to uniquely identify + -- projects. If we don't have a stable UUID, then the + -- output files might have spurious diffs whenever we + -- re-run premake generation. + uuid(os.uuid(projectDir)) + -- Set the location where the project file will be placed. -- We set the project files to reside in their source -- directory, because in Visual Studio the default @@ -252,11 +260,16 @@ function example(name) -- if it is going to use Slang, so we might as well set up a suitable -- include path here rather than make each example do it. -- - includedirs { "." } + -- Most of the examples also need the `gfx` library, + -- which lives under `tools/`, so we will add that to the path as well. + -- + includedirs { ".", "tools" } -- The examples also need to link against the slang library, - -- so we specify that here rather than in each example. - links { "slang" } + -- and the `gfx` abstraction layer (which in turn + -- depends on the `core` library). We specify all of that here, + -- rather than in each example. + links { "slang", "core", "gfx" } end -- @@ -264,23 +277,17 @@ end -- actual projects quite simply. For example, here is the entire -- declaration of the "Hello, World" example project: -- -example "hello" - uuid "E6385042-1649-4803-9EBD-168F8B7EF131" - includedirs { ".", "tools" } - links { "core", "slang-graphics" } +example "hello-world" -- -- Note how we are calling our custom `example()` subroutine with -- the same syntax sugar that Premake usually advocates for their -- `project()` function. This allows us to treat `example` as -- a kind of specialized "subclass" of `project` -- --- The call to `uuid()` in the definition of `hello` establishes --- the UUID/GUID that will be used for the project in generated --- formats that use these as unique identifiers (e.g., Visual --- Studio solutions). Without this call, Premake will generate --- a fresh UUID for a project each time its generation logic --- runs, which can create spurious diffs. --- + +-- Let's go ahead and set up the projects for our other example now. +example "model-viewer" + -- Most of the other projects have more interesting configuration going -- on, so let's walk through them in order of increasing complexity. @@ -364,8 +371,8 @@ tool "slang-eval-test" tool "render-test" uuid "96610759-07B9-4EEB-A974-5C634A2E742B" - includedirs { ".", "external", "source", "tools/slang-graphics" } - links { "core", "slang", "slang-graphics" } + includedirs { ".", "external", "source", "tools/gfx" } + links { "core", "slang", "gfx" } filter { "system:windows" } systemversion "10.0.14393.0" @@ -376,12 +383,12 @@ tool "render-test" postbuildcommands { '"$(SolutionDir)tools\\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/%{cfg.platform:lower()}/" "%{cfg.targetdir}/"'} -- --- `slang-graphics` is a utility library for doing GPU rendering +-- `gfx` is a utility library for doing GPU rendering -- and compute, which is used by both our testing and exmaples. -- It depends on teh `core` library, so we need to declare that: -- -tool "slang-graphics" +tool "gfx" uuid "222F7498-B40C-4F3F-A704-DDEB91A4484A" -- Unlike most of the code under `tools/`, this is a library -- rather than a stand-alone executable. @@ -474,13 +474,13 @@ extern "C" This type is generally compatible with the Windows API `HRESULT` type. In particular, negative values indicate failure results, while zero or positive results indicate success. - In general, Slang APIs always return a zero result on success, unless documented otherwise. Strictly speaking + In general, Slang APIs always return a zero result on success, unless documented otherwise. Strictly speaking a negative value indicates an error, a positive (or 0) value indicates success. This can be tested for with the macros SLANG_SUCCEEDED(x) or SLANG_FAILED(x). - - It can represent if the call was successful or not. It can also specify in an extensible manner what facility + + It can represent if the call was successful or not. It can also specify in an extensible manner what facility produced the result (as the integral 'facility') as well as what caused it (as an integral 'code'). - Under the covers SlangResult is represented as a int32_t. + Under the covers SlangResult is represented as a int32_t. SlangResult is designed to be compatible with COM HRESULT. @@ -493,12 +493,12 @@ extern "C" Severity - 1 fail, 0 is success - as SlangResult is signed 32 bits, means negative number indicates failure. Facility is where the error originated from. Code is the code specific to the facility. - Result codes have the following styles, + Result codes have the following styles, 1) SLANG_name 2) SLANG_s_f_name 3) SLANG_s_name - where s is S for success, E for error + where s is S for success, E for error f is the short version of the facility name Style 1 is reserved for SLANG_OK and SLANG_FAIL as they are so commonly used. @@ -516,7 +516,7 @@ extern "C" //! Get the facility the result is associated with #define SLANG_GET_RESULT_FACILITY(r) ((int32_t)(((r) >> 16) & 0x7fff)) - //! Get the result code for the facility + //! Get the result code for the facility #define SLANG_GET_RESULT_CODE(r) ((int32_t)((r) & 0xffff)) #define SLANG_MAKE_ERROR(fac, code) ((((int32_t)(fac)) << 16) | ((int32_t)(code)) | 0x80000000) @@ -530,7 +530,7 @@ extern "C" #define SLANG_FACILITY_WIN_API 7 //! Base facility -> so as to not clash with HRESULT values (values in 0x200 range do not appear used) -#define SLANG_FACILITY_BASE 0x200 +#define SLANG_FACILITY_BASE 0x200 /*! Facilities numbers must be unique across a project to make the resulting result a unique number. It can be useful to have a consistent short name for a facility, as used in the name prefix */ @@ -539,7 +539,7 @@ extern "C" should never be part of a public API. */ #define SLANG_FACILITY_INTERNAL SLANG_FACILITY_BASE + 1 - /// Base for external facilities. Facilities should be unique across modules. + /// Base for external facilities. Facilities should be unique across modules. #define SLANG_FACILITY_EXTERNAL_BASE 0x210 /* ************************ Win COM compatible Results ******************************/ @@ -551,18 +551,18 @@ extern "C" #define SLANG_FAIL SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_INTERFACE, 5) #define SLANG_MAKE_WIN_INTERFACE_ERROR(code) SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_INTERFACE, code) -#define SLANG_MAKE_WIN_API_ERROR(code) SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, code) +#define SLANG_MAKE_WIN_API_ERROR(code) SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, code) - //! Functionality is not implemented + //! Functionality is not implemented #define SLANG_E_NOT_IMPLEMENTED SLANG_MAKE_WIN_INTERFACE_ERROR(1) - //! Interface not be found + //! Interface not be found #define SLANG_E_NO_INTERFACE SLANG_MAKE_WIN_INTERFACE_ERROR(2) - //! Operation was aborted (did not correctly complete) + //! Operation was aborted (did not correctly complete) #define SLANG_E_ABORT SLANG_MAKE_WIN_INTERFACE_ERROR(4) - //! Indicates that a handle passed in as parameter to a method is invalid. + //! Indicates that a handle passed in as parameter to a method is invalid. #define SLANG_E_INVALID_HANDLE SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, 6) - //! Indicates that an argument passed in as parameter to a method is invalid. + //! Indicates that an argument passed in as parameter to a method is invalid. #define SLANG_E_INVALID_ARG SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, 0x57) //! Operation could not complete - ran out of memory #define SLANG_E_OUT_OF_MEMORY SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, 0xe) @@ -573,10 +573,10 @@ extern "C" // Supplied buffer is too small to be able to complete #define SLANG_E_BUFFER_TOO_SMALL SLANG_MAKE_CORE_ERROR(1) - //! Used to identify a Result that has yet to be initialized. - //! It defaults to failure such that if used incorrectly will fail, as similar in concept to using an uninitialized variable. + //! Used to identify a Result that has yet to be initialized. + //! It defaults to failure such that if used incorrectly will fail, as similar in concept to using an uninitialized variable. #define SLANG_E_UNINITIALIZED SLANG_MAKE_CORE_ERROR(2) - //! Returned from an async method meaning the output is invalid (thus an error), but a result for the request is pending, and will be returned on a subsequent call with the async handle. + //! Returned from an async method meaning the output is invalid (thus an error), but a result for the request is pending, and will be returned on a subsequent call with the async handle. #define SLANG_E_PENDING SLANG_MAKE_CORE_ERROR(3) //! Indicates a file/resource could not be opened #define SLANG_E_CANNOT_OPEN SLANG_MAKE_CORE_ERROR(4) @@ -747,7 +747,7 @@ extern "C" - SLANG_GLSL. Generates GLSL code. - SLANG_HLSL. Generates HLSL code. - SLANG_SPIRV. Generates SPIR-V code. - */ + */ SLANG_API void spSetCodeGenTarget( SlangCompileRequest* request, SlangCompileTarget target); @@ -816,7 +816,7 @@ extern "C" /*! @brief Set options using arguments as if specified via command line. - @return Returns SlangResult. On success SLANG_SUCCEEDED(result) is true. + @return Returns SlangResult. On success SLANG_SUCCEEDED(result) is true. */ SLANG_API SlangResult spProcessCommandLineArguments( SlangCompileRequest* request, @@ -1276,6 +1276,8 @@ extern "C" SLANG_API SlangMatrixLayoutMode spReflectionTypeLayout_GetMatrixLayoutMode(SlangReflectionTypeLayout* type); + SLANG_API int spReflectionTypeLayout_getGenericParamIndex(SlangReflectionTypeLayout* type); + // Variable Reflection SLANG_API char const* spReflectionVariable_GetName(SlangReflectionVariable* var); @@ -1353,8 +1355,9 @@ extern "C" SLANG_API SlangReflectionTypeLayout* spReflection_GetTypeLayout(SlangReflection* reflection, SlangReflectionType* reflectionType, SlangLayoutRules rules); SLANG_API SlangUInt spReflection_getEntryPointCount(SlangReflection* reflection); - SLANG_API SlangReflectionEntryPoint* spReflection_getEntryPointByIndex(SlangReflection* reflection, SlangUInt index); + SLANG_API SlangReflectionEntryPoint* spReflection_findEntryPointByName(SlangReflection* reflection, char const* name); + SLANG_API SlangUInt spReflection_getGlobalConstantBufferBinding(SlangReflection* reflection); SLANG_API size_t spReflection_getGlobalConstantBufferSize(SlangReflection* reflection); @@ -1638,6 +1641,11 @@ namespace slang return spReflectionTypeLayout_GetMatrixLayoutMode((SlangReflectionTypeLayout*) this); } + int getGenericParamIndex() + { + return spReflectionTypeLayout_getGenericParamIndex( + (SlangReflectionTypeLayout*) this); + } }; struct Modifier @@ -1800,6 +1808,11 @@ namespace slang } }; + enum class LayoutRules : SlangLayoutRules + { + Default = SLANG_LAYOUT_RULES_DEFAULT, + }; + struct ShaderReflection { unsigned getParameterCount() @@ -1851,6 +1864,30 @@ namespace slang { return spReflection_getGlobalConstantBufferSize((SlangReflection*)this); } + + TypeReflection* findTypeByName(const char* name) + { + return (TypeReflection*)spReflection_FindTypeByName( + (SlangReflection*) this, + name); + } + + TypeLayoutReflection* getTypeLayout( + TypeReflection* type, + LayoutRules rules = LayoutRules::Default) + { + return (TypeLayoutReflection*)spReflection_GetTypeLayout( + (SlangReflection*) this, + (SlangReflectionType*)type, + SlangLayoutRules(rules)); + } + + EntryPointReflection* findEntryPointByName(const char* name) + { + return (EntryPointReflection*)spReflection_findEntryPointByName( + (SlangReflection*) this, + name); + } }; } @@ -3,7 +3,9 @@ Microsoft Visual Studio Solution File, Format Version 12.00 # Visual Studio 14 Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "examples", "examples", "{EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231}" EndProject -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "hello", "examples\hello\hello.vcxproj", "{E6385042-1649-4803-9EBD-168F8B7EF131}" +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "hello-world", "examples\hello-world\hello-world.vcxproj", "{5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}" +EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "model-viewer", "examples\model-viewer\model-viewer.vcxproj", "{639B13F2-CF07-CFEC-98FB-664A0427F154}" EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "core", "source\core\core.vcxproj", "{F9BE7957-8399-899E-0C49-E714FDDD4B65}" EndProject @@ -19,7 +21,7 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "slang-eval-test", "tools\sl EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "render-test", "tools\render-test\render-test.vcxproj", "{96610759-07B9-4EEB-A974-5C634A2E742B}" EndProject -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "slang-graphics", "tools\slang-graphics\slang-graphics.vcxproj", "{222F7498-B40C-4F3F-A704-DDEB91A4484A}" +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "gfx", "tools\gfx\gfx.vcxproj", "{222F7498-B40C-4F3F-A704-DDEB91A4484A}" EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "slangc", "source\slangc\slangc.vcxproj", "{D56CBCEB-1EB5-4CA8-AEC4-48EA35ED61C7}" EndProject @@ -38,14 +40,22 @@ Global Release|x64 = Release|x64 EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|Win32.ActiveCfg = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|Win32.Build.0 = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.ActiveCfg = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.Build.0 = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|Win32.ActiveCfg = Release|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|Win32.Build.0 = Release|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.ActiveCfg = Release|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.Build.0 = Release|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|Win32.ActiveCfg = Debug|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|Win32.Build.0 = Debug|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|x64.ActiveCfg = Debug|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|x64.Build.0 = Debug|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|Win32.ActiveCfg = Release|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|Win32.Build.0 = Release|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|x64.ActiveCfg = Release|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|x64.Build.0 = Release|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|Win32.ActiveCfg = Debug|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|Win32.Build.0 = Debug|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|x64.ActiveCfg = Debug|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|x64.Build.0 = Debug|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|Win32.ActiveCfg = Release|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|Win32.Build.0 = Release|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|x64.ActiveCfg = Release|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|x64.Build.0 = Release|x64 {F9BE7957-8399-899E-0C49-E714FDDD4B65}.Debug|Win32.ActiveCfg = Debug|Win32 {F9BE7957-8399-899E-0C49-E714FDDD4B65}.Debug|Win32.Build.0 = Debug|Win32 {F9BE7957-8399-899E-0C49-E714FDDD4B65}.Debug|x64.ActiveCfg = Debug|x64 @@ -131,7 +141,8 @@ Global HideSolutionNode = FALSE EndGlobalSection GlobalSection(NestedProjects) = preSolution - {E6385042-1649-4803-9EBD-168F8B7EF131} = {EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231} + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA} = {EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231} + {639B13F2-CF07-CFEC-98FB-664A0427F154} = {EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231} {66174227-8541-41FC-A6DF-4764FC66F78E} = {FD47AE19-69FD-260F-F2F1-20E65EA61D13} {0C768A18-1D25-4000-9F37-DA5FE99E3B64} = {FD47AE19-69FD-260F-F2F1-20E65EA61D13} {22C45F4F-FB6B-4535-BED1-D3F5D0C71047} = {FD47AE19-69FD-260F-F2F1-20E65EA61D13} diff --git a/source/core/core.vcxproj b/source/core/core.vcxproj index ecd8ee07b..3dbfaac3f 100644 --- a/source/core/core.vcxproj +++ b/source/core/core.vcxproj @@ -182,12 +182,9 @@ <ClInclude Include="list.h" /> <ClInclude Include="platform.h" /> <ClInclude Include="secure-crt.h" /> - <ClInclude Include="slang-com-ptr.h" /> - <ClInclude Include="slang-defines.h" /> <ClInclude Include="slang-free-list.h" /> <ClInclude Include="slang-io.h" /> <ClInclude Include="slang-math.h" /> - <ClInclude Include="slang-result.h" /> <ClInclude Include="slang-string-util.h" /> <ClInclude Include="slang-string.h" /> <ClInclude Include="smart-pointer.h" /> diff --git a/source/core/core.vcxproj.filters b/source/core/core.vcxproj.filters index 39a164770..27b0fe82f 100644 --- a/source/core/core.vcxproj.filters +++ b/source/core/core.vcxproj.filters @@ -45,12 +45,6 @@ <ClInclude Include="secure-crt.h"> <Filter>Header Files</Filter> </ClInclude> - <ClInclude Include="slang-com-ptr.h"> - <Filter>Header Files</Filter> - </ClInclude> - <ClInclude Include="slang-defines.h"> - <Filter>Header Files</Filter> - </ClInclude> <ClInclude Include="slang-free-list.h"> <Filter>Header Files</Filter> </ClInclude> @@ -60,9 +54,6 @@ <ClInclude Include="slang-math.h"> <Filter>Header Files</Filter> </ClInclude> - <ClInclude Include="slang-result.h"> - <Filter>Header Files</Filter> - </ClInclude> <ClInclude Include="slang-string-util.h"> <Filter>Header Files</Filter> </ClInclude> diff --git a/source/core/smart-pointer.h b/source/core/smart-pointer.h index bc1683a5b..4c6744d1b 100644 --- a/source/core/smart-pointer.h +++ b/source/core/smart-pointer.h @@ -6,6 +6,8 @@ #include <assert.h> +#include "../../slang.h" + namespace Slang { // TODO: Need to centralize these typedefs @@ -199,13 +201,17 @@ namespace Slang T* detach() { - if (pointer) - dynamic_cast<RefObject*>(pointer)->decreaseReference(); auto rs = pointer; pointer = nullptr; return rs; } + /// Get ready for writing (nulls contents) + SLANG_FORCE_INLINE T** writeRef() { *this = nullptr; return &pointer; } + + /// Get for read access + SLANG_FORCE_INLINE T*const* readRef() const { return &pointer; } + private: T* pointer; diff --git a/source/slang/reflection.cpp b/source/slang/reflection.cpp index f8d12b9e9..6661850ae 100644 --- a/source/slang/reflection.cpp +++ b/source/slang/reflection.cpp @@ -443,7 +443,7 @@ SLANG_API SlangReflectionType * spReflection_FindTypeByName(SlangReflection * re SLANG_API SlangReflectionTypeLayout* spReflection_GetTypeLayout( SlangReflection* reflection, - SlangReflectionType* inType, + SlangReflectionType* inType, SlangLayoutRules /*rules*/) { auto context = convert(reflection); @@ -674,6 +674,21 @@ SLANG_API SlangMatrixLayoutMode spReflectionTypeLayout_GetMatrixLayoutMode(Slang } +SLANG_API int spReflectionTypeLayout_getGenericParamIndex(SlangReflectionTypeLayout* inTypeLayout) +{ + auto typeLayout = convert(inTypeLayout); + if(!typeLayout) return -1; + + if(auto genericParamTypeLayout = dynamic_cast<GenericParamTypeLayout*>(typeLayout)) + { + return genericParamTypeLayout->paramIndex; + } + else + { + return -1; + } +} + // Variable Reflection @@ -925,7 +940,7 @@ namespace Slang return 0; } - + static VarLayout* getParameterByIndex(RefPtr<TypeLayout> typeLayout, unsigned index) { if(auto parameterGroupLayout = typeLayout.As<ParameterGroupTypeLayout>()) @@ -1147,6 +1162,24 @@ SLANG_API SlangReflectionEntryPoint* spReflection_getEntryPointByIndex(SlangRefl return convert(program->entryPoints[(int) index].Ptr()); } +SLANG_API SlangReflectionEntryPoint* spReflection_findEntryPointByName(SlangReflection* inProgram, char const* name) +{ + auto program = convert(inProgram); + if(!program) return 0; + + // TODO: improve on dumb linear search + for(auto ep : program->entryPoints) + { + if(ep->entryPoint->getName()->text == name) + { + return convert(ep); + } + } + + return nullptr; +} + + SLANG_API SlangUInt spReflection_getGlobalConstantBufferBinding(SlangReflection* inProgram) { auto program = convert(inProgram); diff --git a/source/slang/slang.cpp b/source/slang/slang.cpp index 2b1857e07..b94e146dd 100644 --- a/source/slang/slang.cpp +++ b/source/slang/slang.cpp @@ -979,9 +979,6 @@ SLANG_API void spDestroySession( { if(!session) return; delete SESSION(session); -#ifdef _MSC_VER - _CrtDumpMemoryLeaks(); -#endif } SLANG_API void spAddBuiltins( @@ -1483,7 +1480,8 @@ SLANG_API SlangResult spGetEntryPointCodeBlob( } Slang::CompileResult& result = targetReq->entryPointResults[entryPointIndex]; - *outBlob = result.getBlob().detach(); + auto blob = result.getBlob(); + *outBlob = blob.detach(); return SLANG_OK; } diff --git a/tools/slang-graphics/circular-resource-heap-d3d12.cpp b/tools/gfx/circular-resource-heap-d3d12.cpp index 8b63819a5..20e47c4dd 100644 --- a/tools/slang-graphics/circular-resource-heap-d3d12.cpp +++ b/tools/gfx/circular-resource-heap-d3d12.cpp @@ -1,6 +1,6 @@ #include "circular-resource-heap-d3d12.h" -namespace slang_graphics { +namespace gfx { using namespace Slang; D3D12CircularResourceHeap::D3D12CircularResourceHeap(): @@ -219,4 +219,4 @@ D3D12CircularResourceHeap::Cursor D3D12CircularResourceHeap::allocate(size_t siz return cursor; } -} // namespace slang_graphics +} // namespace gfx diff --git a/tools/slang-graphics/circular-resource-heap-d3d12.h b/tools/gfx/circular-resource-heap-d3d12.h index a200d3bbc..cca981601 100644 --- a/tools/slang-graphics/circular-resource-heap-d3d12.h +++ b/tools/gfx/circular-resource-heap-d3d12.h @@ -6,7 +6,7 @@ #include "resource-d3d12.h" -namespace slang_graphics { +namespace gfx { /*! \brief The D3D12CircularResourceHeap is a heap that is suited for size constrained real-time resources allocation that is transitory in nature. It is designed to allocate resources which are used and discarded, often used where in @@ -202,5 +202,5 @@ class D3D12CircularResourceHeap ID3D12Device* m_device; ///< The device that resources will be constructed on }; -} // namespace slang_graphics +} // namespace gfx diff --git a/tools/slang-graphics/d3d-util.cpp b/tools/gfx/d3d-util.cpp index b2c3f87ee..19135707b 100644 --- a/tools/slang-graphics/d3d-util.cpp +++ b/tools/gfx/d3d-util.cpp @@ -6,7 +6,7 @@ // We will use the C standard library just for printing error messages. #include <stdio.h> -namespace slang_graphics { +namespace gfx { using namespace Slang; /* static */D3D_PRIMITIVE_TOPOLOGY D3DUtil::getPrimitiveTopology(PrimitiveTopology topology) diff --git a/tools/slang-graphics/d3d-util.h b/tools/gfx/d3d-util.h index b5f154e6e..04bfae63d 100644 --- a/tools/slang-graphics/d3d-util.h +++ b/tools/gfx/d3d-util.h @@ -13,7 +13,7 @@ #include <D3Dcommon.h> #include <DXGIFormat.h> -namespace slang_graphics { +namespace gfx { class D3DUtil { diff --git a/tools/slang-graphics/descriptor-heap-d3d12.cpp b/tools/gfx/descriptor-heap-d3d12.cpp index 23c56d46d..382fc3219 100644 --- a/tools/slang-graphics/descriptor-heap-d3d12.cpp +++ b/tools/gfx/descriptor-heap-d3d12.cpp @@ -1,7 +1,7 @@ #include "descriptor-heap-d3d12.h" -namespace slang_graphics { +namespace gfx { using namespace Slang; D3D12DescriptorHeap::D3D12DescriptorHeap(): @@ -43,5 +43,5 @@ Result D3D12DescriptorHeap::init(ID3D12Device* device, const D3D12_CPU_DESCRIPTO return SLANG_OK; } -} // namespace slang_graphics +} // namespace gfx diff --git a/tools/slang-graphics/descriptor-heap-d3d12.h b/tools/gfx/descriptor-heap-d3d12.h index 6ddb583dc..2a814583b 100644 --- a/tools/slang-graphics/descriptor-heap-d3d12.h +++ b/tools/gfx/descriptor-heap-d3d12.h @@ -5,8 +5,9 @@ #include <d3d12.h> #include "../../slang-com-ptr.h" +#include "../../source/core/list.h" -namespace slang_graphics { +namespace gfx { /*! \brief A simple class to manage an underlying Dx12 Descriptor Heap. Allocations are made linearly in order. It is not possible to free individual allocations, but all allocations can be deallocated with 'deallocateAll'. */ @@ -62,6 +63,88 @@ protected: int m_descriptorSize; ///< The size of each descriptor }; +/// A host-visible descriptor, used as "backing storage" for a view. +/// +/// This type is intended to be used to represent descriptors that +/// are allocated and freed through a `HostVisibleDescriptorAllocator`. +struct D3D12HostVisibleDescriptor +{ + D3D12_CPU_DESCRIPTOR_HANDLE cpuHandle; +}; + +/// An allocator for host-visible descriptors. +/// +/// Unlike the `D3D12DescriptorHeap` type, this class allows for both +/// allocation and freeing of descriptors, by maintaining a free list. +/// In order to keep the implementation simple, this class only supports +/// allocation of single descriptors and not ranges. +/// +class D3D12HostVisibleDescriptorAllocator +{ + ID3D12Device* m_device; + int m_chunkSize; + D3D12_DESCRIPTOR_HEAP_TYPE m_type; + + D3D12DescriptorHeap m_heap; + Slang::List<D3D12HostVisibleDescriptor> m_freeList; + Slang::List<D3D12DescriptorHeap> m_heaps; + +public: + D3D12HostVisibleDescriptorAllocator() + {} + + Slang::Result init(ID3D12Device* device, int chunkSize, D3D12_DESCRIPTOR_HEAP_TYPE type) + { + m_device = device; + m_chunkSize = chunkSize; + m_type = type; + + SLANG_RETURN_ON_FAIL(m_heap.init(m_device, m_chunkSize, m_type, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); + + return SLANG_OK; + } + + Slang::Result allocate(D3D12HostVisibleDescriptor* outDescriptor) + { + // TODO: this allocator would take some work to make thread-safe + + if(m_freeList.Count() > 0) + { + auto descriptor = m_freeList[0]; + m_freeList.FastRemoveAt(0); + + *outDescriptor = descriptor; + return SLANG_OK; + } + + int index = m_heap.allocate(); + if(index < 0) + { + // Allocate a new heap and try again. + m_heaps.Add(m_heap); + SLANG_RETURN_ON_FAIL(m_heap.init(m_device, m_chunkSize, m_type, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); + + int index = m_heap.allocate(); + if(index < 0) + { + assert(!"descriptor allocation failed on fresh heap"); + return SLANG_FAIL; + } + } + + D3D12HostVisibleDescriptor descriptor; + descriptor.cpuHandle = m_heap.getCpuHandle(index); + + *outDescriptor = descriptor; + return SLANG_OK; + } + + void free(D3D12HostVisibleDescriptor descriptor) + { + m_freeList.Add(descriptor); + } +}; + // --------------------------------------------------------------------------- int D3D12DescriptorHeap::allocate() { @@ -111,5 +194,5 @@ SLANG_FORCE_INLINE D3D12_GPU_DESCRIPTOR_HANDLE D3D12DescriptorHeap::getGpuHandle return dst; } -} // namespace slang_graphics +} // namespace gfx diff --git a/tools/slang-graphics/slang-graphics.vcxproj b/tools/gfx/gfx.vcxproj index ce7502326..cbafe84b1 100644 --- a/tools/slang-graphics/slang-graphics.vcxproj +++ b/tools/gfx/gfx.vcxproj @@ -22,7 +22,7 @@ <ProjectGuid>{222F7498-B40C-4F3F-A704-DDEB91A4484A}</ProjectGuid> <IgnoreWarnCompileDuplicatedFilename>true</IgnoreWarnCompileDuplicatedFilename> <Keyword>Win32Proj</Keyword> - <RootNamespace>slang-graphics</RootNamespace> + <RootNamespace>gfx</RootNamespace> <WindowsTargetPlatformVersion>10.0.14393.0</WindowsTargetPlatformVersion> </PropertyGroup> <Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" /> @@ -68,26 +68,26 @@ <PropertyGroup Label="UserMacros" /> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'"> <OutDir>..\..\bin\windows-x86\debug\</OutDir> - <IntDir>..\..\intermediate\windows-x86\debug\slang-graphics\</IntDir> - <TargetName>slang-graphics</TargetName> + <IntDir>..\..\intermediate\windows-x86\debug\gfx\</IntDir> + <TargetName>gfx</TargetName> <TargetExt>.lib</TargetExt> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'"> <OutDir>..\..\bin\windows-x64\debug\</OutDir> - <IntDir>..\..\intermediate\windows-x64\debug\slang-graphics\</IntDir> - <TargetName>slang-graphics</TargetName> + <IntDir>..\..\intermediate\windows-x64\debug\gfx\</IntDir> + <TargetName>gfx</TargetName> <TargetExt>.lib</TargetExt> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'"> <OutDir>..\..\bin\windows-x86\release\</OutDir> - <IntDir>..\..\intermediate\windows-x86\release\slang-graphics\</IntDir> - <TargetName>slang-graphics</TargetName> + <IntDir>..\..\intermediate\windows-x86\release\gfx\</IntDir> + <TargetName>gfx</TargetName> <TargetExt>.lib</TargetExt> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'"> <OutDir>..\..\bin\windows-x64\release\</OutDir> - <IntDir>..\..\intermediate\windows-x64\release\slang-graphics\</IntDir> - <TargetName>slang-graphics</TargetName> + <IntDir>..\..\intermediate\windows-x64\release\gfx\</IntDir> + <TargetName>gfx</TargetName> <TargetExt>.lib</TargetExt> </PropertyGroup> <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'"> @@ -174,6 +174,7 @@ <ClInclude Include="circular-resource-heap-d3d12.h" /> <ClInclude Include="d3d-util.h" /> <ClInclude Include="descriptor-heap-d3d12.h" /> + <ClInclude Include="model.h" /> <ClInclude Include="render-d3d11.h" /> <ClInclude Include="render-d3d12.h" /> <ClInclude Include="render-gl.h" /> @@ -181,6 +182,7 @@ <ClInclude Include="render.h" /> <ClInclude Include="resource-d3d12.h" /> <ClInclude Include="surface.h" /> + <ClInclude Include="vector-math.h" /> <ClInclude Include="vk-api.h" /> <ClInclude Include="vk-device-queue.h" /> <ClInclude Include="vk-module.h" /> @@ -192,6 +194,7 @@ <ClCompile Include="circular-resource-heap-d3d12.cpp" /> <ClCompile Include="d3d-util.cpp" /> <ClCompile Include="descriptor-heap-d3d12.cpp" /> + <ClCompile Include="model.cpp" /> <ClCompile Include="render-d3d11.cpp" /> <ClCompile Include="render-d3d12.cpp" /> <ClCompile Include="render-gl.cpp" /> diff --git a/tools/slang-graphics/slang-graphics.vcxproj.filters b/tools/gfx/gfx.vcxproj.filters index b1e4c42a3..f1c7f7f5e 100644 --- a/tools/slang-graphics/slang-graphics.vcxproj.filters +++ b/tools/gfx/gfx.vcxproj.filters @@ -18,6 +18,9 @@ <ClInclude Include="descriptor-heap-d3d12.h"> <Filter>Header Files</Filter> </ClInclude> + <ClInclude Include="model.h"> + <Filter>Header Files</Filter> + </ClInclude> <ClInclude Include="render-d3d11.h"> <Filter>Header Files</Filter> </ClInclude> @@ -39,6 +42,9 @@ <ClInclude Include="surface.h"> <Filter>Header Files</Filter> </ClInclude> + <ClInclude Include="vector-math.h"> + <Filter>Header Files</Filter> + </ClInclude> <ClInclude Include="vk-api.h"> <Filter>Header Files</Filter> </ClInclude> @@ -68,6 +74,9 @@ <ClCompile Include="descriptor-heap-d3d12.cpp"> <Filter>Source Files</Filter> </ClCompile> + <ClCompile Include="model.cpp"> + <Filter>Source Files</Filter> + </ClCompile> <ClCompile Include="render-d3d11.cpp"> <Filter>Source Files</Filter> </ClCompile> diff --git a/tools/gfx/model.cpp b/tools/gfx/model.cpp new file mode 100644 index 000000000..c8218102e --- /dev/null +++ b/tools/gfx/model.cpp @@ -0,0 +1,530 @@ +// model.cpp +#include "model.h" + +#define TINYOBJLOADER_IMPLEMENTATION +#include "../../external/tinyobjloader/tiny_obj_loader.h" + +#define STB_IMAGE_IMPLEMENTATION +#include "../../external/stb/stb_image.h" + +#define STB_IMAGE_RESIZE_IMPLEMENTATION +#include "../../external/stb/stb_image_resize.h" + +#include "../../external/glm/glm/glm.hpp" +#include "../../external/glm/glm/gtc/matrix_transform.hpp" +#include "../../external/glm/glm/gtc/constants.hpp" + +#include <memory> +#include <unordered_map> +#include <unordered_set> + +namespace gfx { + +// TinyObj provides a tuple type that bundles up indices, but doesn't +// provide equality comparison or hashing for that type. We'd like +// to have a hash function so that we can unique indices. +// +// In the simplest case, we could define hashing and operator== operations +// directly on `tinobj::index_t`, but that would create problems if they +// revise their API. +// +// We will instead define our own wrapper type that supports equality +// comparisons. +// +struct ObjIndexKey +{ + tinyobj::index_t index; +}; + +bool operator==(ObjIndexKey const& left, ObjIndexKey const& right) +{ + return left.index.vertex_index == right.index.vertex_index + && left.index.normal_index == right.index.normal_index + && left.index.texcoord_index == right.index.texcoord_index; +} + +struct Hasher +{ + template<typename T> + void add(T const& v) + { + state ^= std::hash<T>()(v) + 0x9e3779b9 + (state << 6) + (state >> 2); + } + size_t state = 0; +}; + +struct SmoothingGroupVertexID +{ + size_t smoothingGroup; + size_t positionID; +}; +bool operator==(SmoothingGroupVertexID const& left, SmoothingGroupVertexID const& right) +{ + return left.smoothingGroup == right.smoothingGroup + && left.positionID == right.positionID; +} + +} + +namespace std +{ + template<> struct hash<gfx::ObjIndexKey> + { + size_t operator()(gfx::ObjIndexKey const& key) const + { + gfx::Hasher hasher; + hasher.add(key.index.vertex_index); + hasher.add(key.index.normal_index); + hasher.add(key.index.texcoord_index); + return hasher.state; + } + }; + + template<> struct hash<gfx::SmoothingGroupVertexID> + { + size_t operator()(gfx::SmoothingGroupVertexID const& id) const + { + gfx::Hasher hasher; + hasher.add(id.smoothingGroup); + hasher.add(id.positionID); + return hasher.state; + } + }; +} + +namespace gfx +{ + +RefPtr<TextureResource> loadTextureImage( + Renderer* renderer, + char const* path) +{ + int extentX = 0; + int extentY = 0; + int originalChannelCount = 0; + int requestedChannelCount = 4; // force to 4-component result + stbi_uc* data = stbi_load( + path, + &extentX, + &extentY, + &originalChannelCount, + requestedChannelCount); + if(!data) + return nullptr; + + int channelCount = requestedChannelCount ? requestedChannelCount : originalChannelCount; + + Format format; + switch(channelCount) + { + default: + return nullptr; + + case 4: format = Format::RGBA_Unorm_UInt8; + + // TODO: handle other cases here if/when we stop forcing 4-component + // results when loading the image with stb_image. + } + + std::vector<void*> subresourceInitData; + std::vector<ptrdiff_t> mipRowStrides; + + ptrdiff_t stride = extentX * channelCount * sizeof(stbi_uc); + + subresourceInitData.push_back(data); + mipRowStrides.push_back(stride); + + // create down-sampled images for the different mip levels + bool generateMips = true; + if(generateMips) + { + int prevExtentX = extentX; + int prevExtentY = extentY; + stbi_uc* prevData = data; + ptrdiff_t prevStride = stride; + + for(;;) + { + if(prevExtentX == 1 && prevExtentY == 1) + break; + + int newExtentX = prevExtentX / 2; + int newExtentY = prevExtentY / 2; + + if(!newExtentX) newExtentX = 1; + if(!newExtentY) newExtentY = 1; + + stbi_uc* newData = (stbi_uc*) malloc(newExtentX * newExtentY * channelCount * sizeof(stbi_uc)); + ptrdiff_t newStride = newExtentX * channelCount * sizeof(stbi_uc); + + stbir_resize_uint8_srgb( + prevData, prevExtentX, prevExtentY, prevStride, + newData, newExtentX, newExtentY, newStride, + channelCount, + STBIR_ALPHA_CHANNEL_NONE, + STBIR_FLAG_ALPHA_PREMULTIPLIED); + + subresourceInitData.push_back(newData); + mipRowStrides.push_back(newStride); + + prevExtentX = newExtentX; + prevExtentY = newExtentY; + prevData = newData; + prevStride = newStride; + } + } + + int mipCount = (int) mipRowStrides.size(); + + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, format, extentX, extentY, mipCount); + + TextureResource::Data initData; + initData.numSubResources = mipCount; + initData.numMips = mipCount; + initData.subResources = &subresourceInitData[0]; + initData.mipRowStrides = &mipRowStrides[0]; + + auto texture = renderer->createTextureResource( + Resource::Usage::PixelShaderResource, + desc, + &initData); + + free(data); + + return texture; +} + +Result ModelLoader::load( + char const* inputPath, + void** outModel) +{ + // TODO: need to actually allocate/load the data + + tinyobj::attrib_t objVertexAttributes; + std::vector<tinyobj::shape_t> objShapes; + std::vector<tinyobj::material_t> objMaterials; + + std::string diagnostics; + bool shouldTriangulate = true; + bool success = tinyobj::LoadObj( + &objVertexAttributes, + &objShapes, + &objMaterials, + &diagnostics, + inputPath, + nullptr, + shouldTriangulate); + + if(!diagnostics.empty()) + { + log("%s", diagnostics.c_str()); + } + if(!success) + { + return SLANG_FAIL; + } + + // Translate each material imported by TinyObj into a format that + // we can actually use for rendering. + // + std::vector<void*> materials; + for(auto& objMaterial : objMaterials) + { + MaterialData materialData; + + materialData.diffuseColor = glm::vec3( + objMaterial.diffuse[0], + objMaterial.diffuse[1], + objMaterial.diffuse[2]); + + // load any referenced textures here + if(objMaterial.diffuse_texname.length()) + { + materialData.diffuseMap = loadTextureImage( + renderer, + objMaterial.diffuse_texname.c_str()); + } + + auto material = callbacks->createMaterial(materialData); + materials.push_back(material); + } + + // Flip the winding order on all faces if we are asked to... + // + if(loadFlags & LoadFlag::FlipWinding) + { + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t beginIndex = objIndexCounter; + size_t endIndex = beginIndex + objFaceVertexCount; + objIndexCounter = endIndex; + + size_t halfCount = objFaceVertexCount / 2; + for(size_t ii = 0; ii < halfCount; ++ii) + { + std::swap( + objShape.mesh.indices[beginIndex + ii], + objShape.mesh.indices[endIndex - (ii + 1)]); + } + } + } + + } + + // Identify cases where a face has a vertex without a normal, and in that + // case remember that the given vertex needs to be "smoothed" as part of + // the smoothing group for that face. Note that it is possible for the + // same vertex (position) to be part of faces in distinct smoothing groups. + // + std::unordered_map<SmoothingGroupVertexID, size_t> smoothedVertexNormals; + size_t firstSmoothedNormalID = objVertexAttributes.normals.size() / 3; + size_t flatFaceCounter = 0; + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t flatFaceIndex = flatFaceCounter++; + size_t objFaceIndex = objFaceCounter++; + size_t smoothingGroup = objShape.mesh.smoothing_group_ids[objFaceIndex]; + if(!smoothingGroup) + { + smoothingGroup = ~flatFaceIndex; + } + + for(size_t objFaceVertex = 0; objFaceVertex < objFaceVertexCount; ++objFaceVertex) + { + tinyobj::index_t& objIndex = objShape.mesh.indices[objIndexCounter++]; + + if(objIndex.normal_index < 0) + { + SmoothingGroupVertexID smoothVertexID; + smoothVertexID.positionID = objIndex.vertex_index; + smoothVertexID.smoothingGroup = smoothingGroup; + + if(smoothedVertexNormals.find(smoothVertexID) == smoothedVertexNormals.end()) + { + size_t normalID = objVertexAttributes.normals.size() / 3; + objVertexAttributes.normals.push_back(0); + objVertexAttributes.normals.push_back(0); + objVertexAttributes.normals.push_back(0); + + smoothedVertexNormals.insert(std::make_pair(smoothVertexID, normalID)); + + objIndex.normal_index = normalID; + } + } + } + } + } + // + // Having identified which vertices we need to smooth, we will make another + // pass to compute face normals and apply them to the vertices that belong + // to the same smoothing group. + // + flatFaceCounter = 0; + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t flatFaceIndex = flatFaceCounter++; + size_t objFaceIndex = objFaceCounter++; + unsigned int smoothingGroup = objShape.mesh.smoothing_group_ids[objFaceIndex]; + if(!smoothingGroup) + { + smoothingGroup = ~flatFaceIndex; + } + + glm::vec3 faceNormal; + if(objFaceVertexCount >= 3) + { + glm::vec3 v[3]; + for(size_t objFaceVertex = 0; objFaceVertex < 3; ++objFaceVertex) + { + tinyobj::index_t objIndex = objShape.mesh.indices[objIndexCounter + objFaceVertex]; + if(objIndex.vertex_index >= 0) + { + v[objFaceVertex] = glm::vec3( + objVertexAttributes.vertices[3 * objIndex.vertex_index + 0], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 1], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 2]); + } + } + faceNormal = cross(v[1] - v[0], v[2] - v[0]); + } + + // Add this face normal to any to-be-smoothed vertex on the face. + for(size_t objFaceVertex = 0; objFaceVertex < objFaceVertexCount; ++objFaceVertex) + { + tinyobj::index_t objIndex = objShape.mesh.indices[objIndexCounter++]; + + SmoothingGroupVertexID smoothVertexID; + smoothVertexID.positionID = objIndex.vertex_index; + smoothVertexID.smoothingGroup = smoothingGroup; + + auto ii = smoothedVertexNormals.find(smoothVertexID); + if(ii != smoothedVertexNormals.end()) + { + size_t normalID = ii->second; + objVertexAttributes.normals[normalID * 3 + 0] += faceNormal.x; + objVertexAttributes.normals[normalID * 3 + 1] += faceNormal.y; + objVertexAttributes.normals[normalID * 3 + 2] += faceNormal.z; + } + } + } + } + // + // Once we've added all contributions from each smoothing group, + // we can normalize the normals to compute the area-weighted average. + // + size_t normalCount = objVertexAttributes.normals.size() / 3; + for(size_t ii = firstSmoothedNormalID; ii < normalCount; ++ii) + { + glm::vec3 normal = glm::vec3( + objVertexAttributes.normals[3 * ii + 0], + objVertexAttributes.normals[3 * ii + 1], + objVertexAttributes.normals[3 * ii + 2]); + + normal = normalize(normal); + + objVertexAttributes.normals[3 * ii + 0] = normal.x; + objVertexAttributes.normals[3 * ii + 1] = normal.y; + objVertexAttributes.normals[3 * ii + 2] = normal.z; + } + + // TODO: we should sort the faces to group faces with + // the same material ID together, in case they weren't + // grouped in the original file. + + // We need to undo the .obj indexing stuff so that we have + // standard position/normal/etc. data in a single flat array + + std::unordered_map<ObjIndexKey, Index> mapObjIndexToFlatIndex; + std::vector<Vertex> flatVertices; + std::vector<Index> flatIndices; + + MeshData* currentMesh = nullptr; + MeshData currentMeshStorage; + + std::vector<void*> meshes; + + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t objFaceIndex = objFaceCounter++; + int faceMaterialID = objShape.mesh.material_ids[objFaceIndex]; + void* faceMaterial = materials[faceMaterialID]; + + if(!currentMesh || (faceMaterial != currentMesh->material)) + { + // finish old mesh. + if(currentMesh) + { + meshes.push_back(callbacks->createMesh(*currentMesh)); + } + + // Need to start a new mesh. + currentMesh = ¤tMeshStorage; + currentMesh->material = faceMaterial; + currentMesh->firstIndex = (int)flatIndices.size(); + currentMesh->indexCount = 0; + } + + for(size_t objFaceVertex = 0; objFaceVertex < objFaceVertexCount; ++objFaceVertex) + { + tinyobj::index_t objIndex = objShape.mesh.indices[objIndexCounter++]; + ObjIndexKey objIndexKey; objIndexKey.index = objIndex; + + + Index flatIndex = Index(-1); + auto iter = mapObjIndexToFlatIndex.find(objIndexKey); + if(iter != mapObjIndexToFlatIndex.end()) + { + flatIndex = iter->second; + } + else + { + Vertex flatVertex; + if(objIndex.vertex_index >= 0) + { + flatVertex.position = scale * glm::vec3( + objVertexAttributes.vertices[3 * objIndex.vertex_index + 0], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 1], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 2]); + } + if(objIndex.normal_index >= 0) + { + flatVertex.normal = glm::vec3( + objVertexAttributes.normals[3 * objIndex.normal_index + 0], + objVertexAttributes.normals[3 * objIndex.normal_index + 1], + objVertexAttributes.normals[3 * objIndex.normal_index + 2]); + } + if(objIndex.texcoord_index >= 0) + { + flatVertex.uv = glm::vec2( + objVertexAttributes.texcoords[2 * objIndex.texcoord_index + 0], + objVertexAttributes.texcoords[2 * objIndex.texcoord_index + 1]); + } + + flatIndex = flatVertices.size(); + mapObjIndexToFlatIndex.insert(std::make_pair(objIndexKey, flatIndex)); + flatVertices.push_back(flatVertex); + } + + flatIndices.push_back(flatIndex); + currentMesh->indexCount++; + } + } + } + + // finish last mesh. + if(currentMesh) + { + meshes.push_back(callbacks->createMesh(*currentMesh)); + } + + ModelData modelData; + + modelData.vertexCount = (int)flatVertices.size(); + modelData.indexCount = (int)flatIndices.size(); + + modelData.meshCount = meshes.size(); + modelData.meshes = meshes.data(); + + BufferResource::Desc vertexBufferDesc; + vertexBufferDesc.init(modelData.vertexCount * sizeof(Vertex)); + vertexBufferDesc.setDefaults(Resource::Usage::VertexBuffer); + + modelData.vertexBuffer = renderer->createBufferResource( + Resource::Usage::VertexBuffer, + vertexBufferDesc, + flatVertices.data()); + if(!modelData.vertexBuffer) return SLANG_FAIL; + + BufferResource::Desc indexBufferDesc; + indexBufferDesc.init(modelData.indexCount * sizeof(Index)); + vertexBufferDesc.setDefaults(Resource::Usage::IndexBuffer); + + modelData.indexBuffer = renderer->createBufferResource( + Resource::Usage::IndexBuffer, + indexBufferDesc, + flatIndices.data()); + if(!modelData.indexBuffer) return SLANG_FAIL; + + *outModel = callbacks->createModel(modelData); + + return SLANG_OK; +} + +} // gfx diff --git a/tools/gfx/model.h b/tools/gfx/model.h new file mode 100644 index 000000000..046b9764b --- /dev/null +++ b/tools/gfx/model.h @@ -0,0 +1,73 @@ +// model.h +#pragma once + +#include "render.h" +#include "vector-math.h" + +#include <vector> + +namespace gfx { + +struct ModelLoader +{ + struct MaterialData + { + glm::vec3 diffuseColor; + RefPtr<TextureResource> diffuseMap; + }; + + struct Vertex + { + glm::vec3 position; + glm::vec3 normal; + glm::vec2 uv; + }; + + typedef uint32_t Index; + + struct MeshData + { + int firstIndex; + int indexCount; + + void* material; + }; + + struct ModelData + { + RefPtr<BufferResource> vertexBuffer; + RefPtr<BufferResource> indexBuffer; + PrimitiveTopology primitiveTopology; + int vertexCount; + int indexCount; + int meshCount; + void* const* meshes; + }; + + struct ICallbacks + { + typedef ModelLoader::MaterialData MaterialData; + typedef ModelLoader::MeshData MeshData; + typedef ModelLoader::ModelData ModelData; + + virtual void* createMaterial(MaterialData const& data) = 0; + virtual void* createMesh(MeshData const& data) = 0; + virtual void* createModel(ModelData const& data) = 0; + }; + + typedef uint32_t LoadFlags; + enum LoadFlag : LoadFlags + { + FlipWinding = 1 << 0, + }; + + ICallbacks* callbacks = nullptr; + RefPtr<Renderer> renderer; + LoadFlags loadFlags = 0; + float scale = 1.0f; + + Result load(char const* inputPath, void** outModel); +}; + + +} // gfx diff --git a/tools/gfx/render-d3d11.cpp b/tools/gfx/render-d3d11.cpp new file mode 100644 index 000000000..57c0672bd --- /dev/null +++ b/tools/gfx/render-d3d11.cpp @@ -0,0 +1,2112 @@ +// render-d3d11.cpp + +#define _CRT_SECURE_NO_WARNINGS + +#include "render-d3d11.h" + +//WORKING: #include "options.h" +#include "render.h" +#include "d3d-util.h" + +#include "surface.h" + +// In order to use the Slang API, we need to include its header + +//#include <slang.h> + +#include "../../slang-com-ptr.h" + +// We will be rendering with Direct3D 11, so we need to include +// the Windows and D3D11 headers + +#define WIN32_LEAN_AND_MEAN +#define NOMINMAX +#include <Windows.h> +#undef WIN32_LEAN_AND_MEAN +#undef NOMINMAX + +#include <d3d11_2.h> +#include <d3dcompiler.h> + +// We will use the C standard library just for printing error messages. +#include <stdio.h> + +#ifdef _MSC_VER +#include <stddef.h> +#if (_MSC_VER < 1900) +#define snprintf sprintf_s +#endif +#endif +// +using namespace Slang; + +namespace gfx { + +class D3D11Renderer : public Renderer +{ +public: + enum + { + kMaxUAVs = 64, + kMaxRTVs = 8, + }; + + // Renderer implementation + virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; + virtual void setClearColor(const float color[4]) override; + virtual void clearFrame() override; + virtual void presentFrame() override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + + virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; + + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; + virtual void unmap(BufferResource* buffer) override; + virtual void setPrimitiveTopology(PrimitiveTopology topology) override; + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; + virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; + virtual void dispatchCompute(int x, int y, int z) override; + virtual void submitGpuWork() override {} + virtual void waitForGpu() override {} + virtual RendererType getRendererType() const override { return RendererType::DirectX11; } + + protected: + +#if 0 + struct BindingDetail + { + ComPtr<ID3D11ShaderResourceView> m_srv; + ComPtr<ID3D11UnorderedAccessView> m_uav; + ComPtr<ID3D11SamplerState> m_samplerState; + }; + + class BindingStateImpl: public BindingState + { + public: + typedef BindingState Parent; + + /// Ctor + BindingStateImpl(const Desc& desc): + Parent(desc) + {} + + List<BindingDetail> m_bindingDetails; + }; +#endif + + enum class D3D11DescriptorSlotType + { + ConstantBuffer, + ShaderResourceView, + UnorderedAccessView, + Sampler, + + CombinedTextureSampler, + + CountOf, + }; + + class DescriptorSetLayoutImpl : public DescriptorSetLayout + { + public: + struct RangeInfo + { + D3D11DescriptorSlotType type; + UInt arrayIndex; + UInt pairedSamplerArrayIndex; + }; + List<RangeInfo> m_ranges; + + UInt m_counts[int(D3D11DescriptorSlotType::CountOf)]; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: + struct DescriptorSetInfo + { + RefPtr<DescriptorSetLayoutImpl> layout; + UInt baseIndices[int(D3D11DescriptorSlotType::CountOf)]; + }; + + List<DescriptorSetInfo> m_descriptorSets; + UINT m_uavCount; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr<DescriptorSetLayoutImpl> m_layout; + + List<ComPtr<ID3D11Buffer>> m_cbs; + List<ComPtr<ID3D11ShaderResourceView>> m_srvs; + List<ComPtr<ID3D11UnorderedAccessView>> m_uavs; + List<ComPtr<ID3D11SamplerState>> m_samplers; + }; + + class ShaderProgramImpl: public ShaderProgram + { + public: + ComPtr<ID3D11VertexShader> m_vertexShader; + ComPtr<ID3D11PixelShader> m_pixelShader; + ComPtr<ID3D11ComputeShader> m_computeShader; + }; + + class BufferResourceImpl: public BufferResource + { + public: + typedef BufferResource Parent; + + BufferResourceImpl(const Desc& desc, Usage initialUsage): + Parent(desc), + m_initialUsage(initialUsage) + { + } + + MapFlavor m_mapFlavor; + Usage m_initialUsage; + ComPtr<ID3D11Buffer> m_buffer; + ComPtr<ID3D11Buffer> m_staging; + }; + class TextureResourceImpl : public TextureResource + { + public: + typedef TextureResource Parent; + + TextureResourceImpl(const Desc& desc, Usage initialUsage) : + Parent(desc), + m_initialUsage(initialUsage) + { + } + Usage m_initialUsage; + ComPtr<ID3D11Resource> m_resource; + + }; + + class SamplerStateImpl : public SamplerState + { + public: + ComPtr<ID3D11SamplerState> m_sampler; + }; + + + class ResourceViewImpl : public ResourceView + { + public: + enum class Type + { + SRV, + UAV, + DSV, + RTV, + }; + Type m_type; + }; + + class ShaderResourceViewImpl : public ResourceViewImpl + { + public: + ComPtr<ID3D11ShaderResourceView> m_srv; + }; + + class UnorderedAccessViewImpl : public ResourceViewImpl + { + public: + ComPtr<ID3D11UnorderedAccessView> m_uav; + }; + + class DepthStencilViewImpl : public ResourceViewImpl + { + public: + ComPtr<ID3D11DepthStencilView> m_dsv; + }; + + class RenderTargetViewImpl : public ResourceViewImpl + { + public: + ComPtr<ID3D11RenderTargetView> m_rtv; + }; + + class InputLayoutImpl: public InputLayout + { + public: + ComPtr<ID3D11InputLayout> m_layout; + }; + + class PipelineStateImpl : public PipelineState + { + public: + RefPtr<ShaderProgramImpl> m_program; + RefPtr<PipelineLayoutImpl> m_pipelineLayout; + }; + + + class GraphicsPipelineStateImpl : public PipelineStateImpl + { + public: + UINT m_rtvCount; + + RefPtr<InputLayoutImpl> m_inputLayout; + ComPtr<ID3D11DepthStencilState> m_depthStencilState; + ComPtr<ID3D11RasterizerState> m_rasterizerState; + + UINT m_stencilRef; + }; + + class ComputePipelineStateImpl : public PipelineStateImpl + { + public: + }; + + /// Capture a texture to a file + static HRESULT captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut); + + void _flushGraphicsState(); + void _flushComputeState(); + + ComPtr<IDXGISwapChain> m_swapChain; + ComPtr<ID3D11Device> m_device; + ComPtr<ID3D11DeviceContext> m_immediateContext; + ComPtr<ID3D11Texture2D> m_backBufferTexture; + + RefPtr<TextureResourceImpl> m_primaryRenderTargetTexture; + RefPtr<RenderTargetViewImpl> m_primaryRenderTargetView; + +// List<ComPtr<ID3D11RenderTargetView> > m_renderTargetViews; +// List<ComPtr<ID3D11Texture2D> > m_renderTargetTextures; + + bool m_renderTargetBindingsDirty = false; + + RefPtr<GraphicsPipelineStateImpl> m_currentGraphicsState; + RefPtr<ComputePipelineStateImpl> m_currentComputeState; + + ComPtr<ID3D11RenderTargetView> m_rtvBindings[kMaxRTVs]; + ComPtr<ID3D11DepthStencilView> m_dsvBinding; + ComPtr<ID3D11UnorderedAccessView> m_uavBindings[int(PipelineType::CountOf)][kMaxUAVs]; + bool m_targetBindingsDirty[int(PipelineType::CountOf)]; + + Desc m_desc; + + float m_clearColor[4] = { 0, 0, 0, 0 }; +}; + +Renderer* createD3D11Renderer() +{ + return new D3D11Renderer(); +} + +/* static */HRESULT D3D11Renderer::captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut) +{ + if (!context) return E_INVALIDARG; + if (!texture) return E_INVALIDARG; + + D3D11_TEXTURE2D_DESC textureDesc; + texture->GetDesc(&textureDesc); + + // Don't bother supporting MSAA for right now + if (textureDesc.SampleDesc.Count > 1) + { + fprintf(stderr, "ERROR: cannot capture multi-sample texture\n"); + return E_INVALIDARG; + } + + HRESULT hr = S_OK; + ComPtr<ID3D11Texture2D> stagingTexture; + + if (textureDesc.Usage == D3D11_USAGE_STAGING && (textureDesc.CPUAccessFlags & D3D11_CPU_ACCESS_READ)) + { + stagingTexture = texture; + } + else + { + // Modify the descriptor to give us a staging texture + textureDesc.BindFlags = 0; + textureDesc.MiscFlags &= ~D3D11_RESOURCE_MISC_TEXTURECUBE; + textureDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; + textureDesc.Usage = D3D11_USAGE_STAGING; + + hr = device->CreateTexture2D(&textureDesc, 0, stagingTexture.writeRef()); + if (FAILED(hr)) + { + fprintf(stderr, "ERROR: failed to create staging texture\n"); + return hr; + } + + context->CopyResource(stagingTexture, texture); + } + + // Now just read back texels from the staging textures + { + D3D11_MAPPED_SUBRESOURCE mappedResource; + SLANG_RETURN_ON_FAIL(context->Map(stagingTexture, 0, D3D11_MAP_READ, 0, &mappedResource)); + + Result res = surfaceOut.set(textureDesc.Width, textureDesc.Height, Format::RGBA_Unorm_UInt8, mappedResource.RowPitch, mappedResource.pData, SurfaceAllocator::getMallocAllocator()); + + // Make sure to unmap + context->Unmap(stagingTexture, 0); + return res; + } +} + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! + +SlangResult D3D11Renderer::initialize(const Desc& desc, void* inWindowHandle) +{ + auto windowHandle = (HWND)inWindowHandle; + m_desc = desc; + + // Rather than statically link against D3D, we load it dynamically. + HMODULE d3dModule = LoadLibraryA("d3d11.dll"); + if (!d3dModule) + { + fprintf(stderr, "error: failed load 'd3d11.dll'\n"); + return SLANG_FAIL; + } + + PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN D3D11CreateDeviceAndSwapChain_ = + (PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN)GetProcAddress(d3dModule, "D3D11CreateDeviceAndSwapChain"); + if (!D3D11CreateDeviceAndSwapChain_) + { + fprintf(stderr, + "error: failed load symbol 'D3D11CreateDeviceAndSwapChain'\n"); + return SLANG_FAIL; + } + + UINT deviceFlags = 0; + +#ifdef _DEBUG + // We will enable the D3D debug more for debug builds. + // + // TODO: we should probably provide a command-line option + // to override this kind of default rather than leave it + // up to each back-end to specify. + deviceFlags |= D3D11_CREATE_DEVICE_DEBUG; +#endif + + // Our swap chain uses RGBA8 with sRGB, with double buffering. + DXGI_SWAP_CHAIN_DESC swapChainDesc = { 0 }; + swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; + + // Note(tfoley): Disabling sRGB for DX back buffer for now, so that we + // can get consistent output with OpenGL, where setting up sRGB will + // probably be more involved. + // swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM_SRGB; + swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; + + swapChainDesc.SampleDesc.Count = 1; + swapChainDesc.SampleDesc.Quality = 0; + swapChainDesc.BufferCount = 2; + swapChainDesc.OutputWindow = windowHandle; + swapChainDesc.Windowed = TRUE; + swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD; + swapChainDesc.Flags = 0; + + // We will ask for the highest feature level that can be supported. + const D3D_FEATURE_LEVEL featureLevels[] = { + D3D_FEATURE_LEVEL_11_1, + D3D_FEATURE_LEVEL_11_0, + D3D_FEATURE_LEVEL_10_1, + D3D_FEATURE_LEVEL_10_0, + D3D_FEATURE_LEVEL_9_3, + D3D_FEATURE_LEVEL_9_2, + D3D_FEATURE_LEVEL_9_1, + }; + D3D_FEATURE_LEVEL featureLevel = D3D_FEATURE_LEVEL_9_1; + const int totalNumFeatureLevels = SLANG_COUNT_OF(featureLevels); + + // On a machine that does not have an up-to-date version of D3D installed, + // the `D3D11CreateDeviceAndSwapChain` call will fail with `E_INVALIDARG` + // if you ask for featuer level 11_1. The workaround is to call + // `D3D11CreateDeviceAndSwapChain` up to twice: the first time with 11_1 + // at the start of the list of requested feature levels, and the second + // time without it. + + for (int ii = 0; ii < 2; ++ii) + { + const HRESULT hr = D3D11CreateDeviceAndSwapChain_( + nullptr, // adapter (use default) + D3D_DRIVER_TYPE_REFERENCE, +// D3D_DRIVER_TYPE_HARDWARE, + nullptr, // software + deviceFlags, + &featureLevels[ii], + totalNumFeatureLevels - ii, + D3D11_SDK_VERSION, + &swapChainDesc, + m_swapChain.writeRef(), + m_device.writeRef(), + &featureLevel, + m_immediateContext.writeRef()); + + // Failures with `E_INVALIDARG` might be due to feature level 11_1 + // not being supported. + if (hr == E_INVALIDARG) + { + continue; + } + + // Other failures are real, though. + SLANG_RETURN_ON_FAIL(hr); + // We must have a swap chain + break; + } + + // TODO: Add support for debugging to help detect leaks: + // + // ComPtr<ID3D11Debug> gDebug; + // m_device->QueryInterface(IID_PPV_ARGS(gDebug.writeRef())); + // + + // After we've created the swap chain, we can request a pointer to the + // back buffer as a D3D11 texture, and create a render-target view from it. + + static const IID kIID_ID3D11Texture2D = { + 0x6f15aaf2, 0xd208, 0x4e89, 0x9a, 0xb4, 0x48, + 0x95, 0x35, 0xd3, 0x4f, 0x9c }; + + SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(0, kIID_ID3D11Texture2D, (void**)m_backBufferTexture.writeRef())); + +// for (int i = 0; i < 8; i++) + { + ComPtr<ID3D11Texture2D> texture; + D3D11_TEXTURE2D_DESC textureDesc; + m_backBufferTexture->GetDesc(&textureDesc); + SLANG_RETURN_ON_FAIL(m_device->CreateTexture2D(&textureDesc, nullptr, texture.writeRef())); + + ComPtr<ID3D11RenderTargetView> rtv; + D3D11_RENDER_TARGET_VIEW_DESC rtvDesc; + rtvDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; + rtvDesc.Texture2D.MipSlice = 0; + rtvDesc.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D; + SLANG_RETURN_ON_FAIL(m_device->CreateRenderTargetView(texture, &rtvDesc, rtv.writeRef())); + + TextureResource::Desc resourceDesc; + resourceDesc.init2D(Resource::Type::Texture2D, Format::RGBA_Unorm_UInt8, textureDesc.Width, textureDesc.Height, 1); + + RefPtr<TextureResource> primaryRenderTargetTexture; + SLANG_RETURN_ON_FAIL(createTextureResource(Resource::Usage::RenderTarget, resourceDesc, nullptr, primaryRenderTargetTexture.writeRef())); + + ResourceView::Desc viewDesc; + viewDesc.format = resourceDesc.format; + viewDesc.type = ResourceView::Type::RenderTarget; + RefPtr<ResourceView> primaryRenderTargetView; + SLANG_RETURN_ON_FAIL(createTextureView(primaryRenderTargetTexture, viewDesc, primaryRenderTargetView.writeRef())); + + m_primaryRenderTargetTexture = (TextureResourceImpl*) primaryRenderTargetTexture.Ptr(); + m_primaryRenderTargetView = (RenderTargetViewImpl*) primaryRenderTargetView.Ptr(); + } + +// m_immediateContext->OMSetRenderTargets(1, m_primaryRenderTargetView->m_rtv.readRef(), nullptr); + m_rtvBindings[0] = m_primaryRenderTargetView->m_rtv; + m_targetBindingsDirty[int(PipelineType::Graphics)] = true; + + // Similarly, we are going to set up a viewport once, and then never + // switch, since this is a simple test app. + D3D11_VIEWPORT viewport; + viewport.TopLeftX = 0; + viewport.TopLeftY = 0; + viewport.Width = (float)desc.width; + viewport.Height = (float)desc.height; + viewport.MaxDepth = 1; // TODO(tfoley): use reversed depth + viewport.MinDepth = 0; + m_immediateContext->RSSetViewports(1, &viewport); + + return SLANG_OK; +} + +void D3D11Renderer::setClearColor(const float color[4]) +{ + memcpy(m_clearColor, color, sizeof(m_clearColor)); +} + +void D3D11Renderer::clearFrame() +{ + m_immediateContext->ClearRenderTargetView(m_primaryRenderTargetView->m_rtv, m_clearColor); + + if(m_dsvBinding) + { + m_immediateContext->ClearDepthStencilView(m_dsvBinding, D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL, 1.0f, 0); + } +} + +void D3D11Renderer::presentFrame() +{ + m_immediateContext->CopyResource(m_backBufferTexture, m_primaryRenderTargetTexture->m_resource); + m_swapChain->Present(0, 0); +} + +TextureResource::Desc D3D11Renderer::getSwapChainTextureDesc() +{ + D3D11_TEXTURE2D_DESC dxDesc; + ((ID3D11Texture2D*)m_primaryRenderTargetTexture->m_resource.get())->GetDesc(&dxDesc); + + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, dxDesc.Width, dxDesc.Height, 1); + + return desc; +} + +SlangResult D3D11Renderer::captureScreenSurface(Surface& surfaceOut) +{ + return captureTextureToSurface(m_device, m_immediateContext, (ID3D11Texture2D*) m_primaryRenderTargetTexture->m_resource.get(), surfaceOut); +} + +static D3D11_BIND_FLAG _calcResourceFlag(Resource::BindFlag::Enum bindFlag) +{ + typedef Resource::BindFlag BindFlag; + switch (bindFlag) + { + case BindFlag::VertexBuffer: return D3D11_BIND_VERTEX_BUFFER; + case BindFlag::IndexBuffer: return D3D11_BIND_INDEX_BUFFER; + case BindFlag::ConstantBuffer: return D3D11_BIND_CONSTANT_BUFFER; + case BindFlag::StreamOutput: return D3D11_BIND_STREAM_OUTPUT; + case BindFlag::RenderTarget: return D3D11_BIND_RENDER_TARGET; + case BindFlag::DepthStencil: return D3D11_BIND_DEPTH_STENCIL; + case BindFlag::UnorderedAccess: return D3D11_BIND_UNORDERED_ACCESS; + case BindFlag::PixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; + case BindFlag::NonPixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; + default: return D3D11_BIND_FLAG(0); + } +} + +static int _calcResourceBindFlags(int bindFlags) +{ + int dstFlags = 0; + while (bindFlags) + { + int lsb = bindFlags & -bindFlags; + + dstFlags |= _calcResourceFlag(Resource::BindFlag::Enum(lsb)); + bindFlags &= ~lsb; + } + return dstFlags; +} + +static int _calcResourceAccessFlags(int accessFlags) +{ + switch (accessFlags) + { + case 0: return 0; + case Resource::AccessFlag::Read: return D3D11_CPU_ACCESS_READ; + case Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_WRITE; + case Resource::AccessFlag::Read | + Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_READ | D3D11_CPU_ACCESS_WRITE; + default: assert(!"Invalid flags"); return 0; + } +} + +Result D3D11Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) +{ + TextureResource::Desc srcDesc(descIn); + srcDesc.setDefaults(initialUsage); + + const int effectiveArraySize = srcDesc.calcEffectiveArraySize(); + + if(initData) + { + assert(initData->numSubResources == srcDesc.numMipLevels * effectiveArraySize * srcDesc.size.depth); + } + + const DXGI_FORMAT format = D3DUtil::getMapFormat(srcDesc.format); + if (format == DXGI_FORMAT_UNKNOWN) + { + return SLANG_FAIL; + } + + const int bindFlags = _calcResourceBindFlags(srcDesc.bindFlags); + + // Set up the initialize data + List<D3D11_SUBRESOURCE_DATA> subRes; + D3D11_SUBRESOURCE_DATA* subResourcesPtr = nullptr; + if(initData) + { + subRes.SetSize(srcDesc.numMipLevels * effectiveArraySize); + { + int subResourceIndex = 0; + for (int i = 0; i < effectiveArraySize; i++) + { + for (int j = 0; j < srcDesc.numMipLevels; j++) + { + const int mipHeight = TextureResource::calcMipSize(srcDesc.size.height, j); + + D3D11_SUBRESOURCE_DATA& data = subRes[subResourceIndex]; + + data.pSysMem = initData->subResources[subResourceIndex]; + + data.SysMemPitch = UINT(initData->mipRowStrides[j]); + data.SysMemSlicePitch = UINT(initData->mipRowStrides[j] * mipHeight); + + subResourceIndex++; + } + } + } + subResourcesPtr = subRes.Buffer(); + } + + const int accessFlags = _calcResourceAccessFlags(srcDesc.cpuAccessFlags); + + RefPtr<TextureResourceImpl> texture(new TextureResourceImpl(srcDesc, initialUsage)); + + switch (srcDesc.type) + { + case Resource::Type::Texture1D: + { + D3D11_TEXTURE1D_DESC desc = { 0 }; + desc.BindFlags = bindFlags; + desc.CPUAccessFlags = accessFlags; + desc.Format = format; + desc.MiscFlags = 0; + desc.MipLevels = srcDesc.numMipLevels; + desc.ArraySize = effectiveArraySize; + desc.Width = srcDesc.size.width; + desc.Usage = D3D11_USAGE_DEFAULT; + + ComPtr<ID3D11Texture1D> texture1D; + SLANG_RETURN_ON_FAIL(m_device->CreateTexture1D(&desc, subResourcesPtr, texture1D.writeRef())); + + texture->m_resource = texture1D; + break; + } + case Resource::Type::TextureCube: + case Resource::Type::Texture2D: + { + D3D11_TEXTURE2D_DESC desc = { 0 }; + desc.BindFlags = bindFlags; + desc.CPUAccessFlags = accessFlags; + desc.Format = format; + desc.MiscFlags = 0; + desc.MipLevels = srcDesc.numMipLevels; + desc.ArraySize = effectiveArraySize; + + desc.Width = srcDesc.size.width; + desc.Height = srcDesc.size.height; + desc.Usage = D3D11_USAGE_DEFAULT; + desc.SampleDesc.Count = srcDesc.sampleDesc.numSamples; + desc.SampleDesc.Quality = srcDesc.sampleDesc.quality; + + if (srcDesc.type == Resource::Type::TextureCube) + { + desc.MiscFlags |= D3D11_RESOURCE_MISC_TEXTURECUBE; + } + + ComPtr<ID3D11Texture2D> texture2D; + SLANG_RETURN_ON_FAIL(m_device->CreateTexture2D(&desc, subResourcesPtr, texture2D.writeRef())); + + texture->m_resource = texture2D; + break; + } + case Resource::Type::Texture3D: + { + D3D11_TEXTURE3D_DESC desc = { 0 }; + desc.BindFlags = bindFlags; + desc.CPUAccessFlags = accessFlags; + desc.Format = format; + desc.MiscFlags = 0; + desc.MipLevels = srcDesc.numMipLevels; + desc.Width = srcDesc.size.width; + desc.Height = srcDesc.size.height; + desc.Depth = srcDesc.size.depth; + desc.Usage = D3D11_USAGE_DEFAULT; + + ComPtr<ID3D11Texture3D> texture3D; + SLANG_RETURN_ON_FAIL(m_device->CreateTexture3D(&desc, subResourcesPtr, texture3D.writeRef())); + + texture->m_resource = texture3D; + break; + } + default: + return SLANG_FAIL; + } + + *outResource = texture.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) +{ + BufferResource::Desc srcDesc(descIn); + srcDesc.setDefaults(initialUsage); + + // Make aligned to 256 bytes... not sure why, but if you remove this the tests do fail. + const size_t alignedSizeInBytes = D3DUtil::calcAligned(srcDesc.sizeInBytes, 256); + + // Hack to make the initialization never read from out of bounds memory, by copying into a buffer + List<uint8_t> initDataBuffer; + if (initData && alignedSizeInBytes > srcDesc.sizeInBytes) + { + initDataBuffer.SetSize(alignedSizeInBytes); + ::memcpy(initDataBuffer.Buffer(), initData, srcDesc.sizeInBytes); + initData = initDataBuffer.Buffer(); + } + + D3D11_BUFFER_DESC bufferDesc = { 0 }; + bufferDesc.ByteWidth = UINT(alignedSizeInBytes); + bufferDesc.BindFlags = _calcResourceBindFlags(srcDesc.bindFlags); + // For read we'll need to do some staging + bufferDesc.CPUAccessFlags = _calcResourceAccessFlags(descIn.cpuAccessFlags & Resource::AccessFlag::Write); + bufferDesc.Usage = D3D11_USAGE_DEFAULT; + + // If written by CPU, make it dynamic + if (descIn.cpuAccessFlags & Resource::AccessFlag::Write) + { + bufferDesc.Usage = D3D11_USAGE_DYNAMIC; + } + + switch (initialUsage) + { + case Resource::Usage::ConstantBuffer: + { + // We'll just assume ConstantBuffers are dynamic for now + bufferDesc.Usage = D3D11_USAGE_DYNAMIC; + break; + } + default: break; + } + + if (bufferDesc.BindFlags & (D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE)) + { + //desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE; + if (srcDesc.elementSize != 0) + { + bufferDesc.StructureByteStride = srcDesc.elementSize; + bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED; + } + else + { + bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS; + } + } + + D3D11_SUBRESOURCE_DATA subResourceData = { 0 }; + subResourceData.pSysMem = initData; + + RefPtr<BufferResourceImpl> buffer(new BufferResourceImpl(srcDesc, initialUsage)); + + SLANG_RETURN_ON_FAIL(m_device->CreateBuffer(&bufferDesc, initData ? &subResourceData : nullptr, buffer->m_buffer.writeRef())); + + if (srcDesc.cpuAccessFlags & Resource::AccessFlag::Read) + { + D3D11_BUFFER_DESC bufDesc = {}; + bufDesc.BindFlags = 0; + bufDesc.ByteWidth = (UINT)alignedSizeInBytes; + bufDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; + bufDesc.Usage = D3D11_USAGE_STAGING; + + SLANG_RETURN_ON_FAIL(m_device->CreateBuffer(&bufDesc, nullptr, buffer->m_staging.writeRef())); + } + + *outResource = buffer.detach(); + return SLANG_OK; +} + +D3D11_FILTER_TYPE translateFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return D3D11_FILTER_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return D3D11_FILTER_TYPE_##DST + + CASE(Point, POINT); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +D3D11_FILTER_REDUCTION_TYPE translateFilterReduction(TextureReductionOp op) +{ + switch (op) + { + default: + return D3D11_FILTER_REDUCTION_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureReductionOp::SRC: return D3D11_FILTER_REDUCTION_TYPE_##DST + + CASE(Average, STANDARD); + CASE(Comparison, COMPARISON); + CASE(Minimum, MINIMUM); + CASE(Maximum, MAXIMUM); + +#undef CASE + } +} + +D3D11_TEXTURE_ADDRESS_MODE translateAddressingMode(TextureAddressingMode mode) +{ + switch (mode) + { + default: + return D3D11_TEXTURE_ADDRESS_MODE(0); + +#define CASE(SRC, DST) \ + case TextureAddressingMode::SRC: return D3D11_TEXTURE_ADDRESS_##DST + + CASE(Wrap, WRAP); + CASE(ClampToEdge, CLAMP); + CASE(ClampToBorder, BORDER); + CASE(MirrorRepeat, MIRROR); + CASE(MirrorOnce, MIRROR_ONCE); + +#undef CASE + } +} + +static D3D11_COMPARISON_FUNC translateComparisonFunc(ComparisonFunc func) +{ + switch (func) + { + default: + // TODO: need to report failures + return D3D11_COMPARISON_ALWAYS; + +#define CASE(FROM, TO) \ + case ComparisonFunc::FROM: return D3D11_COMPARISON_##TO + + CASE(Never, NEVER); + CASE(Less, LESS); + CASE(Equal, EQUAL); + CASE(LessEqual, LESS_EQUAL); + CASE(Greater, GREATER); + CASE(NotEqual, NOT_EQUAL); + CASE(GreaterEqual, GREATER_EQUAL); + CASE(Always, ALWAYS); +#undef CASE + } +} + +Result D3D11Renderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + D3D11_FILTER_REDUCTION_TYPE dxReduction = translateFilterReduction(desc.reductionOp); + D3D11_FILTER dxFilter; + if (desc.maxAnisotropy > 1) + { + dxFilter = D3D11_ENCODE_ANISOTROPIC_FILTER(dxReduction); + } + else + { + D3D11_FILTER_TYPE dxMin = translateFilterMode(desc.minFilter); + D3D11_FILTER_TYPE dxMag = translateFilterMode(desc.magFilter); + D3D11_FILTER_TYPE dxMip = translateFilterMode(desc.mipFilter); + + dxFilter = D3D11_ENCODE_BASIC_FILTER(dxMin, dxMag, dxMip, dxReduction); + } + + D3D11_SAMPLER_DESC dxDesc = {}; + dxDesc.Filter = dxFilter; + dxDesc.AddressU = translateAddressingMode(desc.addressU); + dxDesc.AddressV = translateAddressingMode(desc.addressV); + dxDesc.AddressW = translateAddressingMode(desc.addressW); + dxDesc.MipLODBias = desc.mipLODBias; + dxDesc.MaxAnisotropy = desc.maxAnisotropy; + dxDesc.ComparisonFunc = translateComparisonFunc(desc.comparisonFunc); + for (int ii = 0; ii < 4; ++ii) + dxDesc.BorderColor[ii] = desc.borderColor[ii]; + dxDesc.MinLOD = desc.minLOD; + dxDesc.MaxLOD = desc.maxLOD; + + ComPtr<ID3D11SamplerState> sampler; + SLANG_RETURN_ON_FAIL(m_device->CreateSamplerState( + &dxDesc, + sampler.writeRef())); + + RefPtr<SamplerStateImpl> samplerImpl = new SamplerStateImpl(); + samplerImpl->m_sampler = sampler; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (TextureResourceImpl*) texture; + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::RenderTarget: + { + ComPtr<ID3D11RenderTargetView> rtv; + SLANG_RETURN_ON_FAIL(m_device->CreateRenderTargetView(resourceImpl->m_resource, nullptr, rtv.writeRef())); + + RefPtr<RenderTargetViewImpl> viewImpl = new RenderTargetViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::RTV; + viewImpl->m_rtv = rtv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::DepthStencil: + { + ComPtr<ID3D11DepthStencilView> dsv; + SLANG_RETURN_ON_FAIL(m_device->CreateDepthStencilView(resourceImpl->m_resource, nullptr, dsv.writeRef())); + + RefPtr<DepthStencilViewImpl> viewImpl = new DepthStencilViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::DSV; + viewImpl->m_dsv = dsv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::UnorderedAccess: + { + ComPtr<ID3D11UnorderedAccessView> uav; + SLANG_RETURN_ON_FAIL(m_device->CreateUnorderedAccessView(resourceImpl->m_resource, nullptr, uav.writeRef())); + + RefPtr<UnorderedAccessViewImpl> viewImpl = new UnorderedAccessViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::UAV; + viewImpl->m_uav = uav; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::ShaderResource: + { + ComPtr<ID3D11ShaderResourceView> srv; + SLANG_RETURN_ON_FAIL(m_device->CreateShaderResourceView(resourceImpl->m_resource, nullptr, srv.writeRef())); + + RefPtr<ShaderResourceViewImpl> viewImpl = new ShaderResourceViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::SRV; + viewImpl->m_srv = srv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + } +} + +Result D3D11Renderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + auto resourceDesc = resourceImpl->getDesc(); + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::UnorderedAccess: + { + D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc = {}; + uavDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER; + uavDesc.Format = D3DUtil::getMapFormat(desc.format); + uavDesc.Buffer.FirstElement = 0; + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + else if(desc.format == Format::Unknown) + { + uavDesc.Buffer.Flags |= D3D11_BUFFER_UAV_FLAG_RAW; + uavDesc.Format = DXGI_FORMAT_R32_TYPELESS; + } + + ComPtr<ID3D11UnorderedAccessView> uav; + SLANG_RETURN_ON_FAIL(m_device->CreateUnorderedAccessView(resourceImpl->m_buffer, &uavDesc, uav.writeRef())); + + RefPtr<UnorderedAccessViewImpl> viewImpl = new UnorderedAccessViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::UAV; + viewImpl->m_uav = uav; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::ShaderResource: + { + D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc = {}; + srvDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER; + srvDesc.Format = D3DUtil::getMapFormat(desc.format); + srvDesc.Buffer.ElementOffset = 0; + srvDesc.Buffer.ElementWidth = 1; + srvDesc.Buffer.FirstElement = 0; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + srvDesc.Buffer.ElementWidth = resourceDesc.elementSize; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + + ComPtr<ID3D11ShaderResourceView> srv; + SLANG_RETURN_ON_FAIL(m_device->CreateShaderResourceView(resourceImpl->m_buffer, &srvDesc, srv.writeRef())); + + RefPtr<ShaderResourceViewImpl> viewImpl = new ShaderResourceViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::SRV; + viewImpl->m_srv = srv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + } +} + +Result D3D11Renderer::createInputLayout(const InputElementDesc* inputElementsIn, UInt inputElementCount, InputLayout** outLayout) +{ + D3D11_INPUT_ELEMENT_DESC inputElements[16] = {}; + + char hlslBuffer[1024]; + char* hlslCursor = &hlslBuffer[0]; + + hlslCursor += sprintf(hlslCursor, "float4 main(\n"); + + for (UInt ii = 0; ii < inputElementCount; ++ii) + { + inputElements[ii].SemanticName = inputElementsIn[ii].semanticName; + inputElements[ii].SemanticIndex = (UINT)inputElementsIn[ii].semanticIndex; + inputElements[ii].Format = D3DUtil::getMapFormat(inputElementsIn[ii].format); + inputElements[ii].InputSlot = 0; + inputElements[ii].AlignedByteOffset = (UINT)inputElementsIn[ii].offset; + inputElements[ii].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA; + inputElements[ii].InstanceDataStepRate = 0; + + if (ii != 0) + { + hlslCursor += sprintf(hlslCursor, ",\n"); + } + + char const* typeName = "Unknown"; + switch (inputElementsIn[ii].format) + { + case Format::RGBA_Float32: + typeName = "float4"; + break; + case Format::RGB_Float32: + typeName = "float3"; + break; + case Format::RG_Float32: + typeName = "float2"; + break; + case Format::R_Float32: + typeName = "float"; + break; + default: + return SLANG_FAIL; + } + + hlslCursor += sprintf(hlslCursor, "%s a%d : %s%d", + typeName, + (int)ii, + inputElementsIn[ii].semanticName, + (int)inputElementsIn[ii].semanticIndex); + } + + hlslCursor += sprintf(hlslCursor, "\n) : SV_Position { return 0; }"); + + ComPtr<ID3DBlob> vertexShaderBlob; + SLANG_RETURN_ON_FAIL(D3DUtil::compileHLSLShader("inputLayout", hlslBuffer, "main", "vs_5_0", vertexShaderBlob)); + + ComPtr<ID3D11InputLayout> inputLayout; + SLANG_RETURN_ON_FAIL(m_device->CreateInputLayout(&inputElements[0], (UINT)inputElementCount, vertexShaderBlob->GetBufferPointer(), vertexShaderBlob->GetBufferSize(), + inputLayout.writeRef())); + + RefPtr<InputLayoutImpl> impl = new InputLayoutImpl; + impl->m_layout.swap(inputLayout); + + *outLayout = impl.detach(); + return SLANG_OK; +} + +void* D3D11Renderer::map(BufferResource* bufferIn, MapFlavor flavor) +{ + BufferResourceImpl* bufferResource = static_cast<BufferResourceImpl*>(bufferIn); + + D3D11_MAP mapType; + ID3D11Buffer* buffer = bufferResource->m_buffer; + + switch (flavor) + { + case MapFlavor::WriteDiscard: + mapType = D3D11_MAP_WRITE_DISCARD; + break; + case MapFlavor::HostWrite: + mapType = D3D11_MAP_WRITE; + break; + case MapFlavor::HostRead: + mapType = D3D11_MAP_READ; + + buffer = bufferResource->m_staging; + if (!buffer) + { + return nullptr; + } + + // Okay copy the data over + m_immediateContext->CopyResource(buffer, bufferResource->m_buffer); + + break; + default: + return nullptr; + } + + // We update our constant buffer per-frame, just for the purposes + // of the example, but we don't actually load different data + // per-frame (we always use an identity projection). + D3D11_MAPPED_SUBRESOURCE mappedSub; + SLANG_RETURN_NULL_ON_FAIL(m_immediateContext->Map(buffer, 0, mapType, 0, &mappedSub)); + + bufferResource->m_mapFlavor = flavor; + + return mappedSub.pData; +} + +void D3D11Renderer::unmap(BufferResource* bufferIn) +{ + BufferResourceImpl* bufferResource = static_cast<BufferResourceImpl*>(bufferIn); + ID3D11Buffer* buffer = (bufferResource->m_mapFlavor == MapFlavor::HostRead) ? bufferResource->m_staging : bufferResource->m_buffer; + m_immediateContext->Unmap(buffer, 0); +} + +#if 0 +void D3D11Renderer::setInputLayout(InputLayout* inputLayoutIn) +{ + auto inputLayout = static_cast<InputLayoutImpl*>(inputLayoutIn); + m_immediateContext->IASetInputLayout(inputLayout->m_layout); +} +#endif + +void D3D11Renderer::setPrimitiveTopology(PrimitiveTopology topology) +{ + m_immediateContext->IASetPrimitiveTopology(D3DUtil::getPrimitiveTopology(topology)); +} + +void D3D11Renderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffersIn, const UInt* stridesIn, const UInt* offsetsIn) +{ + static const int kMaxVertexBuffers = 16; + assert(slotCount <= kMaxVertexBuffers); + + UINT vertexStrides[kMaxVertexBuffers]; + UINT vertexOffsets[kMaxVertexBuffers]; + ID3D11Buffer* dxBuffers[kMaxVertexBuffers]; + + auto buffers = (BufferResourceImpl*const*)buffersIn; + + for (UInt ii = 0; ii < slotCount; ++ii) + { + vertexStrides[ii] = (UINT)stridesIn[ii]; + vertexOffsets[ii] = (UINT)offsetsIn[ii]; + dxBuffers[ii] = buffers[ii]->m_buffer; + } + + m_immediateContext->IASetVertexBuffers((UINT)startSlot, (UINT)slotCount, dxBuffers, &vertexStrides[0], &vertexOffsets[0]); +} + +void D3D11Renderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) +{ + DXGI_FORMAT dxFormat = D3DUtil::getMapFormat(indexFormat); + m_immediateContext->IASetIndexBuffer(((BufferResourceImpl*)buffer)->m_buffer, dxFormat, offset); +} + +void D3D11Renderer::setDepthStencilTarget(ResourceView* depthStencilView) +{ + m_dsvBinding = ((DepthStencilViewImpl*) depthStencilView)->m_dsv; + m_targetBindingsDirty[int(PipelineType::Graphics)] = true; +} + +void D3D11Renderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + switch(pipelineType) + { + default: + break; + + case PipelineType::Graphics: + { + auto stateImpl = (GraphicsPipelineStateImpl*) state; + auto programImpl = stateImpl->m_program; + + // TODO: We could conceivably do some lightweight state + // differencing here (e.g., check if `programImpl` is the + // same as the program that is currently bound). + // + // It isn't clear how much that would pay off given that + // the D3D11 runtime seems to do its own state diffing. + + // IA + + m_immediateContext->IASetInputLayout(stateImpl->m_inputLayout->m_layout); + + // VS + + m_immediateContext->VSSetShader(programImpl->m_vertexShader, nullptr, 0); + + // HS + + // DS + + // GS + + // RS + + m_immediateContext->RSSetState(stateImpl->m_rasterizerState); + + // PS + + m_immediateContext->PSSetShader(programImpl->m_pixelShader, nullptr, 0); + + // OM + + m_immediateContext->OMSetDepthStencilState(stateImpl->m_depthStencilState, stateImpl->m_stencilRef); + + m_currentGraphicsState = stateImpl; + } + break; + + case PipelineType::Compute: + { + auto stateImpl = (ComputePipelineStateImpl*) state; + auto programImpl = stateImpl->m_program; + + // CS + + m_immediateContext->CSSetShader(programImpl->m_computeShader, nullptr, 0); + + m_currentComputeState = stateImpl; + } + break; + } + + /// ... +} + +void D3D11Renderer::draw(UInt vertexCount, UInt startVertex) +{ + _flushGraphicsState(); + m_immediateContext->Draw((UINT)vertexCount, (UINT)startVertex); +} + +void D3D11Renderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ + _flushGraphicsState(); + m_immediateContext->DrawIndexed((UINT)indexCount, (UINT)startIndex, (UInt)baseVertex); +} + +Result D3D11Renderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) +{ + if (desc.pipelineType == PipelineType::Compute) + { + auto computeKernel = desc.findKernel(StageType::Compute); + + ComPtr<ID3D11ComputeShader> computeShader; + SLANG_RETURN_ON_FAIL(m_device->CreateComputeShader(computeKernel->codeBegin, computeKernel->getCodeSize(), nullptr, computeShader.writeRef())); + + RefPtr<ShaderProgramImpl> shaderProgram = new ShaderProgramImpl(); + shaderProgram->m_computeShader.swap(computeShader); + + *outProgram = shaderProgram.detach(); + return SLANG_OK; + } + else + { + auto vertexKernel = desc.findKernel(StageType::Vertex); + auto fragmentKernel = desc.findKernel(StageType::Fragment); + + ComPtr<ID3D11VertexShader> vertexShader; + ComPtr<ID3D11PixelShader> pixelShader; + + SLANG_RETURN_ON_FAIL(m_device->CreateVertexShader(vertexKernel->codeBegin, vertexKernel->getCodeSize(), nullptr, vertexShader.writeRef())); + SLANG_RETURN_ON_FAIL(m_device->CreatePixelShader(fragmentKernel->codeBegin, fragmentKernel->getCodeSize(), nullptr, pixelShader.writeRef())); + + RefPtr<ShaderProgramImpl> shaderProgram = new ShaderProgramImpl(); + shaderProgram->m_vertexShader.swap(vertexShader); + shaderProgram->m_pixelShader.swap(pixelShader); + + *outProgram = shaderProgram.detach(); + return SLANG_OK; + } +} + +static D3D11_STENCIL_OP translateStencilOp(StencilOp op) +{ + switch(op) + { + default: + // TODO: need to report failures + return D3D11_STENCIL_OP_KEEP; + +#define CASE(FROM, TO) \ + case StencilOp::FROM: return D3D11_STENCIL_OP_##TO + + CASE(Keep, KEEP); + CASE(Zero, ZERO); + CASE(Replace, REPLACE); + CASE(IncrementSaturate, INCR_SAT); + CASE(DecrementSaturate, DECR_SAT); + CASE(Invert, INVERT); + CASE(IncrementWrap, INCR); + CASE(DecrementWrap, DECR); +#undef CASE + + } +} + +static D3D11_FILL_MODE translateFillMode(FillMode mode) +{ + switch(mode) + { + default: + // TODO: need to report failures + return D3D11_FILL_SOLID; + + case FillMode::Solid: return D3D11_FILL_SOLID; + case FillMode::Wireframe: return D3D11_FILL_WIREFRAME; + } +} + +static D3D11_CULL_MODE translateCullMode(CullMode mode) +{ + switch(mode) + { + default: + // TODO: need to report failures + return D3D11_CULL_NONE; + + case CullMode::None: return D3D11_CULL_NONE; + case CullMode::Back: return D3D11_CULL_BACK; + case CullMode::Front: return D3D11_CULL_FRONT; + } +} + +Result D3D11Renderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + + ComPtr<ID3D11DepthStencilState> depthStencilState; + { + D3D11_DEPTH_STENCIL_DESC dsDesc; + dsDesc.DepthEnable = desc.depthStencil.depthTestEnable; + dsDesc.DepthWriteMask = desc.depthStencil.depthWriteEnable ? D3D11_DEPTH_WRITE_MASK_ALL : D3D11_DEPTH_WRITE_MASK_ZERO; + dsDesc.DepthFunc = translateComparisonFunc(desc.depthStencil.depthFunc); + dsDesc.StencilEnable = desc.depthStencil.stencilEnable; + dsDesc.StencilReadMask = desc.depthStencil.stencilReadMask; + dsDesc.StencilWriteMask = desc.depthStencil.stencilWriteMask; + + #define FACE(DST, SRC) \ + dsDesc.DST.StencilFailOp = translateStencilOp( desc.depthStencil.SRC.stencilFailOp); \ + dsDesc.DST.StencilDepthFailOp = translateStencilOp( desc.depthStencil.SRC.stencilDepthFailOp); \ + dsDesc.DST.StencilPassOp = translateStencilOp( desc.depthStencil.SRC.stencilPassOp); \ + dsDesc.DST.StencilFunc = translateComparisonFunc(desc.depthStencil.SRC.stencilFunc); \ + /* end */ + + FACE(FrontFace, frontFace); + FACE(BackFace, backFace); + + SLANG_RETURN_ON_FAIL(m_device->CreateDepthStencilState( + &dsDesc, + depthStencilState.writeRef())); + } + + ComPtr<ID3D11RasterizerState> rasterizerState; + { + D3D11_RASTERIZER_DESC rsDesc; + rsDesc.FillMode = translateFillMode(desc.rasterizer.fillMode); + rsDesc.CullMode = translateCullMode(desc.rasterizer.cullMode); + rsDesc.FrontCounterClockwise = desc.rasterizer.frontFace == FrontFaceMode::Clockwise; + rsDesc.DepthBias = desc.rasterizer.depthBias; + rsDesc.DepthBiasClamp = desc.rasterizer.depthBiasClamp; + rsDesc.SlopeScaledDepthBias = desc.rasterizer.slopeScaledDepthBias; + rsDesc.DepthClipEnable = desc.rasterizer.depthClipEnable; + rsDesc.ScissorEnable = desc.rasterizer.scissorEnable; + rsDesc.MultisampleEnable = desc.rasterizer.multisampleEnable; + rsDesc.AntialiasedLineEnable = desc.rasterizer.antialiasedLineEnable; + + SLANG_RETURN_ON_FAIL(m_device->CreateRasterizerState( + &rsDesc, + rasterizerState.writeRef())); + + } + + RefPtr<GraphicsPipelineStateImpl> state = new GraphicsPipelineStateImpl(); + state->m_program = programImpl; + state->m_stencilRef = desc.depthStencil.stencilRef; + state->m_depthStencilState = depthStencilState; + state->m_rasterizerState = rasterizerState; + state->m_pipelineLayout = (PipelineLayoutImpl*) desc.pipelineLayout; + state->m_inputLayout = (InputLayoutImpl*) desc.inputLayout; + state->m_rtvCount = desc.renderTargetCount; + + *outState = state.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + + RefPtr<ComputePipelineStateImpl> state = new ComputePipelineStateImpl(); + state->m_program = programImpl; + state->m_pipelineLayout = pipelineLayoutImpl; + + *outState = state.detach(); + return SLANG_OK; +} + +void D3D11Renderer::dispatchCompute(int x, int y, int z) +{ + _flushComputeState(); + m_immediateContext->Dispatch(x, y, z); +} + +Result D3D11Renderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + RefPtr<DescriptorSetLayoutImpl> descriptorSetLayoutImpl = new DescriptorSetLayoutImpl(); + + UInt counts[int(D3D11DescriptorSlotType::CountOf)] = { 0, }; + + UInt rangeCount = desc.slotRangeCount; + for(UInt rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + + switch(rangeDesc.type) + { + default: + assert(!"invalid slot type"); + return SLANG_FAIL; + + case DescriptorSlotType::Sampler: + rangeInfo.type = D3D11DescriptorSlotType::Sampler; + break; + + case DescriptorSlotType::CombinedImageSampler: + rangeInfo.type = D3D11DescriptorSlotType::CombinedTextureSampler; + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + rangeInfo.type = D3D11DescriptorSlotType::ConstantBuffer; + break; + + case DescriptorSlotType::SampledImage: + case DescriptorSlotType::UniformTexelBuffer: + case DescriptorSlotType::InputAttachment: + rangeInfo.type = D3D11DescriptorSlotType::ShaderResourceView; + break; + + case DescriptorSlotType::StorageImage: + case DescriptorSlotType::StorageTexelBuffer: + case DescriptorSlotType::StorageBuffer: + case DescriptorSlotType::DynamicStorageBuffer: + rangeInfo.type = D3D11DescriptorSlotType::UnorderedAccessView; + break; + } + + if(rangeInfo.type == D3D11DescriptorSlotType::CombinedTextureSampler) + { + auto srvTypeIndex = int(D3D11DescriptorSlotType::ShaderResourceView); + auto samplerTypeIndex = int(D3D11DescriptorSlotType::Sampler); + + rangeInfo.arrayIndex = counts[srvTypeIndex]; + rangeInfo.pairedSamplerArrayIndex = counts[samplerTypeIndex]; + + counts[srvTypeIndex] += rangeDesc.count; + counts[samplerTypeIndex] += rangeDesc.count; + } + else + { + auto typeIndex = int(rangeInfo.type); + + rangeInfo.arrayIndex = counts[typeIndex]; + counts[typeIndex] += rangeDesc.count; + } + descriptorSetLayoutImpl->m_ranges.Add(rangeInfo); + } + + for(int ii = 0; ii < int(D3D11DescriptorSlotType::CountOf); ++ii) + { + descriptorSetLayoutImpl->m_counts[ii] = counts[ii]; + } + + *outLayout = descriptorSetLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + RefPtr<PipelineLayoutImpl> pipelineLayoutImpl = new PipelineLayoutImpl(); + + UInt counts[int(D3D11DescriptorSlotType::CountOf)] = { 0, }; + + UInt setCount = desc.descriptorSetCount; + for(UInt ii = 0; ii < setCount; ++ii) + { + auto setDesc = desc.descriptorSets[ii]; + PipelineLayoutImpl::DescriptorSetInfo setInfo; + + setInfo.layout = (DescriptorSetLayoutImpl*) setDesc.layout; + + for(int jj = 0; jj < int(D3D11DescriptorSlotType::CountOf); ++jj) + { + setInfo.baseIndices[jj] = counts[jj]; + counts[jj] += setInfo.layout->m_counts[jj]; + } + + pipelineLayoutImpl->m_descriptorSets.Add(setInfo); + } + + pipelineLayoutImpl->m_uavCount = counts[int(D3D11DescriptorSlotType::UnorderedAccessView)]; + + *outLayout = pipelineLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*)layout; + + RefPtr<DescriptorSetImpl> descriptorSetImpl = new DescriptorSetImpl(); + + descriptorSetImpl->m_layout = layoutImpl; + descriptorSetImpl->m_cbs .SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::ConstantBuffer)]); + descriptorSetImpl->m_srvs .SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::ShaderResourceView)]); + descriptorSetImpl->m_uavs .SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::UnorderedAccessView)]); + descriptorSetImpl->m_samplers.SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::Sampler)]); + + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + + +#if 0 +BindingState* D3D11Renderer::createBindingState(const BindingState::Desc& bindingStateDesc) +{ + RefPtr<BindingStateImpl> bindingState(new BindingStateImpl(bindingStateDesc)); + + const auto& srcBindings = bindingStateDesc.m_bindings; + const int numBindings = int(srcBindings.Count()); + + auto& dstDetails = bindingState->m_bindingDetails; + dstDetails.SetSize(numBindings); + + for (int i = 0; i < numBindings; ++i) + { + auto& dstDetail = dstDetails[i]; + const auto& srcBinding = srcBindings[i]; + + assert(srcBinding.registerRange.isSingle()); + + switch (srcBinding.bindingType) + { + case BindingType::Buffer: + { + assert(srcBinding.resource && srcBinding.resource->isBuffer()); + + BufferResourceImpl* buffer = static_cast<BufferResourceImpl*>(srcBinding.resource.Ptr()); + const BufferResource::Desc& desc = buffer->getDesc(); + + const int elemSize = bufferDesc.elementSize <= 0 ? 1 : bufferDesc.elementSize; + + if (bufferDesc.bindFlags & Resource::BindFlag::UnorderedAccess) + { + D3D11_UNORDERED_ACCESS_VIEW_DESC viewDesc; + memset(&viewDesc, 0, sizeof(viewDesc)); + viewDesc.Buffer.FirstElement = 0; + viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); + viewDesc.Buffer.Flags = 0; + viewDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER; + viewDesc.Format = D3DUtil::getMapFormat(bufferDesc.format); + + if (bufferDesc.elementSize == 0 && bufferDesc.format == Format::Unknown) + { + viewDesc.Buffer.Flags |= D3D11_BUFFER_UAV_FLAG_RAW; + viewDesc.Format = DXGI_FORMAT_R32_TYPELESS; + } + + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateUnorderedAccessView(buffer->m_buffer, &viewDesc, dstDetail.m_uav.writeRef())); + } + if (bufferDesc.bindFlags & (Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource)) + { + D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; + memset(&viewDesc, 0, sizeof(viewDesc)); + viewDesc.Buffer.FirstElement = 0; + viewDesc.Buffer.ElementWidth = elemSize; + viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); + viewDesc.Buffer.ElementOffset = 0; + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER; + viewDesc.Format = DXGI_FORMAT_UNKNOWN; + + if (bufferDesc.elementSize == 0) + { + viewDesc.Format = DXGI_FORMAT_R32_FLOAT; + } + + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(buffer->m_buffer, &viewDesc, dstDetail.m_srv.writeRef())); + } + break; + } + case BindingType::Texture: + case BindingType::CombinedTextureSampler: + { + assert(srcBinding.resource && srcBinding.resource->isTexture()); + + TextureResourceImpl* texture = static_cast<TextureResourceImpl*>(srcBinding.resource.Ptr()); + + const TextureResource::Desc& textureDesc = texture->getDesc(); + + D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; + viewDesc.Format = D3DUtil::getMapFormat(textureDesc.format); + + switch (texture->getType()) + { + case Resource::Type::Texture1D: + { + if (textureDesc.arraySize <= 0) + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1D; + viewDesc.Texture1D.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture1D.MostDetailedMip = 0; + } + else + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1DARRAY; + viewDesc.Texture1DArray.ArraySize = textureDesc.arraySize; + viewDesc.Texture1DArray.FirstArraySlice = 0; + viewDesc.Texture1DArray.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture1DArray.MostDetailedMip = 0; + } + break; + } + case Resource::Type::Texture2D: + { + if (textureDesc.arraySize <= 0) + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; + viewDesc.Texture2D.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture2D.MostDetailedMip = 0; + } + else + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2DARRAY; + viewDesc.Texture2DArray.ArraySize = textureDesc.arraySize; + viewDesc.Texture2DArray.FirstArraySlice = 0; + viewDesc.Texture2DArray.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture2DArray.MostDetailedMip = 0; + } + break; + } + case Resource::Type::TextureCube: + { + if (textureDesc.arraySize <= 0) + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBE; + viewDesc.TextureCube.MipLevels = textureDesc.numMipLevels; + viewDesc.TextureCube.MostDetailedMip = 0; + } + else + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBEARRAY; + viewDesc.TextureCubeArray.MipLevels = textureDesc.numMipLevels; + viewDesc.TextureCubeArray.MostDetailedMip = 0; + viewDesc.TextureCubeArray.First2DArrayFace = 0; + viewDesc.TextureCubeArray.NumCubes = textureDesc.arraySize; + } + break; + } + case Resource::Type::Texture3D: + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE3D; + viewDesc.Texture3D.MipLevels = textureDesc.numMipLevels; // Old code fixed as one + viewDesc.Texture3D.MostDetailedMip = 0; + break; + } + default: + { + assert(!"Unhandled type"); + return nullptr; + } + } + + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(texture->m_resource, &viewDesc, dstDetail.m_srv.writeRef())); + break; + } + case BindingType::Sampler: + { + const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; + + D3D11_SAMPLER_DESC desc = {}; + desc.AddressU = desc.AddressV = desc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP; + + if (samplerDesc.isCompareSampler) + { + desc.ComparisonFunc = D3D11_COMPARISON_LESS_EQUAL; + desc.Filter = D3D11_FILTER_MIN_LINEAR_MAG_MIP_POINT; + desc.MinLOD = desc.MaxLOD = 0.0f; + } + else + { + desc.Filter = D3D11_FILTER_ANISOTROPIC; + desc.MaxAnisotropy = 8; + desc.MinLOD = 0.0f; + desc.MaxLOD = 100.0f; + } + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateSamplerState(&desc, dstDetail.m_samplerState.writeRef())); + break; + } + default: + { + assert(!"Unhandled type"); + return nullptr; + } + } + } + + // Done + return bindingState.detach(); +} + +void D3D11Renderer::_applyBindingState(bool isCompute) +{ + auto context = m_immediateContext.get(); + + const auto& details = m_currentBindings->m_bindingDetails; + const auto& bindings = m_currentBindings->getDesc().m_bindings; + + const int numBindings = int(bindings.Count()); + + for (int i = 0; i < numBindings; ++i) + { + const auto& binding = bindings[i]; + const auto& detail = details[i]; + + const int bindingIndex = binding.registerRange.getSingleIndex(); + + switch (binding.bindingType) + { + case BindingType::Buffer: + { + assert(binding.resource && binding.resource->isBuffer()); + if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) + { + ID3D11Buffer* buffer = static_cast<BufferResourceImpl*>(binding.resource.Ptr())->m_buffer; + if (isCompute) + context->CSSetConstantBuffers(bindingIndex, 1, &buffer); + else + { + context->VSSetConstantBuffers(bindingIndex, 1, &buffer); + context->PSSetConstantBuffers(bindingIndex, 1, &buffer); + } + } + else if (detail.m_uav) + { + if (isCompute) + context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); + else + context->OMSetRenderTargetsAndUnorderedAccessViews( + m_currentBindings->getDesc().m_numRenderTargets, + m_renderTargetViews.Buffer()->readRef(), + m_depthStencilView, + bindingIndex, + 1, + detail.m_uav.readRef(), + nullptr); + } + else + { + if (isCompute) + context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + else + { + context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + } + } + break; + } + case BindingType::Texture: + { + if (detail.m_uav) + { + if (isCompute) + context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); + else + context->OMSetRenderTargetsAndUnorderedAccessViews(D3D11_KEEP_RENDER_TARGETS_AND_DEPTH_STENCIL, + nullptr, nullptr, bindingIndex, 1, detail.m_uav.readRef(), nullptr); + } + else + { + if (isCompute) + context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + else + { + context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + } + } + break; + } + case BindingType::Sampler: + { + if (isCompute) + context->CSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); + else + { + context->PSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); + context->VSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); + } + break; + } + default: + { + assert(!"Not implemented"); + return; + } + } + } +} + +void D3D11Renderer::setBindingState(BindingState* state) +{ + m_currentBindings = static_cast<BindingStateImpl*>(state); +} +#endif + +void D3D11Renderer::_flushGraphicsState() +{ + auto pipelineType = int(PipelineType::Graphics); + if(m_targetBindingsDirty[pipelineType]) + { + m_targetBindingsDirty[pipelineType] = false; + + auto pipelineState = m_currentGraphicsState.Ptr(); + + auto rtvCount = pipelineState->m_rtvCount; + auto uavCount = pipelineState->m_pipelineLayout->m_uavCount; + + m_immediateContext->OMSetRenderTargetsAndUnorderedAccessViews( + rtvCount, + m_rtvBindings[0].readRef(), + m_dsvBinding, + rtvCount, + uavCount, + m_uavBindings[pipelineType][0].readRef(), + nullptr); + } +} + +void D3D11Renderer::_flushComputeState() +{ + auto pipelineType = int(PipelineType::Compute); + if(m_targetBindingsDirty[pipelineType]) + { + m_targetBindingsDirty[pipelineType] = false; + + auto pipelineState = m_currentComputeState.Ptr(); + + auto uavCount = pipelineState->m_pipelineLayout->m_uavCount; + + m_immediateContext->CSSetUnorderedAccessViews( + 0, + uavCount, + m_uavBindings[pipelineType][0].readRef(), + nullptr); + } +} + +void D3D11Renderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto bufferImpl = (BufferResourceImpl*) buffer; + auto& rangeInfo = m_layout->m_ranges[range]; + + assert(rangeInfo.type == D3D11DescriptorSlotType::ConstantBuffer); + + m_cbs[rangeInfo.arrayIndex + index] = bufferImpl->m_buffer; +} + +void D3D11Renderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto viewImpl = (ResourceViewImpl*)view; + auto& rangeInfo = m_layout->m_ranges[range]; + + switch (rangeInfo.type) + { + case D3D11DescriptorSlotType::ShaderResourceView: + { + assert(viewImpl->m_type == ResourceViewImpl::Type::SRV); + auto srvImpl = (ShaderResourceViewImpl*)viewImpl; + m_srvs[rangeInfo.arrayIndex + index] = srvImpl->m_srv; + } + break; + + case D3D11DescriptorSlotType::UnorderedAccessView: + { + assert(viewImpl->m_type == ResourceViewImpl::Type::UAV); + auto uavImpl = (UnorderedAccessViewImpl*)viewImpl; + m_uavs[rangeInfo.arrayIndex + index] = uavImpl->m_uav; + } + break; + + default: + assert(!"invalid to bind a resource view to this descriptor range"); + break; + } +} + +void D3D11Renderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ + auto samplerImpl = (SamplerStateImpl*) sampler; + auto& rangeInfo = m_layout->m_ranges[range]; + + assert(rangeInfo.type == D3D11DescriptorSlotType::Sampler); + + m_samplers[rangeInfo.arrayIndex + index] = samplerImpl->m_sampler; +} + +void D3D11Renderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ + auto viewImpl = (ResourceViewImpl*) textureView; + auto samplerImpl = (SamplerStateImpl*)sampler; + + auto& rangeInfo = m_layout->m_ranges[range]; + assert(rangeInfo.type == D3D11DescriptorSlotType::CombinedTextureSampler); + + assert(viewImpl->m_type == ResourceViewImpl::Type::SRV); + auto srvImpl = (ShaderResourceViewImpl*)viewImpl; + m_srvs[rangeInfo.arrayIndex + index] = srvImpl->m_srv; + + m_samplers[rangeInfo.arrayIndex + index] = samplerImpl->m_sampler; + + // TODO: need a place to bind the matching sampler... + m_srvs[rangeInfo.pairedSamplerArrayIndex + index] = srvImpl->m_srv; +} + +void D3D11Renderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) +{ + auto pipelineLayoutImpl = (PipelineLayoutImpl*)layout; + auto descriptorSetImpl = (DescriptorSetImpl*) descriptorSet; + + auto descriptorSetLayoutImpl = descriptorSetImpl->m_layout; + auto& setInfo = pipelineLayoutImpl->m_descriptorSets[index]; + + // Note: `setInfo->layout` and `descriptorSetLayoutImpl` need to be compatible + + // TODO: If/when we add per-stage visibility masks, it would be best to organize + // this as a loop over stages, so that we only do the binding that is required + // for each stage. + + { + int slotType = int(D3D11DescriptorSlotType::ConstantBuffer); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto cbs = descriptorSetImpl->m_cbs[0].readRef(); + + m_immediateContext->VSSetConstantBuffers(startSlot, slotCount, cbs); + // ... + m_immediateContext->PSSetConstantBuffers(startSlot, slotCount, cbs); + + m_immediateContext->CSSetConstantBuffers(startSlot, slotCount, cbs); + } + } + + { + int slotType = int(D3D11DescriptorSlotType::ShaderResourceView); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto srvs = descriptorSetImpl->m_srvs[0].readRef(); + + m_immediateContext->VSSetShaderResources(startSlot, slotCount, srvs); + // ... + m_immediateContext->PSSetShaderResources(startSlot, slotCount, srvs); + + m_immediateContext->CSSetShaderResources(startSlot, slotCount, srvs); + } + } + + { + int slotType = int(D3D11DescriptorSlotType::Sampler); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto samplers = descriptorSetImpl->m_samplers[0].readRef(); + + m_immediateContext->VSSetSamplers(startSlot, slotCount, samplers); + // ... + m_immediateContext->PSSetSamplers(startSlot, slotCount, samplers); + + m_immediateContext->CSSetSamplers(startSlot, slotCount, samplers); + } + } + + { + // Note: UAVs are handled differently from other bindings, because + // D3D11 requires all UAVs to be set with a single call, rather + // than allowing incremental updates. We will therefore shadow + // the UAV bindings with `m_uavBindings` and then flush them + // as needed right before a draw/dispatch. + // + int slotType = int(D3D11DescriptorSlotType::UnorderedAccessView); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto uavs = descriptorSetImpl->m_uavs[0].readRef(); + + for(UINT ii = 0; ii < slotCount; ++ii) + { + m_uavBindings[int(pipelineType)][startSlot + ii] = uavs[ii]; + } + m_targetBindingsDirty[int(pipelineType)] = true; + } + } + + +} + +} // renderer_test diff --git a/tools/slang-graphics/render-d3d11.h b/tools/gfx/render-d3d11.h index 7b3d25e9f..9e671d541 100644 --- a/tools/slang-graphics/render-d3d11.h +++ b/tools/gfx/render-d3d11.h @@ -1,10 +1,10 @@ // render-d3d11.h #pragma once -namespace slang_graphics { +namespace gfx { class Renderer; Renderer* createD3D11Renderer(); -} // slang_graphics +} // gfx diff --git a/tools/slang-graphics/render-d3d12.cpp b/tools/gfx/render-d3d12.cpp index 24c9ecacb..2d3b8f521 100644 --- a/tools/slang-graphics/render-d3d12.cpp +++ b/tools/gfx/render-d3d12.cpp @@ -1,4 +1,4 @@ -// render-d3d12.cpp +// render-d3d12.cpp #define _CRT_SECURE_NO_WARNINGS #include "render-d3d12.h" @@ -46,7 +46,7 @@ #define ENABLE_DEBUG_LAYER 1 -namespace slang_graphics { +namespace gfx { using namespace Slang; class D3D12Renderer : public Renderer @@ -57,20 +57,40 @@ public: virtual void setClearColor(const float color[4]) override; virtual void clearFrame() override; virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& bufferDesc, const void* initData) override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& bindingStateDesc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; +// virtual void setInputLayout(InputLayout* inputLayout) override; virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState* state); + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; virtual void dispatchCompute(int x, int y, int z) override; virtual void submitGpuWork() override; virtual void waitForGpu() override; @@ -82,6 +102,9 @@ protected: static const Int kMaxNumRenderFrames = 4; static const Int kMaxNumRenderTargets = 3; + static const Int kMaxRTVCount = 8; + static const Int kMaxDescriptorSetCount = 16; + struct Submitter { virtual void setRootConstantBufferView(int index, D3D12_GPU_VIRTUAL_ADDRESS gpuBufferLocation) = 0; @@ -153,11 +176,28 @@ protected: static BackingStyle _calcResourceBackingStyle(Usage usage) { + // Note: the D3D12 back-end has support for "versioning" of constant buffers, + // where the same logical `BufferResource` can actually point to different + // backing storage over its lifetime, to emulate the ability to modify the + // buffer contents as in D3D11, etc. + // + // The VK back-end doesn't have the same behavior, and it is difficult + // to both support this degree of flexibility *and* efficeintly exploit + // descriptor tables (since any table referencing the buffer would need + // to be updated when a new buffer "version" gets allocated). + // + // I'm choosing to disable this for now, and make all buffers be memory-backed, + // although this creates synchronization issues that we'll have to address + // next. + + return BackingStyle::ResourceBacked; +#if 0 switch (usage) { case Usage::ConstantBuffer: return BackingStyle::MemoryBacked; default: return BackingStyle::ResourceBacked; } +#endif } BackingStyle m_backingStyle; ///< How the resource is 'backed' - either as a resource or cpu memory. Cpu memory is typically used for constant buffers. @@ -183,6 +223,19 @@ protected: D3D12Resource m_resource; }; + class SamplerStateImpl : public SamplerState + { + public: + D3D12_CPU_DESCRIPTOR_HANDLE m_cpuHandle; + }; + + class ResourceViewImpl : public ResourceView + { + public: + RefPtr<Resource> m_resource; + D3D12HostVisibleDescriptor m_descriptor; + }; + class InputLayoutImpl: public InputLayout { public: @@ -190,6 +243,7 @@ protected: List<char> m_text; ///< Holds all strings to keep in scope }; +#if 0 struct BindingDetail { int m_srvIndex = -1; @@ -216,20 +270,99 @@ protected: {} List<BindingDetail> m_bindingDetails; ///< These match 1-1 to the bindings in the m_desc - - D3D12DescriptorHeap m_viewHeap; ///< Cbv, Srv, Uav - D3D12DescriptorHeap m_samplerHeap; ///< Heap for samplers }; +#endif - class RenderState: public RefObject + class DescriptorSetLayoutImpl : public DescriptorSetLayout { - public: - D3D12_PRIMITIVE_TOPOLOGY_TYPE m_primitiveTopologyType; - RefPtr<BindingStateImpl> m_bindingState; - RefPtr<InputLayoutImpl> m_inputLayout; - RefPtr<ShaderProgramImpl> m_shaderProgram; + public: + struct RangeInfo + { + DescriptorSlotType type; + Int count; + Int arrayIndex; + }; + + List<RangeInfo> m_ranges; + List<D3D12_DESCRIPTOR_RANGE> m_dxRanges; + List<D3D12_ROOT_PARAMETER> m_dxRootParameters; + + Int m_resourceCount; + Int m_samplerCount; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: ComPtr<ID3D12RootSignature> m_rootSignature; + UInt m_descriptorSetCount; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr<D3D12Renderer> m_renderer; + RefPtr<DescriptorSetLayoutImpl> m_layout; + + D3D12DescriptorHeap* m_resourceHeap = nullptr; + D3D12DescriptorHeap* m_samplerHeap = nullptr; + + Int m_resourceTable = 0; + Int m_samplerTable = 0; + + // The following arrays are used to retain the relevant + // objects so that they will not be released while this + // descriptor-set is still alive. + // + // For the `m_resourceObjects` array, the values are either + // the relevant `ResourceViewImpl` for SRV/UAV slots, or + // a `BufferResourceImpl` for a CBV slot. + // + List<RefPtr<RefObject>> m_resourceObjects; + List<RefPtr<SamplerStateImpl>> m_samplerObjects; + }; + + + // During command submission, we need all the descriptor tables that get + // used to come from a single heap (for each descritpor heap type). + // + // We will thus keep a single heap of each type that we hope will hold + // all the descriptors that actually get needed in a frame. + // + // TODO: we need an allocation policy to reallocate and resize these + // if/when we run out of space during a frame. + // + D3D12DescriptorHeap m_viewHeap; ///< Cbv, Srv, Uav + D3D12DescriptorHeap m_samplerHeap; ///< Heap for samplers + + D3D12HostVisibleDescriptorAllocator m_rtvAllocator; + D3D12HostVisibleDescriptorAllocator m_dsvAllocator; + + D3D12HostVisibleDescriptorAllocator m_viewAllocator; + D3D12HostVisibleDescriptorAllocator m_samplerAllocator; + + // Space in the GPU-visible heaps is precious, so we will also keep + // around CPU-visible heaps for storing descriptors in a format + // that is ready for copying into the GPU-visible heaps as needed. + // + D3D12DescriptorHeap m_cpuViewHeap; ///< Cbv, Srv, Uav + D3D12DescriptorHeap m_cpuSamplerHeap; ///< Heap for samplers + + class PipelineStateImpl : public PipelineState + { + public: + PipelineType m_pipelineType; + RefPtr<PipelineLayoutImpl> m_pipelineLayout; ComPtr<ID3D12PipelineState> m_pipelineState; }; @@ -240,6 +373,7 @@ protected: int m_offset; }; +#if 0 struct BindParameters { enum @@ -261,6 +395,7 @@ protected: D3D12_ROOT_PARAMETER m_parameters[kMaxParameters]; int m_paramIndex; }; +#endif struct GraphicsSubmitter : public Submitter { @@ -329,15 +464,16 @@ protected: ID3D12GraphicsCommandList* getCommandList() const { return m_commandList; } - RenderState* calcRenderState(); +// RenderState* calcRenderState(); + /// From current bindings calculate the root signature and pipeline state - Result calcGraphicsPipelineState(ComPtr<ID3D12RootSignature>& sigOut, ComPtr<ID3D12PipelineState>& pipelineStateOut); - Result calcComputePipelineState(ComPtr<ID3D12RootSignature>& signatureOut, ComPtr<ID3D12PipelineState>& pipelineStateOut); +// Result calcGraphicsPipelineState(ComPtr<ID3D12RootSignature>& sigOut, ComPtr<ID3D12PipelineState>& pipelineStateOut); +// Result calcComputePipelineState(ComPtr<ID3D12RootSignature>& signatureOut, ComPtr<ID3D12PipelineState>& pipelineStateOut); - Result _bindRenderState(RenderState* renderState, ID3D12GraphicsCommandList* commandList, Submitter* submitter); + Result _bindRenderState(PipelineStateImpl* pipelineStateImpl, ID3D12GraphicsCommandList* commandList, Submitter* submitter); - Result _calcBindParameters(BindParameters& params); - RenderState* findRenderState(PipelineType pipelineType); +// Result _calcBindParameters(BindParameters& params); +// RenderState* findRenderState(PipelineType pipelineType); PFN_D3D12_SERIALIZE_ROOT_SIGNATURE m_D3D12SerializeRootSignature = nullptr; @@ -347,9 +483,17 @@ protected: List<BoundVertexBuffer> m_boundVertexBuffers; - RefPtr<ShaderProgramImpl> m_boundShaderProgram; - RefPtr<InputLayoutImpl> m_boundInputLayout; - RefPtr<BindingStateImpl> m_boundBindingState; + RefPtr<BufferResourceImpl> m_boundIndexBuffer; + DXGI_FORMAT m_boundIndexFormat; + UINT m_boundIndexOffset; + + RefPtr<PipelineStateImpl> m_currentPipelineState; + +// RefPtr<ShaderProgramImpl> m_boundShaderProgram; +// RefPtr<InputLayoutImpl> m_boundInputLayout; + +// RefPtr<BindingStateImpl> m_boundBindingState; + RefPtr<DescriptorSetImpl> m_boundDescriptorSets[int(PipelineType::CountOf)][kMaxDescriptorSetCount]; DXGI_FORMAT m_targetFormat = DXGI_FORMAT_R8G8B8A8_UNORM; DXGI_FORMAT m_depthStencilFormat = DXGI_FORMAT_D24_UNORM_S8_UINT; @@ -376,17 +520,17 @@ protected: ComPtr<ID3D12Device> m_device; ComPtr<IDXGISwapChain3> m_swapChain; ComPtr<ID3D12CommandQueue> m_commandQueue; - ComPtr<ID3D12DescriptorHeap> m_rtvHeap; +// ComPtr<ID3D12DescriptorHeap> m_rtvHeap; ComPtr<ID3D12GraphicsCommandList> m_commandList; D3D12_RECT m_scissorRect = {}; - List<RefPtr<RenderState> > m_renderStates; ///< Holds list of all render state combinations - RenderState* m_currentRenderState = nullptr; ///< The current combination +// List<RefPtr<RenderState> > m_renderStates; ///< Holds list of all render state combinations +// RenderState* m_currentRenderState = nullptr; ///< The current combination UINT m_rtvDescriptorSize = 0; - ComPtr<ID3D12DescriptorHeap> m_dsvHeap; +// ComPtr<ID3D12DescriptorHeap> m_dsvHeap; UINT m_dsvDescriptorSize = 0; // Synchronization objects. @@ -408,8 +552,8 @@ protected: D3D12Resource m_backBufferResources[kMaxNumRenderTargets]; D3D12Resource m_renderTargetResources[kMaxNumRenderTargets]; - D3D12Resource m_depthStencil; - D3D12_CPU_DESCRIPTOR_HANDLE m_depthStencilView = {}; + RefPtr<ResourceViewImpl> m_rtvs[kMaxRTVCount]; + RefPtr<ResourceViewImpl> m_dsv; int32_t m_depthStencilUsageFlags = 0; ///< D3DUtil::UsageFlag combination for depth stencil int32_t m_targetUsageFlags = 0; ///< D3DUtil::UsageFlag combination for target @@ -615,15 +759,12 @@ void D3D12Renderer::_resetCommandList() ID3D12GraphicsCommandList* commandList = getCommandList(); commandList->Reset(frame.m_commandAllocator, nullptr); - D3D12_CPU_DESCRIPTOR_HANDLE rtvHandle = { m_rtvHeap->GetCPUDescriptorHandleForHeapStart().ptr + m_renderTargetIndex * m_rtvDescriptorSize }; - if (m_depthStencil) - { - commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, &m_depthStencilView); - } - else - { - commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, nullptr); - } + // TIM: when should this get set? +// commandList->OMSetRenderTargets( +// 1, +// &m_rtvs[0]->m_descriptor.cpuHandle, +// FALSE, +// m_dsv ? &m_dsv->m_descriptor.cpuHandle : nullptr); // Set necessary state. commandList->RSSetViewports(1, &m_viewport); @@ -797,6 +938,7 @@ Result D3D12Renderer::captureTextureToSurface(D3D12Resource& resource, Surface& } } +#if 0 Result D3D12Renderer::calcComputePipelineState(ComPtr<ID3D12RootSignature>& signatureOut, ComPtr<ID3D12PipelineState>& pipelineStateOut) { BindParameters bindParameters; @@ -832,123 +974,9 @@ Result D3D12Renderer::calcComputePipelineState(ComPtr<ID3D12RootSignature>& sign return SLANG_OK; } +#endif -Result D3D12Renderer::calcGraphicsPipelineState(ComPtr<ID3D12RootSignature>& signatureOut, ComPtr<ID3D12PipelineState>& pipelineStateOut) -{ - BindParameters bindParameters; - _calcBindParameters(bindParameters); - - ComPtr<ID3D12RootSignature> rootSignature; - ComPtr<ID3D12PipelineState> pipelineState; - - { - // Deny unnecessary access to certain pipeline stages - D3D12_ROOT_SIGNATURE_DESC rootSignatureDesc; - rootSignatureDesc.NumParameters = bindParameters.m_paramIndex; - rootSignatureDesc.pParameters = bindParameters.m_parameters; - rootSignatureDesc.NumStaticSamplers = 0; - rootSignatureDesc.pStaticSamplers = nullptr; - rootSignatureDesc.Flags = m_boundInputLayout ? D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT : D3D12_ROOT_SIGNATURE_FLAG_NONE; - - ComPtr<ID3DBlob> signature; - ComPtr<ID3DBlob> error; - SLANG_RETURN_ON_FAIL(m_D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, signature.writeRef(), error.writeRef())); - SLANG_RETURN_ON_FAIL(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(rootSignature.writeRef()))); - } - - { - // Describe and create the graphics pipeline state object (PSO) - D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {}; - - psoDesc.pRootSignature = rootSignature; - - psoDesc.VS = { m_boundShaderProgram->m_vertexShader.Buffer(), m_boundShaderProgram->m_vertexShader.Count() }; - psoDesc.PS = { m_boundShaderProgram->m_pixelShader.Buffer(), m_boundShaderProgram->m_pixelShader.Count() }; - - { - psoDesc.InputLayout = { m_boundInputLayout->m_elements.Buffer(), UINT(m_boundInputLayout->m_elements.Count()) }; - psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; - - { - const int numRenderTargets = m_boundBindingState ? m_boundBindingState->getDesc().m_numRenderTargets : 1; - - psoDesc.DSVFormat = m_depthStencilFormat; - psoDesc.NumRenderTargets = numRenderTargets; - for (Int i = 0; i < numRenderTargets; i++) - { - psoDesc.RTVFormats[i] = m_targetFormat; - } - - psoDesc.SampleDesc.Count = 1; - psoDesc.SampleDesc.Quality = 0; - - psoDesc.SampleMask = UINT_MAX; - } - - { - auto& rs = psoDesc.RasterizerState; - rs.FillMode = D3D12_FILL_MODE_SOLID; - rs.CullMode = D3D12_CULL_MODE_NONE; - rs.FrontCounterClockwise = FALSE; - rs.DepthBias = D3D12_DEFAULT_DEPTH_BIAS; - rs.DepthBiasClamp = D3D12_DEFAULT_DEPTH_BIAS_CLAMP; - rs.SlopeScaledDepthBias = D3D12_DEFAULT_SLOPE_SCALED_DEPTH_BIAS; - rs.DepthClipEnable = TRUE; - rs.MultisampleEnable = FALSE; - rs.AntialiasedLineEnable = FALSE; - rs.ForcedSampleCount = 0; - rs.ConservativeRaster = D3D12_CONSERVATIVE_RASTERIZATION_MODE_OFF; - } - - { - D3D12_BLEND_DESC& blend = psoDesc.BlendState; - - blend.AlphaToCoverageEnable = FALSE; - blend.IndependentBlendEnable = FALSE; - const D3D12_RENDER_TARGET_BLEND_DESC defaultRenderTargetBlendDesc = - { - FALSE,FALSE, - D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, - D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, - D3D12_LOGIC_OP_NOOP, - D3D12_COLOR_WRITE_ENABLE_ALL, - }; - for (UINT i = 0; i < D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT; ++i) - { - blend.RenderTarget[i] = defaultRenderTargetBlendDesc; - } - } - - { - auto& ds = psoDesc.DepthStencilState; - - ds.DepthEnable = FALSE; - ds.DepthWriteMask = D3D12_DEPTH_WRITE_MASK_ALL; - ds.DepthFunc = D3D12_COMPARISON_FUNC_ALWAYS; - //ds.DepthFunc = D3D12_COMPARISON_FUNC_LESS; - ds.StencilEnable = FALSE; - ds.StencilReadMask = D3D12_DEFAULT_STENCIL_READ_MASK; - ds.StencilWriteMask = D3D12_DEFAULT_STENCIL_WRITE_MASK; - const D3D12_DEPTH_STENCILOP_DESC defaultStencilOp = - { - D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_COMPARISON_FUNC_ALWAYS - }; - ds.FrontFace = defaultStencilOp; - ds.BackFace = defaultStencilOp; - } - } - - psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; - - SLANG_RETURN_ON_FAIL(m_device->CreateGraphicsPipelineState(&psoDesc, IID_PPV_ARGS(pipelineState.writeRef()))); - } - - signatureOut.swap(rootSignature); - pipelineStateOut.swap(pipelineState); - - return SLANG_OK; -} - +#if 0 D3D12Renderer::RenderState* D3D12Renderer::findRenderState(PipelineType pipelineType) { switch (pipelineType) @@ -1172,72 +1200,74 @@ Result D3D12Renderer::_calcBindParameters(BindParameters& params) } return SLANG_OK; } +#endif -Result D3D12Renderer::_bindRenderState(RenderState* renderState, ID3D12GraphicsCommandList* commandList, Submitter* submitter) +Result D3D12Renderer::_bindRenderState(PipelineStateImpl* pipelineStateImpl, ID3D12GraphicsCommandList* commandList, Submitter* submitter) { - BindingStateImpl* bindingState = m_boundBindingState; + // TODO: we should only set some of this state as needed... - submitter->setRootSignature(renderState->m_rootSignature); - commandList->SetPipelineState(renderState->m_pipelineState); + auto pipelineTypeIndex = (int) pipelineStateImpl->m_pipelineType; + auto pipelineLayout = pipelineStateImpl->m_pipelineLayout; - if (bindingState) - { - ID3D12DescriptorHeap* heaps[] = - { - bindingState->m_viewHeap.getHeap(), - bindingState->m_samplerHeap.getHeap(), - }; - commandList->SetDescriptorHeaps(SLANG_COUNT_OF(heaps), heaps); - } - else + submitter->setRootSignature(pipelineLayout->m_rootSignature); + commandList->SetPipelineState(pipelineStateImpl->m_pipelineState); + + ID3D12DescriptorHeap* heaps[] = { - commandList->SetDescriptorHeaps(0, nullptr); - } + m_viewHeap.getHeap(), + m_samplerHeap.getHeap(), + }; + commandList->SetDescriptorHeaps(SLANG_COUNT_OF(heaps), heaps); + + // We need to copy descriptors over from the descriptor sets + // (where they are stored in CPU-visible heaps) to the GPU-visible + // heaps so that they can be accessed by shader code. + Int descriptorSetCount = pipelineLayout->m_descriptorSetCount; + Int rootParameterIndex = 0; + for(Int dd = 0; dd < descriptorSetCount; ++dd) { - int index = 0; + auto descriptorSet = m_boundDescriptorSets[pipelineTypeIndex][dd]; + auto descriptorSetLayout = descriptorSet->m_layout; + + // TODO: require that `descriptorSetLayout` is compatible with + // `pipelineLayout->descriptorSetlayouts[dd]`. - int numConstantBuffers = 0; { - if (bindingState) + if(auto descriptorCount = descriptorSetLayout->m_resourceCount) { - D3D12DescriptorHeap& heap = bindingState->m_viewHeap; - const auto& details = bindingState->m_bindingDetails; - const auto& bindings = bindingState->getDesc().m_bindings; - const int numBindings = int(details.Count()); + auto& gpuHeap = m_viewHeap; + auto gpuDescriptorTable = gpuHeap.allocate(descriptorCount); - for (int i = 0; i < numBindings; i++) - { - const auto& detail = details[i]; - const auto& binding = bindings[i]; + auto& cpuHeap = *descriptorSet->m_resourceHeap; + auto cpuDescriptorTable = descriptorSet->m_resourceTable; - if (binding.bindingType == BindingType::Buffer) - { - assert(binding.resource && binding.resource->isBuffer()); - if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) - { - BufferResourceImpl* buffer = static_cast<BufferResourceImpl*>(binding.resource.Ptr()); - buffer->bindConstantBufferView(m_circularResourceHeap, index++, submitter); - numConstantBuffers++; - } - } + m_device->CopyDescriptorsSimple( + descriptorCount, + gpuHeap.getCpuHandle(gpuDescriptorTable), + cpuHeap.getCpuHandle(cpuDescriptorTable), + D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); - if (detail.m_srvIndex >= 0) - { - submitter->setRootDescriptorTable(index++, heap.getGpuHandle(detail.m_srvIndex)); - } - - if (detail.m_uavIndex >= 0) - { - submitter->setRootDescriptorTable(index++, heap.getGpuHandle(detail.m_uavIndex)); - } - } + submitter->setRootDescriptorTable(rootParameterIndex++, gpuHeap.getGpuHandle(gpuDescriptorTable)); } } - - if (bindingState && bindingState->m_samplerHeap.getUsedSize() > 0) { - submitter->setRootDescriptorTable(index, bindingState->m_samplerHeap.getGpuStart()); + if(auto descriptorCount = descriptorSetLayout->m_samplerCount) + { + auto& gpuHeap = m_samplerHeap; + auto gpuDescriptorTable = gpuHeap.allocate(descriptorCount); + + auto& cpuHeap = *descriptorSet->m_samplerHeap; + auto cpuDescriptorTable = descriptorSet->m_samplerTable; + + m_device->CopyDescriptorsSimple( + descriptorCount, + gpuHeap.getCpuHandle(gpuDescriptorTable), + cpuHeap.getCpuHandle(cpuDescriptorTable), + D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER); + + submitter->setRootDescriptorTable(rootParameterIndex++, gpuHeap.getGpuHandle(gpuDescriptorTable)); + } } } @@ -1339,9 +1369,15 @@ Result D3D12Renderer::initialize(const Desc& desc, void* inWindowHandle) if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE) { + // TODO: may want to allow software driver as fallback } - else if (SUCCEEDED(D3D12CreateDevice_(candidateAdapter, featureLevel, IID_PPV_ARGS(m_device.writeRef())))) + else + { + continue; + } + + if (SUCCEEDED(D3D12CreateDevice_(candidateAdapter, featureLevel, IID_PPV_ARGS(m_device.writeRef())))) { // We found one! adapter = candidateAdapter; @@ -1356,6 +1392,27 @@ Result D3D12Renderer::initialize(const Desc& desc, void* inWindowHandle) return SLANG_FAIL; } + // set up debug layer +#ifndef NDEBUG + { + + LOAD_D3D_PROC(PFN_D3D12_GET_DEBUG_INTERFACE, D3D12GetDebugInterface); + if (!D3D12GetDebugInterface_) + { + return SLANG_FAIL; + } + + ComPtr<ID3D12Debug> debug; + + if (!SUCCEEDED(D3D12GetDebugInterface_(IID_PPV_ARGS(debug.writeRef())))) + { + return SLANG_FAIL; + } + + debug->EnableDebugLayer(); + } +#endif + m_numRenderFrames = 3; m_numRenderTargets = 2; @@ -1432,27 +1489,17 @@ Result D3D12Renderer::initialize(const Desc& desc, void* inWindowHandle) m_renderTargetIndex = m_swapChain->GetCurrentBackBufferIndex(); // Create descriptor heaps. - { - // Describe and create a render target view (RTV) descriptor heap. - D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {}; - rtvHeapDesc.NumDescriptors = m_numRenderTargets; - rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; - rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; - SLANG_RETURN_ON_FAIL(m_device->CreateDescriptorHeap(&rtvHeapDesc, IID_PPV_ARGS(m_rtvHeap.writeRef()))); - m_rtvDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV); - } + SLANG_RETURN_ON_FAIL(m_viewHeap.init (m_device, 256, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); + SLANG_RETURN_ON_FAIL(m_samplerHeap.init(m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); - { - // Describe and create a depth stencil view (DSV) descriptor heap. - D3D12_DESCRIPTOR_HEAP_DESC dsvHeapDesc = {}; - dsvHeapDesc.NumDescriptors = 1; - dsvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_DSV; - dsvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; - SLANG_RETURN_ON_FAIL(m_device->CreateDescriptorHeap(&dsvHeapDesc, IID_PPV_ARGS(m_dsvHeap.writeRef()))); + SLANG_RETURN_ON_FAIL(m_cpuViewHeap.init (m_device, 1024, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); + SLANG_RETURN_ON_FAIL(m_cpuSamplerHeap.init(m_device, 64, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); - m_dsvDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_DSV); - } + SLANG_RETURN_ON_FAIL(m_rtvAllocator.init (m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_RTV)); + SLANG_RETURN_ON_FAIL(m_dsvAllocator.init (m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_DSV)); + SLANG_RETURN_ON_FAIL(m_viewAllocator.init (m_device, 64, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV)); + SLANG_RETURN_ON_FAIL(m_samplerAllocator.init(m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER)); // Setup frame resources { @@ -1488,7 +1535,7 @@ Result D3D12Renderer::createFrameResources() { // Create back buffers { - D3D12_CPU_DESCRIPTOR_HANDLE rtvStart(m_rtvHeap->GetCPUDescriptorHandleForHeapStart()); +// D3D12_CPU_DESCRIPTOR_HANDLE rtvStart(m_rtvHeap->GetCPUDescriptorHandleForHeapStart()); // Work out target format D3D12_RESOURCE_DESC resourceDesc; @@ -1543,8 +1590,10 @@ Result D3D12Renderer::createFrameResources() m_renderTargets[i] = &m_renderTargetResources[i]; } - D3D12_CPU_DESCRIPTOR_HANDLE rtvHandle = { rtvStart.ptr + i * m_rtvDescriptorSize }; - m_device->CreateRenderTargetView(*m_renderTargets[i], nullptr, rtvHandle); + D3D12HostVisibleDescriptor rtvDescriptor; + SLANG_RETURN_ON_FAIL(m_rtvAllocator.allocate(&rtvDescriptor)); + + m_device->CreateRenderTargetView(*m_renderTargets[i], nullptr, rtvDescriptor.cpuHandle); } } @@ -1594,6 +1643,7 @@ Result D3D12Renderer::createFrameResources() resourceDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL; resourceDesc.Alignment = 0; +#if 0 SLANG_RETURN_ON_FAIL(m_depthStencil.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, D3D12_RESOURCE_STATE_DEPTH_WRITE, &clearValue)); // Set the depth stencil @@ -1605,6 +1655,7 @@ Result D3D12Renderer::createFrameResources() // Set up as the depth stencil view m_device->CreateDepthStencilView(m_depthStencil, &depthStencilDesc, m_dsvHeap->GetCPUDescriptorHandleForHeapStart()); m_depthStencilView = m_dsvHeap->GetCPUDescriptorHandleForHeapStart(); +#endif } m_viewport.Width = static_cast<float>(m_desc.width); @@ -1625,11 +1676,13 @@ void D3D12Renderer::setClearColor(const float color[4]) void D3D12Renderer::clearFrame() { // Record commands - D3D12_CPU_DESCRIPTOR_HANDLE rtvHandle = { m_rtvHeap->GetCPUDescriptorHandleForHeapStart().ptr + m_renderTargetIndex * m_rtvDescriptorSize }; - m_commandList->ClearRenderTargetView(rtvHandle, m_clearColor, 0, nullptr); - if (m_depthStencil) + if(auto rtv = m_rtvs[0]) + { + m_commandList->ClearRenderTargetView(rtv->m_descriptor.cpuHandle, m_clearColor, 0, nullptr); + } + if (m_dsv) { - m_commandList->ClearDepthStencilView(m_depthStencilView, D3D12_CLEAR_FLAG_DEPTH, 1.0f, 0, 0, nullptr); + m_commandList->ClearDepthStencilView(m_dsv->m_descriptor.cpuHandle, D3D12_CLEAR_FLAG_DEPTH, 1.0f, 0, 0, nullptr); } } @@ -1677,6 +1730,14 @@ void D3D12Renderer::presentFrame() beginRender(); } +TextureResource::Desc D3D12Renderer::getSwapChainTextureDesc() +{ + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, m_desc.width, m_desc.height, 1); + + return desc; +} + SlangResult D3D12Renderer::captureScreenSurface(Surface& surfaceOut) { return captureTextureToSurface(*m_renderTargets[m_renderTargetIndex], surfaceOut); @@ -1743,7 +1804,7 @@ static D3D12_RESOURCE_DIMENSION _calcResourceDimension(Resource::Type type) } } -TextureResource* D3D12Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) +Result D3D12Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) { // Description of uploading on Dx12 // https://msdn.microsoft.com/en-us/library/windows/desktop/dn899215%28v=vs.85%29.aspx @@ -1754,7 +1815,7 @@ TextureResource* D3D12Renderer::createTextureResource(Resource::Usage initialUsa const DXGI_FORMAT pixelFormat = D3DUtil::getMapFormat(srcDesc.format); if (pixelFormat == DXGI_FORMAT_UNKNOWN) { - return nullptr; + return SLANG_FAIL; } const int arraySize = srcDesc.calcEffectiveArraySize(); @@ -1762,7 +1823,7 @@ TextureResource* D3D12Renderer::createTextureResource(Resource::Usage initialUsa const D3D12_RESOURCE_DIMENSION dimension = _calcResourceDimension(srcDesc.type); if (dimension == D3D12_RESOURCE_DIMENSION_UNKNOWN) { - return nullptr; + return SLANG_FAIL; } const int numMipMaps = srcDesc.numMipLevels; @@ -1796,7 +1857,7 @@ TextureResource* D3D12Renderer::createTextureResource(Resource::Usage initialUsa heapProps.CreationNodeMask = 1; heapProps.VisibleNodeMask = 1; - SLANG_RETURN_NULL_ON_FAIL(texture->m_resource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); + SLANG_RETURN_ON_FAIL(texture->m_resource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); texture->m_resource.setDebugName(L"Texture"); } @@ -1849,7 +1910,7 @@ TextureResource* D3D12Renderer::createTextureResource(Resource::Usage initialUsa uploadResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR; uploadResourceDesc.Alignment = 0; - SLANG_RETURN_NULL_ON_FAIL(uploadTexture.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, uploadResourceDesc, D3D12_RESOURCE_STATE_GENERIC_READ, nullptr)); + SLANG_RETURN_ON_FAIL(uploadTexture.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, uploadResourceDesc, D3D12_RESOURCE_STATE_GENERIC_READ, nullptr)); uploadTexture.setDebugName(L"TextureUpload"); } @@ -1911,35 +1972,42 @@ TextureResource* D3D12Renderer::createTextureResource(Resource::Usage initialUsa subResourceIndex++; } - { - // const D3D12_RESOURCE_STATES finalState = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE; - const D3D12_RESOURCE_STATES finalState = _calcResourceState(initialUsage); + // Block - waiting for copy to complete (so can drop upload texture) + submitGpuWorkAndWait(); + } - D3D12BarrierSubmitter submitter(m_commandList); - texture->m_resource.transition(finalState, submitter); - } + { + const D3D12_RESOURCE_STATES finalState = _calcResourceState(initialUsage); + D3D12BarrierSubmitter submitter(m_commandList); + texture->m_resource.transition(finalState, submitter); - // Block - waiting for copy to complete (so can drop upload texture) submitGpuWorkAndWait(); } - return texture.detach(); + *outResource = texture.detach(); + return SLANG_OK; } -BufferResource* D3D12Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) +Result D3D12Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) { typedef BufferResourceImpl::BackingStyle Style; BufferResource::Desc srcDesc(descIn); srcDesc.setDefaults(initialUsage); + // Always align up to 256 bytes, since that is required for constant buffers. + // + // TODO: only do this for buffers that could potentially be bound as constant buffers... + // + const size_t alignedSizeInBytes = D3DUtil::calcAligned(srcDesc.sizeInBytes, 256); + RefPtr<BufferResourceImpl> buffer(new BufferResourceImpl(initialUsage, srcDesc)); // Save the style buffer->m_backingStyle = BufferResourceImpl::_calcResourceBackingStyle(initialUsage); D3D12_RESOURCE_DESC bufferDesc; - _initBufferResourceDesc(srcDesc.sizeInBytes, bufferDesc); + _initBufferResourceDesc(alignedSizeInBytes, bufferDesc); bufferDesc.Flags = _calcResourceBindFlags(initialUsage, srcDesc.bindFlags); @@ -1949,7 +2017,7 @@ BufferResource* D3D12Renderer::createBufferResource(Resource::Usage initialUsage { // Assume the constant buffer will change every frame. We'll just keep a copy of the contents // in regular memory until it needed - buffer->m_memory.SetSize(UInt(srcDesc.sizeInBytes)); + buffer->m_memory.SetSize(UInt(alignedSizeInBytes)); // Initialize if (initData) { @@ -1960,16 +2028,268 @@ BufferResource* D3D12Renderer::createBufferResource(Resource::Usage initialUsage case Style::ResourceBacked: { const D3D12_RESOURCE_STATES initialState = _calcResourceState(initialUsage); - SLANG_RETURN_NULL_ON_FAIL(createBuffer(bufferDesc, initData, buffer->m_uploadResource, initialState, buffer->m_resource)); + SLANG_RETURN_ON_FAIL(createBuffer(bufferDesc, initData, buffer->m_uploadResource, initialState, buffer->m_resource)); break; } - default: return nullptr; + default: + return SLANG_FAIL; + } + + *outResource = buffer.detach(); + return SLANG_OK; +} + +D3D12_FILTER_TYPE translateFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return D3D12_FILTER_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return D3D12_FILTER_TYPE_##DST + + CASE(Point, POINT); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +D3D12_FILTER_REDUCTION_TYPE translateFilterReduction(TextureReductionOp op) +{ + switch (op) + { + default: + return D3D12_FILTER_REDUCTION_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureReductionOp::SRC: return D3D12_FILTER_REDUCTION_TYPE_##DST + + CASE(Average, STANDARD); + CASE(Comparison, COMPARISON); + CASE(Minimum, MINIMUM); + CASE(Maximum, MAXIMUM); + +#undef CASE + } +} + +D3D12_TEXTURE_ADDRESS_MODE translateAddressingMode(TextureAddressingMode mode) +{ + switch (mode) + { + default: + return D3D12_TEXTURE_ADDRESS_MODE(0); + +#define CASE(SRC, DST) \ + case TextureAddressingMode::SRC: return D3D12_TEXTURE_ADDRESS_MODE_##DST + + CASE(Wrap, WRAP); + CASE(ClampToEdge, CLAMP); + CASE(ClampToBorder, BORDER); + CASE(MirrorRepeat, MIRROR); + CASE(MirrorOnce, MIRROR_ONCE); + +#undef CASE + } +} + +static D3D12_COMPARISON_FUNC translateComparisonFunc(ComparisonFunc func) +{ + switch (func) + { + default: + // TODO: need to report failures + return D3D12_COMPARISON_FUNC_ALWAYS; + +#define CASE(FROM, TO) \ + case ComparisonFunc::FROM: return D3D12_COMPARISON_FUNC_##TO + + CASE(Never, NEVER); + CASE(Less, LESS); + CASE(Equal, EQUAL); + CASE(LessEqual, LESS_EQUAL); + CASE(Greater, GREATER); + CASE(NotEqual, NOT_EQUAL); + CASE(GreaterEqual, GREATER_EQUAL); + CASE(Always, ALWAYS); +#undef CASE + } +} + +Result D3D12Renderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + D3D12_FILTER_REDUCTION_TYPE dxReduction = translateFilterReduction(desc.reductionOp); + D3D12_FILTER dxFilter; + if (desc.maxAnisotropy > 1) + { + dxFilter = D3D12_ENCODE_ANISOTROPIC_FILTER(dxReduction); + } + else + { + D3D12_FILTER_TYPE dxMin = translateFilterMode(desc.minFilter); + D3D12_FILTER_TYPE dxMag = translateFilterMode(desc.magFilter); + D3D12_FILTER_TYPE dxMip = translateFilterMode(desc.mipFilter); + + dxFilter = D3D12_ENCODE_BASIC_FILTER(dxMin, dxMag, dxMip, dxReduction); + } + + D3D12_SAMPLER_DESC dxDesc = {}; + dxDesc.Filter = dxFilter; + dxDesc.AddressU = translateAddressingMode(desc.addressU); + dxDesc.AddressV = translateAddressingMode(desc.addressV); + dxDesc.AddressW = translateAddressingMode(desc.addressW); + dxDesc.MipLODBias = desc.mipLODBias; + dxDesc.MaxAnisotropy = desc.maxAnisotropy; + dxDesc.ComparisonFunc = translateComparisonFunc(desc.comparisonFunc); + for (int ii = 0; ii < 4; ++ii) + dxDesc.BorderColor[ii] = desc.borderColor[ii]; + dxDesc.MinLOD = desc.minLOD; + dxDesc.MaxLOD = desc.maxLOD; + + auto samplerHeap = &m_cpuSamplerHeap; + + int indexInSamplerHeap = samplerHeap->allocate(); + if(indexInSamplerHeap < 0) + { + // We ran out of room in our CPU sampler heap. + // + // TODO: this should not be a catastrophic failure, because + // we should just allocate another CPU sampler heap that + // can service subsequent allocation. + // + return SLANG_FAIL; + } + auto cpuDescriptorHandle = samplerHeap->getCpuHandle(indexInSamplerHeap); + + m_device->CreateSampler(&dxDesc, cpuDescriptorHandle); + + // TODO: We really ought to have a free-list of sampler-heap + // entries that we check before we go to the heap, and then + // when we are done with a sampler we simply add it to the free list. + // + RefPtr<SamplerStateImpl> samplerImpl = new SamplerStateImpl(); + samplerImpl->m_cpuHandle = cpuDescriptorHandle; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (TextureResourceImpl*) texture; + + RefPtr<ResourceViewImpl> viewImpl = new ResourceViewImpl(); + viewImpl->m_resource = resourceImpl; + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::RenderTarget: + { + SLANG_RETURN_ON_FAIL(m_rtvAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateRenderTargetView(resourceImpl->m_resource, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::DepthStencil: + { + SLANG_RETURN_ON_FAIL(m_dsvAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateDepthStencilView(resourceImpl->m_resource, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::UnorderedAccess: + { + // TODO: need to support the separate "counter resource" for the case + // of append/consume buffers with attached counters. + + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateUnorderedAccessView(resourceImpl->m_resource, nullptr, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::ShaderResource: + { + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateShaderResourceView(resourceImpl->m_resource, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; } - return buffer.detach(); + *outView = viewImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + auto resourceDesc = resourceImpl->getDesc(); + + RefPtr<ResourceViewImpl> viewImpl = new ResourceViewImpl(); + viewImpl->m_resource = resourceImpl; + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::UnorderedAccess: + { + D3D12_UNORDERED_ACCESS_VIEW_DESC uavDesc = {}; + uavDesc.ViewDimension = D3D12_UAV_DIMENSION_BUFFER; + uavDesc.Format = D3DUtil::getMapFormat(desc.format); + uavDesc.Buffer.FirstElement = 0; + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + uavDesc.Buffer.StructureByteStride = resourceDesc.elementSize; + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + else if(desc.format == Format::Unknown) + { + uavDesc.Buffer.Flags |= D3D12_BUFFER_UAV_FLAG_RAW; + uavDesc.Format = DXGI_FORMAT_R32_TYPELESS; + } + + + // TODO: need to support the separate "counter resource" for the case + // of append/consume buffers with attached counters. + + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateUnorderedAccessView(resourceImpl->m_resource, nullptr, &uavDesc, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::ShaderResource: + { + D3D12_SHADER_RESOURCE_VIEW_DESC srvDesc = {}; + srvDesc.ViewDimension = D3D12_SRV_DIMENSION_BUFFER; + srvDesc.Format = D3DUtil::getMapFormat(desc.format); + srvDesc.Buffer.StructureByteStride = 0; + srvDesc.Buffer.FirstElement = 0; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + srvDesc.Buffer.StructureByteStride = resourceDesc.elementSize; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateShaderResourceView(resourceImpl->m_resource, &srvDesc, viewImpl->m_descriptor.cpuHandle); + } + break; + } + + *outView = viewImpl.detach(); + return SLANG_OK; } -InputLayout* D3D12Renderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) +Result D3D12Renderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) { RefPtr<InputLayoutImpl> layout(new InputLayoutImpl); @@ -2012,7 +2332,8 @@ InputLayout* D3D12Renderer::createInputLayout(const InputElementDesc* inputEleme dstEle.InstanceDataStepRate = 0; } - return layout.detach(); + *outLayout = layout.detach(); + return SLANG_OK; } void* D3D12Renderer::map(BufferResource* bufferIn, MapFlavor flavor) @@ -2164,10 +2485,12 @@ void D3D12Renderer::unmap(BufferResource* bufferIn) } } +#if 0 void D3D12Renderer::setInputLayout(InputLayout* inputLayout) { m_boundInputLayout = static_cast<InputLayoutImpl*>(inputLayout); } +#endif void D3D12Renderer::setPrimitiveTopology(PrimitiveTopology topology) { @@ -2211,28 +2534,37 @@ void D3D12Renderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResou } } -void D3D12Renderer::setShaderProgram(ShaderProgram* inProgram) +void D3D12Renderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) +{ + m_boundIndexBuffer = (BufferResourceImpl*) buffer; + m_boundIndexFormat = D3DUtil::getMapFormat(indexFormat); + m_boundIndexOffset = offset; +} + +void D3D12Renderer::setDepthStencilTarget(ResourceView* depthStencilView) { - m_boundShaderProgram = static_cast<ShaderProgramImpl*>(inProgram); +} + +void D3D12Renderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + m_currentPipelineState = (PipelineStateImpl*)state; } void D3D12Renderer::draw(UInt vertexCount, UInt startVertex) { ID3D12GraphicsCommandList* commandList = m_commandList; - RenderState* renderState = calcRenderState(); - if (!renderState) + auto pipelineState = m_currentPipelineState.Ptr(); + if (!pipelineState || (pipelineState->m_pipelineType != PipelineType::Graphics)) { - assert(!"Couldn't create render state"); + assert(!"No graphics pipeline state set"); return; } - BindingStateImpl* bindingState = m_boundBindingState; - // Submit - setting for graphics { GraphicsSubmitter submitter(commandList); - _bindRenderState(renderState, commandList, &submitter); + _bindRenderState(pipelineState, commandList, &submitter); } commandList->IASetPrimitiveTopology(m_primitiveTopology); @@ -2248,31 +2580,49 @@ void D3D12Renderer::draw(UInt vertexCount, UInt startVertex) if (buffer) { D3D12_VERTEX_BUFFER_VIEW& vertexView = vertexViews[numVertexViews++]; - vertexView.BufferLocation = buffer->m_resource.getResource()->GetGPUVirtualAddress(); - vertexView.SizeInBytes = int(buffer->getDesc().sizeInBytes); + vertexView.BufferLocation = buffer->m_resource.getResource()->GetGPUVirtualAddress() + + boundVertexBuffer.m_offset; + vertexView.SizeInBytes = buffer->getDesc().sizeInBytes - boundVertexBuffer.m_offset; vertexView.StrideInBytes = boundVertexBuffer.m_stride; } } commandList->IASetVertexBuffers(0, numVertexViews, vertexViews); } + // Set up index buffer + if(m_boundIndexBuffer) + { + D3D12_INDEX_BUFFER_VIEW indexBufferView; + indexBufferView.BufferLocation = m_boundIndexBuffer->m_resource.getResource()->GetGPUVirtualAddress() + + m_boundIndexOffset; + indexBufferView.SizeInBytes = m_boundIndexBuffer->getDesc().sizeInBytes - m_boundIndexOffset; + indexBufferView.Format = m_boundIndexFormat; + + commandList->IASetIndexBuffer(&indexBufferView); + } + commandList->DrawInstanced(UINT(vertexCount), 1, UINT(startVertex), 0); } +void D3D12Renderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ +} + void D3D12Renderer::dispatchCompute(int x, int y, int z) { ID3D12GraphicsCommandList* commandList = m_commandList; - RenderState* renderState = calcRenderState(); + auto pipelineStateImpl = m_currentPipelineState; // Submit binding for compute { ComputeSubmitter submitter(commandList); - _bindRenderState(renderState, commandList, &submitter); + _bindRenderState(pipelineStateImpl, commandList, &submitter); } commandList->Dispatch(x, y, z); } +#if 0 BindingState* D3D12Renderer::createBindingState(const BindingState::Desc& bindingStateDesc) { RefPtr<BindingStateImpl> bindingState(new BindingStateImpl(bindingStateDesc)); @@ -2298,7 +2648,7 @@ BindingState* D3D12Renderer::createBindingState(const BindingState::Desc& bindin { assert(srcEntry.resource && srcEntry.resource->isBuffer()); BufferResourceImpl* bufferResource = static_cast<BufferResourceImpl*>(srcEntry.resource.Ptr()); - const BufferResource::Desc& bufferDesc = bufferResource->getDesc(); + const BufferResource::Desc& desc = bufferResource->getDesc(); const size_t bufferSize = bufferDesc.sizeInBytes; const int elemSize = bufferDesc.elementSize <= 0 ? sizeof(uint32_t) : bufferDesc.elementSize; @@ -2440,8 +2790,160 @@ void D3D12Renderer::setBindingState(BindingState* state) { m_boundBindingState = static_cast<BindingStateImpl*>(state); } +#endif + +void D3D12Renderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto dxDevice = m_renderer->m_device; + + + auto resourceImpl = (BufferResourceImpl*) buffer; + auto resourceDesc = resourceImpl->getDesc(); + + // Constant buffer view size must be a multiple of 256 bytes, so we round it up here. + const size_t alignedSizeInBytes = D3DUtil::calcAligned(resourceDesc.sizeInBytes, 256); + + D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc = {}; + cbvDesc.BufferLocation = resourceImpl->m_resource.getResource()->GetGPUVirtualAddress(); + cbvDesc.SizeInBytes = alignedSizeInBytes; + + auto& rangeInfo = m_layout->m_ranges[range]; + +#ifdef _DEBUG + switch(rangeInfo.type) + { + default: + assert(!"incorrect slot type"); + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + break; + } +#endif + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto descriptorIndex = m_resourceTable + arrayIndex; + + m_resourceObjects[arrayIndex] = resourceImpl; + dxDevice->CreateConstantBufferView( + &cbvDesc, + m_resourceHeap->getCpuHandle(descriptorIndex)); +} + +void D3D12Renderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto dxDevice = m_renderer->m_device; + + auto viewImpl = (ResourceViewImpl*) view; + + auto& rangeInfo = m_layout->m_ranges[range]; + + // TODO: validation that slot type matches view + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto descriptorIndex = m_resourceTable + arrayIndex; + + m_resourceObjects[arrayIndex] = viewImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_resourceHeap->getCpuHandle(descriptorIndex), + viewImpl->m_descriptor.cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); +} + +void D3D12Renderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ + auto dxDevice = m_renderer->m_device; + + auto samplerImpl = (SamplerStateImpl*) sampler; + + auto& rangeInfo = m_layout->m_ranges[range]; + +#ifdef _DEBUG + switch(rangeInfo.type) + { + default: + assert(!"incorrect slot type"); + break; + + case DescriptorSlotType::Sampler: + break; + } +#endif + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto descriptorIndex = m_resourceTable + arrayIndex; + + m_samplerObjects[arrayIndex] = samplerImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_samplerHeap->getCpuHandle(descriptorIndex), + samplerImpl->m_cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER); +} -ShaderProgram* D3D12Renderer::createProgram(const ShaderProgram::Desc& desc) +void D3D12Renderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ + auto dxDevice = m_renderer->m_device; + + auto viewImpl = (ResourceViewImpl*) textureView; + auto samplerImpl = (SamplerStateImpl*) sampler; + + auto& rangeInfo = m_layout->m_ranges[range]; + +#ifdef _DEBUG + switch(rangeInfo.type) + { + default: + assert(!"incorrect slot type"); + break; + + case DescriptorSlotType::CombinedImageSampler: + break; + } +#endif + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto resourceDescriptorIndex = m_resourceTable + arrayIndex; + auto samplerDescriptorIndex = m_samplerTable + arrayIndex; + + m_resourceObjects[arrayIndex] = viewImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_resourceHeap->getCpuHandle(resourceDescriptorIndex), + viewImpl->m_descriptor.cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); + + m_samplerObjects[arrayIndex] = samplerImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_samplerHeap->getCpuHandle(samplerDescriptorIndex), + samplerImpl->m_cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER); +} + +void D3D12Renderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) +{ + // In D3D12, unlike Vulkan, binding a root signature invalidates *all* descriptor table + // bindings (rather than preserving those that are part of the longest common prefix + // between the old and new layout). + // + // In order to accomodate having descriptor-set bindings that persist across changes + // in pipeline state (which may also change pipeline layout), we will shadow the + // descriptor-set bindings and only flush them on-demand at draw tiume once the final + // pipline layout is known. + // + + auto descriptorSetImpl = (DescriptorSetImpl*) descriptorSet; + m_boundDescriptorSets[int(pipelineType)][index] = descriptorSetImpl; +} + +Result D3D12Renderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) { RefPtr<ShaderProgramImpl> program(new ShaderProgramImpl()); program->m_pipelineType = desc.pipelineType; @@ -2460,8 +2962,596 @@ ShaderProgram* D3D12Renderer::createProgram(const ShaderProgram::Desc& desc) program->m_pixelShader.InsertRange(0, (const uint8_t*) fragmentKernel->codeBegin, fragmentKernel->getCodeSize()); } - return program.detach(); + *outProgram = program.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + Int rangeCount = desc.slotRangeCount; + + // For our purposes, there are three main cases of descriptor ranges to consider: + // + // 1. Resources: CBV, SRV, UAV + // + // 2. Samplers + // + // 3. Combined texture/sampler pairs + // + // The combined case presents challenges, because we will implement + // them as both a resource slot and a sampler slot, and for conveience + // in the indexing logic, it would be nice it they "lined up." + // + // We will start by counting how many ranges, and how many + // descriptors, of each type we have. + // + + Int dedicatedResourceCount = 0; + Int dedicatedSamplerCount = 0; + Int combinedCount = 0; + + Int dedicatedResourceRangeCount = 0; + Int dedicatedSamplerRangeCount = 0; + Int combinedRangeCount = 0; + + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + switch(rangeDesc.type) + { + case DescriptorSlotType::Sampler: + dedicatedSamplerCount += rangeDesc.count; + dedicatedSamplerRangeCount++; + break; + + case DescriptorSlotType::CombinedImageSampler: + combinedCount += rangeDesc.count; + combinedRangeCount++; + break; + + default: + dedicatedResourceCount += rangeDesc.count; + dedicatedResourceRangeCount++; + break; + } + } + + // Now we know how many ranges we have to allocate space for, + // and also how they need to be arranged. + // + // Each "combined" range will map to two ranges in the D3D + // descriptor tables. + + RefPtr<DescriptorSetLayoutImpl> descriptorSetLayoutImpl = new DescriptorSetLayoutImpl(); + + // We know the total number of resource and sampler "slots" that an instance + // of this decriptor-set layout would need: + // + descriptorSetLayoutImpl->m_resourceCount = combinedCount + dedicatedResourceCount; + descriptorSetLayoutImpl->m_samplerCount = combinedCount + dedicatedSamplerCount; + + // We can start by allocating the D3D root parameter info needed for the + // descriptor set, based on the total number or ranges we need, which + // we can compute from the combined and dedicated counts: + // + Int totalResourceRangeCount = combinedRangeCount + dedicatedResourceRangeCount; + Int totalSamplerRangeCount = combinedRangeCount + dedicatedSamplerRangeCount; + + if( totalResourceRangeCount ) + { + D3D12_ROOT_PARAMETER dxRootParameter = {}; + dxRootParameter.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + dxRootParameter.DescriptorTable.NumDescriptorRanges = totalResourceRangeCount; + descriptorSetLayoutImpl->m_dxRootParameters.Add(dxRootParameter); + } + if( totalSamplerRangeCount ) + { + D3D12_ROOT_PARAMETER dxRootParameter = {}; + dxRootParameter.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + dxRootParameter.DescriptorTable.NumDescriptorRanges = totalSamplerRangeCount; + descriptorSetLayoutImpl->m_dxRootParameters.Add(dxRootParameter); + } + + // Next we can allocate space for all the D3D register ranges we need, + // again based on totals that we can compute easily: + // + Int totalRangeCount = totalResourceRangeCount + totalSamplerRangeCount; + descriptorSetLayoutImpl->m_dxRanges.SetSize(totalRangeCount); + + // Now we will walk through the ranges in the order they were + // specified, so that we can fill in the "range info" required for + // binding parameters into descriptor sets allocated with this layout. + // + // This effectively determines the space required in two arrays + // in each descriptor set: one for resources, and one for samplers. + // A "combined" descriptor requires space in both arrays. The entries + // for "dedicated" samplers/resources always come after those for + // "combined" descriptors in the same array, so that a single index + // can be used for both arrays in the combined case. + // + + { + Int samplerCounter = 0; + Int resourceCounter = 0; + Int combinedCounter = 0; + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + + rangeInfo.type = rangeDesc.type; + rangeInfo.count = rangeDesc.count; + + switch(rangeDesc.type) + { + default: + // Default case is a dedicated resource, and its index in the + // resource array will come after all the combined entries. + rangeInfo.arrayIndex = combinedCount + resourceCounter; + resourceCounter += rangeInfo.count; + break; + + case DescriptorSlotType::Sampler: + // A dedicated sampler comes after all the entries for + // combined texture/samplers in the sampler array. + rangeInfo.arrayIndex = combinedCount + samplerCounter; + samplerCounter += rangeInfo.count; + break; + + case DescriptorSlotType::CombinedImageSampler: + // Combined descriptors take entries at the front of + // the resource and sampler arrays. + rangeInfo.arrayIndex = combinedCounter; + combinedCounter += rangeInfo.count; + break; + } + + descriptorSetLayoutImpl->m_ranges.Add(rangeInfo); + } + } + + // Finally, we will go through and fill in ready-to-go D3D + // register range information. + { + UInt cbvCounter = 0; + UInt srvCounter = 0; + UInt uavCounter = 0; + UInt samplerCounter = 0; + + Int resourceRangeCounter = 0; + Int samplerRangeCounter = 0; + Int combinedRangeCounter = 0; + + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + Int bindingCount = rangeDesc.count; + + // All of these descriptor ranges will be initialized + // with a "space" of zero, with the assumption that + // the actual space number will come from when they are + // used as part of a pipeline layout. + // + Int bindingSpace = 0; + + Int dxRangeIndex = -1; + Int dxPairedSamplerRangeIndex = -1; + + switch(rangeDesc.type) + { + default: + // Default case is a dedicated resource, and its index in the + // resource array will come after all the combined entries. + dxRangeIndex = combinedRangeCount + resourceRangeCounter; + resourceRangeCounter++; + break; + + case DescriptorSlotType::Sampler: + // A dedicated sampler comes after all the entries for + // combined texture/samplers in the sampler array. + dxRangeIndex = totalResourceRangeCount + combinedRangeCount + samplerRangeCounter; + samplerRangeCounter++; + break; + + case DescriptorSlotType::CombinedImageSampler: + // Combined descriptors take entries at the front of + // the resource and sampler arrays. + dxRangeIndex = combinedRangeCounter; + dxPairedSamplerRangeIndex = totalResourceRangeCount + combinedRangeCounter; + combinedRangeCounter++; + break; + } + + D3D12_DESCRIPTOR_RANGE& dxRange = descriptorSetLayoutImpl->m_dxRanges[dxRangeIndex]; + memset(&dxRange, 0, sizeof(dxRange)); + + switch(rangeDesc.type) + { + default: + // ERROR: unsupported slot type. + break; + + case DescriptorSlotType::Sampler: + { + UInt bindingIndex = samplerCounter; samplerCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + + case DescriptorSlotType::SampledImage: + case DescriptorSlotType::UniformTexelBuffer: + { + UInt bindingIndex = srvCounter; srvCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SRV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + + case DescriptorSlotType::CombinedImageSampler: + { + // The combined texture/sampler case basically just + // does the work of both the SRV and sampler cases above. + + { + // Here's the SRV logic: + + UInt bindingIndex = srvCounter; srvCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SRV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + + { + // And here we do the sampler logic at the "paired" index. + D3D12_DESCRIPTOR_RANGE& dxPairedSamplerRange = descriptorSetLayoutImpl->m_dxRanges[dxPairedSamplerRangeIndex]; + memset(&dxPairedSamplerRange, 0, sizeof(dxPairedSamplerRange)); + + UInt pairedSamplerBindingIndex = srvCounter; srvCounter += bindingCount; + + dxPairedSamplerRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER; + dxPairedSamplerRange.NumDescriptors = bindingCount; + dxPairedSamplerRange.BaseShaderRegister = pairedSamplerBindingIndex; + dxPairedSamplerRange.RegisterSpace = bindingSpace; + dxPairedSamplerRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + + } + break; + + + case DescriptorSlotType::InputAttachment: + case DescriptorSlotType::StorageImage: + case DescriptorSlotType::StorageTexelBuffer: + case DescriptorSlotType::StorageBuffer: + case DescriptorSlotType::DynamicStorageBuffer: + { + UInt bindingIndex = uavCounter; uavCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_UAV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + { + UInt bindingIndex = cbvCounter; cbvCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + } + } + } + + *outLayout = descriptorSetLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + static const UInt kMaxRanges = 16; + static const UInt kMaxRootParameters = 32; + + D3D12_DESCRIPTOR_RANGE ranges[kMaxRanges]; + D3D12_ROOT_PARAMETER rootParameters[kMaxRootParameters]; + + UInt rangeCount = 0; + UInt rootParameterCount = 0; + + auto descriptorSetCount = desc.descriptorSetCount; + + // We are going to make two passes over the descriptor set layouts + // that are being used to build the pipeline layout. In the first + // pass we will collect all the descriptor ranges that have been + // specified, applying an offset to their register spaces as needed. + // + for(UInt dd = 0; dd < descriptorSetCount; ++dd) + { + auto& descriptorSetInfo = desc.descriptorSets[dd]; + auto descriptorSetLayout = (DescriptorSetLayoutImpl*) descriptorSetInfo.layout; + + // For now we assume that the register space used for + // logical descriptor set #N will be space N. + // + // TODO: This might need to be revisited in the future because + // a single logical descriptor set might need to encompass stuff + // that comes from multiple spaces (e.g., if it contains an unbounded + // array). + // + UInt bindingSpace = dd; + + // Copy descriptor range infromation from the set layout into our + // temporary copy (this is required because the same set layout + // might be applied to different ranges). + // + // API design note: this copy step could be avoided if the D3D + // API allowed for a "space offset" to be applied as part of + // a descriptor-table root parameter. + // + for(auto setDescriptorRange : descriptorSetLayout->m_dxRanges) + { + auto& range = ranges[rangeCount++]; + range = setDescriptorRange; + range.RegisterSpace = bindingSpace; + + // HACK: in order to deal with SM5.0 shaders, `u` registers + // in `space0` need to start with a number *after* the number + // of `SV_Target` outputs that will be used. + // + // TODO: This is clearly a mess, and doing this behavior here + // means it *won't* work for SM5.1 where the restriction is + // lifted. The only real alternative is to rely on explicit + // register numbers (e.g., from shader reflection) but that + // goes against the simplicity that this API layer strives for + // (everything so far has been set up to work correctly with + // automatic assignment of bindings). + // + if( range.RegisterSpace == 0 + && range.RangeType == D3D12_DESCRIPTOR_RANGE_TYPE_UAV ) + { + range.BaseShaderRegister += desc.renderTargetCount; + } + } + } + + // In our second pass, we will copy over root parameters, which + // may end up pointing into the list of ranges from the first step. + // + auto rangePtr = &ranges[0]; + for(UInt dd = 0; dd < descriptorSetCount; ++dd) + { + auto& descriptorSetInfo = desc.descriptorSets[dd]; + auto descriptorSetLayout = (DescriptorSetLayoutImpl*) descriptorSetInfo.layout; + + // Copy root parameter information from the set layout to our + // overall pipeline layout. + for( auto setRootParameter : descriptorSetLayout->m_dxRootParameters ) + { + auto& rootParameter = rootParameters[rootParameterCount++]; + rootParameter = setRootParameter; + + // In the case where this parameter is a descriptor table, it + // needs to point into our array of ranges (with offsets applied), + // so we will fix up those pointers here. + // + if(rootParameter.ParameterType == D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE) + { + rootParameter.DescriptorTable.pDescriptorRanges = rangePtr; + rangePtr += rootParameter.DescriptorTable.NumDescriptorRanges; + } + } + } + + D3D12_ROOT_SIGNATURE_DESC rootSignatureDesc = {}; + rootSignatureDesc.NumParameters = rootParameterCount; + rootSignatureDesc.pParameters = rootParameters; + + // TODO: static samplers should be reasonably easy to support... + rootSignatureDesc.NumStaticSamplers = 0; + rootSignatureDesc.pStaticSamplers = nullptr; + + // TODO: only set this flag if needed (requires creating root + // signature at same time as pipeline state...). + // + rootSignatureDesc.Flags = D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT; + + ComPtr<ID3DBlob> signature; + ComPtr<ID3DBlob> error; + if( SLANG_FAILED(m_D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, signature.writeRef(), error.writeRef())) ) + { + fprintf(stderr, "error: D3D12SerializeRootSignature failed"); + if( error ) + { + fprintf(stderr, ": %s\n", (const char*) error->GetBufferPointer()); + } + return SLANG_FAIL; + } + + ComPtr<ID3D12RootSignature> rootSignature; + SLANG_RETURN_ON_FAIL(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(rootSignature.writeRef()))); + + + RefPtr<PipelineLayoutImpl> pipelineLayoutImpl = new PipelineLayoutImpl(); + pipelineLayoutImpl->m_rootSignature = rootSignature; + pipelineLayoutImpl->m_descriptorSetCount = descriptorSetCount; + *outLayout = pipelineLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*) layout; + + RefPtr<DescriptorSetImpl> descriptorSetImpl = new DescriptorSetImpl(); + descriptorSetImpl->m_renderer = this; + descriptorSetImpl->m_layout = layoutImpl; + + // We allocate CPU-visible descriptor tables to providing the + // backing storage for each descriptor set. GPU-visible storage + // will only be allocated as needed during per-frame logic in + // order to ensure that a descriptor set it available for use + // in rendering. + // + Int resourceCount = layoutImpl->m_resourceCount; + if( resourceCount ) + { + auto resourceHeap = &m_cpuViewHeap; + descriptorSetImpl->m_resourceHeap = resourceHeap; + descriptorSetImpl->m_resourceTable = resourceHeap->allocate(resourceCount); + descriptorSetImpl->m_resourceObjects.SetSize(resourceCount); + } + + Int samplerCount = layoutImpl->m_samplerCount; + if( samplerCount ) + { + auto samplerHeap = &m_cpuSamplerHeap; + descriptorSetImpl->m_samplerHeap = samplerHeap; + descriptorSetImpl->m_samplerTable = samplerHeap->allocate(samplerCount); + descriptorSetImpl->m_samplerObjects.SetSize(samplerCount); + } + + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto programImpl = (ShaderProgramImpl*) desc.program; + auto inputLayoutImpl = (InputLayoutImpl*) desc.inputLayout; + + // Describe and create the graphics pipeline state object (PSO) + D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {}; + + psoDesc.pRootSignature = pipelineLayoutImpl->m_rootSignature; + + psoDesc.VS = { programImpl->m_vertexShader.Buffer(), programImpl->m_vertexShader.Count() }; + psoDesc.PS = { programImpl->m_pixelShader .Buffer(), programImpl->m_pixelShader .Count() }; + + psoDesc.InputLayout = { inputLayoutImpl->m_elements.Buffer(), UINT(inputLayoutImpl->m_elements.Count()) }; + psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; + + { + const int numRenderTargets = desc.renderTargetCount; + + psoDesc.DSVFormat = m_depthStencilFormat; + psoDesc.NumRenderTargets = numRenderTargets; + for (Int i = 0; i < numRenderTargets; i++) + { + psoDesc.RTVFormats[i] = m_targetFormat; + } + + psoDesc.SampleDesc.Count = 1; + psoDesc.SampleDesc.Quality = 0; + + psoDesc.SampleMask = UINT_MAX; + } + + { + auto& rs = psoDesc.RasterizerState; + rs.FillMode = D3D12_FILL_MODE_SOLID; + rs.CullMode = D3D12_CULL_MODE_NONE; + rs.FrontCounterClockwise = FALSE; + rs.DepthBias = D3D12_DEFAULT_DEPTH_BIAS; + rs.DepthBiasClamp = D3D12_DEFAULT_DEPTH_BIAS_CLAMP; + rs.SlopeScaledDepthBias = D3D12_DEFAULT_SLOPE_SCALED_DEPTH_BIAS; + rs.DepthClipEnable = TRUE; + rs.MultisampleEnable = FALSE; + rs.AntialiasedLineEnable = FALSE; + rs.ForcedSampleCount = 0; + rs.ConservativeRaster = D3D12_CONSERVATIVE_RASTERIZATION_MODE_OFF; + } + + { + D3D12_BLEND_DESC& blend = psoDesc.BlendState; + + blend.AlphaToCoverageEnable = FALSE; + blend.IndependentBlendEnable = FALSE; + const D3D12_RENDER_TARGET_BLEND_DESC defaultRenderTargetBlendDesc = + { + FALSE,FALSE, + D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, + D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, + D3D12_LOGIC_OP_NOOP, + D3D12_COLOR_WRITE_ENABLE_ALL, + }; + for (UINT i = 0; i < D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT; ++i) + { + blend.RenderTarget[i] = defaultRenderTargetBlendDesc; + } + } + + { + auto& ds = psoDesc.DepthStencilState; + + ds.DepthEnable = FALSE; + ds.DepthWriteMask = D3D12_DEPTH_WRITE_MASK_ALL; + ds.DepthFunc = D3D12_COMPARISON_FUNC_ALWAYS; + //ds.DepthFunc = D3D12_COMPARISON_FUNC_LESS; + ds.StencilEnable = FALSE; + ds.StencilReadMask = D3D12_DEFAULT_STENCIL_READ_MASK; + ds.StencilWriteMask = D3D12_DEFAULT_STENCIL_WRITE_MASK; + const D3D12_DEPTH_STENCILOP_DESC defaultStencilOp = + { + D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_COMPARISON_FUNC_ALWAYS + }; + ds.FrontFace = defaultStencilOp; + ds.BackFace = defaultStencilOp; + } + + psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; + + ComPtr<ID3D12PipelineState> pipelineState; + SLANG_RETURN_ON_FAIL(m_device->CreateGraphicsPipelineState(&psoDesc, IID_PPV_ARGS(pipelineState.writeRef()))); + + RefPtr<PipelineStateImpl> pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_pipelineType = PipelineType::Graphics; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_pipelineState = pipelineState; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; } +Result D3D12Renderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto programImpl = (ShaderProgramImpl*) desc.program; + + // Describe and create the compute pipeline state object + D3D12_COMPUTE_PIPELINE_STATE_DESC computeDesc = {}; + computeDesc.pRootSignature = pipelineLayoutImpl->m_rootSignature; + computeDesc.CS = { programImpl->m_computeShader.Buffer(), programImpl->m_computeShader.Count() }; + + ComPtr<ID3D12PipelineState> pipelineState; + SLANG_RETURN_ON_FAIL(m_device->CreateComputePipelineState(&computeDesc, IID_PPV_ARGS(pipelineState.writeRef()))); + + RefPtr<PipelineStateImpl> pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_pipelineType = PipelineType::Compute; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_pipelineState = pipelineState; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} } // renderer_test diff --git a/tools/slang-graphics/render-d3d12.h b/tools/gfx/render-d3d12.h index 5f0eea4d2..b8a3104c0 100644 --- a/tools/slang-graphics/render-d3d12.h +++ b/tools/gfx/render-d3d12.h @@ -1,10 +1,10 @@ // render-d3d12.h #pragma once -namespace slang_graphics { +namespace gfx { class Renderer; Renderer* createD3D12Renderer(); -} // slang_graphics +} // gfx diff --git a/tools/slang-graphics/render-gl.cpp b/tools/gfx/render-gl.cpp index f85a81ca4..3ab818fdd 100644 --- a/tools/slang-graphics/render-gl.cpp +++ b/tools/gfx/render-gl.cpp @@ -73,7 +73,7 @@ using namespace Slang; -namespace slang_graphics { +namespace gfx { class GLRenderer : public Renderer { @@ -84,20 +84,39 @@ public: virtual void setClearColor(const float color[4]) override; virtual void clearFrame() override; virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& bindingStateDesc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState* state); + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; virtual void dispatchCompute(int x, int y, int z) override; virtual void submitGpuWork() override {} virtual void waitForGpu() override {} @@ -107,6 +126,7 @@ public: enum { kMaxVertexStreams = 16, + kMaxDescriptorSetCount = 8, }; struct VertexAttributeFormat @@ -184,33 +204,78 @@ public: GLuint m_handle; }; - struct BindingDetail + class SamplerStateImpl : public SamplerState { - GLuint m_samplerHandle = 0; + public: + GLuint m_samplerID; }; - class BindingStateImpl: public BindingState + class ResourceViewImpl : public ResourceView { - public: - typedef BindingState Parent; + }; - /// Ctor - BindingStateImpl(const Desc& desc, GLRenderer* renderer): - Parent(desc), - m_renderer(renderer) - { - } + class TextureViewImpl : public ResourceViewImpl + { + public: + RefPtr<TextureResourceImpl> m_resource; + GLuint m_textureID; + }; - ~BindingStateImpl() - { - if (m_renderer) - { - m_renderer->destroyBindingEntries(getDesc(), m_bindingDetails.Buffer()); - } - } + class BufferViewImpl : public ResourceViewImpl + { + public: + RefPtr<BufferResourceImpl> m_resource; + GLuint m_bufferID; + }; - GLRenderer* m_renderer; - List<BindingDetail> m_bindingDetails; + enum class GLDescriptorSlotType + { + ConstantBuffer, + CombinedTextureSampler, + + CountOf, + }; + + class DescriptorSetLayoutImpl : public DescriptorSetLayout + { + public: + struct RangeInfo + { + GLDescriptorSlotType type; + UInt arrayIndex; + }; + List<RangeInfo> m_ranges; + Int m_counts[int(GLDescriptorSlotType::CountOf)]; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: + struct DescriptorSetInfo + { + RefPtr<DescriptorSetLayoutImpl> layout; + UInt baseArrayIndex[int(GLDescriptorSlotType::CountOf)]; + }; + + List<DescriptorSetInfo> m_sets; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr<DescriptorSetLayoutImpl> m_layout; + List<RefPtr<BufferResourceImpl>> m_constantBuffers; + List<RefPtr<TextureViewImpl>> m_textures; + List<RefPtr<SamplerStateImpl>> m_samplers; }; class ShaderProgramImpl : public ShaderProgram @@ -233,6 +298,14 @@ public: GLRenderer* m_renderer; }; + class PipelineStateImpl : public PipelineState + { + public: + RefPtr<ShaderProgramImpl> m_program; + RefPtr<PipelineLayoutImpl> m_pipelineLayout; + RefPtr<InputLayoutImpl> m_inputLayout; + }; + enum class GlPixelFormat { Unknown, @@ -247,7 +320,7 @@ public: GLenum formatType; // such as GL_UNSIGNED_BYTE }; - void destroyBindingEntries(const BindingState::Desc& desc, const BindingDetail* details); +// void destroyBindingEntries(const BindingState::Desc& desc, const BindingDetail* details); void bindBufferImpl(int target, UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* offsets); void flushStateForDraw(); @@ -266,8 +339,11 @@ public: HGLRC m_glContext; float m_clearColor[4] = { 0, 0, 0, 0 }; - RefPtr<ShaderProgramImpl> m_boundShaderProgram; - RefPtr<InputLayoutImpl> m_boundInputLayout; + RefPtr<PipelineStateImpl> m_currentPipelineState; +// RefPtr<ShaderProgramImpl> m_boundShaderProgram; +// RefPtr<InputLayoutImpl> m_boundInputLayout; + + RefPtr<DescriptorSetImpl> m_boundDescriptorSets[kMaxDescriptorSetCount]; GLenum m_boundPrimitiveTopology = GL_TRIANGLES; GLuint m_boundVertexStreamBuffers[kMaxVertexStreams]; @@ -365,11 +441,11 @@ void GLRenderer::bindBufferImpl(int target, UInt startSlot, UInt slotCount, Buff void GLRenderer::flushStateForDraw() { - auto layout = m_boundInputLayout.Ptr(); - auto attrCount = layout->m_attributeCount; + auto inputLayout = m_currentPipelineState->m_inputLayout.Ptr(); + auto attrCount = inputLayout->m_attributeCount; for (UInt ii = 0; ii < attrCount; ++ii) { - auto& attr = layout->m_attributes[ii]; + auto& attr = inputLayout->m_attributes[ii]; auto streamIndex = attr.streamIndex; @@ -389,6 +465,57 @@ void GLRenderer::flushStateForDraw() { glDisableVertexAttribArray((GLuint)ii); } + + // Next bind the descriptor sets as required by the layout + auto pipelineLayout = m_currentPipelineState->m_pipelineLayout; + auto descriptorSetCount = pipelineLayout->m_sets.Count(); + for(UInt ii = 0; ii < descriptorSetCount; ++ii) + { + auto descriptorSet = m_boundDescriptorSets[ii]; + auto descriptorSetInfo = pipelineLayout->m_sets[ii]; + auto descriptorSetLayout = descriptorSetInfo.layout; + + // TODO: need to validate that `descriptorSet->m_layout` matches + // `descriptorSetLayout`. + + { + // First we will bind any uniform buffers that were specified. + + auto slotTypeIndex = int(GLDescriptorSlotType::ConstantBuffer); + auto count = descriptorSetLayout->m_counts[slotTypeIndex]; + auto baseIndex = descriptorSetInfo.baseArrayIndex[slotTypeIndex]; + + for(Int ii = 0; ii < count; ++ii) + { + auto bufferImpl = descriptorSet->m_constantBuffers[ii]; + glBindBufferBase(GL_UNIFORM_BUFFER, ii, bufferImpl->m_handle); + } + } + + + { + // Next we will bind any combined texture/sampler slots. + + auto slotTypeIndex = int(GLDescriptorSlotType::CombinedTextureSampler); + auto count = descriptorSetLayout->m_counts[slotTypeIndex]; + auto baseIndex = descriptorSetInfo.baseArrayIndex[slotTypeIndex]; + + // TODO: We should be able to use a single call to glBindTextures here, + // rather than a loop. This would also eliminate the need to retain + // the appropriate target (e.g., `GL_TEXTURE_2D` for binding). + + for(Int ii = 0; ii < count; ++ii) + { + auto textureViewImpl = descriptorSet->m_textures[ii]; + auto samplerImpl = descriptorSet->m_samplers[ii]; + + glActiveTexture(GL_TEXTURE0 + ii); + glBindTexture(GL_TEXTURE_2D, textureViewImpl->m_textureID); + + glBindSampler(baseIndex + ii, samplerImpl->m_samplerID); + } + } + } } GLuint GLRenderer::loadShader(GLenum stage, const char* source) @@ -502,6 +629,7 @@ GLuint GLRenderer::loadShader(GLenum stage, const char* source) return shaderID; } +#if 0 void GLRenderer::destroyBindingEntries(const BindingState::Desc& desc, const BindingDetail* details) { const auto& bindings = desc.m_bindings; @@ -517,6 +645,7 @@ void GLRenderer::destroyBindingEntries(const BindingState::Desc& desc, const Bin } } } +#endif // !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! @@ -581,6 +710,13 @@ void GLRenderer::presentFrame() ::SwapBuffers(m_hdc); } +TextureResource::Desc GLRenderer::getSwapChainTextureDesc() +{ + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, m_desc.width, m_desc.height, 1); + return desc; +} + SlangResult GLRenderer::captureScreenSurface(Surface& surfaceOut) { SLANG_RETURN_ON_FAIL(surfaceOut.allocate(m_desc.width, m_desc.height, Format::RGBA_Unorm_UInt8, 1, SurfaceAllocator::getMallocAllocator())); @@ -589,7 +725,7 @@ SlangResult GLRenderer::captureScreenSurface(Surface& surfaceOut) return SLANG_OK; } -TextureResource* GLRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) +Result GLRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) { TextureResource::Desc srcDesc(descIn); srcDesc.setDefaults(initialUsage); @@ -597,7 +733,7 @@ TextureResource* GLRenderer::createTextureResource(Resource::Usage initialUsage, GlPixelFormat pixelFormat = _getGlPixelFormat(srcDesc.format); if (pixelFormat == GlPixelFormat::Unknown) { - return nullptr; + return SLANG_FAIL; } const GlPixelFormatInfo& info = s_pixelFormatInfos[int(pixelFormat)]; @@ -713,7 +849,8 @@ TextureResource* GLRenderer::createTextureResource(Resource::Usage initialUsage, } break; } - default: return nullptr; + default: + return SLANG_FAIL; } glTexParameteri(target, GL_TEXTURE_WRAP_S, GL_REPEAT); @@ -727,7 +864,8 @@ TextureResource* GLRenderer::createTextureResource(Resource::Usage initialUsage, texture->m_target = target; - return texture.detach(); + *outResource = texture.detach(); + return SLANG_OK; } static GLenum _calcUsage(Resource::Usage usage) @@ -750,7 +888,7 @@ static GLenum _calcTarget(Resource::Usage usage) } } -BufferResource* GLRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) +Result GLRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) { BufferResource::Desc desc(descIn); desc.setDefaults(initialUsage); @@ -765,12 +903,51 @@ BufferResource* GLRenderer::createBufferResource(Resource::Usage initialUsage, c glBufferData(target, descIn.sizeInBytes, initData, usage); - return new BufferResourceImpl(initialUsage, desc, this, bufferID, target); + RefPtr<BufferResourceImpl> resourceImpl = new BufferResourceImpl(initialUsage, desc, this, bufferID, target); + *outResource = resourceImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + GLuint samplerID; + glCreateSamplers(1, &samplerID); + + RefPtr<SamplerStateImpl> samplerImpl = new SamplerStateImpl(); + samplerImpl->m_samplerID = samplerID; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (TextureResourceImpl*) texture; + + // TODO: actually do something? + + RefPtr<TextureViewImpl> viewImpl = new TextureViewImpl(); + viewImpl->m_resource = resourceImpl; + viewImpl->m_textureID = resourceImpl->m_handle; + *outView = viewImpl; + return SLANG_OK; +} + +Result GLRenderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + + // TODO: actually do something? + + RefPtr<BufferViewImpl> viewImpl = new BufferViewImpl(); + viewImpl->m_resource = resourceImpl; + viewImpl->m_bufferID = resourceImpl->m_handle; + *outView = viewImpl.detach(); + return SLANG_OK; } -InputLayout* GLRenderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) +Result GLRenderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) { - InputLayoutImpl* inputLayout = new InputLayoutImpl; + RefPtr<InputLayoutImpl> inputLayout = new InputLayoutImpl; inputLayout->m_attributeCount = inputElementCount; for (UInt ii = 0; ii < inputElementCount; ++ii) @@ -783,7 +960,8 @@ InputLayout* GLRenderer::createInputLayout(const InputElementDesc* inputElements glAttr.offset = (GLsizei)inputAttr.offset; } - return (InputLayout*)inputLayout; + *outLayout = inputLayout.detach(); + return SLANG_OK; } void* GLRenderer::map(BufferResource* bufferIn, MapFlavor flavor) @@ -815,11 +993,6 @@ void GLRenderer::unmap(BufferResource* bufferIn) glUnmapBuffer(buffer->m_target); } -void GLRenderer::setInputLayout(InputLayout* inputLayout) -{ - m_boundInputLayout = static_cast<InputLayoutImpl*>(inputLayout); -} - void GLRenderer::setPrimitiveTopology(PrimitiveTopology topology) { GLenum glTopology = 0; @@ -849,12 +1022,23 @@ void GLRenderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource } } -void GLRenderer::setShaderProgram(ShaderProgram* programIn) +void GLRenderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) +{ +} + +void GLRenderer::setDepthStencilTarget(ResourceView* depthStencilView) { - ShaderProgramImpl* program = static_cast<ShaderProgramImpl*>(programIn); - m_boundShaderProgram = program; +} + +void GLRenderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + auto pipelineStateImpl = (PipelineStateImpl*) state; + + m_currentPipelineState = pipelineStateImpl; + + auto program = pipelineStateImpl->m_program; GLuint programID = program ? program->m_id : 0; - glUseProgram(programID); + glUseProgram(programID); } void GLRenderer::draw(UInt vertexCount, UInt startVertex = 0) @@ -864,11 +1048,17 @@ void GLRenderer::draw(UInt vertexCount, UInt startVertex = 0) glDrawArrays(m_boundPrimitiveTopology, (GLint)startVertex, (GLsizei)vertexCount); } +void GLRenderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ + assert(!"unimplemented"); +} + void GLRenderer::dispatchCompute(int x, int y, int z) { glDispatchCompute(x, y, z); } +#if 0 BindingState* GLRenderer::createBindingState(const BindingState::Desc& bindingStateDesc) { RefPtr<BindingStateImpl> bindingState(new BindingStateImpl(bindingStateDesc, this)); @@ -990,8 +1180,168 @@ void GLRenderer::setBindingState(BindingState* stateIn) } } } +#endif + +void GLRenderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + + auto layout = m_layout; + auto rangeInfo = layout->m_ranges[range]; + auto arrayIndex = rangeInfo.arrayIndex + index; + + m_constantBuffers[arrayIndex] = resourceImpl; +} + +void GLRenderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto viewImpl = (ResourceViewImpl*) view; + + auto layout = m_layout; + auto rangeInfo = layout->m_ranges[range]; + auto arrayIndex = rangeInfo.arrayIndex + index; + + assert(!"unimplemented"); +} + +void GLRenderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ + assert(!"unsupported"); +} + +void GLRenderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ + auto viewImpl = (TextureViewImpl*) textureView; + auto samplerImpl = (SamplerStateImpl*) sampler; + + auto layout = m_layout; + auto rangeInfo = layout->m_ranges[range]; + auto arrayIndex = rangeInfo.arrayIndex + index; + + m_textures[arrayIndex] = viewImpl; + m_samplers[arrayIndex] = samplerImpl; +} + +void GLRenderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) +{ + auto descriptorSetImpl = (DescriptorSetImpl*)descriptorSet; + + // TODO: can we just bind things immediately here, rather than shadowing the state? + + m_boundDescriptorSets[index] = descriptorSetImpl; +} -ShaderProgram* GLRenderer::createProgram(const ShaderProgram::Desc& desc) +Result GLRenderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + RefPtr<DescriptorSetLayoutImpl> layoutImpl = new DescriptorSetLayoutImpl(); + + Int counts[int(GLDescriptorSlotType::CountOf)] = { 0, }; + + Int rangeCount = desc.slotRangeCount; + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + + GLDescriptorSlotType glSlotType; + switch( rangeDesc.type ) + { + default: + assert(!"unsupported"); + break; + + // TODO: There are many other slot types we could support here, + // in particular including storage buffers. + + case DescriptorSlotType::CombinedImageSampler: + glSlotType = GLDescriptorSlotType::CombinedTextureSampler; + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + glSlotType = GLDescriptorSlotType::ConstantBuffer; + break; + } + + rangeInfo.type = glSlotType; + rangeInfo.arrayIndex = counts[int(glSlotType)]; + counts[int(glSlotType)] += rangeDesc.count; + + layoutImpl->m_ranges.Add(rangeInfo); + } + + for( Int ii = 0; ii < int(GLDescriptorSlotType::CountOf); ++ii ) + { + layoutImpl->m_counts[ii] = counts[ii]; + } + + *outLayout = layoutImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + RefPtr<PipelineLayoutImpl> layoutImpl = new PipelineLayoutImpl(); + + static const int kSlotTypeCount = int(GLDescriptorSlotType::CountOf); + Int counts[kSlotTypeCount] = { 0, }; + + Int setCount = desc.descriptorSetCount; + for( Int ii = 0; ii < setCount; ++ii ) + { + auto setLayout = (DescriptorSetLayoutImpl*) desc.descriptorSets[ii].layout; + + PipelineLayoutImpl::DescriptorSetInfo setInfo; + setInfo.layout = setLayout; + + for( Int ii = 0; ii < int(GLDescriptorSlotType::CountOf); ++ii ) + { + setInfo.baseArrayIndex[ii] = counts[ii]; + counts[ii] += setLayout->m_counts[ii]; + } + + layoutImpl->m_sets.Add(setInfo); + } + + *outLayout = layoutImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*) layout; + + RefPtr<DescriptorSetImpl> descriptorSetImpl = new DescriptorSetImpl(); + + descriptorSetImpl->m_layout = layoutImpl; + + // TODO: storage for the arrays of bound objects could be tail allocated + // as part of the descriptor set, with offsets pre-computed in the + // descriptor set layout. + + { + auto slotTypeIndex = int(GLDescriptorSlotType::ConstantBuffer); + auto slotCount = layoutImpl->m_counts[slotTypeIndex]; + descriptorSetImpl->m_constantBuffers.SetSize(slotCount); + } + + { + auto slotTypeIndex = int(GLDescriptorSlotType::CombinedTextureSampler); + auto slotCount = layoutImpl->m_counts[slotTypeIndex]; + + descriptorSetImpl->m_textures.SetSize(slotCount); + descriptorSetImpl->m_samplers.SetSize(slotCount); + } + + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) { auto programID = glCreateProgram(); if(desc.pipelineType == PipelineType::Compute ) @@ -1039,10 +1389,37 @@ ShaderProgram* GLRenderer::createProgram(const ShaderProgram::Desc& desc) ::free(infoBuffer); glDeleteProgram(programID); - return nullptr; + return SLANG_FAIL; } - return new ShaderProgramImpl(this, programID); + *outProgram = new ShaderProgramImpl(this, programID); + return SLANG_OK; +} + +Result GLRenderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto inputLayoutImpl = (InputLayoutImpl*) desc.inputLayout; + + RefPtr<PipelineStateImpl> pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_program = programImpl; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_inputLayout = inputLayoutImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + + RefPtr<PipelineStateImpl> pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_program = programImpl; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; } diff --git a/tools/slang-graphics/render-gl.h b/tools/gfx/render-gl.h index 2b211cda4..055031d38 100644 --- a/tools/slang-graphics/render-gl.h +++ b/tools/gfx/render-gl.h @@ -1,10 +1,10 @@ // render-d3d11.h #pragma once -namespace slang_graphics { +namespace gfx { class Renderer; Renderer* createGLRenderer(); -} // slang_graphics +} // gfx diff --git a/tools/slang-graphics/render-vk.cpp b/tools/gfx/render-vk.cpp index d7cd93e67..27926e0e6 100644 --- a/tools/slang-graphics/render-vk.cpp +++ b/tools/gfx/render-vk.cpp @@ -26,33 +26,58 @@ # endif #endif -namespace slang_graphics { +namespace gfx { using namespace Slang; class VKRenderer : public Renderer { public: - enum { kMaxRenderTargets = 8, kMaxAttachments = kMaxRenderTargets + 1 }; + enum + { + kMaxRenderTargets = 8, + kMaxAttachments = kMaxRenderTargets + 1, + + kMaxDescriptorSets = 4, + }; // Renderer implementation virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; virtual void setClearColor(const float color[4]) override; virtual void clearFrame() override; virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& bufferDesc, const void* initData) override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + virtual SlangResult captureScreenSurface(Surface& surface) override; - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& bindingStateDesc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState* state); + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; virtual void dispatchCompute(int x, int y, int z) override; virtual void submitGpuWork() override; virtual void waitForGpu() override; @@ -155,7 +180,63 @@ public: const VulkanApi* m_api; }; - class ShaderProgramImpl: public ShaderProgram + class SamplerStateImpl : public SamplerState + { + public: + VkSampler m_sampler; + }; + + class ResourceViewImpl : public ResourceView + { + public: + enum class ViewType + { + Texture, + TexelBuffer, + PlainBuffer, + }; + ViewType m_type; + }; + + class TextureResourceViewImpl : public ResourceViewImpl + { + public: + TextureResourceViewImpl() + { + m_type = ViewType::Texture; + } + + RefPtr<TextureResourceImpl> m_texture; + VkImageView m_view; + VkImageLayout m_layout; + }; + + class TexelBufferResourceViewImpl : public ResourceViewImpl + { + public: + TexelBufferResourceViewImpl() + { + m_type = ViewType::TexelBuffer; + } + + RefPtr<BufferResourceImpl> m_buffer; + VkBufferView m_view; + }; + + class PlainBufferResourceViewImpl : public ResourceViewImpl + { + public: + PlainBufferResourceViewImpl() + { + m_type = ViewType::PlainBuffer; + } + + RefPtr<BufferResourceImpl> m_buffer; + VkDeviceSize offset; + VkDeviceSize size; + }; + + class ShaderProgramImpl: public ShaderProgram { public: @@ -172,6 +253,85 @@ public: List<char> m_buffers[2]; //< To keep storage of code in scope }; + class DescriptorSetLayoutImpl : public DescriptorSetLayout + { + public: + DescriptorSetLayoutImpl(const VulkanApi& api) + : m_api(&api) + { + } + + ~DescriptorSetLayoutImpl() + { + if(m_descriptorSetLayout != VK_NULL_HANDLE) + { + m_api->vkDestroyDescriptorSetLayout(m_api->m_device, m_descriptorSetLayout, nullptr); + } + if (m_descriptorPool != VK_NULL_HANDLE) + { + m_api->vkDestroyDescriptorPool(m_api->m_device, m_descriptorPool, nullptr); + } + } + + VulkanApi const* m_api; + VkDescriptorSetLayout m_descriptorSetLayout = VK_NULL_HANDLE; + VkDescriptorPool m_descriptorPool = VK_NULL_HANDLE; + + struct RangeInfo + { + VkDescriptorType descriptorType; + }; + List<RangeInfo> m_ranges; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: + PipelineLayoutImpl(const VulkanApi& api) + : m_api(&api) + { + } + + ~PipelineLayoutImpl() + { + if (m_pipelineLayout != VK_NULL_HANDLE) + { + m_api->vkDestroyPipelineLayout(m_api->m_device, m_pipelineLayout, nullptr); + } + } + + VulkanApi const* m_api; + VkPipelineLayout m_pipelineLayout = VK_NULL_HANDLE; + UInt m_descriptorSetCount = 0; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + DescriptorSetImpl(VKRenderer* renderer) + : m_renderer(renderer) + { + } + + ~DescriptorSetImpl() + { + } + + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr<VKRenderer> m_renderer; + RefPtr<DescriptorSetLayoutImpl> m_layout; + VkDescriptorSet m_descriptorSet = VK_NULL_HANDLE; + }; + +#if 0 struct BindingDetail { VkImageView m_srv = VK_NULL_HANDLE; @@ -212,6 +372,7 @@ public: const VulkanApi* m_api; List<BindingDetail> m_bindingDetails; }; +#endif struct BoundVertexBuffer { @@ -220,44 +381,30 @@ public: int m_offset; }; - class Pipeline : public RefObject + class PipelineStateImpl : public PipelineState { public: - Pipeline(const VulkanApi& api): + PipelineStateImpl(const VulkanApi& api): m_api(&api) { } - ~Pipeline() + ~PipelineStateImpl() { if (m_pipeline != VK_NULL_HANDLE) { m_api->vkDestroyPipeline(m_api->m_device, m_pipeline, nullptr); } - if (m_descriptorPool != VK_NULL_HANDLE) - { - m_api->vkDestroyDescriptorPool(m_api->m_device, m_descriptorPool, nullptr); - } - if (m_pipelineLayout != VK_NULL_HANDLE) - { - m_api->vkDestroyPipelineLayout(m_api->m_device, m_pipelineLayout, nullptr); - } - if(m_descriptorSetLayout != VK_NULL_HANDLE) - { - m_api->vkDestroyDescriptorSetLayout(m_api->m_device, m_descriptorSetLayout, nullptr); - } } const VulkanApi* m_api; - VkPrimitiveTopology m_primitiveTopology; - RefPtr<BindingStateImpl> m_bindingState; - RefPtr<InputLayoutImpl> m_inputLayout; +// VkPrimitiveTopology m_primitiveTopology; + + RefPtr<PipelineLayoutImpl> m_pipelineLayout; + +// RefPtr<InputLayoutImpl> m_inputLayout; RefPtr<ShaderProgramImpl> m_shaderProgram; - VkDescriptorSetLayout m_descriptorSetLayout = VK_NULL_HANDLE; - VkPipelineLayout m_pipelineLayout = VK_NULL_HANDLE; - VkDescriptorPool m_descriptorPool = VK_NULL_HANDLE; - VkDescriptorSet m_descriptorSet = VK_NULL_HANDLE; VkPipeline m_pipeline = VK_NULL_HANDLE; }; @@ -273,9 +420,9 @@ public: size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg, void* pUserData); /// Returns true if m_currentPipeline matches the current configuration - Pipeline* _getPipeline(); - bool _isEqual(const Pipeline& pipeline) const; - Slang::Result _createPipeline(RefPtr<Pipeline>& pipelineOut); +// Pipeline* _getPipeline(); +// bool _isEqual(const Pipeline& pipeline) const; +// Slang::Result _createPipeline(RefPtr<Pipeline>& pipelineOut); void _beginRender(); void _endRender(); @@ -285,12 +432,18 @@ public: VkDebugReportCallbackEXT m_debugReportCallback; - RefPtr<InputLayoutImpl> m_currentInputLayout; - RefPtr<BindingStateImpl> m_currentBindingState; - RefPtr<ShaderProgramImpl> m_currentProgram; +// RefPtr<InputLayoutImpl> m_currentInputLayout; + +// RefPtr<BindingStateImpl> m_currentBindingState; + RefPtr<PipelineLayoutImpl> m_currentPipelineLayout; + + RefPtr<DescriptorSetImpl> m_currentDescriptorSetImpls [kMaxDescriptorSets]; + VkDescriptorSet m_currentDescriptorSets [kMaxDescriptorSets]; + +// RefPtr<ShaderProgramImpl> m_currentProgram; - List<RefPtr<Pipeline> > m_pipelineCache; - Pipeline* m_currentPipeline = nullptr; +// List<RefPtr<Pipeline> > m_pipelineCache; + RefPtr<PipelineStateImpl> m_currentPipeline; List<BoundVertexBuffer> m_boundVertexBuffers; @@ -349,10 +502,11 @@ Result VKRenderer::Buffer::init(const VulkanApi& api, size_t bufferSize, VkBuffe /* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VkRenderer !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ +#if 0 bool VKRenderer::_isEqual(const Pipeline& pipeline) const { return - pipeline.m_bindingState == m_currentBindingState && + pipeline.m_pipelineLayout == m_currentPipelineLayout && pipeline.m_primitiveTopology == m_primitiveTopology && pipeline.m_inputLayout == m_currentInputLayout && pipeline.m_shaderProgram == m_currentProgram; @@ -389,266 +543,13 @@ Slang::Result VKRenderer::_createPipeline(RefPtr<Pipeline>& pipelineOut) // Initialize the state pipeline->m_primitiveTopology = m_primitiveTopology; - pipeline->m_bindingState = m_currentBindingState; + pipeline->m_pipelineLayout = m_currentPipelineLayout; pipeline->m_shaderProgram = m_currentProgram; pipeline->m_inputLayout = m_currentInputLayout; // Must be equal at this point if all the items are correctly set in pipeline assert(_isEqual(*pipeline)); - // First create a pipeline layout based on what is bound - - const auto& srcDetails = m_currentBindingState->m_bindingDetails; - const auto& srcBindings = m_currentBindingState->getDesc().m_bindings; - - const int numBindings = int(srcBindings.Count()); - - int numBuffers = 0; - int numImages = 0; - - int numDescriptorByType[VK_DESCRIPTOR_TYPE_RANGE_SIZE] = { 0, }; - - Slang::List<VkDescriptorSetLayoutBinding> dstBindings; - for (int i = 0; i < numBindings; ++i) - { - const auto& srcDetail = srcDetails[i]; - const auto& srcBinding = srcBindings[i]; - - VkDescriptorSetLayoutBinding dstBinding = {}; - - dstBinding.descriptorCount = 1; - - switch (srcBinding.bindingType) - { - case BindingType::Buffer: - { - BufferResourceImpl* bufferResource = static_cast<BufferResourceImpl*>(srcBinding.resource.Ptr()); - const BufferResource::Desc& bufferResourceDesc = bufferResource->getDesc(); - - if (bufferResourceDesc.bindFlags & Resource::BindFlag::UnorderedAccess) - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; - dstBinding.stageFlags = VK_SHADER_STAGE_ALL; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numBuffers++; - } - else if (bufferResourceDesc.bindFlags & Resource::BindFlag::ConstantBuffer) - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; - dstBinding.stageFlags = VK_SHADER_STAGE_ALL; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numBuffers++; - } - break; - } - case BindingType::Texture: - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE; - dstBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numImages++; - break; - } - case BindingType::Sampler: - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLER; - dstBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numImages++; - break; - } - - case BindingType::CombinedTextureSampler: - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER; - dstBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numImages++; - break; - } - default: - { - assert(!"Unhandled type"); - return SLANG_FAIL; - } - } - } - - // Create a descriptor pool for allocating sets - { -#if 0 - VkDescriptorPoolSize poolSizes[] = - { - { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, 128 }, - { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 128 }, - { VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, 128 }, - }; -#endif - - List<VkDescriptorPoolSize> poolSizes; - for (int i = 0; i < SLANG_COUNT_OF(numDescriptorByType); ++i) - { - int numDescriptors = numDescriptorByType[i]; - if (numDescriptors > 0) - { - const VkDescriptorPoolSize poolSize = { VkDescriptorType(i), uint32_t(numDescriptors) }; - poolSizes.Add(poolSize); - } - } - VkDescriptorPoolCreateInfo descriptorPoolInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO }; - - descriptorPoolInfo.maxSets = 128; // TODO: actually pick a size. - descriptorPoolInfo.poolSizeCount = uint32_t(poolSizes.Count()); - descriptorPoolInfo.pPoolSizes = poolSizes.Buffer(); - - SLANG_VK_CHECK(m_api.vkCreateDescriptorPool(m_device, &descriptorPoolInfo, nullptr, &pipeline->m_descriptorPool)); - } - - // Create the layout - { - VkDescriptorSetLayoutCreateInfo descriptorSetLayoutInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO }; - descriptorSetLayoutInfo.bindingCount = uint32_t(dstBindings.Count()); - descriptorSetLayoutInfo.pBindings = dstBindings.Buffer(); - - SLANG_VK_CHECK(m_api.vkCreateDescriptorSetLayout(m_device, &descriptorSetLayoutInfo, nullptr, &pipeline->m_descriptorSetLayout)); - } - - // Create a descriptor set based on our layout - { - VkDescriptorSetAllocateInfo descriptorSetAllocInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO }; - descriptorSetAllocInfo.descriptorPool = pipeline->m_descriptorPool; - descriptorSetAllocInfo.descriptorSetCount = 1; - descriptorSetAllocInfo.pSetLayouts = &pipeline->m_descriptorSetLayout; - - SLANG_VK_CHECK(m_api.vkAllocateDescriptorSets(m_device, &descriptorSetAllocInfo, &pipeline->m_descriptorSet)); - } - - // Fill in the descriptor set, using our binding information - - List<VkDescriptorImageInfo> imageInfos; - List<VkDescriptorBufferInfo> bufferInfos; - List<VkWriteDescriptorSet> writes; - - // Make sure there is enough space... - imageInfos.Reserve(numImages); - bufferInfos.Reserve(numBuffers); - - int elementIndex = 0; - - for (int i = 0; i < numBindings; ++i) - { - const auto& srcDetail = srcDetails[i]; - const auto& srcBinding = srcBindings[i]; - - const int bindingIndex = srcBinding.registerRange.getSingleIndex(); - - VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; - writeInfo.descriptorCount = 1; - writeInfo.dstSet = pipeline->m_descriptorSet; - writeInfo.dstBinding = bindingIndex; - writeInfo.dstArrayElement = 0; - - switch (srcBinding.bindingType) - { - case BindingType::Buffer: - { - assert(srcBinding.resource && srcBinding.resource->isBuffer()); - BufferResourceImpl* bufferResource = static_cast<BufferResourceImpl*>(srcBinding.resource.Ptr()); - const BufferResource::Desc& bufferResourceDesc = bufferResource->getDesc(); - - { - VkDescriptorBufferInfo bufferInfo; - bufferInfo.buffer = bufferResource->m_buffer.m_buffer; - bufferInfo.offset = 0; - bufferInfo.range = bufferResourceDesc.sizeInBytes; - - bufferInfos.Add(bufferInfo); - } - - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; - if (bufferResource->m_initialUsage == Resource::Usage::UnorderedAccess) - { - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; - } - else if (bufferResource->m_initialUsage == Resource::Usage::ConstantBuffer) - { - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; - } - - writeInfo.pBufferInfo = &bufferInfos.Last(); - - writes.Add(writeInfo); - break; - } - case BindingType::Texture: - { - assert(srcBinding.resource && srcBinding.resource->isTexture()); - - TextureResourceImpl* textureResource = static_cast<TextureResourceImpl*>(srcBinding.resource.Ptr()); - const TextureResource::Desc& textureResourceDesc = textureResource->getDesc(); - - { - VkDescriptorImageInfo imageInfo = {}; - imageInfo.imageView = srcDetail.m_srv; - imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; - imageInfos.Add(imageInfo); - } - - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE; - writeInfo.pImageInfo = &imageInfos.Last(); - - writes.Add(writeInfo); - break; - } - case BindingType::Sampler: - { - { - VkDescriptorImageInfo imageInfo = {}; - imageInfo.sampler = srcDetail.m_sampler; - //imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; - imageInfos.Add(imageInfo); - } - - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLER; - writeInfo.pImageInfo = &imageInfos.Last(); - - writes.Add(writeInfo); - break; - } - default: - { - assert(!"Binding not currently handled"); - return SLANG_FAIL; - } - } - } - - assert(imageInfos.Count() == numImages); - assert(bufferInfos.Count() == numBuffers); - - // Write into the descriptor set - { - m_api.vkUpdateDescriptorSets(m_device, uint32_t(writes.Count()), writes.Buffer(), 0, nullptr); - } - - // Create a pipeline layout based on our descriptor set layout(s) - - VkPipelineLayoutCreateInfo pipelineLayoutInfo = { VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO }; - pipelineLayoutInfo.setLayoutCount = 1; - pipelineLayoutInfo.pSetLayouts = &pipeline->m_descriptorSetLayout; - - SLANG_VK_CHECK(m_api.vkCreatePipelineLayout(m_device, &pipelineLayoutInfo, nullptr, &pipeline->m_pipelineLayout)); - VkPipelineCache pipelineCache = VK_NULL_HANDLE; if (m_currentProgram->m_pipelineType == PipelineType::Compute) @@ -657,7 +558,7 @@ Slang::Result VKRenderer::_createPipeline(RefPtr<Pipeline>& pipelineOut) VkComputePipelineCreateInfo computePipelineInfo = { VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO }; computePipelineInfo.stage = m_currentProgram->m_compute; - computePipelineInfo.layout = pipeline->m_pipelineLayout; + computePipelineInfo.layout = pipeline->m_pipelineLayout->m_pipelineLayout; SLANG_VK_CHECK(m_api.vkCreateComputePipelines(m_device, pipelineCache, 1, &computePipelineInfo, nullptr, &pipeline->m_pipeline)); } @@ -762,7 +663,7 @@ Slang::Result VKRenderer::_createPipeline(RefPtr<Pipeline>& pipelineOut) pipelineInfo.pRasterizationState = &rasterizer; pipelineInfo.pMultisampleState = &multisampling; pipelineInfo.pColorBlendState = &colorBlending; - pipelineInfo.layout = pipeline->m_pipelineLayout; + pipelineInfo.layout = pipeline->m_pipelineLayout->m_pipelineLayout; pipelineInfo.renderPass = m_renderPass; pipelineInfo.subpass = 0; pipelineInfo.basePipelineHandle = VK_NULL_HANDLE; @@ -778,6 +679,7 @@ Slang::Result VKRenderer::_createPipeline(RefPtr<Pipeline>& pipelineOut) pipelineOut = pipeline; return SLANG_OK; } +#endif Result VKRenderer::_beginPass() { @@ -1190,6 +1092,13 @@ void VKRenderer::presentFrame() _beginRender(); } +TextureResource::Desc VKRenderer::getSwapChainTextureDesc() +{ + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, m_desc.width, m_desc.height, 1); + return desc; +} + SlangResult VKRenderer::captureScreenSurface(Surface& surfaceOut) { return SLANG_FAIL; @@ -1345,7 +1254,7 @@ void VKRenderer::_transitionImageLayout(VkImage image, VkFormat format, const Te m_api.vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &barrier); } -TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) +Result VKRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) { TextureResource::Desc desc(descIn); desc.setDefaults(initialUsage); @@ -1354,7 +1263,7 @@ TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, if (format == VK_FORMAT_UNDEFINED) { assert(!"Unhandled image format"); - return nullptr; + return SLANG_FAIL; } const int arraySize = desc.calcEffectiveArraySize(); @@ -1397,7 +1306,7 @@ TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, default: { assert(!"Unhandled type"); - return nullptr; + return SLANG_FAIL; } } @@ -1413,7 +1322,7 @@ TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, imageInfo.samples = VK_SAMPLE_COUNT_1_BIT; imageInfo.flags = 0; // Optional - SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateImage(m_device, &imageInfo, nullptr, &texture->m_image)); + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateImage(m_device, &imageInfo, nullptr, &texture->m_image)); } VkMemoryRequirements memRequirements; @@ -1433,7 +1342,7 @@ TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, allocInfo.allocationSize = memRequirements.size; allocInfo.memoryTypeIndex = memoryTypeIndex; - SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkAllocateMemory(m_device, &allocInfo, nullptr, &texture->m_imageMemory)); + SLANG_VK_RETURN_ON_FAIL(m_api.vkAllocateMemory(m_device, &allocInfo, nullptr, &texture->m_imageMemory)); } // Bind the memory to the image @@ -1468,7 +1377,7 @@ TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, bufferSize *= arraySize; Buffer uploadBuffer; - SLANG_RETURN_NULL_ON_FAIL(uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); + SLANG_RETURN_ON_FAIL(uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); assert(mipSizes.Count() == numMipMaps); @@ -1554,10 +1463,11 @@ TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, m_deviceQueue.flushAndWait(); } - return texture.detach(); + *outResource = texture.detach(); + return SLANG_OK; } -BufferResource* VKRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) +Result VKRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) { BufferResource::Desc desc(descIn); desc.setDefaults(initialUsage); @@ -1579,11 +1489,11 @@ BufferResource* VKRenderer::createBufferResource(Resource::Usage initialUsage, c } RefPtr<BufferResourceImpl> buffer(new BufferResourceImpl(initialUsage, desc, this)); - SLANG_RETURN_NULL_ON_FAIL(buffer->m_buffer.init(m_api, desc.sizeInBytes, usage, reqMemoryProperties)); + SLANG_RETURN_ON_FAIL(buffer->m_buffer.init(m_api, desc.sizeInBytes, usage, reqMemoryProperties)); if ((desc.cpuAccessFlags & Resource::AccessFlag::Write) || initData) { - SLANG_RETURN_NULL_ON_FAIL(buffer->m_uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); + SLANG_RETURN_ON_FAIL(buffer->m_uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); } if (initData) @@ -1607,10 +1517,200 @@ BufferResource* VKRenderer::createBufferResource(Resource::Usage initialUsage, c //flushCommandBuffer(commandBuffer); } - return buffer.detach(); + *outResource = buffer.detach(); + return SLANG_OK; +} + + +VkFilter translateFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return VkFilter(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return VK_FILTER_##DST + + CASE(Point, NEAREST); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +VkSamplerMipmapMode translateMipFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return VkSamplerMipmapMode(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return VK_SAMPLER_MIPMAP_MODE_##DST + + CASE(Point, NEAREST); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +VkSamplerAddressMode translateAddressingMode(TextureAddressingMode mode) +{ + switch (mode) + { + default: + return VkSamplerAddressMode(0); + +#define CASE(SRC, DST) \ + case TextureAddressingMode::SRC: return VK_SAMPLER_ADDRESS_MODE_##DST + + CASE(Wrap, REPEAT); + CASE(ClampToEdge, CLAMP_TO_EDGE); + CASE(ClampToBorder, CLAMP_TO_BORDER); + CASE(MirrorRepeat, MIRRORED_REPEAT); + CASE(MirrorOnce, MIRROR_CLAMP_TO_EDGE); + +#undef CASE + } +} + +static VkCompareOp translateComparisonFunc(ComparisonFunc func) +{ + switch (func) + { + default: + // TODO: need to report failures + return VK_COMPARE_OP_ALWAYS; + +#define CASE(FROM, TO) \ + case ComparisonFunc::FROM: return VK_COMPARE_OP_##TO + + CASE(Never, NEVER); + CASE(Less, LESS); + CASE(Equal, EQUAL); + CASE(LessEqual, LESS_OR_EQUAL); + CASE(Greater, GREATER); + CASE(NotEqual, NOT_EQUAL); + CASE(GreaterEqual, GREATER_OR_EQUAL); + CASE(Always, ALWAYS); +#undef CASE + } +} + +Result VKRenderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + VkSamplerCreateInfo samplerInfo = { VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO }; + + samplerInfo.magFilter = translateFilterMode(desc.minFilter); + samplerInfo.minFilter = translateFilterMode(desc.magFilter); + + samplerInfo.addressModeU = translateAddressingMode(desc.addressU); + samplerInfo.addressModeV = translateAddressingMode(desc.addressV); + samplerInfo.addressModeW = translateAddressingMode(desc.addressW); + + samplerInfo.anisotropyEnable = desc.maxAnisotropy > 1; + samplerInfo.maxAnisotropy = (float) desc.maxAnisotropy; + + // TODO: support translation of border color... + samplerInfo.borderColor = VK_BORDER_COLOR_INT_OPAQUE_BLACK; + + samplerInfo.unnormalizedCoordinates = VK_FALSE; + samplerInfo.compareEnable = desc.reductionOp == TextureReductionOp::Comparison; + samplerInfo.compareOp = translateComparisonFunc(desc.comparisonFunc); + samplerInfo.mipmapMode = translateMipFilterMode(desc.mipFilter); + + VkSampler sampler; + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateSampler(m_device, &samplerInfo, nullptr, &sampler)); + + RefPtr<SamplerStateImpl> samplerImpl = new SamplerStateImpl(); + samplerImpl->m_sampler = sampler; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + assert(!"unimplemented"); + return SLANG_FAIL; +} + +Result VKRenderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + + // TODO: These should come from the `ResourceView::Desc` + VkDeviceSize offset = 0; + VkDeviceSize size = resourceImpl->getDesc().sizeInBytes; + + // There are two different cases we need to think about for buffers. + // + // One is when we have a "uniform texel buffer" or "storage texel buffer," + // in which case we need to construct a `VkBufferView` to represent the + // formatting that is applied to the buffer. This case would correspond + // to a `textureBuffer` or `imageBuffer` in GLSL, and more or less to + // `Buffer<..>` or `RWBuffer<...>` in HLSL. + // + // The other case is a `storage buffer` which is the catch-all for any + // non-formatted R/W access to a buffer. In GLSL this is a `buffer { ... }` + // declaration, while in HLSL it covers a bunch of different `RW*Buffer` + // cases. In these cases we do *not* need a `VkBufferView`, but in + // order to be compatible with other APIs that require views for any + // potentially writable access, we will have to create one anyway. + // + // We will distinguish the two cases by looking at whether the view + // is being requested with a format or not. + // + + switch(desc.type) + { + default: + assert(!"unhandled"); + return SLANG_FAIL; + + case ResourceView::Type::UnorderedAccess: + // Is this a formatted view? + // + if(desc.format == Format::Unknown) + { + // Buffer usage that doesn't involve formatting doesn't + // require a view in Vulkan. + RefPtr<PlainBufferResourceViewImpl> viewImpl = new PlainBufferResourceViewImpl(); + viewImpl->m_buffer = resourceImpl; + viewImpl->offset = 0; + viewImpl->size = size; + *outView = viewImpl.detach(); + return SLANG_OK; + } + // + // If the view is formatted, then we need to handle + // it just like we would for a "sampled" buffer: + // + // FALLTHROUGH + case ResourceView::Type::ShaderResource: + { + VkBufferViewCreateInfo info = { VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO }; + + info.format = VulkanUtil::getVkFormat(desc.format); + info.buffer = resourceImpl->m_buffer.m_buffer; + info.offset = offset; + info.range = size; + + VkBufferView view; + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateBufferView(m_device, &info, nullptr, &view)); + + RefPtr<TexelBufferResourceViewImpl> viewImpl = new TexelBufferResourceViewImpl(); + viewImpl->m_buffer = resourceImpl; + viewImpl->m_view = view; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + } } -InputLayout* VKRenderer::createInputLayout(const InputElementDesc* elements, UInt numElements) +Result VKRenderer::createInputLayout(const InputElementDesc* elements, UInt numElements, InputLayout** outLayout) { RefPtr<InputLayoutImpl> layout(new InputLayoutImpl); @@ -1629,7 +1729,7 @@ InputLayout* VKRenderer::createInputLayout(const InputElementDesc* elements, UIn dstDesc.format = VulkanUtil::getVkFormat(srcDesc.format); if (dstDesc.format == VK_FORMAT_UNDEFINED) { - return nullptr; + return SLANG_FAIL; } dstDesc.offset = uint32_t(srcDesc.offset); @@ -1643,7 +1743,8 @@ InputLayout* VKRenderer::createInputLayout(const InputElementDesc* elements, UIn // Work out the overall size layout->m_vertexSize = int(vertexSize); - return layout.detach(); + *outLayout = layout.detach(); + return SLANG_OK; } void* VKRenderer::map(BufferResource* bufferIn, MapFlavor flavor) @@ -1738,11 +1839,6 @@ void VKRenderer::unmap(BufferResource* bufferIn) buffer->m_mapFlavor = MapFlavor::Unknown; } -void VKRenderer::setInputLayout(InputLayout* inputLayout) -{ - m_currentInputLayout = static_cast<InputLayoutImpl*>(inputLayout); -} - void VKRenderer::setPrimitiveTopology(PrimitiveTopology topology) { m_primitiveTopology = VulkanUtil::getVkPrimitiveTopology(topology); @@ -1773,14 +1869,22 @@ void VKRenderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource } } -void VKRenderer::setShaderProgram(ShaderProgram* program) +void VKRenderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) { - m_currentProgram = (ShaderProgramImpl*)program; +} + +void VKRenderer::setDepthStencilTarget(ResourceView* depthStencilView) +{ +} + +void VKRenderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + m_currentPipeline = (PipelineStateImpl*)state; } void VKRenderer::draw(UInt vertexCount, UInt startVertex = 0) { - Pipeline* pipeline = _getPipeline(); + auto pipeline = m_currentPipeline; if (!pipeline || pipeline->m_shaderProgram->m_pipelineType != PipelineType::Graphics) { assert(!"Invalid render pipeline"); @@ -1793,8 +1897,12 @@ void VKRenderer::draw(UInt vertexCount, UInt startVertex = 0) VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); m_api.vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline->m_pipeline); - m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline->m_pipelineLayout, - 0, 1, &pipeline->m_descriptorSet, 0, nullptr); + + auto pipelineLayoutImpl = pipeline->m_pipelineLayout.Ptr(); + m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipelineLayoutImpl->m_pipelineLayout, + 0, pipelineLayoutImpl->m_descriptorSetCount, + &m_currentDescriptorSets[0], + 0, nullptr); // Bind the vertex buffer if (m_boundVertexBuffers.Count() > 0 && m_boundVertexBuffers[0].m_buffer) @@ -1812,12 +1920,16 @@ void VKRenderer::draw(UInt vertexCount, UInt startVertex = 0) _endPass(); } +void VKRenderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ +} + void VKRenderer::dispatchCompute(int x, int y, int z) { - Pipeline* pipeline = _getPipeline(); + auto pipeline = m_currentPipeline; if (!pipeline || pipeline->m_shaderProgram->m_pipelineType != PipelineType::Compute) { - assert(!"Invalid render pipeline"); + assert(!"Invalid compute pipeline"); return; } @@ -1826,8 +1938,11 @@ void VKRenderer::dispatchCompute(int x, int y, int z) m_api.vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipeline->m_pipeline); - m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipeline->m_pipelineLayout, - 0, 1, &pipeline->m_descriptorSet, 0, nullptr); + auto pipelineLayoutImpl = pipeline->m_pipelineLayout.Ptr(); + m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipelineLayoutImpl->m_pipelineLayout, + 0, pipelineLayoutImpl->m_descriptorSetCount, + &m_currentDescriptorSets[0], + 0, nullptr); m_api.vkCmdDispatch(commandBuffer, x, y, z); } @@ -1855,7 +1970,7 @@ static VkImageViewType _calcImageViewType(TextureResource::Type type, const Text return VK_IMAGE_VIEW_TYPE_MAX_ENUM; } - +#if 0 BindingState* VKRenderer::createBindingState(const BindingState::Desc& bindingStateDesc) { RefPtr<BindingStateImpl> bindingState(new BindingStateImpl(bindingStateDesc, &m_api)); @@ -1991,13 +2106,268 @@ BindingState* VKRenderer::createBindingState(const BindingState::Desc& bindingSt return bindingState.detach();; } +#endif + +static VkDescriptorType translateDescriptorType(DescriptorSlotType type) +{ + switch(type) + { + default: + return VK_DESCRIPTOR_TYPE_MAX_ENUM; + +#define CASE(SRC, DST) \ + case DescriptorSlotType::SRC: return VK_DESCRIPTOR_TYPE_##DST + + CASE(Sampler, SAMPLER); + CASE(CombinedImageSampler, COMBINED_IMAGE_SAMPLER); + CASE(SampledImage, SAMPLED_IMAGE); + CASE(StorageImage, STORAGE_IMAGE); + CASE(UniformTexelBuffer, UNIFORM_TEXEL_BUFFER); + CASE(StorageTexelBuffer, STORAGE_TEXEL_BUFFER); + CASE(UniformBuffer, UNIFORM_BUFFER); + CASE(StorageBuffer, STORAGE_BUFFER); + CASE(DynamicUniformBuffer, UNIFORM_BUFFER_DYNAMIC); + CASE(DynamicStorageBuffer, STORAGE_BUFFER_DYNAMIC); + CASE(InputAttachment, INPUT_ATTACHMENT); + +#undef CASE + } +} + +Result VKRenderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + RefPtr<DescriptorSetLayoutImpl> descriptorSetLayoutImpl = new DescriptorSetLayoutImpl(m_api); + + Slang::List<VkDescriptorSetLayoutBinding> dstBindings; + + uint32_t descriptorCountForTypes[VK_DESCRIPTOR_TYPE_RANGE_SIZE] = { 0, }; + + UInt rangeCount = desc.slotRangeCount; + for(UInt rr = 0; rr < rangeCount; ++rr) + { + auto& srcRange = desc.slotRanges[rr]; + + VkDescriptorType dstDescriptorType = translateDescriptorType(srcRange.type); + + VkDescriptorSetLayoutBinding dstBinding; + dstBinding.binding = rr; + dstBinding.descriptorType = dstDescriptorType; + dstBinding.descriptorCount = srcRange.count; + dstBinding.stageFlags = VK_SHADER_STAGE_ALL; + dstBinding.pImmutableSamplers = nullptr; + + descriptorCountForTypes[dstDescriptorType] += srcRange.count; + + dstBindings.Add(dstBinding); + + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + rangeInfo.descriptorType = dstDescriptorType; + descriptorSetLayoutImpl->m_ranges.Add(rangeInfo); + } + + VkDescriptorSetLayoutCreateInfo descriptorSetLayoutInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO }; + descriptorSetLayoutInfo.bindingCount = uint32_t(dstBindings.Count()); + descriptorSetLayoutInfo.pBindings = dstBindings.Buffer(); + + VkDescriptorSetLayout descriptorSetLayout = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateDescriptorSetLayout(m_device, &descriptorSetLayoutInfo, nullptr, &descriptorSetLayout)); + + // Create a pool while we are at it, to allocate descriptor sets of this type. + + VkDescriptorPoolSize poolSizes[VK_DESCRIPTOR_TYPE_RANGE_SIZE]; + uint32_t poolSizeCount = 0; + for (int ii = 0; ii < SLANG_COUNT_OF(descriptorCountForTypes); ++ii) + { + auto descriptorCount = descriptorCountForTypes[ii]; + if (descriptorCount > 0) + { + poolSizes[poolSizeCount].type = VkDescriptorType(ii); + poolSizes[poolSizeCount].descriptorCount = descriptorCount; + poolSizeCount++; + } + } + + VkDescriptorPoolCreateInfo descriptorPoolInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO }; + descriptorPoolInfo.maxSets = 128; // TODO: actually pick a size. + descriptorPoolInfo.poolSizeCount = poolSizeCount; + descriptorPoolInfo.pPoolSizes = &poolSizes[0]; + + VkDescriptorPool descriptorPool = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateDescriptorPool(m_device, &descriptorPoolInfo, nullptr, &descriptorPool)); + + descriptorSetLayoutImpl->m_descriptorSetLayout = descriptorSetLayout; + descriptorSetLayoutImpl->m_descriptorPool = descriptorPool; + + *outLayout = descriptorSetLayoutImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + UInt descriptorSetCount = desc.descriptorSetCount; + + VkDescriptorSetLayout descriptorSetLayouts[kMaxDescriptorSets]; + for(UInt ii = 0; ii < descriptorSetCount; ++ii) + { + descriptorSetLayouts[ii] = ((DescriptorSetLayoutImpl*) desc.descriptorSets[ii].layout)->m_descriptorSetLayout; + } + + VkPipelineLayoutCreateInfo pipelineLayoutInfo = { VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO }; + pipelineLayoutInfo.setLayoutCount = desc.descriptorSetCount; + pipelineLayoutInfo.pSetLayouts = &descriptorSetLayouts[0]; + + VkPipelineLayout pipelineLayout; + SLANG_VK_CHECK(m_api.vkCreatePipelineLayout(m_device, &pipelineLayoutInfo, nullptr, &pipelineLayout)); + + RefPtr<PipelineLayoutImpl> pipelineLayoutImpl = new PipelineLayoutImpl(m_api); + pipelineLayoutImpl->m_pipelineLayout = pipelineLayout; + pipelineLayoutImpl->m_descriptorSetCount = descriptorSetCount; + + *outLayout = pipelineLayoutImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*)layout; + + VkDescriptorSetAllocateInfo descriptorSetAllocInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO }; + descriptorSetAllocInfo.descriptorPool = layoutImpl->m_descriptorPool; + descriptorSetAllocInfo.descriptorSetCount = 1; + descriptorSetAllocInfo.pSetLayouts = &layoutImpl->m_descriptorSetLayout; + + VkDescriptorSet descriptorSet; + SLANG_VK_CHECK(m_api.vkAllocateDescriptorSets(m_device, &descriptorSetAllocInfo, &descriptorSet)); + + RefPtr<DescriptorSetImpl> descriptorSetImpl = new DescriptorSetImpl(this); + descriptorSetImpl->m_layout = layoutImpl; + descriptorSetImpl->m_descriptorSet = descriptorSet; + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + +void VKRenderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto bufferImpl = (BufferResourceImpl*)buffer; + + VkDescriptorBufferInfo bufferInfo = {}; + bufferInfo.buffer = bufferImpl->m_buffer.m_buffer; + bufferInfo.offset = 0; + bufferInfo.range = bufferImpl->getDesc().sizeInBytes; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pBufferInfo = &bufferInfo; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); +} + +void VKRenderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto viewImpl = (ResourceViewImpl*)view; + switch (viewImpl->m_type) + { + case ResourceViewImpl::ViewType::Texture: + { + auto textureViewImpl = (TextureResourceViewImpl*)viewImpl; + VkDescriptorImageInfo imageInfo = {}; + imageInfo.imageView = textureViewImpl->m_view; + imageInfo.imageLayout = textureViewImpl->m_layout; + // imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pImageInfo = &imageInfo; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); + } + break; + + case ResourceViewImpl::ViewType::TexelBuffer: + { + auto bufferViewImpl = (TexelBufferResourceViewImpl*)viewImpl; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pTexelBufferView = &bufferViewImpl->m_view; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); + } + break; + + case ResourceViewImpl::ViewType::PlainBuffer: + { + auto bufferViewImpl = (PlainBufferResourceViewImpl*) viewImpl; + + VkDescriptorBufferInfo bufferInfo = {}; + bufferInfo.buffer = bufferViewImpl->m_buffer->m_buffer.m_buffer; + bufferInfo.offset = bufferViewImpl->offset; + bufferInfo.range = bufferViewImpl->size; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pBufferInfo = &bufferInfo; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); + } + break; + + } +} + +void VKRenderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ +} + +void VKRenderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ +} -void VKRenderer::setBindingState(BindingState* state) +void VKRenderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) { - m_currentBindingState = static_cast<BindingStateImpl*>(state); + // Ideally this should eventually be as simple as: + // + // m_api.vkCmdBindDescriptorSets( + // commandBuffer, + // translatePipelineBindPoint(pipelineType), + // layout->m_pipelineLayout, + // index, + // 1, + // ((DescriptorSetImpl*) descriptorSet)->m_descriptorSet, + // 0, + // nullptr); + // + // For now we are lazily flushing state right before drawing, so + // we will hang onto the parameters that were passed in and then + // use them later. + // + + auto descriptorSetImpl = (DescriptorSetImpl*)descriptorSet; + m_currentDescriptorSetImpls[index] = descriptorSetImpl; + m_currentDescriptorSets[index] = descriptorSetImpl->m_descriptorSet; } -ShaderProgram* VKRenderer::createProgram(const ShaderProgram::Desc& desc) +Result VKRenderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) { ShaderProgramImpl* impl = new ShaderProgramImpl(desc.pipelineType); if( desc.pipelineType == PipelineType::Compute) @@ -2013,7 +2383,187 @@ ShaderProgram* VKRenderer::createProgram(const ShaderProgram::Desc& desc) impl->m_vertex = compileEntryPoint(*vertexKernel, VK_SHADER_STAGE_VERTEX_BIT, impl->m_buffers[0]); impl->m_fragment = compileEntryPoint(*fragmentKernel, VK_SHADER_STAGE_FRAGMENT_BIT, impl->m_buffers[1]); } - return impl; + *outProgram = impl; + return SLANG_OK; +} + +Result VKRenderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + VkPipelineCache pipelineCache = VK_NULL_HANDLE; + + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto inputLayoutImpl = (InputLayoutImpl*) desc.inputLayout; + + int width = desc.framebufferWidth; + int height = desc.framebufferHeight; + + // Shader Stages + // + // Currently only handles vertex/fragment. + + static const uint32_t kMaxShaderStages = 2; + VkPipelineShaderStageCreateInfo shaderStages[kMaxShaderStages]; + + uint32_t shaderStageCount = 0; + shaderStages[shaderStageCount++] = programImpl->m_vertex; + shaderStages[shaderStageCount++] = programImpl->m_fragment; + + // VertexBuffer/s + // Currently only handles one + + VkPipelineVertexInputStateCreateInfo vertexInputInfo = { VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO }; + vertexInputInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO; + vertexInputInfo.vertexBindingDescriptionCount = 0; + vertexInputInfo.vertexAttributeDescriptionCount = 0; + + VkVertexInputBindingDescription vertexInputBindingDescription; + + if (inputLayoutImpl) + { + vertexInputBindingDescription.binding = 0; + vertexInputBindingDescription.stride = inputLayoutImpl->m_vertexSize; + vertexInputBindingDescription.inputRate = VK_VERTEX_INPUT_RATE_VERTEX; + + const auto& srcAttributeDescs = inputLayoutImpl->m_vertexDescs; + + vertexInputInfo.vertexBindingDescriptionCount = 1; + vertexInputInfo.pVertexBindingDescriptions = &vertexInputBindingDescription; + + vertexInputInfo.vertexAttributeDescriptionCount = static_cast<uint32_t>(srcAttributeDescs.Count()); + vertexInputInfo.pVertexAttributeDescriptions = srcAttributeDescs.Buffer(); + } + + VkPipelineInputAssemblyStateCreateInfo inputAssembly = {}; + inputAssembly.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO; + inputAssembly.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; + inputAssembly.primitiveRestartEnable = VK_FALSE; + + VkViewport viewport = {}; + viewport.x = 0.0f; + viewport.y = 0.0f; + viewport.width = (float)width; + viewport.height = (float)height; + viewport.minDepth = 0.0f; + viewport.maxDepth = 1.0f; + + VkRect2D scissor = {}; + scissor.offset = { 0, 0 }; + scissor.extent = { uint32_t(width), uint32_t(height) }; + + VkPipelineViewportStateCreateInfo viewportState = {}; + viewportState.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO; + viewportState.viewportCount = 1; + viewportState.pViewports = &viewport; + viewportState.scissorCount = 1; + viewportState.pScissors = &scissor; + + VkPipelineRasterizationStateCreateInfo rasterizer = {}; + rasterizer.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO; + rasterizer.depthClampEnable = VK_FALSE; + rasterizer.rasterizerDiscardEnable = VK_FALSE; + rasterizer.polygonMode = VK_POLYGON_MODE_FILL; + rasterizer.lineWidth = 1.0f; + rasterizer.cullMode = VK_CULL_MODE_NONE; + rasterizer.frontFace = VK_FRONT_FACE_CLOCKWISE; + rasterizer.depthBiasEnable = VK_FALSE; + + VkPipelineMultisampleStateCreateInfo multisampling = {}; + multisampling.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO; + multisampling.sampleShadingEnable = VK_FALSE; + multisampling.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT; + + VkPipelineColorBlendAttachmentState colorBlendAttachment = {}; + colorBlendAttachment.colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT; + colorBlendAttachment.blendEnable = VK_FALSE; + + VkPipelineColorBlendStateCreateInfo colorBlending = {}; + colorBlending.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO; + colorBlending.logicOpEnable = VK_FALSE; + colorBlending.logicOp = VK_LOGIC_OP_COPY; + colorBlending.attachmentCount = 1; + colorBlending.pAttachments = &colorBlendAttachment; + colorBlending.blendConstants[0] = 0.0f; + colorBlending.blendConstants[1] = 0.0f; + colorBlending.blendConstants[2] = 0.0f; + colorBlending.blendConstants[3] = 0.0f; + + VkGraphicsPipelineCreateInfo pipelineInfo = { VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO }; + + pipelineInfo.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO; + pipelineInfo.stageCount = 2; + pipelineInfo.pStages = shaderStages; + pipelineInfo.pVertexInputState = &vertexInputInfo; + pipelineInfo.pInputAssemblyState = &inputAssembly; + pipelineInfo.pViewportState = &viewportState; + pipelineInfo.pRasterizationState = &rasterizer; + pipelineInfo.pMultisampleState = &multisampling; + pipelineInfo.pColorBlendState = &colorBlending; + pipelineInfo.layout = pipelineLayoutImpl->m_pipelineLayout; + pipelineInfo.renderPass = m_renderPass; + pipelineInfo.subpass = 0; + pipelineInfo.basePipelineHandle = VK_NULL_HANDLE; + + VkPipeline pipeline = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateGraphicsPipelines(m_device, pipelineCache, 1, &pipelineInfo, nullptr, &pipeline)); + + RefPtr<PipelineStateImpl> pipelineStateImpl; + pipelineStateImpl->m_pipeline = pipeline; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_shaderProgram = programImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + VkPipelineCache pipelineCache = VK_NULL_HANDLE; + + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + + VkComputePipelineCreateInfo computePipelineInfo = { VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO }; + computePipelineInfo.stage = programImpl->m_compute; + computePipelineInfo.layout = pipelineLayoutImpl->m_pipelineLayout; + + VkPipeline pipeline = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateComputePipelines(m_device, pipelineCache, 1, &computePipelineInfo, nullptr, &pipeline)); + + RefPtr<PipelineStateImpl> pipelineStateImpl = new PipelineStateImpl(m_api); + pipelineStateImpl->m_pipeline = pipeline; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_shaderProgram = programImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; } + +#if 0 + else if (m_currentProgram->m_pipelineType == PipelineType::Graphics) + { + // Create the graphics pipeline + + const int width = m_swapChain.getWidth(); + const int height = m_swapChain.getHeight(); + + + + + + // + + + } + else + { + assert(!"Unhandled program type"); + return SLANG_FAIL; + } + + pipelineOut = pipeline; + return SLANG_OK; + + +#endif + } // renderer_test diff --git a/tools/slang-graphics/render-vk.h b/tools/gfx/render-vk.h index 720f35a2c..14a8e403a 100644 --- a/tools/slang-graphics/render-vk.h +++ b/tools/gfx/render-vk.h @@ -1,10 +1,10 @@ // render-vk.h #pragma once -namespace slang_graphics { +namespace gfx { class Renderer; Renderer* createVKRenderer(); -} // slang_graphics +} // gfx diff --git a/tools/slang-graphics/render.cpp b/tools/gfx/render.cpp index 3595f73c1..8f887b491 100644 --- a/tools/slang-graphics/render.cpp +++ b/tools/gfx/render.cpp @@ -3,7 +3,7 @@ #include "../../source/core/slang-math.h" -namespace slang_graphics { +namespace gfx { using namespace Slang; /* static */const Resource::BindFlag::Enum Resource::s_requiredBinding[] = @@ -77,7 +77,7 @@ const Resource::DescBase& Resource::getDescBase() const } /* !!!!!!!!!!!!!!!!!!!!!!!!!!! BindingState::Desc !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - +#if 0 void BindingState::Desc::addSampler(const SamplerDesc& desc, const RegisterRange& registerRange) { int descIndex = int(m_samplerDescs.Count()); @@ -143,6 +143,7 @@ int BindingState::Desc::findBindingIndex(Resource::BindFlag::Enum bindFlag, int return -1; } +#endif /* !!!!!!!!!!!!!!!!!!!!!!!!!!! TextureResource::Size !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ @@ -290,7 +291,7 @@ void TextureResource::Desc::init1D(Format formatIn, int widthIn, int numMipMapsI this->type = Type::Texture1D; this->size.init(widthIn); - this->format = format; + this->format = formatIn; this->arraySize = 0; this->numMipLevels = numMipMapsIn; this->sampleDesc.init(); @@ -303,10 +304,10 @@ void TextureResource::Desc::init2D(Type typeIn, Format formatIn, int widthIn, in { assert(typeIn == Type::Texture2D || typeIn == Type::TextureCube); - this->type = type; + this->type = typeIn; this->size.init(widthIn, heightIn); - this->format = format; + this->format = formatIn; this->arraySize = 0; this->numMipLevels = numMipMapsIn; this->sampleDesc.init(); @@ -320,7 +321,7 @@ void TextureResource::Desc::init3D(Format formatIn, int widthIn, int heightIn, i this->type = Type::Texture3D; this->size.init(widthIn, heightIn, depthIn); - this->format = format; + this->format = formatIn; this->arraySize = 0; this->numMipLevels = numMipMapsIn; this->sampleDesc.init(); diff --git a/tools/slang-graphics/render.h b/tools/gfx/render.h index 92e9c0930..b43152b68 100644 --- a/tools/slang-graphics/render.h +++ b/tools/gfx/render.h @@ -1,4 +1,4 @@ -// render.h +// render.h #pragma once #include "window.h" @@ -9,8 +9,18 @@ #include "../../source/core/smart-pointer.h" #include "../../source/core/list.h" +#include "../../source/core/dictionary.h" -namespace slang_graphics { +namespace gfx { + +using Slang::RefObject; +using Slang::RefPtr; +using Slang::Dictionary; +using Slang::GetHashCode; +using Slang::combineHash; +using Slang::List; + +typedef SlangResult Result; // Had to move here, because Options needs types defined here typedef intptr_t Int; @@ -413,90 +423,251 @@ class TextureResource: public Resource Desc m_desc; }; -enum class BindingType +enum class ComparisonFunc : uint8_t +{ + Never = 0, + Less = 0x01, + Equal = 0x02, + LessEqual = 0x03, + Greater = 0x04, + NotEqual = 0x05, + GreaterEqual = 0x06, + Always = 0x07, +}; + +enum class TextureFilteringMode +{ + Point, + Linear, +}; + +enum class TextureAddressingMode +{ + Wrap, + ClampToEdge, + ClampToBorder, + MirrorRepeat, + MirrorOnce, +}; + +enum class TextureReductionOp +{ + Average, + Comparison, + Minimum, + Maximum, +}; + +class SamplerState : public Slang::RefObject +{ +public: + struct Desc + { + TextureFilteringMode minFilter = TextureFilteringMode::Linear; + TextureFilteringMode magFilter = TextureFilteringMode::Linear; + TextureFilteringMode mipFilter = TextureFilteringMode::Linear; + TextureReductionOp reductionOp = TextureReductionOp::Average; + TextureAddressingMode addressU = TextureAddressingMode::Wrap; + TextureAddressingMode addressV = TextureAddressingMode::Wrap; + TextureAddressingMode addressW = TextureAddressingMode::Wrap; + float mipLODBias = 0.0f; + uint32_t maxAnisotropy = 1; + ComparisonFunc comparisonFunc = ComparisonFunc::Never; + float borderColor[4] = { 1.0f, 1.0f, 1.0f, 1.0f }; + float minLOD = -FLT_MAX; + float maxLOD = FLT_MAX; + }; +}; + +enum class DescriptorSlotType { Unknown, + Sampler, - Buffer, - Texture, - CombinedTextureSampler, - CountOf, + CombinedImageSampler, + SampledImage, + StorageImage, + UniformTexelBuffer, + StorageTexelBuffer, + UniformBuffer, + StorageBuffer, + DynamicUniformBuffer, + DynamicStorageBuffer, + InputAttachment, }; -class BindingState : public Slang::RefObject +class DescriptorSetLayout : public Slang::RefObject { public: - /// A register set consists of one or more contiguous indices. - /// To be valid index >= 0 and size >= 1 - struct RegisterRange - { - /// True if contains valid contents - bool isValid() const { return size > 0; } - /// True if valid single value - bool isSingle() const { return size == 1; } - /// Get as a single index (must be at least one index) - int getSingleIndex() const { return (size == 1) ? index : -1; } - /// Return the first index - int getFirstIndex() const { return (size > 0) ? index : -1; } - /// True if contains register index - bool hasRegister(int registerIndex) const { return registerIndex >= index && registerIndex < index + size; } - - static RegisterRange makeInvalid() { return RegisterRange{ -1, 0 }; } - static RegisterRange makeSingle(int index) { return RegisterRange{ int16_t(index), 1 }; } - static RegisterRange makeRange(int index, int size) { return RegisterRange{ int16_t(index), uint16_t(size) }; } - - int16_t index; ///< The base index - uint16_t size; ///< The amount of register indices + struct SlotRangeDesc + { + DescriptorSlotType type = DescriptorSlotType::Unknown; + UInt count = 1; + + SlotRangeDesc() + {} + + SlotRangeDesc( + DescriptorSlotType type, + UInt count = 1) + : type(type) + , count(count) + {} }; - struct SamplerDesc + struct Desc { - bool isCompareSampler; + UInt slotRangeCount = 0; + SlotRangeDesc const* slotRanges = nullptr; }; +}; - struct Binding +class PipelineLayout : public Slang::RefObject +{ +public: + struct DescriptorSetDesc { - BindingType bindingType; ///< Type of binding - int descIndex; ///< The description index associated with type. -1 if not used. For example if bindingType is Sampler, the descIndex is into m_samplerDescs. - Slang::RefPtr<Resource> resource; ///< Associated resource. nullptr if not used - RegisterRange registerRange; /// Defines the registers for binding + DescriptorSetLayout* layout = nullptr; + + DescriptorSetDesc() + {} + + DescriptorSetDesc( + DescriptorSetLayout* layout) + : layout(layout) + {} }; struct Desc { - /// Add a resource - assumed that the binding will match the Desc of the resource - void addResource(BindingType bindingType, Resource* resource, const RegisterRange& registerRange); - /// Add a sampler - void addSampler(const SamplerDesc& desc, const RegisterRange& registerRange); - /// Add a BufferResource - void addBufferResource(BufferResource* resource, const RegisterRange& registerRange) { addResource(BindingType::Buffer, resource, registerRange); } - /// Add a texture - void addTextureResource(TextureResource* resource, const RegisterRange& registerRange) { addResource(BindingType::Texture, resource, registerRange); } - /// Add combined texture a - void addCombinedTextureSampler(TextureResource* resource, const SamplerDesc& samplerDesc, const RegisterRange& registerRange); - - /// Returns the bind index, that has the bind flag, and indexes the specified register - int findBindingIndex(Resource::BindFlag::Enum bindFlag, int registerIndex) const; + UInt renderTargetCount = 0; + UInt descriptorSetCount = 0; + DescriptorSetDesc const* descriptorSets = nullptr; + }; +}; - /// Clear the contents - void clear(); +class ResourceView : public Slang::RefObject +{ +public: + enum class Type + { + Unknown, - Slang::List<Binding> m_bindings; ///< All of the bindings in order - Slang::List<SamplerDesc> m_samplerDescs; ///< Holds the SamplerDesc for the binding - indexed by the descIndex member of Binding + RenderTarget, + DepthStencil, + ShaderResource, + UnorderedAccess, + }; - int m_numRenderTargets = 1; + struct Desc + { + Type type; + Format format; }; +}; - /// Get the Desc used to create this binding - SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } +class DescriptorSet : public Slang::RefObject +{ +public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) = 0; + virtual void setResource(UInt range, UInt index, ResourceView* view) = 0; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) = 0; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) = 0; +}; - protected: - BindingState(const Desc& desc): - m_desc(desc) - { - } +enum class StencilOp : uint8_t +{ + Keep, + Zero, + Replace, + IncrementSaturate, + DecrementSaturate, + Invert, + IncrementWrap, + DecrementWrap, +}; - Desc m_desc; +enum class FillMode : uint8_t +{ + Solid, + Wireframe, +}; + +enum class CullMode : uint8_t +{ + None, + Front, + Back, +}; + +enum class FrontFaceMode : uint8_t +{ + CounterClockwise, + Clockwise, +}; + +struct DepthStencilOpDesc +{ + StencilOp stencilFailOp = StencilOp::Keep; + StencilOp stencilDepthFailOp = StencilOp::Keep; + StencilOp stencilPassOp = StencilOp::Keep; + ComparisonFunc stencilFunc = ComparisonFunc::Always; +}; + +struct DepthStencilDesc +{ + bool depthTestEnable = true; + bool depthWriteEnable = true; + ComparisonFunc depthFunc = ComparisonFunc::Less; + + bool stencilEnable = false; + uint32_t stencilReadMask = 0xFFFFFFFF; + uint32_t stencilWriteMask = 0xFFFFFFFF; + DepthStencilOpDesc frontFace; + DepthStencilOpDesc backFace; + + uint32_t stencilRef = 0; +}; + +struct RasterizerDesc +{ + FillMode fillMode = FillMode::Solid; + CullMode cullMode = CullMode::Back; + FrontFaceMode frontFace = FrontFaceMode::CounterClockwise; + int32_t depthBias = 0; + float depthBiasClamp = 0.0f; + float slopeScaledDepthBias = 0.0f; + bool depthClipEnable = true; + bool scissorEnable = false; + bool multisampleEnable = false; + bool antialiasedLineEnable = false; +}; + +struct GraphicsPipelineStateDesc +{ + ShaderProgram* program; + PipelineLayout* pipelineLayout; + InputLayout* inputLayout; + UInt framebufferWidth; + UInt framebufferHeight; + UInt renderTargetCount; + DepthStencilDesc depthStencil; + RasterizerDesc rasterizer; +}; + +struct ComputePipelineStateDesc +{ + ShaderProgram* program; + PipelineLayout* pipelineLayout; +}; + +class PipelineState : public Slang::RefObject +{ +public: }; class Renderer: public Slang::RefObject @@ -516,32 +687,147 @@ public: virtual void presentFrame() = 0; + virtual TextureResource::Desc getSwapChainTextureDesc() = 0; + /// Create a texture resource. initData holds the initialize data to set the contents of the texture when constructed. - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData = nullptr) { return nullptr; } + virtual Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) = 0; + + /// Create a texture resource. initData holds the initialize data to set the contents of the texture when constructed. + inline RefPtr<TextureResource> createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData = nullptr) + { + RefPtr<TextureResource> resource; + SLANG_RETURN_NULL_ON_FAIL(createTextureResource(initialUsage, desc, initData, resource.writeRef())); + return resource; + } + /// Create a buffer resource - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData = nullptr) { return nullptr; } + virtual Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) = 0; - /// Captures the back buffer and stores the result in surfaceOut. If the surface contains data - it will either be overwritten (if same size and format), or freed and a re-allocated. - virtual SlangResult captureScreenSurface(Surface& surfaceOut) = 0; + inline RefPtr<BufferResource> createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData = nullptr) + { + RefPtr<BufferResource> resource; + SLANG_RETURN_NULL_ON_FAIL(createBufferResource(initialUsage, desc, initData, resource.writeRef())); + return resource; + } - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) = 0; - virtual BindingState* createBindingState(const BindingState::Desc& desc) { return nullptr; } + virtual Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) = 0; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) = 0; + inline RefPtr<SamplerState> createSamplerState(SamplerState::Desc const& desc) + { + RefPtr<SamplerState> sampler; + SLANG_RETURN_NULL_ON_FAIL(createSamplerState(desc, sampler.writeRef())); + return sampler; + } + + virtual Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) = 0; + + inline RefPtr<ResourceView> createTextureView(TextureResource* texture, ResourceView::Desc const& desc) + { + RefPtr<ResourceView> view; + SLANG_RETURN_NULL_ON_FAIL(createTextureView(texture, desc, view.writeRef())); + return view; + } + + virtual Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) = 0; + + inline RefPtr<ResourceView> createBufferView(BufferResource* buffer, ResourceView::Desc const& desc) + { + RefPtr<ResourceView> view; + SLANG_RETURN_NULL_ON_FAIL(createBufferView(buffer, desc, view.writeRef())); + return view; + } + + virtual Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) = 0; + + inline RefPtr<InputLayout> createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) + { + RefPtr<InputLayout> layout; + SLANG_RETURN_NULL_ON_FAIL(createInputLayout(inputElements, inputElementCount, layout.writeRef())); + return layout; + } + + virtual Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) = 0; + + inline RefPtr<DescriptorSetLayout> createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc) + { + RefPtr<DescriptorSetLayout> layout; + SLANG_RETURN_NULL_ON_FAIL(createDescriptorSetLayout(desc, layout.writeRef())); + return layout; + } + + virtual Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) = 0; + + inline RefPtr<PipelineLayout> createPipelineLayout(const PipelineLayout::Desc& desc) + { + RefPtr<PipelineLayout> layout; + SLANG_RETURN_NULL_ON_FAIL(createPipelineLayout(desc, layout.writeRef())); + return layout; + } + + virtual Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) = 0; + + inline RefPtr<DescriptorSet> createDescriptorSet(DescriptorSetLayout* layout) + { + RefPtr<DescriptorSet> descriptorSet; + SLANG_RETURN_NULL_ON_FAIL(createDescriptorSet(layout, descriptorSet.writeRef())); + return descriptorSet; + } + + virtual Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) = 0; + + inline RefPtr<ShaderProgram> createProgram(const ShaderProgram::Desc& desc) + { + RefPtr<ShaderProgram> program; + SLANG_RETURN_NULL_ON_FAIL(createProgram(desc, program.writeRef())); + return program; + } + + virtual Result createGraphicsPipelineState( + const GraphicsPipelineStateDesc& desc, + PipelineState** outState) = 0; + + inline RefPtr<PipelineState> createGraphicsPipelineState( + const GraphicsPipelineStateDesc& desc) + { + RefPtr<PipelineState> state; + SLANG_RETURN_NULL_ON_FAIL(createGraphicsPipelineState(desc, state.writeRef())); + return state; + } + + virtual Result createComputePipelineState( + const ComputePipelineStateDesc& desc, + PipelineState** outState) = 0; + + inline RefPtr<PipelineState> createComputePipelineState( + const ComputePipelineStateDesc& desc) + { + RefPtr<PipelineState> state; + SLANG_RETURN_NULL_ON_FAIL(createComputePipelineState(desc, state.writeRef())); + return state; + } + + /// Captures the back buffer and stores the result in surfaceOut. If the surface contains data - it will either be overwritten (if same size and format), or freed and a re-allocated. + virtual SlangResult captureScreenSurface(Surface& surfaceOut) = 0; virtual void* map(BufferResource* buffer, MapFlavor flavor) = 0; virtual void unmap(BufferResource* buffer) = 0; - virtual void setInputLayout(InputLayout* inputLayout) = 0; virtual void setPrimitiveTopology(PrimitiveTopology topology) = 0; - virtual void setBindingState(BindingState* state) = 0; - virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) = 0; + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) = 0; + + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) = 0; inline void setVertexBuffer(UInt slot, BufferResource* buffer, UInt stride, UInt offset = 0); - virtual void setShaderProgram(ShaderProgram* program) = 0; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset = 0) = 0; + + virtual void setDepthStencilTarget(ResourceView* depthStencilView) = 0; + + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) = 0; virtual void draw(UInt vertexCount, UInt startVertex = 0) = 0; + virtual void drawIndexed(UInt indexCount, UInt startIndex = 0, UInt baseVertex = 0) = 0; + virtual void dispatchCompute(int x, int y, int z) = 0; /// Commit any buffered state changes or draw calls. diff --git a/tools/slang-graphics/resource-d3d12.cpp b/tools/gfx/resource-d3d12.cpp index bb39d2529..2e0f78371 100644 --- a/tools/slang-graphics/resource-d3d12.cpp +++ b/tools/gfx/resource-d3d12.cpp @@ -1,7 +1,7 @@ // resource-d3d12.cpp #include "resource-d3d12.h" -namespace slang_graphics { +namespace gfx { using namespace Slang; /* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! D3D12BarrierSubmitter !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ diff --git a/tools/slang-graphics/resource-d3d12.h b/tools/gfx/resource-d3d12.h index 6040291cc..1764adf9d 100644 --- a/tools/slang-graphics/resource-d3d12.h +++ b/tools/gfx/resource-d3d12.h @@ -13,7 +13,7 @@ #include "../../slang-com-ptr.h" #include "d3d-util.h" -namespace slang_graphics { +namespace gfx { // Enables more conservative barriers - restoring the state of resources after they are used. // Should not need to be enabled in normal builds, as the barriers should correctly sync resources diff --git a/tools/slang-graphics/surface.cpp b/tools/gfx/surface.cpp index 9d91f8778..4b53d278a 100644 --- a/tools/slang-graphics/surface.cpp +++ b/tools/gfx/surface.cpp @@ -6,7 +6,7 @@ #include "../../source/core/list.h" -namespace slang_graphics { +namespace gfx { using namespace Slang; class MallocSurfaceAllocator: public SurfaceAllocator diff --git a/tools/slang-graphics/surface.h b/tools/gfx/surface.h index 026ba25ed..3e0f6f0aa 100644 --- a/tools/slang-graphics/surface.h +++ b/tools/gfx/surface.h @@ -3,7 +3,7 @@ #include "render.h" -namespace slang_graphics { +namespace gfx { class Surface; diff --git a/tools/gfx/vector-math.h b/tools/gfx/vector-math.h new file mode 100644 index 000000000..88cb0c1d9 --- /dev/null +++ b/tools/gfx/vector-math.h @@ -0,0 +1,14 @@ +// vector-math.h +#pragma once + +// We will use the GLM library for our vector math types, just for simplicity. + +#include "../../external/glm/glm/glm.hpp" +#include "../../external/glm/glm/gtc/matrix_transform.hpp" +#include "../../external/glm/glm/gtc/constants.hpp" + +namespace gfx { + +using namespace glm; + +} // gfx diff --git a/tools/slang-graphics/vk-api.cpp b/tools/gfx/vk-api.cpp index 0ffbf46eb..4030e43ba 100644 --- a/tools/slang-graphics/vk-api.cpp +++ b/tools/gfx/vk-api.cpp @@ -3,7 +3,7 @@ #include "../../source/core/list.h" -namespace slang_graphics { +namespace gfx { using namespace Slang; // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VulkanApi !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! diff --git a/tools/slang-graphics/vk-api.h b/tools/gfx/vk-api.h index 0cbd3faf7..5ec28ef6e 100644 --- a/tools/slang-graphics/vk-api.h +++ b/tools/gfx/vk-api.h @@ -3,7 +3,7 @@ #include "vk-module.h" -namespace slang_graphics { +namespace gfx { #define VK_API_GLOBAL_PROCS(x) \ x(vkGetInstanceProcAddr) \ diff --git a/tools/slang-graphics/vk-device-queue.cpp b/tools/gfx/vk-device-queue.cpp index 9e978117f..10a3d0e3b 100644 --- a/tools/slang-graphics/vk-device-queue.cpp +++ b/tools/gfx/vk-device-queue.cpp @@ -5,7 +5,7 @@ #include <stdio.h> #include <assert.h> -namespace slang_graphics { +namespace gfx { using namespace Slang; VulkanDeviceQueue::~VulkanDeviceQueue() diff --git a/tools/slang-graphics/vk-device-queue.h b/tools/gfx/vk-device-queue.h index 01ed16f5d..d57483ec0 100644 --- a/tools/slang-graphics/vk-device-queue.h +++ b/tools/gfx/vk-device-queue.h @@ -3,7 +3,7 @@ #include "vk-api.h" -namespace slang_graphics { +namespace gfx { struct VulkanDeviceQueue { diff --git a/tools/slang-graphics/vk-module.cpp b/tools/gfx/vk-module.cpp index 460e6550c..4e92a3d2c 100644 --- a/tools/slang-graphics/vk-module.cpp +++ b/tools/gfx/vk-module.cpp @@ -11,7 +11,7 @@ # include <dlfcn.h> #endif -namespace slang_graphics { +namespace gfx { using namespace Slang; // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VulkanModule !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! diff --git a/tools/slang-graphics/vk-module.h b/tools/gfx/vk-module.h index c72334db5..55e26f335 100644 --- a/tools/slang-graphics/vk-module.h +++ b/tools/gfx/vk-module.h @@ -14,7 +14,7 @@ #define VK_NO_PROTOTYPES #include <vulkan/vulkan.h> -namespace slang_graphics { +namespace gfx { struct VulkanModule { diff --git a/tools/slang-graphics/vk-swap-chain.cpp b/tools/gfx/vk-swap-chain.cpp index a6704e9d7..6e89c946c 100644 --- a/tools/slang-graphics/vk-swap-chain.cpp +++ b/tools/gfx/vk-swap-chain.cpp @@ -8,7 +8,7 @@ #include <stdlib.h> #include <stdio.h> -namespace slang_graphics { +namespace gfx { using namespace Slang; static int _indexOf(List<VkSurfaceFormatKHR>& formatsIn, VkFormat format) diff --git a/tools/slang-graphics/vk-swap-chain.h b/tools/gfx/vk-swap-chain.h index 7c04af70c..12feb0ed5 100644 --- a/tools/slang-graphics/vk-swap-chain.h +++ b/tools/gfx/vk-swap-chain.h @@ -8,7 +8,7 @@ #include "../../source/core/list.h" -namespace slang_graphics { +namespace gfx { struct VulkanSwapChain { diff --git a/tools/slang-graphics/vk-util.cpp b/tools/gfx/vk-util.cpp index 374a3876d..e8940d1b2 100644 --- a/tools/slang-graphics/vk-util.cpp +++ b/tools/gfx/vk-util.cpp @@ -4,7 +4,7 @@ #include <stdlib.h> #include <stdio.h> -namespace slang_graphics { +namespace gfx { /* static */VkFormat VulkanUtil::getVkFormat(Format format) { diff --git a/tools/slang-graphics/vk-util.h b/tools/gfx/vk-util.h index 420c0a57a..edba3a7d2 100644 --- a/tools/slang-graphics/vk-util.h +++ b/tools/gfx/vk-util.h @@ -15,7 +15,7 @@ /// Is similar to SLANG_VK_RETURN_ON_FAIL, but does not return. Will call checkFail on failure - which asserts on debug builds. #define SLANG_VK_CHECK(x) { VkResult _res = x; if (_res != VK_SUCCESS) { VulkanUtil::checkFail(_res); } } -namespace slang_graphics { +namespace gfx { // Utility functions for Vulkan struct VulkanUtil diff --git a/tools/slang-graphics/window.cpp b/tools/gfx/window.cpp index 7aef88c12..ee9f50813 100644 --- a/tools/slang-graphics/window.cpp +++ b/tools/gfx/window.cpp @@ -11,6 +11,8 @@ #endif #endif +#include <stdint.h> + #if _WIN32 #include <Windows.h> @@ -18,7 +20,7 @@ #error "The slang-graphics library currently only supports Windows platforms" #endif -namespace slang_graphics { +namespace gfx { #if _WIN32 @@ -70,6 +72,16 @@ struct ApplicationContext int resultCode = 0; }; +static uint64_t gTimerFrequency; + + +static void initApplication(ApplicationContext* context) +{ + LARGE_INTEGER timerFrequency; + QueryPerformanceFrequency(&timerFrequency); + gTimerFrequency = timerFrequency.QuadPart; +} + /// Run an application given the specified callback and command-line arguments. int runApplication( ApplicationFunc func, @@ -78,6 +90,7 @@ int runApplication( { ApplicationContext context; context.instance = (HINSTANCE) GetModuleHandle(0); + initApplication(&context); func(&context); return context.resultCode; } @@ -90,6 +103,7 @@ int runWindowsApplication( ApplicationContext context; context.instance = (HINSTANCE) instance; context.showCommand = showCommand; + initApplication(&context); func(&context); return context.resultCode; } @@ -216,6 +230,24 @@ void exitApplication(ApplicationContext* context, int resultCode) ExitProcess(resultCode); } +void log(char const* message, ...) +{ + va_list args; + va_start(args, message); + + static const int kBufferSize = 1024; + char messageBuffer[kBufferSize]; + vsnprintf(messageBuffer, kBufferSize - 1, message, args); + messageBuffer[kBufferSize - 1] = 0; + + va_end(args); + + fputs(messageBuffer, stderr); + + OSString wideMessageBuffer(messageBuffer); + OutputDebugStringW(wideMessageBuffer); +} + int reportError(char const* message, ...) { va_list args; @@ -236,10 +268,22 @@ int reportError(char const* message, ...) return 1; } +uint64_t getCurrentTime() +{ + LARGE_INTEGER counter; + QueryPerformanceCounter(&counter); + return counter.QuadPart; +} + +uint64_t getTimerFrequency() +{ + return gTimerFrequency; +} + #else // TODO: put an SDL version here #endif -} // slang_graphics +} // gfx diff --git a/tools/slang-graphics/window.h b/tools/gfx/window.h index 91c8286d5..6e557d26c 100644 --- a/tools/slang-graphics/window.h +++ b/tools/gfx/window.h @@ -1,7 +1,9 @@ // window.h #pragma once -namespace slang_graphics { +#include <stdint.h> + +namespace gfx { struct WindowDesc { @@ -30,18 +32,25 @@ bool dispatchEvents(ApplicationContext* context); /// Exit the application with a given result code void exitApplication(ApplicationContext* context, int resultCode); +/// Log a message to an appropriate logging destination. +void log(char const* message, ...); + /// Report an error to an appropriate logging destination. int reportError(char const* message, ...); +uint64_t getCurrentTime(); + +uint64_t getTimerFrequency(); + /// Run an application given the specified callback and command-line arguments. int runApplication( ApplicationFunc func, int argc, char const* const* argv); -#define SG_CONSOLE_MAIN(APPLICATION_ENTRY) \ +#define GFX_CONSOLE_MAIN(APPLICATION_ENTRY) \ int main(int argc, char** argv) { \ - return slang_graphics::runApplication(&(APPLIATION_ENTRY), argc, argv); \ + return gfx::runApplication(&(APPLIATION_ENTRY), argc, argv); \ } #ifdef _WIN32 @@ -51,19 +60,19 @@ int runWindowsApplication( void* instance, int showCommand); -#define SG_UI_MAIN(APPLICATION_ENTRY) \ +#define GFX_UI_MAIN(APPLICATION_ENTRY) \ int __stdcall WinMain( \ void* instance, \ void* /* prevInstance */, \ void* /* commandLine */, \ int showCommand) { \ - return slang_graphics::runWindowsApplication(&(APPLICATION_ENTRY), instance, showCommand); \ + return gfx::runWindowsApplication(&(APPLICATION_ENTRY), instance, showCommand); \ } #else -#define SG_UI_MAIN(APPLICATION_ENTRY) SG_CONSOLE_MAIN(APPLICATION_ENTRY) +#define GFX_UI_MAIN(APPLICATION_ENTRY) GFX_CONSOLE_MAIN(APPLICATION_ENTRY) #endif -} // slang_graphics +} // gfx diff --git a/tools/render-test/main.cpp b/tools/render-test/main.cpp index 935b9bc98..4734c2c8f 100644 --- a/tools/render-test/main.cpp +++ b/tools/render-test/main.cpp @@ -29,6 +29,8 @@ namespace renderer_test { +using Slang::Result; + int gWindowWidth = 1024; int gWindowHeight = 768; @@ -45,7 +47,7 @@ struct Vertex float uv[2]; }; -static const Vertex kVertexData[] = +static const Vertex kVertexData[] = { { { 0, 0, 0.5 }, {1, 0, 0} , {0, 0} }, { { 0, 1, 0.5 }, {0, 0, 1} , {1, 0} }, @@ -61,15 +63,15 @@ class RenderTestApp // At initialization time, we are going to load and compile our Slang shader // code, and then create the API objects we need for rendering. - Result initialize(Renderer* renderer, ShaderCompiler* shaderCompiler); + Result initialize(Renderer* renderer, ShaderCompiler* shaderCompiler); void runCompute(); void renderFrame(); void finalize(); - BindingState* getBindingState() const { return m_bindingState; } + BindingStateImpl* getBindingState() const { return m_bindingState; } Result writeBindingOutput(const char* fileName); - + Result writeScreen(const char* filename); protected: @@ -85,7 +87,8 @@ class RenderTestApp RefPtr<InputLayout> m_inputLayout; RefPtr<BufferResource> m_vertexBuffer; RefPtr<ShaderProgram> m_shaderProgram; - RefPtr<BindingState> m_bindingState; + RefPtr<PipelineState> m_pipelineState; + RefPtr<BindingStateImpl> m_bindingState; ShaderInputLayout m_shaderInputLayout; ///< The binding layout int m_numAddedConstantBuffers; ///< Constant buffers can be added to the binding directly. Will be added at the end. @@ -117,22 +120,26 @@ SlangResult RenderTestApp::initialize(Renderer* renderer, ShaderCompiler* shader } { - BindingState::Desc bindingStateDesc; - SLANG_RETURN_ON_FAIL(ShaderRendererUtil::createBindingStateDesc(m_shaderInputLayout, m_renderer, bindingStateDesc)); - - //! Hack -> if bindings are specified, just set up the constant buffer binding - // Should probably be more sophisticated than this - with 'dynamic' constant buffer/s binding always being specified - // in the test file - - if ((gOptions.shaderType == Options::ShaderProgramType::Graphics || gOptions.shaderType == Options::ShaderProgramType::GraphicsCompute) - && bindingStateDesc.findBindingIndex(Resource::BindFlag::ConstantBuffer, 0) < 0) + //! Hack -> if doing a graphics test, add an extra binding for our dynamic constant buffer + // + // TODO: Should probably be more sophisticated than this - with 'dynamic' constant buffer/s binding always being specified + // in the test file + RefPtr<BufferResource> addedConstantBuffer; + switch(gOptions.shaderType) { - bindingStateDesc.addResource(BindingType::Buffer, m_constantBuffer, BindingState::RegisterRange::makeSingle(0) ); + default: + break; + case Options::ShaderProgramType::Graphics: + case Options::ShaderProgramType::GraphicsCompute: + addedConstantBuffer = m_constantBuffer; m_numAddedConstantBuffers++; + break; } - m_bindingState = m_renderer->createBindingState(bindingStateDesc); + BindingStateImpl* bindingState = nullptr; + SLANG_RETURN_ON_FAIL(ShaderRendererUtil::createBindingState(m_shaderInputLayout, m_renderer, addedConstantBuffer, &bindingState)); + m_bindingState = bindingState; } // Do other initialization that doesn't depend on the source language. @@ -156,6 +163,38 @@ SlangResult RenderTestApp::initialize(Renderer* renderer, ShaderCompiler* shader if(!m_vertexBuffer) return SLANG_FAIL; + { + switch(gOptions.shaderType) + { + default: + assert(!"unexpected test shader type"); + return SLANG_FAIL; + + case Options::ShaderProgramType::Compute: + { + ComputePipelineStateDesc desc; + desc.pipelineLayout = m_bindingState->pipelineLayout; + desc.program = m_shaderProgram; + + m_pipelineState = renderer->createComputePipelineState(desc); + } + break; + + case Options::ShaderProgramType::Graphics: + case Options::ShaderProgramType::GraphicsCompute: + { + GraphicsPipelineStateDesc desc; + desc.pipelineLayout = m_bindingState->pipelineLayout; + desc.program = m_shaderProgram; + desc.inputLayout = m_inputLayout; + desc.renderTargetCount = m_bindingState->m_numRenderTargets; + + m_pipelineState = renderer->createGraphicsPipelineState(desc); + } + break; + } + } + return SLANG_OK; } @@ -182,6 +221,16 @@ Result RenderTestApp::initializeShaders(ShaderCompiler* shaderCompiler) fclose(sourceFile); sourceText[sourceSize] = 0; + switch( gOptions.shaderType ) + { + default: + m_shaderInputLayout.numRenderTargets = 1; + break; + + case Options::ShaderProgramType::Compute: + m_shaderInputLayout.numRenderTargets = 0; + break; + } m_shaderInputLayout.Parse(sourceText); ShaderCompileRequest::SourceInfo sourceInfo; @@ -220,31 +269,27 @@ void RenderTestApp::renderFrame() { const ProjectionStyle projectionStyle = RendererUtil::getProjectionStyle(m_renderer->getRendererType()); RendererUtil::getIdentityProjection(projectionStyle, (float*)mappedData); - + m_renderer->unmap(m_constantBuffer); } - // Input Assembler (IA) + auto pipelineType = PipelineType::Graphics; - m_renderer->setInputLayout(m_inputLayout); - m_renderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); + m_renderer->setPipelineState(pipelineType, m_pipelineState); + m_renderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); m_renderer->setVertexBuffer(0, m_vertexBuffer, sizeof(Vertex)); - // Vertex Shader (VS) - // Pixel Shader (PS) - - m_renderer->setShaderProgram(m_shaderProgram); - m_renderer->setBindingState(m_bindingState); - // + m_bindingState->apply(m_renderer, pipelineType); m_renderer->draw(3); } void RenderTestApp::runCompute() { - m_renderer->setShaderProgram(m_shaderProgram); - m_renderer->setBindingState(m_bindingState); + auto pipelineType = PipelineType::Compute; + m_renderer->setPipelineState(pipelineType, m_pipelineState); + m_bindingState->apply(m_renderer, pipelineType); m_renderer->dispatchCompute(1, 1, 1); } @@ -265,18 +310,12 @@ Result RenderTestApp::writeBindingOutput(const char* fileName) return SLANG_FAIL; } - const BindingState::Desc& bindingStateDesc = m_bindingState->getDesc(); - // Must be the same amount of entries - assert(bindingStateDesc.m_bindings.Count() == m_shaderInputLayout.entries.Count() + m_numAddedConstantBuffers); - - const int numBindings = int(m_shaderInputLayout.entries.Count()); - - for (int i = 0; i < numBindings; ++i) + for(auto binding : m_bindingState->outputBindings) { + auto i = binding.entryIndex; const auto& layoutBinding = m_shaderInputLayout.entries[i]; - const auto& binding = bindingStateDesc.m_bindings[i]; - if (layoutBinding.isOutput) + assert(layoutBinding.isOutput); { if (binding.resource && binding.resource->isBuffer()) { @@ -524,11 +563,11 @@ SlangResult innerMain(int argc, char** argv) else { Result res = app.writeScreen(gOptions.outputPath); - + if (SLANG_FAILED(res)) { fprintf(stderr, "ERROR: failed to write screen capture to file\n"); - return res; + return res; } } return SLANG_OK; diff --git a/tools/render-test/options.h b/tools/render-test/options.h index 82c018f66..78f673796 100644 --- a/tools/render-test/options.h +++ b/tools/render-test/options.h @@ -9,7 +9,7 @@ namespace renderer_test { -using namespace slang_graphics; +using namespace gfx; struct Options { diff --git a/tools/render-test/png-serialize-util.h b/tools/render-test/png-serialize-util.h index dad17ae74..1ec5204f7 100644 --- a/tools/render-test/png-serialize-util.h +++ b/tools/render-test/png-serialize-util.h @@ -5,7 +5,7 @@ namespace renderer_test { -using namespace slang_graphics; +using namespace gfx; struct PngSerializeUtil { diff --git a/tools/render-test/render-test.vcxproj b/tools/render-test/render-test.vcxproj index 66ad9e7ed..91c8bd997 100644 --- a/tools/render-test/render-test.vcxproj +++ b/tools/render-test/render-test.vcxproj @@ -99,7 +99,7 @@ <PrecompiledHeader>NotUsing</PrecompiledHeader> <WarningLevel>Level3</WarningLevel> <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> - <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <DebugInformationFormat>EditAndContinue</DebugInformationFormat> <Optimization>Disabled</Optimization> <RuntimeLibrary>MultiThreadedDebug</RuntimeLibrary> @@ -117,7 +117,7 @@ <PrecompiledHeader>NotUsing</PrecompiledHeader> <WarningLevel>Level3</WarningLevel> <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> - <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <DebugInformationFormat>EditAndContinue</DebugInformationFormat> <Optimization>Disabled</Optimization> <RuntimeLibrary>MultiThreadedDebug</RuntimeLibrary> @@ -135,7 +135,7 @@ <PrecompiledHeader>NotUsing</PrecompiledHeader> <WarningLevel>Level3</WarningLevel> <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> - <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <Optimization>Full</Optimization> <FunctionLevelLinking>true</FunctionLevelLinking> <IntrinsicFunctions>true</IntrinsicFunctions> @@ -157,7 +157,7 @@ <PrecompiledHeader>NotUsing</PrecompiledHeader> <WarningLevel>Level3</WarningLevel> <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions> - <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> + <AdditionalIncludeDirectories>..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> <Optimization>Full</Optimization> <FunctionLevelLinking>true</FunctionLevelLinking> <IntrinsicFunctions>true</IntrinsicFunctions> @@ -196,7 +196,7 @@ <ProjectReference Include="..\..\source\slang\slang.vcxproj"> <Project>{DB00DA62-0533-4AFD-B59F-A67D5B3A0808}</Project> </ProjectReference> - <ProjectReference Include="..\slang-graphics\slang-graphics.vcxproj"> + <ProjectReference Include="..\gfx\gfx.vcxproj"> <Project>{222F7498-B40C-4F3F-A704-DDEB91A4484A}</Project> </ProjectReference> </ItemGroup> diff --git a/tools/render-test/shader-input-layout.h b/tools/render-test/shader-input-layout.h index 19a7e59d0..92dd516a7 100644 --- a/tools/render-test/shader-input-layout.h +++ b/tools/render-test/shader-input-layout.h @@ -7,7 +7,7 @@ namespace renderer_test { -using namespace slang_graphics; +using namespace gfx; enum class ShaderInputType { diff --git a/tools/render-test/shader-renderer-util.cpp b/tools/render-test/shader-renderer-util.cpp index e46c725bc..f6c0366bb 100644 --- a/tools/render-test/shader-renderer-util.cpp +++ b/tools/render-test/shader-renderer-util.cpp @@ -5,6 +5,16 @@ namespace renderer_test { using namespace Slang; +using Slang::Result; + +void BindingStateImpl::apply(Renderer* renderer, PipelineType pipelineType) +{ + renderer->setDescriptorSet( + pipelineType, + pipelineLayout, + 0, + descriptorSet); +} /* static */Result ShaderRendererUtil::generateTextureResource(const InputTextureDesc& inputDesc, int bindFlags, Renderer* renderer, RefPtr<TextureResource>& textureOut) { @@ -125,16 +135,27 @@ using namespace Slang; return SLANG_OK; } -static BindingState::SamplerDesc _calcSamplerDesc(const InputSamplerDesc& srcDesc) +static SamplerState::Desc _calcSamplerDesc(const InputSamplerDesc& srcDesc) { - BindingState::SamplerDesc dstDesc; - dstDesc.isCompareSampler = srcDesc.isCompareSampler; + SamplerState::Desc dstDesc; + if (srcDesc.isCompareSampler) + { + dstDesc.reductionOp = TextureReductionOp::Comparison; + dstDesc.comparisonFunc = ComparisonFunc::Less; + } return dstDesc; } -/* static */BindingState::RegisterRange ShaderRendererUtil::calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry) +static RefPtr<SamplerState> _createSamplerState( + Renderer* renderer, + const InputSamplerDesc& srcDesc) { - typedef BindingState::RegisterRange RegisterRange; + return renderer->createSamplerState(_calcSamplerDesc(srcDesc)); +} + +/* static */BindingStateImpl::RegisterRange ShaderRendererUtil::calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry) +{ + typedef BindingStateImpl::RegisterRange RegisterRange; BindingStyle bindingStyle = RendererUtil::getBindingStyle(renderer->getRendererType()); @@ -179,71 +200,227 @@ static BindingState::SamplerDesc _calcSamplerDesc(const InputSamplerDesc& srcDes return RegisterRange::makeInvalid(); } -/* static */Result ShaderRendererUtil::createBindingStateDesc(ShaderInputLayoutEntry* srcEntries, int numEntries, Renderer* renderer, BindingState::Desc& descOut) +/* static */Result ShaderRendererUtil::createBindingState(const ShaderInputLayout& layout, Renderer* renderer, BufferResource* addedConstantBuffer, BindingStateImpl** outBindingState) { + auto srcEntries = layout.entries.Buffer(); + auto numEntries = int(layout.entries.Count()); + const int textureBindFlags = Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource; - descOut.clear(); + List<DescriptorSetLayout::SlotRangeDesc> slotRangeDescs; + + if(addedConstantBuffer) + { + DescriptorSetLayout::SlotRangeDesc slotRangeDesc; + slotRangeDesc.type = DescriptorSlotType::UniformBuffer; + + slotRangeDescs.Add(slotRangeDesc); + } + for (int i = 0; i < numEntries; i++) { const ShaderInputLayoutEntry& srcEntry = srcEntries[i]; - const BindingState::RegisterRange registerSet = calcRegisterRange(renderer, srcEntry); + const BindingStateImpl::RegisterRange registerSet = calcRegisterRange(renderer, srcEntry); if (!registerSet.isValid()) { assert(!"Couldn't find a binding"); return SLANG_FAIL; } + DescriptorSetLayout::SlotRangeDesc slotRangeDesc; + switch (srcEntry.type) { case ShaderInputType::Buffer: - { - const InputBufferDesc& srcBuffer = srcEntry.bufferDesc; + { + const InputBufferDesc& srcBuffer = srcEntry.bufferDesc; + + switch (srcBuffer.type) + { + case InputBufferType::ConstantBuffer: + slotRangeDesc.type = DescriptorSlotType::UniformBuffer; + break; + + case InputBufferType::StorageBuffer: + slotRangeDesc.type = DescriptorSlotType::StorageBuffer; + break; + } + } + break; - const size_t bufferSize = srcEntry.bufferData.Count() * sizeof(uint32_t); + case ShaderInputType::CombinedTextureSampler: + { + slotRangeDesc.type = DescriptorSlotType::CombinedImageSampler; + } + break; - RefPtr<BufferResource> bufferResource; - SLANG_RETURN_ON_FAIL(createBufferResource(srcEntry.bufferDesc, srcEntry.isOutput, bufferSize, srcEntry.bufferData.Buffer(), renderer, bufferResource)); + case ShaderInputType::Texture: + { + if (srcEntry.textureDesc.isRWTexture) + { + slotRangeDesc.type = DescriptorSlotType::StorageImage; + } + else + { + slotRangeDesc.type = DescriptorSlotType::SampledImage; + } + } + break; - descOut.addBufferResource(bufferResource, registerSet); + case ShaderInputType::Sampler: + slotRangeDesc.type = DescriptorSlotType::Sampler; break; - } + + default: + assert(!"Unhandled type"); + return SLANG_FAIL; + } + slotRangeDescs.Add(slotRangeDesc); + } + + DescriptorSetLayout::Desc descriptorSetLayoutDesc; + descriptorSetLayoutDesc.slotRangeCount = slotRangeDescs.Count(); + descriptorSetLayoutDesc.slotRanges = slotRangeDescs.Buffer(); + + auto descriptorSetLayout = renderer->createDescriptorSetLayout(descriptorSetLayoutDesc); + if(!descriptorSetLayout) return SLANG_FAIL; + + List<PipelineLayout::DescriptorSetDesc> pipelineDescriptorSets; + pipelineDescriptorSets.Add(PipelineLayout::DescriptorSetDesc(descriptorSetLayout)); + + PipelineLayout::Desc pipelineLayoutDesc; + pipelineLayoutDesc.renderTargetCount = layout.numRenderTargets; + pipelineLayoutDesc.descriptorSetCount = pipelineDescriptorSets.Count(); + pipelineLayoutDesc.descriptorSets = pipelineDescriptorSets.Buffer(); + + auto pipelineLayout = renderer->createPipelineLayout(pipelineLayoutDesc); + if(!pipelineLayout) return SLANG_FAIL; + + auto descriptorSet = renderer->createDescriptorSet(descriptorSetLayout); + if(!descriptorSet) return SLANG_FAIL; + + List<BindingStateImpl::OutputBinding> outputBindings; + + if(addedConstantBuffer) + { + descriptorSet->setConstantBuffer(0, 0, addedConstantBuffer); + } + for (int i = 0; i < numEntries; i++) + { + const ShaderInputLayoutEntry& srcEntry = srcEntries[i]; + + auto rangeIndex = i + (addedConstantBuffer ? 1 : 0); + + switch (srcEntry.type) + { + case ShaderInputType::Buffer: + { + const InputBufferDesc& srcBuffer = srcEntry.bufferDesc; + const size_t bufferSize = srcEntry.bufferData.Count() * sizeof(uint32_t); + + RefPtr<BufferResource> bufferResource; + SLANG_RETURN_ON_FAIL(createBufferResource(srcEntry.bufferDesc, srcEntry.isOutput, bufferSize, srcEntry.bufferData.Buffer(), renderer, bufferResource)); + + switch(srcBuffer.type) + { + case InputBufferType::ConstantBuffer: + descriptorSet->setConstantBuffer(rangeIndex, 0, bufferResource); + break; + + case InputBufferType::StorageBuffer: + { + ResourceView::Desc viewDesc; + viewDesc.type = ResourceView::Type::UnorderedAccess; + viewDesc.format = srcBuffer.format; + auto bufferView = renderer->createBufferView( + bufferResource, + viewDesc); + descriptorSet->setResource(rangeIndex, 0, bufferView); + } + break; + } + + if(srcEntry.isOutput) + { + BindingStateImpl::OutputBinding binding; + binding.entryIndex = i; + binding.resource = bufferResource; + outputBindings.Add(binding); + } + } + break; + case ShaderInputType::CombinedTextureSampler: - { - RefPtr<TextureResource> texture; - SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); - descOut.addCombinedTextureSampler(texture, _calcSamplerDesc(srcEntry.samplerDesc), registerSet); + { + RefPtr<TextureResource> texture; + SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); + + auto sampler = _createSamplerState(renderer, srcEntry.samplerDesc); + + ResourceView::Desc viewDesc; + viewDesc.type = ResourceView::Type::ShaderResource; + auto textureView = renderer->createTextureView( + texture, + viewDesc); + + descriptorSet->setCombinedTextureSampler(rangeIndex, 0, textureView, sampler); + + if(srcEntry.isOutput) + { + BindingStateImpl::OutputBinding binding; + binding.entryIndex = i; + binding.resource = texture; + outputBindings.Add(binding); + } + } break; - } - case ShaderInputType::Texture: - { - RefPtr<TextureResource> texture; - SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); - descOut.addTextureResource(texture, registerSet); + case ShaderInputType::Texture: + { + RefPtr<TextureResource> texture; + SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); + + // TODO: support UAV textures... + + ResourceView::Desc viewDesc; + viewDesc.type = ResourceView::Type::ShaderResource; + auto textureView = renderer->createTextureView( + texture, + viewDesc); + + descriptorSet->setResource(rangeIndex, 0, textureView); + + if(srcEntry.isOutput) + { + BindingStateImpl::OutputBinding binding; + binding.entryIndex = i; + binding.resource = texture; + outputBindings.Add(binding); + } + } break; - } + case ShaderInputType::Sampler: - { - descOut.addSampler(_calcSamplerDesc(srcEntry.samplerDesc), registerSet); + { + auto sampler = _createSamplerState(renderer, srcEntry.samplerDesc); + descriptorSet->setSampler(rangeIndex, 0, sampler); + } break; - } + default: - { assert(!"Unhandled type"); return SLANG_FAIL; - } } } - return SLANG_OK; -} + BindingStateImpl* bindingState = new BindingStateImpl(); + bindingState->descriptorSet = descriptorSet; + bindingState->pipelineLayout = pipelineLayout; + bindingState->outputBindings = outputBindings; + bindingState->m_numRenderTargets = layout.numRenderTargets; -/* static */Result ShaderRendererUtil::createBindingStateDesc(const ShaderInputLayout& layout, Renderer* renderer, BindingState::Desc& descOut) -{ - SLANG_RETURN_ON_FAIL(createBindingStateDesc(layout.entries.Buffer(), int(layout.entries.Count()), renderer, descOut)); - descOut.m_numRenderTargets = layout.numRenderTargets; + *outBindingState = bindingState; return SLANG_OK; } diff --git a/tools/render-test/shader-renderer-util.h b/tools/render-test/shader-renderer-util.h index 849e68754..bbdea2af6 100644 --- a/tools/render-test/shader-renderer-util.h +++ b/tools/render-test/shader-renderer-util.h @@ -6,26 +6,68 @@ namespace renderer_test { -/// Utility class containing functions that construct items on the renderer using the ShaderInputLayout representation -struct ShaderRendererUtil +using namespace Slang; + +struct BindingStateImpl : public Slang::RefObject +{ + /// A register set consists of one or more contiguous indices. + /// To be valid index >= 0 and size >= 1 + struct RegisterRange + { + /// True if contains valid contents + bool isValid() const { return size > 0; } + /// True if valid single value + bool isSingle() const { return size == 1; } + /// Get as a single index (must be at least one index) + int getSingleIndex() const { return (size == 1) ? index : -1; } + /// Return the first index + int getFirstIndex() const { return (size > 0) ? index : -1; } + /// True if contains register index + bool hasRegister(int registerIndex) const { return registerIndex >= index && registerIndex < index + size; } + + static RegisterRange makeInvalid() { return RegisterRange{ -1, 0 }; } + static RegisterRange makeSingle(int index) { return RegisterRange{ int16_t(index), 1 }; } + static RegisterRange makeRange(int index, int size) { return RegisterRange{ int16_t(index), uint16_t(size) }; } + + int16_t index; ///< The base index + uint16_t size; ///< The amount of register indices + }; + + void apply(Renderer* renderer, PipelineType pipelineType); + + struct OutputBinding + { + RefPtr<Resource> resource; + Slang::UInt entryIndex; + }; + List<OutputBinding> outputBindings; + + RefPtr<PipelineLayout> pipelineLayout; + RefPtr<DescriptorSet> descriptorSet; + int m_numRenderTargets = 1; +}; + +/// Utility class containing functions that construct items on the renderer using the ShaderInputLayout representation +struct ShaderRendererUtil { /// Generate a texture using the InputTextureDesc and construct a TextureResource using the Renderer with the contents static Slang::Result generateTextureResource(const InputTextureDesc& inputDesc, int bindFlags, Renderer* renderer, Slang::RefPtr<TextureResource>& textureOut); /// Create texture resource using inputDesc, and texData to describe format, and contents static Slang::Result createTextureResource(const InputTextureDesc& inputDesc, const TextureData& texData, int bindFlags, Renderer* renderer, Slang::RefPtr<TextureResource>& textureOut); - + /// Create the BufferResource using the renderer from the contents of inputDesc static Slang::Result createBufferResource(const InputBufferDesc& inputDesc, bool isOutput, size_t bufferSize, const void* initData, Renderer* renderer, Slang::RefPtr<BufferResource>& bufferOut); /// Create BindingState::Desc from the contents of layout - static Slang::Result createBindingStateDesc(const ShaderInputLayout& layout, Renderer* renderer, BindingState::Desc& descOut); - /// Create BindingState::Desc from a list of ShaderInputLayout entries - static Slang::Result createBindingStateDesc(ShaderInputLayoutEntry* srcEntries, int numEntries, Renderer* renderer, BindingState::Desc& descOut); + static Slang::Result createBindingState(const ShaderInputLayout& layout, Renderer* renderer, BufferResource* addedConstantBuffer, BindingStateImpl** outBindingState); /// Get the binding register associated with this binding (or -1 if none defined) - static BindingState::RegisterRange calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry); + static BindingStateImpl::RegisterRange calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry); +private: + /// Create BindingState::Desc from a list of ShaderInputLayout entries + static Slang::Result _createBindingState(ShaderInputLayoutEntry* srcEntries, int numEntries, Renderer* renderer, BufferResource* addedConstantBuffer, BindingStateImpl** outBindingState); }; } // renderer_test diff --git a/tools/render-test/slang-support.cpp b/tools/render-test/slang-support.cpp index a6c252843..26e856295 100644 --- a/tools/render-test/slang-support.cpp +++ b/tools/render-test/slang-support.cpp @@ -11,7 +11,7 @@ namespace renderer_test { -ShaderProgram* ShaderCompiler::compileProgram( +RefPtr<ShaderProgram> ShaderCompiler::compileProgram( ShaderCompileRequest const& request) { SlangSession* slangSession = spCreateSession(NULL); @@ -92,7 +92,7 @@ ShaderProgram* ShaderCompiler::compileProgram( } - ShaderProgram * shaderProgram = nullptr; + RefPtr<ShaderProgram> shaderProgram; Slang::List<const char*> rawTypeNames; for (auto typeName : request.entryPointTypeArguments) rawTypeNames.Add(typeName.Buffer()); diff --git a/tools/render-test/slang-support.h b/tools/render-test/slang-support.h index 8697abcb8..03de062d1 100644 --- a/tools/render-test/slang-support.h +++ b/tools/render-test/slang-support.h @@ -11,13 +11,13 @@ namespace renderer_test { struct ShaderCompiler { - Renderer* renderer; + RefPtr<Renderer> renderer; SlangCompileTarget target; SlangSourceLanguage sourceLanguage; SlangPassThrough passThrough; char const* profile; - ShaderProgram* compileProgram( + RefPtr<ShaderProgram> compileProgram( ShaderCompileRequest const& request); }; diff --git a/tools/slang-graphics/render-d3d11.cpp b/tools/slang-graphics/render-d3d11.cpp deleted file mode 100644 index 4f9749e39..000000000 --- a/tools/slang-graphics/render-d3d11.cpp +++ /dev/null @@ -1,1101 +0,0 @@ -// render-d3d11.cpp - -#define _CRT_SECURE_NO_WARNINGS - -#include "render-d3d11.h" - -//WORKING: #include "options.h" -#include "render.h" -#include "d3d-util.h" - -#include "surface.h" - -// In order to use the Slang API, we need to include its header - -//#include <slang.h> - -#include "../../slang-com-ptr.h" - -// We will be rendering with Direct3D 11, so we need to include -// the Windows and D3D11 headers - -#define WIN32_LEAN_AND_MEAN -#define NOMINMAX -#include <Windows.h> -#undef WIN32_LEAN_AND_MEAN -#undef NOMINMAX - -#include <d3d11_2.h> -#include <d3dcompiler.h> - -// We will use the C standard library just for printing error messages. -#include <stdio.h> - -#ifdef _MSC_VER -#include <stddef.h> -#if (_MSC_VER < 1900) -#define snprintf sprintf_s -#endif -#endif -// -using namespace Slang; - -namespace slang_graphics { - -class D3D11Renderer : public Renderer -{ -public: - // Renderer implementation - virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; - virtual void setClearColor(const float color[4]) override; - virtual void clearFrame() override; - virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& bufferDesc, const void* initData) override; - virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; - virtual InputLayout* createInputLayout( const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& desc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; - virtual void* map(BufferResource* buffer, MapFlavor flavor) override; - virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; - virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState * state); - virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; - virtual void draw(UInt vertexCount, UInt startVertex) override; - virtual void dispatchCompute(int x, int y, int z) override; - virtual void submitGpuWork() override {} - virtual void waitForGpu() override {} - virtual RendererType getRendererType() const override { return RendererType::DirectX11; } - - protected: - - struct BindingDetail - { - ComPtr<ID3D11ShaderResourceView> m_srv; - ComPtr<ID3D11UnorderedAccessView> m_uav; - ComPtr<ID3D11SamplerState> m_samplerState; - }; - - class BindingStateImpl: public BindingState - { - public: - typedef BindingState Parent; - - /// Ctor - BindingStateImpl(const Desc& desc): - Parent(desc) - {} - - List<BindingDetail> m_bindingDetails; - }; - - class ShaderProgramImpl: public ShaderProgram - { - public: - ComPtr<ID3D11VertexShader> m_vertexShader; - ComPtr<ID3D11PixelShader> m_pixelShader; - ComPtr<ID3D11ComputeShader> m_computeShader; - }; - - class BufferResourceImpl: public BufferResource - { - public: - typedef BufferResource Parent; - - BufferResourceImpl(const Desc& desc, Usage initialUsage): - Parent(desc), - m_initialUsage(initialUsage) - { - } - - MapFlavor m_mapFlavor; - Usage m_initialUsage; - ComPtr<ID3D11Buffer> m_buffer; - ComPtr<ID3D11Buffer> m_staging; - }; - class TextureResourceImpl : public TextureResource - { - public: - typedef TextureResource Parent; - - TextureResourceImpl(const Desc& desc, Usage initialUsage) : - Parent(desc), - m_initialUsage(initialUsage) - { - } - Usage m_initialUsage; - ComPtr<ID3D11Resource> m_resource; - }; - - class InputLayoutImpl: public InputLayout - { - public: - ComPtr<ID3D11InputLayout> m_layout; - }; - - /// Capture a texture to a file - static HRESULT captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut); - - void _applyBindingState(bool isCompute); - - ComPtr<IDXGISwapChain> m_swapChain; - ComPtr<ID3D11Device> m_device; - ComPtr<ID3D11DeviceContext> m_immediateContext; - ComPtr<ID3D11Texture2D> m_backBufferTexture; - - List<ComPtr<ID3D11RenderTargetView> > m_renderTargetViews; - List<ComPtr<ID3D11Texture2D> > m_renderTargetTextures; - - RefPtr<BindingStateImpl> m_currentBindings; - - Desc m_desc; - - float m_clearColor[4] = { 0, 0, 0, 0 }; -}; - -Renderer* createD3D11Renderer() -{ - return new D3D11Renderer(); -} - -/* static */HRESULT D3D11Renderer::captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut) -{ - if (!context) return E_INVALIDARG; - if (!texture) return E_INVALIDARG; - - D3D11_TEXTURE2D_DESC textureDesc; - texture->GetDesc(&textureDesc); - - // Don't bother supporting MSAA for right now - if (textureDesc.SampleDesc.Count > 1) - { - fprintf(stderr, "ERROR: cannot capture multi-sample texture\n"); - return E_INVALIDARG; - } - - HRESULT hr = S_OK; - ComPtr<ID3D11Texture2D> stagingTexture; - - if (textureDesc.Usage == D3D11_USAGE_STAGING && (textureDesc.CPUAccessFlags & D3D11_CPU_ACCESS_READ)) - { - stagingTexture = texture; - } - else - { - // Modify the descriptor to give us a staging texture - textureDesc.BindFlags = 0; - textureDesc.MiscFlags &= ~D3D11_RESOURCE_MISC_TEXTURECUBE; - textureDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; - textureDesc.Usage = D3D11_USAGE_STAGING; - - hr = device->CreateTexture2D(&textureDesc, 0, stagingTexture.writeRef()); - if (FAILED(hr)) - { - fprintf(stderr, "ERROR: failed to create staging texture\n"); - return hr; - } - - context->CopyResource(stagingTexture, texture); - } - - // Now just read back texels from the staging textures - { - D3D11_MAPPED_SUBRESOURCE mappedResource; - SLANG_RETURN_ON_FAIL(context->Map(stagingTexture, 0, D3D11_MAP_READ, 0, &mappedResource)); - - Result res = surfaceOut.set(textureDesc.Width, textureDesc.Height, Format::RGBA_Unorm_UInt8, mappedResource.RowPitch, mappedResource.pData, SurfaceAllocator::getMallocAllocator()); - - // Make sure to unmap - context->Unmap(stagingTexture, 0); - return res; - } -} - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! - -SlangResult D3D11Renderer::initialize(const Desc& desc, void* inWindowHandle) -{ - auto windowHandle = (HWND)inWindowHandle; - m_desc = desc; - - // Rather than statically link against D3D, we load it dynamically. - HMODULE d3dModule = LoadLibraryA("d3d11.dll"); - if (!d3dModule) - { - fprintf(stderr, "error: failed load 'd3d11.dll'\n"); - return SLANG_FAIL; - } - - PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN D3D11CreateDeviceAndSwapChain_ = - (PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN)GetProcAddress(d3dModule, "D3D11CreateDeviceAndSwapChain"); - if (!D3D11CreateDeviceAndSwapChain_) - { - fprintf(stderr, - "error: failed load symbol 'D3D11CreateDeviceAndSwapChain'\n"); - return SLANG_FAIL; - } - - // We create our device in debug mode, just so that we can check that the - // example doesn't trigger warnings. - UINT deviceFlags = 0; - deviceFlags |= D3D11_CREATE_DEVICE_DEBUG; - - // Our swap chain uses RGBA8 with sRGB, with double buffering. - DXGI_SWAP_CHAIN_DESC swapChainDesc = { 0 }; - swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; - - // Note(tfoley): Disabling sRGB for DX back buffer for now, so that we - // can get consistent output with OpenGL, where setting up sRGB will - // probably be more involved. - // swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM_SRGB; - swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; - - swapChainDesc.SampleDesc.Count = 1; - swapChainDesc.SampleDesc.Quality = 0; - swapChainDesc.BufferCount = 2; - swapChainDesc.OutputWindow = windowHandle; - swapChainDesc.Windowed = TRUE; - swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD; - swapChainDesc.Flags = 0; - - // We will ask for the highest feature level that can be supported. - const D3D_FEATURE_LEVEL featureLevels[] = { - D3D_FEATURE_LEVEL_11_1, - D3D_FEATURE_LEVEL_11_0, - D3D_FEATURE_LEVEL_10_1, - D3D_FEATURE_LEVEL_10_0, - D3D_FEATURE_LEVEL_9_3, - D3D_FEATURE_LEVEL_9_2, - D3D_FEATURE_LEVEL_9_1, - }; - D3D_FEATURE_LEVEL featureLevel = D3D_FEATURE_LEVEL_9_1; - const int totalNumFeatureLevels = SLANG_COUNT_OF(featureLevels); - - // On a machine that does not have an up-to-date version of D3D installed, - // the `D3D11CreateDeviceAndSwapChain` call will fail with `E_INVALIDARG` - // if you ask for featuer level 11_1. The workaround is to call - // `D3D11CreateDeviceAndSwapChain` up to twice: the first time with 11_1 - // at the start of the list of requested feature levels, and the second - // time without it. - - for (int ii = 0; ii < 2; ++ii) - { - const HRESULT hr = D3D11CreateDeviceAndSwapChain_( - nullptr, // adapter (use default) - D3D_DRIVER_TYPE_REFERENCE, - //D3D_DRIVER_TYPE_HARDWARE, - nullptr, // software - deviceFlags, - &featureLevels[ii], - totalNumFeatureLevels - ii, - D3D11_SDK_VERSION, - &swapChainDesc, - m_swapChain.writeRef(), - m_device.writeRef(), - &featureLevel, - m_immediateContext.writeRef()); - - // Failures with `E_INVALIDARG` might be due to feature level 11_1 - // not being supported. - if (hr == E_INVALIDARG) - { - continue; - } - - // Other failures are real, though. - SLANG_RETURN_ON_FAIL(hr); - // We must have a swap chain - break; - } - - // After we've created the swap chain, we can request a pointer to the - // back buffer as a D3D11 texture, and create a render-target view from it. - - static const IID kIID_ID3D11Texture2D = { - 0x6f15aaf2, 0xd208, 0x4e89, 0x9a, 0xb4, 0x48, - 0x95, 0x35, 0xd3, 0x4f, 0x9c }; - - SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(0, kIID_ID3D11Texture2D, (void**)m_backBufferTexture.writeRef())); - - for (int i = 0; i < 8; i++) - { - ComPtr<ID3D11Texture2D> texture; - D3D11_TEXTURE2D_DESC textureDesc; - m_backBufferTexture->GetDesc(&textureDesc); - SLANG_RETURN_ON_FAIL(m_device->CreateTexture2D(&textureDesc, nullptr, texture.writeRef())); - - ComPtr<ID3D11RenderTargetView> rtv; - D3D11_RENDER_TARGET_VIEW_DESC rtvDesc; - rtvDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; - rtvDesc.Texture2D.MipSlice = 0; - rtvDesc.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D; - SLANG_RETURN_ON_FAIL(m_device->CreateRenderTargetView(texture, &rtvDesc, rtv.writeRef())); - - m_renderTargetViews.Add(rtv); - m_renderTargetTextures.Add(texture); - } - - m_immediateContext->OMSetRenderTargets((UINT)m_renderTargetViews.Count(), m_renderTargetViews.Buffer()->readRef(), nullptr); - - // Similarly, we are going to set up a viewport once, and then never - // switch, since this is a simple test app. - D3D11_VIEWPORT viewport; - viewport.TopLeftX = 0; - viewport.TopLeftY = 0; - viewport.Width = (float)desc.width; - viewport.Height = (float)desc.height; - viewport.MaxDepth = 1; // TODO(tfoley): use reversed depth - viewport.MinDepth = 0; - m_immediateContext->RSSetViewports(1, &viewport); - - return SLANG_OK; -} - -void D3D11Renderer::setClearColor(const float color[4]) -{ - memcpy(m_clearColor, color, sizeof(m_clearColor)); -} - -void D3D11Renderer::clearFrame() -{ - for (auto i = 0u; i < m_renderTargetViews.Count(); i++) - { - m_immediateContext->ClearRenderTargetView(m_renderTargetViews[i], m_clearColor); - } -} - -void D3D11Renderer::presentFrame() -{ - m_immediateContext->CopyResource(m_backBufferTexture, m_renderTargetTextures[0]); - m_swapChain->Present(0, 0); -} - -SlangResult D3D11Renderer::captureScreenSurface(Surface& surfaceOut) -{ - return captureTextureToSurface(m_device, m_immediateContext, m_renderTargetTextures[0], surfaceOut); -} - -static D3D11_BIND_FLAG _calcResourceFlag(Resource::BindFlag::Enum bindFlag) -{ - typedef Resource::BindFlag BindFlag; - switch (bindFlag) - { - case BindFlag::VertexBuffer: return D3D11_BIND_VERTEX_BUFFER; - case BindFlag::IndexBuffer: return D3D11_BIND_INDEX_BUFFER; - case BindFlag::ConstantBuffer: return D3D11_BIND_CONSTANT_BUFFER; - case BindFlag::StreamOutput: return D3D11_BIND_STREAM_OUTPUT; - case BindFlag::RenderTarget: return D3D11_BIND_RENDER_TARGET; - case BindFlag::DepthStencil: return D3D11_BIND_DEPTH_STENCIL; - case BindFlag::UnorderedAccess: return D3D11_BIND_UNORDERED_ACCESS; - case BindFlag::PixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; - case BindFlag::NonPixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; - default: return D3D11_BIND_FLAG(0); - } -} - -static int _calcResourceBindFlags(int bindFlags) -{ - int dstFlags = 0; - while (bindFlags) - { - int lsb = bindFlags & -bindFlags; - - dstFlags |= _calcResourceFlag(Resource::BindFlag::Enum(lsb)); - bindFlags &= ~lsb; - } - return dstFlags; -} - -static int _calcResourceAccessFlags(int accessFlags) -{ - switch (accessFlags) - { - case 0: return 0; - case Resource::AccessFlag::Read: return D3D11_CPU_ACCESS_READ; - case Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_WRITE; - case Resource::AccessFlag::Read | - Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_READ | D3D11_CPU_ACCESS_WRITE; - default: assert(!"Invalid flags"); return 0; - } -} - -TextureResource* D3D11Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) -{ - TextureResource::Desc srcDesc(descIn); - srcDesc.setDefaults(initialUsage); - - const int effectiveArraySize = srcDesc.calcEffectiveArraySize(); - - assert(initData); - assert(initData->numSubResources == srcDesc.numMipLevels * effectiveArraySize * srcDesc.size.depth); - - const DXGI_FORMAT format = D3DUtil::getMapFormat(srcDesc.format); - if (format == DXGI_FORMAT_UNKNOWN) - { - return nullptr; - } - - const int bindFlags = _calcResourceBindFlags(srcDesc.bindFlags); - - // Set up the initialize data - List<D3D11_SUBRESOURCE_DATA> subRes; - subRes.SetSize(srcDesc.numMipLevels * effectiveArraySize); - { - int subResourceIndex = 0; - for (int i = 0; i < effectiveArraySize; i++) - { - for (int j = 0; j < srcDesc.numMipLevels; j++) - { - const int mipHeight = TextureResource::calcMipSize(srcDesc.size.height, j); - - D3D11_SUBRESOURCE_DATA& data = subRes[subResourceIndex]; - - data.pSysMem = initData->subResources[subResourceIndex]; - - data.SysMemPitch = UINT(initData->mipRowStrides[j]); - data.SysMemSlicePitch = UINT(initData->mipRowStrides[j] * mipHeight); - - subResourceIndex++; - } - } - } - - const int accessFlags = _calcResourceAccessFlags(srcDesc.cpuAccessFlags); - - RefPtr<TextureResourceImpl> texture(new TextureResourceImpl(srcDesc, initialUsage)); - - switch (srcDesc.type) - { - case Resource::Type::Texture1D: - { - D3D11_TEXTURE1D_DESC desc = { 0 }; - desc.BindFlags = bindFlags; - desc.CPUAccessFlags = accessFlags; - desc.Format = format; - desc.MiscFlags = 0; - desc.MipLevels = srcDesc.numMipLevels; - desc.ArraySize = effectiveArraySize; - desc.Width = srcDesc.size.width; - desc.Usage = D3D11_USAGE_DEFAULT; - - ComPtr<ID3D11Texture1D> texture1D; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateTexture1D(&desc, subRes.Buffer(), texture1D.writeRef())); - - texture->m_resource = texture1D; - break; - } - case Resource::Type::TextureCube: - case Resource::Type::Texture2D: - { - D3D11_TEXTURE2D_DESC desc = { 0 }; - desc.BindFlags = bindFlags; - desc.CPUAccessFlags = accessFlags; - desc.Format = format; - desc.MiscFlags = 0; - desc.MipLevels = srcDesc.numMipLevels; - desc.ArraySize = effectiveArraySize; - - desc.Width = srcDesc.size.width; - desc.Height = srcDesc.size.height; - desc.Usage = D3D11_USAGE_DEFAULT; - desc.SampleDesc.Count = srcDesc.sampleDesc.numSamples; - desc.SampleDesc.Quality = srcDesc.sampleDesc.quality; - - if (srcDesc.type == Resource::Type::TextureCube) - { - desc.MiscFlags |= D3D11_RESOURCE_MISC_TEXTURECUBE; - } - - ComPtr<ID3D11Texture2D> texture2D; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateTexture2D(&desc, subRes.Buffer(), texture2D.writeRef())); - - texture->m_resource = texture2D; - break; - } - case Resource::Type::Texture3D: - { - D3D11_TEXTURE3D_DESC desc = { 0 }; - desc.BindFlags = bindFlags; - desc.CPUAccessFlags = accessFlags; - desc.Format = format; - desc.MiscFlags = 0; - desc.MipLevels = srcDesc.numMipLevels; - desc.Width = srcDesc.size.width; - desc.Height = srcDesc.size.height; - desc.Depth = srcDesc.size.depth; - desc.Usage = D3D11_USAGE_DEFAULT; - - ComPtr<ID3D11Texture3D> texture3D; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateTexture3D(&desc, subRes.Buffer(), texture3D.writeRef())); - - texture->m_resource = texture3D; - break; - } - default: return nullptr; - } - - return texture.detach(); -} - -BufferResource* D3D11Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) -{ - BufferResource::Desc srcDesc(descIn); - srcDesc.setDefaults(initialUsage); - - // Make aligned to 256 bytes... not sure why, but if you remove this the tests do fail. - const size_t alignedSizeInBytes = D3DUtil::calcAligned(srcDesc.sizeInBytes, 256); - - // Hack to make the initialization never read from out of bounds memory, by copying into a buffer - List<uint8_t> initDataBuffer; - if (initData && alignedSizeInBytes > srcDesc.sizeInBytes) - { - initDataBuffer.SetSize(alignedSizeInBytes); - ::memcpy(initDataBuffer.Buffer(), initData, srcDesc.sizeInBytes); - initData = initDataBuffer.Buffer(); - } - - D3D11_BUFFER_DESC bufferDesc = { 0 }; - bufferDesc.ByteWidth = UINT(alignedSizeInBytes); - bufferDesc.BindFlags = _calcResourceBindFlags(srcDesc.bindFlags); - // For read we'll need to do some staging - bufferDesc.CPUAccessFlags = _calcResourceAccessFlags(descIn.cpuAccessFlags & Resource::AccessFlag::Write); - bufferDesc.Usage = D3D11_USAGE_DEFAULT; - - // If written by CPU, make it dynamic - if (descIn.cpuAccessFlags & Resource::AccessFlag::Write) - { - bufferDesc.Usage = D3D11_USAGE_DYNAMIC; - } - - switch (initialUsage) - { - case Resource::Usage::ConstantBuffer: - { - // We'll just assume ConstantBuffers are dynamic for now - bufferDesc.Usage = D3D11_USAGE_DYNAMIC; - break; - } - default: break; - } - - if (bufferDesc.BindFlags & (D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE)) - { - //desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE; - if (srcDesc.elementSize != 0) - { - bufferDesc.StructureByteStride = srcDesc.elementSize; - bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED; - } - else - { - bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS; - } - } - - D3D11_SUBRESOURCE_DATA subResourceData = { 0 }; - subResourceData.pSysMem = initData; - - RefPtr<BufferResourceImpl> buffer(new BufferResourceImpl(srcDesc, initialUsage)); - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateBuffer(&bufferDesc, initData ? &subResourceData : nullptr, buffer->m_buffer.writeRef())); - - if (srcDesc.cpuAccessFlags & Resource::AccessFlag::Read) - { - D3D11_BUFFER_DESC bufDesc = {}; - bufDesc.BindFlags = 0; - bufDesc.ByteWidth = (UINT)alignedSizeInBytes; - bufDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; - bufDesc.Usage = D3D11_USAGE_STAGING; - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateBuffer(&bufDesc, nullptr, buffer->m_staging.writeRef())); - } - - return buffer.detach(); -} - -InputLayout* D3D11Renderer::createInputLayout(const InputElementDesc* inputElementsIn, UInt inputElementCount) -{ - D3D11_INPUT_ELEMENT_DESC inputElements[16] = {}; - - char hlslBuffer[1024]; - char* hlslCursor = &hlslBuffer[0]; - - hlslCursor += sprintf(hlslCursor, "float4 main(\n"); - - for (UInt ii = 0; ii < inputElementCount; ++ii) - { - inputElements[ii].SemanticName = inputElementsIn[ii].semanticName; - inputElements[ii].SemanticIndex = (UINT)inputElementsIn[ii].semanticIndex; - inputElements[ii].Format = D3DUtil::getMapFormat(inputElementsIn[ii].format); - inputElements[ii].InputSlot = 0; - inputElements[ii].AlignedByteOffset = (UINT)inputElementsIn[ii].offset; - inputElements[ii].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA; - inputElements[ii].InstanceDataStepRate = 0; - - if (ii != 0) - { - hlslCursor += sprintf(hlslCursor, ",\n"); - } - - char const* typeName = "Unknown"; - switch (inputElementsIn[ii].format) - { - case Format::RGBA_Float32: - typeName = "float4"; - break; - case Format::RGB_Float32: - typeName = "float3"; - break; - case Format::RG_Float32: - typeName = "float2"; - break; - case Format::R_Float32: - typeName = "float"; - break; - default: - return nullptr; - } - - hlslCursor += sprintf(hlslCursor, "%s a%d : %s%d", - typeName, - (int)ii, - inputElementsIn[ii].semanticName, - (int)inputElementsIn[ii].semanticIndex); - } - - hlslCursor += sprintf(hlslCursor, "\n) : SV_Position { return 0; }"); - - ComPtr<ID3DBlob> vertexShaderBlob; - SLANG_RETURN_NULL_ON_FAIL(D3DUtil::compileHLSLShader("inputLayout", hlslBuffer, "main", "vs_5_0", vertexShaderBlob)); - - ComPtr<ID3D11InputLayout> inputLayout; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateInputLayout(&inputElements[0], (UINT)inputElementCount, vertexShaderBlob->GetBufferPointer(), vertexShaderBlob->GetBufferSize(), - inputLayout.writeRef())); - - InputLayoutImpl* impl = new InputLayoutImpl; - impl->m_layout.swap(inputLayout); - - return impl; -} - -void* D3D11Renderer::map(BufferResource* bufferIn, MapFlavor flavor) -{ - BufferResourceImpl* bufferResource = static_cast<BufferResourceImpl*>(bufferIn); - - D3D11_MAP mapType; - ID3D11Buffer* buffer = bufferResource->m_buffer; - - switch (flavor) - { - case MapFlavor::WriteDiscard: - mapType = D3D11_MAP_WRITE_DISCARD; - break; - case MapFlavor::HostWrite: - mapType = D3D11_MAP_WRITE; - break; - case MapFlavor::HostRead: - mapType = D3D11_MAP_READ; - - buffer = bufferResource->m_staging; - if (!buffer) - { - return nullptr; - } - - // Okay copy the data over - m_immediateContext->CopyResource(buffer, bufferResource->m_buffer); - - break; - default: - return nullptr; - } - - // We update our constant buffer per-frame, just for the purposes - // of the example, but we don't actually load different data - // per-frame (we always use an identity projection). - D3D11_MAPPED_SUBRESOURCE mappedSub; - SLANG_RETURN_NULL_ON_FAIL(m_immediateContext->Map(buffer, 0, mapType, 0, &mappedSub)); - - bufferResource->m_mapFlavor = flavor; - - return mappedSub.pData; -} - -void D3D11Renderer::unmap(BufferResource* bufferIn) -{ - BufferResourceImpl* bufferResource = static_cast<BufferResourceImpl*>(bufferIn); - ID3D11Buffer* buffer = (bufferResource->m_mapFlavor == MapFlavor::HostRead) ? bufferResource->m_staging : bufferResource->m_buffer; - m_immediateContext->Unmap(buffer, 0); -} - -void D3D11Renderer::setInputLayout(InputLayout* inputLayoutIn) -{ - auto inputLayout = static_cast<InputLayoutImpl*>(inputLayoutIn); - m_immediateContext->IASetInputLayout(inputLayout->m_layout); -} - -void D3D11Renderer::setPrimitiveTopology(PrimitiveTopology topology) -{ - m_immediateContext->IASetPrimitiveTopology(D3DUtil::getPrimitiveTopology(topology)); -} - -void D3D11Renderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffersIn, const UInt* stridesIn, const UInt* offsetsIn) -{ - static const int kMaxVertexBuffers = 16; - assert(slotCount <= kMaxVertexBuffers); - - UINT vertexStrides[kMaxVertexBuffers]; - UINT vertexOffsets[kMaxVertexBuffers]; - ID3D11Buffer* dxBuffers[kMaxVertexBuffers]; - - auto buffers = (BufferResourceImpl*const*)buffersIn; - - for (UInt ii = 0; ii < slotCount; ++ii) - { - vertexStrides[ii] = (UINT)stridesIn[ii]; - vertexOffsets[ii] = (UINT)offsetsIn[ii]; - dxBuffers[ii] = buffers[ii]->m_buffer; - } - - m_immediateContext->IASetVertexBuffers((UINT)startSlot, (UINT)slotCount, dxBuffers, &vertexStrides[0], &vertexOffsets[0]); -} - -void D3D11Renderer::setShaderProgram(ShaderProgram* programIn) -{ - auto program = (ShaderProgramImpl*)programIn; - m_immediateContext->CSSetShader(program->m_computeShader, nullptr, 0); - m_immediateContext->VSSetShader(program->m_vertexShader, nullptr, 0); - m_immediateContext->PSSetShader(program->m_pixelShader, nullptr, 0); -} - -void D3D11Renderer::draw(UInt vertexCount, UInt startVertex) -{ - _applyBindingState(false); - m_immediateContext->Draw((UINT)vertexCount, (UINT)startVertex); -} - -ShaderProgram* D3D11Renderer::createProgram(const ShaderProgram::Desc& desc) -{ - if (desc.pipelineType == PipelineType::Compute) - { - auto computeKernel = desc.findKernel(StageType::Compute); - - ComPtr<ID3D11ComputeShader> computeShader; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateComputeShader(computeKernel->codeBegin, computeKernel->getCodeSize(), nullptr, computeShader.writeRef())); - - ShaderProgramImpl* shaderProgram = new ShaderProgramImpl(); - shaderProgram->m_computeShader.swap(computeShader); - return shaderProgram; - } - else - { - auto vertexKernel = desc.findKernel(StageType::Vertex); - auto fragmentKernel = desc.findKernel(StageType::Fragment); - - ComPtr<ID3D11VertexShader> vertexShader; - ComPtr<ID3D11PixelShader> pixelShader; - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateVertexShader(vertexKernel->codeBegin, vertexKernel->getCodeSize(), nullptr, vertexShader.writeRef())); - SLANG_RETURN_NULL_ON_FAIL(m_device->CreatePixelShader(fragmentKernel->codeBegin, fragmentKernel->getCodeSize(), nullptr, pixelShader.writeRef())); - - ShaderProgramImpl* shaderProgram = new ShaderProgramImpl(); - shaderProgram->m_vertexShader.swap(vertexShader); - shaderProgram->m_pixelShader.swap(pixelShader); - return shaderProgram; - } -} - -void D3D11Renderer::dispatchCompute(int x, int y, int z) -{ - _applyBindingState(true); - m_immediateContext->Dispatch(x, y, z); -} - -BindingState* D3D11Renderer::createBindingState(const BindingState::Desc& bindingStateDesc) -{ - RefPtr<BindingStateImpl> bindingState(new BindingStateImpl(bindingStateDesc)); - - const auto& srcBindings = bindingStateDesc.m_bindings; - const int numBindings = int(srcBindings.Count()); - - auto& dstDetails = bindingState->m_bindingDetails; - dstDetails.SetSize(numBindings); - - for (int i = 0; i < numBindings; ++i) - { - auto& dstDetail = dstDetails[i]; - const auto& srcBinding = srcBindings[i]; - - assert(srcBinding.registerRange.isSingle()); - - switch (srcBinding.bindingType) - { - case BindingType::Buffer: - { - assert(srcBinding.resource && srcBinding.resource->isBuffer()); - - BufferResourceImpl* buffer = static_cast<BufferResourceImpl*>(srcBinding.resource.Ptr()); - const BufferResource::Desc& bufferDesc = buffer->getDesc(); - - const int elemSize = bufferDesc.elementSize <= 0 ? 1 : bufferDesc.elementSize; - - if (bufferDesc.bindFlags & Resource::BindFlag::UnorderedAccess) - { - D3D11_UNORDERED_ACCESS_VIEW_DESC viewDesc; - memset(&viewDesc, 0, sizeof(viewDesc)); - viewDesc.Buffer.FirstElement = 0; - viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); - viewDesc.Buffer.Flags = 0; - viewDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER; - viewDesc.Format = D3DUtil::getMapFormat(bufferDesc.format); - - if (bufferDesc.elementSize == 0 && bufferDesc.format == Format::Unknown) - { - viewDesc.Buffer.Flags |= D3D11_BUFFER_UAV_FLAG_RAW; - viewDesc.Format = DXGI_FORMAT_R32_TYPELESS; - } - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateUnorderedAccessView(buffer->m_buffer, &viewDesc, dstDetail.m_uav.writeRef())); - } - if (bufferDesc.bindFlags & (Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource)) - { - D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; - memset(&viewDesc, 0, sizeof(viewDesc)); - viewDesc.Buffer.FirstElement = 0; - viewDesc.Buffer.ElementWidth = elemSize; - viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); - viewDesc.Buffer.ElementOffset = 0; - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER; - viewDesc.Format = DXGI_FORMAT_UNKNOWN; - - if (bufferDesc.elementSize == 0) - { - viewDesc.Format = DXGI_FORMAT_R32_FLOAT; - } - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(buffer->m_buffer, &viewDesc, dstDetail.m_srv.writeRef())); - } - break; - } - case BindingType::Texture: - case BindingType::CombinedTextureSampler: - { - assert(srcBinding.resource && srcBinding.resource->isTexture()); - - TextureResourceImpl* texture = static_cast<TextureResourceImpl*>(srcBinding.resource.Ptr()); - - const TextureResource::Desc& textureDesc = texture->getDesc(); - - D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; - viewDesc.Format = D3DUtil::getMapFormat(textureDesc.format); - - switch (texture->getType()) - { - case Resource::Type::Texture1D: - { - if (textureDesc.arraySize <= 0) - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1D; - viewDesc.Texture1D.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture1D.MostDetailedMip = 0; - } - else - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1DARRAY; - viewDesc.Texture1DArray.ArraySize = textureDesc.arraySize; - viewDesc.Texture1DArray.FirstArraySlice = 0; - viewDesc.Texture1DArray.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture1DArray.MostDetailedMip = 0; - } - break; - } - case Resource::Type::Texture2D: - { - if (textureDesc.arraySize <= 0) - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; - viewDesc.Texture2D.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture2D.MostDetailedMip = 0; - } - else - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2DARRAY; - viewDesc.Texture2DArray.ArraySize = textureDesc.arraySize; - viewDesc.Texture2DArray.FirstArraySlice = 0; - viewDesc.Texture2DArray.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture2DArray.MostDetailedMip = 0; - } - break; - } - case Resource::Type::TextureCube: - { - if (textureDesc.arraySize <= 0) - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBE; - viewDesc.TextureCube.MipLevels = textureDesc.numMipLevels; - viewDesc.TextureCube.MostDetailedMip = 0; - } - else - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBEARRAY; - viewDesc.TextureCubeArray.MipLevels = textureDesc.numMipLevels; - viewDesc.TextureCubeArray.MostDetailedMip = 0; - viewDesc.TextureCubeArray.First2DArrayFace = 0; - viewDesc.TextureCubeArray.NumCubes = textureDesc.arraySize; - } - break; - } - case Resource::Type::Texture3D: - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE3D; - viewDesc.Texture3D.MipLevels = textureDesc.numMipLevels; // Old code fixed as one - viewDesc.Texture3D.MostDetailedMip = 0; - break; - } - default: - { - assert(!"Unhandled type"); - return nullptr; - } - } - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(texture->m_resource, &viewDesc, dstDetail.m_srv.writeRef())); - break; - } - case BindingType::Sampler: - { - const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; - - D3D11_SAMPLER_DESC desc = {}; - desc.AddressU = desc.AddressV = desc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP; - - if (samplerDesc.isCompareSampler) - { - desc.ComparisonFunc = D3D11_COMPARISON_LESS_EQUAL; - desc.Filter = D3D11_FILTER_MIN_LINEAR_MAG_MIP_POINT; - desc.MinLOD = desc.MaxLOD = 0.0f; - } - else - { - desc.Filter = D3D11_FILTER_ANISOTROPIC; - desc.MaxAnisotropy = 8; - desc.MinLOD = 0.0f; - desc.MaxLOD = 100.0f; - } - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateSamplerState(&desc, dstDetail.m_samplerState.writeRef())); - break; - } - default: - { - assert(!"Unhandled type"); - return nullptr; - } - } - } - - // Done - return bindingState.detach(); -} - -void D3D11Renderer::_applyBindingState(bool isCompute) -{ - auto context = m_immediateContext.get(); - - const auto& details = m_currentBindings->m_bindingDetails; - const auto& bindings = m_currentBindings->getDesc().m_bindings; - - const int numBindings = int(bindings.Count()); - - for (int i = 0; i < numBindings; ++i) - { - const auto& binding = bindings[i]; - const auto& detail = details[i]; - - const int bindingIndex = binding.registerRange.getSingleIndex(); - - switch (binding.bindingType) - { - case BindingType::Buffer: - { - assert(binding.resource && binding.resource->isBuffer()); - if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) - { - ID3D11Buffer* buffer = static_cast<BufferResourceImpl*>(binding.resource.Ptr())->m_buffer; - if (isCompute) - context->CSSetConstantBuffers(bindingIndex, 1, &buffer); - else - { - context->VSSetConstantBuffers(bindingIndex, 1, &buffer); - context->PSSetConstantBuffers(bindingIndex, 1, &buffer); - } - } - else if (detail.m_uav) - { - if (isCompute) - context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); - else - context->OMSetRenderTargetsAndUnorderedAccessViews(m_currentBindings->getDesc().m_numRenderTargets, - m_renderTargetViews.Buffer()->readRef(), nullptr, bindingIndex, 1, detail.m_uav.readRef(), nullptr); - } - else - { - if (isCompute) - context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - else - { - context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - } - } - break; - } - case BindingType::Texture: - { - if (detail.m_uav) - { - if (isCompute) - context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); - else - context->OMSetRenderTargetsAndUnorderedAccessViews(D3D11_KEEP_RENDER_TARGETS_AND_DEPTH_STENCIL, - nullptr, nullptr, bindingIndex, 1, detail.m_uav.readRef(), nullptr); - } - else - { - if (isCompute) - context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - else - { - context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - } - } - break; - } - case BindingType::Sampler: - { - if (isCompute) - context->CSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); - else - { - context->PSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); - context->VSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); - } - break; - } - default: - { - assert(!"Not implemented"); - return; - } - } - } -} - -void D3D11Renderer::setBindingState(BindingState* state) -{ - m_currentBindings = static_cast<BindingStateImpl*>(state); -} - -} // renderer_test |
