From 68d705f6c805c9b4d31b386e065762e6db13ad18 Mon Sep 17 00:00:00 2001 From: Tim Foley Date: Fri, 3 Aug 2018 08:39:28 -0700 Subject: Major overhaul of Renderer abstraction, to support a new example (#624) The original goal here was to bring up a second example program: `model-viewer`. While the existing `hello-world` example is enough to get somebody up to speed with the basics of the Slang API (as a drop-in replacement for `D3DCompile` or similar), it doesn't really show any of the big-picture stuff that Slang is meant to enable. There wasn't any use of D3D12/Vulkan descriptor tables/sets, and there wasn't any use of interfaces, generics, or `ParameterBlock`s in the shader code. The `model-viewer` example addresses these issues. Its shader code involves generics, interfaces, and multiple `ParameterBlock`s, and the host-side code demonstrates a few key things for working with Slang: * There is an application-level abstraction for parameter blocks, that combines the graphics-API descriptor set object with Slang type information * There is a shader cache layer used to look up an appropriate variant of a rendering effect by using parameter block types to "plug in" global type variables * There is a clear separation between the phases of compilation: a first phase that does semantic checking and enables reflection-based allocation of graphics API objects, followed by one or more code generation passes for specialized kernels. This example is certainly not perfect, and it will need to be revamped more going forward. In particular: * The output picture is ugly as sin. We need a plan for how to get this to load better content, perhaps even popping up an error message to note that the required input data isn't present in the basic repository. * The shader code is too simplistic. There isn't any real material variety, and the `IMaterial` abstraction is completely wrong. * The use of parameter blocks is facile because there are no resource parameters right now. Fixing that will likely expose issues around interfacing with Slang's reflection API. * The whole example exposes the issue that Slang's current APIs aren't really designed for the benefit of two-phase compilation (since our many client application has been stuck on one-phase compilation). * Global type parameters are actually a Bad Idea that we only did for compatibility with existing codebases. We should not be showing them off in an example of the Right Way to use Slang, but the language support for type parameters on entry points is still not complete. Of course, the majority of the changes here are *not* inside the example applications, and instead involve a major overhaul of the `Renderer` abstraction that is used for both tests and examples. The main thrust of the change is to make the abstraction layer be closer to the D3D12/Vulkan model than to a D3D11-style model. This is important for the `model-viewer` example, since it aspires to show how Slang can be incorporated into a renderer that targets a modern API. The most important bit is actually the use of descriptor sets and "pipeline layouts" a la Vulkan, since without these Slang's `ParameterBlock` abstraction won't make a lot of sense. Implementation of the abstraction for the various APIs has very much been on an as-needed basis. The current implementation is just enough for the two examples to work, plus enough to get all the tests to pass in both debug and release builds on Windows. A big missing feature in the API abstraction right now is memory lifetime management. The code had been trending toward something D3D11-like where a constant buffer could be mapped per-frame with the implementation doing behind-the-scenes allocation for targets like D3D12/Vulkan. I'd like to shift more toward a model of just exposing "transient" allocations that are only valid for one frame, because these are more representation of how an efficient renderer for next-generation APIs will work. That transition isn't actually complete, though, so there are problems with the existing examples where `hello-world` is actually scribbling into memory that the GPU might still be using, while `model-viewer` is doing full-on heavy-weight allocations on a per-frame basis with no real concern for the performance implications. All together, there are a lot of things here that need more work, but this branch has been way too long-lived already, and so I'd like to get this checked in as long as all the tests pass. --- .gitmodules | 6 + README.md | 31 +- examples/hello-world/README.md | 12 + examples/hello-world/hello-world.vcxproj | 184 + examples/hello-world/hello-world.vcxproj.filters | 18 + examples/hello-world/main.cpp | 470 +++ examples/hello-world/shaders.slang | 63 + examples/hello/README.md | 12 - examples/hello/hello.cpp | 380 --- examples/hello/hello.slang | 76 - examples/hello/hello.sln | 28 - examples/hello/hello.vcxproj | 184 - examples/hello/hello.vcxproj.filters | 18 - examples/model-viewer/README.md | 25 + examples/model-viewer/cube.mtl | 8 + examples/model-viewer/cube.obj | 24 + examples/model-viewer/main.cpp | 1618 +++++++++ examples/model-viewer/model-viewer.vcxproj | 184 + examples/model-viewer/model-viewer.vcxproj.filters | 18 + examples/model-viewer/shaders.slang | 178 + external/glm | 1 + external/stb/stb_image_resize.h | 2627 +++++++++++++++ external/tinyobjloader | 1 + premake5.lua | 43 +- slang.h | 79 +- slang.sln | 33 +- source/core/core.vcxproj | 3 - source/core/core.vcxproj.filters | 9 - source/core/smart-pointer.h | 10 +- source/slang/reflection.cpp | 37 +- source/slang/slang.cpp | 6 +- tools/gfx/circular-resource-heap-d3d12.cpp | 222 ++ tools/gfx/circular-resource-heap-d3d12.h | 206 ++ tools/gfx/d3d-util.cpp | 306 ++ tools/gfx/d3d-util.h | 61 + tools/gfx/descriptor-heap-d3d12.cpp | 47 + tools/gfx/descriptor-heap-d3d12.h | 198 ++ tools/gfx/gfx.vcxproj | 215 ++ tools/gfx/gfx.vcxproj.filters | 120 + tools/gfx/model.cpp | 530 +++ tools/gfx/model.h | 73 + tools/gfx/render-d3d11.cpp | 2112 ++++++++++++ tools/gfx/render-d3d11.h | 10 + tools/gfx/render-d3d12.cpp | 3557 ++++++++++++++++++++ tools/gfx/render-d3d12.h | 10 + tools/gfx/render-gl.cpp | 1426 ++++++++ tools/gfx/render-gl.h | 10 + tools/gfx/render-vk.cpp | 2569 ++++++++++++++ tools/gfx/render-vk.h | 10 + tools/gfx/render.cpp | 391 +++ tools/gfx/render.h | 869 +++++ tools/gfx/resource-d3d12.cpp | 214 ++ tools/gfx/resource-d3d12.h | 178 + tools/gfx/surface.cpp | 222 ++ tools/gfx/surface.h | 86 + tools/gfx/vector-math.h | 14 + tools/gfx/vk-api.cpp | 138 + tools/gfx/vk-api.h | 196 ++ tools/gfx/vk-device-queue.cpp | 199 ++ tools/gfx/vk-device-queue.h | 94 + tools/gfx/vk-module.cpp | 76 + tools/gfx/vk-module.h | 39 + tools/gfx/vk-swap-chain.cpp | 421 +++ tools/gfx/vk-swap-chain.h | 141 + tools/gfx/vk-util.cpp | 59 + tools/gfx/vk-util.h | 41 + tools/gfx/window.cpp | 289 ++ tools/gfx/window.h | 78 + tools/render-test/main.cpp | 117 +- tools/render-test/options.h | 2 +- tools/render-test/png-serialize-util.h | 2 +- tools/render-test/render-test.vcxproj | 10 +- tools/render-test/shader-input-layout.h | 2 +- tools/render-test/shader-renderer-util.cpp | 251 +- tools/render-test/shader-renderer-util.h | 56 +- tools/render-test/slang-support.cpp | 4 +- tools/render-test/slang-support.h | 4 +- .../circular-resource-heap-d3d12.cpp | 222 -- .../slang-graphics/circular-resource-heap-d3d12.h | 206 -- tools/slang-graphics/d3d-util.cpp | 306 -- tools/slang-graphics/d3d-util.h | 61 - tools/slang-graphics/descriptor-heap-d3d12.cpp | 47 - tools/slang-graphics/descriptor-heap-d3d12.h | 115 - tools/slang-graphics/render-d3d11.cpp | 1101 ------ tools/slang-graphics/render-d3d11.h | 10 - tools/slang-graphics/render-d3d12.cpp | 2467 -------------- tools/slang-graphics/render-d3d12.h | 10 - tools/slang-graphics/render-gl.cpp | 1049 ------ tools/slang-graphics/render-gl.h | 10 - tools/slang-graphics/render-vk.cpp | 2019 ----------- tools/slang-graphics/render-vk.h | 10 - tools/slang-graphics/render.cpp | 390 --- tools/slang-graphics/render.h | 583 ---- tools/slang-graphics/resource-d3d12.cpp | 214 -- tools/slang-graphics/resource-d3d12.h | 178 - tools/slang-graphics/slang-graphics.vcxproj | 212 -- .../slang-graphics/slang-graphics.vcxproj.filters | 111 - tools/slang-graphics/surface.cpp | 222 -- tools/slang-graphics/surface.h | 86 - tools/slang-graphics/vk-api.cpp | 138 - tools/slang-graphics/vk-api.h | 196 -- tools/slang-graphics/vk-device-queue.cpp | 199 -- tools/slang-graphics/vk-device-queue.h | 94 - tools/slang-graphics/vk-module.cpp | 76 - tools/slang-graphics/vk-module.h | 39 - tools/slang-graphics/vk-swap-chain.cpp | 421 --- tools/slang-graphics/vk-swap-chain.h | 141 - tools/slang-graphics/vk-util.cpp | 59 - tools/slang-graphics/vk-util.h | 41 - tools/slang-graphics/window.cpp | 245 -- tools/slang-graphics/window.h | 69 - 111 files changed, 21388 insertions(+), 12220 deletions(-) create mode 100644 examples/hello-world/README.md create mode 100644 examples/hello-world/hello-world.vcxproj create mode 100644 examples/hello-world/hello-world.vcxproj.filters create mode 100644 examples/hello-world/main.cpp create mode 100644 examples/hello-world/shaders.slang delete mode 100644 examples/hello/README.md delete mode 100644 examples/hello/hello.cpp delete mode 100644 examples/hello/hello.slang delete mode 100644 examples/hello/hello.sln delete mode 100644 examples/hello/hello.vcxproj delete mode 100644 examples/hello/hello.vcxproj.filters create mode 100644 examples/model-viewer/README.md create mode 100644 examples/model-viewer/cube.mtl create mode 100644 examples/model-viewer/cube.obj create mode 100644 examples/model-viewer/main.cpp create mode 100644 examples/model-viewer/model-viewer.vcxproj create mode 100644 examples/model-viewer/model-viewer.vcxproj.filters create mode 100644 examples/model-viewer/shaders.slang create mode 160000 external/glm create mode 100644 external/stb/stb_image_resize.h create mode 160000 external/tinyobjloader create mode 100644 tools/gfx/circular-resource-heap-d3d12.cpp create mode 100644 tools/gfx/circular-resource-heap-d3d12.h create mode 100644 tools/gfx/d3d-util.cpp create mode 100644 tools/gfx/d3d-util.h create mode 100644 tools/gfx/descriptor-heap-d3d12.cpp create mode 100644 tools/gfx/descriptor-heap-d3d12.h create mode 100644 tools/gfx/gfx.vcxproj create mode 100644 tools/gfx/gfx.vcxproj.filters create mode 100644 tools/gfx/model.cpp create mode 100644 tools/gfx/model.h create mode 100644 tools/gfx/render-d3d11.cpp create mode 100644 tools/gfx/render-d3d11.h create mode 100644 tools/gfx/render-d3d12.cpp create mode 100644 tools/gfx/render-d3d12.h create mode 100644 tools/gfx/render-gl.cpp create mode 100644 tools/gfx/render-gl.h create mode 100644 tools/gfx/render-vk.cpp create mode 100644 tools/gfx/render-vk.h create mode 100644 tools/gfx/render.cpp create mode 100644 tools/gfx/render.h create mode 100644 tools/gfx/resource-d3d12.cpp create mode 100644 tools/gfx/resource-d3d12.h create mode 100644 tools/gfx/surface.cpp create mode 100644 tools/gfx/surface.h create mode 100644 tools/gfx/vector-math.h create mode 100644 tools/gfx/vk-api.cpp create mode 100644 tools/gfx/vk-api.h create mode 100644 tools/gfx/vk-device-queue.cpp create mode 100644 tools/gfx/vk-device-queue.h create mode 100644 tools/gfx/vk-module.cpp create mode 100644 tools/gfx/vk-module.h create mode 100644 tools/gfx/vk-swap-chain.cpp create mode 100644 tools/gfx/vk-swap-chain.h create mode 100644 tools/gfx/vk-util.cpp create mode 100644 tools/gfx/vk-util.h create mode 100644 tools/gfx/window.cpp create mode 100644 tools/gfx/window.h delete mode 100644 tools/slang-graphics/circular-resource-heap-d3d12.cpp delete mode 100644 tools/slang-graphics/circular-resource-heap-d3d12.h delete mode 100644 tools/slang-graphics/d3d-util.cpp delete mode 100644 tools/slang-graphics/d3d-util.h delete mode 100644 tools/slang-graphics/descriptor-heap-d3d12.cpp delete mode 100644 tools/slang-graphics/descriptor-heap-d3d12.h delete mode 100644 tools/slang-graphics/render-d3d11.cpp delete mode 100644 tools/slang-graphics/render-d3d11.h delete mode 100644 tools/slang-graphics/render-d3d12.cpp delete mode 100644 tools/slang-graphics/render-d3d12.h delete mode 100644 tools/slang-graphics/render-gl.cpp delete mode 100644 tools/slang-graphics/render-gl.h delete mode 100644 tools/slang-graphics/render-vk.cpp delete mode 100644 tools/slang-graphics/render-vk.h delete mode 100644 tools/slang-graphics/render.cpp delete mode 100644 tools/slang-graphics/render.h delete mode 100644 tools/slang-graphics/resource-d3d12.cpp delete mode 100644 tools/slang-graphics/resource-d3d12.h delete mode 100644 tools/slang-graphics/slang-graphics.vcxproj delete mode 100644 tools/slang-graphics/slang-graphics.vcxproj.filters delete mode 100644 tools/slang-graphics/surface.cpp delete mode 100644 tools/slang-graphics/surface.h delete mode 100644 tools/slang-graphics/vk-api.cpp delete mode 100644 tools/slang-graphics/vk-api.h delete mode 100644 tools/slang-graphics/vk-device-queue.cpp delete mode 100644 tools/slang-graphics/vk-device-queue.h delete mode 100644 tools/slang-graphics/vk-module.cpp delete mode 100644 tools/slang-graphics/vk-module.h delete mode 100644 tools/slang-graphics/vk-swap-chain.cpp delete mode 100644 tools/slang-graphics/vk-swap-chain.h delete mode 100644 tools/slang-graphics/vk-util.cpp delete mode 100644 tools/slang-graphics/vk-util.h delete mode 100644 tools/slang-graphics/window.cpp delete mode 100644 tools/slang-graphics/window.h diff --git a/.gitmodules b/.gitmodules index 5ee785420..d410bf9b7 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,3 +1,9 @@ [submodule "external/glslang"] path = external/glslang url = https://github.com/KhronosGroup/glslang.git +[submodule "external/tinyobjloader"] + path = external/tinyobjloader + url = https://github.com/syoyo/tinyobjloader +[submodule "external/glm"] + path = external/glm + url = https://github.com/g-truc/glm.git diff --git a/README.md b/README.md index a9e01340d..01d88eae1 100644 --- a/README.md +++ b/README.md @@ -3,18 +3,22 @@ [![AppVeyor build status](https://ci.appveyor.com/api/projects/status/3jptgsry13k6wdwp/branch/master?svg=true)](https://ci.appveyor.com/project/shader-slang/slang/branch/master) [![Travis build status](https://travis-ci.org/shader-slang/slang.svg?branch=master)](https://travis-ci.org/shader-slang/slang) Slang is a shading language that extends HLSL with new capabilities for building modular, extensible, and high-performance real-time shading systems. -This repository provides a command-line compiler and a plain C API for loading, compiling, and reflecting shader code in Slang or plain HLSL. +This repository provides a command-line compiler and a C/C++ API for loading, compiling, and reflecting shader code in Slang or plain HLSL. -Using Slang you can: +The extensions provided by the Slang language make it easier for you to write high-performance shader codebases with a maintainable and modular structure. For example: -* Compile your HLSL or Slang code to DX bytecode, SPIR-V, or plain source code in HLSL or GLSL (DXIL support is planned). +* Parameter blocks (exposed as `ParameterBlock`) let you group together related shader parameters -- both simple uniform values and resources like samplers/textures - in ordinary `struct` types, and then specify that they should be passed to the GPU as a single coherent block. Your application code can easily map a parameter block to abstractions like descriptor tables/sets on D3D12/Vulkan, or to the facilities provided by other APIs. + +* Generics and interfaces can be used to perform static specialization of your shader code without resort to preprocessor techniques or string-pasting. Unlike C++ templates, Slang's generics can be checked ahead of time and don't produce cascading error messages that are difficult to diagnose. The same generic shader can be specialized for a variety of different types to produce specialized code ahead of time, or on the fly, completely under application control. + +The Slang implementation in this repository provides a library and a stand-alone compiler for Slang that can be used to: + +* Compile your HLSL or Slang code to DX bytecode, DXIL, SPIR-V, or plain source code in HLSL or GLSL. * Get full reflection information about the parameters of your shader code, with a consistent interface no matter the target graphics API. Slang doesn't silently drop unused or "dead" shader parameters from the reflection data, so you can always see the full picture. * Take ordinary HLSL code that neglects to include all those tedious `register` and `layout` bindings, and transform it into code that includes explicit bindings on every shader parameter. This frees you to write simple and clean code, while still getting completely deterministic binding locations. -* Write shading code that uses first-class support for modules, interfaces, and generics to build clean and reusable shader libraries. - ## Getting Started The fastest way to get started with Slang is to use a pre-built binary package, available through GitHub [releases](https://github.com/shader-slang/slang/releases). @@ -25,6 +29,14 @@ If you would like to build Slang from source, please consult the instructions [h ## Documentation +For users getting started with Slang, it may help to start by looking at our example programs: + +* The [`hello-world`](examples/hello-world/) example shows the basics for integrating the Slang API into an application as a more-or-less drop-in replacement for `D3DCompile`. + +* The [`model-viewer`](examples/model-viewer/) example shows a more involved rendering application that uses Slang's new language features to perform efficient shader specialization and parameter binding while maintaining clear and modular shader code. + +A [paper](http://graphics.cs.cmu.edu/projects/slang/) on the Slang system was accepted into SIGGRAPH 2018, and it provides an overview of the language and the design of the impelemtnation. + The Slang [language guide](docs/language-guide.md) provides information on extended language features that Slang provides for user code. The [API user's guide](docs/api-users-guide.md) gives information on how to drive Slang programmatically from an application. @@ -34,15 +46,15 @@ Be warned, however, that the command-line tool is primarily intended for experim ## Limitations -The Slang project is in a very early state, so there are many rough edges to be aware of. +The Slang project is in an early state, so there are many rough edges to be aware of. Slang is *not* currently recommended for production use. The project is intentionally on a pre-`1.0.0` version to reflect the fact that interfaces and features may change at any time (though we try not to break user code without good reason). Major limitations to be aware of (beyond everything files in the issue tracker): -* Slang only supports outputting GLSL/SPIR-V for Vulkan, not OpenGL +* Slang only officially supports outputting GLSL/SPIR-V for Vulkan, not OpenGL -* Slang's current approach to automatically assigning registers is appropriate to D3D12, but not D3D11 +* Slang's current approach to automatically assigning registers is appropriate to D3D12, and is not ideal for D3D11 * Slang-to-GLSL cross-compilation only supports vertex, fragment, and compute shaders. Geometry and tessellation shader cross-compilation is not yet implemented. @@ -66,7 +78,6 @@ The Slang code itself is under the MIT license (see [LICENSE](LICENSE)). The Slang projet can be compiled to use the [`glslang`](https://github.com/KhronosGroup/glslang) project as a submodule (under `external/glslang`), and `glslang` is under a BSD license. -The Slang tests (which are not distributed with source/binary releases) include example shaders extracted from: -* Sample HLSL shaders from the Microsoft DirectX SDK, which has its own license +The Slang tests (which are not distributed with source/binary releases) include example HLSL shaders extracted from the Microsoft DirectX SDK, which has its own license Some of the Slang examples and tests use the `stb_image` and `stb_image_write` libraries (under `external/stb`) which have been placed in the public domain by their author(s). diff --git a/examples/hello-world/README.md b/examples/hello-world/README.md new file mode 100644 index 000000000..ba377b8cb --- /dev/null +++ b/examples/hello-world/README.md @@ -0,0 +1,12 @@ +Slang "Hello World" Example +=========================== + +The goal of this example is to demonstrate an almost minimal application that uses Slang for shading. + +The `shaders.slang` file contains simple vertex and fragment shader entry points. The shader code should compile as either Slang or HLSL code (that is, this example does not show off any new Slang language features). + +The `main.cpp` file contains the C++ application code, showing how to use the Slang API to load and compile the shader code to DirectX shader bytecode (DXBC). +The application perform rendering using the D3D11 API, through a platform and graphics API abstraction layer that is implemented in `tools/gfx`. +Note that this abstraction layer is *not* required in order to work with Slang, and it is just there to help us write example and test applications more conveniently. + +This example is not necessarily representative of best practices for integrating Slang into a production engine; the goal is merely to use the minimum amount of code possible to demonstrate a complete applicaiton that uses Slang. diff --git a/examples/hello-world/hello-world.vcxproj b/examples/hello-world/hello-world.vcxproj new file mode 100644 index 000000000..9efb28688 --- /dev/null +++ b/examples/hello-world/hello-world.vcxproj @@ -0,0 +1,184 @@ + + + + + Debug + Win32 + + + Debug + x64 + + + Release + Win32 + + + Release + x64 + + + + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA} + true + Win32Proj + hello-world + + + + Application + true + Unicode + v140 + + + Application + true + Unicode + v140 + + + Application + false + Unicode + v140 + + + Application + false + Unicode + v140 + + + + + + + + + + + + + + + + + + + true + ..\..\bin\windows-x86\debug\ + ..\..\intermediate\windows-x86\debug\hello-world\ + hello-world + .exe + + + true + ..\..\bin\windows-x64\debug\ + ..\..\intermediate\windows-x64\debug\hello-world\ + hello-world + .exe + + + false + ..\..\bin\windows-x86\release\ + ..\..\intermediate\windows-x86\release\hello-world\ + hello-world + .exe + + + false + ..\..\bin\windows-x64\release\ + ..\..\intermediate\windows-x64\release\hello-world\ + hello-world + .exe + + + + NotUsing + Level3 + _DEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + EditAndContinue + Disabled + MultiThreadedDebug + + + Windows + true + + + + + NotUsing + Level3 + _DEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + EditAndContinue + Disabled + MultiThreadedDebug + + + Windows + true + + + + + NotUsing + Level3 + NDEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + Full + true + true + false + true + MultiThreaded + + + Windows + true + true + + + + + NotUsing + Level3 + NDEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + Full + true + true + false + true + MultiThreaded + + + Windows + true + true + + + + + + + + + + + {DB00DA62-0533-4AFD-B59F-A67D5B3A0808} + + + {F9BE7957-8399-899E-0C49-E714FDDD4B65} + + + {222F7498-B40C-4F3F-A704-DDEB91A4484A} + + + + + + \ No newline at end of file diff --git a/examples/hello-world/hello-world.vcxproj.filters b/examples/hello-world/hello-world.vcxproj.filters new file mode 100644 index 000000000..a02cb79fc --- /dev/null +++ b/examples/hello-world/hello-world.vcxproj.filters @@ -0,0 +1,18 @@ + + + + + {E9C7FDCE-D52A-8D73-7EB0-C5296AF258F6} + + + + + Source Files + + + + + Source Files + + + \ No newline at end of file diff --git a/examples/hello-world/main.cpp b/examples/hello-world/main.cpp new file mode 100644 index 000000000..bac378d96 --- /dev/null +++ b/examples/hello-world/main.cpp @@ -0,0 +1,470 @@ +// main.cpp + +// This file implements an extremely simple example of loading and +// executing a Slang shader program. This is primarily an example +// of how to use Slang as a "drop-in" replacement for an existing +// HLSL compiler like the `D3DCompile` API. More advanced usage +// of advanced Slang language and API features is left to the +// next example. +// +// The comments in the file will attempt to explain concepts as +// they are introduced. +// +// Of course, in order to use the Slang API, we need to include +// its header. We have set up the build options for this project +// so that it is as simple as: +// +#include +// +// Other build setups are possible, and Slang doesn't assume that +// its include directory must be added to your global include +// path. + +// For the purposes of keeping the demo code as simple as possible, +// while still retaining some level of portability, our examples +// make use of a small platform and graphics API abstraction layer, +// which is included in the Slang source distribution under the +// `tools/` directory. +// +// Applications can of course use Slang without ever touching this +// abstraction layer, so we will not focus on it when explaining +// examples, except in places where best practices for interacting +// with Slang may depend on an application/engine making certain +// design choices in their abstraction layer. +// +#include "gfx/render.h" +#include "gfx/render-d3d11.h" +#include "gfx/window.h" +using namespace gfx; + +// For the purposes of a small example, we will define the vertex data for a +// single triangle directly in the source file. It should be easy to extend +// this example to load data from an external source, if desired. +// +struct Vertex +{ + float position[3]; + float color[3]; +}; + +static const int kVertexCount = 3; +static const Vertex kVertexData[kVertexCount] = +{ + { { 0, 0, 0.5 }, { 1, 0, 0 } }, + { { 0, 1, 0.5 }, { 0, 0, 1 } }, + { { 1, 0, 0.5 }, { 0, 1, 0 } }, +}; + +// The example application will be implemented as a `struct`, so that +// we can scope the resources it allocates without using global variables. +// +struct HelloWorld +{ + +// We will start with a function that will invoke the Slang compiler +// to generate target-specific code from a shader file, and then +// use that to initialize an API shader program. +// +// Note that `Renderer` and `ShaderProgram` here are types from +// the graphics API abstraction layer, and *not* part of the +// Slang API. This function is representative of code that a user +// might write to integrate Slang into their renderer/engine. +// +RefPtr loadShaderProgram(gfx::Renderer* renderer) +{ + // First, we need to create a "session" for interacting with the Slang + // compiler. This scopes all of our application's interactions + // with the Slang library. At the moment, creating a session causes + // Slang to load and validate its standard library, so this is a + // somewhat heavy-weight operation. When possible, an application + // should try to re-use the same session across multiple compiles. + // + SlangSession* slangSession = spCreateSession(NULL); + + // A compile request represents a single invocation of the compiler, + // to process some inputs and produce outputs (or errors). + // + SlangCompileRequest* slangRequest = spCreateCompileRequest(slangSession); + + // We would like to request a single target (output) format: DirectX shader bytecode (DXBC) + int targetIndex = spAddCodeGenTarget(slangRequest, SLANG_DXBC); + + // We will specify the desired "profile" for this one target in terms of the + // DirectX "shader model" that should be supported. + // + spSetTargetProfile(slangRequest, targetIndex, spFindProfile(slangSession, "sm_4_0")); + + // A compile request can include one or more "translation units," which more or + // less amount to individual source files (think `.c` files, not the `.h` files they + // might include). + // + // For this example, our code will all be in the Slang language. The user may + // also specify HLSL input here, but that currently doesn't affect the compiler's + // behavior much. + // + int translationUnitIndex = spAddTranslationUnit(slangRequest, SLANG_SOURCE_LANGUAGE_SLANG, nullptr); + + // We will load source code for our translation unit from the file `shaders.slang`. + // There are also variations of this API for adding source code from application-provided buffers. + // + spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, "shaders.slang"); + + // Next we will specify the entry points we'd like to compile. + // It is often convenient to put more than one entry point in the same file, + // and the Slang API makes it convenient to use a single run of the compiler + // to compile all entry points. + // + // For each entry point, we need to specify the name of a function, the + // translation unit in which that function can be found, and the stage + // that we need to compile for (e.g., vertex, fragment, geometry, ...). + // + char const* vertexEntryPointName = "vertexMain"; + char const* fragmentEntryPointName = "fragmentMain"; + int vertexIndex = spAddEntryPoint(slangRequest, translationUnitIndex, vertexEntryPointName, SLANG_STAGE_VERTEX); + int fragmentIndex = spAddEntryPoint(slangRequest, translationUnitIndex, fragmentEntryPointName, SLANG_STAGE_FRAGMENT); + + // Once all of the input options for the compiler have been specified, + // we can invoke `spCompile` to run the compiler and see if any errors + // were detected. + // + const SlangResult compileRes = spCompile(slangRequest); + + // Even if there were no errors that forced compilation to fail, the + // compiler may have produced "diagnostic" output such as warnings. + // We will go ahead and print that output here. + // + if(auto diagnostics = spGetDiagnosticOutput(slangRequest)) + { + reportError("%s", diagnostics); + } + + // If compilation failed, there is no point in continuing any further. + if(SLANG_FAILED(compileRes)) + { + spDestroyCompileRequest(slangRequest); + spDestroySession(slangSession); + return nullptr; + } + + // If compilation was successful, then we will extract the code for + // our two entry points as "blobs". + // + // If you are using a D3D API, then your application may want to + // take advantage of the fact taht these blobs are binary compatible + // with the `ID3DBlob`, `ID3D10Blob`, etc. interfaces. + // + + ISlangBlob* vertexShaderBlob = nullptr; + spGetEntryPointCodeBlob(slangRequest, vertexIndex, 0, &vertexShaderBlob); + + ISlangBlob* fragmentShaderBlob = nullptr; + spGetEntryPointCodeBlob(slangRequest, fragmentIndex, 0, &fragmentShaderBlob); + + // We extract the begin/end pointers to the output code buffers + // using operations on the `ISlangBlob` interface. + // + char const* vertexCode = (char const*) vertexShaderBlob->getBufferPointer(); + char const* vertexCodeEnd = vertexCode + vertexShaderBlob->getBufferSize(); + + char const* fragmentCode = (char const*) fragmentShaderBlob->getBufferPointer(); + char const* fragmentCodeEnd = fragmentCode + fragmentShaderBlob->getBufferSize(); + + // Once we have extracted the output blobs, it is safe to destroy + // the compile request and even the session. + // + spDestroyCompileRequest(slangRequest); + spDestroySession(slangSession); + + // Now we use the operations of the example graphics API abstraction + // layer to load shader code into the underlying API. + // + // Reminder: this section does not involve the Slang API at all. + // + + gfx::ShaderProgram::KernelDesc kernelDescs[] = + { + { gfx::StageType::Vertex, vertexCode, vertexCodeEnd }, + { gfx::StageType::Fragment, fragmentCode, fragmentCodeEnd }, + }; + + gfx::ShaderProgram::Desc programDesc; + programDesc.pipelineType = gfx::PipelineType::Graphics; + programDesc.kernels = &kernelDescs[0]; + programDesc.kernelCount = 2; + + auto shaderProgram = renderer->createProgram(programDesc); + + // Once we've used the output blobs from the Slang compiler to initialize + // the API-specific shader program, we can release their memory. + // + vertexShaderBlob->release(); + fragmentShaderBlob->release(); + + return shaderProgram; +} + +// +// The above function shows the core of what is required to use the +// Slang API as a simple compiler (e.g., a drop-in replacement for +// fxc or dxc). +// +// The rest of this file implements an extremely simple rendering application +// that will execute the vertex/fragment shaders loaded with the function +// we have just defined. +// + +// We will hard-code the size of our rendering window. +// +int gWindowWidth = 1024; +int gWindowHeight = 768; + +// We will define global variables for the various platform and +// graphics API objects that our application needs: +// +// As a reminder, *none* of these are Slang API objects. All +// of them come from the utility library we are using to simplify +// building an example program. +// +gfx::ApplicationContext* gAppContext; +gfx::Window* gWindow; +RefPtr gRenderer; +RefPtr gConstantBuffer; + +RefPtr gPipelineLayout; +RefPtr gPipelineState; +RefPtr gDescriptorSet; + +RefPtr gVertexBuffer; + +// Now that we've covered the function that actually loads and +// compiles our Slang shade code, we can go through the rest +// of the application code without as much commentary. +// +Result initialize() +{ + // Create a window for our application to render into. + // + WindowDesc windowDesc; + windowDesc.title = "Hello, World!"; + windowDesc.width = gWindowWidth; + windowDesc.height = gWindowHeight; + gWindow = createWindow(windowDesc); + + // Initialize the rendering layer. + // + // Note: for now we are hard-coding logic to use the + // Direct3D11 back-end for the graphics API abstraction. + // A future version of this example may support multiple + // platforms/APIs. + // + gRenderer = createD3D11Renderer(); + Renderer::Desc rendererDesc; + rendererDesc.width = gWindowWidth; + rendererDesc.height = gWindowHeight; + { + Result res = gRenderer->initialize(rendererDesc, getPlatformWindowHandle(gWindow)); + if(SLANG_FAILED(res)) return res; + } + + // Create a constant buffer for passing the model-view-projection matrix. + // + // Note: the Slang API supports reflection which could be used + // to query the size of the `Uniform` constant buffer, but we + // will not deal with that here because Slang also supports + // applications that want to hard-code things like memory + // layout and parameter locations. + // + int constantBufferSize = 16 * sizeof(float); + + BufferResource::Desc constantBufferDesc; + constantBufferDesc.init(constantBufferSize); + constantBufferDesc.setDefaults(Resource::Usage::ConstantBuffer); + constantBufferDesc.cpuAccessFlags = Resource::AccessFlag::Write; + + gConstantBuffer = gRenderer->createBufferResource( + Resource::Usage::ConstantBuffer, + constantBufferDesc); + if(!gConstantBuffer) return SLANG_FAIL; + + // Now we will create objects needed to configur the "input assembler" + // (IA) stage of the D3D pipeline. + // + // First, we create an input layout: + // + InputElementDesc inputElements[] = { + { "POSITION", 0, Format::RGB_Float32, offsetof(Vertex, position) }, + { "COLOR", 0, Format::RGB_Float32, offsetof(Vertex, color) }, + }; + auto inputLayout = gRenderer->createInputLayout( + &inputElements[0], + 2); + if(!inputLayout) return SLANG_FAIL; + + // Next we allocate a vertex buffer for our pre-initialized + // vertex data. + // + BufferResource::Desc vertexBufferDesc; + vertexBufferDesc.init(kVertexCount * sizeof(Vertex)); + vertexBufferDesc.setDefaults(Resource::Usage::VertexBuffer); + gVertexBuffer = gRenderer->createBufferResource( + Resource::Usage::VertexBuffer, + vertexBufferDesc, + &kVertexData[0]); + if(!gVertexBuffer) return SLANG_FAIL; + + // Now we will use our `loadShaderProgram` function to load + // the code from `shaders.slang` into the graphics API. + // + RefPtr shaderProgram = loadShaderProgram(gRenderer); + if(!shaderProgram) return SLANG_FAIL; + + // Our example graphics API usess a "modern" D3D12/Vulkan style + // of resource binding, so now we will dive into describing and + // allocating "descriptor sets." + // + // First, we need to construct a descriptor set *layout*. + // + DescriptorSetLayout::SlotRangeDesc slotRanges[] = + { + DescriptorSetLayout::SlotRangeDesc(DescriptorSlotType::UniformBuffer), + }; + DescriptorSetLayout::Desc descriptorSetLayoutDesc; + descriptorSetLayoutDesc.slotRangeCount = 1; + descriptorSetLayoutDesc.slotRanges = &slotRanges[0]; + auto descriptorSetLayout = gRenderer->createDescriptorSetLayout(descriptorSetLayoutDesc); + if(!descriptorSetLayout) return SLANG_FAIL; + + // Next we will allocate a pipeline layout, which specifies + // that we will render with only a single descriptor set bound. + // + + PipelineLayout::DescriptorSetDesc descriptorSets[] = + { + PipelineLayout::DescriptorSetDesc( descriptorSetLayout ), + }; + PipelineLayout::Desc pipelineLayoutDesc; + pipelineLayoutDesc.renderTargetCount = 1; + pipelineLayoutDesc.descriptorSetCount = 1; + pipelineLayoutDesc.descriptorSets = &descriptorSets[0]; + auto pipelineLayout = gRenderer->createPipelineLayout(pipelineLayoutDesc); + if(!pipelineLayout) return SLANG_FAIL; + + gPipelineLayout = pipelineLayout; + + // Once we have the descriptor set layout, we can allocate + // and fill in a descriptor set to hold our parameters. + // + auto descriptorSet = gRenderer->createDescriptorSet(descriptorSetLayout); + if(!descriptorSet) return SLANG_FAIL; + + descriptorSet->setConstantBuffer(0, 0, gConstantBuffer); + + gDescriptorSet = descriptorSet; + + // Following the D3D12/Vulkan style of API, we need a pipeline state object + // (PSO) to encapsulate the configuration of the overall graphics pipeline. + // + GraphicsPipelineStateDesc desc; + desc.pipelineLayout = gPipelineLayout; + desc.inputLayout = inputLayout; + desc.program = shaderProgram; + desc.renderTargetCount = 1; + auto pipelineState = gRenderer->createGraphicsPipelineState(desc); + if(!pipelineState) return SLANG_FAIL; + + gPipelineState = pipelineState; + + // Once we've initialized all the graphics API objects, + // it is time to show our application window and start rendering. + // + showWindow(gWindow); + + return SLANG_OK; +} + +// With the initialization out of the way, we can now turn our attention +// to the per-frame rendering logic. As with the initialization, there is +// nothing really Slang-specific here, so the commentary doesn't need +// to be very detailed. +// +void renderFrame() +{ + // We start by clearing our framebuffer, which only has a color target. + // + static const float kClearColor[] = { 0.25, 0.25, 0.25, 1.0 }; + gRenderer->setClearColor(kClearColor); + gRenderer->clearFrame(); + + // We update our constant buffer per-frame, just for the purposes + // of the example, but we don't actually load different data + // per-frame (we always use an identity projection). + // + if(float* data = (float*) gRenderer->map(gConstantBuffer, MapFlavor::WriteDiscard)) + { + static const float kIdentity[] = + { + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1 }; + memcpy(data, kIdentity, sizeof(kIdentity)); + + gRenderer->unmap(gConstantBuffer); + } + + // Now we configure our graphics pipeline state by setting the + // PSO, binding our descriptor set (which references the + // constant buffer that we wrote to above), and setting + // some additional bits of state, before drawing our triangle. + // + gRenderer->setPipelineState(PipelineType::Graphics, gPipelineState); + gRenderer->setDescriptorSet(PipelineType::Graphics, gPipelineLayout, 0, gDescriptorSet); + + gRenderer->setVertexBuffer(0, gVertexBuffer, sizeof(Vertex)); + gRenderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); + + gRenderer->draw(3); + + // With that, we are done drawing for one frame, and ready for the next. + // + gRenderer->presentFrame(); +} + +void finalize() +{ + // All of our graphics API objects are reference-counted, + // so there isn't any additional cleanup work that needs + // to be done in this simple example. +} + +}; + +// This "inner" main function is used by the platform abstraction +// layer to deal with differences in how an entry point needs +// to be defined for different platforms. +// +void innerMain(ApplicationContext* context) +{ + // We construct an instance of our example application + // `struct` type, and then walk through the lifecyle + // of the application. + + HelloWorld app; + + if (SLANG_FAILED(app.initialize())) + { + return exitApplication(context, 1); + } + + while(dispatchEvents(context)) + { + app.renderFrame(); + } + + app.finalize(); +} + +// This macro instantiates an appropriate main function to +// invoke the `innerMain` above. +// +GFX_UI_MAIN(innerMain) diff --git a/examples/hello-world/shaders.slang b/examples/hello-world/shaders.slang new file mode 100644 index 000000000..2df26b3d9 --- /dev/null +++ b/examples/hello-world/shaders.slang @@ -0,0 +1,63 @@ +// shaders.slang + +// +// This file provides a simple vertex and fragment shader that can be compiled +// using Slang. This code should also be valid as HLSL, and thus it does not +// use any of the new language features supported by Slang. +// + +// Uniform data to be passed from application -> shader. +cbuffer Uniforms +{ + float4x4 modelViewProjection; +} + +// Per-vertex attributes to be assembled from bound vertex buffers. +struct AssembledVertex +{ + float3 position : POSITION; + float3 color : COLOR; +}; + +// Output of the vertex shader, and input to the fragment shader. +struct CoarseVertex +{ + float3 color; +}; + +// Output of the fragment shader +struct Fragment +{ + float4 color; +}; + +// Vertex Shader + +struct VertexStageOutput +{ + CoarseVertex coarseVertex : CoarseVertex; + float4 sv_position : SV_Position; +}; +VertexStageOutput vertexMain( + AssembledVertex assembledVertex) +{ + VertexStageOutput output; + + float3 position = assembledVertex.position; + float3 color = assembledVertex.color; + + output.coarseVertex.color = color; + output.sv_position = mul(modelViewProjection, float4(position, 1.0)); + + return output; +} + +// Fragment Shader + +float4 fragmentMain( + CoarseVertex coarseVertex : CoarseVertex) : SV_Target +{ + float3 color = coarseVertex.color; + + return float4(color, 1.0); +} diff --git a/examples/hello/README.md b/examples/hello/README.md deleted file mode 100644 index 31a983428..000000000 --- a/examples/hello/README.md +++ /dev/null @@ -1,12 +0,0 @@ -Slang "Hello World" Example -=========================== - -The goal of this example is to demonstrate an almost minimal application that uses Slang for shading. - -The `hello.slang` file contains simple vertex and fragment shader entry points. The shader code should compile as either Slang or HLSL code (that is, this example does not show off any new Slang language features). - -The `hello.cpp` file contains the C++ application code, showing how to use the Slang C API to load and compile the shader code to DirectX shader bytecode (DXBC). -The application perform rendering using the D3D11 API, through a platform and graphics API abstraction layer that is implemented in `tools/slang-graphics`. -Note that this abstraction layer is *not* required in order to work with Slang, and it is just there to help us write example applications more conveniently. - -This example is not necessarily representative of best practices for integrating Slang into a production engine; the goal is merely to use the minimum amount of code possible to demonstrate a complete applicaiton that uses Slang. diff --git a/examples/hello/hello.cpp b/examples/hello/hello.cpp deleted file mode 100644 index 8f2fbca0b..000000000 --- a/examples/hello/hello.cpp +++ /dev/null @@ -1,380 +0,0 @@ -// hello.cpp - -// This file implements an extremely simple example of loading and -// executing a Slang shader program. -// -// The comments in the file will attempt to explain concepts as -// they are introduced. -// -// Of course, in order to use the Slang API, we need to include -// its header. We have set up the build options for this project -// so that it is as simple as: -// -#include -// -// Other build setups are possible, and Slang doesn't assume that -// its include directory must be added to your global include -// path. - -// For the purposes of keeping the demo code as simple as possible, -// while still retaining some level of portability, our examples -// make use of a small platform and graphics API abstraction layer, -// which is included in the Slang source distribution under the -// `tools/` directory. -// -// Applications can of course use Slang without ever touching this -// abstraction layer, so we will not focus on it when explaining -// examples, except in places where best practices for interacting -// with Slang may depend on an application/engine making certain -// design choices in their abstraction layer. -// -#include "slang-graphics/render.h" -#include "slang-graphics/render-d3d11.h" -#include "slang-graphics/window.h" -using namespace slang_graphics; - -// We will start with a function that will invoke the Slang compiler -// to generate target-specific code from a shader file, and then -// use that to initialize an API shader program. -// -// Note that `Renderer` and `ShaderProgram` here are types from -// the graphics API abstraction layer, and *not* part of the -// Slang API. This function is representative of code that a user -// might write to integrate Slang into their renderer/engine. -// -ShaderProgram* loadShaderProgram(Renderer* renderer) -{ - // First, we need to create a "session" for interacting with the Slang - // compiler. This scopes all of our application's interactions - // with the Slang library. At the moment, creating a session causes - // Slang to load and validate its standard library, so this is a - // somewhat heavy-weight operation. When possible, an application - // should try to re-use the same session across multiple compiles. - SlangSession* slangSession = spCreateSession(NULL); - - // A compile request represents a single invocation of the compiler, - // to process some inputs and produce outputs (or errors). - SlangCompileRequest* slangRequest = spCreateCompileRequest(slangSession); - - // We would like to request a single target (output) format: DirectX shader bytecode (DXBC) - int targetIndex = spAddCodeGenTarget(slangRequest, SLANG_DXBC); - - // We will specify the desired "profile" for this one target in terms of the - // DirectX "shader model" that should be supported. - spSetTargetProfile(slangRequest, targetIndex, spFindProfile(slangSession, "sm_4_0")); - - // A compile request can include one or more "translation units," which more or - // less amount to individual source files (think `.c` files, not the `.h` files they - // might include). - // - // For this example, our code will all be in the Slang language. The user may - // also specify HLSL input here, but that currently doesn't affect the compiler's - // behavior much. - int translationUnitIndex = spAddTranslationUnit(slangRequest, SLANG_SOURCE_LANGUAGE_SLANG, nullptr); - - // We will load source code for our translation unit from the file `hello.slang`. - // There are also variations of this API for adding source code from application-provided buffers. - spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, "hello.slang"); - - // Next we will specify the entry points we'd like to compile. - // It is often convenient to put more than one entry point in the same file, - // and the Slang API makes it convenient to use a single run of the compiler - // to compile all entry points. - // - // For each entry point, we need to specify the name of a function, the - // translation unit in which that function can be found, and the stage - // that we need to compile for (e.g., vertex, fragment, geometry, ...). - // - char const* vertexEntryPointName = "vertexMain"; - char const* fragmentEntryPointName = "fragmentMain"; - int vertexIndex = spAddEntryPoint(slangRequest, translationUnitIndex, vertexEntryPointName, SLANG_STAGE_VERTEX); - int fragmentIndex = spAddEntryPoint(slangRequest, translationUnitIndex, fragmentEntryPointName, SLANG_STAGE_FRAGMENT); - - // Once all of the input options for the compiler have been specified, - // we can invoke `spCompile` to run the compiler and see if any errors - // were detected. - // - const SlangResult compileRes = spCompile(slangRequest); - - // Even if there were no errors that forced compilation to fail, the - // compiler may have produced "diagnostic" output such as warnings. - // We will go ahead and print that output here. - // - if (auto diagnostics = spGetDiagnosticOutput(slangRequest)) - { - reportError("%s", diagnostics); - } - - // If compilation failed, there is no point in continuing any further. - if (SLANG_FAILED(compileRes)) - { - spDestroyCompileRequest(slangRequest); - spDestroySession(slangSession); - return nullptr; - } - - // If compilation was successful, then we will extract the code for - // our two entry points as "blobs". - // - // If you are using a D3D API, then your application may want to - // take advantage of the fact taht these blobs are binary compatible - // with the `ID3DBlob`, `ID3D10Blob`, etc. interfaces. - - ISlangBlob* vertexShaderBlob = nullptr; - spGetEntryPointCodeBlob(slangRequest, vertexIndex, 0, &vertexShaderBlob); - - ISlangBlob* fragmentShaderBlob = nullptr; - spGetEntryPointCodeBlob(slangRequest, fragmentIndex, 0, &fragmentShaderBlob); - - // We extract the begin/end pointers to the output code buffers - // using operations on the `ISlangBlob` interface. - char const* vertexCode = (char const*)vertexShaderBlob->getBufferPointer(); - char const* vertexCodeEnd = vertexCode + vertexShaderBlob->getBufferSize(); - - char const* fragmentCode = (char const*)fragmentShaderBlob->getBufferPointer(); - char const* fragmentCodeEnd = fragmentCode + fragmentShaderBlob->getBufferSize(); - - // Once we have extract the output blobs, it is safe to destroy - // the compile request and even the session. - // - spDestroyCompileRequest(slangRequest); - spDestroySession(slangSession); - - // Now we use the operations of the example graphics API abstraction - // layer to load shader code into the underlying API. - // - // Reminder: this section does not involve the Slang API at all. - // - - ShaderProgram::KernelDesc kernelDescs[] = - { - { StageType::Vertex, vertexCode, vertexCodeEnd }, - { StageType::Fragment, fragmentCode, fragmentCodeEnd }, - }; - - ShaderProgram::Desc programDesc; - programDesc.pipelineType = PipelineType::Graphics; - programDesc.kernels = &kernelDescs[0]; - programDesc.kernelCount = 2; - - ShaderProgram* shaderProgram = renderer->createProgram(programDesc); - - // Once we've used the output blobs from the Slang compiler to initialize - // the API-specific shader program, we can release their memory. - // - vertexShaderBlob->release(); - fragmentShaderBlob->release(); - - return shaderProgram; -} - -// -// The above function shows the core of what is required to use the -// Slang API as a simple compiler (e.g., a drop-in replacement for -// fxc or dxc). -// -// The rest of this file implements an extremely simple rendering application -// that will execute the vertex/fragment shaders loaded with the function -// we have just defined. -// - -// We will hard-code the size of our rendering window. -// -static int gWindowWidth = 1024; -static int gWindowHeight = 768; - -// For the purposes of a small example, we will define the vertex data for a -// single triangle directly in the source file. It should be easy to extend -// this example to load data from an external source, if desired. -// -struct Vertex -{ - float position[3]; - float color[3]; -}; - -static const int kVertexCount = 3; -static const Vertex kVertexData[kVertexCount] = -{ - { { 0, 0, 0.5 },{ 1, 0, 0 } }, - { { 0, 1, 0.5 },{ 0, 0, 1 } }, - { { 1, 0, 0.5 },{ 0, 1, 0 } }, -}; - -// We will define global variables for the various platform and -// graphics API objects that our application needs: -// -// As a reminder, *none* of these are Slang API objects. All -// of them come from the utility library we are using to simplify -// building an example program. -// -ApplicationContext* gAppContext; -Window* gWindow; -Renderer* gRenderer; -BufferResource* gConstantBuffer; -InputLayout* gInputLayout; -BufferResource* gVertexBuffer; -ShaderProgram* gShaderProgram; -BindingState* gBindingState; - -SlangResult initialize() -{ - // Create a window for our application to render into. - WindowDesc windowDesc; - windowDesc.title = "Hello, World!"; - windowDesc.width = gWindowWidth; - windowDesc.height = gWindowHeight; - gWindow = createWindow(windowDesc); - - // Initialize the rendering layer. - // - // Note: for now we are hard-coding logic to use the - // Direct3D11 back-end for the graphics API abstraction. - // A future version of this example may support multiple - // platforms/APIs. - // - gRenderer = createD3D11Renderer(); - Renderer::Desc rendererDesc; - rendererDesc.width = gWindowWidth; - rendererDesc.height = gWindowHeight; - { - const SlangResult res = gRenderer->initialize(rendererDesc, getPlatformWindowHandle(gWindow)); - if (SLANG_FAILED(res)) return res; - } - - // Create a constant buffer for passing the model-view-projection matrix. - // - // TODO: A future version of this example will show how to - // use the Slang reflection API to query the required size - // for the data in this constant buffer. - // - int constantBufferSize = 16 * sizeof(float); - - BufferResource::Desc constantBufferDesc; - constantBufferDesc.init(constantBufferSize); - constantBufferDesc.setDefaults(Resource::Usage::ConstantBuffer); - constantBufferDesc.cpuAccessFlags = Resource::AccessFlag::Write; - - gConstantBuffer = gRenderer->createBufferResource( - Resource::Usage::ConstantBuffer, - constantBufferDesc); - if (!gConstantBuffer) return SLANG_FAIL; - - // Input Assembler (IA) - - // Input Layout - - InputElementDesc inputElements[] = { - { "POSITION", 0, Format::RGB_Float32, offsetof(Vertex, position) }, - { "COLOR", 0, Format::RGB_Float32, offsetof(Vertex, color) }, - }; - gInputLayout = gRenderer->createInputLayout( - &inputElements[0], - 2); - if (!gInputLayout) return SLANG_FAIL; - - // Vertex Buffer - - BufferResource::Desc vertexBufferDesc; - vertexBufferDesc.init(kVertexCount * sizeof(Vertex)); - vertexBufferDesc.setDefaults(Resource::Usage::VertexBuffer); - - gVertexBuffer = gRenderer->createBufferResource( - Resource::Usage::VertexBuffer, - vertexBufferDesc, - &kVertexData[0]); - if (!gVertexBuffer) return SLANG_FAIL; - - // Shaders (VS, PS, ...) - - gShaderProgram = loadShaderProgram(gRenderer); - if (!gShaderProgram) return SLANG_FAIL; - - // Resource binding state - - BindingState::Desc bindingStateDesc; - bindingStateDesc.addBufferResource(gConstantBuffer, BindingState::RegisterRange::makeSingle(0)); - gBindingState = gRenderer->createBindingState(bindingStateDesc); - - // Once we've initialized all the graphics API objects, - // it is time to show our application window and start rendering. - - showWindow(gWindow); - - return SLANG_OK; -} - -void renderFrame() -{ - // Clear our framebuffer (color target only) - // - static const float kClearColor[] = { 0.25, 0.25, 0.25, 1.0 }; - gRenderer->setClearColor(kClearColor); - gRenderer->clearFrame(); - - // We update our constant buffer per-frame, just for the purposes - // of the example, but we don't actually load different data - // per-frame (we always use an identity projection). - // - if (float* data = (float*)gRenderer->map(gConstantBuffer, MapFlavor::WriteDiscard)) - { - static const float kIdentity[] = { - 1, 0, 0, 0, - 0, 1, 0, 0, - 0, 0, 1, 0, - 0, 0, 0, 1 }; - memcpy(data, kIdentity, sizeof(kIdentity)); - - gRenderer->unmap(gConstantBuffer); - } - - // Input Assembler (IA) - - gRenderer->setInputLayout(gInputLayout); - gRenderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); - - UInt vertexStride = sizeof(Vertex); - UInt vertexBufferOffset = 0; - gRenderer->setVertexBuffers(0, 1, &gVertexBuffer, &vertexStride, &vertexBufferOffset); - - // Vertex Shader (VS) - // Pixel Shader (PS) - - gRenderer->setShaderProgram(gShaderProgram); - gRenderer->setBindingState(gBindingState); - - // - - gRenderer->draw(3); - - gRenderer->presentFrame(); -} - -void finalize() -{ - // TODO: Proper cleanup. -} - -// This "inner" main function is used by the platform abstraction -// layer to deal with differences in how an entry point needs -// to be defined for different platforms. -// -void innerMain(ApplicationContext* context) -{ - if (SLANG_FAILED(initialize())) - { - return exitApplication(context, 1); - } - - while (dispatchEvents(context)) - { - renderFrame(); - } - - finalize(); -} - -// This macro instantiates an appropriate main function to -// invoke the `innerMain` above. -// -SG_UI_MAIN(innerMain) diff --git a/examples/hello/hello.slang b/examples/hello/hello.slang deleted file mode 100644 index 5a68979ce..000000000 --- a/examples/hello/hello.slang +++ /dev/null @@ -1,76 +0,0 @@ -// hello.slang - -// This file provides a simple vertex and fragment shader that can be compiled -// using Slang. This code should also be valid as HLSL, and thus it does not -// use any of the new language features supported by Slang. - -cbuffer Uniforms -{ - float4x4 modelViewProjection; -} - -struct AssembledVertex -{ - float3 position : POSITION; - float3 color : COLOR; -}; - -struct CoarseVertex -{ - float3 color; -}; - -struct Fragment -{ - float4 color; -}; - - -// Vertex Shader - -struct VertexStageInput -{ - AssembledVertex assembledVertex; -}; - -struct VertexStageOutput -{ - CoarseVertex coarseVertex : CoarseVertex; - float4 sv_position : SV_Position; -}; - -VertexStageOutput vertexMain(VertexStageInput input) -{ - VertexStageOutput output; - - float3 position = input.assembledVertex.position; - float3 color = input.assembledVertex.color; - - output.coarseVertex.color = color; - output.sv_position = mul(modelViewProjection, float4(position, 1.0)); - - return output; -} - -// Fragment Shader - -struct FragmentStageInput -{ - CoarseVertex coarseVertex : CoarseVertex; -}; - -struct FragmentStageOutput -{ - Fragment fragment : SV_Target; -}; - -FragmentStageOutput fragmentMain(FragmentStageInput input) -{ - FragmentStageOutput output; - - float3 color = input.coarseVertex.color; - - output.fragment.color = float4(color, 1.0); - - return output; -} diff --git a/examples/hello/hello.sln b/examples/hello/hello.sln deleted file mode 100644 index 3ddf262df..000000000 --- a/examples/hello/hello.sln +++ /dev/null @@ -1,28 +0,0 @@ - -Microsoft Visual Studio Solution File, Format Version 12.00 -# Visual Studio 14 -VisualStudioVersion = 14.0.25420.1 -MinimumVisualStudioVersion = 10.0.40219.1 -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "hello", "hello.vcxproj", "{E6385042-1649-4803-9EBD-168F8B7EF131}" -EndProject -Global - GlobalSection(SolutionConfigurationPlatforms) = preSolution - Debug|x64 = Debug|x64 - Debug|x86 = Debug|x86 - Release|x64 = Release|x64 - Release|x86 = Release|x86 - EndGlobalSection - GlobalSection(ProjectConfigurationPlatforms) = postSolution - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.ActiveCfg = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.Build.0 = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x86.ActiveCfg = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x86.Build.0 = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.ActiveCfg = Release|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.Build.0 = Release|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x86.ActiveCfg = Release|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x86.Build.0 = Release|Win32 - EndGlobalSection - GlobalSection(SolutionProperties) = preSolution - HideSolutionNode = FALSE - EndGlobalSection -EndGlobal diff --git a/examples/hello/hello.vcxproj b/examples/hello/hello.vcxproj deleted file mode 100644 index 885c2ff86..000000000 --- a/examples/hello/hello.vcxproj +++ /dev/null @@ -1,184 +0,0 @@ - - - - - Debug - Win32 - - - Debug - x64 - - - Release - Win32 - - - Release - x64 - - - - {E6385042-1649-4803-9EBD-168F8B7EF131} - true - Win32Proj - hello - - - - Application - true - Unicode - v140 - - - Application - true - Unicode - v140 - - - Application - false - Unicode - v140 - - - Application - false - Unicode - v140 - - - - - - - - - - - - - - - - - - - true - ..\..\bin\windows-x86\debug\ - ..\..\intermediate\windows-x86\debug\hello\ - hello - .exe - - - true - ..\..\bin\windows-x64\debug\ - ..\..\intermediate\windows-x64\debug\hello\ - hello - .exe - - - false - ..\..\bin\windows-x86\release\ - ..\..\intermediate\windows-x86\release\hello\ - hello - .exe - - - false - ..\..\bin\windows-x64\release\ - ..\..\intermediate\windows-x64\release\hello\ - hello - .exe - - - - NotUsing - Level3 - _DEBUG;%(PreprocessorDefinitions) - ..\..;..\..\tools;%(AdditionalIncludeDirectories) - EditAndContinue - Disabled - MultiThreadedDebug - - - Windows - true - - - - - NotUsing - Level3 - _DEBUG;%(PreprocessorDefinitions) - ..\..;..\..\tools;%(AdditionalIncludeDirectories) - EditAndContinue - Disabled - MultiThreadedDebug - - - Windows - true - - - - - NotUsing - Level3 - NDEBUG;%(PreprocessorDefinitions) - ..\..;..\..\tools;%(AdditionalIncludeDirectories) - Full - true - true - false - true - MultiThreaded - - - Windows - true - true - - - - - NotUsing - Level3 - NDEBUG;%(PreprocessorDefinitions) - ..\..;..\..\tools;%(AdditionalIncludeDirectories) - Full - true - true - false - true - MultiThreaded - - - Windows - true - true - - - - - - - - - - - {DB00DA62-0533-4AFD-B59F-A67D5B3A0808} - - - {F9BE7957-8399-899E-0C49-E714FDDD4B65} - - - {222F7498-B40C-4F3F-A704-DDEB91A4484A} - - - - - - \ No newline at end of file diff --git a/examples/hello/hello.vcxproj.filters b/examples/hello/hello.vcxproj.filters deleted file mode 100644 index 6855e69cc..000000000 --- a/examples/hello/hello.vcxproj.filters +++ /dev/null @@ -1,18 +0,0 @@ - - - - - {E9C7FDCE-D52A-8D73-7EB0-C5296AF258F6} - - - - - Source Files - - - - - Source Files - - - \ No newline at end of file diff --git a/examples/model-viewer/README.md b/examples/model-viewer/README.md new file mode 100644 index 000000000..a350a48a2 --- /dev/null +++ b/examples/model-viewer/README.md @@ -0,0 +1,25 @@ +Model Viewer Example +==================== + +This example expands on the simple Slang API integration from the "Hello, World" example by actually loading and rendering model data with extremely basic surface and light shading. + +This time, the shader code is making use of various Slang language features, so readers may want to read through `shaders.slang` to see an example of how the various mechanisms can be used to build out a more complicated shader library. +While the shader code in this example is still simplistic, it shows examples of: + +* Using multiple Slang `ParameterBlock`s to manage the space of shader parameter bindings in a graphics-API-independent fashion, while still taking advantage of the performance opportunities afforded by D3D12 and Vulkan. + +* Using `interface`s and generics to express multiple variations of a feature with static specialization, in place of more traditional preprocessor techniques. + +The application code in `main.cpp` also shows a more advanced integration of the Slang API than that in the "Hello, World" example, including examples of: + +* Loading a library of Slang shader code to perform reflection on its types *without* specifying a particular entry point to generate code for + +* Using Slang's reflection information to allocate graphics-API objects to implement parameter blocks (e.g., D3D12/Vulkan descriptor tables/sets) + +* Performing on-demand specialization of Slang's generics using type information from parameter blocks to achieve simple shader specialization + +It is perhaps worth taking note of the two things this example intentionally does *not* do: + +* There is no use of the C-style preprocessor in the shader code presented, in order to demonstrate that shader specialization can be achieved without preprocessor techniques. + +* There is no use of explicit parameter binding decorations (e.g., HLSL `regsiter` or GLSL `layout` modifiers), in order to demonstrate that these are not needed in order to achieve high-performance shader parameter binding. diff --git a/examples/model-viewer/cube.mtl b/examples/model-viewer/cube.mtl new file mode 100644 index 000000000..6634af823 --- /dev/null +++ b/examples/model-viewer/cube.mtl @@ -0,0 +1,8 @@ +newmtl Material +Ns 96.078431 +Ka 0.000000 0.000000 0.000000 +Kd 0.640000 0.640000 0.640000 +Ks 0.500000 0.500000 0.500000 +Ni 1.000000 +d 1.000000 +illum 2 diff --git a/examples/model-viewer/cube.obj b/examples/model-viewer/cube.obj new file mode 100644 index 000000000..7226aaa77 --- /dev/null +++ b/examples/model-viewer/cube.obj @@ -0,0 +1,24 @@ +mtllib cube.mtl +o Cube +v 1.000000 -1.000000 -1.000000 +v 1.000000 -1.000000 1.000000 +v -1.000000 -1.000000 1.000000 +v -1.000000 -1.000000 -1.000000 +v 1.000000 1.000000 -1.000000 +v 1.000000 1.000000 1.000000 +v -1.000000 1.000000 1.000000 +v -1.000000 1.000000 -1.000000 +vn 0.000000 -1.000000 0.000000 +vn 0.000000 1.000000 0.000000 +vn 1.000000 0.000000 0.000000 +vn 0.000000 0.000000 1.000000 +vn -1.000000 0.000000 0.000000 +vn 0.000000 0.000000 -1.000000 +usemtl Material +s off +f 1//1 2//1 3//1 4//1 +f 5//2 8//2 7//2 6//2 +f 1//3 5//3 6//3 2//3 +f 2//4 6//4 7//4 3//4 +f 3//5 7//5 8//5 4//5 +f 5//6 1//6 4//6 8//6 diff --git a/examples/model-viewer/main.cpp b/examples/model-viewer/main.cpp new file mode 100644 index 000000000..cd6b404ee --- /dev/null +++ b/examples/model-viewer/main.cpp @@ -0,0 +1,1618 @@ +// main.cpp + +// +// This example is much more involved than the `hello-world` example, +// so readers are encouraged to work through the simpler code first +// before diving into this application. We will gloss over parts of +// the code that are similar to the code in `hello-world`, and +// instead focus on the new code that is required to use Slang in +// more advanced ways. +// + +// We still need to include the Slang header to use the Slang API +// +#include + +// We will again make use of a simple graphics API abstraction +// layer, just to keep the examples short and to the point. +// +#include "gfx/model.h" +#include "gfx/render.h" +#include "gfx/render-d3d11.h" +#include "gfx/vector-math.h" +#include "gfx/window.h" +using namespace gfx; + +// We will use a few utilities from the C++ standard library, +// just to keep the code short. Note that the Slang API does +// not use or require any C++ standard library features. +// +#include +#include + +// A larger application will typically want to load/compile +// multiple modules/files of shader code. When using the +// Slang API, some one-time setup work can be amortized +// across multiple modules by using a single Slang +// "session" across multiple compiles. +// +// To that end, our application will use a function-`static` +// variable to create a session on demand and re-use it +// for the duration of the application. +// +SlangSession* getSlangSession() +{ + static SlangSession* slangSession = spCreateSession(NULL); + return slangSession; +} + +// This application is going to build its own layered +// application-specific abstractions on top of Slang, +// so it will have its own notion of a shader "module," +// which comprises the results of a Slang compilation, +// including the reflection information. +// +struct ShaderModule : RefObject +{ + // The file that the module was loaded from. + std::string inputPath; + + // Slang compile request and reflection data. + SlangCompileRequest* slangRequest; + slang::ShaderReflection* slangReflection; + + // Reference to the renderer, used to service requests + // that load graphics API objects based on the module. + RefPtr renderer; +}; +// +// In order to load a shader module from a `.slang` file on +// disk, we will use a Slang compile session, much like +// how the earlier Hello World example loaded shader code. +// +// We will point out major differences between the earlier +// example's `loadShaderProgram()` function, and how this function +// loads a module for reflection purposes. +// +RefPtr loadShaderModule(Renderer* renderer, char const* inputPath) +{ + auto slangSession = getSlangSession(); + SlangCompileRequest* slangRequest = spCreateCompileRequest(slangSession); + + // When *loading* the shader library, we will request that concrete + // kernel code *not* be generated, because the module might have + // unspecialized generic parameters. Instead, we will generate kernels + // on demand at runtime. + // + spSetCompileFlags( + slangRequest, + SLANG_COMPILE_FLAG_NO_CODEGEN); + + // The main logic for specifying target information and loading source + // code is the same as before with the notable change that we are *not* + // specifying specific vertex/fragment entry points to compile here. + // + // Instead, the `[shader(...)]` attributes used in `shaders.slang` will + // identify the entry points in the shader library to the compiler with + // specific action needing to be taken in the application. + // + int targetIndex = spAddCodeGenTarget(slangRequest, SLANG_DXBC); + spSetTargetProfile(slangRequest, targetIndex, spFindProfile(slangSession, "sm_4_0")); + int translationUnitIndex = spAddTranslationUnit(slangRequest, SLANG_SOURCE_LANGUAGE_SLANG, nullptr); + spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, inputPath); + int compileErr = spCompile(slangRequest); + if(auto diagnostics = spGetDiagnosticOutput(slangRequest)) + { + reportError("%s", diagnostics); + } + if(compileErr) + { + spDestroyCompileRequest(slangRequest); + spDestroySession(slangSession); + return nullptr; + } + auto slangReflection = (slang::ShaderReflection*) spGetReflection(slangRequest); + + // We will not destroy the Slang compile request here, because we want to + // keep it around to service reflection quries made from the application code. + // + RefPtr module = new ShaderModule(); + module->renderer = renderer; + module->inputPath = inputPath; + module->slangRequest = slangRequest; + module->slangReflection = slangReflection; + return module; +} + +// Once a shader moduel has been loaded, it is possible to look up +// individual entry points by their name to get reflection information, +// including the stage for which the entry point was compiled. +// +// As with `ShaderModule` above, the `EntryPoint` type is the application's +// wrapper around a Slang entry point. In this case it caches the +// identity of the target stage as encoded for the graphics API. +// +struct EntryPoint : RefObject +{ + // Name of the entry point function + std::string name; + + // Stage targetted by the entry point (Slang version) + SlangStage slangStage; + + // Stage targetted by the entry point (graphics API version) + gfx::StageType apiStage; +}; +// +// Loading an entry point from a module is a straightforward +// application of the Slang reflection API. +// +RefPtr loadEntryPoint( + ShaderModule* module, + char const* name) +{ + auto slangReflection = module->slangReflection; + + // Look up the Slang entry point based on its name, and bail + // out with an error if it isn't found. + // + auto slangEntryPoint = slangReflection->findEntryPointByName(name); + if(!slangEntryPoint) return nullptr; + + // Extract the stage of the entry point using the Slang API, + // and then try to map it to the corresponding stage as + // exposed by the graphics API. + // + auto slangStage = slangEntryPoint->getStage(); + StageType apiStage = StageType::Unknown; + switch(slangStage) + { + default: + return nullptr; + + case SLANG_STAGE_VERTEX: apiStage = gfx::StageType::Vertex; break; + case SLANG_STAGE_FRAGMENT: apiStage = gfx::StageType::Fragment; break; + } + + // Allocate an application object to hold on to this entry point + // so that we can use it in later specialization steps. + // + RefPtr entryPoint = new EntryPoint(); + entryPoint->name = name; + entryPoint->slangStage = slangEntryPoint->getStage(); + entryPoint->apiStage = apiStage; + return entryPoint; +} + +// In this application a `Program` represents a combination of entry +// points that will be used together (e.g., matching vertex and fragment +// entry points). +// +// Along with the entry points themselves, the `Program` object will +// cache information gleaned from Slang's reflection interface. Notably: +// +// * The number of `ParamterBlock`s that the program uses +// * Information about generic (type) parameters +// +struct Program : RefObject +{ + // The shader module that the program was loaded from. + RefPtr shaderModule; + + // The entry points that comprise the program + // (e.g., both a vertex and a fragment entry point). + std::vector> entryPoints; + + // The number of parameter blocks that are used by the shader + // program. This will be used by our rendering code later to + // decide how many descriptor set bindings should affect + // specialization/execution using this program. + // + int parameterBlockCount; + + // We will store information about the generic (type) parameters + // of the program. In particular, for each generic parameter + // we are going to find a parameter block that uses that + // generic type parameter. + // + // E.g., given input code like: + // + // type_param A; + // type_param B; + // + // ParameterBlock x; // block 0 + // ParameterBlock y; // block 1 + // ParameterBlock z; // block 2 + // + // We would have two `GenericParam` entries. The first one, + // for `A`, would store a `parameterBlockIndex` of `2`, because + // `A` is used as the type of the `x` parameter block. + // + // This information will be used later when we want to specialize + // shader code, because if `z` is bound using a `ParameterBlock` + // then we can infer that `A` should be bound to `Bar`. + // + struct GenericParam + { + int parameterBlockIndex; + }; + std::vector genericParams; +}; +// +// As with entry points, loading a program is done with +// the help of Slang's reflection API. +// +RefPtr loadProgram( + ShaderModule* module, + int entryPointCount, + const char* const* entryPointNames) +{ + auto slangReflection = module->slangReflection; + + RefPtr program = new Program(); + program->shaderModule = module; + + // We will loop over the entry point names that were requested, + // loading each and adding it to our program. + // + for(int ee = 0; ee < entryPointCount; ++ee) + { + auto entryPoint = loadEntryPoint(module, entryPointNames[ee]); + if(!entryPoint) + return nullptr; + program->entryPoints.push_back(entryPoint); + } + + // Next, we will look at the reflection information to see how + // many generic type parameters were declared, and allocate + // space in the `genericParams` array for them. + // + // We don't yet have enough information to fill in the + // `parameterBlockIndex` field. + // + auto genericParamCount = slangReflection->getTypeParameterCount(); + for(unsigned int pp = 0; pp < genericParamCount; ++pp) + { + auto slangGenericParam = slangReflection->getTypeParameterByIndex(pp); + + Program::GenericParam genericParam = {}; + program->genericParams.push_back(genericParam); + } + + // We want to specialize our shaders based on what gets bound + // in parameter blocks, so we will scan the shader parameters + // looking for `ParameterBlock` where `G` is one of our + // generic type parameters. + // + // We do this by iterating over *all* the global shader paramters, + // and looking for those that happen to be parameter blocks, and + // of those the ones where the "element type" of the parameter block + // is a generic type parameter. + // + auto paramCount = slangReflection->getParameterCount(); + int parameterBlockCounter = 0; + for(unsigned int pp = 0; pp < paramCount; ++pp) + { + auto slangParam = slangReflection->getParameterByIndex(pp); + + // Is it a parameter block? If not, skip it. + if(slangParam->getType()->getKind() != slang::TypeReflection::Kind::ParameterBlock) + continue; + + // Okay, we've found another parameter block, so we can compute its zero-based index. + int parameterBlockIndex = parameterBlockCounter++; + + // Get the element type of the parameter block, and if it isn't a generic type + // parameter, then skip it. + auto slangElementTypeLayout = slangParam->getTypeLayout()->getElementTypeLayout(); + if(slangElementTypeLayout->getKind() != slang::TypeReflection::Kind::GenericTypeParameter) + continue; + + // At this point we've found a `ParameterBlock` where `G` is a `type_param`, + // so we can store the index of the parameter block back into our array of + // generic type parameter info. + // + auto genericParamIndex = slangElementTypeLayout->getGenericParamIndex(); + program->genericParams[genericParamIndex].parameterBlockIndex = parameterBlockIndex; + } + + // The above loop over the global shader parameters will have found all the + // parameter blocks that were specified in the shader code, so now we know + // how many parameter blocks are expected to be bound when this program is used. + // + program->parameterBlockCount = parameterBlockCounter; + + return program; +} +// +// As a convenience, we will define a simple wrapper around `loadProgram` for the case +// where we have just two entry points, since that is what the application actually uses. +// +RefPtr loadProgram(ShaderModule* module, char const* entryPoint0, char const* entryPoint1) +{ + char const* entryPointNames[] = { entryPoint0, entryPoint1 }; + return loadProgram(module, 2, entryPointNames); +} + +// The `ParameterBlock` type is supported by the Slang language and compiler, +// but it is up to each application to map it down to whatever graphics API +// abstraction is most fitting. +// +// For our application, a parameter block will be implemented as a combination +// of Slang type reflection information (to determine the layout) plus a +// graphics API descriptor set object. +// +// Note: the example graphics API abstraction we are using exposes descriptor sets +// similar to those in Vulkan, and then maps these down to efficient alternatives +// on other APIs including D3D12, D3D11, and OpenGL. +// +// Every parameter block is allocated based on a particular layout, and we +// can share the same layout across multiple blocks: +// +struct ParameterBlockLayout : RefObject +{ + // The graphics API device that should be used to allocate parameter + // block instances. + // + RefPtr renderer; + + // The Slang type layout information that will be used to decide + // how much space is needed in instances of this layout. + // + // If the user declares a `ParameterBlock` parameter, then + // this will be the type layout information for `Batman`. + // + slang::TypeLayoutReflection* slangTypeLayout; + + // The size of the "primary" constant buffer that will hold any + // "ordinary" (not-resource) fields in the `slangTypeLayout` above. + // + size_t primaryConstantBufferSize; + + // API-specific layout information computes from `slangTypelayout`. + // + RefPtr descriptorSetLayout; +}; +// +// A parameter block layout can be computed for any `struct` type +// declared in the user's shade code. We extract the relevant +// information from the type using the Slang reflection API. +// +RefPtr getParameterBlockLayout( + ShaderModule* module, + char const* name) +{ + auto slangReflection = module->slangReflection; + auto renderer = module->renderer; + + // Look up the type with the given name, and bail out + // if no such type is found in the module. + // + auto type = slangReflection->findTypeByName(name); + if(!type) return nullptr; + + // Request layout information for the type. Note that a single + // type might be laid out differently for different compilation + // targets, or based on how it is used (e.g., as a `cbuffer` + // field vs. in a `StructuredBuffer`). + // + auto typeLayout = slangReflection->getTypeLayout(type); + if(!typeLayout) return nullptr; + + // If the type that is going in the parameter block has + // any ordinary data in it (as opposed to resources), then + // a constant buffer will be needed to hold that data. + // + // In turn any resource parameters would need to go into + // the descriptor set *after* this constant buffer. + // + size_t primaryConstantBufferSize = typeLayout->getSize(SLANG_PARAMETER_CATEGORY_UNIFORM); + + // We need to use the Slang reflection information to + // create a graphics-API-level descriptor-set layout that + // is compatible with the original declaration. + // + std::vector slotRanges; + + // If the type has any ordinary data, then the descriptor set + // will need a constant buffer to be the first thing it stores. + // + // Note: for a renderer only targetting D3D12, it might make + // sense to allocate this "primary" constant buffer as a root + // descriptor instead of inside the descriptor set (or at least + // do this *if* there are no non-uniform parameters). Policy + // decisions like that are up to the application, not Slang. + // This example application just does something simple. + // + if(primaryConstantBufferSize) + { + slotRanges.push_back( + gfx::DescriptorSetLayout::SlotRangeDesc( + gfx::DescriptorSlotType::UniformBuffer)); + } + + // Next, the application will recursively walk + // the structure of `typeLayout` to figure out what resource + // binding ranges are required for the target API. + // + // TODO: This application doesn't yet use any resource parameters, + // so we are skipping this step, but it is obviously needed + // for a fully fleshed-out example. + + // Now that we've collected the graphics-API level binding + // information, we can construct a graphics API descriptor set + // layout. + gfx::DescriptorSetLayout::Desc descriptorSetLayoutDesc; + descriptorSetLayoutDesc.slotRangeCount = slotRanges.size(); + descriptorSetLayoutDesc.slotRanges = slotRanges.data(); + auto descriptorSetLayout = renderer->createDescriptorSetLayout(descriptorSetLayoutDesc); + if(!descriptorSetLayout) return nullptr; + + RefPtr parameterBlockLayout = new ParameterBlockLayout(); + parameterBlockLayout->renderer = renderer; + parameterBlockLayout->primaryConstantBufferSize = primaryConstantBufferSize; + parameterBlockLayout->slangTypeLayout = typeLayout; + parameterBlockLayout->descriptorSetLayout = descriptorSetLayout; + return parameterBlockLayout; +} + +// A `ParameterBlock` abstracts over the allocated storage +// for a descriptor set, based on some `ParameterBlockLayout` +// +struct ParameterBlock : RefObject +{ + // The graphics API device used to allocate this block. + RefPtr renderer; + + // The associated parameter block layout. + RefPtr layout; + + // The (optional) constant buffer that holds the values + // for any ordinay fields. This will be null if + // `layout->primaryConstantBufferSize` is zero. + RefPtr primaryConstantBuffer; + + // The graphics-API descriptor set that provides storage + // for any resource fields. + RefPtr descriptorSet; + + // Map/unmap operations are provided to access the + // contents of the primary constant buffer. + void* map(); + void unmap(); + + // A a convenience, `pb->mapAs()` map be used as + // a declaration of intent, instead of `(X*) pb->map()` + template + T* mapAs() { return (T*)map(); } +}; + +// Allocating a parameter block is mostly a matter of allocating +// the required graphics API objects. +// +RefPtr allocateParameterBlockImpl( + ParameterBlockLayout* layout) +{ + auto renderer = layout->renderer; + + // A descriptor set is then used to provide the storage for all + // resource parameters (including the primary constant buffer, if any). + // + auto descriptorSet = renderer->createDescriptorSet( + layout->descriptorSetLayout); + + // If the parameter block has any ordinary data, then it requires + // a "primary" constant buffer to hold that data. + // + RefPtr primaryConstantBuffer = nullptr; + if(auto primaryConstantBufferSize = layout->primaryConstantBufferSize) + { + gfx::BufferResource::Desc bufferDesc; + bufferDesc.init(primaryConstantBufferSize); + bufferDesc.setDefaults(gfx::Resource::Usage::ConstantBuffer); + bufferDesc.cpuAccessFlags = gfx::Resource::AccessFlag::Write; + primaryConstantBuffer = renderer->createBufferResource( + gfx::Resource::Usage::ConstantBuffer, + bufferDesc); + + // The primary constant buffer will always be the first thing + // stored in the descriptor set for a parameter block. + // + descriptorSet->setConstantBuffer(0, 0, primaryConstantBuffer); + } + + // Now that we've allocated the graphics API objects, we can just + // allocate our application-side wrapper object to tie everything + // together. + // + RefPtr parameterBlock = new ParameterBlock(); + parameterBlock->renderer = renderer; + parameterBlock->layout = layout; + parameterBlock->primaryConstantBuffer = primaryConstantBuffer; + parameterBlock->descriptorSet = descriptorSet; + return parameterBlock; +} + +// A full-featured high-performance application would likely draw +// a distinction between "persistent" parameter blocks that are +// filled in once and then used over many frames, and "transient" +// blocks that are allocated, filled in, and discarded within +// a single frame. +// +// These two cases warrant very different allocation strategies, +// but for now we are using the same logic in both cases. +// +RefPtr allocatePersistentParameterBlock( + ParameterBlockLayout* layout) +{ + return allocateParameterBlockImpl(layout); +} +RefPtr allocateTransientParameterBlock( + ParameterBlockLayout* layout) +{ + return allocateParameterBlockImpl(layout); +} + +// As described earlier, it is convenient to be able +// to easily map the primary constant buffer of a parameter +// block, since this will hold the values for any ordinary fields. +// +void* ParameterBlock::map() +{ + return renderer->map( + primaryConstantBuffer, + MapFlavor::WriteDiscard); +} +void ParameterBlock::unmap() +{ + renderer->unmap(primaryConstantBuffer); +} + +// Our application code has a rudimentary material system, +// to match the `IMaterial` abstraction used in the shade code. +// +struct Material : RefObject +{ + // The key feature of a matrial in our application is that + // it can provide a parameter block that describes it and + // its parameters. The contents of the parameter block will + // be any colors, textures, etc. that the material needs, + // while the Slang type that was used to allocate the + // block will be an implementation of `IMaterial` that + // provides the evaluation logic for the material. + + // Each subclass of `Material` will provide a routine to + // create a parameter block of its chosen type/layout. + virtual RefPtr createParameterBlock() = 0; + + // The parameter block for a material will be stashed here + // after it is created. + RefPtr parameterBlock; +}; + +// For now we have only a single implementation of `Material`, +// which corresponds to the `SimpleMaterial` type in our shader +// code. +// +struct SimpleMaterial : Material +{ + // The `SimpleMaterial` shader type has only uniform data, + // so we declare a `struct` type for that data here. + struct Uniforms + { + glm::vec3 diffuseColor; + float pad; + }; + Uniforms uniforms; + + // When asked to create a parameter block, the `SimpleMaterial` + // type will allocate a block based on the corresponding + // shader type, and fill it in based on the data in the C++ + // object. + // + RefPtr createParameterBlock() override + { + auto parameterBlockLayout = gParameterBlockLayout; + auto parameterBlock = allocatePersistentParameterBlock( + parameterBlockLayout); + + if(auto u = parameterBlock->mapAs()) + { + *u = uniforms; + parameterBlock->unmap(); + } + + return parameterBlock; + } + + // We cache the corresponding parameter block layout for + // `SimpleMaterial` in a static variable so that we don't + // load it more than once. + // + static RefPtr gParameterBlockLayout; +}; +RefPtr SimpleMaterial::gParameterBlockLayout; + +// With the `Material` abstraction defined, we can go on to define +// the representation for loaded models that we will use. +// +// A `Model` will own vertex/index buffers, along with a list of meshes, +// while each `Mesh` will own a material and a range of indices. +// For this example we will be loading models from `.obj` files, but +// that is just a simple lowest-common-denominator choice. +// +struct Mesh : RefObject +{ + RefPtr material; + int firstIndex; + int indexCount; +}; +struct Model : RefObject +{ + typedef ModelLoader::Vertex Vertex; + + RefPtr vertexBuffer; + RefPtr indexBuffer; + PrimitiveTopology primitiveTopology; + int vertexCount; + int indexCount; + std::vector> meshes; +}; +// +// Loading a model from disk is done with the help of some utility +// code for parsing the `.obj` file format, so that the application +// mostly just registers some callbacks to allocate the objects +// used for its representation. +// +RefPtr loadModel( + Renderer* renderer, + char const* inputPath, + ModelLoader::LoadFlags loadFlags = 0, + float scale = 1.0f) +{ + // The model loading interface using a C++ interface of + // callback functions to handle creating the application-specific + // representation of meshes, materials, etc. + // + struct Callbacks : ModelLoader::ICallbacks + { + void* createMaterial(MaterialData const& data) override + { + SimpleMaterial* material = new SimpleMaterial(); + material->uniforms.diffuseColor = data.diffuseColor; + + material->parameterBlock = material->createParameterBlock(); + + return material; + } + + void* createMesh(MeshData const& data) override + { + Mesh* mesh = new Mesh(); + mesh->firstIndex = data.firstIndex; + mesh->indexCount = data.indexCount; + mesh->material = (Material*)data.material; + return mesh; + } + + void* createModel(ModelData const& data) override + { + Model* model = new Model(); + model->vertexBuffer = data.vertexBuffer; + model->indexBuffer = data.indexBuffer; + model->primitiveTopology = data.primitiveTopology; + model->vertexCount = data.vertexCount; + model->indexCount = data.indexCount; + + int meshCount = data.meshCount; + for(int ii = 0; ii < meshCount; ++ii) + model->meshes.push_back((Mesh*)data.meshes[ii]); + + return model; + } + }; + Callbacks callbacks; + + // We instantiate a model loader object and then use it to + // try and load a model from the chosen path. + // + ModelLoader loader; + loader.renderer = renderer; + loader.loadFlags = loadFlags; + loader.scale = scale; + loader.callbacks = &callbacks; + Model* model = nullptr; + if(SLANG_FAILED(loader.load(inputPath, (void**)&model))) + { + log("failed to load '%s'\n", inputPath); + return nullptr; + } + + return model; +} + +// The core of our application's rendering abstraction is +// the notion of an "effect," which ties together a particular +// set of shader entry points (as a `Program`), with graphics +// API state objects for the fixed-function parts of the pipeline. +// +// Note that the program here is an *unspecialized* program, +// which might have unbound global `type_param`s. Thus the +// `Effect` type here is not one-to-one with a "pipeline state +// object," because the same effect could be used to instantiate +// multiple pipeline state objects based on how things get +// specialized. +// +struct Effect : RefObject +{ + // The shader program entry point(s) to execute + RefPtr program; + + // Additional state corresponding to the data needed + // to create a graphics-API pipeline state object. + RefPtr inputLayout; + Int renderTargetCount; +}; + +// In order to render using the `Effect` abstraction, our +// application will be creating various specialized +// shader kernels and pipeline states on-demand. +// +// We'll start with the representation of a specialized +// "variant" of an effect. +// +struct EffectVariant : RefObject +{ + // The graphics API pipeline layout and state + // that need to be bound in order to use this + // effect. + // + RefPtr pipelineLayout; + RefPtr pipelineState; +}; +// +// A specialized variant is created based on a base effect +// and the types that will be bound to its parameter blocks. +// +RefPtr createEffectVaraint( + Effect* effect, + UInt parameterBlockCount, + ParameterBlockLayout* const* parameterBlockLayouts) +{ + // One note to make at the very start is that the creation + // of a specialized variant is based on the *layout* of + // the parameter blocks in use and not on the particular + // parameter blocks themselves. This is important because + // it means that, e.g., two materials that use the same code, + // but different parameter values (different textures, colors, + // etc.) do *not* require switching between different + // shader code or specialized PSOs. + + // We'll start by extracting some of the pieces of + // information taht we need into local variables, + // just to simplify the remaining code. + // + auto program = effect->program; + auto shaderModule = program->shaderModule; + auto renderer = shaderModule->renderer; + + // Our specialized effect is going to need a few things: + // + // 1. A specialized pipeline layout, based on the layout + // of the bound parameter blocks. + // + // 2. Specialized shader kernels, based on "plugging in" + // the parameter block types for generic type parameters + // as needed. + // + // 3. A specialized pipeline state object that ties the + // above items together with the fixed-function state + // already specified in the effect. + // + // We will now go through these steps in order. + + // (1) The pipline layout (aka D3D12 "root signature") will + // be determined based on the descriptor-set layouts + // already cached in the given parameter block layouts. + // + std::vector descriptorSets; + for(UInt pp = 0; pp < parameterBlockCount; ++pp) + { + descriptorSets.emplace_back( + parameterBlockLayouts[pp]->descriptorSetLayout); + } + PipelineLayout::Desc pipelineLayoutDesc; + pipelineLayoutDesc.renderTargetCount = 1; + pipelineLayoutDesc.descriptorSetCount = descriptorSets.size(); + pipelineLayoutDesc.descriptorSets = descriptorSets.data(); + auto pipelineLayout = renderer->createPipelineLayout(pipelineLayoutDesc); + + // (2) The final shader kernels to bind will be computed + // from the kernels we extracted into an application `EntryPoint` + // plus the types of the bound paramter blocks, as needed. + // + // We will "infer" a type argument for each of the generic + // parameters of our shader program by looking for a + // parameter block that is declared using that generic + // type. + // + std::vector genericArgs; + for(auto gp : program->genericParams) + { + int parameterBlockIndex = gp.parameterBlockIndex; + auto typeName = parameterBlockLayouts[parameterBlockIndex]->slangTypeLayout->getName(); + genericArgs.push_back(typeName); + } + + // Now that we are ready to generate specialized shader code, + // we wil invoke the Slang compiler again. This time we leave + // full code generation turned on, and we also specify the + // entry points that we want explicitly (so that we don't + // generate code for any other entry points). + // + auto slangSession = getSlangSession(); + SlangCompileRequest* slangRequest = spCreateCompileRequest(slangSession); + int targetIndex = spAddCodeGenTarget(slangRequest, SLANG_DXBC); + spSetTargetProfile(slangRequest, targetIndex, spFindProfile(slangSession, "sm_4_0")); + int translationUnitIndex = spAddTranslationUnit(slangRequest, SLANG_SOURCE_LANGUAGE_SLANG, nullptr); + spAddTranslationUnitSourceFile(slangRequest, translationUnitIndex, program->shaderModule->inputPath.c_str()); + + int entryPointCont = program->entryPoints.size(); + for(int ii = 0; ii < entryPointCont; ++ii) + { + auto entryPoint = program->entryPoints[ii]; + + // We are using the `spAddEntryPointEx` API so that we + // can specify the type names to use for the generic + // type parameters of the program. + // + spAddEntryPointEx( + slangRequest, + translationUnitIndex, + entryPoint->name.c_str(), + entryPoint->slangStage, + genericArgs.size(), + genericArgs.data()); + } + + // We expect compilation to go through without a hitch, because the + // code was already statically checked back in `loadShaderModule()`. + // It is still possible for errors to arise if, e.g., the application + // tries to specialize code based on a type that doesn't implement + // a required interface. + // + int compileErr = spCompile(slangRequest); + if(auto diagnostics = spGetDiagnosticOutput(slangRequest)) + { + reportError("%s", diagnostics); + } + if(compileErr) + { + spDestroyCompileRequest(slangRequest); + assert(!"unexected"); + return nullptr; + } + + // Once compilation is done we can extract the kernel code + // for each of the entry points, and set them up for passing + // to the graphics APIs loading logic. + // + std::vector kernelBlobs; + std::vector kernelDescs; + for(int ii = 0; ii < entryPointCont; ++ii) + { + auto entryPoint = program->entryPoints[ii]; + + ISlangBlob* blob = nullptr; + spGetEntryPointCodeBlob(slangRequest, ii, 0, &blob); + + kernelBlobs.push_back(blob); + + ShaderProgram::KernelDesc kernelDesc; + + char const* codeBegin = (char const*) blob->getBufferPointer(); + char const* codeEnd = codeBegin + blob->getBufferSize(); + + kernelDesc.stage = entryPoint->apiStage; + kernelDesc.codeBegin = codeBegin; + kernelDesc.codeEnd = codeEnd; + + kernelDescs.push_back(kernelDesc); + } + + // Once we've extracted the "blobs" of compiled code, + // we are done with the Slang compilation request. + // + // Note that all of our reflection was performed on the unspecialized + // shader code at load time, but we know that information is still + // applicable to specialized kernels because of the guarantees + // the Slang compiler makes about type layout. + // + spDestroyCompileRequest(slangRequest); + + // We use the graphics API to load a program into the GPU + gfx::ShaderProgram::Desc programDesc; + programDesc.pipelineType = gfx::PipelineType::Graphics; + programDesc.kernels = kernelDescs.data(); + programDesc.kernelCount = kernelDescs.size(); + auto specializedProgram = renderer->createProgram(programDesc); + + // Then we unload our "blobs" of kernel code once the graphics + // API is doen with their data. + // + for(auto blob : kernelBlobs) + { + blob->release(); + } + + // (3) We construct a full graphics API pipeline state + // object that combines our new program and pipeline layout + // with the other state objects from the `Effect`. + // + gfx::GraphicsPipelineStateDesc pipelineStateDesc = {}; + pipelineStateDesc.program = specializedProgram; + pipelineStateDesc.pipelineLayout = pipelineLayout; + pipelineStateDesc.inputLayout = effect->inputLayout; + pipelineStateDesc.renderTargetCount = effect->renderTargetCount; + auto pipelineState = renderer->createGraphicsPipelineState(pipelineStateDesc); + + RefPtr variant = new EffectVariant(); + variant->pipelineLayout = pipelineLayout; + variant->pipelineState = pipelineState; + return variant; +} + +// A more advanced application might add logic to +// pre-populate the shader cache with shader variants +// that were compiled offline. +// +struct ShaderCache : RefObject +{ + struct VariantKey + { + Effect* effect; + UInt parameterBlockCount; + ParameterBlockLayout* parameterBlockLayouts[8]; + + // In order to be used as a hash-table key, our + // variant key representation must support + // equality comparison and a matching hashin function. + + bool operator==(VariantKey const& other) const + { + if(effect != other.effect) return false; + if(parameterBlockCount != other.parameterBlockCount) return false; + for( UInt ii = 0; ii < parameterBlockCount; ++ii ) + { + if(parameterBlockLayouts[ii] != other.parameterBlockLayouts[ii]) return false; + } + return true; + } + + UInt GetHashCode() const + { + auto hash = ::GetHashCode(effect); + hash = combineHash(hash, ::GetHashCode(parameterBlockCount)); + for( UInt ii = 0; ii < parameterBlockCount; ++ii ) + { + hash = combineHash(hash, ::GetHashCode(parameterBlockLayouts[ii])); + } + return hash; + } + }; + + // The shader cache is mostly just a dictionary mapping + // variant keys to the associated variant, generated on-demand. + // + // TODO: A more advanced application might support removing + // entries from the shader cache when effects get unloaded, + // or in order to respond to operations like a "hot reload" + // key in a development build (e.g., just clear the + // cache of variants and allow the ordinary loading logic + // to re-populate it). + // + Dictionary > variants; + + // Getting a variant is just a matter of looking for an + // existing entry in the dictionary, and creating one + // on demand in case of a miss. + // + RefPtr getEffectVariant( + VariantKey const& key) + { + RefPtr variant; + if(variants.TryGetValue(key, variant)) + return variant; + + variant = createEffectVaraint( + key.effect, + key.parameterBlockCount, + key.parameterBlockLayouts); + + variants.Add(key, variant); + return variant; + } +}; + + +// In order to render using the `Effect` abstraction, our +// application will use its own rendering context type +// to manage the state that it is binding. This layer +// performs a small amount of shadowing on top of the +// underlying graphics API. +// +// Note: for the purposes of our examples the "graphcis API" +// in a cross-platform abstraction over multiple APIs, but +// we do not actually advocate that real applications should +// be built in terms of distinct layers for cross-platform +// GPU API abstraction and "effect" state management. +// +// A high-performance application built on top of this approach +// would instead implement the concepts like `ParameterBlock` +// and `RenderContext` on a per-API basis, making use of +// whatever is most efficeint on that API without any +// additional abstraction layers in between. +// +// We've done things differently in this example program in +// order to avoid getting bogged down in the specifics of +// any one GPU API. +// +// With that disclaimer out of the way, let's talk through +// the `RenderContext` type in this application. +// +struct RenderContext +{ +private: + // The `RenderContext` type is used to wrap the graphics + // API "context" or "command list" type for submission. + // Our current abstraction layer lumps this all together + // with the "device." + // + RefPtr renderer; + + // We also retain a pointer to the shader cache, which + // will be used to implement lookup of the right + // effect variant to execute based on bound parameter + // blocks. + // + RefPtr shaderCache; + + // We will establish a small upper bound on how many + // parameter blocks can be used simultaneously. In + // practice, most shaders won't need more than about + // four parameter blocks, and attempting to use more + // than that under Vulkan can cause portability issues. + // + enum { kMaxParameterBlocks = 8 }; + + // The overall "state" of the rendering context consists of: + // + // * The currently selected "effect" + // * The parameter blocks that are used to specialize and + // provide parameters for that effects. + // + RefPtr effect; + RefPtr parameterBlocks[kMaxParameterBlocks]; + + // Along with the retained state above, we also store + // state in exactly the form required for looking up + // an effect variant in our shader cache, to minimize + // the work that needs to be done when looking up state. + // + ShaderCache::VariantKey variantKey; + + // When state gets changed, we track a few dirty flags rather than + // flush changes to the GPU right away. + + // Tracks whether any state has changed in a way that requires computing + // and binding a new GPU pipeline state object (PSO). + // + // E.g., changing the current effect would set this flag, but changing + // a parameter block binding to one with a new layout would also set the flag. + bool pipelineStateDirty = true; + + // The `minDirtyBlockBinding` flag tracks the lowest-numbered parameter + // block binding that needs to be flushed to the GPU. That is, if + // parameters blocks [0,N) have been bound to the GPU, and then the user + // tries to set block K, then the range [0,K-1) will be left alone, + // while the range [K,N) needs to be set again. + // + // This is an optimization that can be exploited on the Vulkan API + // (and potentially others) if switching pipeline layouts doesn't invalidate + // all currently-bound descriptor sets. + // + int minDirtyBlockBinding = 0; + + // Finally, we cache the specialized effect variant that has been + // most recently bound to the GPU state, so that we can use the + // information it stores (specifically the pipeline layout) when + // binding descriptor sets. + // + RefPtr currentEffectVariant; + +public: + // Initializing a render context just sets its pointer to the GPU API device + RenderContext( + gfx::Renderer* renderer, + ShaderCache* shaderCache) + : renderer(renderer) + , shaderCache(shaderCache) + {} + + void setEffect( + Effect* inEffect) + { + // Bail out if nothing is changing. + if( inEffect == effect ) + return; + + effect = inEffect; + variantKey.effect = effect; + variantKey.parameterBlockCount = effect->program->parameterBlockCount; + + // Binding a new effect invalidates the current state object, since + // it will be a specialization of some other effect. + // + pipelineStateDirty = true; + } + + void setParameterBlock( + int index, + ParameterBlock* parameterBlock) + { + // Bail out if nothing is changing. + if(parameterBlock == parameterBlocks[index]) + return; + + parameterBlocks[index] = parameterBlock; + + // This parameter block needs to be bound to the GPU, and any + // parameter blocks after it in the list will also get re-bound + // (even if they haven't changed). This is a reasonable choice + // if parameter blocks are ordered based on expected frequency + // of update (so that lower-numbered blocks change less often). + // + minDirtyBlockBinding = std::min(index, minDirtyBlockBinding); + + // Next, check if the layout for the block we just bound + // is different than the one that was in place before, + // as stored in the "variant key" + // + auto layout = parameterBlock->layout; + if(layout.Ptr() == variantKey.parameterBlockLayouts[index]) + return; + + variantKey.parameterBlockLayouts[index] = layout; + + // Changing the layout of a parameter block (which includes + // the underlying Slang type) requires computing a new + // pipeline state object, because it may lead to differently + // specialized code being generated. + // + pipelineStateDirty = true; + } + + void flushState() + { + // The `flushState()` operation must be used by the application + // any time it binds a different effect or parameter block(s), + // to ensure that the GPU state is fully configured for rendering. + // It is thus important that this function do as little work + // as possible, especially in the common case where state + // doesn't actually need to change. + // + // The first check we do is to see if any change might require + // a different set of shader kernels. + // + if(pipelineStateDirty) + { + pipelineStateDirty = false; + + // Almost all of the logic for retrieving or creating + // a new pipeline state with specialized kernels is + // handled by our shader cache. + // + // In the common case, the desired variant will already + // be present in the cache, and this function returns + // without much effort. + // + auto variant = shaderCache->getEffectVariant(variantKey); + + // In order to adapt to a change in shader variant, + // we simply bind its PSO into the GPU state, and + // remember the variant we've selected. + // + renderer->setPipelineState(PipelineType::Graphics, variant->pipelineState); + currentEffectVariant = variant; + } + + // Even if the current pipeline state was fine, we may need to + // bind one or more descriptor sets. We do this by walking + // from our lowest-numbered "dirty" set up to the number + // of sets expected by the current effect and binding them. + // + // If `minDirtyBlockBinding` is greater than or equal to the + // `parameterBlockCount` of the currently bound effect, then + // this will be a no-op. + // + // The common case in a tight drawing loop will be that only + // the last block will be dirty, and we will only execute + // one iteration of this loop. + // + auto program = effect->program; + auto parameterBlockCount = program->parameterBlockCount; + auto pipelineLayout = currentEffectVariant->pipelineLayout; + for(int ii = minDirtyBlockBinding; ii < parameterBlockCount; ++ii) + { + renderer->setDescriptorSet( + PipelineType::Graphics, + pipelineLayout, + ii, + parameterBlocks[ii]->descriptorSet); + } + minDirtyBlockBinding = parameterBlockCount; + } +}; + +// We will again structure our example application as a C++ `struct`, +// so that we can scope its allocations for easy cleanup, rather than +// use global variables. +// +struct ModelViewer { + +Window* gWindow; +RefPtr gRenderer; +RefPtr gDepthTarget; + +// We keep a pointer to the one effect we are using (for a forward +// rendering pass), plus the parameter-block layouts for our `PerView` +// and `PerModel` shader types. +// +RefPtr gEffect; +RefPtr gPerViewParameterBlockLayout; +RefPtr gPerModelParameterBlockLayout; + +RefPtr shaderCache; + +// Most of the application state is stored in the list of loaded models. +// +std::vector> gModels; + +// During startup the application will load one or more models and +// add them to the `gModels` list. +// +void loadAndAddModel( + char const* inputPath, + ModelLoader::LoadFlags loadFlags = 0, + float scale = 1.0f) +{ + auto model = loadModel(gRenderer, inputPath, loadFlags, scale); + if(!model) return; + gModels.push_back(model); +} + +int gWindowWidth = 1024; +int gWindowHeight = 768; + +// For this more complex example we will be passing multiple +// parameter blocks into the shader code, and each will +// need its own `struct` type the define the layout of the +// uniform data. +// +struct PerView +{ + glm::mat4x4 viewProjection; + + glm::vec3 lightDir; + float pad0; + + glm::vec3 lightColor; + float pad1; +}; +struct PerModel +{ + glm::mat4x4 modelTransform; + glm::mat4x4 inverseTransposeModelTransform; +}; + +// The overall initialization logic is quite similar to +// the earlier example. The biggest difference is that we +// create instances of our application-specific parameter +// block layout and effect types instead of just creating +// raw graphics API objects. +// +Result initialize() +{ + WindowDesc windowDesc; + windowDesc.title = "Model Viewer"; + windowDesc.width = gWindowWidth; + windowDesc.height = gWindowHeight; + gWindow = createWindow(windowDesc); + + gRenderer = createD3D11Renderer(); + Renderer::Desc rendererDesc; + rendererDesc.width = gWindowWidth; + rendererDesc.height = gWindowHeight; + gRenderer->initialize(rendererDesc, getPlatformWindowHandle(gWindow)); + + InputElementDesc inputElements[] = { + {"POSITION", 0, Format::RGB_Float32, offsetof(Model::Vertex, position) }, + {"NORMAL", 0, Format::RGB_Float32, offsetof(Model::Vertex, normal) }, + {"UV", 0, Format::RG_Float32, offsetof(Model::Vertex, uv) }, + }; + auto inputLayout = gRenderer->createInputLayout( + &inputElements[0], + 3); + if(!inputLayout) return SLANG_FAIL; + + // Because we are rendering more than a single triangle this time, we + // require a depth buffer to resolve visibility. + // + TextureResource::Desc depthBufferDesc = gRenderer->getSwapChainTextureDesc(); + depthBufferDesc.format = Format::D_Float32; + depthBufferDesc.setDefaults(Resource::Usage::DepthWrite); + auto depthTexture = gRenderer->createTextureResource( + Resource::Usage::DepthWrite, + depthBufferDesc); + if(!depthTexture) return SLANG_FAIL; + + ResourceView::Desc textureViewDesc; + textureViewDesc.type = ResourceView::Type::DepthStencil; + auto depthTarget = gRenderer->createTextureView(depthTexture, textureViewDesc); + if (!depthTarget) return SLANG_FAIL; + + gDepthTarget = depthTarget; + + // Unlike the earlier example, we will not generate final shader kernel + // code during initialization. Instead, we simply load the shader module + // so that we can perform reflection and allocate resources. + // + auto shaderModule = loadShaderModule(gRenderer, "shaders.slang"); + if(!shaderModule) return SLANG_FAIL; + + // Once the shader code has been loaded, we can look up types declared + // in the shader code by name and perform reflection on them to determine + // parameter block layouts, etc. + // + // A more advanced application might load this information on-demand + // and potentially tie into an application-level reflection system + // that already knows the string names of its types (e.g., to connect + // the `PerView` type in shader code to the `PerView` type declared + // in the application code). + // + gPerViewParameterBlockLayout = getParameterBlockLayout( + shaderModule, "PerView"); + gPerModelParameterBlockLayout = getParameterBlockLayout( + shaderModule, "PerModel"); + // + // Note how we are able to load the type definition for `SimpleMaterial` + // from the Slang shader module even though the `SimpleMaterial` type + // is not actually *used* by any entry point in the file. + // + SimpleMaterial::gParameterBlockLayout = getParameterBlockLayout( + shaderModule, "SimpleMaterial"); + + // We also load a shader program based on vertex/fragment shaders in our + // module, and then use this to create an application-level effect. + // + // Note that the `loadProgram` operation here does *not* invoke any + // Slang compilation, because the shader module was already completely + // parsed, checked, etc. by the logic in `loadShaderModule()` above. + // + auto program = loadProgram(shaderModule, "vertexMain", "fragmentMain"); + if(!program) return SLANG_FAIL; + + RefPtr effect = new Effect(); + effect->program = program; + effect->inputLayout = inputLayout; + effect->renderTargetCount = 1; + gEffect = effect; + + // In order to create specialized variants of the effect(s) that + // get used for rendering, we will use a shader cache. + // + shaderCache = new ShaderCache(); + + // Once we have created all our graphcis API and application resources, + // we can start to load models. For now we are keeping things extremely + // simple by using a trivial `.obj` file that can be checked into source + // control. + // + // Support for loading more interesting/complex models will be added + // to this example over time (although model loading is *not* the focus). + // + loadAndAddModel("cube.obj"); + + showWindow(gWindow); + + return SLANG_OK; +} + +// With the setup work done, we can look at the per-frame rendering +// logic to see how the application will drive the `RenderContext` +// type to perform both shader parameter binding and code specialization. +// +void renderFrame() +{ + // In order to see that things are rendering properly we need some + // kind of animation, so we will compute a crude delta-time value here. + // + static uint64_t lastTime = getCurrentTime(); + uint64_t currentTime = getCurrentTime(); + float deltaTime = float(currentTime - lastTime) / float(getTimerFrequency()); + lastTime = currentTime; + + // We will use the GLM library to do the matrix math required + // to set up our various transformation matrices. + // + glm::mat4x4 identity = glm::mat4x4(1.0f); + + glm::mat4x4 projection = glm::perspective( + glm::radians(60.0f), + float(gWindowWidth) / float(gWindowHeight), + 0.1f, + 1000.0f); + + glm::mat4x4 view = identity; + view = translate(view, glm::vec3(0, 0, -5)); + + glm::mat4x4 viewProjection = projection * view; + + // We set up a light source with a simple animation applied + // to its direction. + // + glm::vec3 lightDir = normalize(glm::vec3(10, 10, -10)); + glm::vec3 lightColor = glm::vec3(1, 1, 1); + static float angle = 0.0f; + angle += 0.5f * deltaTime; + glm::mat4x4 lightTransform = identity; + lightTransform = rotate(lightTransform, angle, glm::vec3(0, 1, 0)); + lightDir = glm::vec3(lightTransform * glm::vec4(lightDir, 0)); + + // Some of the basic rendering setup is identical to the previous example. + // + static const float kClearColor[] = { 0.25, 0.25, 0.25, 1.0 }; + gRenderer->setClearColor(kClearColor); + gRenderer->clearFrame(); + gRenderer->setDepthStencilTarget(gDepthTarget); + gRenderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); + + // Now we will start in on the more interesting rendering logic, + // by creating the `RenderContext` we will use for submission. + // + // Note: in a multi-threaded submission case, the application would + // need to use a distinct `RenderContext` on each thread. + // + RenderContext context(gRenderer, shaderCache); + + // Next we set the effect that we will use for our forward rendering + // pass. Note that an example with multiple passes would use a + // distinct effect for each pass. + // + context.setEffect(gEffect); + + // We are only rendering one view, so we can fill in a per-view + // parameter block once and use it across all draw calls. + // This parameter block will be different every frame, so we + // allocate a transient parameter block rather than try to + // carefully track and re-use an allocation. + // + auto viewParameterBlock = allocateTransientParameterBlock( + gPerViewParameterBlockLayout); + if(auto perView = viewParameterBlock->mapAs()) + { + perView->viewProjection = viewProjection; + perView->lightDir = lightDir; + perView->lightColor = lightColor; + + viewParameterBlock->unmap(); + } + // + // Note: the assignment of indices to parameter blocks is driven + // by their order of declaration in the shader code, so we know + // that the per-view parameter block has index zero. Alternatively, + // an application could use reflection API operations to look up + // the index of a parameter block based on its name. + // + context.setParameterBlock(0, viewParameterBlock); + + // The majority of our rendering logic is handled as a loop + // over the models in the scene, and their meshes. + // + for(auto& model : gModels) + { + gRenderer->setVertexBuffer(0, model->vertexBuffer, sizeof(Model::Vertex)); + gRenderer->setIndexBuffer(model->indexBuffer, Format::R_UInt32); + + // For each model we provide a parameter + // block that holds the per-model transformation + // parameters, corresponding to the `PerModel` type + // in the shader code. + // + // Like the view parameter block, it makes sense + // to allocate this block as a transient allocation, + // since its contents would be different on the next + // frame anyway. + // + glm::mat4x4 modelTransform = identity; + glm::mat4x4 inverseTransposeModelTransform = inverse(transpose(modelTransform)); + + auto modelParameterBlock = allocateTransientParameterBlock( + gPerModelParameterBlockLayout); + if(auto perModel = modelParameterBlock->mapAs()) + { + perModel->modelTransform = modelTransform; + perModel->inverseTransposeModelTransform = inverseTransposeModelTransform; + + modelParameterBlock->unmap(); + } + context.setParameterBlock(1, modelParameterBlock); + + // Now we loop over the meshes in the model. + // + // A more advanced rendering loop would sort things by material + // rather than by model, to avoid overly frequent state changes. + // We are just doing something simple for the purposes of an + // exmple program. + // + for(auto& mesh : model->meshes) + { + // Each mesh has a material, and each material has its own + // parameter block that was created at load time, so we + // can just re-use the persistent parameter block for the + // chosen material. + // + // Note that binding the material parameter block here is + // both selecting the values to use for various material + // parameters as well as the *code* to use for material + // evaluation (based on the concrete shader type that + // is implementing the `IMaterial` interface). + // + context.setParameterBlock( + 2, + mesh->material->parameterBlock); + + // Once we've set up all the parameter blocks needed + // for a given drawing operation, we need to flush + // any pending state changes (e.g., if the type of + // material changed, a shader switch might be + // required). + // + context.flushState(); + + gRenderer->drawIndexed(mesh->indexCount, mesh->firstIndex); + } + } + + gRenderer->presentFrame(); +} + +void finalize() +{ + // Because we've stored a reference to some graphics API objects + // in a class-static variable (effectively a global) we need + // to clear those out before tearing down the application so + // that we aren't relying on C++ global destructors to tear + // down our application cleanly. + // + SimpleMaterial::gParameterBlockLayout = nullptr; +} + +}; + +void innerMain(ApplicationContext* context) +{ + ModelViewer app; + if(SLANG_FAILED(app.initialize())) + { + exitApplication(context, 1); + } + + while(dispatchEvents(context)) + { + app.renderFrame(); + } + + app.finalize(); +} +GFX_UI_MAIN(innerMain) diff --git a/examples/model-viewer/model-viewer.vcxproj b/examples/model-viewer/model-viewer.vcxproj new file mode 100644 index 000000000..ea7ee1521 --- /dev/null +++ b/examples/model-viewer/model-viewer.vcxproj @@ -0,0 +1,184 @@ + + + + + Debug + Win32 + + + Debug + x64 + + + Release + Win32 + + + Release + x64 + + + + {639B13F2-CF07-CFEC-98FB-664A0427F154} + true + Win32Proj + model-viewer + + + + Application + true + Unicode + v140 + + + Application + true + Unicode + v140 + + + Application + false + Unicode + v140 + + + Application + false + Unicode + v140 + + + + + + + + + + + + + + + + + + + true + ..\..\bin\windows-x86\debug\ + ..\..\intermediate\windows-x86\debug\model-viewer\ + model-viewer + .exe + + + true + ..\..\bin\windows-x64\debug\ + ..\..\intermediate\windows-x64\debug\model-viewer\ + model-viewer + .exe + + + false + ..\..\bin\windows-x86\release\ + ..\..\intermediate\windows-x86\release\model-viewer\ + model-viewer + .exe + + + false + ..\..\bin\windows-x64\release\ + ..\..\intermediate\windows-x64\release\model-viewer\ + model-viewer + .exe + + + + NotUsing + Level3 + _DEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + EditAndContinue + Disabled + MultiThreadedDebug + + + Windows + true + + + + + NotUsing + Level3 + _DEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + EditAndContinue + Disabled + MultiThreadedDebug + + + Windows + true + + + + + NotUsing + Level3 + NDEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + Full + true + true + false + true + MultiThreaded + + + Windows + true + true + + + + + NotUsing + Level3 + NDEBUG;%(PreprocessorDefinitions) + ..\..;..\..\tools;%(AdditionalIncludeDirectories) + Full + true + true + false + true + MultiThreaded + + + Windows + true + true + + + + + + + + + + + {DB00DA62-0533-4AFD-B59F-A67D5B3A0808} + + + {F9BE7957-8399-899E-0C49-E714FDDD4B65} + + + {222F7498-B40C-4F3F-A704-DDEB91A4484A} + + + + + + \ No newline at end of file diff --git a/examples/model-viewer/model-viewer.vcxproj.filters b/examples/model-viewer/model-viewer.vcxproj.filters new file mode 100644 index 000000000..a02cb79fc --- /dev/null +++ b/examples/model-viewer/model-viewer.vcxproj.filters @@ -0,0 +1,18 @@ + + + + + {E9C7FDCE-D52A-8D73-7EB0-C5296AF258F6} + + + + + Source Files + + + + + Source Files + + + \ No newline at end of file diff --git a/examples/model-viewer/shaders.slang b/examples/model-viewer/shaders.slang new file mode 100644 index 000000000..b79636d15 --- /dev/null +++ b/examples/model-viewer/shaders.slang @@ -0,0 +1,178 @@ +// shaders.slang + +// +// This example builds on the simplistic shaders presented in the +// "Hello, World" example by adding support for (intentionally +// simplistic) surface materil and light shading. +// +// The code here is not meant to exemplify state-of-the-art material +// and lighting techniques, but rather to show how a shader +// library can be developed in a modular fashion without reliance +// on the C preprocessor manual parameter-binding decorations. +// + +// We will start with a `struct` for per-view parameters that +// will be allocated into a `ParameterBlock`. +// +// As written, this isn't very different from using an HLSL +// `cbuffer` declaration, but importantly this code will +// continue to work if we add one or more resources (e.g., +// an enironment map texture) to the `PerView` type. +// +struct PerView +{ + float4x4 viewProjection; + + float3 lightDir; + float3 lightColor; +}; +ParameterBlock gViewParams; + +// Declaring a block for per-model parameter data is +// similarly simple. +// +struct PerModel +{ + float4x4 modelTransform; + float4x4 inverseTransposeModelTransform; +}; +ParameterBlock gModelParams; + + +// Next, we are going to demonstrate a simplistic interface +// for surface materials. As written, materials can only +// determine how to compute the diffuse color component +// of a surface; a more advanced example would fold +// the entire BRDF into the material interface. +// +interface IMaterial +{ + float3 getDiffuseColor(); +}; + +// In order for our shader to be able to take a material +// as a parameter, we need to declare a `ParameterBlock` +// for some material type `M`. Rather than hard-code the +// specific material type to use, or select one via the +// preprocessor, we will use Slang's support for generics, +// by defining a "global type parameter": +// +type_param TMaterial : IMaterial; +// +// This declaration declares a shader parameter `TMaterial` +// that is a to-be-determined *type*. The `TMaterial` +// type parameter is *constrained* to only support types +// that implement our `IMaterial` interface. +// +// With the `TMaterial` parameter declared, we can +// declare that our shader takes as input a parameter block +// containing material data: +// +ParameterBlock gMaterial; + +// For now, we will define only a single implementation +// of the `IMaterial` interface, which is a simple material +// with a uniform diffuse color: +// +struct SimpleMaterial : IMaterial +{ + float3 diffuseColor; + + float3 getDiffuseColor() + { + return diffuseColor; + } +}; +// +// Note that no other code in this file statically +// references the `SimpleMaterial` type, and instead +// it is up to the application to "plug in" this type, +// or another `IMaterial` implementation for the +// `TMaterial` parameter. +// + +// Our vertex shader entry point is only marginally more +// complicated than the Hello World example. We will +// start by declaring the various "connector" `struct`s. +// +struct AssembledVertex +{ + float3 position : POSITION; + float3 normal : NORMAL; + float2 uv : UV; +}; +struct CoarseVertex +{ + float3 worldPosition; + float3 worldNormal; + float2 uv; +}; +struct VertexStageOutput +{ + CoarseVertex coarseVertex : CoarseVertex; + float4 sv_position : SV_Position; +}; + +// Perhaps most interesting new feature of the entry +// point decalrations is that we use a `[shader(...)]` +// attribute (as introduced in HLSL Shader Model 6.x) +// in order to tag our entry points. +// +// This attribute informs the Slang compiler which +// functions are intended to be compiled as shader +// entry points (and what stage they target), so that +// the programmer no longer needs to specify the +// entry point name/stage through the API (or on +// the command line when using `slangc`). +// +// While HLSL added this feature only in newer versions, +// the Slang compiler supports this attribute across +// *all* targets, so that it is okay to use whether you +// want DXBC, DXIL, or SPIR-V output. +// +[shader("vertex")] +VertexStageOutput vertexMain( + AssembledVertex assembledVertex) +{ + VertexStageOutput output; + + float3 position = assembledVertex.position; + float3 normal = assembledVertex.normal; + float2 uv = assembledVertex.uv; + + float3 worldPosition = mul(gModelParams.modelTransform, float4(position, 1.0)).xyz; + float3 worldNormal = mul(gModelParams.inverseTransposeModelTransform, float4(normal, 0.0)).xyz; + + output.coarseVertex.worldPosition = worldPosition; + output.coarseVertex.worldNormal = worldNormal; + output.coarseVertex.uv = uv; + + output.sv_position = mul(gViewParams.viewProjection, float4(worldPosition, 1.0)); + + return output; +} + +// Our fragment shader is almost trivial, with the most interesting +// thing being how it uses the `TMaterial` type parameter (through the +// value stored in the `gMaterial` parameter block) to dispatch to +// the correct implementation of the `getDiffuseColor()` method +// in the `IMaterial` interface. +// +// The `gMaterial` parameter block declaration thus serves not only +// to group certain shader parameters for efficient CPU-to-GPU +// communication, but also to select the code that will execute +// in specialized versions of the `fragmentMain` entry point. +// +[shader("fragment")] +float4 fragmentMain( + CoarseVertex coarseVertex : CoarseVertex) : SV_Target +{ + float3 N = normalize(coarseVertex.worldNormal); + float3 L = normalize(gViewParams.lightDir); + + float4 color; + color.xyz = gMaterial.getDiffuseColor() * max(0, dot(N, L)); + color.w = 1.0f; + + return color; +} diff --git a/external/glm b/external/glm new file mode 160000 index 000000000..0d973b40a --- /dev/null +++ b/external/glm @@ -0,0 +1 @@ +Subproject commit 0d973b40a49e550b1ea7df22a8573bc5fff84f24 diff --git a/external/stb/stb_image_resize.h b/external/stb/stb_image_resize.h new file mode 100644 index 000000000..031ca99dc --- /dev/null +++ b/external/stb/stb_image_resize.h @@ -0,0 +1,2627 @@ +/* stb_image_resize - v0.95 - public domain image resizing + by Jorge L Rodriguez (@VinoBS) - 2014 + http://github.com/nothings/stb + + Written with emphasis on usability, portability, and efficiency. (No + SIMD or threads, so it be easily outperformed by libs that use those.) + Only scaling and translation is supported, no rotations or shears. + Easy API downsamples w/Mitchell filter, upsamples w/cubic interpolation. + + COMPILING & LINKING + In one C/C++ file that #includes this file, do this: + #define STB_IMAGE_RESIZE_IMPLEMENTATION + before the #include. That will create the implementation in that file. + + QUICKSTART + stbir_resize_uint8( input_pixels , in_w , in_h , 0, + output_pixels, out_w, out_h, 0, num_channels) + stbir_resize_float(...) + stbir_resize_uint8_srgb( input_pixels , in_w , in_h , 0, + output_pixels, out_w, out_h, 0, + num_channels , alpha_chan , 0) + stbir_resize_uint8_srgb_edgemode( + input_pixels , in_w , in_h , 0, + output_pixels, out_w, out_h, 0, + num_channels , alpha_chan , 0, STBIR_EDGE_CLAMP) + // WRAP/REFLECT/ZERO + + FULL API + See the "header file" section of the source for API documentation. + + ADDITIONAL DOCUMENTATION + + SRGB & FLOATING POINT REPRESENTATION + The sRGB functions presume IEEE floating point. If you do not have + IEEE floating point, define STBIR_NON_IEEE_FLOAT. This will use + a slower implementation. + + MEMORY ALLOCATION + The resize functions here perform a single memory allocation using + malloc. To control the memory allocation, before the #include that + triggers the implementation, do: + + #define STBIR_MALLOC(size,context) ... + #define STBIR_FREE(ptr,context) ... + + Each resize function makes exactly one call to malloc/free, so to use + temp memory, store the temp memory in the context and return that. + + ASSERT + Define STBIR_ASSERT(boolval) to override assert() and not use assert.h + + OPTIMIZATION + Define STBIR_SATURATE_INT to compute clamp values in-range using + integer operations instead of float operations. This may be faster + on some platforms. + + DEFAULT FILTERS + For functions which don't provide explicit control over what filters + to use, you can change the compile-time defaults with + + #define STBIR_DEFAULT_FILTER_UPSAMPLE STBIR_FILTER_something + #define STBIR_DEFAULT_FILTER_DOWNSAMPLE STBIR_FILTER_something + + See stbir_filter in the header-file section for the list of filters. + + NEW FILTERS + A number of 1D filter kernels are used. For a list of + supported filters see the stbir_filter enum. To add a new filter, + write a filter function and add it to stbir__filter_info_table. + + PROGRESS + For interactive use with slow resize operations, you can install + a progress-report callback: + + #define STBIR_PROGRESS_REPORT(val) some_func(val) + + The parameter val is a float which goes from 0 to 1 as progress is made. + + For example: + + static void my_progress_report(float progress); + #define STBIR_PROGRESS_REPORT(val) my_progress_report(val) + + #define STB_IMAGE_RESIZE_IMPLEMENTATION + #include "stb_image_resize.h" + + static void my_progress_report(float progress) + { + printf("Progress: %f%%\n", progress*100); + } + + MAX CHANNELS + If your image has more than 64 channels, define STBIR_MAX_CHANNELS + to the max you'll have. + + ALPHA CHANNEL + Most of the resizing functions provide the ability to control how + the alpha channel of an image is processed. The important things + to know about this: + + 1. The best mathematically-behaved version of alpha to use is + called "premultiplied alpha", in which the other color channels + have had the alpha value multiplied in. If you use premultiplied + alpha, linear filtering (such as image resampling done by this + library, or performed in texture units on GPUs) does the "right + thing". While premultiplied alpha is standard in the movie CGI + industry, it is still uncommon in the videogame/real-time world. + + If you linearly filter non-premultiplied alpha, strange effects + occur. (For example, the 50/50 average of 99% transparent bright green + and 1% transparent black produces 50% transparent dark green when + non-premultiplied, whereas premultiplied it produces 50% + transparent near-black. The former introduces green energy + that doesn't exist in the source image.) + + 2. Artists should not edit premultiplied-alpha images; artists + want non-premultiplied alpha images. Thus, art tools generally output + non-premultiplied alpha images. + + 3. You will get best results in most cases by converting images + to premultiplied alpha before processing them mathematically. + + 4. If you pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED, the + resizer does not do anything special for the alpha channel; + it is resampled identically to other channels. This produces + the correct results for premultiplied-alpha images, but produces + less-than-ideal results for non-premultiplied-alpha images. + + 5. If you do not pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED, + then the resizer weights the contribution of input pixels + based on their alpha values, or, equivalently, it multiplies + the alpha value into the color channels, resamples, then divides + by the resultant alpha value. Input pixels which have alpha=0 do + not contribute at all to output pixels unless _all_ of the input + pixels affecting that output pixel have alpha=0, in which case + the result for that pixel is the same as it would be without + STBIR_FLAG_ALPHA_PREMULTIPLIED. However, this is only true for + input images in integer formats. For input images in float format, + input pixels with alpha=0 have no effect, and output pixels + which have alpha=0 will be 0 in all channels. (For float images, + you can manually achieve the same result by adding a tiny epsilon + value to the alpha channel of every image, and then subtracting + or clamping it at the end.) + + 6. You can suppress the behavior described in #5 and make + all-0-alpha pixels have 0 in all channels by #defining + STBIR_NO_ALPHA_EPSILON. + + 7. You can separately control whether the alpha channel is + interpreted as linear or affected by the colorspace. By default + it is linear; you almost never want to apply the colorspace. + (For example, graphics hardware does not apply sRGB conversion + to the alpha channel.) + + CONTRIBUTORS + Jorge L Rodriguez: Implementation + Sean Barrett: API design, optimizations + Aras Pranckevicius: bugfix + Nathan Reed: warning fixes + + REVISIONS + 0.95 (2017-07-23) fixed warnings + 0.94 (2017-03-18) fixed warnings + 0.93 (2017-03-03) fixed bug with certain combinations of heights + 0.92 (2017-01-02) fix integer overflow on large (>2GB) images + 0.91 (2016-04-02) fix warnings; fix handling of subpixel regions + 0.90 (2014-09-17) first released version + + LICENSE + See end of file for license information. + + TODO + Don't decode all of the image data when only processing a partial tile + Don't use full-width decode buffers when only processing a partial tile + When processing wide images, break processing into tiles so data fits in L1 cache + Installable filters? + Resize that respects alpha test coverage + (Reference code: FloatImage::alphaTestCoverage and FloatImage::scaleAlphaToCoverage: + https://code.google.com/p/nvidia-texture-tools/source/browse/trunk/src/nvimage/FloatImage.cpp ) +*/ + +#ifndef STBIR_INCLUDE_STB_IMAGE_RESIZE_H +#define STBIR_INCLUDE_STB_IMAGE_RESIZE_H + +#ifdef _MSC_VER +typedef unsigned char stbir_uint8; +typedef unsigned short stbir_uint16; +typedef unsigned int stbir_uint32; +#else +#include +typedef uint8_t stbir_uint8; +typedef uint16_t stbir_uint16; +typedef uint32_t stbir_uint32; +#endif + +#ifdef STB_IMAGE_RESIZE_STATIC +#define STBIRDEF static +#else +#ifdef __cplusplus +#define STBIRDEF extern "C" +#else +#define STBIRDEF extern +#endif +#endif + + +////////////////////////////////////////////////////////////////////////////// +// +// Easy-to-use API: +// +// * "input pixels" points to an array of image data with 'num_channels' channels (e.g. RGB=3, RGBA=4) +// * input_w is input image width (x-axis), input_h is input image height (y-axis) +// * stride is the offset between successive rows of image data in memory, in bytes. you can +// specify 0 to mean packed continuously in memory +// * alpha channel is treated identically to other channels. +// * colorspace is linear or sRGB as specified by function name +// * returned result is 1 for success or 0 in case of an error. +// #define STBIR_ASSERT() to trigger an assert on parameter validation errors. +// * Memory required grows approximately linearly with input and output size, but with +// discontinuities at input_w == output_w and input_h == output_h. +// * These functions use a "default" resampling filter defined at compile time. To change the filter, +// you can change the compile-time defaults by #defining STBIR_DEFAULT_FILTER_UPSAMPLE +// and STBIR_DEFAULT_FILTER_DOWNSAMPLE, or you can use the medium-complexity API. + +STBIRDEF int stbir_resize_uint8( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels); + +STBIRDEF int stbir_resize_float( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels); + + +// The following functions interpret image data as gamma-corrected sRGB. +// Specify STBIR_ALPHA_CHANNEL_NONE if you have no alpha channel, +// or otherwise provide the index of the alpha channel. Flags value +// of 0 will probably do the right thing if you're not sure what +// the flags mean. + +#define STBIR_ALPHA_CHANNEL_NONE -1 + +// Set this flag if your texture has premultiplied alpha. Otherwise, stbir will +// use alpha-weighted resampling (effectively premultiplying, resampling, +// then unpremultiplying). +#define STBIR_FLAG_ALPHA_PREMULTIPLIED (1 << 0) +// The specified alpha channel should be handled as gamma-corrected value even +// when doing sRGB operations. +#define STBIR_FLAG_ALPHA_USES_COLORSPACE (1 << 1) + +STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags); + + +typedef enum +{ + STBIR_EDGE_CLAMP = 1, + STBIR_EDGE_REFLECT = 2, + STBIR_EDGE_WRAP = 3, + STBIR_EDGE_ZERO = 4, +} stbir_edge; + +// This function adds the ability to specify how requests to sample off the edge of the image are handled. +STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode); + +////////////////////////////////////////////////////////////////////////////// +// +// Medium-complexity API +// +// This extends the easy-to-use API as follows: +// +// * Alpha-channel can be processed separately +// * If alpha_channel is not STBIR_ALPHA_CHANNEL_NONE +// * Alpha channel will not be gamma corrected (unless flags&STBIR_FLAG_GAMMA_CORRECT) +// * Filters will be weighted by alpha channel (unless flags&STBIR_FLAG_ALPHA_PREMULTIPLIED) +// * Filter can be selected explicitly +// * uint16 image type +// * sRGB colorspace available for all types +// * context parameter for passing to STBIR_MALLOC + +typedef enum +{ + STBIR_FILTER_DEFAULT = 0, // use same filter type that easy-to-use API chooses + STBIR_FILTER_BOX = 1, // A trapezoid w/1-pixel wide ramps, same result as box for integer scale ratios + STBIR_FILTER_TRIANGLE = 2, // On upsampling, produces same results as bilinear texture filtering + STBIR_FILTER_CUBICBSPLINE = 3, // The cubic b-spline (aka Mitchell-Netrevalli with B=1,C=0), gaussian-esque + STBIR_FILTER_CATMULLROM = 4, // An interpolating cubic spline + STBIR_FILTER_MITCHELL = 5, // Mitchell-Netrevalli filter with B=1/3, C=1/3 +} stbir_filter; + +typedef enum +{ + STBIR_COLORSPACE_LINEAR, + STBIR_COLORSPACE_SRGB, + + STBIR_MAX_COLORSPACES, +} stbir_colorspace; + +// The following functions are all identical except for the type of the image data + +STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context); + +STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context); + +STBIRDEF int stbir_resize_float_generic( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context); + + + +////////////////////////////////////////////////////////////////////////////// +// +// Full-complexity API +// +// This extends the medium API as follows: +// +// * uint32 image type +// * not typesafe +// * separate filter types for each axis +// * separate edge modes for each axis +// * can specify scale explicitly for subpixel correctness +// * can specify image source tile using texture coordinates + +typedef enum +{ + STBIR_TYPE_UINT8 , + STBIR_TYPE_UINT16, + STBIR_TYPE_UINT32, + STBIR_TYPE_FLOAT , + + STBIR_MAX_TYPES +} stbir_datatype; + +STBIRDEF int stbir_resize( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context); + +STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float x_scale, float y_scale, + float x_offset, float y_offset); + +STBIRDEF int stbir_resize_region( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float s0, float t0, float s1, float t1); +// (s0, t0) & (s1, t1) are the top-left and bottom right corner (uv addressing style: [0, 1]x[0, 1]) of a region of the input image to use. + +// +// +//// end header file ///////////////////////////////////////////////////// +#endif // STBIR_INCLUDE_STB_IMAGE_RESIZE_H + + + + + +#ifdef STB_IMAGE_RESIZE_IMPLEMENTATION + +#ifndef STBIR_ASSERT +#include +#define STBIR_ASSERT(x) assert(x) +#endif + +// For memset +#include + +#include + +#ifndef STBIR_MALLOC +#include +// use comma operator to evaluate c, to avoid "unused parameter" warnings +#define STBIR_MALLOC(size,c) ((void)(c), malloc(size)) +#define STBIR_FREE(ptr,c) ((void)(c), free(ptr)) +#endif + +#ifndef _MSC_VER +#ifdef __cplusplus +#define stbir__inline inline +#else +#define stbir__inline +#endif +#else +#define stbir__inline __forceinline +#endif + + +// should produce compiler error if size is wrong +typedef unsigned char stbir__validate_uint32[sizeof(stbir_uint32) == 4 ? 1 : -1]; + +#ifdef _MSC_VER +#define STBIR__NOTUSED(v) (void)(v) +#else +#define STBIR__NOTUSED(v) (void)sizeof(v) +#endif + +#define STBIR__ARRAY_SIZE(a) (sizeof((a))/sizeof((a)[0])) + +#ifndef STBIR_DEFAULT_FILTER_UPSAMPLE +#define STBIR_DEFAULT_FILTER_UPSAMPLE STBIR_FILTER_CATMULLROM +#endif + +#ifndef STBIR_DEFAULT_FILTER_DOWNSAMPLE +#define STBIR_DEFAULT_FILTER_DOWNSAMPLE STBIR_FILTER_MITCHELL +#endif + +#ifndef STBIR_PROGRESS_REPORT +#define STBIR_PROGRESS_REPORT(float_0_to_1) +#endif + +#ifndef STBIR_MAX_CHANNELS +#define STBIR_MAX_CHANNELS 64 +#endif + +#if STBIR_MAX_CHANNELS > 65536 +#error "Too many channels; STBIR_MAX_CHANNELS must be no more than 65536." +// because we store the indices in 16-bit variables +#endif + +// This value is added to alpha just before premultiplication to avoid +// zeroing out color values. It is equivalent to 2^-80. If you don't want +// that behavior (it may interfere if you have floating point images with +// very small alpha values) then you can define STBIR_NO_ALPHA_EPSILON to +// disable it. +#ifndef STBIR_ALPHA_EPSILON +#define STBIR_ALPHA_EPSILON ((float)1 / (1 << 20) / (1 << 20) / (1 << 20) / (1 << 20)) +#endif + + + +#ifdef _MSC_VER +#define STBIR__UNUSED_PARAM(v) (void)(v) +#else +#define STBIR__UNUSED_PARAM(v) (void)sizeof(v) +#endif + +// must match stbir_datatype +static unsigned char stbir__type_size[] = { + 1, // STBIR_TYPE_UINT8 + 2, // STBIR_TYPE_UINT16 + 4, // STBIR_TYPE_UINT32 + 4, // STBIR_TYPE_FLOAT +}; + +// Kernel function centered at 0 +typedef float (stbir__kernel_fn)(float x, float scale); +typedef float (stbir__support_fn)(float scale); + +typedef struct +{ + stbir__kernel_fn* kernel; + stbir__support_fn* support; +} stbir__filter_info; + +// When upsampling, the contributors are which source pixels contribute. +// When downsampling, the contributors are which destination pixels are contributed to. +typedef struct +{ + int n0; // First contributing pixel + int n1; // Last contributing pixel +} stbir__contributors; + +typedef struct +{ + const void* input_data; + int input_w; + int input_h; + int input_stride_bytes; + + void* output_data; + int output_w; + int output_h; + int output_stride_bytes; + + float s0, t0, s1, t1; + + float horizontal_shift; // Units: output pixels + float vertical_shift; // Units: output pixels + float horizontal_scale; + float vertical_scale; + + int channels; + int alpha_channel; + stbir_uint32 flags; + stbir_datatype type; + stbir_filter horizontal_filter; + stbir_filter vertical_filter; + stbir_edge edge_horizontal; + stbir_edge edge_vertical; + stbir_colorspace colorspace; + + stbir__contributors* horizontal_contributors; + float* horizontal_coefficients; + + stbir__contributors* vertical_contributors; + float* vertical_coefficients; + + int decode_buffer_pixels; + float* decode_buffer; + + float* horizontal_buffer; + + // cache these because ceil/floor are inexplicably showing up in profile + int horizontal_coefficient_width; + int vertical_coefficient_width; + int horizontal_filter_pixel_width; + int vertical_filter_pixel_width; + int horizontal_filter_pixel_margin; + int vertical_filter_pixel_margin; + int horizontal_num_contributors; + int vertical_num_contributors; + + int ring_buffer_length_bytes; // The length of an individual entry in the ring buffer. The total number of ring buffers is stbir__get_filter_pixel_width(filter) + int ring_buffer_num_entries; // Total number of entries in the ring buffer. + int ring_buffer_first_scanline; + int ring_buffer_last_scanline; + int ring_buffer_begin_index; // first_scanline is at this index in the ring buffer + float* ring_buffer; + + float* encode_buffer; // A temporary buffer to store floats so we don't lose precision while we do multiply-adds. + + int horizontal_contributors_size; + int horizontal_coefficients_size; + int vertical_contributors_size; + int vertical_coefficients_size; + int decode_buffer_size; + int horizontal_buffer_size; + int ring_buffer_size; + int encode_buffer_size; +} stbir__info; + + +static const float stbir__max_uint8_as_float = 255.0f; +static const float stbir__max_uint16_as_float = 65535.0f; +static const double stbir__max_uint32_as_float = 4294967295.0; + + +static stbir__inline int stbir__min(int a, int b) +{ + return a < b ? a : b; +} + +static stbir__inline float stbir__saturate(float x) +{ + if (x < 0) + return 0; + + if (x > 1) + return 1; + + return x; +} + +#ifdef STBIR_SATURATE_INT +static stbir__inline stbir_uint8 stbir__saturate8(int x) +{ + if ((unsigned int) x <= 255) + return x; + + if (x < 0) + return 0; + + return 255; +} + +static stbir__inline stbir_uint16 stbir__saturate16(int x) +{ + if ((unsigned int) x <= 65535) + return x; + + if (x < 0) + return 0; + + return 65535; +} +#endif + +static float stbir__srgb_uchar_to_linear_float[256] = { + 0.000000f, 0.000304f, 0.000607f, 0.000911f, 0.001214f, 0.001518f, 0.001821f, 0.002125f, 0.002428f, 0.002732f, 0.003035f, + 0.003347f, 0.003677f, 0.004025f, 0.004391f, 0.004777f, 0.005182f, 0.005605f, 0.006049f, 0.006512f, 0.006995f, 0.007499f, + 0.008023f, 0.008568f, 0.009134f, 0.009721f, 0.010330f, 0.010960f, 0.011612f, 0.012286f, 0.012983f, 0.013702f, 0.014444f, + 0.015209f, 0.015996f, 0.016807f, 0.017642f, 0.018500f, 0.019382f, 0.020289f, 0.021219f, 0.022174f, 0.023153f, 0.024158f, + 0.025187f, 0.026241f, 0.027321f, 0.028426f, 0.029557f, 0.030713f, 0.031896f, 0.033105f, 0.034340f, 0.035601f, 0.036889f, + 0.038204f, 0.039546f, 0.040915f, 0.042311f, 0.043735f, 0.045186f, 0.046665f, 0.048172f, 0.049707f, 0.051269f, 0.052861f, + 0.054480f, 0.056128f, 0.057805f, 0.059511f, 0.061246f, 0.063010f, 0.064803f, 0.066626f, 0.068478f, 0.070360f, 0.072272f, + 0.074214f, 0.076185f, 0.078187f, 0.080220f, 0.082283f, 0.084376f, 0.086500f, 0.088656f, 0.090842f, 0.093059f, 0.095307f, + 0.097587f, 0.099899f, 0.102242f, 0.104616f, 0.107023f, 0.109462f, 0.111932f, 0.114435f, 0.116971f, 0.119538f, 0.122139f, + 0.124772f, 0.127438f, 0.130136f, 0.132868f, 0.135633f, 0.138432f, 0.141263f, 0.144128f, 0.147027f, 0.149960f, 0.152926f, + 0.155926f, 0.158961f, 0.162029f, 0.165132f, 0.168269f, 0.171441f, 0.174647f, 0.177888f, 0.181164f, 0.184475f, 0.187821f, + 0.191202f, 0.194618f, 0.198069f, 0.201556f, 0.205079f, 0.208637f, 0.212231f, 0.215861f, 0.219526f, 0.223228f, 0.226966f, + 0.230740f, 0.234551f, 0.238398f, 0.242281f, 0.246201f, 0.250158f, 0.254152f, 0.258183f, 0.262251f, 0.266356f, 0.270498f, + 0.274677f, 0.278894f, 0.283149f, 0.287441f, 0.291771f, 0.296138f, 0.300544f, 0.304987f, 0.309469f, 0.313989f, 0.318547f, + 0.323143f, 0.327778f, 0.332452f, 0.337164f, 0.341914f, 0.346704f, 0.351533f, 0.356400f, 0.361307f, 0.366253f, 0.371238f, + 0.376262f, 0.381326f, 0.386430f, 0.391573f, 0.396755f, 0.401978f, 0.407240f, 0.412543f, 0.417885f, 0.423268f, 0.428691f, + 0.434154f, 0.439657f, 0.445201f, 0.450786f, 0.456411f, 0.462077f, 0.467784f, 0.473532f, 0.479320f, 0.485150f, 0.491021f, + 0.496933f, 0.502887f, 0.508881f, 0.514918f, 0.520996f, 0.527115f, 0.533276f, 0.539480f, 0.545725f, 0.552011f, 0.558340f, + 0.564712f, 0.571125f, 0.577581f, 0.584078f, 0.590619f, 0.597202f, 0.603827f, 0.610496f, 0.617207f, 0.623960f, 0.630757f, + 0.637597f, 0.644480f, 0.651406f, 0.658375f, 0.665387f, 0.672443f, 0.679543f, 0.686685f, 0.693872f, 0.701102f, 0.708376f, + 0.715694f, 0.723055f, 0.730461f, 0.737911f, 0.745404f, 0.752942f, 0.760525f, 0.768151f, 0.775822f, 0.783538f, 0.791298f, + 0.799103f, 0.806952f, 0.814847f, 0.822786f, 0.830770f, 0.838799f, 0.846873f, 0.854993f, 0.863157f, 0.871367f, 0.879622f, + 0.887923f, 0.896269f, 0.904661f, 0.913099f, 0.921582f, 0.930111f, 0.938686f, 0.947307f, 0.955974f, 0.964686f, 0.973445f, + 0.982251f, 0.991102f, 1.0f +}; + +static float stbir__srgb_to_linear(float f) +{ + if (f <= 0.04045f) + return f / 12.92f; + else + return (float)pow((f + 0.055f) / 1.055f, 2.4f); +} + +static float stbir__linear_to_srgb(float f) +{ + if (f <= 0.0031308f) + return f * 12.92f; + else + return 1.055f * (float)pow(f, 1 / 2.4f) - 0.055f; +} + +#ifndef STBIR_NON_IEEE_FLOAT +// From https://gist.github.com/rygorous/2203834 + +typedef union +{ + stbir_uint32 u; + float f; +} stbir__FP32; + +static const stbir_uint32 fp32_to_srgb8_tab4[104] = { + 0x0073000d, 0x007a000d, 0x0080000d, 0x0087000d, 0x008d000d, 0x0094000d, 0x009a000d, 0x00a1000d, + 0x00a7001a, 0x00b4001a, 0x00c1001a, 0x00ce001a, 0x00da001a, 0x00e7001a, 0x00f4001a, 0x0101001a, + 0x010e0033, 0x01280033, 0x01410033, 0x015b0033, 0x01750033, 0x018f0033, 0x01a80033, 0x01c20033, + 0x01dc0067, 0x020f0067, 0x02430067, 0x02760067, 0x02aa0067, 0x02dd0067, 0x03110067, 0x03440067, + 0x037800ce, 0x03df00ce, 0x044600ce, 0x04ad00ce, 0x051400ce, 0x057b00c5, 0x05dd00bc, 0x063b00b5, + 0x06970158, 0x07420142, 0x07e30130, 0x087b0120, 0x090b0112, 0x09940106, 0x0a1700fc, 0x0a9500f2, + 0x0b0f01cb, 0x0bf401ae, 0x0ccb0195, 0x0d950180, 0x0e56016e, 0x0f0d015e, 0x0fbc0150, 0x10630143, + 0x11070264, 0x1238023e, 0x1357021d, 0x14660201, 0x156601e9, 0x165a01d3, 0x174401c0, 0x182401af, + 0x18fe0331, 0x1a9602fe, 0x1c1502d2, 0x1d7e02ad, 0x1ed4028d, 0x201a0270, 0x21520256, 0x227d0240, + 0x239f0443, 0x25c003fe, 0x27bf03c4, 0x29a10392, 0x2b6a0367, 0x2d1d0341, 0x2ebe031f, 0x304d0300, + 0x31d105b0, 0x34a80555, 0x37520507, 0x39d504c5, 0x3c37048b, 0x3e7c0458, 0x40a8042a, 0x42bd0401, + 0x44c20798, 0x488e071e, 0x4c1c06b6, 0x4f76065d, 0x52a50610, 0x55ac05cc, 0x5892058f, 0x5b590559, + 0x5e0c0a23, 0x631c0980, 0x67db08f6, 0x6c55087f, 0x70940818, 0x74a007bd, 0x787d076c, 0x7c330723, +}; + +static stbir_uint8 stbir__linear_to_srgb_uchar(float in) +{ + static const stbir__FP32 almostone = { 0x3f7fffff }; // 1-eps + static const stbir__FP32 minval = { (127-13) << 23 }; + stbir_uint32 tab,bias,scale,t; + stbir__FP32 f; + + // Clamp to [2^(-13), 1-eps]; these two values map to 0 and 1, respectively. + // The tests are carefully written so that NaNs map to 0, same as in the reference + // implementation. + if (!(in > minval.f)) // written this way to catch NaNs + in = minval.f; + if (in > almostone.f) + in = almostone.f; + + // Do the table lookup and unpack bias, scale + f.f = in; + tab = fp32_to_srgb8_tab4[(f.u - minval.u) >> 20]; + bias = (tab >> 16) << 9; + scale = tab & 0xffff; + + // Grab next-highest mantissa bits and perform linear interpolation + t = (f.u >> 12) & 0xff; + return (unsigned char) ((bias + scale*t) >> 16); +} + +#else +// sRGB transition values, scaled by 1<<28 +static int stbir__srgb_offset_to_linear_scaled[256] = +{ + 0, 40738, 122216, 203693, 285170, 366648, 448125, 529603, + 611080, 692557, 774035, 855852, 942009, 1033024, 1128971, 1229926, + 1335959, 1447142, 1563542, 1685229, 1812268, 1944725, 2082664, 2226148, + 2375238, 2529996, 2690481, 2856753, 3028870, 3206888, 3390865, 3580856, + 3776916, 3979100, 4187460, 4402049, 4622919, 4850123, 5083710, 5323731, + 5570236, 5823273, 6082892, 6349140, 6622065, 6901714, 7188133, 7481369, + 7781466, 8088471, 8402427, 8723380, 9051372, 9386448, 9728650, 10078021, + 10434603, 10798439, 11169569, 11548036, 11933879, 12327139, 12727857, 13136073, + 13551826, 13975156, 14406100, 14844697, 15290987, 15745007, 16206795, 16676389, + 17153826, 17639142, 18132374, 18633560, 19142734, 19659934, 20185196, 20718552, + 21260042, 21809696, 22367554, 22933648, 23508010, 24090680, 24681686, 25281066, + 25888850, 26505076, 27129772, 27762974, 28404716, 29055026, 29713942, 30381490, + 31057708, 31742624, 32436272, 33138682, 33849884, 34569912, 35298800, 36036568, + 36783260, 37538896, 38303512, 39077136, 39859796, 40651528, 41452360, 42262316, + 43081432, 43909732, 44747252, 45594016, 46450052, 47315392, 48190064, 49074096, + 49967516, 50870356, 51782636, 52704392, 53635648, 54576432, 55526772, 56486700, + 57456236, 58435408, 59424248, 60422780, 61431036, 62449032, 63476804, 64514376, + 65561776, 66619028, 67686160, 68763192, 69850160, 70947088, 72053992, 73170912, + 74297864, 75434880, 76581976, 77739184, 78906536, 80084040, 81271736, 82469648, + 83677792, 84896192, 86124888, 87363888, 88613232, 89872928, 91143016, 92423512, + 93714432, 95015816, 96327688, 97650056, 98982952, 100326408, 101680440, 103045072, + 104420320, 105806224, 107202800, 108610064, 110028048, 111456776, 112896264, 114346544, + 115807632, 117279552, 118762328, 120255976, 121760536, 123276016, 124802440, 126339832, + 127888216, 129447616, 131018048, 132599544, 134192112, 135795792, 137410592, 139036528, + 140673648, 142321952, 143981456, 145652208, 147334208, 149027488, 150732064, 152447968, + 154175200, 155913792, 157663776, 159425168, 161197984, 162982240, 164777968, 166585184, + 168403904, 170234160, 172075968, 173929344, 175794320, 177670896, 179559120, 181458992, + 183370528, 185293776, 187228736, 189175424, 191133888, 193104112, 195086128, 197079968, + 199085648, 201103184, 203132592, 205173888, 207227120, 209292272, 211369392, 213458480, + 215559568, 217672656, 219797792, 221934976, 224084240, 226245600, 228419056, 230604656, + 232802400, 235012320, 237234432, 239468736, 241715280, 243974080, 246245120, 248528464, + 250824112, 253132064, 255452368, 257785040, 260130080, 262487520, 264857376, 267239664, +}; + +static stbir_uint8 stbir__linear_to_srgb_uchar(float f) +{ + int x = (int) (f * (1 << 28)); // has headroom so you don't need to clamp + int v = 0; + int i; + + // Refine the guess with a short binary search. + i = v + 128; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 64; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 32; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 16; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 8; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 4; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 2; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + i = v + 1; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i; + + return (stbir_uint8) v; +} +#endif + +static float stbir__filter_trapezoid(float x, float scale) +{ + float halfscale = scale / 2; + float t = 0.5f + halfscale; + STBIR_ASSERT(scale <= 1); + + x = (float)fabs(x); + + if (x >= t) + return 0; + else + { + float r = 0.5f - halfscale; + if (x <= r) + return 1; + else + return (t - x) / scale; + } +} + +static float stbir__support_trapezoid(float scale) +{ + STBIR_ASSERT(scale <= 1); + return 0.5f + scale / 2; +} + +static float stbir__filter_triangle(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x <= 1.0f) + return 1 - x; + else + return 0; +} + +static float stbir__filter_cubic(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x < 1.0f) + return (4 + x*x*(3*x - 6))/6; + else if (x < 2.0f) + return (8 + x*(-12 + x*(6 - x)))/6; + + return (0.0f); +} + +static float stbir__filter_catmullrom(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x < 1.0f) + return 1 - x*x*(2.5f - 1.5f*x); + else if (x < 2.0f) + return 2 - x*(4 + x*(0.5f*x - 2.5f)); + + return (0.0f); +} + +static float stbir__filter_mitchell(float x, float s) +{ + STBIR__UNUSED_PARAM(s); + + x = (float)fabs(x); + + if (x < 1.0f) + return (16 + x*x*(21 * x - 36))/18; + else if (x < 2.0f) + return (32 + x*(-60 + x*(36 - 7*x)))/18; + + return (0.0f); +} + +static float stbir__support_zero(float s) +{ + STBIR__UNUSED_PARAM(s); + return 0; +} + +static float stbir__support_one(float s) +{ + STBIR__UNUSED_PARAM(s); + return 1; +} + +static float stbir__support_two(float s) +{ + STBIR__UNUSED_PARAM(s); + return 2; +} + +static stbir__filter_info stbir__filter_info_table[] = { + { NULL, stbir__support_zero }, + { stbir__filter_trapezoid, stbir__support_trapezoid }, + { stbir__filter_triangle, stbir__support_one }, + { stbir__filter_cubic, stbir__support_two }, + { stbir__filter_catmullrom, stbir__support_two }, + { stbir__filter_mitchell, stbir__support_two }, +}; + +stbir__inline static int stbir__use_upsampling(float ratio) +{ + return ratio > 1; +} + +stbir__inline static int stbir__use_width_upsampling(stbir__info* stbir_info) +{ + return stbir__use_upsampling(stbir_info->horizontal_scale); +} + +stbir__inline static int stbir__use_height_upsampling(stbir__info* stbir_info) +{ + return stbir__use_upsampling(stbir_info->vertical_scale); +} + +// This is the maximum number of input samples that can affect an output sample +// with the given filter +static int stbir__get_filter_pixel_width(stbir_filter filter, float scale) +{ + STBIR_ASSERT(filter != 0); + STBIR_ASSERT(filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); + + if (stbir__use_upsampling(scale)) + return (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2); + else + return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2 / scale); +} + +// This is how much to expand buffers to account for filters seeking outside +// the image boundaries. +static int stbir__get_filter_pixel_margin(stbir_filter filter, float scale) +{ + return stbir__get_filter_pixel_width(filter, scale) / 2; +} + +static int stbir__get_coefficient_width(stbir_filter filter, float scale) +{ + if (stbir__use_upsampling(scale)) + return (int)ceil(stbir__filter_info_table[filter].support(1 / scale) * 2); + else + return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2); +} + +static int stbir__get_contributors(float scale, stbir_filter filter, int input_size, int output_size) +{ + if (stbir__use_upsampling(scale)) + return output_size; + else + return (input_size + stbir__get_filter_pixel_margin(filter, scale) * 2); +} + +static int stbir__get_total_horizontal_coefficients(stbir__info* info) +{ + return info->horizontal_num_contributors + * stbir__get_coefficient_width (info->horizontal_filter, info->horizontal_scale); +} + +static int stbir__get_total_vertical_coefficients(stbir__info* info) +{ + return info->vertical_num_contributors + * stbir__get_coefficient_width (info->vertical_filter, info->vertical_scale); +} + +static stbir__contributors* stbir__get_contributor(stbir__contributors* contributors, int n) +{ + return &contributors[n]; +} + +// For perf reasons this code is duplicated in stbir__resample_horizontal_upsample/downsample, +// if you change it here change it there too. +static float* stbir__get_coefficient(float* coefficients, stbir_filter filter, float scale, int n, int c) +{ + int width = stbir__get_coefficient_width(filter, scale); + return &coefficients[width*n + c]; +} + +static int stbir__edge_wrap_slow(stbir_edge edge, int n, int max) +{ + switch (edge) + { + case STBIR_EDGE_ZERO: + return 0; // we'll decode the wrong pixel here, and then overwrite with 0s later + + case STBIR_EDGE_CLAMP: + if (n < 0) + return 0; + + if (n >= max) + return max - 1; + + return n; // NOTREACHED + + case STBIR_EDGE_REFLECT: + { + if (n < 0) + { + if (n < max) + return -n; + else + return max - 1; + } + + if (n >= max) + { + int max2 = max * 2; + if (n >= max2) + return 0; + else + return max2 - n - 1; + } + + return n; // NOTREACHED + } + + case STBIR_EDGE_WRAP: + if (n >= 0) + return (n % max); + else + { + int m = (-n) % max; + + if (m != 0) + m = max - m; + + return (m); + } + // NOTREACHED + + default: + STBIR_ASSERT(!"Unimplemented edge type"); + return 0; + } +} + +stbir__inline static int stbir__edge_wrap(stbir_edge edge, int n, int max) +{ + // avoid per-pixel switch + if (n >= 0 && n < max) + return n; + return stbir__edge_wrap_slow(edge, n, max); +} + +// What input pixels contribute to this output pixel? +static void stbir__calculate_sample_range_upsample(int n, float out_filter_radius, float scale_ratio, float out_shift, int* in_first_pixel, int* in_last_pixel, float* in_center_of_out) +{ + float out_pixel_center = (float)n + 0.5f; + float out_pixel_influence_lowerbound = out_pixel_center - out_filter_radius; + float out_pixel_influence_upperbound = out_pixel_center + out_filter_radius; + + float in_pixel_influence_lowerbound = (out_pixel_influence_lowerbound + out_shift) / scale_ratio; + float in_pixel_influence_upperbound = (out_pixel_influence_upperbound + out_shift) / scale_ratio; + + *in_center_of_out = (out_pixel_center + out_shift) / scale_ratio; + *in_first_pixel = (int)(floor(in_pixel_influence_lowerbound + 0.5)); + *in_last_pixel = (int)(floor(in_pixel_influence_upperbound - 0.5)); +} + +// What output pixels does this input pixel contribute to? +static void stbir__calculate_sample_range_downsample(int n, float in_pixels_radius, float scale_ratio, float out_shift, int* out_first_pixel, int* out_last_pixel, float* out_center_of_in) +{ + float in_pixel_center = (float)n + 0.5f; + float in_pixel_influence_lowerbound = in_pixel_center - in_pixels_radius; + float in_pixel_influence_upperbound = in_pixel_center + in_pixels_radius; + + float out_pixel_influence_lowerbound = in_pixel_influence_lowerbound * scale_ratio - out_shift; + float out_pixel_influence_upperbound = in_pixel_influence_upperbound * scale_ratio - out_shift; + + *out_center_of_in = in_pixel_center * scale_ratio - out_shift; + *out_first_pixel = (int)(floor(out_pixel_influence_lowerbound + 0.5)); + *out_last_pixel = (int)(floor(out_pixel_influence_upperbound - 0.5)); +} + +static void stbir__calculate_coefficients_upsample(stbir_filter filter, float scale, int in_first_pixel, int in_last_pixel, float in_center_of_out, stbir__contributors* contributor, float* coefficient_group) +{ + int i; + float total_filter = 0; + float filter_scale; + + STBIR_ASSERT(in_last_pixel - in_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical. + + contributor->n0 = in_first_pixel; + contributor->n1 = in_last_pixel; + + STBIR_ASSERT(contributor->n1 >= contributor->n0); + + for (i = 0; i <= in_last_pixel - in_first_pixel; i++) + { + float in_pixel_center = (float)(i + in_first_pixel) + 0.5f; + coefficient_group[i] = stbir__filter_info_table[filter].kernel(in_center_of_out - in_pixel_center, 1 / scale); + + // If the coefficient is zero, skip it. (Don't do the <0 check here, we want the influence of those outside pixels.) + if (i == 0 && !coefficient_group[i]) + { + contributor->n0 = ++in_first_pixel; + i--; + continue; + } + + total_filter += coefficient_group[i]; + } + + STBIR_ASSERT(stbir__filter_info_table[filter].kernel((float)(in_last_pixel + 1) + 0.5f - in_center_of_out, 1/scale) == 0); + + STBIR_ASSERT(total_filter > 0.9); + STBIR_ASSERT(total_filter < 1.1f); // Make sure it's not way off. + + // Make sure the sum of all coefficients is 1. + filter_scale = 1 / total_filter; + + for (i = 0; i <= in_last_pixel - in_first_pixel; i++) + coefficient_group[i] *= filter_scale; + + for (i = in_last_pixel - in_first_pixel; i >= 0; i--) + { + if (coefficient_group[i]) + break; + + // This line has no weight. We can skip it. + contributor->n1 = contributor->n0 + i - 1; + } +} + +static void stbir__calculate_coefficients_downsample(stbir_filter filter, float scale_ratio, int out_first_pixel, int out_last_pixel, float out_center_of_in, stbir__contributors* contributor, float* coefficient_group) +{ + int i; + + STBIR_ASSERT(out_last_pixel - out_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(scale_ratio) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical. + + contributor->n0 = out_first_pixel; + contributor->n1 = out_last_pixel; + + STBIR_ASSERT(contributor->n1 >= contributor->n0); + + for (i = 0; i <= out_last_pixel - out_first_pixel; i++) + { + float out_pixel_center = (float)(i + out_first_pixel) + 0.5f; + float x = out_pixel_center - out_center_of_in; + coefficient_group[i] = stbir__filter_info_table[filter].kernel(x, scale_ratio) * scale_ratio; + } + + STBIR_ASSERT(stbir__filter_info_table[filter].kernel((float)(out_last_pixel + 1) + 0.5f - out_center_of_in, scale_ratio) == 0); + + for (i = out_last_pixel - out_first_pixel; i >= 0; i--) + { + if (coefficient_group[i]) + break; + + // This line has no weight. We can skip it. + contributor->n1 = contributor->n0 + i - 1; + } +} + +static void stbir__normalize_downsample_coefficients(stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, int input_size, int output_size) +{ + int num_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size); + int num_coefficients = stbir__get_coefficient_width(filter, scale_ratio); + int i, j; + int skip; + + for (i = 0; i < output_size; i++) + { + float scale; + float total = 0; + + for (j = 0; j < num_contributors; j++) + { + if (i >= contributors[j].n0 && i <= contributors[j].n1) + { + float coefficient = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0); + total += coefficient; + } + else if (i < contributors[j].n0) + break; + } + + STBIR_ASSERT(total > 0.9f); + STBIR_ASSERT(total < 1.1f); + + scale = 1 / total; + + for (j = 0; j < num_contributors; j++) + { + if (i >= contributors[j].n0 && i <= contributors[j].n1) + *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0) *= scale; + else if (i < contributors[j].n0) + break; + } + } + + // Optimize: Skip zero coefficients and contributions outside of image bounds. + // Do this after normalizing because normalization depends on the n0/n1 values. + for (j = 0; j < num_contributors; j++) + { + int range, max, width; + + skip = 0; + while (*stbir__get_coefficient(coefficients, filter, scale_ratio, j, skip) == 0) + skip++; + + contributors[j].n0 += skip; + + while (contributors[j].n0 < 0) + { + contributors[j].n0++; + skip++; + } + + range = contributors[j].n1 - contributors[j].n0 + 1; + max = stbir__min(num_coefficients, range); + + width = stbir__get_coefficient_width(filter, scale_ratio); + for (i = 0; i < max; i++) + { + if (i + skip >= width) + break; + + *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i) = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i + skip); + } + + continue; + } + + // Using min to avoid writing into invalid pixels. + for (i = 0; i < num_contributors; i++) + contributors[i].n1 = stbir__min(contributors[i].n1, output_size - 1); +} + +// Each scan line uses the same kernel values so we should calculate the kernel +// values once and then we can use them for every scan line. +static void stbir__calculate_filters(stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, float shift, int input_size, int output_size) +{ + int n; + int total_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size); + + if (stbir__use_upsampling(scale_ratio)) + { + float out_pixels_radius = stbir__filter_info_table[filter].support(1 / scale_ratio) * scale_ratio; + + // Looping through out pixels + for (n = 0; n < total_contributors; n++) + { + float in_center_of_out; // Center of the current out pixel in the in pixel space + int in_first_pixel, in_last_pixel; + + stbir__calculate_sample_range_upsample(n, out_pixels_radius, scale_ratio, shift, &in_first_pixel, &in_last_pixel, &in_center_of_out); + + stbir__calculate_coefficients_upsample(filter, scale_ratio, in_first_pixel, in_last_pixel, in_center_of_out, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0)); + } + } + else + { + float in_pixels_radius = stbir__filter_info_table[filter].support(scale_ratio) / scale_ratio; + + // Looping through in pixels + for (n = 0; n < total_contributors; n++) + { + float out_center_of_in; // Center of the current out pixel in the in pixel space + int out_first_pixel, out_last_pixel; + int n_adjusted = n - stbir__get_filter_pixel_margin(filter, scale_ratio); + + stbir__calculate_sample_range_downsample(n_adjusted, in_pixels_radius, scale_ratio, shift, &out_first_pixel, &out_last_pixel, &out_center_of_in); + + stbir__calculate_coefficients_downsample(filter, scale_ratio, out_first_pixel, out_last_pixel, out_center_of_in, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0)); + } + + stbir__normalize_downsample_coefficients(contributors, coefficients, filter, scale_ratio, input_size, output_size); + } +} + +static float* stbir__get_decode_buffer(stbir__info* stbir_info) +{ + // The 0 index of the decode buffer starts after the margin. This makes + // it okay to use negative indexes on the decode buffer. + return &stbir_info->decode_buffer[stbir_info->horizontal_filter_pixel_margin * stbir_info->channels]; +} + +#define STBIR__DECODE(type, colorspace) ((type) * (STBIR_MAX_COLORSPACES) + (colorspace)) + +static void stbir__decode_scanline(stbir__info* stbir_info, int n) +{ + int c; + int channels = stbir_info->channels; + int alpha_channel = stbir_info->alpha_channel; + int type = stbir_info->type; + int colorspace = stbir_info->colorspace; + int input_w = stbir_info->input_w; + size_t input_stride_bytes = stbir_info->input_stride_bytes; + float* decode_buffer = stbir__get_decode_buffer(stbir_info); + stbir_edge edge_horizontal = stbir_info->edge_horizontal; + stbir_edge edge_vertical = stbir_info->edge_vertical; + size_t in_buffer_row_offset = stbir__edge_wrap(edge_vertical, n, stbir_info->input_h) * input_stride_bytes; + const void* input_data = (char *) stbir_info->input_data + in_buffer_row_offset; + int max_x = input_w + stbir_info->horizontal_filter_pixel_margin; + int decode = STBIR__DECODE(type, colorspace); + + int x = -stbir_info->horizontal_filter_pixel_margin; + + // special handling for STBIR_EDGE_ZERO because it needs to return an item that doesn't appear in the input, + // and we want to avoid paying overhead on every pixel if not STBIR_EDGE_ZERO + if (edge_vertical == STBIR_EDGE_ZERO && (n < 0 || n >= stbir_info->input_h)) + { + for (; x < max_x; x++) + for (c = 0; c < channels; c++) + decode_buffer[x*channels + c] = 0; + return; + } + + switch (decode) + { + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = ((float)((const unsigned char*)input_data)[input_pixel_index + c]) / stbir__max_uint8_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_uchar_to_linear_float[((const unsigned char*)input_data)[input_pixel_index + c]]; + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned char*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint8_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = ((float)((const unsigned short*)input_data)[input_pixel_index + c]) / stbir__max_uint16_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((float)((const unsigned short*)input_data)[input_pixel_index + c]) / stbir__max_uint16_as_float); + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned short*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint16_as_float; + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / stbir__max_uint32_as_float); + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear((float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / stbir__max_uint32_as_float)); + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint32_as_float); + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = ((const float*)input_data)[input_pixel_index + c]; + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB): + for (; x < max_x; x++) + { + int decode_pixel_index = x * channels; + int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels; + for (c = 0; c < channels; c++) + decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((const float*)input_data)[input_pixel_index + c]); + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + decode_buffer[decode_pixel_index + alpha_channel] = ((const float*)input_data)[input_pixel_index + alpha_channel]; + } + + break; + + default: + STBIR_ASSERT(!"Unknown type/colorspace/channels combination."); + break; + } + + if (!(stbir_info->flags & STBIR_FLAG_ALPHA_PREMULTIPLIED)) + { + for (x = -stbir_info->horizontal_filter_pixel_margin; x < max_x; x++) + { + int decode_pixel_index = x * channels; + + // If the alpha value is 0 it will clobber the color values. Make sure it's not. + float alpha = decode_buffer[decode_pixel_index + alpha_channel]; +#ifndef STBIR_NO_ALPHA_EPSILON + if (stbir_info->type != STBIR_TYPE_FLOAT) { + alpha += STBIR_ALPHA_EPSILON; + decode_buffer[decode_pixel_index + alpha_channel] = alpha; + } +#endif + for (c = 0; c < channels; c++) + { + if (c == alpha_channel) + continue; + + decode_buffer[decode_pixel_index + c] *= alpha; + } + } + } + + if (edge_horizontal == STBIR_EDGE_ZERO) + { + for (x = -stbir_info->horizontal_filter_pixel_margin; x < 0; x++) + { + for (c = 0; c < channels; c++) + decode_buffer[x*channels + c] = 0; + } + for (x = input_w; x < max_x; x++) + { + for (c = 0; c < channels; c++) + decode_buffer[x*channels + c] = 0; + } + } +} + +static float* stbir__get_ring_buffer_entry(float* ring_buffer, int index, int ring_buffer_length) +{ + return &ring_buffer[index * ring_buffer_length]; +} + +static float* stbir__add_empty_ring_buffer_entry(stbir__info* stbir_info, int n) +{ + int ring_buffer_index; + float* ring_buffer; + + stbir_info->ring_buffer_last_scanline = n; + + if (stbir_info->ring_buffer_begin_index < 0) + { + ring_buffer_index = stbir_info->ring_buffer_begin_index = 0; + stbir_info->ring_buffer_first_scanline = n; + } + else + { + ring_buffer_index = (stbir_info->ring_buffer_begin_index + (stbir_info->ring_buffer_last_scanline - stbir_info->ring_buffer_first_scanline)) % stbir_info->ring_buffer_num_entries; + STBIR_ASSERT(ring_buffer_index != stbir_info->ring_buffer_begin_index); + } + + ring_buffer = stbir__get_ring_buffer_entry(stbir_info->ring_buffer, ring_buffer_index, stbir_info->ring_buffer_length_bytes / sizeof(float)); + memset(ring_buffer, 0, stbir_info->ring_buffer_length_bytes); + + return ring_buffer; +} + + +static void stbir__resample_horizontal_upsample(stbir__info* stbir_info, float* output_buffer) +{ + int x, k; + int output_w = stbir_info->output_w; + int channels = stbir_info->channels; + float* decode_buffer = stbir__get_decode_buffer(stbir_info); + stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors; + float* horizontal_coefficients = stbir_info->horizontal_coefficients; + int coefficient_width = stbir_info->horizontal_coefficient_width; + + for (x = 0; x < output_w; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int out_pixel_index = x * channels; + int coefficient_group = coefficient_width * x; + int coefficient_counter = 0; + + STBIR_ASSERT(n1 >= n0); + STBIR_ASSERT(n0 >= -stbir_info->horizontal_filter_pixel_margin); + STBIR_ASSERT(n1 >= -stbir_info->horizontal_filter_pixel_margin); + STBIR_ASSERT(n0 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin); + STBIR_ASSERT(n1 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin); + + switch (channels) { + case 1: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 1; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + } + break; + case 2: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 2; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + } + break; + case 3: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 3; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + } + break; + case 4: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * 4; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient; + } + break; + default: + for (k = n0; k <= n1; k++) + { + int in_pixel_index = k * channels; + float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++]; + int c; + STBIR_ASSERT(coefficient != 0); + for (c = 0; c < channels; c++) + output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient; + } + break; + } + } +} + +static void stbir__resample_horizontal_downsample(stbir__info* stbir_info, float* output_buffer) +{ + int x, k; + int input_w = stbir_info->input_w; + int channels = stbir_info->channels; + float* decode_buffer = stbir__get_decode_buffer(stbir_info); + stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors; + float* horizontal_coefficients = stbir_info->horizontal_coefficients; + int coefficient_width = stbir_info->horizontal_coefficient_width; + int filter_pixel_margin = stbir_info->horizontal_filter_pixel_margin; + int max_x = input_w + filter_pixel_margin * 2; + + STBIR_ASSERT(!stbir__use_width_upsampling(stbir_info)); + + switch (channels) { + case 1: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 1; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 1; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + } + } + break; + + case 2: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 2; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 2; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + } + } + break; + + case 3: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 3; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 3; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + } + } + break; + + case 4: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * 4; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int out_pixel_index = k * 4; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient; + output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient; + output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient; + output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient; + } + } + break; + + default: + for (x = 0; x < max_x; x++) + { + int n0 = horizontal_contributors[x].n0; + int n1 = horizontal_contributors[x].n1; + + int in_x = x - filter_pixel_margin; + int in_pixel_index = in_x * channels; + int max_n = n1; + int coefficient_group = coefficient_width * x; + + for (k = n0; k <= max_n; k++) + { + int c; + int out_pixel_index = k * channels; + float coefficient = horizontal_coefficients[coefficient_group + k - n0]; + STBIR_ASSERT(coefficient != 0); + for (c = 0; c < channels; c++) + output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient; + } + } + break; + } +} + +static void stbir__decode_and_resample_upsample(stbir__info* stbir_info, int n) +{ + // Decode the nth scanline from the source image into the decode buffer. + stbir__decode_scanline(stbir_info, n); + + // Now resample it into the ring buffer. + if (stbir__use_width_upsampling(stbir_info)) + stbir__resample_horizontal_upsample(stbir_info, stbir__add_empty_ring_buffer_entry(stbir_info, n)); + else + stbir__resample_horizontal_downsample(stbir_info, stbir__add_empty_ring_buffer_entry(stbir_info, n)); + + // Now it's sitting in the ring buffer ready to be used as source for the vertical sampling. +} + +static void stbir__decode_and_resample_downsample(stbir__info* stbir_info, int n) +{ + // Decode the nth scanline from the source image into the decode buffer. + stbir__decode_scanline(stbir_info, n); + + memset(stbir_info->horizontal_buffer, 0, stbir_info->output_w * stbir_info->channels * sizeof(float)); + + // Now resample it into the horizontal buffer. + if (stbir__use_width_upsampling(stbir_info)) + stbir__resample_horizontal_upsample(stbir_info, stbir_info->horizontal_buffer); + else + stbir__resample_horizontal_downsample(stbir_info, stbir_info->horizontal_buffer); + + // Now it's sitting in the horizontal buffer ready to be distributed into the ring buffers. +} + +// Get the specified scan line from the ring buffer. +static float* stbir__get_ring_buffer_scanline(int get_scanline, float* ring_buffer, int begin_index, int first_scanline, int ring_buffer_num_entries, int ring_buffer_length) +{ + int ring_buffer_index = (begin_index + (get_scanline - first_scanline)) % ring_buffer_num_entries; + return stbir__get_ring_buffer_entry(ring_buffer, ring_buffer_index, ring_buffer_length); +} + + +static void stbir__encode_scanline(stbir__info* stbir_info, int num_pixels, void *output_buffer, float *encode_buffer, int channels, int alpha_channel, int decode) +{ + int x; + int n; + int num_nonalpha; + stbir_uint16 nonalpha[STBIR_MAX_CHANNELS]; + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_PREMULTIPLIED)) + { + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + float alpha = encode_buffer[pixel_index + alpha_channel]; + float reciprocal_alpha = alpha ? 1.0f / alpha : 0; + + // unrolling this produced a 1% slowdown upscaling a large RGBA linear-space image on my machine - stb + for (n = 0; n < channels; n++) + if (n != alpha_channel) + encode_buffer[pixel_index + n] *= reciprocal_alpha; + + // We added in a small epsilon to prevent the color channel from being deleted with zero alpha. + // Because we only add it for integer types, it will automatically be discarded on integer + // conversion, so we don't need to subtract it back out (which would be problematic for + // numeric precision reasons). + } + } + + // build a table of all channels that need colorspace correction, so + // we don't perform colorspace correction on channels that don't need it. + for (x = 0, num_nonalpha = 0; x < channels; ++x) + { + if (x != alpha_channel || (stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE)) + { + nonalpha[num_nonalpha++] = (stbir_uint16)x; + } + } + + #define STBIR__ROUND_INT(f) ((int) ((f)+0.5)) + #define STBIR__ROUND_UINT(f) ((stbir_uint32) ((f)+0.5)) + + #ifdef STBIR__SATURATE_INT + #define STBIR__ENCODE_LINEAR8(f) stbir__saturate8 (STBIR__ROUND_INT((f) * stbir__max_uint8_as_float )) + #define STBIR__ENCODE_LINEAR16(f) stbir__saturate16(STBIR__ROUND_INT((f) * stbir__max_uint16_as_float)) + #else + #define STBIR__ENCODE_LINEAR8(f) (unsigned char ) STBIR__ROUND_INT(stbir__saturate(f) * stbir__max_uint8_as_float ) + #define STBIR__ENCODE_LINEAR16(f) (unsigned short) STBIR__ROUND_INT(stbir__saturate(f) * stbir__max_uint16_as_float) + #endif + + switch (decode) + { + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((unsigned char*)output_buffer)[index] = STBIR__ENCODE_LINEAR8(encode_buffer[index]); + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((unsigned char*)output_buffer)[index] = stbir__linear_to_srgb_uchar(encode_buffer[index]); + } + + if (!(stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((unsigned char *)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR8(encode_buffer[pixel_index+alpha_channel]); + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((unsigned short*)output_buffer)[index] = STBIR__ENCODE_LINEAR16(encode_buffer[index]); + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((unsigned short*)output_buffer)[index] = (unsigned short)STBIR__ROUND_INT(stbir__linear_to_srgb(stbir__saturate(encode_buffer[index])) * stbir__max_uint16_as_float); + } + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((unsigned short*)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR16(encode_buffer[pixel_index + alpha_channel]); + } + + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__saturate(encode_buffer[index])) * stbir__max_uint32_as_float); + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__linear_to_srgb(stbir__saturate(encode_buffer[index]))) * stbir__max_uint32_as_float); + } + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((unsigned int*)output_buffer)[pixel_index + alpha_channel] = (unsigned int)STBIR__ROUND_INT(((double)stbir__saturate(encode_buffer[pixel_index + alpha_channel])) * stbir__max_uint32_as_float); + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < channels; n++) + { + int index = pixel_index + n; + ((float*)output_buffer)[index] = encode_buffer[index]; + } + } + break; + + case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB): + for (x=0; x < num_pixels; ++x) + { + int pixel_index = x*channels; + + for (n = 0; n < num_nonalpha; n++) + { + int index = pixel_index + nonalpha[n]; + ((float*)output_buffer)[index] = stbir__linear_to_srgb(encode_buffer[index]); + } + + if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE)) + ((float*)output_buffer)[pixel_index + alpha_channel] = encode_buffer[pixel_index + alpha_channel]; + } + break; + + default: + STBIR_ASSERT(!"Unknown type/colorspace/channels combination."); + break; + } +} + +static void stbir__resample_vertical_upsample(stbir__info* stbir_info, int n) +{ + int x, k; + int output_w = stbir_info->output_w; + stbir__contributors* vertical_contributors = stbir_info->vertical_contributors; + float* vertical_coefficients = stbir_info->vertical_coefficients; + int channels = stbir_info->channels; + int alpha_channel = stbir_info->alpha_channel; + int type = stbir_info->type; + int colorspace = stbir_info->colorspace; + int ring_buffer_entries = stbir_info->ring_buffer_num_entries; + void* output_data = stbir_info->output_data; + float* encode_buffer = stbir_info->encode_buffer; + int decode = STBIR__DECODE(type, colorspace); + int coefficient_width = stbir_info->vertical_coefficient_width; + int coefficient_counter; + int contributor = n; + + float* ring_buffer = stbir_info->ring_buffer; + int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index; + int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline; + int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float); + + int n0,n1, output_row_start; + int coefficient_group = coefficient_width * contributor; + + n0 = vertical_contributors[contributor].n0; + n1 = vertical_contributors[contributor].n1; + + output_row_start = n * stbir_info->output_stride_bytes; + + STBIR_ASSERT(stbir__use_height_upsampling(stbir_info)); + + memset(encode_buffer, 0, output_w * sizeof(float) * channels); + + // I tried reblocking this for better cache usage of encode_buffer + // (using x_outer, k, x_inner), but it lost speed. -- stb + + coefficient_counter = 0; + switch (channels) { + case 1: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 1; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + } + } + break; + case 2: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 2; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient; + } + } + break; + case 3: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 3; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient; + encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient; + } + } + break; + case 4: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * 4; + encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient; + encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient; + encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient; + encode_buffer[in_pixel_index + 3] += ring_buffer_entry[in_pixel_index + 3] * coefficient; + } + } + break; + default: + for (k = n0; k <= n1; k++) + { + int coefficient_index = coefficient_counter++; + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + for (x = 0; x < output_w; ++x) + { + int in_pixel_index = x * channels; + int c; + for (c = 0; c < channels; c++) + encode_buffer[in_pixel_index + c] += ring_buffer_entry[in_pixel_index + c] * coefficient; + } + } + break; + } + stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, encode_buffer, channels, alpha_channel, decode); +} + +static void stbir__resample_vertical_downsample(stbir__info* stbir_info, int n) +{ + int x, k; + int output_w = stbir_info->output_w; + stbir__contributors* vertical_contributors = stbir_info->vertical_contributors; + float* vertical_coefficients = stbir_info->vertical_coefficients; + int channels = stbir_info->channels; + int ring_buffer_entries = stbir_info->ring_buffer_num_entries; + float* horizontal_buffer = stbir_info->horizontal_buffer; + int coefficient_width = stbir_info->vertical_coefficient_width; + int contributor = n + stbir_info->vertical_filter_pixel_margin; + + float* ring_buffer = stbir_info->ring_buffer; + int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index; + int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline; + int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float); + int n0,n1; + + n0 = vertical_contributors[contributor].n0; + n1 = vertical_contributors[contributor].n1; + + STBIR_ASSERT(!stbir__use_height_upsampling(stbir_info)); + + for (k = n0; k <= n1; k++) + { + int coefficient_index = k - n0; + int coefficient_group = coefficient_width * contributor; + float coefficient = vertical_coefficients[coefficient_group + coefficient_index]; + + float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length); + + switch (channels) { + case 1: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 1; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + } + break; + case 2: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 2; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient; + } + break; + case 3: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 3; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient; + ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient; + } + break; + case 4: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * 4; + ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient; + ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient; + ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient; + ring_buffer_entry[in_pixel_index + 3] += horizontal_buffer[in_pixel_index + 3] * coefficient; + } + break; + default: + for (x = 0; x < output_w; x++) + { + int in_pixel_index = x * channels; + + int c; + for (c = 0; c < channels; c++) + ring_buffer_entry[in_pixel_index + c] += horizontal_buffer[in_pixel_index + c] * coefficient; + } + break; + } + } +} + +static void stbir__buffer_loop_upsample(stbir__info* stbir_info) +{ + int y; + float scale_ratio = stbir_info->vertical_scale; + float out_scanlines_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(1/scale_ratio) * scale_ratio; + + STBIR_ASSERT(stbir__use_height_upsampling(stbir_info)); + + for (y = 0; y < stbir_info->output_h; y++) + { + float in_center_of_out = 0; // Center of the current out scanline in the in scanline space + int in_first_scanline = 0, in_last_scanline = 0; + + stbir__calculate_sample_range_upsample(y, out_scanlines_radius, scale_ratio, stbir_info->vertical_shift, &in_first_scanline, &in_last_scanline, &in_center_of_out); + + STBIR_ASSERT(in_last_scanline - in_first_scanline + 1 <= stbir_info->ring_buffer_num_entries); + + if (stbir_info->ring_buffer_begin_index >= 0) + { + // Get rid of whatever we don't need anymore. + while (in_first_scanline > stbir_info->ring_buffer_first_scanline) + { + if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline) + { + // We just popped the last scanline off the ring buffer. + // Reset it to the empty state. + stbir_info->ring_buffer_begin_index = -1; + stbir_info->ring_buffer_first_scanline = 0; + stbir_info->ring_buffer_last_scanline = 0; + break; + } + else + { + stbir_info->ring_buffer_first_scanline++; + stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->ring_buffer_num_entries; + } + } + } + + // Load in new ones. + if (stbir_info->ring_buffer_begin_index < 0) + stbir__decode_and_resample_upsample(stbir_info, in_first_scanline); + + while (in_last_scanline > stbir_info->ring_buffer_last_scanline) + stbir__decode_and_resample_upsample(stbir_info, stbir_info->ring_buffer_last_scanline + 1); + + // Now all buffers should be ready to write a row of vertical sampling. + stbir__resample_vertical_upsample(stbir_info, y); + + STBIR_PROGRESS_REPORT((float)y / stbir_info->output_h); + } +} + +static void stbir__empty_ring_buffer(stbir__info* stbir_info, int first_necessary_scanline) +{ + int output_stride_bytes = stbir_info->output_stride_bytes; + int channels = stbir_info->channels; + int alpha_channel = stbir_info->alpha_channel; + int type = stbir_info->type; + int colorspace = stbir_info->colorspace; + int output_w = stbir_info->output_w; + void* output_data = stbir_info->output_data; + int decode = STBIR__DECODE(type, colorspace); + + float* ring_buffer = stbir_info->ring_buffer; + int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float); + + if (stbir_info->ring_buffer_begin_index >= 0) + { + // Get rid of whatever we don't need anymore. + while (first_necessary_scanline > stbir_info->ring_buffer_first_scanline) + { + if (stbir_info->ring_buffer_first_scanline >= 0 && stbir_info->ring_buffer_first_scanline < stbir_info->output_h) + { + int output_row_start = stbir_info->ring_buffer_first_scanline * output_stride_bytes; + float* ring_buffer_entry = stbir__get_ring_buffer_entry(ring_buffer, stbir_info->ring_buffer_begin_index, ring_buffer_length); + stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, ring_buffer_entry, channels, alpha_channel, decode); + STBIR_PROGRESS_REPORT((float)stbir_info->ring_buffer_first_scanline / stbir_info->output_h); + } + + if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline) + { + // We just popped the last scanline off the ring buffer. + // Reset it to the empty state. + stbir_info->ring_buffer_begin_index = -1; + stbir_info->ring_buffer_first_scanline = 0; + stbir_info->ring_buffer_last_scanline = 0; + break; + } + else + { + stbir_info->ring_buffer_first_scanline++; + stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->ring_buffer_num_entries; + } + } + } +} + +static void stbir__buffer_loop_downsample(stbir__info* stbir_info) +{ + int y; + float scale_ratio = stbir_info->vertical_scale; + int output_h = stbir_info->output_h; + float in_pixels_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(scale_ratio) / scale_ratio; + int pixel_margin = stbir_info->vertical_filter_pixel_margin; + int max_y = stbir_info->input_h + pixel_margin; + + STBIR_ASSERT(!stbir__use_height_upsampling(stbir_info)); + + for (y = -pixel_margin; y < max_y; y++) + { + float out_center_of_in; // Center of the current out scanline in the in scanline space + int out_first_scanline, out_last_scanline; + + stbir__calculate_sample_range_downsample(y, in_pixels_radius, scale_ratio, stbir_info->vertical_shift, &out_first_scanline, &out_last_scanline, &out_center_of_in); + + STBIR_ASSERT(out_last_scanline - out_first_scanline + 1 <= stbir_info->ring_buffer_num_entries); + + if (out_last_scanline < 0 || out_first_scanline >= output_h) + continue; + + stbir__empty_ring_buffer(stbir_info, out_first_scanline); + + stbir__decode_and_resample_downsample(stbir_info, y); + + // Load in new ones. + if (stbir_info->ring_buffer_begin_index < 0) + stbir__add_empty_ring_buffer_entry(stbir_info, out_first_scanline); + + while (out_last_scanline > stbir_info->ring_buffer_last_scanline) + stbir__add_empty_ring_buffer_entry(stbir_info, stbir_info->ring_buffer_last_scanline + 1); + + // Now the horizontal buffer is ready to write to all ring buffer rows. + stbir__resample_vertical_downsample(stbir_info, y); + } + + stbir__empty_ring_buffer(stbir_info, stbir_info->output_h); +} + +static void stbir__setup(stbir__info *info, int input_w, int input_h, int output_w, int output_h, int channels) +{ + info->input_w = input_w; + info->input_h = input_h; + info->output_w = output_w; + info->output_h = output_h; + info->channels = channels; +} + +static void stbir__calculate_transform(stbir__info *info, float s0, float t0, float s1, float t1, float *transform) +{ + info->s0 = s0; + info->t0 = t0; + info->s1 = s1; + info->t1 = t1; + + if (transform) + { + info->horizontal_scale = transform[0]; + info->vertical_scale = transform[1]; + info->horizontal_shift = transform[2]; + info->vertical_shift = transform[3]; + } + else + { + info->horizontal_scale = ((float)info->output_w / info->input_w) / (s1 - s0); + info->vertical_scale = ((float)info->output_h / info->input_h) / (t1 - t0); + + info->horizontal_shift = s0 * info->output_w / (s1 - s0); + info->vertical_shift = t0 * info->output_h / (t1 - t0); + } +} + +static void stbir__choose_filter(stbir__info *info, stbir_filter h_filter, stbir_filter v_filter) +{ + if (h_filter == 0) + h_filter = stbir__use_upsampling(info->horizontal_scale) ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE; + if (v_filter == 0) + v_filter = stbir__use_upsampling(info->vertical_scale) ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE; + info->horizontal_filter = h_filter; + info->vertical_filter = v_filter; +} + +static stbir_uint32 stbir__calculate_memory(stbir__info *info) +{ + int pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale); + int filter_height = stbir__get_filter_pixel_width(info->vertical_filter, info->vertical_scale); + + info->horizontal_num_contributors = stbir__get_contributors(info->horizontal_scale, info->horizontal_filter, info->input_w, info->output_w); + info->vertical_num_contributors = stbir__get_contributors(info->vertical_scale , info->vertical_filter , info->input_h, info->output_h); + + // One extra entry because floating point precision problems sometimes cause an extra to be necessary. + info->ring_buffer_num_entries = filter_height + 1; + + info->horizontal_contributors_size = info->horizontal_num_contributors * sizeof(stbir__contributors); + info->horizontal_coefficients_size = stbir__get_total_horizontal_coefficients(info) * sizeof(float); + info->vertical_contributors_size = info->vertical_num_contributors * sizeof(stbir__contributors); + info->vertical_coefficients_size = stbir__get_total_vertical_coefficients(info) * sizeof(float); + info->decode_buffer_size = (info->input_w + pixel_margin * 2) * info->channels * sizeof(float); + info->horizontal_buffer_size = info->output_w * info->channels * sizeof(float); + info->ring_buffer_size = info->output_w * info->channels * info->ring_buffer_num_entries * sizeof(float); + info->encode_buffer_size = info->output_w * info->channels * sizeof(float); + + STBIR_ASSERT(info->horizontal_filter != 0); + STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late + STBIR_ASSERT(info->vertical_filter != 0); + STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late + + if (stbir__use_height_upsampling(info)) + // The horizontal buffer is for when we're downsampling the height and we + // can't output the result of sampling the decode buffer directly into the + // ring buffers. + info->horizontal_buffer_size = 0; + else + // The encode buffer is to retain precision in the height upsampling method + // and isn't used when height downsampling. + info->encode_buffer_size = 0; + + return info->horizontal_contributors_size + info->horizontal_coefficients_size + + info->vertical_contributors_size + info->vertical_coefficients_size + + info->decode_buffer_size + info->horizontal_buffer_size + + info->ring_buffer_size + info->encode_buffer_size; +} + +static int stbir__resize_allocated(stbir__info *info, + const void* input_data, int input_stride_in_bytes, + void* output_data, int output_stride_in_bytes, + int alpha_channel, stbir_uint32 flags, stbir_datatype type, + stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace, + void* tempmem, size_t tempmem_size_in_bytes) +{ + size_t memory_required = stbir__calculate_memory(info); + + int width_stride_input = input_stride_in_bytes ? input_stride_in_bytes : info->channels * info->input_w * stbir__type_size[type]; + int width_stride_output = output_stride_in_bytes ? output_stride_in_bytes : info->channels * info->output_w * stbir__type_size[type]; + +#ifdef STBIR_DEBUG_OVERWRITE_TEST +#define OVERWRITE_ARRAY_SIZE 8 + unsigned char overwrite_output_before_pre[OVERWRITE_ARRAY_SIZE]; + unsigned char overwrite_tempmem_before_pre[OVERWRITE_ARRAY_SIZE]; + unsigned char overwrite_output_after_pre[OVERWRITE_ARRAY_SIZE]; + unsigned char overwrite_tempmem_after_pre[OVERWRITE_ARRAY_SIZE]; + + size_t begin_forbidden = width_stride_output * (info->output_h - 1) + info->output_w * info->channels * stbir__type_size[type]; + memcpy(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE); + memcpy(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE); + memcpy(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE); + memcpy(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE); +#endif + + STBIR_ASSERT(info->channels >= 0); + STBIR_ASSERT(info->channels <= STBIR_MAX_CHANNELS); + + if (info->channels < 0 || info->channels > STBIR_MAX_CHANNELS) + return 0; + + STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); + STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); + + if (info->horizontal_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table)) + return 0; + if (info->vertical_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table)) + return 0; + + if (alpha_channel < 0) + flags |= STBIR_FLAG_ALPHA_USES_COLORSPACE | STBIR_FLAG_ALPHA_PREMULTIPLIED; + + if (!(flags&STBIR_FLAG_ALPHA_USES_COLORSPACE) || !(flags&STBIR_FLAG_ALPHA_PREMULTIPLIED)) + STBIR_ASSERT(alpha_channel >= 0 && alpha_channel < info->channels); + + if (alpha_channel >= info->channels) + return 0; + + STBIR_ASSERT(tempmem); + + if (!tempmem) + return 0; + + STBIR_ASSERT(tempmem_size_in_bytes >= memory_required); + + if (tempmem_size_in_bytes < memory_required) + return 0; + + memset(tempmem, 0, tempmem_size_in_bytes); + + info->input_data = input_data; + info->input_stride_bytes = width_stride_input; + + info->output_data = output_data; + info->output_stride_bytes = width_stride_output; + + info->alpha_channel = alpha_channel; + info->flags = flags; + info->type = type; + info->edge_horizontal = edge_horizontal; + info->edge_vertical = edge_vertical; + info->colorspace = colorspace; + + info->horizontal_coefficient_width = stbir__get_coefficient_width (info->horizontal_filter, info->horizontal_scale); + info->vertical_coefficient_width = stbir__get_coefficient_width (info->vertical_filter , info->vertical_scale ); + info->horizontal_filter_pixel_width = stbir__get_filter_pixel_width (info->horizontal_filter, info->horizontal_scale); + info->vertical_filter_pixel_width = stbir__get_filter_pixel_width (info->vertical_filter , info->vertical_scale ); + info->horizontal_filter_pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale); + info->vertical_filter_pixel_margin = stbir__get_filter_pixel_margin(info->vertical_filter , info->vertical_scale ); + + info->ring_buffer_length_bytes = info->output_w * info->channels * sizeof(float); + info->decode_buffer_pixels = info->input_w + info->horizontal_filter_pixel_margin * 2; + +#define STBIR__NEXT_MEMPTR(current, newtype) (newtype*)(((unsigned char*)current) + current##_size) + + info->horizontal_contributors = (stbir__contributors *) tempmem; + info->horizontal_coefficients = STBIR__NEXT_MEMPTR(info->horizontal_contributors, float); + info->vertical_contributors = STBIR__NEXT_MEMPTR(info->horizontal_coefficients, stbir__contributors); + info->vertical_coefficients = STBIR__NEXT_MEMPTR(info->vertical_contributors, float); + info->decode_buffer = STBIR__NEXT_MEMPTR(info->vertical_coefficients, float); + + if (stbir__use_height_upsampling(info)) + { + info->horizontal_buffer = NULL; + info->ring_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float); + info->encode_buffer = STBIR__NEXT_MEMPTR(info->ring_buffer, float); + + STBIR_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->encode_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes); + } + else + { + info->horizontal_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float); + info->ring_buffer = STBIR__NEXT_MEMPTR(info->horizontal_buffer, float); + info->encode_buffer = NULL; + + STBIR_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->ring_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes); + } + +#undef STBIR__NEXT_MEMPTR + + // This signals that the ring buffer is empty + info->ring_buffer_begin_index = -1; + + stbir__calculate_filters(info->horizontal_contributors, info->horizontal_coefficients, info->horizontal_filter, info->horizontal_scale, info->horizontal_shift, info->input_w, info->output_w); + stbir__calculate_filters(info->vertical_contributors, info->vertical_coefficients, info->vertical_filter, info->vertical_scale, info->vertical_shift, info->input_h, info->output_h); + + STBIR_PROGRESS_REPORT(0); + + if (stbir__use_height_upsampling(info)) + stbir__buffer_loop_upsample(info); + else + stbir__buffer_loop_downsample(info); + + STBIR_PROGRESS_REPORT(1); + +#ifdef STBIR_DEBUG_OVERWRITE_TEST + STBIR_ASSERT(memcmp(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0); + STBIR_ASSERT(memcmp(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE) == 0); + STBIR_ASSERT(memcmp(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0); + STBIR_ASSERT(memcmp(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE) == 0); +#endif + + return 1; +} + + +static int stbir__resize_arbitrary( + void *alloc_context, + const void* input_data, int input_w, int input_h, int input_stride_in_bytes, + void* output_data, int output_w, int output_h, int output_stride_in_bytes, + float s0, float t0, float s1, float t1, float *transform, + int channels, int alpha_channel, stbir_uint32 flags, stbir_datatype type, + stbir_filter h_filter, stbir_filter v_filter, + stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace) +{ + stbir__info info; + int result; + size_t memory_required; + void* extra_memory; + + stbir__setup(&info, input_w, input_h, output_w, output_h, channels); + stbir__calculate_transform(&info, s0,t0,s1,t1,transform); + stbir__choose_filter(&info, h_filter, v_filter); + memory_required = stbir__calculate_memory(&info); + extra_memory = STBIR_MALLOC(memory_required, alloc_context); + + if (!extra_memory) + return 0; + + result = stbir__resize_allocated(&info, input_data, input_stride_in_bytes, + output_data, output_stride_in_bytes, + alpha_channel, flags, type, + edge_horizontal, edge_vertical, + colorspace, extra_memory, memory_required); + + STBIR_FREE(extra_memory, alloc_context); + + return result; +} + +STBIRDEF int stbir_resize_uint8( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR); +} + +STBIRDEF int stbir_resize_float( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_FLOAT, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR); +} + +STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB); +} + +STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode) +{ + return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, + edge_wrap_mode, edge_wrap_mode, STBIR_COLORSPACE_SRGB); +} + +STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, filter, filter, + edge_wrap_mode, edge_wrap_mode, space); +} + +STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT16, filter, filter, + edge_wrap_mode, edge_wrap_mode, space); +} + + +STBIRDEF int stbir_resize_float_generic( const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + float *output_pixels , int output_w, int output_h, int output_stride_in_bytes, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, + void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_FLOAT, filter, filter, + edge_wrap_mode, edge_wrap_mode, space); +} + + +STBIRDEF int stbir_resize( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical, + edge_mode_horizontal, edge_mode_vertical, space); +} + + +STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float x_scale, float y_scale, + float x_offset, float y_offset) +{ + float transform[4]; + transform[0] = x_scale; + transform[1] = y_scale; + transform[2] = x_offset; + transform[3] = y_offset; + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + 0,0,1,1,transform,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical, + edge_mode_horizontal, edge_mode_vertical, space); +} + +STBIRDEF int stbir_resize_region( const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes, + void *output_pixels, int output_w, int output_h, int output_stride_in_bytes, + stbir_datatype datatype, + int num_channels, int alpha_channel, int flags, + stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, + stbir_filter filter_horizontal, stbir_filter filter_vertical, + stbir_colorspace space, void *alloc_context, + float s0, float t0, float s1, float t1) +{ + return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes, + output_pixels, output_w, output_h, output_stride_in_bytes, + s0,t0,s1,t1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical, + edge_mode_horizontal, edge_mode_vertical, space); +} + +#endif // STB_IMAGE_RESIZE_IMPLEMENTATION + +/* +------------------------------------------------------------------------------ +This software is available under 2 licenses -- choose whichever you prefer. +------------------------------------------------------------------------------ +ALTERNATIVE A - MIT License +Copyright (c) 2017 Sean Barrett +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +------------------------------------------------------------------------------ +ALTERNATIVE B - Public Domain (www.unlicense.org) +This is free and unencumbered software released into the public domain. +Anyone is free to copy, modify, publish, use, compile, sell, or distribute this +software, either in source code form or as a compiled binary, for any purpose, +commercial or non-commercial, and by any means. +In jurisdictions that recognize copyright laws, the author or authors of this +software dedicate any and all copyright interest in the software to the public +domain. We make this dedication for the benefit of the public at large and to +the detriment of our heirs and successors. We intend this dedication to be an +overt act of relinquishment in perpetuity of all present and future rights to +this software under copyright law. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +------------------------------------------------------------------------------ +*/ diff --git a/external/tinyobjloader b/external/tinyobjloader new file mode 160000 index 000000000..d541711a7 --- /dev/null +++ b/external/tinyobjloader @@ -0,0 +1 @@ +Subproject commit d541711a794343de4ef5ea76f037c9fb9c127a55 diff --git a/premake5.lua b/premake5.lua index 2953922b2..1c900c876 100644 --- a/premake5.lua +++ b/premake5.lua @@ -131,6 +131,14 @@ function baseSlangProject(name, baseDir) -- project(name) + -- We need every project to have a stable UUID for + -- output formats (like Visual Studio and XCode projects) + -- that use UUIDs rather than names to uniquely identify + -- projects. If we don't have a stable UUID, then the + -- output files might have spurious diffs whenever we + -- re-run premake generation. + uuid(os.uuid(projectDir)) + -- Set the location where the project file will be placed. -- We set the project files to reside in their source -- directory, because in Visual Studio the default @@ -252,11 +260,16 @@ function example(name) -- if it is going to use Slang, so we might as well set up a suitable -- include path here rather than make each example do it. -- - includedirs { "." } + -- Most of the examples also need the `gfx` library, + -- which lives under `tools/`, so we will add that to the path as well. + -- + includedirs { ".", "tools" } -- The examples also need to link against the slang library, - -- so we specify that here rather than in each example. - links { "slang" } + -- and the `gfx` abstraction layer (which in turn + -- depends on the `core` library). We specify all of that here, + -- rather than in each example. + links { "slang", "core", "gfx" } end -- @@ -264,23 +277,17 @@ end -- actual projects quite simply. For example, here is the entire -- declaration of the "Hello, World" example project: -- -example "hello" - uuid "E6385042-1649-4803-9EBD-168F8B7EF131" - includedirs { ".", "tools" } - links { "core", "slang-graphics" } +example "hello-world" -- -- Note how we are calling our custom `example()` subroutine with -- the same syntax sugar that Premake usually advocates for their -- `project()` function. This allows us to treat `example` as -- a kind of specialized "subclass" of `project` -- --- The call to `uuid()` in the definition of `hello` establishes --- the UUID/GUID that will be used for the project in generated --- formats that use these as unique identifiers (e.g., Visual --- Studio solutions). Without this call, Premake will generate --- a fresh UUID for a project each time its generation logic --- runs, which can create spurious diffs. --- + +-- Let's go ahead and set up the projects for our other example now. +example "model-viewer" + -- Most of the other projects have more interesting configuration going -- on, so let's walk through them in order of increasing complexity. @@ -364,8 +371,8 @@ tool "slang-eval-test" tool "render-test" uuid "96610759-07B9-4EEB-A974-5C634A2E742B" - includedirs { ".", "external", "source", "tools/slang-graphics" } - links { "core", "slang", "slang-graphics" } + includedirs { ".", "external", "source", "tools/gfx" } + links { "core", "slang", "gfx" } filter { "system:windows" } systemversion "10.0.14393.0" @@ -376,12 +383,12 @@ tool "render-test" postbuildcommands { '"$(SolutionDir)tools\\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/%{cfg.platform:lower()}/" "%{cfg.targetdir}/"'} -- --- `slang-graphics` is a utility library for doing GPU rendering +-- `gfx` is a utility library for doing GPU rendering -- and compute, which is used by both our testing and exmaples. -- It depends on teh `core` library, so we need to declare that: -- -tool "slang-graphics" +tool "gfx" uuid "222F7498-B40C-4F3F-A704-DDEB91A4484A" -- Unlike most of the code under `tools/`, this is a library -- rather than a stand-alone executable. diff --git a/slang.h b/slang.h index 3ff5feb84..6dcfbb503 100644 --- a/slang.h +++ b/slang.h @@ -474,13 +474,13 @@ extern "C" This type is generally compatible with the Windows API `HRESULT` type. In particular, negative values indicate failure results, while zero or positive results indicate success. - In general, Slang APIs always return a zero result on success, unless documented otherwise. Strictly speaking + In general, Slang APIs always return a zero result on success, unless documented otherwise. Strictly speaking a negative value indicates an error, a positive (or 0) value indicates success. This can be tested for with the macros SLANG_SUCCEEDED(x) or SLANG_FAILED(x). - - It can represent if the call was successful or not. It can also specify in an extensible manner what facility + + It can represent if the call was successful or not. It can also specify in an extensible manner what facility produced the result (as the integral 'facility') as well as what caused it (as an integral 'code'). - Under the covers SlangResult is represented as a int32_t. + Under the covers SlangResult is represented as a int32_t. SlangResult is designed to be compatible with COM HRESULT. @@ -493,12 +493,12 @@ extern "C" Severity - 1 fail, 0 is success - as SlangResult is signed 32 bits, means negative number indicates failure. Facility is where the error originated from. Code is the code specific to the facility. - Result codes have the following styles, + Result codes have the following styles, 1) SLANG_name 2) SLANG_s_f_name 3) SLANG_s_name - where s is S for success, E for error + where s is S for success, E for error f is the short version of the facility name Style 1 is reserved for SLANG_OK and SLANG_FAIL as they are so commonly used. @@ -516,7 +516,7 @@ extern "C" //! Get the facility the result is associated with #define SLANG_GET_RESULT_FACILITY(r) ((int32_t)(((r) >> 16) & 0x7fff)) - //! Get the result code for the facility + //! Get the result code for the facility #define SLANG_GET_RESULT_CODE(r) ((int32_t)((r) & 0xffff)) #define SLANG_MAKE_ERROR(fac, code) ((((int32_t)(fac)) << 16) | ((int32_t)(code)) | 0x80000000) @@ -530,7 +530,7 @@ extern "C" #define SLANG_FACILITY_WIN_API 7 //! Base facility -> so as to not clash with HRESULT values (values in 0x200 range do not appear used) -#define SLANG_FACILITY_BASE 0x200 +#define SLANG_FACILITY_BASE 0x200 /*! Facilities numbers must be unique across a project to make the resulting result a unique number. It can be useful to have a consistent short name for a facility, as used in the name prefix */ @@ -539,7 +539,7 @@ extern "C" should never be part of a public API. */ #define SLANG_FACILITY_INTERNAL SLANG_FACILITY_BASE + 1 - /// Base for external facilities. Facilities should be unique across modules. + /// Base for external facilities. Facilities should be unique across modules. #define SLANG_FACILITY_EXTERNAL_BASE 0x210 /* ************************ Win COM compatible Results ******************************/ @@ -551,18 +551,18 @@ extern "C" #define SLANG_FAIL SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_INTERFACE, 5) #define SLANG_MAKE_WIN_INTERFACE_ERROR(code) SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_INTERFACE, code) -#define SLANG_MAKE_WIN_API_ERROR(code) SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, code) +#define SLANG_MAKE_WIN_API_ERROR(code) SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, code) - //! Functionality is not implemented + //! Functionality is not implemented #define SLANG_E_NOT_IMPLEMENTED SLANG_MAKE_WIN_INTERFACE_ERROR(1) - //! Interface not be found + //! Interface not be found #define SLANG_E_NO_INTERFACE SLANG_MAKE_WIN_INTERFACE_ERROR(2) - //! Operation was aborted (did not correctly complete) + //! Operation was aborted (did not correctly complete) #define SLANG_E_ABORT SLANG_MAKE_WIN_INTERFACE_ERROR(4) - //! Indicates that a handle passed in as parameter to a method is invalid. + //! Indicates that a handle passed in as parameter to a method is invalid. #define SLANG_E_INVALID_HANDLE SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, 6) - //! Indicates that an argument passed in as parameter to a method is invalid. + //! Indicates that an argument passed in as parameter to a method is invalid. #define SLANG_E_INVALID_ARG SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, 0x57) //! Operation could not complete - ran out of memory #define SLANG_E_OUT_OF_MEMORY SLANG_MAKE_ERROR(SLANG_FACILITY_WIN_API, 0xe) @@ -573,10 +573,10 @@ extern "C" // Supplied buffer is too small to be able to complete #define SLANG_E_BUFFER_TOO_SMALL SLANG_MAKE_CORE_ERROR(1) - //! Used to identify a Result that has yet to be initialized. - //! It defaults to failure such that if used incorrectly will fail, as similar in concept to using an uninitialized variable. + //! Used to identify a Result that has yet to be initialized. + //! It defaults to failure such that if used incorrectly will fail, as similar in concept to using an uninitialized variable. #define SLANG_E_UNINITIALIZED SLANG_MAKE_CORE_ERROR(2) - //! Returned from an async method meaning the output is invalid (thus an error), but a result for the request is pending, and will be returned on a subsequent call with the async handle. + //! Returned from an async method meaning the output is invalid (thus an error), but a result for the request is pending, and will be returned on a subsequent call with the async handle. #define SLANG_E_PENDING SLANG_MAKE_CORE_ERROR(3) //! Indicates a file/resource could not be opened #define SLANG_E_CANNOT_OPEN SLANG_MAKE_CORE_ERROR(4) @@ -747,7 +747,7 @@ extern "C" - SLANG_GLSL. Generates GLSL code. - SLANG_HLSL. Generates HLSL code. - SLANG_SPIRV. Generates SPIR-V code. - */ + */ SLANG_API void spSetCodeGenTarget( SlangCompileRequest* request, SlangCompileTarget target); @@ -816,7 +816,7 @@ extern "C" /*! @brief Set options using arguments as if specified via command line. - @return Returns SlangResult. On success SLANG_SUCCEEDED(result) is true. + @return Returns SlangResult. On success SLANG_SUCCEEDED(result) is true. */ SLANG_API SlangResult spProcessCommandLineArguments( SlangCompileRequest* request, @@ -1276,6 +1276,8 @@ extern "C" SLANG_API SlangMatrixLayoutMode spReflectionTypeLayout_GetMatrixLayoutMode(SlangReflectionTypeLayout* type); + SLANG_API int spReflectionTypeLayout_getGenericParamIndex(SlangReflectionTypeLayout* type); + // Variable Reflection SLANG_API char const* spReflectionVariable_GetName(SlangReflectionVariable* var); @@ -1353,8 +1355,9 @@ extern "C" SLANG_API SlangReflectionTypeLayout* spReflection_GetTypeLayout(SlangReflection* reflection, SlangReflectionType* reflectionType, SlangLayoutRules rules); SLANG_API SlangUInt spReflection_getEntryPointCount(SlangReflection* reflection); - SLANG_API SlangReflectionEntryPoint* spReflection_getEntryPointByIndex(SlangReflection* reflection, SlangUInt index); + SLANG_API SlangReflectionEntryPoint* spReflection_findEntryPointByName(SlangReflection* reflection, char const* name); + SLANG_API SlangUInt spReflection_getGlobalConstantBufferBinding(SlangReflection* reflection); SLANG_API size_t spReflection_getGlobalConstantBufferSize(SlangReflection* reflection); @@ -1638,6 +1641,11 @@ namespace slang return spReflectionTypeLayout_GetMatrixLayoutMode((SlangReflectionTypeLayout*) this); } + int getGenericParamIndex() + { + return spReflectionTypeLayout_getGenericParamIndex( + (SlangReflectionTypeLayout*) this); + } }; struct Modifier @@ -1800,6 +1808,11 @@ namespace slang } }; + enum class LayoutRules : SlangLayoutRules + { + Default = SLANG_LAYOUT_RULES_DEFAULT, + }; + struct ShaderReflection { unsigned getParameterCount() @@ -1851,6 +1864,30 @@ namespace slang { return spReflection_getGlobalConstantBufferSize((SlangReflection*)this); } + + TypeReflection* findTypeByName(const char* name) + { + return (TypeReflection*)spReflection_FindTypeByName( + (SlangReflection*) this, + name); + } + + TypeLayoutReflection* getTypeLayout( + TypeReflection* type, + LayoutRules rules = LayoutRules::Default) + { + return (TypeLayoutReflection*)spReflection_GetTypeLayout( + (SlangReflection*) this, + (SlangReflectionType*)type, + SlangLayoutRules(rules)); + } + + EntryPointReflection* findEntryPointByName(const char* name) + { + return (EntryPointReflection*)spReflection_findEntryPointByName( + (SlangReflection*) this, + name); + } }; } diff --git a/slang.sln b/slang.sln index 654d72a88..8adbb3822 100644 --- a/slang.sln +++ b/slang.sln @@ -3,7 +3,9 @@ Microsoft Visual Studio Solution File, Format Version 12.00 # Visual Studio 14 Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "examples", "examples", "{EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231}" EndProject -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "hello", "examples\hello\hello.vcxproj", "{E6385042-1649-4803-9EBD-168F8B7EF131}" +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "hello-world", "examples\hello-world\hello-world.vcxproj", "{5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}" +EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "model-viewer", "examples\model-viewer\model-viewer.vcxproj", "{639B13F2-CF07-CFEC-98FB-664A0427F154}" EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "core", "source\core\core.vcxproj", "{F9BE7957-8399-899E-0C49-E714FDDD4B65}" EndProject @@ -19,7 +21,7 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "slang-eval-test", "tools\sl EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "render-test", "tools\render-test\render-test.vcxproj", "{96610759-07B9-4EEB-A974-5C634A2E742B}" EndProject -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "slang-graphics", "tools\slang-graphics\slang-graphics.vcxproj", "{222F7498-B40C-4F3F-A704-DDEB91A4484A}" +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "gfx", "tools\gfx\gfx.vcxproj", "{222F7498-B40C-4F3F-A704-DDEB91A4484A}" EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "slangc", "source\slangc\slangc.vcxproj", "{D56CBCEB-1EB5-4CA8-AEC4-48EA35ED61C7}" EndProject @@ -38,14 +40,22 @@ Global Release|x64 = Release|x64 EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|Win32.ActiveCfg = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|Win32.Build.0 = Debug|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.ActiveCfg = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Debug|x64.Build.0 = Debug|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|Win32.ActiveCfg = Release|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|Win32.Build.0 = Release|Win32 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.ActiveCfg = Release|x64 - {E6385042-1649-4803-9EBD-168F8B7EF131}.Release|x64.Build.0 = Release|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|Win32.ActiveCfg = Debug|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|Win32.Build.0 = Debug|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|x64.ActiveCfg = Debug|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Debug|x64.Build.0 = Debug|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|Win32.ActiveCfg = Release|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|Win32.Build.0 = Release|Win32 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|x64.ActiveCfg = Release|x64 + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA}.Release|x64.Build.0 = Release|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|Win32.ActiveCfg = Debug|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|Win32.Build.0 = Debug|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|x64.ActiveCfg = Debug|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Debug|x64.Build.0 = Debug|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|Win32.ActiveCfg = Release|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|Win32.Build.0 = Release|Win32 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|x64.ActiveCfg = Release|x64 + {639B13F2-CF07-CFEC-98FB-664A0427F154}.Release|x64.Build.0 = Release|x64 {F9BE7957-8399-899E-0C49-E714FDDD4B65}.Debug|Win32.ActiveCfg = Debug|Win32 {F9BE7957-8399-899E-0C49-E714FDDD4B65}.Debug|Win32.Build.0 = Debug|Win32 {F9BE7957-8399-899E-0C49-E714FDDD4B65}.Debug|x64.ActiveCfg = Debug|x64 @@ -131,7 +141,8 @@ Global HideSolutionNode = FALSE EndGlobalSection GlobalSection(NestedProjects) = preSolution - {E6385042-1649-4803-9EBD-168F8B7EF131} = {EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231} + {5CF41E7B-4883-A844-F1A1-BC3FDD0FB9EA} = {EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231} + {639B13F2-CF07-CFEC-98FB-664A0427F154} = {EB5FC2C6-D72D-B6CC-C0C1-26F3AC2E9231} {66174227-8541-41FC-A6DF-4764FC66F78E} = {FD47AE19-69FD-260F-F2F1-20E65EA61D13} {0C768A18-1D25-4000-9F37-DA5FE99E3B64} = {FD47AE19-69FD-260F-F2F1-20E65EA61D13} {22C45F4F-FB6B-4535-BED1-D3F5D0C71047} = {FD47AE19-69FD-260F-F2F1-20E65EA61D13} diff --git a/source/core/core.vcxproj b/source/core/core.vcxproj index ecd8ee07b..3dbfaac3f 100644 --- a/source/core/core.vcxproj +++ b/source/core/core.vcxproj @@ -182,12 +182,9 @@ - - - diff --git a/source/core/core.vcxproj.filters b/source/core/core.vcxproj.filters index 39a164770..27b0fe82f 100644 --- a/source/core/core.vcxproj.filters +++ b/source/core/core.vcxproj.filters @@ -45,12 +45,6 @@ Header Files - - Header Files - - - Header Files - Header Files @@ -60,9 +54,6 @@ Header Files - - Header Files - Header Files diff --git a/source/core/smart-pointer.h b/source/core/smart-pointer.h index bc1683a5b..4c6744d1b 100644 --- a/source/core/smart-pointer.h +++ b/source/core/smart-pointer.h @@ -6,6 +6,8 @@ #include +#include "../../slang.h" + namespace Slang { // TODO: Need to centralize these typedefs @@ -199,13 +201,17 @@ namespace Slang T* detach() { - if (pointer) - dynamic_cast(pointer)->decreaseReference(); auto rs = pointer; pointer = nullptr; return rs; } + /// Get ready for writing (nulls contents) + SLANG_FORCE_INLINE T** writeRef() { *this = nullptr; return &pointer; } + + /// Get for read access + SLANG_FORCE_INLINE T*const* readRef() const { return &pointer; } + private: T* pointer; diff --git a/source/slang/reflection.cpp b/source/slang/reflection.cpp index f8d12b9e9..6661850ae 100644 --- a/source/slang/reflection.cpp +++ b/source/slang/reflection.cpp @@ -443,7 +443,7 @@ SLANG_API SlangReflectionType * spReflection_FindTypeByName(SlangReflection * re SLANG_API SlangReflectionTypeLayout* spReflection_GetTypeLayout( SlangReflection* reflection, - SlangReflectionType* inType, + SlangReflectionType* inType, SlangLayoutRules /*rules*/) { auto context = convert(reflection); @@ -674,6 +674,21 @@ SLANG_API SlangMatrixLayoutMode spReflectionTypeLayout_GetMatrixLayoutMode(Slang } +SLANG_API int spReflectionTypeLayout_getGenericParamIndex(SlangReflectionTypeLayout* inTypeLayout) +{ + auto typeLayout = convert(inTypeLayout); + if(!typeLayout) return -1; + + if(auto genericParamTypeLayout = dynamic_cast(typeLayout)) + { + return genericParamTypeLayout->paramIndex; + } + else + { + return -1; + } +} + // Variable Reflection @@ -925,7 +940,7 @@ namespace Slang return 0; } - + static VarLayout* getParameterByIndex(RefPtr typeLayout, unsigned index) { if(auto parameterGroupLayout = typeLayout.As()) @@ -1147,6 +1162,24 @@ SLANG_API SlangReflectionEntryPoint* spReflection_getEntryPointByIndex(SlangRefl return convert(program->entryPoints[(int) index].Ptr()); } +SLANG_API SlangReflectionEntryPoint* spReflection_findEntryPointByName(SlangReflection* inProgram, char const* name) +{ + auto program = convert(inProgram); + if(!program) return 0; + + // TODO: improve on dumb linear search + for(auto ep : program->entryPoints) + { + if(ep->entryPoint->getName()->text == name) + { + return convert(ep); + } + } + + return nullptr; +} + + SLANG_API SlangUInt spReflection_getGlobalConstantBufferBinding(SlangReflection* inProgram) { auto program = convert(inProgram); diff --git a/source/slang/slang.cpp b/source/slang/slang.cpp index 2b1857e07..b94e146dd 100644 --- a/source/slang/slang.cpp +++ b/source/slang/slang.cpp @@ -979,9 +979,6 @@ SLANG_API void spDestroySession( { if(!session) return; delete SESSION(session); -#ifdef _MSC_VER - _CrtDumpMemoryLeaks(); -#endif } SLANG_API void spAddBuiltins( @@ -1483,7 +1480,8 @@ SLANG_API SlangResult spGetEntryPointCodeBlob( } Slang::CompileResult& result = targetReq->entryPointResults[entryPointIndex]; - *outBlob = result.getBlob().detach(); + auto blob = result.getBlob(); + *outBlob = blob.detach(); return SLANG_OK; } diff --git a/tools/gfx/circular-resource-heap-d3d12.cpp b/tools/gfx/circular-resource-heap-d3d12.cpp new file mode 100644 index 000000000..20e47c4dd --- /dev/null +++ b/tools/gfx/circular-resource-heap-d3d12.cpp @@ -0,0 +1,222 @@ +#include "circular-resource-heap-d3d12.h" + +namespace gfx { +using namespace Slang; + +D3D12CircularResourceHeap::D3D12CircularResourceHeap(): + m_fence(nullptr), + m_device(nullptr), + m_blockFreeList(sizeof(Block), SLANG_ALIGN_OF(Block), 16), + m_blocks(nullptr) +{ + m_back.m_block = nullptr; + m_back.m_position = nullptr; + m_front.m_block = nullptr; + m_front.m_position = nullptr; +} + +D3D12CircularResourceHeap::~D3D12CircularResourceHeap() +{ + _freeBlockListResources(m_blocks); +} + +void D3D12CircularResourceHeap::_freeBlockListResources(const Block* start) +{ + if (start) + { + const Block* block = start; + do + { + ID3D12Resource* resource = block->m_resource; + + resource->Unmap(0, nullptr); + resource->Release(); + + // Next in list + block = block->m_next; + + } while (block != start); + } +} + +Result D3D12CircularResourceHeap::init(ID3D12Device* device, const Desc& desc, D3D12CounterFence* fence) +{ + assert(m_blocks == nullptr); + assert(desc.m_blockSize > 0); + + m_fence = fence; + m_desc = desc; + m_device = device; + + return SLANG_OK; +} + +void D3D12CircularResourceHeap::addSync(uint64_t signalValue) +{ + assert(signalValue == m_fence->getCurrentValue()); + PendingEntry entry; + entry.m_completedValue = signalValue; + entry.m_cursor = m_front; + m_pendingQueue.Add(entry); +} + +void D3D12CircularResourceHeap::updateCompleted() +{ + const uint64_t completedValue = m_fence->getCompletedValue(); + +#if 0 + while (m_pendingQueue.Count() != 0) + { + const PendingEntry& entry = m_pendingQueue[0]; + if (entry.m_completedValue <= completedValue) + { + m_back = entry.m_cursor; + m_pendingQueue.RemoveAt(0); + } + else + { + break; + } + } +#else + // A more efficient implementation is m_pendingQueue is implemented as a vector like type + const int size = int(m_pendingQueue.Count()); + int end = 0; + while (end < size && m_pendingQueue[end].m_completedValue <= completedValue) + { + end++; + } + + if (end > 0) + { + // Set the back position + m_back = m_pendingQueue[end - 1].m_cursor; + if (end == size) + { + m_pendingQueue.Clear(); + } + else + { + m_pendingQueue.RemoveRange(0, size); + } + } +#endif +} + +D3D12CircularResourceHeap::Block* D3D12CircularResourceHeap::_newBlock() +{ + D3D12_RESOURCE_DESC desc; + + desc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER; + desc.Alignment = 0; + desc.Width = m_desc.m_blockSize; + desc.Height = 1; + desc.DepthOrArraySize = 1; + desc.MipLevels = 1; + desc.Format = DXGI_FORMAT_UNKNOWN; + desc.SampleDesc.Count = 1; + desc.SampleDesc.Quality = 0; + desc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR; + desc.Flags = D3D12_RESOURCE_FLAG_NONE; + + ComPtr resource; + Result res = m_device->CreateCommittedResource(&m_desc.m_heapProperties, m_desc.m_heapFlags, &desc, m_desc.m_initialState, nullptr, IID_PPV_ARGS(resource.writeRef())); + if (SLANG_FAILED(res)) + { + assert(!"Resource allocation failed"); + return nullptr; + } + + uint8_t* data = nullptr; + if (m_desc.m_heapProperties.Type == D3D12_HEAP_TYPE_READBACK) + { + } + else + { + // Map it, and keep it mapped + resource->Map(0, nullptr, (void**)&data); + } + + // We have no blocks -> so lets allocate the first + Block* block = (Block*)m_blockFreeList.allocate(); + block->m_next = nullptr; + + block->m_resource = resource.detach(); + block->m_start = data; + return block; +} + +D3D12CircularResourceHeap::Cursor D3D12CircularResourceHeap::allocate(size_t size, size_t alignment) +{ + const size_t blockSize = getBlockSize(); + + assert(size <= blockSize); + + // If nothing is allocated add the first block + if (m_blocks == nullptr) + { + Block* block = _newBlock(); + if (!block) + { + Cursor cursor = {}; + return cursor; + } + m_blocks = block; + // Make circular + block->m_next = block; + + // Point front and back to same position, as currently it is all free + m_back = { block, block->m_start }; + m_front = m_back; + } + + // If front and back are in the same block then front MUST be ahead of back (as that defined as + // an invariant and is required for block insertion to be possible + Block* block = m_front.m_block; + + // Check the invariant + assert(block != m_back.m_block || m_front.m_position >= m_back.m_position); + + { + uint8_t* cur = (uint8_t*)((size_t(m_front.m_position) + alignment - 1) & ~(alignment - 1)); + // Does the the allocation fit? + if (cur + size <= block->m_start + blockSize) + { + // It fits + // Move the front forward + m_front.m_position = cur + size; + Cursor cursor = { block, cur }; + return cursor; + } + } + + // Okay I can't fit into current block... + + // If the next block contains front, we need to add a block, else we can use that block + if (block->m_next == m_back.m_block) + { + Block* newBlock = _newBlock(); + // Insert into the list + newBlock->m_next = block->m_next; + block->m_next = newBlock; + } + + // Use the block we are going to add to + block = block->m_next; + uint8_t* cur = (uint8_t*)((size_t(block->m_start) + alignment - 1) & ~(alignment - 1)); + // Does the the allocation fit? + if (cur + size > block->m_start + blockSize) + { + assert(!"Couldn't fit into a free block(!) Alignment breaks it?"); + Cursor cursor = {}; + return cursor; + } + // It fits + // Move the front forward + m_front.m_block = block; + m_front.m_position = cur + size; + Cursor cursor = { block, cur }; + return cursor; +} + +} // namespace gfx diff --git a/tools/gfx/circular-resource-heap-d3d12.h b/tools/gfx/circular-resource-heap-d3d12.h new file mode 100644 index 000000000..cca981601 --- /dev/null +++ b/tools/gfx/circular-resource-heap-d3d12.h @@ -0,0 +1,206 @@ +#pragma once + +#include "../../slang-com-ptr.h" +#include "../../source/core/list.h" +#include "../../source/core/slang-free-list.h" + +#include "resource-d3d12.h" + +namespace gfx { + +/*! \brief The D3D12CircularResourceHeap is a heap that is suited for size constrained real-time resources allocation that +is transitory in nature. It is designed to allocate resources which are used and discarded, often used where in +previous versions of DirectX the 'DISCARD' flag was used. + +The idea is to have a heap which chunks of resource can be allocated, and used for GPU execution, +and that the heap is able through the addSync/updateCompleted idiom is able to track when the usage of the resources is +completed allowing them to be reused. The heap is arranged as circularly, with new allocations made from the front, and the back +being updated as the GPU updating the back when it is informed anything using prior parts of the heap have completed. In this +arrangement all the heap between the back and the front can be thought of as in use or potentially in use by the GPU. All the heap +from the front back around to the back, is free and can be allocated from. It is the responsibility of the user of the Heap to make +sure the invariant holds, but in most normal usage it does so simply. + +Another feature of the heap is that it does not require upfront knowledge of how big a heap is needed. The backing resources will be expanded +dynamically with requests as needed. The only requirement is that know single request can be larger than m_blockSize specified in the Desc +used to initialize the heap. This is because all the backing resources are allocated to a single size. This limitation means the D3D12CircularResourceHeap +may not be the best use for example for uploading a texture - because it's design is really around transitory uploads or write backs, and so more suited +to constant buffers, vertex buffer, index buffers and the like. + +To upload a texture at program startup it is most likely better to use a D3D12ResourceScopeManager. + +\code{.cpp} + +typedef D3D12CircularResourceHeap Heap; + +Heap::Cursor cursor = heap.allocateVertexBuffer(sizeof(Vertex) * numVerts); +Memory:copy(cursor.m_position, verts, sizeof(Vertex) * numVerts); + +// Do a command using the GPU handle +m_commandList->... +// Do another command using the GPU handle + +m_commandList->... + +// Execute the command list on the command queue +{ + ID3D12CommandList* lists[] = { m_commandList }; + m_commandQueue->ExecuteCommandLists(SLANG_COUNT_OF(lists), lists); +} + +// Add a sync point +const uint64_t signalValue = m_fence.nextSignal(m_commandQueue); +heap.addSync(signalValue) + +// The cursors cannot be used anymore + +// At some later point call updateCompleted. This will see where the GPU is at, and make resources available that the GPU no longer accesses. +heap.updateCompleted(); + +\endcode + +### Implementation + +Front and back can be in the same block, but ONLY if back is behind front, because we have to always be able to insert +new blocks in front of front. So it must be possible to do an block insertion between the two of them. + +|--B---F-----| |----------| + +When B and F are on top of one another it means there is nothing in the list. NOTE this also means that a move of front can never place it +top of the back. + +https://msdn.microsoft.com/en-us/library/windows/desktop/dn899125%28v=vs.85%29.aspx +https://msdn.microsoft.com/en-us/library/windows/desktop/mt426646%28v=vs.85%29.aspx +*/ + +class D3D12CircularResourceHeap +{ + protected: + struct Block; + public: + typedef D3D12CircularResourceHeap ThisType; + + /// The alignment used for VERTEX_BUFFER allocations + /// Strictly speaking it seems the hardware can handle 4 byte alignment, but since often in use + /// data will be copied from CPU memory to the allocation, using 16 byte alignment is superior as allows + /// significantly faster memcpy. + /// The sample that shows sizeof(float) - 4 bytes is appropriate is at the link below. + /// https://msdn.microsoft.com/en-us/library/windows/desktop/mt426646%28v=vs.85%29.aspx + enum + { + VERTEX_BUFFER_ALIGNMENT = 16, + }; + + struct Desc + { + void init() + { + { + D3D12_HEAP_PROPERTIES& props = m_heapProperties; + + props.Type = D3D12_HEAP_TYPE_UPLOAD; + props.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + props.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + props.CreationNodeMask = 1; + props.VisibleNodeMask = 1; + } + m_heapFlags = D3D12_HEAP_FLAG_NONE; + m_initialState = D3D12_RESOURCE_STATE_GENERIC_READ; + m_blockSize = 0; + } + + D3D12_HEAP_PROPERTIES m_heapProperties; + D3D12_HEAP_FLAGS m_heapFlags; + D3D12_RESOURCE_STATES m_initialState; + size_t m_blockSize; + }; + + /// Cursor position + struct Cursor + { + /// Get GpuHandle + SLANG_FORCE_INLINE D3D12_GPU_VIRTUAL_ADDRESS getGpuHandle() const { return m_block->m_resource->GetGPUVirtualAddress() + size_t(m_position - m_block->m_start); } + /// Must have a block and position + SLANG_FORCE_INLINE bool isValid() const { return m_block != nullptr; } + /// Calculate the offset into the underlying resource + SLANG_FORCE_INLINE size_t getOffset() const { return size_t(m_position - m_block->m_start); } + /// Get the underlying resource + SLANG_FORCE_INLINE ID3D12Resource* getResource() const { return m_block->m_resource; } + + Block* m_block; ///< The block index + uint8_t* m_position; ///< The current position + }; + + /// Get the desc used to initialize the heap + SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } + + /// Must be called before used + /// Block size must be at least as large as the _largest_ thing allocated + /// Also note depending on alignment of a resource allocation, the block size might also need to take into account the + /// maximum alignment use. It is a REQUIREMENT that a newly allocated resource block is large enough to hold any + /// allocation taking into account the alignment used. + Slang::Result init(ID3D12Device* device, const Desc& desc, D3D12CounterFence* fence); + + /// Get the block size + SLANG_FORCE_INLINE size_t getBlockSize() const { return m_desc.m_blockSize; } + + /// Allocate constant buffer of specified size + Cursor allocate(size_t size, size_t alignment); + + /// Allocate a constant buffer + SLANG_FORCE_INLINE Cursor allocateConstantBuffer(size_t size) { return allocate(size, D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT); } + /// Allocate a vertex buffer + SLANG_FORCE_INLINE Cursor allocateVertexBuffer(size_t size) { return allocate(size, VERTEX_BUFFER_ALIGNMENT); } + + /// Create filled in constant buffer + SLANG_FORCE_INLINE Cursor newConstantBuffer(const void* data, size_t size) { Cursor cursor = allocateConstantBuffer(size); ::memcpy(cursor.m_position, data, size); return cursor; } + /// Create in filled in constant buffer + template + SLANG_FORCE_INLINE Cursor newConstantBuffer(const T& in) { return newConstantBuffer(&in, sizeof(T)); } + + /// Look where the GPU has got to and release anything not currently used + void updateCompleted(); + /// Add a sync point - meaning that when this point is hit in the queue + /// all of the resources up to this point will no longer be used. + void addSync(uint64_t signalValue); + + /// Get the gpu address of this cursor + D3D12_GPU_VIRTUAL_ADDRESS getGpuHandle(const Cursor& cursor) const { return cursor.m_block->m_resource->GetGPUVirtualAddress() + size_t(cursor.m_position - cursor.m_block->m_start); } + + /// Ctor + D3D12CircularResourceHeap(); + /// Dtor + ~D3D12CircularResourceHeap(); + + protected: + + struct Block + { + ID3D12Resource* m_resource; ///< The mapped resource + uint8_t* m_start; ///< Once created the resource is mapped to here + Block* m_next; ///< Points to next block in the list + }; + struct PendingEntry + { + uint64_t m_completedValue; ///< The value when this is completed + Cursor m_cursor; ///< the cursor at that point + }; + void _freeBlockListResources(const Block* block); + /// Create a new block (with associated resource), do not add the block list + Block* _newBlock(); + + Block* m_blocks; ///< Circular singly linked list of block. nullptr initially + Slang::FreeList m_blockFreeList; ///< Free list of actual allocations of blocks + Slang::List m_pendingQueue; ///< Holds the list of pending positions. When the fence value is greater than the value on the queue entry, the entry is done. + + // Allocation is made from the front, and freed from the back. + Cursor m_back; ///< Current back position. + Cursor m_front; ///< Current front position. + + Desc m_desc; ///< Describes the heap + + D3D12CounterFence* m_fence; ///< The fence to use + ID3D12Device* m_device; ///< The device that resources will be constructed on +}; + +} // namespace gfx + diff --git a/tools/gfx/d3d-util.cpp b/tools/gfx/d3d-util.cpp new file mode 100644 index 000000000..19135707b --- /dev/null +++ b/tools/gfx/d3d-util.cpp @@ -0,0 +1,306 @@ +// d3d-util.cpp +#include "d3d-util.h" + +#include + +// We will use the C standard library just for printing error messages. +#include + +namespace gfx { +using namespace Slang; + +/* static */D3D_PRIMITIVE_TOPOLOGY D3DUtil::getPrimitiveTopology(PrimitiveTopology topology) +{ + switch (topology) + { + case PrimitiveTopology::TriangleList: + { + return D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST; + } + default: break; + } + return D3D11_PRIMITIVE_TOPOLOGY_UNDEFINED; +} + +/* static */DXGI_FORMAT D3DUtil::getMapFormat(Format format) +{ + switch (format) + { + case Format::RGBA_Float32: return DXGI_FORMAT_R32G32B32A32_FLOAT; + case Format::RGB_Float32: return DXGI_FORMAT_R32G32B32_FLOAT; + case Format::RG_Float32: return DXGI_FORMAT_R32G32_FLOAT; + case Format::R_Float32: return DXGI_FORMAT_R32_FLOAT; + case Format::RGBA_Unorm_UInt8: return DXGI_FORMAT_R8G8B8A8_UNORM; + case Format::R_UInt32: return DXGI_FORMAT_R32_UINT; + + case Format::D_Float32: return DXGI_FORMAT_D32_FLOAT; + case Format::D_Unorm24_S8: return DXGI_FORMAT_D24_UNORM_S8_UINT; + + default: return DXGI_FORMAT_UNKNOWN; + } +} + +/* static */DXGI_FORMAT D3DUtil::calcResourceFormat(UsageType usage, Int usageFlags, DXGI_FORMAT format) +{ + SLANG_UNUSED(usage); + if (usageFlags) + { + switch (format) + { + case DXGI_FORMAT_R32_FLOAT: /* fallthru */ + case DXGI_FORMAT_R32_UINT: + case DXGI_FORMAT_D32_FLOAT: + { + return DXGI_FORMAT_R32_TYPELESS; + } + case DXGI_FORMAT_D24_UNORM_S8_UINT: return DXGI_FORMAT_R24G8_TYPELESS; + default: break; + } + return format; + } + return format; +} + +/* static */DXGI_FORMAT D3DUtil::calcFormat(UsageType usage, DXGI_FORMAT format) +{ + switch (usage) + { + case USAGE_COUNT_OF: + case USAGE_UNKNOWN: + { + return DXGI_FORMAT_UNKNOWN; + } + case USAGE_DEPTH_STENCIL: + { + switch (format) + { + case DXGI_FORMAT_D32_FLOAT: /* fallthru */ + case DXGI_FORMAT_R32_TYPELESS: + { + return DXGI_FORMAT_D32_FLOAT; + } + case DXGI_FORMAT_R24_UNORM_X8_TYPELESS: return DXGI_FORMAT_D24_UNORM_S8_UINT; + case DXGI_FORMAT_R24G8_TYPELESS: return DXGI_FORMAT_D24_UNORM_S8_UINT; + default: break; + } + return format; + } + case USAGE_TARGET: + { + switch (format) + { + case DXGI_FORMAT_D32_FLOAT: /* fallthru */ + case DXGI_FORMAT_D24_UNORM_S8_UINT: + { + return DXGI_FORMAT_UNKNOWN; + } + case DXGI_FORMAT_R32_TYPELESS: return DXGI_FORMAT_R32_FLOAT; + default: break; + } + return format; + } + case USAGE_SRV: + { + switch (format) + { + case DXGI_FORMAT_D32_FLOAT: /* fallthru */ + case DXGI_FORMAT_R32_TYPELESS: + { + return DXGI_FORMAT_R32_FLOAT; + } + case DXGI_FORMAT_R24_UNORM_X8_TYPELESS: return DXGI_FORMAT_R24_UNORM_X8_TYPELESS; + default: break; + } + + return format; + } + } + + assert(!"Not reachable"); + return DXGI_FORMAT_UNKNOWN; +} + +bool D3DUtil::isTypeless(DXGI_FORMAT format) +{ + switch (format) + { + case DXGI_FORMAT_R32G32B32A32_TYPELESS: + case DXGI_FORMAT_R32G32B32_TYPELESS: + case DXGI_FORMAT_R16G16B16A16_TYPELESS: + case DXGI_FORMAT_R32G32_TYPELESS: + case DXGI_FORMAT_R32G8X24_TYPELESS: + case DXGI_FORMAT_R32_FLOAT_X8X24_TYPELESS: + case DXGI_FORMAT_R10G10B10A2_TYPELESS: + case DXGI_FORMAT_R8G8B8A8_TYPELESS: + case DXGI_FORMAT_R16G16_TYPELESS: + case DXGI_FORMAT_R32_TYPELESS: + case DXGI_FORMAT_R24_UNORM_X8_TYPELESS: + case DXGI_FORMAT_R24G8_TYPELESS: + case DXGI_FORMAT_R8G8_TYPELESS: + case DXGI_FORMAT_R16_TYPELESS: + case DXGI_FORMAT_R8_TYPELESS: + case DXGI_FORMAT_BC1_TYPELESS: + case DXGI_FORMAT_BC2_TYPELESS: + case DXGI_FORMAT_BC3_TYPELESS: + case DXGI_FORMAT_BC4_TYPELESS: + case DXGI_FORMAT_BC5_TYPELESS: + case DXGI_FORMAT_B8G8R8A8_TYPELESS: + case DXGI_FORMAT_BC6H_TYPELESS: + case DXGI_FORMAT_BC7_TYPELESS: + { + return true; + } + default: break; + } + return false; +} + +/* static */Int D3DUtil::getNumColorChannelBits(DXGI_FORMAT fmt) +{ + switch (fmt) + { + case DXGI_FORMAT_R32G32B32A32_TYPELESS: + case DXGI_FORMAT_R32G32B32A32_FLOAT: + case DXGI_FORMAT_R32G32B32A32_UINT: + case DXGI_FORMAT_R32G32B32A32_SINT: + case DXGI_FORMAT_R32G32B32_TYPELESS: + case DXGI_FORMAT_R32G32B32_FLOAT: + case DXGI_FORMAT_R32G32B32_UINT: + case DXGI_FORMAT_R32G32B32_SINT: + { + return 32; + } + case DXGI_FORMAT_R16G16B16A16_TYPELESS: + case DXGI_FORMAT_R16G16B16A16_FLOAT: + case DXGI_FORMAT_R16G16B16A16_UNORM: + case DXGI_FORMAT_R16G16B16A16_UINT: + case DXGI_FORMAT_R16G16B16A16_SNORM: + case DXGI_FORMAT_R16G16B16A16_SINT: + { + return 16; + } + case DXGI_FORMAT_R10G10B10A2_TYPELESS: + case DXGI_FORMAT_R10G10B10A2_UNORM: + case DXGI_FORMAT_R10G10B10A2_UINT: + case DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM: + { + return 10; + } + case DXGI_FORMAT_R8G8B8A8_TYPELESS: + case DXGI_FORMAT_R8G8B8A8_UNORM: + case DXGI_FORMAT_R8G8B8A8_UNORM_SRGB: + case DXGI_FORMAT_R8G8B8A8_UINT: + case DXGI_FORMAT_R8G8B8A8_SNORM: + case DXGI_FORMAT_R8G8B8A8_SINT: + case DXGI_FORMAT_B8G8R8A8_UNORM: + case DXGI_FORMAT_B8G8R8X8_UNORM: + case DXGI_FORMAT_B8G8R8A8_TYPELESS: + case DXGI_FORMAT_B8G8R8A8_UNORM_SRGB: + case DXGI_FORMAT_B8G8R8X8_TYPELESS: + case DXGI_FORMAT_B8G8R8X8_UNORM_SRGB: + { + return 8; + } + case DXGI_FORMAT_B5G6R5_UNORM: + case DXGI_FORMAT_B5G5R5A1_UNORM: + { + return 5; + } + case DXGI_FORMAT_B4G4R4A4_UNORM: + return 4; + + default: + return 0; + } +} + +// Note: this subroutine is now only used by D3D11 for generating bytecode to go into input layouts. +// +// TODO: we can probably remove that code completely by switching to a PSO-like model across all APIs. +// +/* static */Result D3DUtil::compileHLSLShader(char const* sourcePath, char const* source, char const* entryPointName, char const* dxProfileName, ComPtr& shaderBlobOut) +{ + // Rather than statically link against the `d3dcompile` library, we + // dynamically load it. + // + // Note: A more realistic application would compile from HLSL text to D3D + // shader bytecode as part of an offline process, rather than doing it + // on-the-fly like this + // + static pD3DCompile compileFunc = nullptr; + if (!compileFunc) + { + // TODO(tfoley): maybe want to search for one of a few versions of the DLL + HMODULE compilerModule = LoadLibraryA("d3dcompiler_47.dll"); + if (!compilerModule) + { + fprintf(stderr, "error: failed load 'd3dcompiler_47.dll'\n"); + return SLANG_FAIL; + } + + compileFunc = (pD3DCompile)GetProcAddress(compilerModule, "D3DCompile"); + if (!compileFunc) + { + fprintf(stderr, "error: failed load symbol 'D3DCompile'\n"); + return SLANG_FAIL; + } + } + + // For this example, we turn on debug output, and turn off all + // optimization. A real application would only use these flags + // when shader debugging is needed. + UINT flags = 0; + flags |= D3DCOMPILE_DEBUG; + flags |= D3DCOMPILE_OPTIMIZATION_LEVEL0 | D3DCOMPILE_SKIP_OPTIMIZATION; + + // We will always define `__HLSL__` when compiling here, so that + // input code can react differently to being compiled as pure HLSL. + D3D_SHADER_MACRO defines[] = { + { "__HLSL__", "1" }, + { nullptr, nullptr }, + }; + + // The `D3DCompile` entry point takes a bunch of parameters, but we + // don't really need most of them for Slang-generated code. + ComPtr shaderBlob; + ComPtr errorBlob; + + HRESULT hr = compileFunc(source, strlen(source), sourcePath, &defines[0], nullptr, entryPointName, dxProfileName, flags, 0, + shaderBlob.writeRef(), errorBlob.writeRef()); + + // If the HLSL-to-bytecode compilation produced any diagnostic messages + // then we will print them out (whether or not the compilation failed). + if (errorBlob) + { + ::fputs((const char*)errorBlob->GetBufferPointer(), stderr); + ::fflush(stderr); + ::OutputDebugStringA((const char*)errorBlob->GetBufferPointer()); + } + + SLANG_RETURN_ON_FAIL(hr); + shaderBlobOut.swap(shaderBlob); + return SLANG_OK; +} + +/* static */void D3DUtil::appendWideChars(const char* in, List& out) +{ + size_t len = ::strlen(in); + + const DWORD dwFlags = 0; + int outSize = ::MultiByteToWideChar(CP_UTF8, dwFlags, in, int(len), nullptr, 0); + + if (outSize > 0) + { + const UInt prevSize = out.Count(); + out.SetSize(prevSize + len + 1); + + WCHAR* dst = out.Buffer() + prevSize; + ::MultiByteToWideChar(CP_UTF8, dwFlags, in, int(len), dst, outSize); + // Make null terminated + dst[outSize] = 0; + // Remove terminating 0 from array + out.UnsafeShrinkToSize(prevSize + outSize); + } +} + +} // renderer_test diff --git a/tools/gfx/d3d-util.h b/tools/gfx/d3d-util.h new file mode 100644 index 000000000..04bfae63d --- /dev/null +++ b/tools/gfx/d3d-util.h @@ -0,0 +1,61 @@ +// d3d-util.h +#pragma once + +#include + +#include "../../slang-com-helper.h" + +#include "../../slang-com-ptr.h" +#include "../../source/core/list.h" + +#include "render.h" + +#include +#include + +namespace gfx { + +class D3DUtil +{ + public: + enum UsageType + { + USAGE_UNKNOWN, ///< Generally used to mark an error + USAGE_TARGET, ///< Format should be used when written as target + USAGE_DEPTH_STENCIL, ///< Format should be used when written as depth stencil + USAGE_SRV, ///< Format if being read as srv + USAGE_COUNT_OF, + }; + enum UsageFlag + { + USAGE_FLAG_MULTI_SAMPLE = 0x1, ///< If set will be used form multi sampling (such as MSAA) + USAGE_FLAG_SRV = 0x2, ///< If set means will be used as a shader resource view (SRV) + }; + + /// Get primitive topology as D3D primitive topology + static D3D_PRIMITIVE_TOPOLOGY getPrimitiveTopology(PrimitiveTopology prim); + + /// Calculate size taking into account alignment. Alignment must be a power of 2 + static UInt calcAligned(UInt size, UInt alignment) { return (size + alignment - 1) & ~(alignment - 1); } + + /// Compile HLSL code to DXBC + static Slang::Result compileHLSLShader(char const* sourcePath, char const* source, char const* entryPointName, char const* dxProfileName, Slang::ComPtr& shaderBlobOut); + + /// Given a slang pixel format returns the equivalent DXGI_ pixel format. If the format is not known, will return DXGI_FORMAT_UNKNOWN + static DXGI_FORMAT getMapFormat(Format format); + + /// Given the usage, flags, and format will return the most suitable format. Will return DXGI_UNKNOWN if combination is not possible + static DXGI_FORMAT calcFormat(UsageType usage, DXGI_FORMAT format); + /// Calculate appropriate format for creating a buffer for usage and flags + static DXGI_FORMAT calcResourceFormat(UsageType usage, Int usageFlags, DXGI_FORMAT format); + /// True if the type is 'typeless' + static bool isTypeless(DXGI_FORMAT format); + + /// Returns number of bits used for color channel for format (for channels with multiple sizes, returns smallest ie RGB565 -> 5) + static Int getNumColorChannelBits(DXGI_FORMAT fmt); + + /// Append text in in, into wide char array + static void appendWideChars(const char* in, Slang::List& out); +}; + +} // renderer_test diff --git a/tools/gfx/descriptor-heap-d3d12.cpp b/tools/gfx/descriptor-heap-d3d12.cpp new file mode 100644 index 000000000..382fc3219 --- /dev/null +++ b/tools/gfx/descriptor-heap-d3d12.cpp @@ -0,0 +1,47 @@ + +#include "descriptor-heap-d3d12.h" + +namespace gfx { +using namespace Slang; + +D3D12DescriptorHeap::D3D12DescriptorHeap(): + m_totalSize(0), + m_currentIndex(0), + m_descriptorSize(0) +{ +} + +Result D3D12DescriptorHeap::init(ID3D12Device* device, int size, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags) +{ + D3D12_DESCRIPTOR_HEAP_DESC srvHeapDesc = {}; + srvHeapDesc.NumDescriptors = size; + srvHeapDesc.Flags = flags; + srvHeapDesc.Type = type; + SLANG_RETURN_ON_FAIL(device->CreateDescriptorHeap(&srvHeapDesc, IID_PPV_ARGS(m_heap.writeRef()))); + + m_descriptorSize = device->GetDescriptorHandleIncrementSize(type); + m_totalSize = size; + + return SLANG_OK; +} + +Result D3D12DescriptorHeap::init(ID3D12Device* device, const D3D12_CPU_DESCRIPTOR_HANDLE* handles, int numHandles, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags) +{ + SLANG_RETURN_ON_FAIL(init(device, numHandles, type, flags)); + D3D12_CPU_DESCRIPTOR_HANDLE dst = m_heap->GetCPUDescriptorHandleForHeapStart(); + + // Copy them all + for (int i = 0; i < numHandles; i++, dst.ptr += m_descriptorSize) + { + D3D12_CPU_DESCRIPTOR_HANDLE src = handles[i]; + if (src.ptr != 0) + { + device->CopyDescriptorsSimple(1, dst, src, type); + } + } + + return SLANG_OK; +} + +} // namespace gfx + diff --git a/tools/gfx/descriptor-heap-d3d12.h b/tools/gfx/descriptor-heap-d3d12.h new file mode 100644 index 000000000..2a814583b --- /dev/null +++ b/tools/gfx/descriptor-heap-d3d12.h @@ -0,0 +1,198 @@ +#pragma once + + +#include +#include + +#include "../../slang-com-ptr.h" +#include "../../source/core/list.h" + +namespace gfx { + +/*! \brief A simple class to manage an underlying Dx12 Descriptor Heap. Allocations are made linearly in order. It is not possible to free +individual allocations, but all allocations can be deallocated with 'deallocateAll'. */ +class D3D12DescriptorHeap +{ + public: + typedef D3D12DescriptorHeap ThisType; + + /// Initialize + Slang::Result init(ID3D12Device* device, int size, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags); + /// Initialize with an array of handles copying over the representation + Slang::Result init(ID3D12Device* device, const D3D12_CPU_DESCRIPTOR_HANDLE* handles, int numHandles, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags); + + /// Returns the number of slots that have been used + SLANG_FORCE_INLINE int getUsedSize() const { return m_currentIndex; } + + /// Get the total amount of descriptors possible on the heap + SLANG_FORCE_INLINE int getTotalSize() const { return m_totalSize; } + /// Allocate a descriptor. Returns the index, or -1 if none left. + SLANG_FORCE_INLINE int allocate(); + /// Allocate a number of descriptors. Returns the start index (or -1 if not possible) + SLANG_FORCE_INLINE int allocate(int numDescriptors); + + /// + SLANG_FORCE_INLINE int placeAt(int index); + + /// Deallocates all allocations, and starts allocation from the start of the underlying heap again + SLANG_FORCE_INLINE void deallocateAll() { m_currentIndex = 0; } + + /// Get the size of each + SLANG_FORCE_INLINE int getDescriptorSize() const { return m_descriptorSize; } + + /// Get the GPU heap start + SLANG_FORCE_INLINE D3D12_GPU_DESCRIPTOR_HANDLE getGpuStart() const { return m_heap->GetGPUDescriptorHandleForHeapStart(); } + /// Get the CPU heap start + SLANG_FORCE_INLINE D3D12_CPU_DESCRIPTOR_HANDLE getCpuStart() const { return m_heap->GetCPUDescriptorHandleForHeapStart(); } + + /// Get the GPU handle at the specified index + SLANG_FORCE_INLINE D3D12_GPU_DESCRIPTOR_HANDLE getGpuHandle(int index) const; + /// Get the CPU handle at the specified index + SLANG_FORCE_INLINE D3D12_CPU_DESCRIPTOR_HANDLE getCpuHandle(int index) const; + + /// Get the underlying heap + SLANG_FORCE_INLINE ID3D12DescriptorHeap* getHeap() const { return m_heap; } + + /// Ctor + D3D12DescriptorHeap(); + +protected: + Slang::ComPtr m_heap; ///< The underlying heap being allocated from + int m_totalSize; ///< Total amount of allocations available on the heap + int m_currentIndex; ///< The current descriptor + int m_descriptorSize; ///< The size of each descriptor +}; + +/// A host-visible descriptor, used as "backing storage" for a view. +/// +/// This type is intended to be used to represent descriptors that +/// are allocated and freed through a `HostVisibleDescriptorAllocator`. +struct D3D12HostVisibleDescriptor +{ + D3D12_CPU_DESCRIPTOR_HANDLE cpuHandle; +}; + +/// An allocator for host-visible descriptors. +/// +/// Unlike the `D3D12DescriptorHeap` type, this class allows for both +/// allocation and freeing of descriptors, by maintaining a free list. +/// In order to keep the implementation simple, this class only supports +/// allocation of single descriptors and not ranges. +/// +class D3D12HostVisibleDescriptorAllocator +{ + ID3D12Device* m_device; + int m_chunkSize; + D3D12_DESCRIPTOR_HEAP_TYPE m_type; + + D3D12DescriptorHeap m_heap; + Slang::List m_freeList; + Slang::List m_heaps; + +public: + D3D12HostVisibleDescriptorAllocator() + {} + + Slang::Result init(ID3D12Device* device, int chunkSize, D3D12_DESCRIPTOR_HEAP_TYPE type) + { + m_device = device; + m_chunkSize = chunkSize; + m_type = type; + + SLANG_RETURN_ON_FAIL(m_heap.init(m_device, m_chunkSize, m_type, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); + + return SLANG_OK; + } + + Slang::Result allocate(D3D12HostVisibleDescriptor* outDescriptor) + { + // TODO: this allocator would take some work to make thread-safe + + if(m_freeList.Count() > 0) + { + auto descriptor = m_freeList[0]; + m_freeList.FastRemoveAt(0); + + *outDescriptor = descriptor; + return SLANG_OK; + } + + int index = m_heap.allocate(); + if(index < 0) + { + // Allocate a new heap and try again. + m_heaps.Add(m_heap); + SLANG_RETURN_ON_FAIL(m_heap.init(m_device, m_chunkSize, m_type, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); + + int index = m_heap.allocate(); + if(index < 0) + { + assert(!"descriptor allocation failed on fresh heap"); + return SLANG_FAIL; + } + } + + D3D12HostVisibleDescriptor descriptor; + descriptor.cpuHandle = m_heap.getCpuHandle(index); + + *outDescriptor = descriptor; + return SLANG_OK; + } + + void free(D3D12HostVisibleDescriptor descriptor) + { + m_freeList.Add(descriptor); + } +}; + +// --------------------------------------------------------------------------- +int D3D12DescriptorHeap::allocate() +{ + assert(m_currentIndex < m_totalSize); + if (m_currentIndex < m_totalSize) + { + return m_currentIndex++; + } + return -1; +} +// --------------------------------------------------------------------------- +int D3D12DescriptorHeap::allocate(int numDescriptors) +{ + assert(m_currentIndex + numDescriptors <= m_totalSize); + if (m_currentIndex + numDescriptors <= m_totalSize) + { + const int index = m_currentIndex; + m_currentIndex += numDescriptors; + return index; + } + return -1; +} +// --------------------------------------------------------------------------- +SLANG_FORCE_INLINE int D3D12DescriptorHeap::placeAt(int index) +{ + assert(index >= 0 && index < m_totalSize); + m_currentIndex = index + 1; + return index; +} + +// --------------------------------------------------------------------------- +SLANG_FORCE_INLINE D3D12_CPU_DESCRIPTOR_HANDLE D3D12DescriptorHeap::getCpuHandle(int index) const +{ + assert(index >= 0 && index < m_totalSize); + D3D12_CPU_DESCRIPTOR_HANDLE start = m_heap->GetCPUDescriptorHandleForHeapStart(); + D3D12_CPU_DESCRIPTOR_HANDLE dst; + dst.ptr = start.ptr + m_descriptorSize * index; + return dst; +} +// --------------------------------------------------------------------------- +SLANG_FORCE_INLINE D3D12_GPU_DESCRIPTOR_HANDLE D3D12DescriptorHeap::getGpuHandle(int index) const +{ + assert(index >= 0 && index < m_totalSize); + D3D12_GPU_DESCRIPTOR_HANDLE start = m_heap->GetGPUDescriptorHandleForHeapStart(); + D3D12_GPU_DESCRIPTOR_HANDLE dst; + dst.ptr = start.ptr + m_descriptorSize * index; + return dst; +} + +} // namespace gfx + diff --git a/tools/gfx/gfx.vcxproj b/tools/gfx/gfx.vcxproj new file mode 100644 index 000000000..cbafe84b1 --- /dev/null +++ b/tools/gfx/gfx.vcxproj @@ -0,0 +1,215 @@ + + + + + Debug + Win32 + + + Debug + x64 + + + Release + Win32 + + + Release + x64 + + + + {222F7498-B40C-4F3F-A704-DDEB91A4484A} + true + Win32Proj + gfx + 10.0.14393.0 + + + + StaticLibrary + true + Unicode + v140 + + + StaticLibrary + true + Unicode + v140 + + + StaticLibrary + false + Unicode + v140 + + + StaticLibrary + false + Unicode + v140 + + + + + + + + + + + + + + + + + + + ..\..\bin\windows-x86\debug\ + ..\..\intermediate\windows-x86\debug\gfx\ + gfx + .lib + + + ..\..\bin\windows-x64\debug\ + ..\..\intermediate\windows-x64\debug\gfx\ + gfx + .lib + + + ..\..\bin\windows-x86\release\ + ..\..\intermediate\windows-x86\release\gfx\ + gfx + .lib + + + ..\..\bin\windows-x64\release\ + ..\..\intermediate\windows-x64\release\gfx\ + gfx + .lib + + + + NotUsing + Level3 + _DEBUG;%(PreprocessorDefinitions) + ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) + EditAndContinue + Disabled + MultiThreadedDebug + + + Windows + true + + + "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x86/" "../../bin/windows-x86/debug/" + + + + + NotUsing + Level3 + _DEBUG;%(PreprocessorDefinitions) + ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) + EditAndContinue + Disabled + MultiThreadedDebug + + + Windows + true + + + "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x64/" "../../bin/windows-x64/debug/" + + + + + NotUsing + Level3 + NDEBUG;%(PreprocessorDefinitions) + ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) + Full + true + true + false + true + MultiThreaded + + + Windows + true + true + + + "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x86/" "../../bin/windows-x86/release/" + + + + + NotUsing + Level3 + NDEBUG;%(PreprocessorDefinitions) + ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) + Full + true + true + false + true + MultiThreaded + + + Windows + true + true + + + "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x64/" "../../bin/windows-x64/release/" + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/tools/gfx/gfx.vcxproj.filters b/tools/gfx/gfx.vcxproj.filters new file mode 100644 index 000000000..f1c7f7f5e --- /dev/null +++ b/tools/gfx/gfx.vcxproj.filters @@ -0,0 +1,120 @@ + + + + + {21EB8090-0D4E-1035-B6D3-48EBA215DCB7} + + + {E9C7FDCE-D52A-8D73-7EB0-C5296AF258F6} + + + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + Header Files + + + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + Source Files + + + \ No newline at end of file diff --git a/tools/gfx/model.cpp b/tools/gfx/model.cpp new file mode 100644 index 000000000..c8218102e --- /dev/null +++ b/tools/gfx/model.cpp @@ -0,0 +1,530 @@ +// model.cpp +#include "model.h" + +#define TINYOBJLOADER_IMPLEMENTATION +#include "../../external/tinyobjloader/tiny_obj_loader.h" + +#define STB_IMAGE_IMPLEMENTATION +#include "../../external/stb/stb_image.h" + +#define STB_IMAGE_RESIZE_IMPLEMENTATION +#include "../../external/stb/stb_image_resize.h" + +#include "../../external/glm/glm/glm.hpp" +#include "../../external/glm/glm/gtc/matrix_transform.hpp" +#include "../../external/glm/glm/gtc/constants.hpp" + +#include +#include +#include + +namespace gfx { + +// TinyObj provides a tuple type that bundles up indices, but doesn't +// provide equality comparison or hashing for that type. We'd like +// to have a hash function so that we can unique indices. +// +// In the simplest case, we could define hashing and operator== operations +// directly on `tinobj::index_t`, but that would create problems if they +// revise their API. +// +// We will instead define our own wrapper type that supports equality +// comparisons. +// +struct ObjIndexKey +{ + tinyobj::index_t index; +}; + +bool operator==(ObjIndexKey const& left, ObjIndexKey const& right) +{ + return left.index.vertex_index == right.index.vertex_index + && left.index.normal_index == right.index.normal_index + && left.index.texcoord_index == right.index.texcoord_index; +} + +struct Hasher +{ + template + void add(T const& v) + { + state ^= std::hash()(v) + 0x9e3779b9 + (state << 6) + (state >> 2); + } + size_t state = 0; +}; + +struct SmoothingGroupVertexID +{ + size_t smoothingGroup; + size_t positionID; +}; +bool operator==(SmoothingGroupVertexID const& left, SmoothingGroupVertexID const& right) +{ + return left.smoothingGroup == right.smoothingGroup + && left.positionID == right.positionID; +} + +} + +namespace std +{ + template<> struct hash + { + size_t operator()(gfx::ObjIndexKey const& key) const + { + gfx::Hasher hasher; + hasher.add(key.index.vertex_index); + hasher.add(key.index.normal_index); + hasher.add(key.index.texcoord_index); + return hasher.state; + } + }; + + template<> struct hash + { + size_t operator()(gfx::SmoothingGroupVertexID const& id) const + { + gfx::Hasher hasher; + hasher.add(id.smoothingGroup); + hasher.add(id.positionID); + return hasher.state; + } + }; +} + +namespace gfx +{ + +RefPtr loadTextureImage( + Renderer* renderer, + char const* path) +{ + int extentX = 0; + int extentY = 0; + int originalChannelCount = 0; + int requestedChannelCount = 4; // force to 4-component result + stbi_uc* data = stbi_load( + path, + &extentX, + &extentY, + &originalChannelCount, + requestedChannelCount); + if(!data) + return nullptr; + + int channelCount = requestedChannelCount ? requestedChannelCount : originalChannelCount; + + Format format; + switch(channelCount) + { + default: + return nullptr; + + case 4: format = Format::RGBA_Unorm_UInt8; + + // TODO: handle other cases here if/when we stop forcing 4-component + // results when loading the image with stb_image. + } + + std::vector subresourceInitData; + std::vector mipRowStrides; + + ptrdiff_t stride = extentX * channelCount * sizeof(stbi_uc); + + subresourceInitData.push_back(data); + mipRowStrides.push_back(stride); + + // create down-sampled images for the different mip levels + bool generateMips = true; + if(generateMips) + { + int prevExtentX = extentX; + int prevExtentY = extentY; + stbi_uc* prevData = data; + ptrdiff_t prevStride = stride; + + for(;;) + { + if(prevExtentX == 1 && prevExtentY == 1) + break; + + int newExtentX = prevExtentX / 2; + int newExtentY = prevExtentY / 2; + + if(!newExtentX) newExtentX = 1; + if(!newExtentY) newExtentY = 1; + + stbi_uc* newData = (stbi_uc*) malloc(newExtentX * newExtentY * channelCount * sizeof(stbi_uc)); + ptrdiff_t newStride = newExtentX * channelCount * sizeof(stbi_uc); + + stbir_resize_uint8_srgb( + prevData, prevExtentX, prevExtentY, prevStride, + newData, newExtentX, newExtentY, newStride, + channelCount, + STBIR_ALPHA_CHANNEL_NONE, + STBIR_FLAG_ALPHA_PREMULTIPLIED); + + subresourceInitData.push_back(newData); + mipRowStrides.push_back(newStride); + + prevExtentX = newExtentX; + prevExtentY = newExtentY; + prevData = newData; + prevStride = newStride; + } + } + + int mipCount = (int) mipRowStrides.size(); + + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, format, extentX, extentY, mipCount); + + TextureResource::Data initData; + initData.numSubResources = mipCount; + initData.numMips = mipCount; + initData.subResources = &subresourceInitData[0]; + initData.mipRowStrides = &mipRowStrides[0]; + + auto texture = renderer->createTextureResource( + Resource::Usage::PixelShaderResource, + desc, + &initData); + + free(data); + + return texture; +} + +Result ModelLoader::load( + char const* inputPath, + void** outModel) +{ + // TODO: need to actually allocate/load the data + + tinyobj::attrib_t objVertexAttributes; + std::vector objShapes; + std::vector objMaterials; + + std::string diagnostics; + bool shouldTriangulate = true; + bool success = tinyobj::LoadObj( + &objVertexAttributes, + &objShapes, + &objMaterials, + &diagnostics, + inputPath, + nullptr, + shouldTriangulate); + + if(!diagnostics.empty()) + { + log("%s", diagnostics.c_str()); + } + if(!success) + { + return SLANG_FAIL; + } + + // Translate each material imported by TinyObj into a format that + // we can actually use for rendering. + // + std::vector materials; + for(auto& objMaterial : objMaterials) + { + MaterialData materialData; + + materialData.diffuseColor = glm::vec3( + objMaterial.diffuse[0], + objMaterial.diffuse[1], + objMaterial.diffuse[2]); + + // load any referenced textures here + if(objMaterial.diffuse_texname.length()) + { + materialData.diffuseMap = loadTextureImage( + renderer, + objMaterial.diffuse_texname.c_str()); + } + + auto material = callbacks->createMaterial(materialData); + materials.push_back(material); + } + + // Flip the winding order on all faces if we are asked to... + // + if(loadFlags & LoadFlag::FlipWinding) + { + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t beginIndex = objIndexCounter; + size_t endIndex = beginIndex + objFaceVertexCount; + objIndexCounter = endIndex; + + size_t halfCount = objFaceVertexCount / 2; + for(size_t ii = 0; ii < halfCount; ++ii) + { + std::swap( + objShape.mesh.indices[beginIndex + ii], + objShape.mesh.indices[endIndex - (ii + 1)]); + } + } + } + + } + + // Identify cases where a face has a vertex without a normal, and in that + // case remember that the given vertex needs to be "smoothed" as part of + // the smoothing group for that face. Note that it is possible for the + // same vertex (position) to be part of faces in distinct smoothing groups. + // + std::unordered_map smoothedVertexNormals; + size_t firstSmoothedNormalID = objVertexAttributes.normals.size() / 3; + size_t flatFaceCounter = 0; + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t flatFaceIndex = flatFaceCounter++; + size_t objFaceIndex = objFaceCounter++; + size_t smoothingGroup = objShape.mesh.smoothing_group_ids[objFaceIndex]; + if(!smoothingGroup) + { + smoothingGroup = ~flatFaceIndex; + } + + for(size_t objFaceVertex = 0; objFaceVertex < objFaceVertexCount; ++objFaceVertex) + { + tinyobj::index_t& objIndex = objShape.mesh.indices[objIndexCounter++]; + + if(objIndex.normal_index < 0) + { + SmoothingGroupVertexID smoothVertexID; + smoothVertexID.positionID = objIndex.vertex_index; + smoothVertexID.smoothingGroup = smoothingGroup; + + if(smoothedVertexNormals.find(smoothVertexID) == smoothedVertexNormals.end()) + { + size_t normalID = objVertexAttributes.normals.size() / 3; + objVertexAttributes.normals.push_back(0); + objVertexAttributes.normals.push_back(0); + objVertexAttributes.normals.push_back(0); + + smoothedVertexNormals.insert(std::make_pair(smoothVertexID, normalID)); + + objIndex.normal_index = normalID; + } + } + } + } + } + // + // Having identified which vertices we need to smooth, we will make another + // pass to compute face normals and apply them to the vertices that belong + // to the same smoothing group. + // + flatFaceCounter = 0; + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t flatFaceIndex = flatFaceCounter++; + size_t objFaceIndex = objFaceCounter++; + unsigned int smoothingGroup = objShape.mesh.smoothing_group_ids[objFaceIndex]; + if(!smoothingGroup) + { + smoothingGroup = ~flatFaceIndex; + } + + glm::vec3 faceNormal; + if(objFaceVertexCount >= 3) + { + glm::vec3 v[3]; + for(size_t objFaceVertex = 0; objFaceVertex < 3; ++objFaceVertex) + { + tinyobj::index_t objIndex = objShape.mesh.indices[objIndexCounter + objFaceVertex]; + if(objIndex.vertex_index >= 0) + { + v[objFaceVertex] = glm::vec3( + objVertexAttributes.vertices[3 * objIndex.vertex_index + 0], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 1], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 2]); + } + } + faceNormal = cross(v[1] - v[0], v[2] - v[0]); + } + + // Add this face normal to any to-be-smoothed vertex on the face. + for(size_t objFaceVertex = 0; objFaceVertex < objFaceVertexCount; ++objFaceVertex) + { + tinyobj::index_t objIndex = objShape.mesh.indices[objIndexCounter++]; + + SmoothingGroupVertexID smoothVertexID; + smoothVertexID.positionID = objIndex.vertex_index; + smoothVertexID.smoothingGroup = smoothingGroup; + + auto ii = smoothedVertexNormals.find(smoothVertexID); + if(ii != smoothedVertexNormals.end()) + { + size_t normalID = ii->second; + objVertexAttributes.normals[normalID * 3 + 0] += faceNormal.x; + objVertexAttributes.normals[normalID * 3 + 1] += faceNormal.y; + objVertexAttributes.normals[normalID * 3 + 2] += faceNormal.z; + } + } + } + } + // + // Once we've added all contributions from each smoothing group, + // we can normalize the normals to compute the area-weighted average. + // + size_t normalCount = objVertexAttributes.normals.size() / 3; + for(size_t ii = firstSmoothedNormalID; ii < normalCount; ++ii) + { + glm::vec3 normal = glm::vec3( + objVertexAttributes.normals[3 * ii + 0], + objVertexAttributes.normals[3 * ii + 1], + objVertexAttributes.normals[3 * ii + 2]); + + normal = normalize(normal); + + objVertexAttributes.normals[3 * ii + 0] = normal.x; + objVertexAttributes.normals[3 * ii + 1] = normal.y; + objVertexAttributes.normals[3 * ii + 2] = normal.z; + } + + // TODO: we should sort the faces to group faces with + // the same material ID together, in case they weren't + // grouped in the original file. + + // We need to undo the .obj indexing stuff so that we have + // standard position/normal/etc. data in a single flat array + + std::unordered_map mapObjIndexToFlatIndex; + std::vector flatVertices; + std::vector flatIndices; + + MeshData* currentMesh = nullptr; + MeshData currentMeshStorage; + + std::vector meshes; + + for(auto& objShape : objShapes) + { + size_t objIndexCounter = 0; + size_t objFaceCounter = 0; + for(auto objFaceVertexCount : objShape.mesh.num_face_vertices) + { + size_t objFaceIndex = objFaceCounter++; + int faceMaterialID = objShape.mesh.material_ids[objFaceIndex]; + void* faceMaterial = materials[faceMaterialID]; + + if(!currentMesh || (faceMaterial != currentMesh->material)) + { + // finish old mesh. + if(currentMesh) + { + meshes.push_back(callbacks->createMesh(*currentMesh)); + } + + // Need to start a new mesh. + currentMesh = ¤tMeshStorage; + currentMesh->material = faceMaterial; + currentMesh->firstIndex = (int)flatIndices.size(); + currentMesh->indexCount = 0; + } + + for(size_t objFaceVertex = 0; objFaceVertex < objFaceVertexCount; ++objFaceVertex) + { + tinyobj::index_t objIndex = objShape.mesh.indices[objIndexCounter++]; + ObjIndexKey objIndexKey; objIndexKey.index = objIndex; + + + Index flatIndex = Index(-1); + auto iter = mapObjIndexToFlatIndex.find(objIndexKey); + if(iter != mapObjIndexToFlatIndex.end()) + { + flatIndex = iter->second; + } + else + { + Vertex flatVertex; + if(objIndex.vertex_index >= 0) + { + flatVertex.position = scale * glm::vec3( + objVertexAttributes.vertices[3 * objIndex.vertex_index + 0], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 1], + objVertexAttributes.vertices[3 * objIndex.vertex_index + 2]); + } + if(objIndex.normal_index >= 0) + { + flatVertex.normal = glm::vec3( + objVertexAttributes.normals[3 * objIndex.normal_index + 0], + objVertexAttributes.normals[3 * objIndex.normal_index + 1], + objVertexAttributes.normals[3 * objIndex.normal_index + 2]); + } + if(objIndex.texcoord_index >= 0) + { + flatVertex.uv = glm::vec2( + objVertexAttributes.texcoords[2 * objIndex.texcoord_index + 0], + objVertexAttributes.texcoords[2 * objIndex.texcoord_index + 1]); + } + + flatIndex = flatVertices.size(); + mapObjIndexToFlatIndex.insert(std::make_pair(objIndexKey, flatIndex)); + flatVertices.push_back(flatVertex); + } + + flatIndices.push_back(flatIndex); + currentMesh->indexCount++; + } + } + } + + // finish last mesh. + if(currentMesh) + { + meshes.push_back(callbacks->createMesh(*currentMesh)); + } + + ModelData modelData; + + modelData.vertexCount = (int)flatVertices.size(); + modelData.indexCount = (int)flatIndices.size(); + + modelData.meshCount = meshes.size(); + modelData.meshes = meshes.data(); + + BufferResource::Desc vertexBufferDesc; + vertexBufferDesc.init(modelData.vertexCount * sizeof(Vertex)); + vertexBufferDesc.setDefaults(Resource::Usage::VertexBuffer); + + modelData.vertexBuffer = renderer->createBufferResource( + Resource::Usage::VertexBuffer, + vertexBufferDesc, + flatVertices.data()); + if(!modelData.vertexBuffer) return SLANG_FAIL; + + BufferResource::Desc indexBufferDesc; + indexBufferDesc.init(modelData.indexCount * sizeof(Index)); + vertexBufferDesc.setDefaults(Resource::Usage::IndexBuffer); + + modelData.indexBuffer = renderer->createBufferResource( + Resource::Usage::IndexBuffer, + indexBufferDesc, + flatIndices.data()); + if(!modelData.indexBuffer) return SLANG_FAIL; + + *outModel = callbacks->createModel(modelData); + + return SLANG_OK; +} + +} // gfx diff --git a/tools/gfx/model.h b/tools/gfx/model.h new file mode 100644 index 000000000..046b9764b --- /dev/null +++ b/tools/gfx/model.h @@ -0,0 +1,73 @@ +// model.h +#pragma once + +#include "render.h" +#include "vector-math.h" + +#include + +namespace gfx { + +struct ModelLoader +{ + struct MaterialData + { + glm::vec3 diffuseColor; + RefPtr diffuseMap; + }; + + struct Vertex + { + glm::vec3 position; + glm::vec3 normal; + glm::vec2 uv; + }; + + typedef uint32_t Index; + + struct MeshData + { + int firstIndex; + int indexCount; + + void* material; + }; + + struct ModelData + { + RefPtr vertexBuffer; + RefPtr indexBuffer; + PrimitiveTopology primitiveTopology; + int vertexCount; + int indexCount; + int meshCount; + void* const* meshes; + }; + + struct ICallbacks + { + typedef ModelLoader::MaterialData MaterialData; + typedef ModelLoader::MeshData MeshData; + typedef ModelLoader::ModelData ModelData; + + virtual void* createMaterial(MaterialData const& data) = 0; + virtual void* createMesh(MeshData const& data) = 0; + virtual void* createModel(ModelData const& data) = 0; + }; + + typedef uint32_t LoadFlags; + enum LoadFlag : LoadFlags + { + FlipWinding = 1 << 0, + }; + + ICallbacks* callbacks = nullptr; + RefPtr renderer; + LoadFlags loadFlags = 0; + float scale = 1.0f; + + Result load(char const* inputPath, void** outModel); +}; + + +} // gfx diff --git a/tools/gfx/render-d3d11.cpp b/tools/gfx/render-d3d11.cpp new file mode 100644 index 000000000..57c0672bd --- /dev/null +++ b/tools/gfx/render-d3d11.cpp @@ -0,0 +1,2112 @@ +// render-d3d11.cpp + +#define _CRT_SECURE_NO_WARNINGS + +#include "render-d3d11.h" + +//WORKING: #include "options.h" +#include "render.h" +#include "d3d-util.h" + +#include "surface.h" + +// In order to use the Slang API, we need to include its header + +//#include + +#include "../../slang-com-ptr.h" + +// We will be rendering with Direct3D 11, so we need to include +// the Windows and D3D11 headers + +#define WIN32_LEAN_AND_MEAN +#define NOMINMAX +#include +#undef WIN32_LEAN_AND_MEAN +#undef NOMINMAX + +#include +#include + +// We will use the C standard library just for printing error messages. +#include + +#ifdef _MSC_VER +#include +#if (_MSC_VER < 1900) +#define snprintf sprintf_s +#endif +#endif +// +using namespace Slang; + +namespace gfx { + +class D3D11Renderer : public Renderer +{ +public: + enum + { + kMaxUAVs = 64, + kMaxRTVs = 8, + }; + + // Renderer implementation + virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; + virtual void setClearColor(const float color[4]) override; + virtual void clearFrame() override; + virtual void presentFrame() override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + + virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; + + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; + virtual void unmap(BufferResource* buffer) override; + virtual void setPrimitiveTopology(PrimitiveTopology topology) override; + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; + virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; + virtual void dispatchCompute(int x, int y, int z) override; + virtual void submitGpuWork() override {} + virtual void waitForGpu() override {} + virtual RendererType getRendererType() const override { return RendererType::DirectX11; } + + protected: + +#if 0 + struct BindingDetail + { + ComPtr m_srv; + ComPtr m_uav; + ComPtr m_samplerState; + }; + + class BindingStateImpl: public BindingState + { + public: + typedef BindingState Parent; + + /// Ctor + BindingStateImpl(const Desc& desc): + Parent(desc) + {} + + List m_bindingDetails; + }; +#endif + + enum class D3D11DescriptorSlotType + { + ConstantBuffer, + ShaderResourceView, + UnorderedAccessView, + Sampler, + + CombinedTextureSampler, + + CountOf, + }; + + class DescriptorSetLayoutImpl : public DescriptorSetLayout + { + public: + struct RangeInfo + { + D3D11DescriptorSlotType type; + UInt arrayIndex; + UInt pairedSamplerArrayIndex; + }; + List m_ranges; + + UInt m_counts[int(D3D11DescriptorSlotType::CountOf)]; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: + struct DescriptorSetInfo + { + RefPtr layout; + UInt baseIndices[int(D3D11DescriptorSlotType::CountOf)]; + }; + + List m_descriptorSets; + UINT m_uavCount; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr m_layout; + + List> m_cbs; + List> m_srvs; + List> m_uavs; + List> m_samplers; + }; + + class ShaderProgramImpl: public ShaderProgram + { + public: + ComPtr m_vertexShader; + ComPtr m_pixelShader; + ComPtr m_computeShader; + }; + + class BufferResourceImpl: public BufferResource + { + public: + typedef BufferResource Parent; + + BufferResourceImpl(const Desc& desc, Usage initialUsage): + Parent(desc), + m_initialUsage(initialUsage) + { + } + + MapFlavor m_mapFlavor; + Usage m_initialUsage; + ComPtr m_buffer; + ComPtr m_staging; + }; + class TextureResourceImpl : public TextureResource + { + public: + typedef TextureResource Parent; + + TextureResourceImpl(const Desc& desc, Usage initialUsage) : + Parent(desc), + m_initialUsage(initialUsage) + { + } + Usage m_initialUsage; + ComPtr m_resource; + + }; + + class SamplerStateImpl : public SamplerState + { + public: + ComPtr m_sampler; + }; + + + class ResourceViewImpl : public ResourceView + { + public: + enum class Type + { + SRV, + UAV, + DSV, + RTV, + }; + Type m_type; + }; + + class ShaderResourceViewImpl : public ResourceViewImpl + { + public: + ComPtr m_srv; + }; + + class UnorderedAccessViewImpl : public ResourceViewImpl + { + public: + ComPtr m_uav; + }; + + class DepthStencilViewImpl : public ResourceViewImpl + { + public: + ComPtr m_dsv; + }; + + class RenderTargetViewImpl : public ResourceViewImpl + { + public: + ComPtr m_rtv; + }; + + class InputLayoutImpl: public InputLayout + { + public: + ComPtr m_layout; + }; + + class PipelineStateImpl : public PipelineState + { + public: + RefPtr m_program; + RefPtr m_pipelineLayout; + }; + + + class GraphicsPipelineStateImpl : public PipelineStateImpl + { + public: + UINT m_rtvCount; + + RefPtr m_inputLayout; + ComPtr m_depthStencilState; + ComPtr m_rasterizerState; + + UINT m_stencilRef; + }; + + class ComputePipelineStateImpl : public PipelineStateImpl + { + public: + }; + + /// Capture a texture to a file + static HRESULT captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut); + + void _flushGraphicsState(); + void _flushComputeState(); + + ComPtr m_swapChain; + ComPtr m_device; + ComPtr m_immediateContext; + ComPtr m_backBufferTexture; + + RefPtr m_primaryRenderTargetTexture; + RefPtr m_primaryRenderTargetView; + +// List > m_renderTargetViews; +// List > m_renderTargetTextures; + + bool m_renderTargetBindingsDirty = false; + + RefPtr m_currentGraphicsState; + RefPtr m_currentComputeState; + + ComPtr m_rtvBindings[kMaxRTVs]; + ComPtr m_dsvBinding; + ComPtr m_uavBindings[int(PipelineType::CountOf)][kMaxUAVs]; + bool m_targetBindingsDirty[int(PipelineType::CountOf)]; + + Desc m_desc; + + float m_clearColor[4] = { 0, 0, 0, 0 }; +}; + +Renderer* createD3D11Renderer() +{ + return new D3D11Renderer(); +} + +/* static */HRESULT D3D11Renderer::captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut) +{ + if (!context) return E_INVALIDARG; + if (!texture) return E_INVALIDARG; + + D3D11_TEXTURE2D_DESC textureDesc; + texture->GetDesc(&textureDesc); + + // Don't bother supporting MSAA for right now + if (textureDesc.SampleDesc.Count > 1) + { + fprintf(stderr, "ERROR: cannot capture multi-sample texture\n"); + return E_INVALIDARG; + } + + HRESULT hr = S_OK; + ComPtr stagingTexture; + + if (textureDesc.Usage == D3D11_USAGE_STAGING && (textureDesc.CPUAccessFlags & D3D11_CPU_ACCESS_READ)) + { + stagingTexture = texture; + } + else + { + // Modify the descriptor to give us a staging texture + textureDesc.BindFlags = 0; + textureDesc.MiscFlags &= ~D3D11_RESOURCE_MISC_TEXTURECUBE; + textureDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; + textureDesc.Usage = D3D11_USAGE_STAGING; + + hr = device->CreateTexture2D(&textureDesc, 0, stagingTexture.writeRef()); + if (FAILED(hr)) + { + fprintf(stderr, "ERROR: failed to create staging texture\n"); + return hr; + } + + context->CopyResource(stagingTexture, texture); + } + + // Now just read back texels from the staging textures + { + D3D11_MAPPED_SUBRESOURCE mappedResource; + SLANG_RETURN_ON_FAIL(context->Map(stagingTexture, 0, D3D11_MAP_READ, 0, &mappedResource)); + + Result res = surfaceOut.set(textureDesc.Width, textureDesc.Height, Format::RGBA_Unorm_UInt8, mappedResource.RowPitch, mappedResource.pData, SurfaceAllocator::getMallocAllocator()); + + // Make sure to unmap + context->Unmap(stagingTexture, 0); + return res; + } +} + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! + +SlangResult D3D11Renderer::initialize(const Desc& desc, void* inWindowHandle) +{ + auto windowHandle = (HWND)inWindowHandle; + m_desc = desc; + + // Rather than statically link against D3D, we load it dynamically. + HMODULE d3dModule = LoadLibraryA("d3d11.dll"); + if (!d3dModule) + { + fprintf(stderr, "error: failed load 'd3d11.dll'\n"); + return SLANG_FAIL; + } + + PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN D3D11CreateDeviceAndSwapChain_ = + (PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN)GetProcAddress(d3dModule, "D3D11CreateDeviceAndSwapChain"); + if (!D3D11CreateDeviceAndSwapChain_) + { + fprintf(stderr, + "error: failed load symbol 'D3D11CreateDeviceAndSwapChain'\n"); + return SLANG_FAIL; + } + + UINT deviceFlags = 0; + +#ifdef _DEBUG + // We will enable the D3D debug more for debug builds. + // + // TODO: we should probably provide a command-line option + // to override this kind of default rather than leave it + // up to each back-end to specify. + deviceFlags |= D3D11_CREATE_DEVICE_DEBUG; +#endif + + // Our swap chain uses RGBA8 with sRGB, with double buffering. + DXGI_SWAP_CHAIN_DESC swapChainDesc = { 0 }; + swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; + + // Note(tfoley): Disabling sRGB for DX back buffer for now, so that we + // can get consistent output with OpenGL, where setting up sRGB will + // probably be more involved. + // swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM_SRGB; + swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; + + swapChainDesc.SampleDesc.Count = 1; + swapChainDesc.SampleDesc.Quality = 0; + swapChainDesc.BufferCount = 2; + swapChainDesc.OutputWindow = windowHandle; + swapChainDesc.Windowed = TRUE; + swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD; + swapChainDesc.Flags = 0; + + // We will ask for the highest feature level that can be supported. + const D3D_FEATURE_LEVEL featureLevels[] = { + D3D_FEATURE_LEVEL_11_1, + D3D_FEATURE_LEVEL_11_0, + D3D_FEATURE_LEVEL_10_1, + D3D_FEATURE_LEVEL_10_0, + D3D_FEATURE_LEVEL_9_3, + D3D_FEATURE_LEVEL_9_2, + D3D_FEATURE_LEVEL_9_1, + }; + D3D_FEATURE_LEVEL featureLevel = D3D_FEATURE_LEVEL_9_1; + const int totalNumFeatureLevels = SLANG_COUNT_OF(featureLevels); + + // On a machine that does not have an up-to-date version of D3D installed, + // the `D3D11CreateDeviceAndSwapChain` call will fail with `E_INVALIDARG` + // if you ask for featuer level 11_1. The workaround is to call + // `D3D11CreateDeviceAndSwapChain` up to twice: the first time with 11_1 + // at the start of the list of requested feature levels, and the second + // time without it. + + for (int ii = 0; ii < 2; ++ii) + { + const HRESULT hr = D3D11CreateDeviceAndSwapChain_( + nullptr, // adapter (use default) + D3D_DRIVER_TYPE_REFERENCE, +// D3D_DRIVER_TYPE_HARDWARE, + nullptr, // software + deviceFlags, + &featureLevels[ii], + totalNumFeatureLevels - ii, + D3D11_SDK_VERSION, + &swapChainDesc, + m_swapChain.writeRef(), + m_device.writeRef(), + &featureLevel, + m_immediateContext.writeRef()); + + // Failures with `E_INVALIDARG` might be due to feature level 11_1 + // not being supported. + if (hr == E_INVALIDARG) + { + continue; + } + + // Other failures are real, though. + SLANG_RETURN_ON_FAIL(hr); + // We must have a swap chain + break; + } + + // TODO: Add support for debugging to help detect leaks: + // + // ComPtr gDebug; + // m_device->QueryInterface(IID_PPV_ARGS(gDebug.writeRef())); + // + + // After we've created the swap chain, we can request a pointer to the + // back buffer as a D3D11 texture, and create a render-target view from it. + + static const IID kIID_ID3D11Texture2D = { + 0x6f15aaf2, 0xd208, 0x4e89, 0x9a, 0xb4, 0x48, + 0x95, 0x35, 0xd3, 0x4f, 0x9c }; + + SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(0, kIID_ID3D11Texture2D, (void**)m_backBufferTexture.writeRef())); + +// for (int i = 0; i < 8; i++) + { + ComPtr texture; + D3D11_TEXTURE2D_DESC textureDesc; + m_backBufferTexture->GetDesc(&textureDesc); + SLANG_RETURN_ON_FAIL(m_device->CreateTexture2D(&textureDesc, nullptr, texture.writeRef())); + + ComPtr rtv; + D3D11_RENDER_TARGET_VIEW_DESC rtvDesc; + rtvDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; + rtvDesc.Texture2D.MipSlice = 0; + rtvDesc.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D; + SLANG_RETURN_ON_FAIL(m_device->CreateRenderTargetView(texture, &rtvDesc, rtv.writeRef())); + + TextureResource::Desc resourceDesc; + resourceDesc.init2D(Resource::Type::Texture2D, Format::RGBA_Unorm_UInt8, textureDesc.Width, textureDesc.Height, 1); + + RefPtr primaryRenderTargetTexture; + SLANG_RETURN_ON_FAIL(createTextureResource(Resource::Usage::RenderTarget, resourceDesc, nullptr, primaryRenderTargetTexture.writeRef())); + + ResourceView::Desc viewDesc; + viewDesc.format = resourceDesc.format; + viewDesc.type = ResourceView::Type::RenderTarget; + RefPtr primaryRenderTargetView; + SLANG_RETURN_ON_FAIL(createTextureView(primaryRenderTargetTexture, viewDesc, primaryRenderTargetView.writeRef())); + + m_primaryRenderTargetTexture = (TextureResourceImpl*) primaryRenderTargetTexture.Ptr(); + m_primaryRenderTargetView = (RenderTargetViewImpl*) primaryRenderTargetView.Ptr(); + } + +// m_immediateContext->OMSetRenderTargets(1, m_primaryRenderTargetView->m_rtv.readRef(), nullptr); + m_rtvBindings[0] = m_primaryRenderTargetView->m_rtv; + m_targetBindingsDirty[int(PipelineType::Graphics)] = true; + + // Similarly, we are going to set up a viewport once, and then never + // switch, since this is a simple test app. + D3D11_VIEWPORT viewport; + viewport.TopLeftX = 0; + viewport.TopLeftY = 0; + viewport.Width = (float)desc.width; + viewport.Height = (float)desc.height; + viewport.MaxDepth = 1; // TODO(tfoley): use reversed depth + viewport.MinDepth = 0; + m_immediateContext->RSSetViewports(1, &viewport); + + return SLANG_OK; +} + +void D3D11Renderer::setClearColor(const float color[4]) +{ + memcpy(m_clearColor, color, sizeof(m_clearColor)); +} + +void D3D11Renderer::clearFrame() +{ + m_immediateContext->ClearRenderTargetView(m_primaryRenderTargetView->m_rtv, m_clearColor); + + if(m_dsvBinding) + { + m_immediateContext->ClearDepthStencilView(m_dsvBinding, D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL, 1.0f, 0); + } +} + +void D3D11Renderer::presentFrame() +{ + m_immediateContext->CopyResource(m_backBufferTexture, m_primaryRenderTargetTexture->m_resource); + m_swapChain->Present(0, 0); +} + +TextureResource::Desc D3D11Renderer::getSwapChainTextureDesc() +{ + D3D11_TEXTURE2D_DESC dxDesc; + ((ID3D11Texture2D*)m_primaryRenderTargetTexture->m_resource.get())->GetDesc(&dxDesc); + + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, dxDesc.Width, dxDesc.Height, 1); + + return desc; +} + +SlangResult D3D11Renderer::captureScreenSurface(Surface& surfaceOut) +{ + return captureTextureToSurface(m_device, m_immediateContext, (ID3D11Texture2D*) m_primaryRenderTargetTexture->m_resource.get(), surfaceOut); +} + +static D3D11_BIND_FLAG _calcResourceFlag(Resource::BindFlag::Enum bindFlag) +{ + typedef Resource::BindFlag BindFlag; + switch (bindFlag) + { + case BindFlag::VertexBuffer: return D3D11_BIND_VERTEX_BUFFER; + case BindFlag::IndexBuffer: return D3D11_BIND_INDEX_BUFFER; + case BindFlag::ConstantBuffer: return D3D11_BIND_CONSTANT_BUFFER; + case BindFlag::StreamOutput: return D3D11_BIND_STREAM_OUTPUT; + case BindFlag::RenderTarget: return D3D11_BIND_RENDER_TARGET; + case BindFlag::DepthStencil: return D3D11_BIND_DEPTH_STENCIL; + case BindFlag::UnorderedAccess: return D3D11_BIND_UNORDERED_ACCESS; + case BindFlag::PixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; + case BindFlag::NonPixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; + default: return D3D11_BIND_FLAG(0); + } +} + +static int _calcResourceBindFlags(int bindFlags) +{ + int dstFlags = 0; + while (bindFlags) + { + int lsb = bindFlags & -bindFlags; + + dstFlags |= _calcResourceFlag(Resource::BindFlag::Enum(lsb)); + bindFlags &= ~lsb; + } + return dstFlags; +} + +static int _calcResourceAccessFlags(int accessFlags) +{ + switch (accessFlags) + { + case 0: return 0; + case Resource::AccessFlag::Read: return D3D11_CPU_ACCESS_READ; + case Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_WRITE; + case Resource::AccessFlag::Read | + Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_READ | D3D11_CPU_ACCESS_WRITE; + default: assert(!"Invalid flags"); return 0; + } +} + +Result D3D11Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) +{ + TextureResource::Desc srcDesc(descIn); + srcDesc.setDefaults(initialUsage); + + const int effectiveArraySize = srcDesc.calcEffectiveArraySize(); + + if(initData) + { + assert(initData->numSubResources == srcDesc.numMipLevels * effectiveArraySize * srcDesc.size.depth); + } + + const DXGI_FORMAT format = D3DUtil::getMapFormat(srcDesc.format); + if (format == DXGI_FORMAT_UNKNOWN) + { + return SLANG_FAIL; + } + + const int bindFlags = _calcResourceBindFlags(srcDesc.bindFlags); + + // Set up the initialize data + List subRes; + D3D11_SUBRESOURCE_DATA* subResourcesPtr = nullptr; + if(initData) + { + subRes.SetSize(srcDesc.numMipLevels * effectiveArraySize); + { + int subResourceIndex = 0; + for (int i = 0; i < effectiveArraySize; i++) + { + for (int j = 0; j < srcDesc.numMipLevels; j++) + { + const int mipHeight = TextureResource::calcMipSize(srcDesc.size.height, j); + + D3D11_SUBRESOURCE_DATA& data = subRes[subResourceIndex]; + + data.pSysMem = initData->subResources[subResourceIndex]; + + data.SysMemPitch = UINT(initData->mipRowStrides[j]); + data.SysMemSlicePitch = UINT(initData->mipRowStrides[j] * mipHeight); + + subResourceIndex++; + } + } + } + subResourcesPtr = subRes.Buffer(); + } + + const int accessFlags = _calcResourceAccessFlags(srcDesc.cpuAccessFlags); + + RefPtr texture(new TextureResourceImpl(srcDesc, initialUsage)); + + switch (srcDesc.type) + { + case Resource::Type::Texture1D: + { + D3D11_TEXTURE1D_DESC desc = { 0 }; + desc.BindFlags = bindFlags; + desc.CPUAccessFlags = accessFlags; + desc.Format = format; + desc.MiscFlags = 0; + desc.MipLevels = srcDesc.numMipLevels; + desc.ArraySize = effectiveArraySize; + desc.Width = srcDesc.size.width; + desc.Usage = D3D11_USAGE_DEFAULT; + + ComPtr texture1D; + SLANG_RETURN_ON_FAIL(m_device->CreateTexture1D(&desc, subResourcesPtr, texture1D.writeRef())); + + texture->m_resource = texture1D; + break; + } + case Resource::Type::TextureCube: + case Resource::Type::Texture2D: + { + D3D11_TEXTURE2D_DESC desc = { 0 }; + desc.BindFlags = bindFlags; + desc.CPUAccessFlags = accessFlags; + desc.Format = format; + desc.MiscFlags = 0; + desc.MipLevels = srcDesc.numMipLevels; + desc.ArraySize = effectiveArraySize; + + desc.Width = srcDesc.size.width; + desc.Height = srcDesc.size.height; + desc.Usage = D3D11_USAGE_DEFAULT; + desc.SampleDesc.Count = srcDesc.sampleDesc.numSamples; + desc.SampleDesc.Quality = srcDesc.sampleDesc.quality; + + if (srcDesc.type == Resource::Type::TextureCube) + { + desc.MiscFlags |= D3D11_RESOURCE_MISC_TEXTURECUBE; + } + + ComPtr texture2D; + SLANG_RETURN_ON_FAIL(m_device->CreateTexture2D(&desc, subResourcesPtr, texture2D.writeRef())); + + texture->m_resource = texture2D; + break; + } + case Resource::Type::Texture3D: + { + D3D11_TEXTURE3D_DESC desc = { 0 }; + desc.BindFlags = bindFlags; + desc.CPUAccessFlags = accessFlags; + desc.Format = format; + desc.MiscFlags = 0; + desc.MipLevels = srcDesc.numMipLevels; + desc.Width = srcDesc.size.width; + desc.Height = srcDesc.size.height; + desc.Depth = srcDesc.size.depth; + desc.Usage = D3D11_USAGE_DEFAULT; + + ComPtr texture3D; + SLANG_RETURN_ON_FAIL(m_device->CreateTexture3D(&desc, subResourcesPtr, texture3D.writeRef())); + + texture->m_resource = texture3D; + break; + } + default: + return SLANG_FAIL; + } + + *outResource = texture.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) +{ + BufferResource::Desc srcDesc(descIn); + srcDesc.setDefaults(initialUsage); + + // Make aligned to 256 bytes... not sure why, but if you remove this the tests do fail. + const size_t alignedSizeInBytes = D3DUtil::calcAligned(srcDesc.sizeInBytes, 256); + + // Hack to make the initialization never read from out of bounds memory, by copying into a buffer + List initDataBuffer; + if (initData && alignedSizeInBytes > srcDesc.sizeInBytes) + { + initDataBuffer.SetSize(alignedSizeInBytes); + ::memcpy(initDataBuffer.Buffer(), initData, srcDesc.sizeInBytes); + initData = initDataBuffer.Buffer(); + } + + D3D11_BUFFER_DESC bufferDesc = { 0 }; + bufferDesc.ByteWidth = UINT(alignedSizeInBytes); + bufferDesc.BindFlags = _calcResourceBindFlags(srcDesc.bindFlags); + // For read we'll need to do some staging + bufferDesc.CPUAccessFlags = _calcResourceAccessFlags(descIn.cpuAccessFlags & Resource::AccessFlag::Write); + bufferDesc.Usage = D3D11_USAGE_DEFAULT; + + // If written by CPU, make it dynamic + if (descIn.cpuAccessFlags & Resource::AccessFlag::Write) + { + bufferDesc.Usage = D3D11_USAGE_DYNAMIC; + } + + switch (initialUsage) + { + case Resource::Usage::ConstantBuffer: + { + // We'll just assume ConstantBuffers are dynamic for now + bufferDesc.Usage = D3D11_USAGE_DYNAMIC; + break; + } + default: break; + } + + if (bufferDesc.BindFlags & (D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE)) + { + //desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE; + if (srcDesc.elementSize != 0) + { + bufferDesc.StructureByteStride = srcDesc.elementSize; + bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED; + } + else + { + bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS; + } + } + + D3D11_SUBRESOURCE_DATA subResourceData = { 0 }; + subResourceData.pSysMem = initData; + + RefPtr buffer(new BufferResourceImpl(srcDesc, initialUsage)); + + SLANG_RETURN_ON_FAIL(m_device->CreateBuffer(&bufferDesc, initData ? &subResourceData : nullptr, buffer->m_buffer.writeRef())); + + if (srcDesc.cpuAccessFlags & Resource::AccessFlag::Read) + { + D3D11_BUFFER_DESC bufDesc = {}; + bufDesc.BindFlags = 0; + bufDesc.ByteWidth = (UINT)alignedSizeInBytes; + bufDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; + bufDesc.Usage = D3D11_USAGE_STAGING; + + SLANG_RETURN_ON_FAIL(m_device->CreateBuffer(&bufDesc, nullptr, buffer->m_staging.writeRef())); + } + + *outResource = buffer.detach(); + return SLANG_OK; +} + +D3D11_FILTER_TYPE translateFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return D3D11_FILTER_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return D3D11_FILTER_TYPE_##DST + + CASE(Point, POINT); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +D3D11_FILTER_REDUCTION_TYPE translateFilterReduction(TextureReductionOp op) +{ + switch (op) + { + default: + return D3D11_FILTER_REDUCTION_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureReductionOp::SRC: return D3D11_FILTER_REDUCTION_TYPE_##DST + + CASE(Average, STANDARD); + CASE(Comparison, COMPARISON); + CASE(Minimum, MINIMUM); + CASE(Maximum, MAXIMUM); + +#undef CASE + } +} + +D3D11_TEXTURE_ADDRESS_MODE translateAddressingMode(TextureAddressingMode mode) +{ + switch (mode) + { + default: + return D3D11_TEXTURE_ADDRESS_MODE(0); + +#define CASE(SRC, DST) \ + case TextureAddressingMode::SRC: return D3D11_TEXTURE_ADDRESS_##DST + + CASE(Wrap, WRAP); + CASE(ClampToEdge, CLAMP); + CASE(ClampToBorder, BORDER); + CASE(MirrorRepeat, MIRROR); + CASE(MirrorOnce, MIRROR_ONCE); + +#undef CASE + } +} + +static D3D11_COMPARISON_FUNC translateComparisonFunc(ComparisonFunc func) +{ + switch (func) + { + default: + // TODO: need to report failures + return D3D11_COMPARISON_ALWAYS; + +#define CASE(FROM, TO) \ + case ComparisonFunc::FROM: return D3D11_COMPARISON_##TO + + CASE(Never, NEVER); + CASE(Less, LESS); + CASE(Equal, EQUAL); + CASE(LessEqual, LESS_EQUAL); + CASE(Greater, GREATER); + CASE(NotEqual, NOT_EQUAL); + CASE(GreaterEqual, GREATER_EQUAL); + CASE(Always, ALWAYS); +#undef CASE + } +} + +Result D3D11Renderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + D3D11_FILTER_REDUCTION_TYPE dxReduction = translateFilterReduction(desc.reductionOp); + D3D11_FILTER dxFilter; + if (desc.maxAnisotropy > 1) + { + dxFilter = D3D11_ENCODE_ANISOTROPIC_FILTER(dxReduction); + } + else + { + D3D11_FILTER_TYPE dxMin = translateFilterMode(desc.minFilter); + D3D11_FILTER_TYPE dxMag = translateFilterMode(desc.magFilter); + D3D11_FILTER_TYPE dxMip = translateFilterMode(desc.mipFilter); + + dxFilter = D3D11_ENCODE_BASIC_FILTER(dxMin, dxMag, dxMip, dxReduction); + } + + D3D11_SAMPLER_DESC dxDesc = {}; + dxDesc.Filter = dxFilter; + dxDesc.AddressU = translateAddressingMode(desc.addressU); + dxDesc.AddressV = translateAddressingMode(desc.addressV); + dxDesc.AddressW = translateAddressingMode(desc.addressW); + dxDesc.MipLODBias = desc.mipLODBias; + dxDesc.MaxAnisotropy = desc.maxAnisotropy; + dxDesc.ComparisonFunc = translateComparisonFunc(desc.comparisonFunc); + for (int ii = 0; ii < 4; ++ii) + dxDesc.BorderColor[ii] = desc.borderColor[ii]; + dxDesc.MinLOD = desc.minLOD; + dxDesc.MaxLOD = desc.maxLOD; + + ComPtr sampler; + SLANG_RETURN_ON_FAIL(m_device->CreateSamplerState( + &dxDesc, + sampler.writeRef())); + + RefPtr samplerImpl = new SamplerStateImpl(); + samplerImpl->m_sampler = sampler; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (TextureResourceImpl*) texture; + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::RenderTarget: + { + ComPtr rtv; + SLANG_RETURN_ON_FAIL(m_device->CreateRenderTargetView(resourceImpl->m_resource, nullptr, rtv.writeRef())); + + RefPtr viewImpl = new RenderTargetViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::RTV; + viewImpl->m_rtv = rtv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::DepthStencil: + { + ComPtr dsv; + SLANG_RETURN_ON_FAIL(m_device->CreateDepthStencilView(resourceImpl->m_resource, nullptr, dsv.writeRef())); + + RefPtr viewImpl = new DepthStencilViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::DSV; + viewImpl->m_dsv = dsv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::UnorderedAccess: + { + ComPtr uav; + SLANG_RETURN_ON_FAIL(m_device->CreateUnorderedAccessView(resourceImpl->m_resource, nullptr, uav.writeRef())); + + RefPtr viewImpl = new UnorderedAccessViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::UAV; + viewImpl->m_uav = uav; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::ShaderResource: + { + ComPtr srv; + SLANG_RETURN_ON_FAIL(m_device->CreateShaderResourceView(resourceImpl->m_resource, nullptr, srv.writeRef())); + + RefPtr viewImpl = new ShaderResourceViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::SRV; + viewImpl->m_srv = srv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + } +} + +Result D3D11Renderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + auto resourceDesc = resourceImpl->getDesc(); + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::UnorderedAccess: + { + D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc = {}; + uavDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER; + uavDesc.Format = D3DUtil::getMapFormat(desc.format); + uavDesc.Buffer.FirstElement = 0; + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + else if(desc.format == Format::Unknown) + { + uavDesc.Buffer.Flags |= D3D11_BUFFER_UAV_FLAG_RAW; + uavDesc.Format = DXGI_FORMAT_R32_TYPELESS; + } + + ComPtr uav; + SLANG_RETURN_ON_FAIL(m_device->CreateUnorderedAccessView(resourceImpl->m_buffer, &uavDesc, uav.writeRef())); + + RefPtr viewImpl = new UnorderedAccessViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::UAV; + viewImpl->m_uav = uav; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + + case ResourceView::Type::ShaderResource: + { + D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc = {}; + srvDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER; + srvDesc.Format = D3DUtil::getMapFormat(desc.format); + srvDesc.Buffer.ElementOffset = 0; + srvDesc.Buffer.ElementWidth = 1; + srvDesc.Buffer.FirstElement = 0; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + srvDesc.Buffer.ElementWidth = resourceDesc.elementSize; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + + ComPtr srv; + SLANG_RETURN_ON_FAIL(m_device->CreateShaderResourceView(resourceImpl->m_buffer, &srvDesc, srv.writeRef())); + + RefPtr viewImpl = new ShaderResourceViewImpl(); + viewImpl->m_type = ResourceViewImpl::Type::SRV; + viewImpl->m_srv = srv; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + } +} + +Result D3D11Renderer::createInputLayout(const InputElementDesc* inputElementsIn, UInt inputElementCount, InputLayout** outLayout) +{ + D3D11_INPUT_ELEMENT_DESC inputElements[16] = {}; + + char hlslBuffer[1024]; + char* hlslCursor = &hlslBuffer[0]; + + hlslCursor += sprintf(hlslCursor, "float4 main(\n"); + + for (UInt ii = 0; ii < inputElementCount; ++ii) + { + inputElements[ii].SemanticName = inputElementsIn[ii].semanticName; + inputElements[ii].SemanticIndex = (UINT)inputElementsIn[ii].semanticIndex; + inputElements[ii].Format = D3DUtil::getMapFormat(inputElementsIn[ii].format); + inputElements[ii].InputSlot = 0; + inputElements[ii].AlignedByteOffset = (UINT)inputElementsIn[ii].offset; + inputElements[ii].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA; + inputElements[ii].InstanceDataStepRate = 0; + + if (ii != 0) + { + hlslCursor += sprintf(hlslCursor, ",\n"); + } + + char const* typeName = "Unknown"; + switch (inputElementsIn[ii].format) + { + case Format::RGBA_Float32: + typeName = "float4"; + break; + case Format::RGB_Float32: + typeName = "float3"; + break; + case Format::RG_Float32: + typeName = "float2"; + break; + case Format::R_Float32: + typeName = "float"; + break; + default: + return SLANG_FAIL; + } + + hlslCursor += sprintf(hlslCursor, "%s a%d : %s%d", + typeName, + (int)ii, + inputElementsIn[ii].semanticName, + (int)inputElementsIn[ii].semanticIndex); + } + + hlslCursor += sprintf(hlslCursor, "\n) : SV_Position { return 0; }"); + + ComPtr vertexShaderBlob; + SLANG_RETURN_ON_FAIL(D3DUtil::compileHLSLShader("inputLayout", hlslBuffer, "main", "vs_5_0", vertexShaderBlob)); + + ComPtr inputLayout; + SLANG_RETURN_ON_FAIL(m_device->CreateInputLayout(&inputElements[0], (UINT)inputElementCount, vertexShaderBlob->GetBufferPointer(), vertexShaderBlob->GetBufferSize(), + inputLayout.writeRef())); + + RefPtr impl = new InputLayoutImpl; + impl->m_layout.swap(inputLayout); + + *outLayout = impl.detach(); + return SLANG_OK; +} + +void* D3D11Renderer::map(BufferResource* bufferIn, MapFlavor flavor) +{ + BufferResourceImpl* bufferResource = static_cast(bufferIn); + + D3D11_MAP mapType; + ID3D11Buffer* buffer = bufferResource->m_buffer; + + switch (flavor) + { + case MapFlavor::WriteDiscard: + mapType = D3D11_MAP_WRITE_DISCARD; + break; + case MapFlavor::HostWrite: + mapType = D3D11_MAP_WRITE; + break; + case MapFlavor::HostRead: + mapType = D3D11_MAP_READ; + + buffer = bufferResource->m_staging; + if (!buffer) + { + return nullptr; + } + + // Okay copy the data over + m_immediateContext->CopyResource(buffer, bufferResource->m_buffer); + + break; + default: + return nullptr; + } + + // We update our constant buffer per-frame, just for the purposes + // of the example, but we don't actually load different data + // per-frame (we always use an identity projection). + D3D11_MAPPED_SUBRESOURCE mappedSub; + SLANG_RETURN_NULL_ON_FAIL(m_immediateContext->Map(buffer, 0, mapType, 0, &mappedSub)); + + bufferResource->m_mapFlavor = flavor; + + return mappedSub.pData; +} + +void D3D11Renderer::unmap(BufferResource* bufferIn) +{ + BufferResourceImpl* bufferResource = static_cast(bufferIn); + ID3D11Buffer* buffer = (bufferResource->m_mapFlavor == MapFlavor::HostRead) ? bufferResource->m_staging : bufferResource->m_buffer; + m_immediateContext->Unmap(buffer, 0); +} + +#if 0 +void D3D11Renderer::setInputLayout(InputLayout* inputLayoutIn) +{ + auto inputLayout = static_cast(inputLayoutIn); + m_immediateContext->IASetInputLayout(inputLayout->m_layout); +} +#endif + +void D3D11Renderer::setPrimitiveTopology(PrimitiveTopology topology) +{ + m_immediateContext->IASetPrimitiveTopology(D3DUtil::getPrimitiveTopology(topology)); +} + +void D3D11Renderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffersIn, const UInt* stridesIn, const UInt* offsetsIn) +{ + static const int kMaxVertexBuffers = 16; + assert(slotCount <= kMaxVertexBuffers); + + UINT vertexStrides[kMaxVertexBuffers]; + UINT vertexOffsets[kMaxVertexBuffers]; + ID3D11Buffer* dxBuffers[kMaxVertexBuffers]; + + auto buffers = (BufferResourceImpl*const*)buffersIn; + + for (UInt ii = 0; ii < slotCount; ++ii) + { + vertexStrides[ii] = (UINT)stridesIn[ii]; + vertexOffsets[ii] = (UINT)offsetsIn[ii]; + dxBuffers[ii] = buffers[ii]->m_buffer; + } + + m_immediateContext->IASetVertexBuffers((UINT)startSlot, (UINT)slotCount, dxBuffers, &vertexStrides[0], &vertexOffsets[0]); +} + +void D3D11Renderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) +{ + DXGI_FORMAT dxFormat = D3DUtil::getMapFormat(indexFormat); + m_immediateContext->IASetIndexBuffer(((BufferResourceImpl*)buffer)->m_buffer, dxFormat, offset); +} + +void D3D11Renderer::setDepthStencilTarget(ResourceView* depthStencilView) +{ + m_dsvBinding = ((DepthStencilViewImpl*) depthStencilView)->m_dsv; + m_targetBindingsDirty[int(PipelineType::Graphics)] = true; +} + +void D3D11Renderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + switch(pipelineType) + { + default: + break; + + case PipelineType::Graphics: + { + auto stateImpl = (GraphicsPipelineStateImpl*) state; + auto programImpl = stateImpl->m_program; + + // TODO: We could conceivably do some lightweight state + // differencing here (e.g., check if `programImpl` is the + // same as the program that is currently bound). + // + // It isn't clear how much that would pay off given that + // the D3D11 runtime seems to do its own state diffing. + + // IA + + m_immediateContext->IASetInputLayout(stateImpl->m_inputLayout->m_layout); + + // VS + + m_immediateContext->VSSetShader(programImpl->m_vertexShader, nullptr, 0); + + // HS + + // DS + + // GS + + // RS + + m_immediateContext->RSSetState(stateImpl->m_rasterizerState); + + // PS + + m_immediateContext->PSSetShader(programImpl->m_pixelShader, nullptr, 0); + + // OM + + m_immediateContext->OMSetDepthStencilState(stateImpl->m_depthStencilState, stateImpl->m_stencilRef); + + m_currentGraphicsState = stateImpl; + } + break; + + case PipelineType::Compute: + { + auto stateImpl = (ComputePipelineStateImpl*) state; + auto programImpl = stateImpl->m_program; + + // CS + + m_immediateContext->CSSetShader(programImpl->m_computeShader, nullptr, 0); + + m_currentComputeState = stateImpl; + } + break; + } + + /// ... +} + +void D3D11Renderer::draw(UInt vertexCount, UInt startVertex) +{ + _flushGraphicsState(); + m_immediateContext->Draw((UINT)vertexCount, (UINT)startVertex); +} + +void D3D11Renderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ + _flushGraphicsState(); + m_immediateContext->DrawIndexed((UINT)indexCount, (UINT)startIndex, (UInt)baseVertex); +} + +Result D3D11Renderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) +{ + if (desc.pipelineType == PipelineType::Compute) + { + auto computeKernel = desc.findKernel(StageType::Compute); + + ComPtr computeShader; + SLANG_RETURN_ON_FAIL(m_device->CreateComputeShader(computeKernel->codeBegin, computeKernel->getCodeSize(), nullptr, computeShader.writeRef())); + + RefPtr shaderProgram = new ShaderProgramImpl(); + shaderProgram->m_computeShader.swap(computeShader); + + *outProgram = shaderProgram.detach(); + return SLANG_OK; + } + else + { + auto vertexKernel = desc.findKernel(StageType::Vertex); + auto fragmentKernel = desc.findKernel(StageType::Fragment); + + ComPtr vertexShader; + ComPtr pixelShader; + + SLANG_RETURN_ON_FAIL(m_device->CreateVertexShader(vertexKernel->codeBegin, vertexKernel->getCodeSize(), nullptr, vertexShader.writeRef())); + SLANG_RETURN_ON_FAIL(m_device->CreatePixelShader(fragmentKernel->codeBegin, fragmentKernel->getCodeSize(), nullptr, pixelShader.writeRef())); + + RefPtr shaderProgram = new ShaderProgramImpl(); + shaderProgram->m_vertexShader.swap(vertexShader); + shaderProgram->m_pixelShader.swap(pixelShader); + + *outProgram = shaderProgram.detach(); + return SLANG_OK; + } +} + +static D3D11_STENCIL_OP translateStencilOp(StencilOp op) +{ + switch(op) + { + default: + // TODO: need to report failures + return D3D11_STENCIL_OP_KEEP; + +#define CASE(FROM, TO) \ + case StencilOp::FROM: return D3D11_STENCIL_OP_##TO + + CASE(Keep, KEEP); + CASE(Zero, ZERO); + CASE(Replace, REPLACE); + CASE(IncrementSaturate, INCR_SAT); + CASE(DecrementSaturate, DECR_SAT); + CASE(Invert, INVERT); + CASE(IncrementWrap, INCR); + CASE(DecrementWrap, DECR); +#undef CASE + + } +} + +static D3D11_FILL_MODE translateFillMode(FillMode mode) +{ + switch(mode) + { + default: + // TODO: need to report failures + return D3D11_FILL_SOLID; + + case FillMode::Solid: return D3D11_FILL_SOLID; + case FillMode::Wireframe: return D3D11_FILL_WIREFRAME; + } +} + +static D3D11_CULL_MODE translateCullMode(CullMode mode) +{ + switch(mode) + { + default: + // TODO: need to report failures + return D3D11_CULL_NONE; + + case CullMode::None: return D3D11_CULL_NONE; + case CullMode::Back: return D3D11_CULL_BACK; + case CullMode::Front: return D3D11_CULL_FRONT; + } +} + +Result D3D11Renderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + + ComPtr depthStencilState; + { + D3D11_DEPTH_STENCIL_DESC dsDesc; + dsDesc.DepthEnable = desc.depthStencil.depthTestEnable; + dsDesc.DepthWriteMask = desc.depthStencil.depthWriteEnable ? D3D11_DEPTH_WRITE_MASK_ALL : D3D11_DEPTH_WRITE_MASK_ZERO; + dsDesc.DepthFunc = translateComparisonFunc(desc.depthStencil.depthFunc); + dsDesc.StencilEnable = desc.depthStencil.stencilEnable; + dsDesc.StencilReadMask = desc.depthStencil.stencilReadMask; + dsDesc.StencilWriteMask = desc.depthStencil.stencilWriteMask; + + #define FACE(DST, SRC) \ + dsDesc.DST.StencilFailOp = translateStencilOp( desc.depthStencil.SRC.stencilFailOp); \ + dsDesc.DST.StencilDepthFailOp = translateStencilOp( desc.depthStencil.SRC.stencilDepthFailOp); \ + dsDesc.DST.StencilPassOp = translateStencilOp( desc.depthStencil.SRC.stencilPassOp); \ + dsDesc.DST.StencilFunc = translateComparisonFunc(desc.depthStencil.SRC.stencilFunc); \ + /* end */ + + FACE(FrontFace, frontFace); + FACE(BackFace, backFace); + + SLANG_RETURN_ON_FAIL(m_device->CreateDepthStencilState( + &dsDesc, + depthStencilState.writeRef())); + } + + ComPtr rasterizerState; + { + D3D11_RASTERIZER_DESC rsDesc; + rsDesc.FillMode = translateFillMode(desc.rasterizer.fillMode); + rsDesc.CullMode = translateCullMode(desc.rasterizer.cullMode); + rsDesc.FrontCounterClockwise = desc.rasterizer.frontFace == FrontFaceMode::Clockwise; + rsDesc.DepthBias = desc.rasterizer.depthBias; + rsDesc.DepthBiasClamp = desc.rasterizer.depthBiasClamp; + rsDesc.SlopeScaledDepthBias = desc.rasterizer.slopeScaledDepthBias; + rsDesc.DepthClipEnable = desc.rasterizer.depthClipEnable; + rsDesc.ScissorEnable = desc.rasterizer.scissorEnable; + rsDesc.MultisampleEnable = desc.rasterizer.multisampleEnable; + rsDesc.AntialiasedLineEnable = desc.rasterizer.antialiasedLineEnable; + + SLANG_RETURN_ON_FAIL(m_device->CreateRasterizerState( + &rsDesc, + rasterizerState.writeRef())); + + } + + RefPtr state = new GraphicsPipelineStateImpl(); + state->m_program = programImpl; + state->m_stencilRef = desc.depthStencil.stencilRef; + state->m_depthStencilState = depthStencilState; + state->m_rasterizerState = rasterizerState; + state->m_pipelineLayout = (PipelineLayoutImpl*) desc.pipelineLayout; + state->m_inputLayout = (InputLayoutImpl*) desc.inputLayout; + state->m_rtvCount = desc.renderTargetCount; + + *outState = state.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + + RefPtr state = new ComputePipelineStateImpl(); + state->m_program = programImpl; + state->m_pipelineLayout = pipelineLayoutImpl; + + *outState = state.detach(); + return SLANG_OK; +} + +void D3D11Renderer::dispatchCompute(int x, int y, int z) +{ + _flushComputeState(); + m_immediateContext->Dispatch(x, y, z); +} + +Result D3D11Renderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + RefPtr descriptorSetLayoutImpl = new DescriptorSetLayoutImpl(); + + UInt counts[int(D3D11DescriptorSlotType::CountOf)] = { 0, }; + + UInt rangeCount = desc.slotRangeCount; + for(UInt rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + + switch(rangeDesc.type) + { + default: + assert(!"invalid slot type"); + return SLANG_FAIL; + + case DescriptorSlotType::Sampler: + rangeInfo.type = D3D11DescriptorSlotType::Sampler; + break; + + case DescriptorSlotType::CombinedImageSampler: + rangeInfo.type = D3D11DescriptorSlotType::CombinedTextureSampler; + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + rangeInfo.type = D3D11DescriptorSlotType::ConstantBuffer; + break; + + case DescriptorSlotType::SampledImage: + case DescriptorSlotType::UniformTexelBuffer: + case DescriptorSlotType::InputAttachment: + rangeInfo.type = D3D11DescriptorSlotType::ShaderResourceView; + break; + + case DescriptorSlotType::StorageImage: + case DescriptorSlotType::StorageTexelBuffer: + case DescriptorSlotType::StorageBuffer: + case DescriptorSlotType::DynamicStorageBuffer: + rangeInfo.type = D3D11DescriptorSlotType::UnorderedAccessView; + break; + } + + if(rangeInfo.type == D3D11DescriptorSlotType::CombinedTextureSampler) + { + auto srvTypeIndex = int(D3D11DescriptorSlotType::ShaderResourceView); + auto samplerTypeIndex = int(D3D11DescriptorSlotType::Sampler); + + rangeInfo.arrayIndex = counts[srvTypeIndex]; + rangeInfo.pairedSamplerArrayIndex = counts[samplerTypeIndex]; + + counts[srvTypeIndex] += rangeDesc.count; + counts[samplerTypeIndex] += rangeDesc.count; + } + else + { + auto typeIndex = int(rangeInfo.type); + + rangeInfo.arrayIndex = counts[typeIndex]; + counts[typeIndex] += rangeDesc.count; + } + descriptorSetLayoutImpl->m_ranges.Add(rangeInfo); + } + + for(int ii = 0; ii < int(D3D11DescriptorSlotType::CountOf); ++ii) + { + descriptorSetLayoutImpl->m_counts[ii] = counts[ii]; + } + + *outLayout = descriptorSetLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + RefPtr pipelineLayoutImpl = new PipelineLayoutImpl(); + + UInt counts[int(D3D11DescriptorSlotType::CountOf)] = { 0, }; + + UInt setCount = desc.descriptorSetCount; + for(UInt ii = 0; ii < setCount; ++ii) + { + auto setDesc = desc.descriptorSets[ii]; + PipelineLayoutImpl::DescriptorSetInfo setInfo; + + setInfo.layout = (DescriptorSetLayoutImpl*) setDesc.layout; + + for(int jj = 0; jj < int(D3D11DescriptorSlotType::CountOf); ++jj) + { + setInfo.baseIndices[jj] = counts[jj]; + counts[jj] += setInfo.layout->m_counts[jj]; + } + + pipelineLayoutImpl->m_descriptorSets.Add(setInfo); + } + + pipelineLayoutImpl->m_uavCount = counts[int(D3D11DescriptorSlotType::UnorderedAccessView)]; + + *outLayout = pipelineLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D11Renderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*)layout; + + RefPtr descriptorSetImpl = new DescriptorSetImpl(); + + descriptorSetImpl->m_layout = layoutImpl; + descriptorSetImpl->m_cbs .SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::ConstantBuffer)]); + descriptorSetImpl->m_srvs .SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::ShaderResourceView)]); + descriptorSetImpl->m_uavs .SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::UnorderedAccessView)]); + descriptorSetImpl->m_samplers.SetSize(layoutImpl->m_counts[int(D3D11DescriptorSlotType::Sampler)]); + + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + + +#if 0 +BindingState* D3D11Renderer::createBindingState(const BindingState::Desc& bindingStateDesc) +{ + RefPtr bindingState(new BindingStateImpl(bindingStateDesc)); + + const auto& srcBindings = bindingStateDesc.m_bindings; + const int numBindings = int(srcBindings.Count()); + + auto& dstDetails = bindingState->m_bindingDetails; + dstDetails.SetSize(numBindings); + + for (int i = 0; i < numBindings; ++i) + { + auto& dstDetail = dstDetails[i]; + const auto& srcBinding = srcBindings[i]; + + assert(srcBinding.registerRange.isSingle()); + + switch (srcBinding.bindingType) + { + case BindingType::Buffer: + { + assert(srcBinding.resource && srcBinding.resource->isBuffer()); + + BufferResourceImpl* buffer = static_cast(srcBinding.resource.Ptr()); + const BufferResource::Desc& desc = buffer->getDesc(); + + const int elemSize = bufferDesc.elementSize <= 0 ? 1 : bufferDesc.elementSize; + + if (bufferDesc.bindFlags & Resource::BindFlag::UnorderedAccess) + { + D3D11_UNORDERED_ACCESS_VIEW_DESC viewDesc; + memset(&viewDesc, 0, sizeof(viewDesc)); + viewDesc.Buffer.FirstElement = 0; + viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); + viewDesc.Buffer.Flags = 0; + viewDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER; + viewDesc.Format = D3DUtil::getMapFormat(bufferDesc.format); + + if (bufferDesc.elementSize == 0 && bufferDesc.format == Format::Unknown) + { + viewDesc.Buffer.Flags |= D3D11_BUFFER_UAV_FLAG_RAW; + viewDesc.Format = DXGI_FORMAT_R32_TYPELESS; + } + + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateUnorderedAccessView(buffer->m_buffer, &viewDesc, dstDetail.m_uav.writeRef())); + } + if (bufferDesc.bindFlags & (Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource)) + { + D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; + memset(&viewDesc, 0, sizeof(viewDesc)); + viewDesc.Buffer.FirstElement = 0; + viewDesc.Buffer.ElementWidth = elemSize; + viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); + viewDesc.Buffer.ElementOffset = 0; + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER; + viewDesc.Format = DXGI_FORMAT_UNKNOWN; + + if (bufferDesc.elementSize == 0) + { + viewDesc.Format = DXGI_FORMAT_R32_FLOAT; + } + + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(buffer->m_buffer, &viewDesc, dstDetail.m_srv.writeRef())); + } + break; + } + case BindingType::Texture: + case BindingType::CombinedTextureSampler: + { + assert(srcBinding.resource && srcBinding.resource->isTexture()); + + TextureResourceImpl* texture = static_cast(srcBinding.resource.Ptr()); + + const TextureResource::Desc& textureDesc = texture->getDesc(); + + D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; + viewDesc.Format = D3DUtil::getMapFormat(textureDesc.format); + + switch (texture->getType()) + { + case Resource::Type::Texture1D: + { + if (textureDesc.arraySize <= 0) + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1D; + viewDesc.Texture1D.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture1D.MostDetailedMip = 0; + } + else + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1DARRAY; + viewDesc.Texture1DArray.ArraySize = textureDesc.arraySize; + viewDesc.Texture1DArray.FirstArraySlice = 0; + viewDesc.Texture1DArray.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture1DArray.MostDetailedMip = 0; + } + break; + } + case Resource::Type::Texture2D: + { + if (textureDesc.arraySize <= 0) + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; + viewDesc.Texture2D.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture2D.MostDetailedMip = 0; + } + else + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2DARRAY; + viewDesc.Texture2DArray.ArraySize = textureDesc.arraySize; + viewDesc.Texture2DArray.FirstArraySlice = 0; + viewDesc.Texture2DArray.MipLevels = textureDesc.numMipLevels; + viewDesc.Texture2DArray.MostDetailedMip = 0; + } + break; + } + case Resource::Type::TextureCube: + { + if (textureDesc.arraySize <= 0) + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBE; + viewDesc.TextureCube.MipLevels = textureDesc.numMipLevels; + viewDesc.TextureCube.MostDetailedMip = 0; + } + else + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBEARRAY; + viewDesc.TextureCubeArray.MipLevels = textureDesc.numMipLevels; + viewDesc.TextureCubeArray.MostDetailedMip = 0; + viewDesc.TextureCubeArray.First2DArrayFace = 0; + viewDesc.TextureCubeArray.NumCubes = textureDesc.arraySize; + } + break; + } + case Resource::Type::Texture3D: + { + viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE3D; + viewDesc.Texture3D.MipLevels = textureDesc.numMipLevels; // Old code fixed as one + viewDesc.Texture3D.MostDetailedMip = 0; + break; + } + default: + { + assert(!"Unhandled type"); + return nullptr; + } + } + + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(texture->m_resource, &viewDesc, dstDetail.m_srv.writeRef())); + break; + } + case BindingType::Sampler: + { + const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; + + D3D11_SAMPLER_DESC desc = {}; + desc.AddressU = desc.AddressV = desc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP; + + if (samplerDesc.isCompareSampler) + { + desc.ComparisonFunc = D3D11_COMPARISON_LESS_EQUAL; + desc.Filter = D3D11_FILTER_MIN_LINEAR_MAG_MIP_POINT; + desc.MinLOD = desc.MaxLOD = 0.0f; + } + else + { + desc.Filter = D3D11_FILTER_ANISOTROPIC; + desc.MaxAnisotropy = 8; + desc.MinLOD = 0.0f; + desc.MaxLOD = 100.0f; + } + SLANG_RETURN_NULL_ON_FAIL(m_device->CreateSamplerState(&desc, dstDetail.m_samplerState.writeRef())); + break; + } + default: + { + assert(!"Unhandled type"); + return nullptr; + } + } + } + + // Done + return bindingState.detach(); +} + +void D3D11Renderer::_applyBindingState(bool isCompute) +{ + auto context = m_immediateContext.get(); + + const auto& details = m_currentBindings->m_bindingDetails; + const auto& bindings = m_currentBindings->getDesc().m_bindings; + + const int numBindings = int(bindings.Count()); + + for (int i = 0; i < numBindings; ++i) + { + const auto& binding = bindings[i]; + const auto& detail = details[i]; + + const int bindingIndex = binding.registerRange.getSingleIndex(); + + switch (binding.bindingType) + { + case BindingType::Buffer: + { + assert(binding.resource && binding.resource->isBuffer()); + if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) + { + ID3D11Buffer* buffer = static_cast(binding.resource.Ptr())->m_buffer; + if (isCompute) + context->CSSetConstantBuffers(bindingIndex, 1, &buffer); + else + { + context->VSSetConstantBuffers(bindingIndex, 1, &buffer); + context->PSSetConstantBuffers(bindingIndex, 1, &buffer); + } + } + else if (detail.m_uav) + { + if (isCompute) + context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); + else + context->OMSetRenderTargetsAndUnorderedAccessViews( + m_currentBindings->getDesc().m_numRenderTargets, + m_renderTargetViews.Buffer()->readRef(), + m_depthStencilView, + bindingIndex, + 1, + detail.m_uav.readRef(), + nullptr); + } + else + { + if (isCompute) + context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + else + { + context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + } + } + break; + } + case BindingType::Texture: + { + if (detail.m_uav) + { + if (isCompute) + context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); + else + context->OMSetRenderTargetsAndUnorderedAccessViews(D3D11_KEEP_RENDER_TARGETS_AND_DEPTH_STENCIL, + nullptr, nullptr, bindingIndex, 1, detail.m_uav.readRef(), nullptr); + } + else + { + if (isCompute) + context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + else + { + context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); + } + } + break; + } + case BindingType::Sampler: + { + if (isCompute) + context->CSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); + else + { + context->PSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); + context->VSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); + } + break; + } + default: + { + assert(!"Not implemented"); + return; + } + } + } +} + +void D3D11Renderer::setBindingState(BindingState* state) +{ + m_currentBindings = static_cast(state); +} +#endif + +void D3D11Renderer::_flushGraphicsState() +{ + auto pipelineType = int(PipelineType::Graphics); + if(m_targetBindingsDirty[pipelineType]) + { + m_targetBindingsDirty[pipelineType] = false; + + auto pipelineState = m_currentGraphicsState.Ptr(); + + auto rtvCount = pipelineState->m_rtvCount; + auto uavCount = pipelineState->m_pipelineLayout->m_uavCount; + + m_immediateContext->OMSetRenderTargetsAndUnorderedAccessViews( + rtvCount, + m_rtvBindings[0].readRef(), + m_dsvBinding, + rtvCount, + uavCount, + m_uavBindings[pipelineType][0].readRef(), + nullptr); + } +} + +void D3D11Renderer::_flushComputeState() +{ + auto pipelineType = int(PipelineType::Compute); + if(m_targetBindingsDirty[pipelineType]) + { + m_targetBindingsDirty[pipelineType] = false; + + auto pipelineState = m_currentComputeState.Ptr(); + + auto uavCount = pipelineState->m_pipelineLayout->m_uavCount; + + m_immediateContext->CSSetUnorderedAccessViews( + 0, + uavCount, + m_uavBindings[pipelineType][0].readRef(), + nullptr); + } +} + +void D3D11Renderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto bufferImpl = (BufferResourceImpl*) buffer; + auto& rangeInfo = m_layout->m_ranges[range]; + + assert(rangeInfo.type == D3D11DescriptorSlotType::ConstantBuffer); + + m_cbs[rangeInfo.arrayIndex + index] = bufferImpl->m_buffer; +} + +void D3D11Renderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto viewImpl = (ResourceViewImpl*)view; + auto& rangeInfo = m_layout->m_ranges[range]; + + switch (rangeInfo.type) + { + case D3D11DescriptorSlotType::ShaderResourceView: + { + assert(viewImpl->m_type == ResourceViewImpl::Type::SRV); + auto srvImpl = (ShaderResourceViewImpl*)viewImpl; + m_srvs[rangeInfo.arrayIndex + index] = srvImpl->m_srv; + } + break; + + case D3D11DescriptorSlotType::UnorderedAccessView: + { + assert(viewImpl->m_type == ResourceViewImpl::Type::UAV); + auto uavImpl = (UnorderedAccessViewImpl*)viewImpl; + m_uavs[rangeInfo.arrayIndex + index] = uavImpl->m_uav; + } + break; + + default: + assert(!"invalid to bind a resource view to this descriptor range"); + break; + } +} + +void D3D11Renderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ + auto samplerImpl = (SamplerStateImpl*) sampler; + auto& rangeInfo = m_layout->m_ranges[range]; + + assert(rangeInfo.type == D3D11DescriptorSlotType::Sampler); + + m_samplers[rangeInfo.arrayIndex + index] = samplerImpl->m_sampler; +} + +void D3D11Renderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ + auto viewImpl = (ResourceViewImpl*) textureView; + auto samplerImpl = (SamplerStateImpl*)sampler; + + auto& rangeInfo = m_layout->m_ranges[range]; + assert(rangeInfo.type == D3D11DescriptorSlotType::CombinedTextureSampler); + + assert(viewImpl->m_type == ResourceViewImpl::Type::SRV); + auto srvImpl = (ShaderResourceViewImpl*)viewImpl; + m_srvs[rangeInfo.arrayIndex + index] = srvImpl->m_srv; + + m_samplers[rangeInfo.arrayIndex + index] = samplerImpl->m_sampler; + + // TODO: need a place to bind the matching sampler... + m_srvs[rangeInfo.pairedSamplerArrayIndex + index] = srvImpl->m_srv; +} + +void D3D11Renderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) +{ + auto pipelineLayoutImpl = (PipelineLayoutImpl*)layout; + auto descriptorSetImpl = (DescriptorSetImpl*) descriptorSet; + + auto descriptorSetLayoutImpl = descriptorSetImpl->m_layout; + auto& setInfo = pipelineLayoutImpl->m_descriptorSets[index]; + + // Note: `setInfo->layout` and `descriptorSetLayoutImpl` need to be compatible + + // TODO: If/when we add per-stage visibility masks, it would be best to organize + // this as a loop over stages, so that we only do the binding that is required + // for each stage. + + { + int slotType = int(D3D11DescriptorSlotType::ConstantBuffer); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto cbs = descriptorSetImpl->m_cbs[0].readRef(); + + m_immediateContext->VSSetConstantBuffers(startSlot, slotCount, cbs); + // ... + m_immediateContext->PSSetConstantBuffers(startSlot, slotCount, cbs); + + m_immediateContext->CSSetConstantBuffers(startSlot, slotCount, cbs); + } + } + + { + int slotType = int(D3D11DescriptorSlotType::ShaderResourceView); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto srvs = descriptorSetImpl->m_srvs[0].readRef(); + + m_immediateContext->VSSetShaderResources(startSlot, slotCount, srvs); + // ... + m_immediateContext->PSSetShaderResources(startSlot, slotCount, srvs); + + m_immediateContext->CSSetShaderResources(startSlot, slotCount, srvs); + } + } + + { + int slotType = int(D3D11DescriptorSlotType::Sampler); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto samplers = descriptorSetImpl->m_samplers[0].readRef(); + + m_immediateContext->VSSetSamplers(startSlot, slotCount, samplers); + // ... + m_immediateContext->PSSetSamplers(startSlot, slotCount, samplers); + + m_immediateContext->CSSetSamplers(startSlot, slotCount, samplers); + } + } + + { + // Note: UAVs are handled differently from other bindings, because + // D3D11 requires all UAVs to be set with a single call, rather + // than allowing incremental updates. We will therefore shadow + // the UAV bindings with `m_uavBindings` and then flush them + // as needed right before a draw/dispatch. + // + int slotType = int(D3D11DescriptorSlotType::UnorderedAccessView); + UInt slotCount = setInfo.layout->m_counts[slotType]; + if(slotCount) + { + UInt startSlot = setInfo.baseIndices[slotType]; + + auto uavs = descriptorSetImpl->m_uavs[0].readRef(); + + for(UINT ii = 0; ii < slotCount; ++ii) + { + m_uavBindings[int(pipelineType)][startSlot + ii] = uavs[ii]; + } + m_targetBindingsDirty[int(pipelineType)] = true; + } + } + + +} + +} // renderer_test diff --git a/tools/gfx/render-d3d11.h b/tools/gfx/render-d3d11.h new file mode 100644 index 000000000..9e671d541 --- /dev/null +++ b/tools/gfx/render-d3d11.h @@ -0,0 +1,10 @@ +// render-d3d11.h +#pragma once + +namespace gfx { + +class Renderer; + +Renderer* createD3D11Renderer(); + +} // gfx diff --git a/tools/gfx/render-d3d12.cpp b/tools/gfx/render-d3d12.cpp new file mode 100644 index 000000000..2d3b8f521 --- /dev/null +++ b/tools/gfx/render-d3d12.cpp @@ -0,0 +1,3557 @@ +// render-d3d12.cpp +#define _CRT_SECURE_NO_WARNINGS + +#include "render-d3d12.h" + +//WORKING:#include "options.h" +#include "render.h" + +#include "surface.h" + +// In order to use the Slang API, we need to include its header + +//WORKING:#include + +// We will be rendering with Direct3D 12, so we need to include +// the Windows and D3D12 headers + +#define WIN32_LEAN_AND_MEAN +#define NOMINMAX +#include +#undef WIN32_LEAN_AND_MEAN +#undef NOMINMAX + +#include +#include +#include + +#include "../../slang-com-ptr.h" + +#include "resource-d3d12.h" +#include "descriptor-heap-d3d12.h" +#include "circular-resource-heap-d3d12.h" + +#include "d3d-util.h" + +// We will use the C standard library just for printing error messages. +#include + +#ifdef _MSC_VER +#include +#if (_MSC_VER < 1900) +#define snprintf sprintf_s +#endif +#endif +// + +#define ENABLE_DEBUG_LAYER 1 + +namespace gfx { +using namespace Slang; + +class D3D12Renderer : public Renderer +{ +public: + // Renderer implementation + virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; + virtual void setClearColor(const float color[4]) override; + virtual void clearFrame() override; + virtual void presentFrame() override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + + virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; + + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; + virtual void unmap(BufferResource* buffer) override; +// virtual void setInputLayout(InputLayout* inputLayout) override; + virtual void setPrimitiveTopology(PrimitiveTopology topology) override; + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; + virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; + virtual void dispatchCompute(int x, int y, int z) override; + virtual void submitGpuWork() override; + virtual void waitForGpu() override; + virtual RendererType getRendererType() const override { return RendererType::DirectX12; } + + ~D3D12Renderer(); + +protected: + static const Int kMaxNumRenderFrames = 4; + static const Int kMaxNumRenderTargets = 3; + + static const Int kMaxRTVCount = 8; + static const Int kMaxDescriptorSetCount = 16; + + struct Submitter + { + virtual void setRootConstantBufferView(int index, D3D12_GPU_VIRTUAL_ADDRESS gpuBufferLocation) = 0; + virtual void setRootDescriptorTable(int index, D3D12_GPU_DESCRIPTOR_HANDLE BaseDescriptor) = 0; + virtual void setRootSignature(ID3D12RootSignature* rootSignature) = 0; + }; + + struct FrameInfo + { + FrameInfo() :m_fenceValue(0) {} + void reset() + { + m_commandAllocator.setNull(); + } + ComPtr m_commandAllocator; ///< The command allocator for this frame + UINT64 m_fenceValue; ///< The fence value when rendering this Frame is complete + }; + + class ShaderProgramImpl: public ShaderProgram + { + public: + PipelineType m_pipelineType; + List m_vertexShader; + List m_pixelShader; + List m_computeShader; + }; + + class BufferResourceImpl: public BufferResource + { + public: + typedef BufferResource Parent; + + enum class BackingStyle + { + Unknown, + ResourceBacked, ///< The contents is only held within the resource + MemoryBacked, ///< The current contents is held in m_memory and copied to GPU every time it's used (typically used for constant buffers) + }; + + void bindConstantBufferView(D3D12CircularResourceHeap& circularHeap, int index, Submitter* submitter) const + { + switch (m_backingStyle) + { + case BackingStyle::MemoryBacked: + { + const size_t bufferSize = m_memory.Count(); + D3D12CircularResourceHeap::Cursor cursor = circularHeap.allocateConstantBuffer(bufferSize); + ::memcpy(cursor.m_position, m_memory.Buffer(), bufferSize); + // Set the constant buffer + submitter->setRootConstantBufferView(index, circularHeap.getGpuHandle(cursor)); + break; + } + case BackingStyle::ResourceBacked: + { + // Set the constant buffer + submitter->setRootConstantBufferView(index, m_resource.getResource()->GetGPUVirtualAddress()); + break; + } + default: break; + } + } + + BufferResourceImpl(Resource::Usage initialUsage, const Desc& desc): + Parent(desc), + m_mapFlavor(MapFlavor::HostRead), + m_initialUsage(initialUsage) + { + } + + static BackingStyle _calcResourceBackingStyle(Usage usage) + { + // Note: the D3D12 back-end has support for "versioning" of constant buffers, + // where the same logical `BufferResource` can actually point to different + // backing storage over its lifetime, to emulate the ability to modify the + // buffer contents as in D3D11, etc. + // + // The VK back-end doesn't have the same behavior, and it is difficult + // to both support this degree of flexibility *and* efficeintly exploit + // descriptor tables (since any table referencing the buffer would need + // to be updated when a new buffer "version" gets allocated). + // + // I'm choosing to disable this for now, and make all buffers be memory-backed, + // although this creates synchronization issues that we'll have to address + // next. + + return BackingStyle::ResourceBacked; +#if 0 + switch (usage) + { + case Usage::ConstantBuffer: return BackingStyle::MemoryBacked; + default: return BackingStyle::ResourceBacked; + } +#endif + } + + BackingStyle m_backingStyle; ///< How the resource is 'backed' - either as a resource or cpu memory. Cpu memory is typically used for constant buffers. + D3D12Resource m_resource; ///< The resource typically in gpu memory + D3D12Resource m_uploadResource; ///< If the resource can be written to, and is in gpu memory (ie not Memory backed), will have upload resource + + Usage m_initialUsage; + + List m_memory; ///< Cpu memory buffer, used if the m_backingStyle is MemoryBacked + MapFlavor m_mapFlavor; ///< If the resource is mapped holds the current mapping flavor + }; + + class TextureResourceImpl: public TextureResource + { + public: + typedef TextureResource Parent; + + TextureResourceImpl(const Desc& desc): + Parent(desc) + { + } + + D3D12Resource m_resource; + }; + + class SamplerStateImpl : public SamplerState + { + public: + D3D12_CPU_DESCRIPTOR_HANDLE m_cpuHandle; + }; + + class ResourceViewImpl : public ResourceView + { + public: + RefPtr m_resource; + D3D12HostVisibleDescriptor m_descriptor; + }; + + class InputLayoutImpl: public InputLayout + { + public: + List m_elements; + List m_text; ///< Holds all strings to keep in scope + }; + +#if 0 + struct BindingDetail + { + int m_srvIndex = -1; + int m_uavIndex = -1; + int m_samplerIndex = -1; + }; + + class BindingStateImpl: public BindingState + { + public: + typedef BindingState Parent; + + Result init(ID3D12Device* device) + { + // Set up descriptor heaps + SLANG_RETURN_ON_FAIL(m_viewHeap.init(device, 256, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); + SLANG_RETURN_ON_FAIL(m_samplerHeap.init(device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); + return SLANG_OK; + } + + /// Ctor + BindingStateImpl(const Desc& desc) : + Parent(desc) + {} + + List m_bindingDetails; ///< These match 1-1 to the bindings in the m_desc + }; +#endif + + class DescriptorSetLayoutImpl : public DescriptorSetLayout + { + public: + struct RangeInfo + { + DescriptorSlotType type; + Int count; + Int arrayIndex; + }; + + List m_ranges; + + List m_dxRanges; + List m_dxRootParameters; + + Int m_resourceCount; + Int m_samplerCount; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: + ComPtr m_rootSignature; + UInt m_descriptorSetCount; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr m_renderer; + RefPtr m_layout; + + D3D12DescriptorHeap* m_resourceHeap = nullptr; + D3D12DescriptorHeap* m_samplerHeap = nullptr; + + Int m_resourceTable = 0; + Int m_samplerTable = 0; + + // The following arrays are used to retain the relevant + // objects so that they will not be released while this + // descriptor-set is still alive. + // + // For the `m_resourceObjects` array, the values are either + // the relevant `ResourceViewImpl` for SRV/UAV slots, or + // a `BufferResourceImpl` for a CBV slot. + // + List> m_resourceObjects; + List> m_samplerObjects; + }; + + + // During command submission, we need all the descriptor tables that get + // used to come from a single heap (for each descritpor heap type). + // + // We will thus keep a single heap of each type that we hope will hold + // all the descriptors that actually get needed in a frame. + // + // TODO: we need an allocation policy to reallocate and resize these + // if/when we run out of space during a frame. + // + D3D12DescriptorHeap m_viewHeap; ///< Cbv, Srv, Uav + D3D12DescriptorHeap m_samplerHeap; ///< Heap for samplers + + D3D12HostVisibleDescriptorAllocator m_rtvAllocator; + D3D12HostVisibleDescriptorAllocator m_dsvAllocator; + + D3D12HostVisibleDescriptorAllocator m_viewAllocator; + D3D12HostVisibleDescriptorAllocator m_samplerAllocator; + + // Space in the GPU-visible heaps is precious, so we will also keep + // around CPU-visible heaps for storing descriptors in a format + // that is ready for copying into the GPU-visible heaps as needed. + // + D3D12DescriptorHeap m_cpuViewHeap; ///< Cbv, Srv, Uav + D3D12DescriptorHeap m_cpuSamplerHeap; ///< Heap for samplers + + class PipelineStateImpl : public PipelineState + { + public: + PipelineType m_pipelineType; + RefPtr m_pipelineLayout; + ComPtr m_pipelineState; + }; + + struct BoundVertexBuffer + { + RefPtr m_buffer; + int m_stride; + int m_offset; + }; + +#if 0 + struct BindParameters + { + enum + { + kMaxRanges = 16, + kMaxParameters = 32 + }; + + D3D12_DESCRIPTOR_RANGE& nextRange() { return m_ranges[m_rangeIndex++]; } + D3D12_ROOT_PARAMETER& nextParameter() { return m_parameters[m_paramIndex++]; } + + BindParameters(): + m_rangeIndex(0), + m_paramIndex(0) + {} + + D3D12_DESCRIPTOR_RANGE m_ranges[kMaxRanges]; + int m_rangeIndex; + D3D12_ROOT_PARAMETER m_parameters[kMaxParameters]; + int m_paramIndex; + }; +#endif + + struct GraphicsSubmitter : public Submitter + { + virtual void setRootConstantBufferView(int index, D3D12_GPU_VIRTUAL_ADDRESS gpuBufferLocation) override + { + m_commandList->SetGraphicsRootConstantBufferView(index, gpuBufferLocation); + } + virtual void setRootDescriptorTable(int index, D3D12_GPU_DESCRIPTOR_HANDLE baseDescriptor) override + { + m_commandList->SetGraphicsRootDescriptorTable(index, baseDescriptor); + } + void setRootSignature(ID3D12RootSignature* rootSignature) + { + m_commandList->SetGraphicsRootSignature(rootSignature); + } + + GraphicsSubmitter(ID3D12GraphicsCommandList* commandList): + m_commandList(commandList) + { + } + + ID3D12GraphicsCommandList* m_commandList; + }; + + struct ComputeSubmitter : public Submitter + { + virtual void setRootConstantBufferView(int index, D3D12_GPU_VIRTUAL_ADDRESS gpuBufferLocation) override + { + m_commandList->SetComputeRootConstantBufferView(index, gpuBufferLocation); + } + virtual void setRootDescriptorTable(int index, D3D12_GPU_DESCRIPTOR_HANDLE baseDescriptor) override + { + m_commandList->SetComputeRootDescriptorTable(index, baseDescriptor); + } + void setRootSignature(ID3D12RootSignature* rootSignature) + { + m_commandList->SetComputeRootSignature(rootSignature); + } + + ComputeSubmitter(ID3D12GraphicsCommandList* commandList) : + m_commandList(commandList) + { + } + + ID3D12GraphicsCommandList* m_commandList; + }; + + static PROC loadProc(HMODULE module, char const* name); + Result createFrameResources(); + /// Blocks until gpu has completed all work + void releaseFrameResources(); + + Result createBuffer(const D3D12_RESOURCE_DESC& resourceDesc, const void* srcData, D3D12Resource& uploadResource, D3D12_RESOURCE_STATES finalState, D3D12Resource& resourceOut); + + void beginRender(); + + void endRender(); + + void submitGpuWorkAndWait(); + void _resetCommandList(); + + Result captureTextureToSurface(D3D12Resource& resource, Surface& surfaceOut); + + FrameInfo& getFrame() { return m_frameInfos[m_frameIndex]; } + const FrameInfo& getFrame() const { return m_frameInfos[m_frameIndex]; } + + ID3D12GraphicsCommandList* getCommandList() const { return m_commandList; } + +// RenderState* calcRenderState(); + + /// From current bindings calculate the root signature and pipeline state +// Result calcGraphicsPipelineState(ComPtr& sigOut, ComPtr& pipelineStateOut); +// Result calcComputePipelineState(ComPtr& signatureOut, ComPtr& pipelineStateOut); + + Result _bindRenderState(PipelineStateImpl* pipelineStateImpl, ID3D12GraphicsCommandList* commandList, Submitter* submitter); + +// Result _calcBindParameters(BindParameters& params); +// RenderState* findRenderState(PipelineType pipelineType); + + PFN_D3D12_SERIALIZE_ROOT_SIGNATURE m_D3D12SerializeRootSignature = nullptr; + + D3D12CircularResourceHeap m_circularResourceHeap; + + int m_commandListOpenCount = 0; ///< If >0 the command list should be open + + List m_boundVertexBuffers; + + RefPtr m_boundIndexBuffer; + DXGI_FORMAT m_boundIndexFormat; + UINT m_boundIndexOffset; + + RefPtr m_currentPipelineState; + +// RefPtr m_boundShaderProgram; +// RefPtr m_boundInputLayout; + +// RefPtr m_boundBindingState; + RefPtr m_boundDescriptorSets[int(PipelineType::CountOf)][kMaxDescriptorSetCount]; + + DXGI_FORMAT m_targetFormat = DXGI_FORMAT_R8G8B8A8_UNORM; + DXGI_FORMAT m_depthStencilFormat = DXGI_FORMAT_D24_UNORM_S8_UINT; + bool m_hasVsync = true; + bool m_isFullSpeed = false; + bool m_allowFullScreen = false; + bool m_isMultiSampled = false; + int m_numTargetSamples = 1; ///< The number of multi sample samples + int m_targetSampleQuality = 0; ///< The multi sample quality + + Desc m_desc; + + bool m_isInitialized = false; + + D3D12_PRIMITIVE_TOPOLOGY_TYPE m_primitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; + D3D12_PRIMITIVE_TOPOLOGY m_primitiveTopology = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST; + + float m_clearColor[4] = { 0, 0, 0, 0 }; + + D3D12_VIEWPORT m_viewport = {}; + + ComPtr m_dxDebug; + + ComPtr m_device; + ComPtr m_swapChain; + ComPtr m_commandQueue; +// ComPtr m_rtvHeap; + ComPtr m_commandList; + + D3D12_RECT m_scissorRect = {}; + +// List > m_renderStates; ///< Holds list of all render state combinations +// RenderState* m_currentRenderState = nullptr; ///< The current combination + + UINT m_rtvDescriptorSize = 0; + +// ComPtr m_dsvHeap; + UINT m_dsvDescriptorSize = 0; + + // Synchronization objects. + D3D12CounterFence m_fence; + + HANDLE m_swapChainWaitableObject; + + // Frame specific data + int m_numRenderFrames = 0; + UINT m_frameIndex = 0; + FrameInfo m_frameInfos[kMaxNumRenderFrames]; + + int m_numRenderTargets = 2; + int m_renderTargetIndex = 0; + + D3D12Resource* m_backBuffers[kMaxNumRenderTargets]; + D3D12Resource* m_renderTargets[kMaxNumRenderTargets]; + + D3D12Resource m_backBufferResources[kMaxNumRenderTargets]; + D3D12Resource m_renderTargetResources[kMaxNumRenderTargets]; + + RefPtr m_rtvs[kMaxRTVCount]; + RefPtr m_dsv; + + int32_t m_depthStencilUsageFlags = 0; ///< D3DUtil::UsageFlag combination for depth stencil + int32_t m_targetUsageFlags = 0; ///< D3DUtil::UsageFlag combination for target + + HWND m_hwnd = nullptr; +}; + +Renderer* createD3D12Renderer() +{ + return new D3D12Renderer; +} + +/* static */PROC D3D12Renderer::loadProc(HMODULE module, char const* name) +{ + PROC proc = ::GetProcAddress(module, name); + if (!proc) + { + fprintf(stderr, "error: failed load symbol '%s'\n", name); + return nullptr; + } + return proc; +} + +void D3D12Renderer::releaseFrameResources() +{ + // https://msdn.microsoft.com/en-us/library/windows/desktop/bb174577%28v=vs.85%29.aspx + + // Release the resources holding references to the swap chain (requirement of + // IDXGISwapChain::ResizeBuffers) and reset the frame fence values to the + // current fence value. + for (int i = 0; i < m_numRenderFrames; i++) + { + FrameInfo& info = m_frameInfos[i]; + info.reset(); + info.m_fenceValue = m_fence.getCurrentValue(); + } + for (int i = 0; i < m_numRenderTargets; i++) + { + m_backBuffers[i]->setResourceNull(); + m_renderTargets[i]->setResourceNull(); + } +} + +void D3D12Renderer::waitForGpu() +{ + m_fence.nextSignalAndWait(m_commandQueue); +} + +D3D12Renderer::~D3D12Renderer() +{ + if (m_isInitialized) + { + // Ensure that the GPU is no longer referencing resources that are about to be + // cleaned up by the destructor. + waitForGpu(); + } +} + +static void _initSrvDesc(Resource::Type resourceType, const TextureResource::Desc& textureDesc, const D3D12_RESOURCE_DESC& desc, DXGI_FORMAT pixelFormat, D3D12_SHADER_RESOURCE_VIEW_DESC& descOut) +{ + // create SRV + descOut = D3D12_SHADER_RESOURCE_VIEW_DESC(); + + descOut.Format = (pixelFormat == DXGI_FORMAT_UNKNOWN) ? D3DUtil::calcFormat(D3DUtil::USAGE_SRV, desc.Format) : pixelFormat; + descOut.Shader4ComponentMapping = D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING; + if (desc.DepthOrArraySize == 1) + { + switch (desc.Dimension) + { + case D3D12_RESOURCE_DIMENSION_TEXTURE1D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE1D; break; + case D3D12_RESOURCE_DIMENSION_TEXTURE2D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE2D; break; + case D3D12_RESOURCE_DIMENSION_TEXTURE3D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE3D; break; + default: assert(!"Unknown dimension"); + } + + descOut.Texture2D.MipLevels = desc.MipLevels; + descOut.Texture2D.MostDetailedMip = 0; + descOut.Texture2D.PlaneSlice = 0; + descOut.Texture2D.ResourceMinLODClamp = 0.0f; + } + else if (resourceType == Resource::Type::TextureCube) + { + if (textureDesc.arraySize > 1) + { + descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURECUBEARRAY; + + descOut.TextureCubeArray.NumCubes = textureDesc.arraySize; + descOut.TextureCubeArray.First2DArrayFace = 0; + descOut.TextureCubeArray.MipLevels = desc.MipLevels; + descOut.TextureCubeArray.MostDetailedMip = 0; + descOut.TextureCubeArray.ResourceMinLODClamp = 0; + } + else + { + descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURECUBE; + + descOut.TextureCube.MipLevels = desc.MipLevels; + descOut.TextureCube.MostDetailedMip = 0; + descOut.TextureCube.ResourceMinLODClamp = 0; + } + } + else + { + assert(desc.DepthOrArraySize > 1); + + switch (desc.Dimension) + { + case D3D12_RESOURCE_DIMENSION_TEXTURE1D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE1DARRAY; break; + case D3D12_RESOURCE_DIMENSION_TEXTURE2D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE2DARRAY; break; + case D3D12_RESOURCE_DIMENSION_TEXTURE3D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE3D; break; + + default: assert(!"Unknown dimension"); + } + + descOut.Texture2DArray.ArraySize = desc.DepthOrArraySize; + descOut.Texture2DArray.MostDetailedMip = 0; + descOut.Texture2DArray.MipLevels = desc.MipLevels; + descOut.Texture2DArray.FirstArraySlice = 0; + descOut.Texture2DArray.PlaneSlice = 0; + descOut.Texture2DArray.ResourceMinLODClamp = 0; + } +} + +static void _initBufferResourceDesc(size_t bufferSize, D3D12_RESOURCE_DESC& out) +{ + out = {}; + + out.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER; + out.Alignment = 0; + out.Width = bufferSize; + out.Height = 1; + out.DepthOrArraySize = 1; + out.MipLevels = 1; + out.Format = DXGI_FORMAT_UNKNOWN; + out.SampleDesc.Count = 1; + out.SampleDesc.Quality = 0; + out.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR; + out.Flags = D3D12_RESOURCE_FLAG_NONE; +} + +Result D3D12Renderer::createBuffer(const D3D12_RESOURCE_DESC& resourceDesc, const void* srcData, D3D12Resource& uploadResource, D3D12_RESOURCE_STATES finalState, D3D12Resource& resourceOut) +{ + const size_t bufferSize = size_t(resourceDesc.Width); + + { + D3D12_HEAP_PROPERTIES heapProps; + heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + + const D3D12_RESOURCE_STATES initialState = srcData ? D3D12_RESOURCE_STATE_COPY_DEST : finalState; + + SLANG_RETURN_ON_FAIL(resourceOut.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, initialState, nullptr)); + } + + { + D3D12_HEAP_PROPERTIES heapProps; + heapProps.Type = D3D12_HEAP_TYPE_UPLOAD; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + + D3D12_RESOURCE_DESC uploadResourceDesc(resourceDesc); + uploadResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE; + + SLANG_RETURN_ON_FAIL(uploadResource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, uploadResourceDesc, D3D12_RESOURCE_STATE_GENERIC_READ, nullptr)); + } + + if (srcData) + { + // Copy data to the intermediate upload heap and then schedule a copy + // from the upload heap to the vertex buffer. + UINT8* dstData; + D3D12_RANGE readRange = {}; // We do not intend to read from this resource on the CPU. + + ID3D12Resource* dxUploadResource = uploadResource.getResource(); + + SLANG_RETURN_ON_FAIL(dxUploadResource->Map(0, &readRange, reinterpret_cast(&dstData))); + ::memcpy(dstData, srcData, bufferSize); + dxUploadResource->Unmap(0, nullptr); + + m_commandList->CopyBufferRegion(resourceOut, 0, uploadResource, 0, bufferSize); + + // Make sure it's in the right state + { + D3D12BarrierSubmitter submitter(m_commandList); + resourceOut.transition(finalState, submitter); + } + + submitGpuWorkAndWait(); + } + + return SLANG_OK; +} + +void D3D12Renderer::_resetCommandList() +{ + const FrameInfo& frame = getFrame(); + + ID3D12GraphicsCommandList* commandList = getCommandList(); + commandList->Reset(frame.m_commandAllocator, nullptr); + + // TIM: when should this get set? +// commandList->OMSetRenderTargets( +// 1, +// &m_rtvs[0]->m_descriptor.cpuHandle, +// FALSE, +// m_dsv ? &m_dsv->m_descriptor.cpuHandle : nullptr); + + // Set necessary state. + commandList->RSSetViewports(1, &m_viewport); + commandList->RSSetScissorRects(1, &m_scissorRect); +} + +void D3D12Renderer::beginRender() +{ + // Should currently not be open! + assert(m_commandListOpenCount == 0); + + m_circularResourceHeap.updateCompleted(); + + getFrame().m_commandAllocator->Reset(); + + _resetCommandList(); + + // Indicate that the render target needs to be writable + { + D3D12BarrierSubmitter submitter(m_commandList); + m_renderTargets[m_renderTargetIndex]->transition(D3D12_RESOURCE_STATE_RENDER_TARGET, submitter); + } + + m_commandListOpenCount = 1; +} + +void D3D12Renderer::endRender() +{ + assert(m_commandListOpenCount == 1); + + { + const UInt64 signalValue = m_fence.nextSignal(m_commandQueue); + m_circularResourceHeap.addSync(signalValue); + } + + D3D12Resource& backBuffer = *m_backBuffers[m_renderTargetIndex]; + if (m_isMultiSampled) + { + // MSAA resolve + D3D12Resource& renderTarget = *m_renderTargets[m_renderTargetIndex]; + assert(&renderTarget != &backBuffer); + // Barriers to wait for the render target, and the backbuffer to be in correct state + { + D3D12BarrierSubmitter submitter(m_commandList); + renderTarget.transition(D3D12_RESOURCE_STATE_RESOLVE_SOURCE, submitter); + backBuffer.transition(D3D12_RESOURCE_STATE_RESOLVE_DEST, submitter); + } + + // Do the resolve... + m_commandList->ResolveSubresource(backBuffer, 0, renderTarget, 0, m_targetFormat); + } + + // Make the back buffer presentable + { + D3D12BarrierSubmitter submitter(m_commandList); + backBuffer.transition(D3D12_RESOURCE_STATE_PRESENT, submitter); + } + + SLANG_ASSERT_VOID_ON_FAIL(m_commandList->Close()); + + { + // Execute the command list. + ID3D12CommandList* commandLists[] = { m_commandList }; + m_commandQueue->ExecuteCommandLists(SLANG_COUNT_OF(commandLists), commandLists); + } + + assert(m_commandListOpenCount == 1); + // Must be 0 + m_commandListOpenCount = 0; +} + +void D3D12Renderer::submitGpuWork() +{ + assert(m_commandListOpenCount); + ID3D12GraphicsCommandList* commandList = getCommandList(); + + SLANG_ASSERT_VOID_ON_FAIL(commandList->Close()); + { + // Execute the command list. + ID3D12CommandList* commandLists[] = { commandList }; + m_commandQueue->ExecuteCommandLists(SLANG_COUNT_OF(commandLists), commandLists); + } + + // Reset the render target + _resetCommandList(); +} + +void D3D12Renderer::submitGpuWorkAndWait() +{ + submitGpuWork(); + waitForGpu(); +} + +Result D3D12Renderer::captureTextureToSurface(D3D12Resource& resource, Surface& surfaceOut) +{ + const D3D12_RESOURCE_STATES initialState = resource.getState(); + + const D3D12_RESOURCE_DESC desc = resource.getResource()->GetDesc(); + + // Don't bother supporting MSAA for right now + if (desc.SampleDesc.Count > 1) + { + fprintf(stderr, "ERROR: cannot capture multi-sample texture\n"); + return SLANG_FAIL; + } + + size_t bytesPerPixel = sizeof(uint32_t); + size_t rowPitch = int(desc.Width) * bytesPerPixel; + size_t bufferSize = rowPitch * int(desc.Height); + + D3D12Resource stagingResource; + { + D3D12_RESOURCE_DESC stagingDesc; + _initBufferResourceDesc(bufferSize, stagingDesc); + + D3D12_HEAP_PROPERTIES heapProps; + heapProps.Type = D3D12_HEAP_TYPE_READBACK; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + + SLANG_RETURN_ON_FAIL(stagingResource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, stagingDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); + } + + { + D3D12BarrierSubmitter submitter(m_commandList); + resource.transition(D3D12_RESOURCE_STATE_COPY_SOURCE, submitter); + } + + // Do the copy + { + D3D12_TEXTURE_COPY_LOCATION srcLoc; + srcLoc.pResource = resource; + srcLoc.Type = D3D12_TEXTURE_COPY_TYPE_SUBRESOURCE_INDEX; + srcLoc.SubresourceIndex = 0; + + D3D12_TEXTURE_COPY_LOCATION dstLoc; + dstLoc.pResource = stagingResource; + dstLoc.Type = D3D12_TEXTURE_COPY_TYPE_PLACED_FOOTPRINT; + dstLoc.PlacedFootprint.Offset = 0; + dstLoc.PlacedFootprint.Footprint.Format = desc.Format; + dstLoc.PlacedFootprint.Footprint.Width = UINT(desc.Width); + dstLoc.PlacedFootprint.Footprint.Height = UINT(desc.Height); + dstLoc.PlacedFootprint.Footprint.Depth = 1; + dstLoc.PlacedFootprint.Footprint.RowPitch = UINT(rowPitch); + + m_commandList->CopyTextureRegion(&dstLoc, 0, 0, 0, &srcLoc, nullptr); + } + + { + D3D12BarrierSubmitter submitter(m_commandList); + resource.transition(initialState, submitter); + } + + // Submit the copy, and wait for copy to complete + submitGpuWorkAndWait(); + + { + ID3D12Resource* dxResource = stagingResource; + + UINT8* data; + D3D12_RANGE readRange = {0, bufferSize}; + + SLANG_RETURN_ON_FAIL(dxResource->Map(0, &readRange, reinterpret_cast(&data))); + + Result res = surfaceOut.set(int(desc.Width), int(desc.Height), Format::RGBA_Unorm_UInt8, int(rowPitch), data, SurfaceAllocator::getMallocAllocator()); + + dxResource->Unmap(0, nullptr); + return res; + } +} + +#if 0 +Result D3D12Renderer::calcComputePipelineState(ComPtr& signatureOut, ComPtr& pipelineStateOut) +{ + BindParameters bindParameters; + _calcBindParameters(bindParameters); + + ComPtr rootSignature; + ComPtr pipelineState; + + { + D3D12_ROOT_SIGNATURE_DESC rootSignatureDesc; + rootSignatureDesc.NumParameters = bindParameters.m_paramIndex; + rootSignatureDesc.pParameters = bindParameters.m_parameters; + rootSignatureDesc.NumStaticSamplers = 0; + rootSignatureDesc.pStaticSamplers = nullptr; + rootSignatureDesc.Flags = D3D12_ROOT_SIGNATURE_FLAG_NONE; + + ComPtr signature; + ComPtr error; + SLANG_RETURN_ON_FAIL(m_D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, signature.writeRef(), error.writeRef())); + SLANG_RETURN_ON_FAIL(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(rootSignature.writeRef()))); + } + + { + // Describe and create the compute pipeline state object + D3D12_COMPUTE_PIPELINE_STATE_DESC computeDesc = {}; + computeDesc.pRootSignature = rootSignature; + computeDesc.CS = { m_boundShaderProgram->m_computeShader.Buffer(), m_boundShaderProgram->m_computeShader.Count() }; + SLANG_RETURN_ON_FAIL(m_device->CreateComputePipelineState(&computeDesc, IID_PPV_ARGS(pipelineState.writeRef()))); + } + + signatureOut.swap(rootSignature); + pipelineStateOut.swap(pipelineState); + + return SLANG_OK; +} +#endif + +#if 0 +D3D12Renderer::RenderState* D3D12Renderer::findRenderState(PipelineType pipelineType) +{ + switch (pipelineType) + { + case PipelineType::Compute: + { + // Check if current state is a match + if (m_currentRenderState) + { + if (m_currentRenderState->m_bindingState == m_boundBindingState && + m_currentRenderState->m_shaderProgram == m_boundShaderProgram) + { + return m_currentRenderState; + } + } + + const int num = int(m_renderStates.Count()); + for (int i = 0; i < num; i++) + { + RenderState* renderState = m_renderStates[i]; + if (renderState->m_bindingState == m_boundBindingState && + renderState->m_shaderProgram == m_boundShaderProgram) + { + return renderState; + } + } + break; + } + case PipelineType::Graphics: + { + if (m_currentRenderState) + { + if (m_currentRenderState->m_bindingState == m_boundBindingState && + m_currentRenderState->m_inputLayout == m_boundInputLayout && + m_currentRenderState->m_shaderProgram == m_boundShaderProgram && + m_currentRenderState->m_primitiveTopologyType == m_primitiveTopologyType) + { + return m_currentRenderState; + } + } + // See if matches one in the list + { + const int num = int(m_renderStates.Count()); + for (int i = 0; i < num; i++) + { + RenderState* renderState = m_renderStates[i]; + if (renderState->m_bindingState == m_boundBindingState && + renderState->m_inputLayout == m_boundInputLayout && + renderState->m_shaderProgram == m_boundShaderProgram && + renderState->m_primitiveTopologyType == m_primitiveTopologyType) + { + // Okay we have a match + return renderState; + } + } + } + break; + } + default: break; + } + return nullptr; +} + +D3D12Renderer::RenderState* D3D12Renderer::calcRenderState() +{ + if (!m_boundShaderProgram) + { + return nullptr; + } + m_currentRenderState = findRenderState(m_boundShaderProgram->m_pipelineType); + if (m_currentRenderState) + { + return m_currentRenderState; + } + + ComPtr rootSignature; + ComPtr pipelineState; + + switch (m_boundShaderProgram->m_pipelineType) + { + case PipelineType::Compute: + { + if (SLANG_FAILED(calcComputePipelineState(rootSignature, pipelineState))) + { + return nullptr; + } + break; + } + case PipelineType::Graphics: + { + if (SLANG_FAILED(calcGraphicsPipelineState(rootSignature, pipelineState))) + { + return nullptr; + } + break; + } + default: return nullptr; + } + + RenderState* renderState = new RenderState; + + renderState->m_primitiveTopologyType = m_primitiveTopologyType; + renderState->m_bindingState = m_boundBindingState; + renderState->m_inputLayout = m_boundInputLayout; + renderState->m_shaderProgram = m_boundShaderProgram; + + renderState->m_rootSignature.swap(rootSignature); + renderState->m_pipelineState.swap(pipelineState); + + m_renderStates.Add(renderState); + + m_currentRenderState = renderState; + + return renderState; +} + +Result D3D12Renderer::_calcBindParameters(BindParameters& params) +{ + int numConstantBuffers = 0; + { + if (m_boundBindingState) + { + const int numBoundConstantBuffers = numConstantBuffers; + + const BindingState::Desc& bindingStateDesc = m_boundBindingState->getDesc(); + + const auto& bindings = bindingStateDesc.m_bindings; + const auto& details = m_boundBindingState->m_bindingDetails; + + const int numBindings = int(bindings.Count()); + + for (int i = 0; i < numBindings; i++) + { + const auto& binding = bindings[i]; + const auto& detail = details[i]; + + const int bindingIndex = binding.registerRange.getSingleIndex(); + + if (binding.bindingType == BindingType::Buffer) + { + assert(binding.resource && binding.resource->isBuffer()); + if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) + { + // Make sure it's not overlapping the ones we just statically defined + //assert(binding.m_binding < numBoundConstantBuffers); + + D3D12_ROOT_PARAMETER& param = params.nextParameter(); + param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_CBV; + param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + + D3D12_ROOT_DESCRIPTOR& descriptor = param.Descriptor; + descriptor.ShaderRegister = bindingIndex; + descriptor.RegisterSpace = 0; + + numConstantBuffers++; + } + } + + if (detail.m_srvIndex >= 0) + { + D3D12_DESCRIPTOR_RANGE& range = params.nextRange(); + + range.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SRV; + range.NumDescriptors = 1; + range.BaseShaderRegister = bindingIndex; + range.RegisterSpace = 0; + range.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + + D3D12_ROOT_PARAMETER& param = params.nextParameter(); + + param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + + D3D12_ROOT_DESCRIPTOR_TABLE& table = param.DescriptorTable; + table.NumDescriptorRanges = 1; + table.pDescriptorRanges = ⦥ + } + + if (detail.m_uavIndex >= 0) + { + D3D12_DESCRIPTOR_RANGE& range = params.nextRange(); + + range.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_UAV; + range.NumDescriptors = 1; + range.BaseShaderRegister = bindingIndex; + range.RegisterSpace = 0; + range.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + + D3D12_ROOT_PARAMETER& param = params.nextParameter(); + + param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + + D3D12_ROOT_DESCRIPTOR_TABLE& table = param.DescriptorTable; + table.NumDescriptorRanges = 1; + table.pDescriptorRanges = ⦥ + } + } + } + } + + // All the samplers are in one continuous section of the sampler heap + if (m_boundBindingState && m_boundBindingState->m_samplerHeap.getUsedSize() > 0) + { + D3D12_DESCRIPTOR_RANGE& range = params.nextRange(); + + range.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER; + range.NumDescriptors = m_boundBindingState->m_samplerHeap.getUsedSize(); + range.BaseShaderRegister = 0; + range.RegisterSpace = 0; + range.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + + D3D12_ROOT_PARAMETER& param = params.nextParameter(); + + param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + + D3D12_ROOT_DESCRIPTOR_TABLE& table = param.DescriptorTable; + table.NumDescriptorRanges = 1; + table.pDescriptorRanges = ⦥ + } + return SLANG_OK; +} +#endif + +Result D3D12Renderer::_bindRenderState(PipelineStateImpl* pipelineStateImpl, ID3D12GraphicsCommandList* commandList, Submitter* submitter) +{ + // TODO: we should only set some of this state as needed... + + auto pipelineTypeIndex = (int) pipelineStateImpl->m_pipelineType; + auto pipelineLayout = pipelineStateImpl->m_pipelineLayout; + + submitter->setRootSignature(pipelineLayout->m_rootSignature); + commandList->SetPipelineState(pipelineStateImpl->m_pipelineState); + + ID3D12DescriptorHeap* heaps[] = + { + m_viewHeap.getHeap(), + m_samplerHeap.getHeap(), + }; + commandList->SetDescriptorHeaps(SLANG_COUNT_OF(heaps), heaps); + + // We need to copy descriptors over from the descriptor sets + // (where they are stored in CPU-visible heaps) to the GPU-visible + // heaps so that they can be accessed by shader code. + + Int descriptorSetCount = pipelineLayout->m_descriptorSetCount; + Int rootParameterIndex = 0; + for(Int dd = 0; dd < descriptorSetCount; ++dd) + { + auto descriptorSet = m_boundDescriptorSets[pipelineTypeIndex][dd]; + auto descriptorSetLayout = descriptorSet->m_layout; + + // TODO: require that `descriptorSetLayout` is compatible with + // `pipelineLayout->descriptorSetlayouts[dd]`. + + { + if(auto descriptorCount = descriptorSetLayout->m_resourceCount) + { + auto& gpuHeap = m_viewHeap; + auto gpuDescriptorTable = gpuHeap.allocate(descriptorCount); + + auto& cpuHeap = *descriptorSet->m_resourceHeap; + auto cpuDescriptorTable = descriptorSet->m_resourceTable; + + m_device->CopyDescriptorsSimple( + descriptorCount, + gpuHeap.getCpuHandle(gpuDescriptorTable), + cpuHeap.getCpuHandle(cpuDescriptorTable), + D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); + + submitter->setRootDescriptorTable(rootParameterIndex++, gpuHeap.getGpuHandle(gpuDescriptorTable)); + } + } + { + if(auto descriptorCount = descriptorSetLayout->m_samplerCount) + { + auto& gpuHeap = m_samplerHeap; + auto gpuDescriptorTable = gpuHeap.allocate(descriptorCount); + + auto& cpuHeap = *descriptorSet->m_samplerHeap; + auto cpuDescriptorTable = descriptorSet->m_samplerTable; + + m_device->CopyDescriptorsSimple( + descriptorCount, + gpuHeap.getCpuHandle(gpuDescriptorTable), + cpuHeap.getCpuHandle(cpuDescriptorTable), + D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER); + + submitter->setRootDescriptorTable(rootParameterIndex++, gpuHeap.getGpuHandle(gpuDescriptorTable)); + } + } + } + + return SLANG_OK; +} + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! + +Result D3D12Renderer::initialize(const Desc& desc, void* inWindowHandle) +{ + m_hwnd = (HWND)inWindowHandle; + // Rather than statically link against D3D, we load it dynamically. + + HMODULE d3dModule = LoadLibraryA("d3d12.dll"); + if (!d3dModule) + { + fprintf(stderr, "error: failed load 'd3d12.dll'\n"); + return SLANG_FAIL; + } + + HMODULE dxgiModule = LoadLibraryA("Dxgi.dll"); + if (!dxgiModule) + { + fprintf(stderr, "error: failed load 'dxgi.dll'\n"); + return SLANG_FAIL; + } + + +#define LOAD_D3D_PROC(TYPE, NAME) \ + TYPE NAME##_ = (TYPE) loadProc(d3dModule, #NAME); +#define LOAD_DXGI_PROC(TYPE, NAME) \ + TYPE NAME##_ = (TYPE) loadProc(dxgiModule, #NAME); + + UINT dxgiFactoryFlags = 0; + +#if ENABLE_DEBUG_LAYER + { + LOAD_D3D_PROC(PFN_D3D12_GET_DEBUG_INTERFACE, D3D12GetDebugInterface); + if (D3D12GetDebugInterface_) + { + if (SUCCEEDED(D3D12GetDebugInterface_(IID_PPV_ARGS(m_dxDebug.writeRef())))) + { + m_dxDebug->EnableDebugLayer(); + dxgiFactoryFlags |= DXGI_CREATE_FACTORY_DEBUG; + } + } + } +#endif + + m_D3D12SerializeRootSignature = (PFN_D3D12_SERIALIZE_ROOT_SIGNATURE)loadProc(d3dModule, "D3D12SerializeRootSignature"); + if (!m_D3D12SerializeRootSignature) + { + return SLANG_FAIL; + } + + // Try and create DXGIFactory + ComPtr dxgiFactory; + { + typedef HRESULT(WINAPI *PFN_DXGI_CREATE_FACTORY_2)(UINT Flags, REFIID riid, _COM_Outptr_ void **ppFactory); + LOAD_DXGI_PROC(PFN_DXGI_CREATE_FACTORY_2, CreateDXGIFactory2); + if (!CreateDXGIFactory2_) + { + return SLANG_FAIL; + } + SLANG_RETURN_ON_FAIL(CreateDXGIFactory2_(dxgiFactoryFlags, IID_PPV_ARGS(dxgiFactory.writeRef()))); + } + + D3D_FEATURE_LEVEL featureLevel = D3D_FEATURE_LEVEL_11_0; + + // Search for an adapter that meets our requirements + ComPtr adapter; + + LOAD_D3D_PROC(PFN_D3D12_CREATE_DEVICE, D3D12CreateDevice); + if (!D3D12CreateDevice_) + { + return SLANG_FAIL; + } + + const bool useWarp = false; + + if (useWarp) + { + SLANG_RETURN_ON_FAIL(dxgiFactory->EnumWarpAdapter(IID_PPV_ARGS(adapter.writeRef()))); + SLANG_RETURN_ON_FAIL(D3D12CreateDevice_(adapter, featureLevel, IID_PPV_ARGS(m_device.writeRef()))); + } + else + { + UINT adapterCounter = 0; + for (;;) + { + UINT adapterIndex = adapterCounter++; + + ComPtr candidateAdapter; + if (dxgiFactory->EnumAdapters1(adapterIndex, candidateAdapter.writeRef()) == DXGI_ERROR_NOT_FOUND) + break; + + DXGI_ADAPTER_DESC1 desc; + candidateAdapter->GetDesc1(&desc); + + if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE) + { + + // TODO: may want to allow software driver as fallback + } + else + { + continue; + } + + if (SUCCEEDED(D3D12CreateDevice_(candidateAdapter, featureLevel, IID_PPV_ARGS(m_device.writeRef())))) + { + // We found one! + adapter = candidateAdapter; + break; + } + } + } + + if (!adapter) + { + // Couldn't find an adapter + return SLANG_FAIL; + } + + // set up debug layer +#ifndef NDEBUG + { + + LOAD_D3D_PROC(PFN_D3D12_GET_DEBUG_INTERFACE, D3D12GetDebugInterface); + if (!D3D12GetDebugInterface_) + { + return SLANG_FAIL; + } + + ComPtr debug; + + if (!SUCCEEDED(D3D12GetDebugInterface_(IID_PPV_ARGS(debug.writeRef())))) + { + return SLANG_FAIL; + } + + debug->EnableDebugLayer(); + } +#endif + + m_numRenderFrames = 3; + m_numRenderTargets = 2; + + m_desc = desc; + + // set viewport + { + m_viewport.Width = float(m_desc.width); + m_viewport.Height = float(m_desc.height); + m_viewport.MinDepth = 0; + m_viewport.MaxDepth = 1; + m_viewport.TopLeftX = 0; + m_viewport.TopLeftY = 0; + } + + { + m_scissorRect.left = 0; + m_scissorRect.top = 0; + m_scissorRect.right = m_desc.width; + m_scissorRect.bottom = m_desc.height; + } + + // Describe and create the command queue. + D3D12_COMMAND_QUEUE_DESC queueDesc = {}; + queueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE; + queueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT; + + SLANG_RETURN_ON_FAIL(m_device->CreateCommandQueue(&queueDesc, IID_PPV_ARGS(m_commandQueue.writeRef()))); + + // Describe the swap chain. + DXGI_SWAP_CHAIN_DESC swapChainDesc = {}; + swapChainDesc.BufferCount = m_numRenderTargets; + swapChainDesc.BufferDesc.Width = m_desc.width; + swapChainDesc.BufferDesc.Height = m_desc.height; + swapChainDesc.BufferDesc.Format = m_targetFormat; + swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; + swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD; + swapChainDesc.OutputWindow = m_hwnd; + swapChainDesc.SampleDesc.Count = 1; + swapChainDesc.Windowed = TRUE; + + if (m_isFullSpeed) + { + m_hasVsync = false; + m_allowFullScreen = false; + } + + if (!m_hasVsync) + { + swapChainDesc.Flags |= DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT; + } + + // Swap chain needs the queue so that it can force a flush on it. + ComPtr swapChain; + SLANG_RETURN_ON_FAIL(dxgiFactory->CreateSwapChain(m_commandQueue, &swapChainDesc, swapChain.writeRef())); + SLANG_RETURN_ON_FAIL(swapChain->QueryInterface(m_swapChain.writeRef())); + + if (!m_hasVsync) + { + m_swapChainWaitableObject = m_swapChain->GetFrameLatencyWaitableObject(); + + int maxLatency = m_numRenderTargets - 2; + + // Make sure the maximum latency is in the range required by dx12 runtime + maxLatency = (maxLatency < 1) ? 1 : maxLatency; + maxLatency = (maxLatency > DXGI_MAX_SWAP_CHAIN_BUFFERS) ? DXGI_MAX_SWAP_CHAIN_BUFFERS : maxLatency; + + m_swapChain->SetMaximumFrameLatency(maxLatency); + } + + // This sample does not support fullscreen transitions. + SLANG_RETURN_ON_FAIL(dxgiFactory->MakeWindowAssociation(m_hwnd, DXGI_MWA_NO_ALT_ENTER)); + + m_renderTargetIndex = m_swapChain->GetCurrentBackBufferIndex(); + + // Create descriptor heaps. + + SLANG_RETURN_ON_FAIL(m_viewHeap.init (m_device, 256, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); + SLANG_RETURN_ON_FAIL(m_samplerHeap.init(m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); + + SLANG_RETURN_ON_FAIL(m_cpuViewHeap.init (m_device, 1024, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); + SLANG_RETURN_ON_FAIL(m_cpuSamplerHeap.init(m_device, 64, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER, D3D12_DESCRIPTOR_HEAP_FLAG_NONE)); + + SLANG_RETURN_ON_FAIL(m_rtvAllocator.init (m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_RTV)); + SLANG_RETURN_ON_FAIL(m_dsvAllocator.init (m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_DSV)); + SLANG_RETURN_ON_FAIL(m_viewAllocator.init (m_device, 64, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV)); + SLANG_RETURN_ON_FAIL(m_samplerAllocator.init(m_device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER)); + + // Setup frame resources + { + SLANG_RETURN_ON_FAIL(createFrameResources()); + } + + // Setup fence, and close the command list (as default state without begin/endRender is closed) + { + SLANG_RETURN_ON_FAIL(m_fence.init(m_device)); + // Create the command list. When command lists are created they are open, so close it. + FrameInfo& frame = m_frameInfos[m_frameIndex]; + SLANG_RETURN_ON_FAIL(m_device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, frame.m_commandAllocator, nullptr, IID_PPV_ARGS(m_commandList.writeRef()))); + m_commandList->Close(); + } + + { + D3D12CircularResourceHeap::Desc desc; + desc.init(); + // Define size + desc.m_blockSize = 65536; + // Set up the heap + m_circularResourceHeap.init(m_device, desc, &m_fence); + } + + // Setup for rendering + beginRender(); + + m_isInitialized = true; + return SLANG_OK; +} + +Result D3D12Renderer::createFrameResources() +{ + // Create back buffers + { +// D3D12_CPU_DESCRIPTOR_HANDLE rtvStart(m_rtvHeap->GetCPUDescriptorHandleForHeapStart()); + + // Work out target format + D3D12_RESOURCE_DESC resourceDesc; + { + ComPtr backBuffer; + SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(0, IID_PPV_ARGS(backBuffer.writeRef()))); + resourceDesc = backBuffer->GetDesc(); + } + const DXGI_FORMAT resourceFormat = D3DUtil::calcResourceFormat(D3DUtil::USAGE_TARGET, m_targetUsageFlags, resourceDesc.Format); + const DXGI_FORMAT targetFormat = D3DUtil::calcFormat(D3DUtil::USAGE_TARGET, resourceFormat); + + // Set the target format + m_targetFormat = targetFormat; + + // Create a RTV, and a command allocator for each frame. + for (int i = 0; i < m_numRenderTargets; i++) + { + // Get the back buffer + ComPtr backBuffer; + SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(UINT(i), IID_PPV_ARGS(backBuffer.writeRef()))); + + // Set up resource for back buffer + m_backBufferResources[i].setResource(backBuffer, D3D12_RESOURCE_STATE_COMMON); + m_backBuffers[i] = &m_backBufferResources[i]; + // Assume they are the same thing for now... + m_renderTargets[i] = &m_backBufferResources[i]; + + // If we are multi-sampling - create a render target separate from the back buffer + if (m_isMultiSampled) + { + D3D12_HEAP_PROPERTIES heapProps; + heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + D3D12_CLEAR_VALUE clearValue = {}; + clearValue.Format = m_targetFormat; + + // Don't know targets alignment, so just memory copy + ::memcpy(clearValue.Color, m_clearColor, sizeof(m_clearColor)); + + D3D12_RESOURCE_DESC desc(resourceDesc); + + desc.Format = resourceFormat; + desc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D; + desc.SampleDesc.Count = m_numTargetSamples; + desc.SampleDesc.Quality = m_targetSampleQuality; + desc.Alignment = 0; + + SLANG_RETURN_ON_FAIL(m_renderTargetResources[i].initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, desc, D3D12_RESOURCE_STATE_RENDER_TARGET, &clearValue)); + m_renderTargets[i] = &m_renderTargetResources[i]; + } + + D3D12HostVisibleDescriptor rtvDescriptor; + SLANG_RETURN_ON_FAIL(m_rtvAllocator.allocate(&rtvDescriptor)); + + m_device->CreateRenderTargetView(*m_renderTargets[i], nullptr, rtvDescriptor.cpuHandle); + } + } + + // Set up frames + for (int i = 0; i < m_numRenderFrames; i++) + { + FrameInfo& frame = m_frameInfos[i]; + SLANG_RETURN_ON_FAIL(m_device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(frame.m_commandAllocator.writeRef()))); + } + + { + D3D12_RESOURCE_DESC desc = m_backBuffers[0]->getResource()->GetDesc(); + assert(desc.Width == UINT64(m_desc.width) && desc.Height == UINT64(m_desc.height)); + } + + // Create the depth stencil view. + { + D3D12_HEAP_PROPERTIES heapProps; + heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + + DXGI_FORMAT resourceFormat = D3DUtil::calcResourceFormat(D3DUtil::USAGE_DEPTH_STENCIL, m_depthStencilUsageFlags, m_depthStencilFormat); + DXGI_FORMAT depthStencilFormat = D3DUtil::calcFormat(D3DUtil::USAGE_DEPTH_STENCIL, resourceFormat); + + // Set the depth stencil format + m_depthStencilFormat = depthStencilFormat; + + // Setup default clear + D3D12_CLEAR_VALUE clearValue = {}; + clearValue.Format = depthStencilFormat; + clearValue.DepthStencil.Depth = 1.0f; + clearValue.DepthStencil.Stencil = 0; + + D3D12_RESOURCE_DESC resourceDesc = {}; + resourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D; + resourceDesc.Format = resourceFormat; + resourceDesc.Width = m_desc.width; + resourceDesc.Height = m_desc.height; + resourceDesc.DepthOrArraySize = 1; + resourceDesc.MipLevels = 1; + resourceDesc.SampleDesc.Count = m_numTargetSamples; + resourceDesc.SampleDesc.Quality = m_targetSampleQuality; + resourceDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN; + resourceDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL; + resourceDesc.Alignment = 0; + +#if 0 + SLANG_RETURN_ON_FAIL(m_depthStencil.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, D3D12_RESOURCE_STATE_DEPTH_WRITE, &clearValue)); + + // Set the depth stencil + D3D12_DEPTH_STENCIL_VIEW_DESC depthStencilDesc = {}; + depthStencilDesc.Format = depthStencilFormat; + depthStencilDesc.ViewDimension = m_isMultiSampled ? D3D12_DSV_DIMENSION_TEXTURE2DMS : D3D12_DSV_DIMENSION_TEXTURE2D; + depthStencilDesc.Flags = D3D12_DSV_FLAG_NONE; + + // Set up as the depth stencil view + m_device->CreateDepthStencilView(m_depthStencil, &depthStencilDesc, m_dsvHeap->GetCPUDescriptorHandleForHeapStart()); + m_depthStencilView = m_dsvHeap->GetCPUDescriptorHandleForHeapStart(); +#endif + } + + m_viewport.Width = static_cast(m_desc.width); + m_viewport.Height = static_cast(m_desc.height); + m_viewport.MaxDepth = 1.0f; + + m_scissorRect.right = static_cast(m_desc.width); + m_scissorRect.bottom = static_cast(m_desc.height); + + return SLANG_OK; +} + +void D3D12Renderer::setClearColor(const float color[4]) +{ + memcpy(m_clearColor, color, sizeof(m_clearColor)); +} + +void D3D12Renderer::clearFrame() +{ + // Record commands + if(auto rtv = m_rtvs[0]) + { + m_commandList->ClearRenderTargetView(rtv->m_descriptor.cpuHandle, m_clearColor, 0, nullptr); + } + if (m_dsv) + { + m_commandList->ClearDepthStencilView(m_dsv->m_descriptor.cpuHandle, D3D12_CLEAR_FLAG_DEPTH, 1.0f, 0, 0, nullptr); + } +} + +void D3D12Renderer::presentFrame() +{ + endRender(); + + if (m_swapChainWaitableObject) + { + // check if now is good time to present + // This doesn't wait - because the wait time is 0. If it returns WAIT_TIMEOUT it means that no frame is waiting to be be displayed + // so there is no point doing a present. + const bool shouldPresent = (WaitForSingleObjectEx(m_swapChainWaitableObject, 0, TRUE) != WAIT_TIMEOUT); + if (shouldPresent) + { + m_swapChain->Present(0, 0); + } + } + else + { + if (SLANG_FAILED(m_swapChain->Present(1, 0))) + { + assert(!"Problem presenting"); + beginRender(); + return; + } + } + + // Increment the fence value. Save on the frame - we'll know that frame is done when the fence value >= + m_frameInfos[m_frameIndex].m_fenceValue = m_fence.nextSignal(m_commandQueue); + + // increment frame index after signal + m_frameIndex = (m_frameIndex + 1) % m_numRenderFrames; + // Update the render target index. + m_renderTargetIndex = m_swapChain->GetCurrentBackBufferIndex(); + + // On the current frame wait until it is completed + { + FrameInfo& frame = m_frameInfos[m_frameIndex]; + // If the next frame is not ready to be rendered yet, wait until it is ready. + m_fence.waitUntilCompleted(frame.m_fenceValue); + } + + // Setup such that rendering can restart + beginRender(); +} + +TextureResource::Desc D3D12Renderer::getSwapChainTextureDesc() +{ + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, m_desc.width, m_desc.height, 1); + + return desc; +} + +SlangResult D3D12Renderer::captureScreenSurface(Surface& surfaceOut) +{ + return captureTextureToSurface(*m_renderTargets[m_renderTargetIndex], surfaceOut); +} + +static D3D12_RESOURCE_STATES _calcResourceState(Resource::Usage usage) +{ + typedef Resource::Usage Usage; + switch (usage) + { + case Usage::VertexBuffer: return D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER; + case Usage::IndexBuffer: return D3D12_RESOURCE_STATE_INDEX_BUFFER; + case Usage::ConstantBuffer: return D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER; + case Usage::StreamOutput: return D3D12_RESOURCE_STATE_STREAM_OUT; + case Usage::RenderTarget: return D3D12_RESOURCE_STATE_RENDER_TARGET; + case Usage::DepthWrite: return D3D12_RESOURCE_STATE_DEPTH_WRITE; + case Usage::DepthRead: return D3D12_RESOURCE_STATE_DEPTH_READ; + case Usage::UnorderedAccess: return D3D12_RESOURCE_STATE_UNORDERED_ACCESS; + case Usage::PixelShaderResource: return D3D12_RESOURCE_STATE_PIXEL_SHADER_RESOURCE; + case Usage::NonPixelShaderResource: return D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE; + case Usage::GenericRead: return D3D12_RESOURCE_STATE_GENERIC_READ; + default: return D3D12_RESOURCE_STATES(0); + } +} + +static D3D12_RESOURCE_FLAGS _calcResourceFlag(Resource::BindFlag::Enum bindFlag) +{ + typedef Resource::BindFlag BindFlag; + switch (bindFlag) + { + case BindFlag::RenderTarget: return D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET; + case BindFlag::DepthStencil: return D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL; + case BindFlag::UnorderedAccess: return D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS; + default: return D3D12_RESOURCE_FLAG_NONE; + } +} + +static D3D12_RESOURCE_FLAGS _calcResourceBindFlags(Resource::Usage initialUsage, int bindFlags) +{ + int dstFlags = 0; + while (bindFlags) + { + int lsb = bindFlags & -bindFlags; + + dstFlags |= _calcResourceFlag(Resource::BindFlag::Enum(lsb)); + bindFlags &= ~lsb; + } + return D3D12_RESOURCE_FLAGS(dstFlags); +} + +static D3D12_RESOURCE_DIMENSION _calcResourceDimension(Resource::Type type) +{ + switch (type) + { + case Resource::Type::Buffer: return D3D12_RESOURCE_DIMENSION_BUFFER; + case Resource::Type::Texture1D: return D3D12_RESOURCE_DIMENSION_TEXTURE1D; + case Resource::Type::TextureCube: + case Resource::Type::Texture2D: + { + return D3D12_RESOURCE_DIMENSION_TEXTURE2D; + } + case Resource::Type::Texture3D: return D3D12_RESOURCE_DIMENSION_TEXTURE3D; + default: return D3D12_RESOURCE_DIMENSION_UNKNOWN; + } +} + +Result D3D12Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) +{ + // Description of uploading on Dx12 + // https://msdn.microsoft.com/en-us/library/windows/desktop/dn899215%28v=vs.85%29.aspx + + TextureResource::Desc srcDesc(descIn); + srcDesc.setDefaults(initialUsage); + + const DXGI_FORMAT pixelFormat = D3DUtil::getMapFormat(srcDesc.format); + if (pixelFormat == DXGI_FORMAT_UNKNOWN) + { + return SLANG_FAIL; + } + + const int arraySize = srcDesc.calcEffectiveArraySize(); + + const D3D12_RESOURCE_DIMENSION dimension = _calcResourceDimension(srcDesc.type); + if (dimension == D3D12_RESOURCE_DIMENSION_UNKNOWN) + { + return SLANG_FAIL; + } + + const int numMipMaps = srcDesc.numMipLevels; + + // Setup desc + D3D12_RESOURCE_DESC resourceDesc; + + resourceDesc.Dimension = dimension; + resourceDesc.Format = pixelFormat; + resourceDesc.Width = srcDesc.size.width; + resourceDesc.Height = srcDesc.size.height; + resourceDesc.DepthOrArraySize = (srcDesc.size.depth > 1) ? srcDesc.size.depth : arraySize; + + resourceDesc.MipLevels = numMipMaps; + resourceDesc.SampleDesc.Count = srcDesc.sampleDesc.numSamples; + resourceDesc.SampleDesc.Quality = srcDesc.sampleDesc.quality; + + resourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE; + resourceDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN; + resourceDesc.Alignment = 0; + + RefPtr texture(new TextureResourceImpl(srcDesc)); + + // Create the target resource + { + D3D12_HEAP_PROPERTIES heapProps; + + heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + + SLANG_RETURN_ON_FAIL(texture->m_resource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); + + texture->m_resource.setDebugName(L"Texture"); + } + + // Calculate the layout + List layouts; + layouts.SetSize(numMipMaps); + List mipRowSizeInBytes; + mipRowSizeInBytes.SetSize(numMipMaps); + List mipNumRows; + mipNumRows.SetSize(numMipMaps); + + // Since textures are effectively immutable currently initData must be set + assert(initData); + // We should have this many sub resources + assert(initData->numSubResources == numMipMaps * srcDesc.size.depth * arraySize); + + // This is just the size for one array upload -> not for the whole texure + UInt64 requiredSize = 0; + m_device->GetCopyableFootprints(&resourceDesc, 0, numMipMaps, 0, layouts.begin(), mipNumRows.begin(), mipRowSizeInBytes.begin(), &requiredSize); + + // Sub resource indexing + // https://msdn.microsoft.com/en-us/library/windows/desktop/dn705766(v=vs.85).aspx#subresource_indexing + + int subResourceIndex = 0; + for (int i = 0; i < arraySize; i++) + { + // Create the upload texture + D3D12Resource uploadTexture; + { + D3D12_HEAP_PROPERTIES heapProps; + + heapProps.Type = D3D12_HEAP_TYPE_UPLOAD; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + + D3D12_RESOURCE_DESC uploadResourceDesc; + + uploadResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER; + uploadResourceDesc.Format = DXGI_FORMAT_UNKNOWN; + uploadResourceDesc.Width = requiredSize; + uploadResourceDesc.Height = 1; + uploadResourceDesc.DepthOrArraySize = 1; + uploadResourceDesc.MipLevels = 1; + uploadResourceDesc.SampleDesc.Count = 1; + uploadResourceDesc.SampleDesc.Quality = 0; + uploadResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE; + uploadResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR; + uploadResourceDesc.Alignment = 0; + + SLANG_RETURN_ON_FAIL(uploadTexture.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, uploadResourceDesc, D3D12_RESOURCE_STATE_GENERIC_READ, nullptr)); + + uploadTexture.setDebugName(L"TextureUpload"); + } + + ID3D12Resource* uploadResource = uploadTexture; + + uint8_t* p; + uploadResource->Map(0, nullptr, reinterpret_cast(&p)); + + for (int j = 0; j < numMipMaps; ++j) + { + const D3D12_PLACED_SUBRESOURCE_FOOTPRINT& layout = layouts[j]; + const D3D12_SUBRESOURCE_FOOTPRINT& footprint = layout.Footprint; + + const TextureResource::Size mipSize = srcDesc.size.calcMipSize(j); + + assert(footprint.Width == mipSize.width && footprint.Height == mipSize.height && footprint.Depth == mipSize.depth); + + const ptrdiff_t dstMipRowPitch = ptrdiff_t(layouts[j].Footprint.RowPitch); + const ptrdiff_t srcMipRowPitch = ptrdiff_t(initData->mipRowStrides[j]); + + assert(dstMipRowPitch >= srcMipRowPitch); + + const uint8_t* srcRow = (const uint8_t*)initData->subResources[subResourceIndex]; + uint8_t* dstRow = p + layouts[j].Offset; + + // Copy the depth each mip + for (int l = 0; l < mipSize.depth; l++) + { + // Copy rows + for (int k = 0; k < mipSize.height; ++k) + { + ::memcpy(dstRow, srcRow, srcMipRowPitch); + + srcRow += srcMipRowPitch; + dstRow += dstMipRowPitch; + } + } + + //assert(srcRow == (const uint8_t*)(srcMip.Buffer() + srcMip.Count())); + } + uploadResource->Unmap(0, nullptr); + + for (int mipIndex = 0; mipIndex < numMipMaps; ++mipIndex) + { + // https://msdn.microsoft.com/en-us/library/windows/desktop/dn903862(v=vs.85).aspx + + D3D12_TEXTURE_COPY_LOCATION src; + src.pResource = uploadTexture; + src.Type = D3D12_TEXTURE_COPY_TYPE_PLACED_FOOTPRINT; + src.PlacedFootprint = layouts[mipIndex]; + + D3D12_TEXTURE_COPY_LOCATION dst; + dst.pResource = texture->m_resource; + dst.Type = D3D12_TEXTURE_COPY_TYPE_SUBRESOURCE_INDEX; + dst.SubresourceIndex = subResourceIndex; + m_commandList->CopyTextureRegion(&dst, 0, 0, 0, &src, nullptr); + + subResourceIndex++; + } + + // Block - waiting for copy to complete (so can drop upload texture) + submitGpuWorkAndWait(); + } + + { + const D3D12_RESOURCE_STATES finalState = _calcResourceState(initialUsage); + D3D12BarrierSubmitter submitter(m_commandList); + texture->m_resource.transition(finalState, submitter); + + submitGpuWorkAndWait(); + } + + *outResource = texture.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) +{ + typedef BufferResourceImpl::BackingStyle Style; + + BufferResource::Desc srcDesc(descIn); + srcDesc.setDefaults(initialUsage); + + // Always align up to 256 bytes, since that is required for constant buffers. + // + // TODO: only do this for buffers that could potentially be bound as constant buffers... + // + const size_t alignedSizeInBytes = D3DUtil::calcAligned(srcDesc.sizeInBytes, 256); + + RefPtr buffer(new BufferResourceImpl(initialUsage, srcDesc)); + + // Save the style + buffer->m_backingStyle = BufferResourceImpl::_calcResourceBackingStyle(initialUsage); + + D3D12_RESOURCE_DESC bufferDesc; + _initBufferResourceDesc(alignedSizeInBytes, bufferDesc); + + bufferDesc.Flags = _calcResourceBindFlags(initialUsage, srcDesc.bindFlags); + + switch (buffer->m_backingStyle) + { + case Style::MemoryBacked: + { + // Assume the constant buffer will change every frame. We'll just keep a copy of the contents + // in regular memory until it needed + buffer->m_memory.SetSize(UInt(alignedSizeInBytes)); + // Initialize + if (initData) + { + ::memcpy(buffer->m_memory.Buffer(), initData, srcDesc.sizeInBytes); + } + break; + } + case Style::ResourceBacked: + { + const D3D12_RESOURCE_STATES initialState = _calcResourceState(initialUsage); + SLANG_RETURN_ON_FAIL(createBuffer(bufferDesc, initData, buffer->m_uploadResource, initialState, buffer->m_resource)); + break; + } + default: + return SLANG_FAIL; + } + + *outResource = buffer.detach(); + return SLANG_OK; +} + +D3D12_FILTER_TYPE translateFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return D3D12_FILTER_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return D3D12_FILTER_TYPE_##DST + + CASE(Point, POINT); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +D3D12_FILTER_REDUCTION_TYPE translateFilterReduction(TextureReductionOp op) +{ + switch (op) + { + default: + return D3D12_FILTER_REDUCTION_TYPE(0); + +#define CASE(SRC, DST) \ + case TextureReductionOp::SRC: return D3D12_FILTER_REDUCTION_TYPE_##DST + + CASE(Average, STANDARD); + CASE(Comparison, COMPARISON); + CASE(Minimum, MINIMUM); + CASE(Maximum, MAXIMUM); + +#undef CASE + } +} + +D3D12_TEXTURE_ADDRESS_MODE translateAddressingMode(TextureAddressingMode mode) +{ + switch (mode) + { + default: + return D3D12_TEXTURE_ADDRESS_MODE(0); + +#define CASE(SRC, DST) \ + case TextureAddressingMode::SRC: return D3D12_TEXTURE_ADDRESS_MODE_##DST + + CASE(Wrap, WRAP); + CASE(ClampToEdge, CLAMP); + CASE(ClampToBorder, BORDER); + CASE(MirrorRepeat, MIRROR); + CASE(MirrorOnce, MIRROR_ONCE); + +#undef CASE + } +} + +static D3D12_COMPARISON_FUNC translateComparisonFunc(ComparisonFunc func) +{ + switch (func) + { + default: + // TODO: need to report failures + return D3D12_COMPARISON_FUNC_ALWAYS; + +#define CASE(FROM, TO) \ + case ComparisonFunc::FROM: return D3D12_COMPARISON_FUNC_##TO + + CASE(Never, NEVER); + CASE(Less, LESS); + CASE(Equal, EQUAL); + CASE(LessEqual, LESS_EQUAL); + CASE(Greater, GREATER); + CASE(NotEqual, NOT_EQUAL); + CASE(GreaterEqual, GREATER_EQUAL); + CASE(Always, ALWAYS); +#undef CASE + } +} + +Result D3D12Renderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + D3D12_FILTER_REDUCTION_TYPE dxReduction = translateFilterReduction(desc.reductionOp); + D3D12_FILTER dxFilter; + if (desc.maxAnisotropy > 1) + { + dxFilter = D3D12_ENCODE_ANISOTROPIC_FILTER(dxReduction); + } + else + { + D3D12_FILTER_TYPE dxMin = translateFilterMode(desc.minFilter); + D3D12_FILTER_TYPE dxMag = translateFilterMode(desc.magFilter); + D3D12_FILTER_TYPE dxMip = translateFilterMode(desc.mipFilter); + + dxFilter = D3D12_ENCODE_BASIC_FILTER(dxMin, dxMag, dxMip, dxReduction); + } + + D3D12_SAMPLER_DESC dxDesc = {}; + dxDesc.Filter = dxFilter; + dxDesc.AddressU = translateAddressingMode(desc.addressU); + dxDesc.AddressV = translateAddressingMode(desc.addressV); + dxDesc.AddressW = translateAddressingMode(desc.addressW); + dxDesc.MipLODBias = desc.mipLODBias; + dxDesc.MaxAnisotropy = desc.maxAnisotropy; + dxDesc.ComparisonFunc = translateComparisonFunc(desc.comparisonFunc); + for (int ii = 0; ii < 4; ++ii) + dxDesc.BorderColor[ii] = desc.borderColor[ii]; + dxDesc.MinLOD = desc.minLOD; + dxDesc.MaxLOD = desc.maxLOD; + + auto samplerHeap = &m_cpuSamplerHeap; + + int indexInSamplerHeap = samplerHeap->allocate(); + if(indexInSamplerHeap < 0) + { + // We ran out of room in our CPU sampler heap. + // + // TODO: this should not be a catastrophic failure, because + // we should just allocate another CPU sampler heap that + // can service subsequent allocation. + // + return SLANG_FAIL; + } + auto cpuDescriptorHandle = samplerHeap->getCpuHandle(indexInSamplerHeap); + + m_device->CreateSampler(&dxDesc, cpuDescriptorHandle); + + // TODO: We really ought to have a free-list of sampler-heap + // entries that we check before we go to the heap, and then + // when we are done with a sampler we simply add it to the free list. + // + RefPtr samplerImpl = new SamplerStateImpl(); + samplerImpl->m_cpuHandle = cpuDescriptorHandle; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (TextureResourceImpl*) texture; + + RefPtr viewImpl = new ResourceViewImpl(); + viewImpl->m_resource = resourceImpl; + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::RenderTarget: + { + SLANG_RETURN_ON_FAIL(m_rtvAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateRenderTargetView(resourceImpl->m_resource, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::DepthStencil: + { + SLANG_RETURN_ON_FAIL(m_dsvAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateDepthStencilView(resourceImpl->m_resource, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::UnorderedAccess: + { + // TODO: need to support the separate "counter resource" for the case + // of append/consume buffers with attached counters. + + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateUnorderedAccessView(resourceImpl->m_resource, nullptr, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::ShaderResource: + { + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateShaderResourceView(resourceImpl->m_resource, nullptr, viewImpl->m_descriptor.cpuHandle); + } + break; + } + + *outView = viewImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + auto resourceDesc = resourceImpl->getDesc(); + + RefPtr viewImpl = new ResourceViewImpl(); + viewImpl->m_resource = resourceImpl; + + switch (desc.type) + { + default: + return SLANG_FAIL; + + case ResourceView::Type::UnorderedAccess: + { + D3D12_UNORDERED_ACCESS_VIEW_DESC uavDesc = {}; + uavDesc.ViewDimension = D3D12_UAV_DIMENSION_BUFFER; + uavDesc.Format = D3DUtil::getMapFormat(desc.format); + uavDesc.Buffer.FirstElement = 0; + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + uavDesc.Buffer.StructureByteStride = resourceDesc.elementSize; + uavDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + else if(desc.format == Format::Unknown) + { + uavDesc.Buffer.Flags |= D3D12_BUFFER_UAV_FLAG_RAW; + uavDesc.Format = DXGI_FORMAT_R32_TYPELESS; + } + + + // TODO: need to support the separate "counter resource" for the case + // of append/consume buffers with attached counters. + + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateUnorderedAccessView(resourceImpl->m_resource, nullptr, &uavDesc, viewImpl->m_descriptor.cpuHandle); + } + break; + + case ResourceView::Type::ShaderResource: + { + D3D12_SHADER_RESOURCE_VIEW_DESC srvDesc = {}; + srvDesc.ViewDimension = D3D12_SRV_DIMENSION_BUFFER; + srvDesc.Format = D3DUtil::getMapFormat(desc.format); + srvDesc.Buffer.StructureByteStride = 0; + srvDesc.Buffer.FirstElement = 0; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes; + + if(resourceDesc.elementSize) + { + srvDesc.Buffer.StructureByteStride = resourceDesc.elementSize; + srvDesc.Buffer.NumElements = resourceDesc.sizeInBytes / resourceDesc.elementSize; + } + + SLANG_RETURN_ON_FAIL(m_viewAllocator.allocate(&viewImpl->m_descriptor)); + m_device->CreateShaderResourceView(resourceImpl->m_resource, &srvDesc, viewImpl->m_descriptor.cpuHandle); + } + break; + } + + *outView = viewImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) +{ + RefPtr layout(new InputLayoutImpl); + + // Work out a buffer size to hold all text + size_t textSize = 0; + for (int i = 0; i < Int(inputElementCount); ++i) + { + const char* text = inputElements[i].semanticName; + textSize += text ? (::strlen(text) + 1) : 0; + } + layout->m_text.SetSize(textSize); + char* textPos = layout->m_text.Buffer(); + + // + List& elements = layout->m_elements; + elements.SetSize(inputElementCount); + + + for (UInt i = 0; i < inputElementCount; ++i) + { + const InputElementDesc& srcEle = inputElements[i]; + D3D12_INPUT_ELEMENT_DESC& dstEle = elements[i]; + + // Add text to the buffer + const char* semanticName = srcEle.semanticName; + if (semanticName) + { + const int len = int(::strlen(semanticName)); + ::memcpy(textPos, semanticName, len + 1); + semanticName = textPos; + textPos += len + 1; + } + + dstEle.SemanticName = semanticName; + dstEle.SemanticIndex = (UINT)srcEle.semanticIndex; + dstEle.Format = D3DUtil::getMapFormat(srcEle.format); + dstEle.InputSlot = 0; + dstEle.AlignedByteOffset = (UINT)srcEle.offset; + dstEle.InputSlotClass = D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA; + dstEle.InstanceDataStepRate = 0; + } + + *outLayout = layout.detach(); + return SLANG_OK; +} + +void* D3D12Renderer::map(BufferResource* bufferIn, MapFlavor flavor) +{ + typedef BufferResourceImpl::BackingStyle Style; + + BufferResourceImpl* buffer = static_cast(bufferIn); + buffer->m_mapFlavor = flavor; + + const size_t bufferSize = buffer->getDesc().sizeInBytes; + + switch (buffer->m_backingStyle) + { + case Style::ResourceBacked: + { + // We need this in a state so we can upload + switch (flavor) + { + case MapFlavor::HostWrite: + case MapFlavor::WriteDiscard: + { + D3D12BarrierSubmitter submitter(m_commandList); + buffer->m_uploadResource.transition(D3D12_RESOURCE_STATE_GENERIC_READ, submitter); + buffer->m_resource.transition(D3D12_RESOURCE_STATE_COPY_DEST, submitter); + + const D3D12_RANGE readRange = {}; + + void* uploadData; + SLANG_RETURN_NULL_ON_FAIL(buffer->m_uploadResource.getResource()->Map(0, &readRange, reinterpret_cast(&uploadData))); + return uploadData; + + break; + } + case MapFlavor::HostRead: + { + // This will be slow!!! - it blocks CPU on GPU completion + D3D12Resource& resource = buffer->m_resource; + + // Readback heap + D3D12_HEAP_PROPERTIES heapProps; + heapProps.Type = D3D12_HEAP_TYPE_READBACK; + heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; + heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; + heapProps.CreationNodeMask = 1; + heapProps.VisibleNodeMask = 1; + + // Resource to readback to + D3D12_RESOURCE_DESC stagingDesc; + _initBufferResourceDesc(bufferSize, stagingDesc); + + D3D12Resource stageBuf; + SLANG_RETURN_NULL_ON_FAIL(stageBuf.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, stagingDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); + + const D3D12_RESOURCE_STATES initialState = resource.getState(); + + // Make it a source + { + D3D12BarrierSubmitter submitter(m_commandList); + resource.transition(D3D12_RESOURCE_STATE_COPY_SOURCE, submitter); + } + // Do the copy + m_commandList->CopyBufferRegion(stageBuf, 0, resource, 0, bufferSize); + // Switch it back + { + D3D12BarrierSubmitter submitter(m_commandList); + resource.transition(initialState, submitter); + } + + // Wait until complete + submitGpuWorkAndWait(); + + // Map and copy + { + UINT8* data; + D3D12_RANGE readRange = { 0, bufferSize }; + + SLANG_RETURN_NULL_ON_FAIL(stageBuf.getResource()->Map(0, &readRange, reinterpret_cast(&data))); + + // Copy to memory buffer + buffer->m_memory.SetSize(bufferSize); + ::memcpy(buffer->m_memory.Buffer(), data, bufferSize); + + stageBuf.getResource()->Unmap(0, nullptr); + } + + return buffer->m_memory.Buffer(); + } + } + break; + } + case Style::MemoryBacked: + { + return buffer->m_memory.Buffer(); + } + default: return nullptr; + } + + return nullptr; +} + +void D3D12Renderer::unmap(BufferResource* bufferIn) +{ + typedef BufferResourceImpl::BackingStyle Style; + BufferResourceImpl* buffer = static_cast(bufferIn); + + switch (buffer->m_backingStyle) + { + case Style::MemoryBacked: + { + // Don't need to do anything, as will be uploaded automatically when used + break; + } + case Style::ResourceBacked: + { + // We need this in a state so we can upload + switch (buffer->m_mapFlavor) + { + case MapFlavor::HostWrite: + case MapFlavor::WriteDiscard: + { + // Unmap + ID3D12Resource* uploadResource = buffer->m_uploadResource; + ID3D12Resource* resource = buffer->m_resource; + + uploadResource->Unmap(0, nullptr); + + const D3D12_RESOURCE_STATES initialState = buffer->m_resource.getState(); + + { + D3D12BarrierSubmitter submitter(m_commandList); + buffer->m_uploadResource.transition(D3D12_RESOURCE_STATE_GENERIC_READ, submitter); + buffer->m_resource.transition(D3D12_RESOURCE_STATE_COPY_DEST, submitter); + } + + m_commandList->CopyBufferRegion(resource, 0, uploadResource, 0, buffer->getDesc().sizeInBytes); + + { + D3D12BarrierSubmitter submitter(m_commandList); + buffer->m_resource.transition(initialState, submitter); + } + break; + } + case MapFlavor::HostRead: + { + break; + } + } + } + } +} + +#if 0 +void D3D12Renderer::setInputLayout(InputLayout* inputLayout) +{ + m_boundInputLayout = static_cast(inputLayout); +} +#endif + +void D3D12Renderer::setPrimitiveTopology(PrimitiveTopology topology) +{ + switch (topology) + { + case PrimitiveTopology::TriangleList: + { + m_primitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; + m_primitiveTopology = D3DUtil::getPrimitiveTopology(topology); + break; + } + default: + { + assert(!"Unhandled type"); + } + } +} + +void D3D12Renderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) +{ + { + const UInt num = startSlot + slotCount; + if (num > m_boundVertexBuffers.Count()) + { + m_boundVertexBuffers.SetSize(num); + } + } + + for (UInt i = 0; i < slotCount; i++) + { + BufferResourceImpl* buffer = static_cast(buffers[i]); + if (buffer) + { + assert(buffer->m_initialUsage == Resource::Usage::VertexBuffer); + } + + BoundVertexBuffer& boundBuffer = m_boundVertexBuffers[startSlot + i]; + boundBuffer.m_buffer = buffer; + boundBuffer.m_stride = int(strides[i]); + boundBuffer.m_offset = int(offsets[i]); + } +} + +void D3D12Renderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) +{ + m_boundIndexBuffer = (BufferResourceImpl*) buffer; + m_boundIndexFormat = D3DUtil::getMapFormat(indexFormat); + m_boundIndexOffset = offset; +} + +void D3D12Renderer::setDepthStencilTarget(ResourceView* depthStencilView) +{ +} + +void D3D12Renderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + m_currentPipelineState = (PipelineStateImpl*)state; +} + +void D3D12Renderer::draw(UInt vertexCount, UInt startVertex) +{ + ID3D12GraphicsCommandList* commandList = m_commandList; + + auto pipelineState = m_currentPipelineState.Ptr(); + if (!pipelineState || (pipelineState->m_pipelineType != PipelineType::Graphics)) + { + assert(!"No graphics pipeline state set"); + return; + } + + // Submit - setting for graphics + { + GraphicsSubmitter submitter(commandList); + _bindRenderState(pipelineState, commandList, &submitter); + } + + commandList->IASetPrimitiveTopology(m_primitiveTopology); + + // Set up vertex buffer views + { + int numVertexViews = 0; + D3D12_VERTEX_BUFFER_VIEW vertexViews[16]; + for (int i = 0; i < int(m_boundVertexBuffers.Count()); i++) + { + const BoundVertexBuffer& boundVertexBuffer = m_boundVertexBuffers[i]; + BufferResourceImpl* buffer = boundVertexBuffer.m_buffer; + if (buffer) + { + D3D12_VERTEX_BUFFER_VIEW& vertexView = vertexViews[numVertexViews++]; + vertexView.BufferLocation = buffer->m_resource.getResource()->GetGPUVirtualAddress() + + boundVertexBuffer.m_offset; + vertexView.SizeInBytes = buffer->getDesc().sizeInBytes - boundVertexBuffer.m_offset; + vertexView.StrideInBytes = boundVertexBuffer.m_stride; + } + } + commandList->IASetVertexBuffers(0, numVertexViews, vertexViews); + } + + // Set up index buffer + if(m_boundIndexBuffer) + { + D3D12_INDEX_BUFFER_VIEW indexBufferView; + indexBufferView.BufferLocation = m_boundIndexBuffer->m_resource.getResource()->GetGPUVirtualAddress() + + m_boundIndexOffset; + indexBufferView.SizeInBytes = m_boundIndexBuffer->getDesc().sizeInBytes - m_boundIndexOffset; + indexBufferView.Format = m_boundIndexFormat; + + commandList->IASetIndexBuffer(&indexBufferView); + } + + commandList->DrawInstanced(UINT(vertexCount), 1, UINT(startVertex), 0); +} + +void D3D12Renderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ +} + +void D3D12Renderer::dispatchCompute(int x, int y, int z) +{ + ID3D12GraphicsCommandList* commandList = m_commandList; + auto pipelineStateImpl = m_currentPipelineState; + + // Submit binding for compute + { + ComputeSubmitter submitter(commandList); + _bindRenderState(pipelineStateImpl, commandList, &submitter); + } + + commandList->Dispatch(x, y, z); +} + +#if 0 +BindingState* D3D12Renderer::createBindingState(const BindingState::Desc& bindingStateDesc) +{ + RefPtr bindingState(new BindingStateImpl(bindingStateDesc)); + + SLANG_RETURN_NULL_ON_FAIL(bindingState->init(m_device)); + + const auto& srcBindings = bindingStateDesc.m_bindings; + const int numBindings = int(srcBindings.Count()); + + auto& dstDetails = bindingState->m_bindingDetails; + dstDetails.SetSize(numBindings); + + for (int i = 0; i < numBindings; ++i) + { + const auto& srcEntry = srcBindings[i]; + auto& dstDetail = dstDetails[i]; + + const int bindingIndex = srcEntry.registerRange.getSingleIndex(); + + switch (srcEntry.bindingType) + { + case BindingType::Buffer: + { + assert(srcEntry.resource && srcEntry.resource->isBuffer()); + BufferResourceImpl* bufferResource = static_cast(srcEntry.resource.Ptr()); + const BufferResource::Desc& desc = bufferResource->getDesc(); + + const size_t bufferSize = bufferDesc.sizeInBytes; + const int elemSize = bufferDesc.elementSize <= 0 ? sizeof(uint32_t) : bufferDesc.elementSize; + + const bool createSrv = false; + + // NOTE! In this arrangement the buffer can either be a ConstantBuffer or a 'StorageBuffer'. + // If it's a storage buffer then it has a 'uav'. + // In neither circumstance is there an associated srv + // This departs a little from dx11 code - in that it will create srv and uav for a storage buffer. + if (bufferDesc.bindFlags & Resource::BindFlag::UnorderedAccess) + { + dstDetail.m_uavIndex = bindingState->m_viewHeap.allocate(); + if (dstDetail.m_uavIndex < 0) + { + return nullptr; + } + + D3D12_UNORDERED_ACCESS_VIEW_DESC uavDesc = {}; + + uavDesc.ViewDimension = D3D12_UAV_DIMENSION_BUFFER; + uavDesc.Format = D3DUtil::getMapFormat(bufferDesc.format); + + uavDesc.Buffer.StructureByteStride = elemSize; + + uavDesc.Buffer.FirstElement = 0; + uavDesc.Buffer.NumElements = (UINT)(bufferSize / elemSize); + uavDesc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_NONE; + + if (bufferDesc.elementSize == 0 && bufferDesc.format == Format::Unknown) + { + uavDesc.Buffer.Flags |= D3D12_BUFFER_UAV_FLAG_RAW; + uavDesc.Format = DXGI_FORMAT_R32_TYPELESS; + + uavDesc.Buffer.StructureByteStride = 0; + } + else if( bufferDesc.format != Format::Unknown ) + { + uavDesc.Buffer.StructureByteStride = 0; + } + + m_device->CreateUnorderedAccessView(bufferResource->m_resource, nullptr, &uavDesc, bindingState->m_viewHeap.getCpuHandle(dstDetail.m_uavIndex)); + } + if (createSrv && (bufferDesc.bindFlags & (Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource))) + { + dstDetail.m_srvIndex = bindingState->m_viewHeap.allocate(); + if (dstDetail.m_srvIndex < 0) + { + return nullptr; + } + + D3D12_SHADER_RESOURCE_VIEW_DESC srvDesc; + + srvDesc.ViewDimension = D3D12_SRV_DIMENSION_BUFFER; + srvDesc.Format = DXGI_FORMAT_UNKNOWN; + srvDesc.Shader4ComponentMapping = D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING; + + srvDesc.Buffer.FirstElement = 0; + srvDesc.Buffer.NumElements = (UINT)(bufferSize / elemSize); + srvDesc.Buffer.StructureByteStride = elemSize; + srvDesc.Buffer.Flags = D3D12_BUFFER_SRV_FLAG_NONE; + + if (bufferDesc.elementSize == 0) + { + srvDesc.Format = DXGI_FORMAT_R32_FLOAT; + } + + m_device->CreateShaderResourceView(bufferResource->m_resource, &srvDesc, bindingState->m_viewHeap.getCpuHandle(dstDetail.m_srvIndex)); + } + + break; + } + case BindingType::Texture: + { + assert(srcEntry.resource && srcEntry.resource->isTexture()); + + TextureResourceImpl* textureResource = static_cast(srcEntry.resource.Ptr()); + + dstDetail.m_srvIndex = bindingState->m_viewHeap.allocate(); + if (dstDetail.m_srvIndex < 0) + { + return nullptr; + } + + { + const D3D12_RESOURCE_DESC resourceDesc = textureResource->m_resource.getResource()->GetDesc(); + const DXGI_FORMAT pixelFormat = resourceDesc.Format; + + D3D12_SHADER_RESOURCE_VIEW_DESC srvDesc; + _initSrvDesc(textureResource->getType(), textureResource->getDesc(), resourceDesc, pixelFormat, srvDesc); + + // Create descriptor + m_device->CreateShaderResourceView(textureResource->m_resource, &srvDesc, bindingState->m_viewHeap.getCpuHandle(dstDetail.m_srvIndex)); + } + + break; + } + case BindingType::Sampler: + { + const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcEntry.descIndex]; + + const int samplerIndex = bindingIndex; + dstDetail.m_samplerIndex = samplerIndex; + bindingState->m_samplerHeap.placeAt(samplerIndex); + + D3D12_SAMPLER_DESC desc = {}; + desc.AddressU = desc.AddressV = desc.AddressW = D3D12_TEXTURE_ADDRESS_MODE_WRAP; + desc.ComparisonFunc = D3D12_COMPARISON_FUNC_ALWAYS; + + if (samplerDesc.isCompareSampler) + { + desc.ComparisonFunc = D3D12_COMPARISON_FUNC_LESS_EQUAL; + desc.Filter = D3D12_FILTER_MIN_LINEAR_MAG_MIP_POINT; + } + else + { + desc.Filter = D3D12_FILTER_ANISOTROPIC; + desc.MaxAnisotropy = 8; + desc.MinLOD = 0.0f; + desc.MaxLOD = 100.0f; + } + + m_device->CreateSampler(&desc, bindingState->m_samplerHeap.getCpuHandle(samplerIndex)); + + break; + } + case BindingType::CombinedTextureSampler: + { + assert(!"Not implemented"); + return nullptr; + } + } + } + + return bindingState.detach(); +} + +void D3D12Renderer::setBindingState(BindingState* state) +{ + m_boundBindingState = static_cast(state); +} +#endif + +void D3D12Renderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto dxDevice = m_renderer->m_device; + + + auto resourceImpl = (BufferResourceImpl*) buffer; + auto resourceDesc = resourceImpl->getDesc(); + + // Constant buffer view size must be a multiple of 256 bytes, so we round it up here. + const size_t alignedSizeInBytes = D3DUtil::calcAligned(resourceDesc.sizeInBytes, 256); + + D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc = {}; + cbvDesc.BufferLocation = resourceImpl->m_resource.getResource()->GetGPUVirtualAddress(); + cbvDesc.SizeInBytes = alignedSizeInBytes; + + auto& rangeInfo = m_layout->m_ranges[range]; + +#ifdef _DEBUG + switch(rangeInfo.type) + { + default: + assert(!"incorrect slot type"); + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + break; + } +#endif + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto descriptorIndex = m_resourceTable + arrayIndex; + + m_resourceObjects[arrayIndex] = resourceImpl; + dxDevice->CreateConstantBufferView( + &cbvDesc, + m_resourceHeap->getCpuHandle(descriptorIndex)); +} + +void D3D12Renderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto dxDevice = m_renderer->m_device; + + auto viewImpl = (ResourceViewImpl*) view; + + auto& rangeInfo = m_layout->m_ranges[range]; + + // TODO: validation that slot type matches view + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto descriptorIndex = m_resourceTable + arrayIndex; + + m_resourceObjects[arrayIndex] = viewImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_resourceHeap->getCpuHandle(descriptorIndex), + viewImpl->m_descriptor.cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); +} + +void D3D12Renderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ + auto dxDevice = m_renderer->m_device; + + auto samplerImpl = (SamplerStateImpl*) sampler; + + auto& rangeInfo = m_layout->m_ranges[range]; + +#ifdef _DEBUG + switch(rangeInfo.type) + { + default: + assert(!"incorrect slot type"); + break; + + case DescriptorSlotType::Sampler: + break; + } +#endif + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto descriptorIndex = m_resourceTable + arrayIndex; + + m_samplerObjects[arrayIndex] = samplerImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_samplerHeap->getCpuHandle(descriptorIndex), + samplerImpl->m_cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER); +} + +void D3D12Renderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ + auto dxDevice = m_renderer->m_device; + + auto viewImpl = (ResourceViewImpl*) textureView; + auto samplerImpl = (SamplerStateImpl*) sampler; + + auto& rangeInfo = m_layout->m_ranges[range]; + +#ifdef _DEBUG + switch(rangeInfo.type) + { + default: + assert(!"incorrect slot type"); + break; + + case DescriptorSlotType::CombinedImageSampler: + break; + } +#endif + + auto arrayIndex = rangeInfo.arrayIndex + index; + auto resourceDescriptorIndex = m_resourceTable + arrayIndex; + auto samplerDescriptorIndex = m_samplerTable + arrayIndex; + + m_resourceObjects[arrayIndex] = viewImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_resourceHeap->getCpuHandle(resourceDescriptorIndex), + viewImpl->m_descriptor.cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); + + m_samplerObjects[arrayIndex] = samplerImpl; + dxDevice->CopyDescriptorsSimple( + 1, + m_samplerHeap->getCpuHandle(samplerDescriptorIndex), + samplerImpl->m_cpuHandle, + D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER); +} + +void D3D12Renderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) +{ + // In D3D12, unlike Vulkan, binding a root signature invalidates *all* descriptor table + // bindings (rather than preserving those that are part of the longest common prefix + // between the old and new layout). + // + // In order to accomodate having descriptor-set bindings that persist across changes + // in pipeline state (which may also change pipeline layout), we will shadow the + // descriptor-set bindings and only flush them on-demand at draw tiume once the final + // pipline layout is known. + // + + auto descriptorSetImpl = (DescriptorSetImpl*) descriptorSet; + m_boundDescriptorSets[int(pipelineType)][index] = descriptorSetImpl; +} + +Result D3D12Renderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) +{ + RefPtr program(new ShaderProgramImpl()); + program->m_pipelineType = desc.pipelineType; + + if (desc.pipelineType == PipelineType::Compute) + { + auto computeKernel = desc.findKernel(StageType::Compute); + program->m_computeShader.InsertRange(0, (const uint8_t*) computeKernel->codeBegin, computeKernel->getCodeSize()); + } + else + { + auto vertexKernel = desc.findKernel(StageType::Vertex); + auto fragmentKernel = desc.findKernel(StageType::Fragment); + + program->m_vertexShader.InsertRange(0, (const uint8_t*) vertexKernel->codeBegin, vertexKernel->getCodeSize()); + program->m_pixelShader.InsertRange(0, (const uint8_t*) fragmentKernel->codeBegin, fragmentKernel->getCodeSize()); + } + + *outProgram = program.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + Int rangeCount = desc.slotRangeCount; + + // For our purposes, there are three main cases of descriptor ranges to consider: + // + // 1. Resources: CBV, SRV, UAV + // + // 2. Samplers + // + // 3. Combined texture/sampler pairs + // + // The combined case presents challenges, because we will implement + // them as both a resource slot and a sampler slot, and for conveience + // in the indexing logic, it would be nice it they "lined up." + // + // We will start by counting how many ranges, and how many + // descriptors, of each type we have. + // + + Int dedicatedResourceCount = 0; + Int dedicatedSamplerCount = 0; + Int combinedCount = 0; + + Int dedicatedResourceRangeCount = 0; + Int dedicatedSamplerRangeCount = 0; + Int combinedRangeCount = 0; + + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + switch(rangeDesc.type) + { + case DescriptorSlotType::Sampler: + dedicatedSamplerCount += rangeDesc.count; + dedicatedSamplerRangeCount++; + break; + + case DescriptorSlotType::CombinedImageSampler: + combinedCount += rangeDesc.count; + combinedRangeCount++; + break; + + default: + dedicatedResourceCount += rangeDesc.count; + dedicatedResourceRangeCount++; + break; + } + } + + // Now we know how many ranges we have to allocate space for, + // and also how they need to be arranged. + // + // Each "combined" range will map to two ranges in the D3D + // descriptor tables. + + RefPtr descriptorSetLayoutImpl = new DescriptorSetLayoutImpl(); + + // We know the total number of resource and sampler "slots" that an instance + // of this decriptor-set layout would need: + // + descriptorSetLayoutImpl->m_resourceCount = combinedCount + dedicatedResourceCount; + descriptorSetLayoutImpl->m_samplerCount = combinedCount + dedicatedSamplerCount; + + // We can start by allocating the D3D root parameter info needed for the + // descriptor set, based on the total number or ranges we need, which + // we can compute from the combined and dedicated counts: + // + Int totalResourceRangeCount = combinedRangeCount + dedicatedResourceRangeCount; + Int totalSamplerRangeCount = combinedRangeCount + dedicatedSamplerRangeCount; + + if( totalResourceRangeCount ) + { + D3D12_ROOT_PARAMETER dxRootParameter = {}; + dxRootParameter.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + dxRootParameter.DescriptorTable.NumDescriptorRanges = totalResourceRangeCount; + descriptorSetLayoutImpl->m_dxRootParameters.Add(dxRootParameter); + } + if( totalSamplerRangeCount ) + { + D3D12_ROOT_PARAMETER dxRootParameter = {}; + dxRootParameter.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + dxRootParameter.DescriptorTable.NumDescriptorRanges = totalSamplerRangeCount; + descriptorSetLayoutImpl->m_dxRootParameters.Add(dxRootParameter); + } + + // Next we can allocate space for all the D3D register ranges we need, + // again based on totals that we can compute easily: + // + Int totalRangeCount = totalResourceRangeCount + totalSamplerRangeCount; + descriptorSetLayoutImpl->m_dxRanges.SetSize(totalRangeCount); + + // Now we will walk through the ranges in the order they were + // specified, so that we can fill in the "range info" required for + // binding parameters into descriptor sets allocated with this layout. + // + // This effectively determines the space required in two arrays + // in each descriptor set: one for resources, and one for samplers. + // A "combined" descriptor requires space in both arrays. The entries + // for "dedicated" samplers/resources always come after those for + // "combined" descriptors in the same array, so that a single index + // can be used for both arrays in the combined case. + // + + { + Int samplerCounter = 0; + Int resourceCounter = 0; + Int combinedCounter = 0; + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + + rangeInfo.type = rangeDesc.type; + rangeInfo.count = rangeDesc.count; + + switch(rangeDesc.type) + { + default: + // Default case is a dedicated resource, and its index in the + // resource array will come after all the combined entries. + rangeInfo.arrayIndex = combinedCount + resourceCounter; + resourceCounter += rangeInfo.count; + break; + + case DescriptorSlotType::Sampler: + // A dedicated sampler comes after all the entries for + // combined texture/samplers in the sampler array. + rangeInfo.arrayIndex = combinedCount + samplerCounter; + samplerCounter += rangeInfo.count; + break; + + case DescriptorSlotType::CombinedImageSampler: + // Combined descriptors take entries at the front of + // the resource and sampler arrays. + rangeInfo.arrayIndex = combinedCounter; + combinedCounter += rangeInfo.count; + break; + } + + descriptorSetLayoutImpl->m_ranges.Add(rangeInfo); + } + } + + // Finally, we will go through and fill in ready-to-go D3D + // register range information. + { + UInt cbvCounter = 0; + UInt srvCounter = 0; + UInt uavCounter = 0; + UInt samplerCounter = 0; + + Int resourceRangeCounter = 0; + Int samplerRangeCounter = 0; + Int combinedRangeCounter = 0; + + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + Int bindingCount = rangeDesc.count; + + // All of these descriptor ranges will be initialized + // with a "space" of zero, with the assumption that + // the actual space number will come from when they are + // used as part of a pipeline layout. + // + Int bindingSpace = 0; + + Int dxRangeIndex = -1; + Int dxPairedSamplerRangeIndex = -1; + + switch(rangeDesc.type) + { + default: + // Default case is a dedicated resource, and its index in the + // resource array will come after all the combined entries. + dxRangeIndex = combinedRangeCount + resourceRangeCounter; + resourceRangeCounter++; + break; + + case DescriptorSlotType::Sampler: + // A dedicated sampler comes after all the entries for + // combined texture/samplers in the sampler array. + dxRangeIndex = totalResourceRangeCount + combinedRangeCount + samplerRangeCounter; + samplerRangeCounter++; + break; + + case DescriptorSlotType::CombinedImageSampler: + // Combined descriptors take entries at the front of + // the resource and sampler arrays. + dxRangeIndex = combinedRangeCounter; + dxPairedSamplerRangeIndex = totalResourceRangeCount + combinedRangeCounter; + combinedRangeCounter++; + break; + } + + D3D12_DESCRIPTOR_RANGE& dxRange = descriptorSetLayoutImpl->m_dxRanges[dxRangeIndex]; + memset(&dxRange, 0, sizeof(dxRange)); + + switch(rangeDesc.type) + { + default: + // ERROR: unsupported slot type. + break; + + case DescriptorSlotType::Sampler: + { + UInt bindingIndex = samplerCounter; samplerCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + + case DescriptorSlotType::SampledImage: + case DescriptorSlotType::UniformTexelBuffer: + { + UInt bindingIndex = srvCounter; srvCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SRV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + + case DescriptorSlotType::CombinedImageSampler: + { + // The combined texture/sampler case basically just + // does the work of both the SRV and sampler cases above. + + { + // Here's the SRV logic: + + UInt bindingIndex = srvCounter; srvCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SRV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + + { + // And here we do the sampler logic at the "paired" index. + D3D12_DESCRIPTOR_RANGE& dxPairedSamplerRange = descriptorSetLayoutImpl->m_dxRanges[dxPairedSamplerRangeIndex]; + memset(&dxPairedSamplerRange, 0, sizeof(dxPairedSamplerRange)); + + UInt pairedSamplerBindingIndex = srvCounter; srvCounter += bindingCount; + + dxPairedSamplerRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER; + dxPairedSamplerRange.NumDescriptors = bindingCount; + dxPairedSamplerRange.BaseShaderRegister = pairedSamplerBindingIndex; + dxPairedSamplerRange.RegisterSpace = bindingSpace; + dxPairedSamplerRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + + } + break; + + + case DescriptorSlotType::InputAttachment: + case DescriptorSlotType::StorageImage: + case DescriptorSlotType::StorageTexelBuffer: + case DescriptorSlotType::StorageBuffer: + case DescriptorSlotType::DynamicStorageBuffer: + { + UInt bindingIndex = uavCounter; uavCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_UAV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + { + UInt bindingIndex = cbvCounter; cbvCounter += bindingCount; + + dxRange.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV; + dxRange.NumDescriptors = bindingCount; + dxRange.BaseShaderRegister = bindingIndex; + dxRange.RegisterSpace = bindingSpace; + dxRange.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; + } + break; + } + } + } + + *outLayout = descriptorSetLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + static const UInt kMaxRanges = 16; + static const UInt kMaxRootParameters = 32; + + D3D12_DESCRIPTOR_RANGE ranges[kMaxRanges]; + D3D12_ROOT_PARAMETER rootParameters[kMaxRootParameters]; + + UInt rangeCount = 0; + UInt rootParameterCount = 0; + + auto descriptorSetCount = desc.descriptorSetCount; + + // We are going to make two passes over the descriptor set layouts + // that are being used to build the pipeline layout. In the first + // pass we will collect all the descriptor ranges that have been + // specified, applying an offset to their register spaces as needed. + // + for(UInt dd = 0; dd < descriptorSetCount; ++dd) + { + auto& descriptorSetInfo = desc.descriptorSets[dd]; + auto descriptorSetLayout = (DescriptorSetLayoutImpl*) descriptorSetInfo.layout; + + // For now we assume that the register space used for + // logical descriptor set #N will be space N. + // + // TODO: This might need to be revisited in the future because + // a single logical descriptor set might need to encompass stuff + // that comes from multiple spaces (e.g., if it contains an unbounded + // array). + // + UInt bindingSpace = dd; + + // Copy descriptor range infromation from the set layout into our + // temporary copy (this is required because the same set layout + // might be applied to different ranges). + // + // API design note: this copy step could be avoided if the D3D + // API allowed for a "space offset" to be applied as part of + // a descriptor-table root parameter. + // + for(auto setDescriptorRange : descriptorSetLayout->m_dxRanges) + { + auto& range = ranges[rangeCount++]; + range = setDescriptorRange; + range.RegisterSpace = bindingSpace; + + // HACK: in order to deal with SM5.0 shaders, `u` registers + // in `space0` need to start with a number *after* the number + // of `SV_Target` outputs that will be used. + // + // TODO: This is clearly a mess, and doing this behavior here + // means it *won't* work for SM5.1 where the restriction is + // lifted. The only real alternative is to rely on explicit + // register numbers (e.g., from shader reflection) but that + // goes against the simplicity that this API layer strives for + // (everything so far has been set up to work correctly with + // automatic assignment of bindings). + // + if( range.RegisterSpace == 0 + && range.RangeType == D3D12_DESCRIPTOR_RANGE_TYPE_UAV ) + { + range.BaseShaderRegister += desc.renderTargetCount; + } + } + } + + // In our second pass, we will copy over root parameters, which + // may end up pointing into the list of ranges from the first step. + // + auto rangePtr = &ranges[0]; + for(UInt dd = 0; dd < descriptorSetCount; ++dd) + { + auto& descriptorSetInfo = desc.descriptorSets[dd]; + auto descriptorSetLayout = (DescriptorSetLayoutImpl*) descriptorSetInfo.layout; + + // Copy root parameter information from the set layout to our + // overall pipeline layout. + for( auto setRootParameter : descriptorSetLayout->m_dxRootParameters ) + { + auto& rootParameter = rootParameters[rootParameterCount++]; + rootParameter = setRootParameter; + + // In the case where this parameter is a descriptor table, it + // needs to point into our array of ranges (with offsets applied), + // so we will fix up those pointers here. + // + if(rootParameter.ParameterType == D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE) + { + rootParameter.DescriptorTable.pDescriptorRanges = rangePtr; + rangePtr += rootParameter.DescriptorTable.NumDescriptorRanges; + } + } + } + + D3D12_ROOT_SIGNATURE_DESC rootSignatureDesc = {}; + rootSignatureDesc.NumParameters = rootParameterCount; + rootSignatureDesc.pParameters = rootParameters; + + // TODO: static samplers should be reasonably easy to support... + rootSignatureDesc.NumStaticSamplers = 0; + rootSignatureDesc.pStaticSamplers = nullptr; + + // TODO: only set this flag if needed (requires creating root + // signature at same time as pipeline state...). + // + rootSignatureDesc.Flags = D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT; + + ComPtr signature; + ComPtr error; + if( SLANG_FAILED(m_D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, signature.writeRef(), error.writeRef())) ) + { + fprintf(stderr, "error: D3D12SerializeRootSignature failed"); + if( error ) + { + fprintf(stderr, ": %s\n", (const char*) error->GetBufferPointer()); + } + return SLANG_FAIL; + } + + ComPtr rootSignature; + SLANG_RETURN_ON_FAIL(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(rootSignature.writeRef()))); + + + RefPtr pipelineLayoutImpl = new PipelineLayoutImpl(); + pipelineLayoutImpl->m_rootSignature = rootSignature; + pipelineLayoutImpl->m_descriptorSetCount = descriptorSetCount; + *outLayout = pipelineLayoutImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*) layout; + + RefPtr descriptorSetImpl = new DescriptorSetImpl(); + descriptorSetImpl->m_renderer = this; + descriptorSetImpl->m_layout = layoutImpl; + + // We allocate CPU-visible descriptor tables to providing the + // backing storage for each descriptor set. GPU-visible storage + // will only be allocated as needed during per-frame logic in + // order to ensure that a descriptor set it available for use + // in rendering. + // + Int resourceCount = layoutImpl->m_resourceCount; + if( resourceCount ) + { + auto resourceHeap = &m_cpuViewHeap; + descriptorSetImpl->m_resourceHeap = resourceHeap; + descriptorSetImpl->m_resourceTable = resourceHeap->allocate(resourceCount); + descriptorSetImpl->m_resourceObjects.SetSize(resourceCount); + } + + Int samplerCount = layoutImpl->m_samplerCount; + if( samplerCount ) + { + auto samplerHeap = &m_cpuSamplerHeap; + descriptorSetImpl->m_samplerHeap = samplerHeap; + descriptorSetImpl->m_samplerTable = samplerHeap->allocate(samplerCount); + descriptorSetImpl->m_samplerObjects.SetSize(samplerCount); + } + + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto programImpl = (ShaderProgramImpl*) desc.program; + auto inputLayoutImpl = (InputLayoutImpl*) desc.inputLayout; + + // Describe and create the graphics pipeline state object (PSO) + D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {}; + + psoDesc.pRootSignature = pipelineLayoutImpl->m_rootSignature; + + psoDesc.VS = { programImpl->m_vertexShader.Buffer(), programImpl->m_vertexShader.Count() }; + psoDesc.PS = { programImpl->m_pixelShader .Buffer(), programImpl->m_pixelShader .Count() }; + + psoDesc.InputLayout = { inputLayoutImpl->m_elements.Buffer(), UINT(inputLayoutImpl->m_elements.Count()) }; + psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; + + { + const int numRenderTargets = desc.renderTargetCount; + + psoDesc.DSVFormat = m_depthStencilFormat; + psoDesc.NumRenderTargets = numRenderTargets; + for (Int i = 0; i < numRenderTargets; i++) + { + psoDesc.RTVFormats[i] = m_targetFormat; + } + + psoDesc.SampleDesc.Count = 1; + psoDesc.SampleDesc.Quality = 0; + + psoDesc.SampleMask = UINT_MAX; + } + + { + auto& rs = psoDesc.RasterizerState; + rs.FillMode = D3D12_FILL_MODE_SOLID; + rs.CullMode = D3D12_CULL_MODE_NONE; + rs.FrontCounterClockwise = FALSE; + rs.DepthBias = D3D12_DEFAULT_DEPTH_BIAS; + rs.DepthBiasClamp = D3D12_DEFAULT_DEPTH_BIAS_CLAMP; + rs.SlopeScaledDepthBias = D3D12_DEFAULT_SLOPE_SCALED_DEPTH_BIAS; + rs.DepthClipEnable = TRUE; + rs.MultisampleEnable = FALSE; + rs.AntialiasedLineEnable = FALSE; + rs.ForcedSampleCount = 0; + rs.ConservativeRaster = D3D12_CONSERVATIVE_RASTERIZATION_MODE_OFF; + } + + { + D3D12_BLEND_DESC& blend = psoDesc.BlendState; + + blend.AlphaToCoverageEnable = FALSE; + blend.IndependentBlendEnable = FALSE; + const D3D12_RENDER_TARGET_BLEND_DESC defaultRenderTargetBlendDesc = + { + FALSE,FALSE, + D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, + D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, + D3D12_LOGIC_OP_NOOP, + D3D12_COLOR_WRITE_ENABLE_ALL, + }; + for (UINT i = 0; i < D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT; ++i) + { + blend.RenderTarget[i] = defaultRenderTargetBlendDesc; + } + } + + { + auto& ds = psoDesc.DepthStencilState; + + ds.DepthEnable = FALSE; + ds.DepthWriteMask = D3D12_DEPTH_WRITE_MASK_ALL; + ds.DepthFunc = D3D12_COMPARISON_FUNC_ALWAYS; + //ds.DepthFunc = D3D12_COMPARISON_FUNC_LESS; + ds.StencilEnable = FALSE; + ds.StencilReadMask = D3D12_DEFAULT_STENCIL_READ_MASK; + ds.StencilWriteMask = D3D12_DEFAULT_STENCIL_WRITE_MASK; + const D3D12_DEPTH_STENCILOP_DESC defaultStencilOp = + { + D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_COMPARISON_FUNC_ALWAYS + }; + ds.FrontFace = defaultStencilOp; + ds.BackFace = defaultStencilOp; + } + + psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; + + ComPtr pipelineState; + SLANG_RETURN_ON_FAIL(m_device->CreateGraphicsPipelineState(&psoDesc, IID_PPV_ARGS(pipelineState.writeRef()))); + + RefPtr pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_pipelineType = PipelineType::Graphics; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_pipelineState = pipelineState; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + +Result D3D12Renderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto programImpl = (ShaderProgramImpl*) desc.program; + + // Describe and create the compute pipeline state object + D3D12_COMPUTE_PIPELINE_STATE_DESC computeDesc = {}; + computeDesc.pRootSignature = pipelineLayoutImpl->m_rootSignature; + computeDesc.CS = { programImpl->m_computeShader.Buffer(), programImpl->m_computeShader.Count() }; + + ComPtr pipelineState; + SLANG_RETURN_ON_FAIL(m_device->CreateComputePipelineState(&computeDesc, IID_PPV_ARGS(pipelineState.writeRef()))); + + RefPtr pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_pipelineType = PipelineType::Compute; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_pipelineState = pipelineState; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + +} // renderer_test diff --git a/tools/gfx/render-d3d12.h b/tools/gfx/render-d3d12.h new file mode 100644 index 000000000..b8a3104c0 --- /dev/null +++ b/tools/gfx/render-d3d12.h @@ -0,0 +1,10 @@ +// render-d3d12.h +#pragma once + +namespace gfx { + +class Renderer; + +Renderer* createD3D12Renderer(); + +} // gfx diff --git a/tools/gfx/render-gl.cpp b/tools/gfx/render-gl.cpp new file mode 100644 index 000000000..3ab818fdd --- /dev/null +++ b/tools/gfx/render-gl.cpp @@ -0,0 +1,1426 @@ +// render-gl.cpp +#include "render-gl.h" + +//WORKING:#include "options.h" +#include "render.h" + +#include +#include +#include +#include "core/basic.h" +#include "core/secure-crt.h" +#include "external/stb/stb_image_write.h" + +#include "surface.h" + +// TODO(tfoley): eventually we should be able to run these +// tests on non-Windows targets to confirm that cross-compilation +// at least *works* on those platforms... +#define WIN32_LEAN_AND_MEAN +#define NOMINMAX +#include +#undef WIN32_LEAN_AND_MEAN +#undef NOMINMAX + +#ifdef _MSC_VER +#include +#if (_MSC_VER < 1900) +#define snprintf sprintf_s +#endif +#endif + +#pragma comment(lib, "opengl32") + +#include +#include "external/glext.h" + +// We define an "X-macro" for mapping over loadable OpenGL +// extension entry point that we will use, so that we can +// easily write generic code to iterate over them. +#define MAP_GL_EXTENSION_FUNCS(F) \ + F(glCreateProgram, PFNGLCREATEPROGRAMPROC) \ + F(glCreateShader, PFNGLCREATESHADERPROC) \ + F(glShaderSource, PFNGLSHADERSOURCEPROC) \ + F(glCompileShader, PFNGLCOMPILESHADERPROC) \ + F(glGetShaderiv, PFNGLGETSHADERIVPROC) \ + F(glDeleteShader, PFNGLDELETESHADERPROC) \ + F(glAttachShader, PFNGLATTACHSHADERPROC) \ + F(glLinkProgram, PFNGLLINKPROGRAMPROC) \ + F(glGetProgramiv, PFNGLGETPROGRAMIVPROC) \ + F(glGetProgramInfoLog, PFNGLGETPROGRAMINFOLOGPROC) \ + F(glDeleteProgram, PFNGLDELETEPROGRAMPROC) \ + F(glGetShaderInfoLog, PFNGLGETSHADERINFOLOGPROC) \ + F(glGenBuffers, PFNGLGENBUFFERSPROC) \ + F(glBindBuffer, PFNGLBINDBUFFERPROC) \ + F(glBufferData, PFNGLBUFFERDATAPROC) \ + F(glDeleteBuffers, PFNGLDELETEBUFFERSPROC) \ + F(glMapBuffer, PFNGLMAPBUFFERPROC) \ + F(glUnmapBuffer, PFNGLUNMAPBUFFERPROC) \ + F(glUseProgram, PFNGLUSEPROGRAMPROC) \ + F(glBindBufferBase, PFNGLBINDBUFFERBASEPROC) \ + F(glVertexAttribPointer, PFNGLVERTEXATTRIBPOINTERPROC) \ + F(glEnableVertexAttribArray, PFNGLENABLEVERTEXATTRIBARRAYPROC) \ + F(glDisableVertexAttribArray, PFNGLDISABLEVERTEXATTRIBARRAYPROC) \ + F(glDebugMessageCallback, PFNGLDEBUGMESSAGECALLBACKPROC) \ + F(glDispatchCompute, PFNGLDISPATCHCOMPUTEPROC) \ + F(glActiveTexture, PFNGLACTIVETEXTUREPROC) \ + F(glCreateSamplers, PFNGLCREATESAMPLERSPROC) \ + F(glDeleteSamplers, PFNGLDELETESAMPLERSPROC) \ + F(glBindSampler, PFNGLBINDSAMPLERPROC) \ + F(glTexImage3D, PFNGLTEXIMAGE3DPROC) \ + F(glSamplerParameteri, PFNGLSAMPLERPARAMETERIPROC) \ + /* end */ + +using namespace Slang; + +namespace gfx { + +class GLRenderer : public Renderer +{ +public: + + // Renderer implementation + virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; + virtual void setClearColor(const float color[4]) override; + virtual void clearFrame() override; + virtual void presentFrame() override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + + virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; + + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; + virtual void unmap(BufferResource* buffer) override; + virtual void setPrimitiveTopology(PrimitiveTopology topology) override; + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; + virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; + virtual void dispatchCompute(int x, int y, int z) override; + virtual void submitGpuWork() override {} + virtual void waitForGpu() override {} + virtual RendererType getRendererType() const override { return RendererType::OpenGl; } + + protected: + enum + { + kMaxVertexStreams = 16, + kMaxDescriptorSetCount = 8, + }; + + struct VertexAttributeFormat + { + GLint componentCount; + GLenum componentType; + GLboolean normalized; + }; + + struct VertexAttributeDesc + { + VertexAttributeFormat format; + GLuint streamIndex; + GLsizei offset; + }; + + class InputLayoutImpl: public InputLayout + { + public: + VertexAttributeDesc m_attributes[kMaxVertexStreams]; + UInt m_attributeCount = 0; + }; + + class BufferResourceImpl: public BufferResource + { + public: + typedef BufferResource Parent; + + BufferResourceImpl(Usage initialUsage, const Desc& desc, GLRenderer* renderer, GLuint id, GLenum target): + Parent(desc), + m_renderer(renderer), + m_handle(id), + m_initialUsage(initialUsage), + m_target(target) + {} + ~BufferResourceImpl() + { + if (m_renderer) + { + m_renderer->glDeleteBuffers(1, &m_handle); + } + } + + Usage m_initialUsage; + GLRenderer* m_renderer; + GLuint m_handle; + GLenum m_target; + }; + + class TextureResourceImpl: public TextureResource + { + public: + typedef TextureResource Parent; + + TextureResourceImpl(Usage initialUsage, const Desc& desc, GLRenderer* renderer): + Parent(desc), + m_initialUsage(initialUsage), + m_renderer(renderer) + { + m_target = 0; + m_handle = 0; + } + + ~TextureResourceImpl() + { + if (m_handle) + { + glDeleteTextures(1, &m_handle); + } + } + + Usage m_initialUsage; + GLRenderer* m_renderer; + GLenum m_target; + GLuint m_handle; + }; + + class SamplerStateImpl : public SamplerState + { + public: + GLuint m_samplerID; + }; + + class ResourceViewImpl : public ResourceView + { + }; + + class TextureViewImpl : public ResourceViewImpl + { + public: + RefPtr m_resource; + GLuint m_textureID; + }; + + class BufferViewImpl : public ResourceViewImpl + { + public: + RefPtr m_resource; + GLuint m_bufferID; + }; + + enum class GLDescriptorSlotType + { + ConstantBuffer, + CombinedTextureSampler, + + CountOf, + }; + + class DescriptorSetLayoutImpl : public DescriptorSetLayout + { + public: + struct RangeInfo + { + GLDescriptorSlotType type; + UInt arrayIndex; + }; + List m_ranges; + Int m_counts[int(GLDescriptorSlotType::CountOf)]; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: + struct DescriptorSetInfo + { + RefPtr layout; + UInt baseArrayIndex[int(GLDescriptorSlotType::CountOf)]; + }; + + List m_sets; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr m_layout; + List> m_constantBuffers; + List> m_textures; + List> m_samplers; + }; + + class ShaderProgramImpl : public ShaderProgram + { + public: + ShaderProgramImpl(GLRenderer* renderer, GLuint id): + m_renderer(renderer), + m_id(id) + { + } + ~ShaderProgramImpl() + { + if (m_renderer) + { + m_renderer->glDeleteProgram(m_id); + } + } + + GLuint m_id; + GLRenderer* m_renderer; + }; + + class PipelineStateImpl : public PipelineState + { + public: + RefPtr m_program; + RefPtr m_pipelineLayout; + RefPtr m_inputLayout; + }; + + enum class GlPixelFormat + { + Unknown, + RGBA_Unorm_UInt8, + CountOf, + }; + + struct GlPixelFormatInfo + { + GLint internalFormat; // such as GL_RGBA8 + GLenum format; // such as GL_RGBA + GLenum formatType; // such as GL_UNSIGNED_BYTE + }; + +// void destroyBindingEntries(const BindingState::Desc& desc, const BindingDetail* details); + + void bindBufferImpl(int target, UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* offsets); + void flushStateForDraw(); + GLuint loadShader(GLenum stage, char const* source); + void debugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message); + + /// Returns GlPixelFormat::Unknown if not an equivalent + static GlPixelFormat _getGlPixelFormat(Format format); + + static void APIENTRY staticDebugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message, const void* userParam); + static VertexAttributeFormat getVertexAttributeFormat(Format format); + + static void compileTimeAsserts(); + + HDC m_hdc; + HGLRC m_glContext; + float m_clearColor[4] = { 0, 0, 0, 0 }; + + RefPtr m_currentPipelineState; +// RefPtr m_boundShaderProgram; +// RefPtr m_boundInputLayout; + + RefPtr m_boundDescriptorSets[kMaxDescriptorSetCount]; + + GLenum m_boundPrimitiveTopology = GL_TRIANGLES; + GLuint m_boundVertexStreamBuffers[kMaxVertexStreams]; + UInt m_boundVertexStreamStrides[kMaxVertexStreams]; + UInt m_boundVertexStreamOffsets[kMaxVertexStreams]; + + Desc m_desc; + + // Declare a function pointer for each OpenGL + // extension function we need to load +#define DECLARE_GL_EXTENSION_FUNC(NAME, TYPE) TYPE NAME; + MAP_GL_EXTENSION_FUNCS(DECLARE_GL_EXTENSION_FUNC) +#undef DECLARE_GL_EXTENSION_FUNC + + static const GlPixelFormatInfo s_pixelFormatInfos[]; /// Maps GlPixelFormat to a format info +}; + +/* static */GLRenderer::GlPixelFormat GLRenderer::_getGlPixelFormat(Format format) +{ + switch (format) + { + case Format::RGBA_Unorm_UInt8: return GlPixelFormat::RGBA_Unorm_UInt8; + default: return GlPixelFormat::Unknown; + } +} + +/* static */ const GLRenderer::GlPixelFormatInfo GLRenderer::s_pixelFormatInfos[] = +{ + // internalType, format, formatType + { 0, 0, 0}, // GlPixelFormat::Unknown + { GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE }, // GlPixelFormat::RGBA_Unorm_UInt8 +}; + +/* static */void GLRenderer::compileTimeAsserts() +{ + SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_pixelFormatInfos) == int(GlPixelFormat::CountOf)); +} + +Renderer* createGLRenderer() +{ + return new GLRenderer(); +} + +void GLRenderer::debugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message) +{ + ::OutputDebugStringA("GL: "); + ::OutputDebugStringA(message); + ::OutputDebugStringA("\n"); + + switch (type) + { + case GL_DEBUG_TYPE_ERROR: + break; + default: + break; + } +} + +/* static */void APIENTRY GLRenderer::staticDebugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message, const void* userParam) +{ + ((GLRenderer*)userParam)->debugCallback(source, type, id, severity, length, message); +} + +/* static */GLRenderer::VertexAttributeFormat GLRenderer::getVertexAttributeFormat(Format format) +{ + switch (format) + { + default: assert(!"unexpected"); return VertexAttributeFormat(); + +#define CASE(NAME, COUNT, TYPE, NORMALIZED) \ + case Format::NAME: do { VertexAttributeFormat result = {COUNT, TYPE, NORMALIZED}; return result; } while (0) + + CASE(RGBA_Float32, 4, GL_FLOAT, GL_FALSE); + CASE(RGB_Float32, 3, GL_FLOAT, GL_FALSE); + CASE(RG_Float32, 2, GL_FLOAT, GL_FALSE); + CASE(R_Float32, 1, GL_FLOAT, GL_FALSE); +#undef CASE + } +} + +void GLRenderer::bindBufferImpl(int target, UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* offsets) +{ + for (UInt ii = 0; ii < slotCount; ++ii) + { + UInt slot = startSlot + ii; + + BufferResourceImpl* buffer = static_cast(buffers[ii]); + GLuint bufferID = buffer ? buffer->m_handle : 0; + + assert(!offsets || !offsets[ii]); + + glBindBufferBase(target, (GLuint)slot, bufferID); + } +} + +void GLRenderer::flushStateForDraw() +{ + auto inputLayout = m_currentPipelineState->m_inputLayout.Ptr(); + auto attrCount = inputLayout->m_attributeCount; + for (UInt ii = 0; ii < attrCount; ++ii) + { + auto& attr = inputLayout->m_attributes[ii]; + + auto streamIndex = attr.streamIndex; + + glBindBuffer(GL_ARRAY_BUFFER, m_boundVertexStreamBuffers[streamIndex]); + + glVertexAttribPointer( + (GLuint)ii, + attr.format.componentCount, + attr.format.componentType, + attr.format.normalized, + (GLsizei)m_boundVertexStreamStrides[streamIndex], + (GLvoid*)(attr.offset + m_boundVertexStreamOffsets[streamIndex])); + + glEnableVertexAttribArray((GLuint)ii); + } + for (UInt ii = attrCount; ii < kMaxVertexStreams; ++ii) + { + glDisableVertexAttribArray((GLuint)ii); + } + + // Next bind the descriptor sets as required by the layout + auto pipelineLayout = m_currentPipelineState->m_pipelineLayout; + auto descriptorSetCount = pipelineLayout->m_sets.Count(); + for(UInt ii = 0; ii < descriptorSetCount; ++ii) + { + auto descriptorSet = m_boundDescriptorSets[ii]; + auto descriptorSetInfo = pipelineLayout->m_sets[ii]; + auto descriptorSetLayout = descriptorSetInfo.layout; + + // TODO: need to validate that `descriptorSet->m_layout` matches + // `descriptorSetLayout`. + + { + // First we will bind any uniform buffers that were specified. + + auto slotTypeIndex = int(GLDescriptorSlotType::ConstantBuffer); + auto count = descriptorSetLayout->m_counts[slotTypeIndex]; + auto baseIndex = descriptorSetInfo.baseArrayIndex[slotTypeIndex]; + + for(Int ii = 0; ii < count; ++ii) + { + auto bufferImpl = descriptorSet->m_constantBuffers[ii]; + glBindBufferBase(GL_UNIFORM_BUFFER, ii, bufferImpl->m_handle); + } + } + + + { + // Next we will bind any combined texture/sampler slots. + + auto slotTypeIndex = int(GLDescriptorSlotType::CombinedTextureSampler); + auto count = descriptorSetLayout->m_counts[slotTypeIndex]; + auto baseIndex = descriptorSetInfo.baseArrayIndex[slotTypeIndex]; + + // TODO: We should be able to use a single call to glBindTextures here, + // rather than a loop. This would also eliminate the need to retain + // the appropriate target (e.g., `GL_TEXTURE_2D` for binding). + + for(Int ii = 0; ii < count; ++ii) + { + auto textureViewImpl = descriptorSet->m_textures[ii]; + auto samplerImpl = descriptorSet->m_samplers[ii]; + + glActiveTexture(GL_TEXTURE0 + ii); + glBindTexture(GL_TEXTURE_2D, textureViewImpl->m_textureID); + + glBindSampler(baseIndex + ii, samplerImpl->m_samplerID); + } + } + } +} + +GLuint GLRenderer::loadShader(GLenum stage, const char* source) +{ + // GLSL is monumentally stupid. It officially requires the `#version` directive + // to be the first thing in the file, which wouldn't be so bad but the API + // doesn't provide a way to pass a `#define` into your shader other than by + // prepending it to the whole thing. + // + // We are going to solve this problem by doing some surgery on the source + // that was passed in. + + const char* sourceBegin = source; + const char* sourceEnd = source + strlen(source); + + // Look for a version directive in the user-provided source. + const char* versionBegin = strstr(source, "#version"); + const char* versionEnd = nullptr; + if (versionBegin) + { + // If we found a directive, then scan for the end-of-line + // after it, and use that to specify the slice. + versionEnd = strchr(versionBegin, '\n'); + if (!versionEnd) + { + versionEnd = sourceEnd; + } + else + { + versionEnd = versionEnd + 1; + } + } + else + { + // If we didn't find a directive, then treat it as being + // a zero-byte slice at the start of the string + versionBegin = sourceBegin; + versionEnd = sourceBegin; + } + + enum { kMaxSourceStringCount = 16 }; + const GLchar* sourceStrings[kMaxSourceStringCount]; + GLint sourceStringLengths[kMaxSourceStringCount]; + + int sourceStringCount = 0; + + const char* stagePrelude = "\n"; + switch (stage) + { +#define CASE(NAME) case GL_##NAME##_SHADER: stagePrelude = "#define __GLSL_" #NAME "__ 1\n"; break + + CASE(VERTEX); + CASE(TESS_CONTROL); + CASE(TESS_EVALUATION); + CASE(GEOMETRY); + CASE(FRAGMENT); + CASE(COMPUTE); + +#undef CASE + } + + const char* prelude = + "#define __GLSL__ 1\n" + ; + +#define ADD_SOURCE_STRING_SPAN(BEGIN, END) \ + sourceStrings[sourceStringCount] = BEGIN; \ + sourceStringLengths[sourceStringCount++] = GLint(END - BEGIN) \ + /* end */ + +#define ADD_SOURCE_STRING(BEGIN) \ + sourceStrings[sourceStringCount] = BEGIN; \ + sourceStringLengths[sourceStringCount++] = GLint(strlen(BEGIN)) \ + /* end */ + + ADD_SOURCE_STRING_SPAN(versionBegin, versionEnd); + ADD_SOURCE_STRING(stagePrelude); + ADD_SOURCE_STRING(prelude); + ADD_SOURCE_STRING_SPAN(sourceBegin, versionBegin); + ADD_SOURCE_STRING_SPAN(versionEnd, sourceEnd); + + auto shaderID = glCreateShader(stage); + glShaderSource( + shaderID, + sourceStringCount, + &sourceStrings[0], + &sourceStringLengths[0]); + glCompileShader(shaderID); + + GLint success = GL_FALSE; + glGetShaderiv(shaderID, GL_COMPILE_STATUS, &success); + if (!success) + { + int maxSize = 0; + glGetShaderiv(shaderID, GL_INFO_LOG_LENGTH, &maxSize); + + auto infoBuffer = (char*)malloc(maxSize); + + int infoSize = 0; + glGetShaderInfoLog(shaderID, maxSize, &infoSize, infoBuffer); + if (infoSize > 0) + { + fprintf(stderr, "%s", infoBuffer); + ::OutputDebugStringA(infoBuffer); + } + + glDeleteShader(shaderID); + return 0; + } + + return shaderID; +} + +#if 0 +void GLRenderer::destroyBindingEntries(const BindingState::Desc& desc, const BindingDetail* details) +{ + const auto& bindings = desc.m_bindings; + const int numBindings = int(bindings.Count()); + for (int i = 0; i < numBindings; ++i) + { + const auto& binding = bindings[i]; + const auto& detail = details[i]; + + if (binding.bindingType == BindingType::Sampler && detail.m_samplerHandle != 0) + { + glDeleteSamplers(1, &detail.m_samplerHandle); + } + } +} +#endif + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! + +SlangResult GLRenderer::initialize(const Desc& desc, void* inWindowHandle) +{ + auto windowHandle = (HWND)inWindowHandle; + m_desc = desc; + + m_hdc = ::GetDC(windowHandle); + + PIXELFORMATDESCRIPTOR pixelFormatDesc = { sizeof(PIXELFORMATDESCRIPTOR) }; + pixelFormatDesc.nVersion = 1; + pixelFormatDesc.dwFlags = PFD_DRAW_TO_WINDOW | PFD_SUPPORT_OPENGL | PFD_DOUBLEBUFFER; + pixelFormatDesc.iPixelType = PFD_TYPE_RGBA; + pixelFormatDesc.cColorBits = 32; + pixelFormatDesc.cDepthBits = 24; + pixelFormatDesc.cStencilBits = 8; + pixelFormatDesc.iLayerType = PFD_MAIN_PLANE; + + int pixelFormatIndex = ChoosePixelFormat(m_hdc, &pixelFormatDesc); + SetPixelFormat(m_hdc, pixelFormatIndex, &pixelFormatDesc); + + m_glContext = wglCreateContext(m_hdc); + wglMakeCurrent(m_hdc, m_glContext); + + auto renderer = glGetString(GL_RENDERER); + auto extensions = glGetString(GL_EXTENSIONS); + + // Load each of our extension functions by name + +#define LOAD_GL_EXTENSION_FUNC(NAME, TYPE) NAME = (TYPE) wglGetProcAddress(#NAME); + MAP_GL_EXTENSION_FUNCS(LOAD_GL_EXTENSION_FUNC) +#undef LOAD_GL_EXTENSION_FUNC + + glDisable(GL_DEPTH_TEST); + glDisable(GL_CULL_FACE); + + glViewport(0, 0, desc.width, desc.height); + + if (glDebugMessageCallback) + { + glEnable(GL_DEBUG_OUTPUT); + glDebugMessageCallback(staticDebugCallback, this); + } + + return SLANG_OK; +} + +void GLRenderer::setClearColor(const float color[4]) +{ + glClearColor(color[0], color[1], color[2], color[3]); +} + +void GLRenderer::clearFrame() +{ + glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT); +} + +void GLRenderer::presentFrame() +{ + glFlush(); + ::SwapBuffers(m_hdc); +} + +TextureResource::Desc GLRenderer::getSwapChainTextureDesc() +{ + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, m_desc.width, m_desc.height, 1); + return desc; +} + +SlangResult GLRenderer::captureScreenSurface(Surface& surfaceOut) +{ + SLANG_RETURN_ON_FAIL(surfaceOut.allocate(m_desc.width, m_desc.height, Format::RGBA_Unorm_UInt8, 1, SurfaceAllocator::getMallocAllocator())); + glReadPixels(0, 0, m_desc.width, m_desc.height, GL_RGBA, GL_UNSIGNED_BYTE, surfaceOut.m_data); + surfaceOut.flipInplaceVertically(); + return SLANG_OK; +} + +Result GLRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) +{ + TextureResource::Desc srcDesc(descIn); + srcDesc.setDefaults(initialUsage); + + GlPixelFormat pixelFormat = _getGlPixelFormat(srcDesc.format); + if (pixelFormat == GlPixelFormat::Unknown) + { + return SLANG_FAIL; + } + + const GlPixelFormatInfo& info = s_pixelFormatInfos[int(pixelFormat)]; + + const GLint internalFormat = info.internalFormat; + const GLenum format = info.format; + const GLenum formatType = info.formatType; + + RefPtr texture(new TextureResourceImpl(initialUsage, srcDesc, this)); + + GLenum target = 0; + GLuint handle = 0; + glGenTextures(1, &handle); + + const int effectiveArraySize = srcDesc.calcEffectiveArraySize(); + + assert(initData); + assert(initData->numSubResources == srcDesc.numMipLevels * srcDesc.size.depth * effectiveArraySize); + + // Set on texture so will be freed if failure + texture->m_handle = handle; + const void*const*const data = initData->subResources; + + switch (srcDesc.type) + { + case Resource::Type::Texture1D: + { + if (srcDesc.arraySize > 0) + { + target = GL_TEXTURE_1D_ARRAY; + glBindTexture(target, handle); + + int slice = 0; + for (int i = 0; i < effectiveArraySize; i++) + { + for (int j = 0; j < srcDesc.numMipLevels; j++) + { + glTexImage2D(target, j, internalFormat, srcDesc.size.width, i, 0, format, formatType, data[slice++]); + } + } + } + else + { + target = GL_TEXTURE_1D; + glBindTexture(target, handle); + for (int i = 0; i < srcDesc.numMipLevels; i++) + { + glTexImage1D(target, i, internalFormat, srcDesc.size.width, 0, format, formatType, data[i]); + } + } + break; + } + case Resource::Type::TextureCube: + case Resource::Type::Texture2D: + { + if (srcDesc.arraySize > 0) + { + if (srcDesc.type == Resource::Type::TextureCube) + { + target = GL_TEXTURE_CUBE_MAP_ARRAY; + } + else + { + target = GL_TEXTURE_2D_ARRAY; + } + + glBindTexture(target, handle); + + int slice = 0; + for (int i = 0; i < effectiveArraySize; i++) + { + for (int j = 0; j < srcDesc.numMipLevels; j++) + { + glTexImage3D(target, j, internalFormat, srcDesc.size.width, srcDesc.size.height, slice, 0, format, formatType, data[slice++]); + } + } + } + else + { + if (srcDesc.type == Resource::Type::TextureCube) + { + target = GL_TEXTURE_CUBE_MAP; + glBindTexture(target, handle); + + int slice = 0; + for (int j = 0; j < 6; j++) + { + for (int i = 0; i < srcDesc.numMipLevels; i++) + { + glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + j, i, internalFormat, srcDesc.size.width, srcDesc.size.height, 0, format, formatType, data[slice++]); + } + } + } + else + { + target = GL_TEXTURE_2D; + glBindTexture(target, handle); + for (int i = 0; i < srcDesc.numMipLevels; i++) + { + glTexImage2D(target, i, internalFormat, srcDesc.size.width, srcDesc.size.height, 0, format, formatType, data[i]); + } + } + } + break; + } + case Resource::Type::Texture3D: + { + target = GL_TEXTURE_3D; + glBindTexture(target, handle); + for (int i = 0; i < srcDesc.numMipLevels; i++) + { + glTexImage3D(target, i, internalFormat, srcDesc.size.width, srcDesc.size.height, srcDesc.size.depth, 0, format, formatType, data[i]); + } + break; + } + default: + return SLANG_FAIL; + } + + glTexParameteri(target, GL_TEXTURE_WRAP_S, GL_REPEAT); + glTexParameteri(target, GL_TEXTURE_WRAP_T, GL_REPEAT); + glTexParameteri(target, GL_TEXTURE_WRAP_R, GL_REPEAT); + + // Assume regular sampling (might be superseded - if a combined sampler wanted) + glTexParameteri(target, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR); + glTexParameteri(target, GL_TEXTURE_MAG_FILTER, GL_LINEAR); + glTexParameterf(target, GL_TEXTURE_MAX_ANISOTROPY_EXT, 8.0f); + + texture->m_target = target; + + *outResource = texture.detach(); + return SLANG_OK; +} + +static GLenum _calcUsage(Resource::Usage usage) +{ + typedef Resource::Usage Usage; + switch (usage) + { + case Usage::ConstantBuffer: return GL_DYNAMIC_DRAW; + default: return GL_STATIC_READ; + } +} + +static GLenum _calcTarget(Resource::Usage usage) +{ + typedef Resource::Usage Usage; + switch (usage) + { + case Usage::ConstantBuffer: return GL_UNIFORM_BUFFER; + default: return GL_SHADER_STORAGE_BUFFER; + } +} + +Result GLRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) +{ + BufferResource::Desc desc(descIn); + desc.setDefaults(initialUsage); + + const GLenum target = _calcTarget(initialUsage); + // TODO: should derive from desc... + const GLenum usage = _calcUsage(initialUsage); + + GLuint bufferID = 0; + glGenBuffers(1, &bufferID); + glBindBuffer(target, bufferID); + + glBufferData(target, descIn.sizeInBytes, initData, usage); + + RefPtr resourceImpl = new BufferResourceImpl(initialUsage, desc, this, bufferID, target); + *outResource = resourceImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + GLuint samplerID; + glCreateSamplers(1, &samplerID); + + RefPtr samplerImpl = new SamplerStateImpl(); + samplerImpl->m_samplerID = samplerID; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (TextureResourceImpl*) texture; + + // TODO: actually do something? + + RefPtr viewImpl = new TextureViewImpl(); + viewImpl->m_resource = resourceImpl; + viewImpl->m_textureID = resourceImpl->m_handle; + *outView = viewImpl; + return SLANG_OK; +} + +Result GLRenderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + + // TODO: actually do something? + + RefPtr viewImpl = new BufferViewImpl(); + viewImpl->m_resource = resourceImpl; + viewImpl->m_bufferID = resourceImpl->m_handle; + *outView = viewImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) +{ + RefPtr inputLayout = new InputLayoutImpl; + + inputLayout->m_attributeCount = inputElementCount; + for (UInt ii = 0; ii < inputElementCount; ++ii) + { + auto& inputAttr = inputElements[ii]; + auto& glAttr = inputLayout->m_attributes[ii]; + + glAttr.streamIndex = 0; + glAttr.format = getVertexAttributeFormat(inputAttr.format); + glAttr.offset = (GLsizei)inputAttr.offset; + } + + *outLayout = inputLayout.detach(); + return SLANG_OK; +} + +void* GLRenderer::map(BufferResource* bufferIn, MapFlavor flavor) +{ + BufferResourceImpl* buffer = static_cast(bufferIn); + + //GLenum target = GL_UNIFORM_BUFFER; + + GLuint access = 0; + switch (flavor) + { + case MapFlavor::WriteDiscard: + case MapFlavor::HostWrite: + access = GL_WRITE_ONLY; + break; + case MapFlavor::HostRead: + access = GL_READ_ONLY; + break; + } + + glBindBuffer(buffer->m_target, buffer->m_handle); + + return glMapBuffer(buffer->m_target, access); +} + +void GLRenderer::unmap(BufferResource* bufferIn) +{ + BufferResourceImpl* buffer = static_cast(bufferIn); + glUnmapBuffer(buffer->m_target); +} + +void GLRenderer::setPrimitiveTopology(PrimitiveTopology topology) +{ + GLenum glTopology = 0; + switch (topology) + { +#define CASE(NAME, VALUE) case PrimitiveTopology::NAME: glTopology = VALUE; break + + CASE(TriangleList, GL_TRIANGLES); + +#undef CASE + } + m_boundPrimitiveTopology = glTopology; +} + +void GLRenderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) +{ + for (UInt ii = 0; ii < slotCount; ++ii) + { + UInt slot = startSlot + ii; + + BufferResourceImpl* buffer = static_cast(buffers[ii]); + GLuint bufferID = buffer ? buffer->m_handle : 0; + + m_boundVertexStreamBuffers[slot] = bufferID; + m_boundVertexStreamStrides[slot] = strides[ii]; + m_boundVertexStreamOffsets[slot] = offsets[ii]; + } +} + +void GLRenderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) +{ +} + +void GLRenderer::setDepthStencilTarget(ResourceView* depthStencilView) +{ +} + +void GLRenderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + auto pipelineStateImpl = (PipelineStateImpl*) state; + + m_currentPipelineState = pipelineStateImpl; + + auto program = pipelineStateImpl->m_program; + GLuint programID = program ? program->m_id : 0; + glUseProgram(programID); +} + +void GLRenderer::draw(UInt vertexCount, UInt startVertex = 0) +{ + flushStateForDraw(); + + glDrawArrays(m_boundPrimitiveTopology, (GLint)startVertex, (GLsizei)vertexCount); +} + +void GLRenderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ + assert(!"unimplemented"); +} + +void GLRenderer::dispatchCompute(int x, int y, int z) +{ + glDispatchCompute(x, y, z); +} + +#if 0 +BindingState* GLRenderer::createBindingState(const BindingState::Desc& bindingStateDesc) +{ + RefPtr bindingState(new BindingStateImpl(bindingStateDesc, this)); + + const auto& srcBindings = bindingStateDesc.m_bindings; + const int numBindings = int(srcBindings.Count()); + + auto& dstDetails = bindingState->m_bindingDetails; + dstDetails.SetSize(numBindings); + + for (int i = 0; i < numBindings; ++i) + { + auto& dstDetail = dstDetails[i]; + const auto& srcBinding = srcBindings[i]; + + + switch (srcBinding.bindingType) + { + case BindingType::Texture: + case BindingType::Buffer: + { + break; + } + case BindingType::CombinedTextureSampler: + { + assert(srcBinding.resource && srcBinding.resource->isTexture()); + TextureResourceImpl* texture = static_cast(srcBinding.resource.Ptr()); + const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; + + if (samplerDesc.isCompareSampler) + { + auto target = texture->m_target; + + glTexParameteri(target, GL_TEXTURE_MIN_FILTER, GL_LINEAR); + glTexParameteri(target, GL_TEXTURE_MAG_FILTER, GL_LINEAR); + glTexParameteri(target, GL_TEXTURE_COMPARE_MODE, GL_COMPARE_REF_TO_TEXTURE); + glTexParameteri(target, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL); + } + break; + } + case BindingType::Sampler: + { + const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; + + GLuint handle; + + glCreateSamplers(1, &handle); + glSamplerParameteri(handle, GL_TEXTURE_WRAP_S, GL_REPEAT); + glSamplerParameteri(handle, GL_TEXTURE_WRAP_T, GL_REPEAT); + glSamplerParameteri(handle, GL_TEXTURE_WRAP_R, GL_REPEAT); + + if (samplerDesc.isCompareSampler) + { + glSamplerParameteri(handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR); + glSamplerParameteri(handle, GL_TEXTURE_MAG_FILTER, GL_LINEAR); + glSamplerParameteri(handle, GL_TEXTURE_COMPARE_MODE, GL_COMPARE_REF_TO_TEXTURE); + glSamplerParameteri(handle, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL); + } + else + { + glSamplerParameteri(handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR); + glSamplerParameteri(handle, GL_TEXTURE_MAG_FILTER, GL_LINEAR); + glSamplerParameteri(handle, GL_TEXTURE_MAX_ANISOTROPY_EXT, 8); + } + + dstDetail.m_samplerHandle = handle; + break; + } + } + } + + return bindingState.detach(); +} + +void GLRenderer::setBindingState(BindingState* stateIn) +{ + BindingStateImpl* state = static_cast(stateIn); + + const auto& bindingDesc = state->getDesc(); + + const auto& details = state->m_bindingDetails; + const auto& bindings = bindingDesc.m_bindings; + const int numBindings = int(bindings.Count()); + + for (int i = 0; i < numBindings; ++i) + { + const auto& binding = bindings[i]; + const auto& detail = details[i]; + + switch (binding.bindingType) + { + case BindingType::Buffer: + { + const int bindingIndex = binding.registerRange.getSingleIndex(); + + BufferResourceImpl* buffer = static_cast(binding.resource.Ptr()); + glBindBufferBase(buffer->m_target, bindingIndex, buffer->m_handle); + break; + } + case BindingType::Sampler: + { + for (int index = binding.registerRange.index; index < binding.registerRange.index + binding.registerRange.size; ++index) + { + glBindSampler(index, detail.m_samplerHandle); + } + break; + } + case BindingType::Texture: + case BindingType::CombinedTextureSampler: + { + BufferResourceImpl* buffer = static_cast(binding.resource.Ptr()); + + const int bindingIndex = binding.registerRange.getSingleIndex(); + + glActiveTexture(GL_TEXTURE0 + bindingIndex); + glBindTexture(buffer->m_target, buffer->m_handle); + break; + } + } + } +} +#endif + +void GLRenderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + + auto layout = m_layout; + auto rangeInfo = layout->m_ranges[range]; + auto arrayIndex = rangeInfo.arrayIndex + index; + + m_constantBuffers[arrayIndex] = resourceImpl; +} + +void GLRenderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto viewImpl = (ResourceViewImpl*) view; + + auto layout = m_layout; + auto rangeInfo = layout->m_ranges[range]; + auto arrayIndex = rangeInfo.arrayIndex + index; + + assert(!"unimplemented"); +} + +void GLRenderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ + assert(!"unsupported"); +} + +void GLRenderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ + auto viewImpl = (TextureViewImpl*) textureView; + auto samplerImpl = (SamplerStateImpl*) sampler; + + auto layout = m_layout; + auto rangeInfo = layout->m_ranges[range]; + auto arrayIndex = rangeInfo.arrayIndex + index; + + m_textures[arrayIndex] = viewImpl; + m_samplers[arrayIndex] = samplerImpl; +} + +void GLRenderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) +{ + auto descriptorSetImpl = (DescriptorSetImpl*)descriptorSet; + + // TODO: can we just bind things immediately here, rather than shadowing the state? + + m_boundDescriptorSets[index] = descriptorSetImpl; +} + +Result GLRenderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + RefPtr layoutImpl = new DescriptorSetLayoutImpl(); + + Int counts[int(GLDescriptorSlotType::CountOf)] = { 0, }; + + Int rangeCount = desc.slotRangeCount; + for(Int rr = 0; rr < rangeCount; ++rr) + { + auto rangeDesc = desc.slotRanges[rr]; + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + + GLDescriptorSlotType glSlotType; + switch( rangeDesc.type ) + { + default: + assert(!"unsupported"); + break; + + // TODO: There are many other slot types we could support here, + // in particular including storage buffers. + + case DescriptorSlotType::CombinedImageSampler: + glSlotType = GLDescriptorSlotType::CombinedTextureSampler; + break; + + case DescriptorSlotType::UniformBuffer: + case DescriptorSlotType::DynamicUniformBuffer: + glSlotType = GLDescriptorSlotType::ConstantBuffer; + break; + } + + rangeInfo.type = glSlotType; + rangeInfo.arrayIndex = counts[int(glSlotType)]; + counts[int(glSlotType)] += rangeDesc.count; + + layoutImpl->m_ranges.Add(rangeInfo); + } + + for( Int ii = 0; ii < int(GLDescriptorSlotType::CountOf); ++ii ) + { + layoutImpl->m_counts[ii] = counts[ii]; + } + + *outLayout = layoutImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + RefPtr layoutImpl = new PipelineLayoutImpl(); + + static const int kSlotTypeCount = int(GLDescriptorSlotType::CountOf); + Int counts[kSlotTypeCount] = { 0, }; + + Int setCount = desc.descriptorSetCount; + for( Int ii = 0; ii < setCount; ++ii ) + { + auto setLayout = (DescriptorSetLayoutImpl*) desc.descriptorSets[ii].layout; + + PipelineLayoutImpl::DescriptorSetInfo setInfo; + setInfo.layout = setLayout; + + for( Int ii = 0; ii < int(GLDescriptorSlotType::CountOf); ++ii ) + { + setInfo.baseArrayIndex[ii] = counts[ii]; + counts[ii] += setLayout->m_counts[ii]; + } + + layoutImpl->m_sets.Add(setInfo); + } + + *outLayout = layoutImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*) layout; + + RefPtr descriptorSetImpl = new DescriptorSetImpl(); + + descriptorSetImpl->m_layout = layoutImpl; + + // TODO: storage for the arrays of bound objects could be tail allocated + // as part of the descriptor set, with offsets pre-computed in the + // descriptor set layout. + + { + auto slotTypeIndex = int(GLDescriptorSlotType::ConstantBuffer); + auto slotCount = layoutImpl->m_counts[slotTypeIndex]; + descriptorSetImpl->m_constantBuffers.SetSize(slotCount); + } + + { + auto slotTypeIndex = int(GLDescriptorSlotType::CombinedTextureSampler); + auto slotCount = layoutImpl->m_counts[slotTypeIndex]; + + descriptorSetImpl->m_textures.SetSize(slotCount); + descriptorSetImpl->m_samplers.SetSize(slotCount); + } + + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) +{ + auto programID = glCreateProgram(); + if(desc.pipelineType == PipelineType::Compute ) + { + auto computeKernel = desc.findKernel(StageType::Compute); + auto computeShaderID = loadShader(GL_COMPUTE_SHADER, (char const*) computeKernel->codeBegin); + glAttachShader(programID, computeShaderID); + glLinkProgram(programID); + glDeleteShader(computeShaderID); + } + else + { + auto vertexKernel = desc.findKernel(StageType::Vertex); + auto fragmentKernel = desc.findKernel(StageType::Fragment); + + auto vertexShaderID = loadShader(GL_VERTEX_SHADER, (char const*) vertexKernel->codeBegin); + auto fragmentShaderID = loadShader(GL_FRAGMENT_SHADER, (char const*) fragmentKernel->codeBegin); + + glAttachShader(programID, vertexShaderID); + glAttachShader(programID, fragmentShaderID); + + + glLinkProgram(programID); + + glDeleteShader(vertexShaderID); + glDeleteShader(fragmentShaderID); + } + GLint success = GL_FALSE; + glGetProgramiv(programID, GL_LINK_STATUS, &success); + if (!success) + { + int maxSize = 0; + glGetProgramiv(programID, GL_INFO_LOG_LENGTH, &maxSize); + + auto infoBuffer = (char*)::malloc(maxSize); + + int infoSize = 0; + glGetProgramInfoLog(programID, maxSize, &infoSize, infoBuffer); + if (infoSize > 0) + { + fprintf(stderr, "%s", infoBuffer); + OutputDebugStringA(infoBuffer); + } + + ::free(infoBuffer); + + glDeleteProgram(programID); + return SLANG_FAIL; + } + + *outProgram = new ShaderProgramImpl(this, programID); + return SLANG_OK; +} + +Result GLRenderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto inputLayoutImpl = (InputLayoutImpl*) desc.inputLayout; + + RefPtr pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_program = programImpl; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_inputLayout = inputLayoutImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + +Result GLRenderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + + RefPtr pipelineStateImpl = new PipelineStateImpl(); + pipelineStateImpl->m_program = programImpl; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + + +} // renderer_test diff --git a/tools/gfx/render-gl.h b/tools/gfx/render-gl.h new file mode 100644 index 000000000..055031d38 --- /dev/null +++ b/tools/gfx/render-gl.h @@ -0,0 +1,10 @@ +// render-d3d11.h +#pragma once + +namespace gfx { + +class Renderer; + +Renderer* createGLRenderer(); + +} // gfx diff --git a/tools/gfx/render-vk.cpp b/tools/gfx/render-vk.cpp new file mode 100644 index 000000000..27926e0e6 --- /dev/null +++ b/tools/gfx/render-vk.cpp @@ -0,0 +1,2569 @@ +// render-vk.cpp +#include "render-vk.h" + +//WORKING:#include "options.h" +#include "render.h" + +#include "../../source/core/smart-pointer.h" + +#include "vk-api.h" +#include "vk-util.h" +#include "vk-device-queue.h" +#include "vk-swap-chain.h" + +#include "surface.h" + +// Vulkan has a different coordinate system to ogl +// http://anki3d.org/vulkan-coordinate-system/ + +#define ENABLE_VALIDATION_LAYER 1 + +#ifdef _MSC_VER +# include +# pragma warning(disable: 4996) +# if (_MSC_VER < 1900) +# define snprintf sprintf_s +# endif +#endif + +namespace gfx { +using namespace Slang; + +class VKRenderer : public Renderer +{ +public: + enum + { + kMaxRenderTargets = 8, + kMaxAttachments = kMaxRenderTargets + 1, + + kMaxDescriptorSets = 4, + }; + + // Renderer implementation + virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; + virtual void setClearColor(const float color[4]) override; + virtual void clearFrame() override; + virtual void presentFrame() override; + TextureResource::Desc getSwapChainTextureDesc() override; + + Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) override; + Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) override; + Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) override; + + Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) override; + Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) override; + + Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) override; + + Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) override; + Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) override; + Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) override; + + Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) override; + Result createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) override; + Result createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) override; + + virtual SlangResult captureScreenSurface(Surface& surface) override; + + virtual void* map(BufferResource* buffer, MapFlavor flavor) override; + virtual void unmap(BufferResource* buffer) override; + virtual void setPrimitiveTopology(PrimitiveTopology topology) override; + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) override; + + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) override; + virtual void setDepthStencilTarget(ResourceView* depthStencilView) override; + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) override; + virtual void draw(UInt vertexCount, UInt startVertex) override; + virtual void drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) override; + virtual void dispatchCompute(int x, int y, int z) override; + virtual void submitGpuWork() override; + virtual void waitForGpu() override; + virtual RendererType getRendererType() const override { return RendererType::Vulkan; } + + /// Dtor + ~VKRenderer(); + + protected: + + class Buffer + { + public: + /// Initialize a buffer with specified size, and memory props + Result init(const VulkanApi& api, size_t bufferSize, VkBufferUsageFlags usage, VkMemoryPropertyFlags reqMemoryProperties); + + /// Returns true if has been initialized + bool isInitialized() const { return m_api != nullptr; } + + // Default Ctor + Buffer(): + m_api(nullptr) + {} + + /// Dtor + ~Buffer() + { + if (m_api) + { + m_api->vkDestroyBuffer(m_api->m_device, m_buffer, nullptr); + m_api->vkFreeMemory(m_api->m_device, m_memory, nullptr); + } + } + + VkBuffer m_buffer; + VkDeviceMemory m_memory; + const VulkanApi* m_api; + }; + + class InputLayoutImpl : public InputLayout + { + public: + List m_vertexDescs; + int m_vertexSize; + }; + + class BufferResourceImpl: public BufferResource + { + public: + typedef BufferResource Parent; + + BufferResourceImpl(Resource::Usage initialUsage, const BufferResource::Desc& desc, VKRenderer* renderer): + Parent(desc), + m_renderer(renderer), + m_initialUsage(initialUsage) + { + assert(renderer); + } + + Resource::Usage m_initialUsage; + VKRenderer* m_renderer; + Buffer m_buffer; + Buffer m_uploadBuffer; + List m_readBuffer; ///< Stores the contents when a map read is performed + + MapFlavor m_mapFlavor = MapFlavor::Unknown; ///< If resource is mapped, records what kind of mapping else Unknown (if not mapped) + }; + + class TextureResourceImpl : public TextureResource + { + public: + typedef TextureResource Parent; + + TextureResourceImpl(const Desc& desc, Usage initialUsage, const VulkanApi* api) : + Parent(desc), + m_initialUsage(initialUsage), + m_api(api) + { + } + ~TextureResourceImpl() + { + if (m_api) + { + if (m_imageMemory != VK_NULL_HANDLE) + { + m_api->vkFreeMemory(m_api->m_device, m_imageMemory, nullptr); + } + if (m_image != VK_NULL_HANDLE) + { + m_api->vkDestroyImage(m_api->m_device, m_image, nullptr); + } + } + } + + Usage m_initialUsage; + + VkImage m_image = VK_NULL_HANDLE; + VkDeviceMemory m_imageMemory = VK_NULL_HANDLE; + + const VulkanApi* m_api; + }; + + class SamplerStateImpl : public SamplerState + { + public: + VkSampler m_sampler; + }; + + class ResourceViewImpl : public ResourceView + { + public: + enum class ViewType + { + Texture, + TexelBuffer, + PlainBuffer, + }; + ViewType m_type; + }; + + class TextureResourceViewImpl : public ResourceViewImpl + { + public: + TextureResourceViewImpl() + { + m_type = ViewType::Texture; + } + + RefPtr m_texture; + VkImageView m_view; + VkImageLayout m_layout; + }; + + class TexelBufferResourceViewImpl : public ResourceViewImpl + { + public: + TexelBufferResourceViewImpl() + { + m_type = ViewType::TexelBuffer; + } + + RefPtr m_buffer; + VkBufferView m_view; + }; + + class PlainBufferResourceViewImpl : public ResourceViewImpl + { + public: + PlainBufferResourceViewImpl() + { + m_type = ViewType::PlainBuffer; + } + + RefPtr m_buffer; + VkDeviceSize offset; + VkDeviceSize size; + }; + + class ShaderProgramImpl: public ShaderProgram + { + public: + + ShaderProgramImpl(PipelineType pipelineType): + m_pipelineType(pipelineType) + {} + + PipelineType m_pipelineType; + + VkPipelineShaderStageCreateInfo m_compute; + VkPipelineShaderStageCreateInfo m_vertex; + VkPipelineShaderStageCreateInfo m_fragment; + + List m_buffers[2]; //< To keep storage of code in scope + }; + + class DescriptorSetLayoutImpl : public DescriptorSetLayout + { + public: + DescriptorSetLayoutImpl(const VulkanApi& api) + : m_api(&api) + { + } + + ~DescriptorSetLayoutImpl() + { + if(m_descriptorSetLayout != VK_NULL_HANDLE) + { + m_api->vkDestroyDescriptorSetLayout(m_api->m_device, m_descriptorSetLayout, nullptr); + } + if (m_descriptorPool != VK_NULL_HANDLE) + { + m_api->vkDestroyDescriptorPool(m_api->m_device, m_descriptorPool, nullptr); + } + } + + VulkanApi const* m_api; + VkDescriptorSetLayout m_descriptorSetLayout = VK_NULL_HANDLE; + VkDescriptorPool m_descriptorPool = VK_NULL_HANDLE; + + struct RangeInfo + { + VkDescriptorType descriptorType; + }; + List m_ranges; + }; + + class PipelineLayoutImpl : public PipelineLayout + { + public: + PipelineLayoutImpl(const VulkanApi& api) + : m_api(&api) + { + } + + ~PipelineLayoutImpl() + { + if (m_pipelineLayout != VK_NULL_HANDLE) + { + m_api->vkDestroyPipelineLayout(m_api->m_device, m_pipelineLayout, nullptr); + } + } + + VulkanApi const* m_api; + VkPipelineLayout m_pipelineLayout = VK_NULL_HANDLE; + UInt m_descriptorSetCount = 0; + }; + + class DescriptorSetImpl : public DescriptorSet + { + public: + DescriptorSetImpl(VKRenderer* renderer) + : m_renderer(renderer) + { + } + + ~DescriptorSetImpl() + { + } + + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) override; + virtual void setResource(UInt range, UInt index, ResourceView* view) override; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) override; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) override; + + RefPtr m_renderer; + RefPtr m_layout; + VkDescriptorSet m_descriptorSet = VK_NULL_HANDLE; + }; + +#if 0 + struct BindingDetail + { + VkImageView m_srv = VK_NULL_HANDLE; + VkBufferView m_uav = VK_NULL_HANDLE; + VkSampler m_sampler = VK_NULL_HANDLE; + }; + + class BindingStateImpl: public BindingState + { + public: + typedef BindingState Parent; + + BindingStateImpl(const Desc& desc, const VulkanApi* api): + Parent(desc), + m_api(api) + { + } + ~BindingStateImpl() + { + for (int i = 0; i < int(m_bindingDetails.Count()); ++i) + { + BindingDetail& detail = m_bindingDetails[i]; + if (detail.m_sampler != VK_NULL_HANDLE) + { + m_api->vkDestroySampler(m_api->m_device, detail.m_sampler, nullptr); + } + if (detail.m_srv != VK_NULL_HANDLE) + { + m_api->vkDestroyImageView(m_api->m_device, detail.m_srv, nullptr); + } + if (detail.m_uav != VK_NULL_HANDLE) + { + m_api->vkDestroyBufferView(m_api->m_device, detail.m_uav, nullptr); + } + } + } + + const VulkanApi* m_api; + List m_bindingDetails; + }; +#endif + + struct BoundVertexBuffer + { + RefPtr m_buffer; + int m_stride; + int m_offset; + }; + + class PipelineStateImpl : public PipelineState + { + public: + PipelineStateImpl(const VulkanApi& api): + m_api(&api) + { + } + ~PipelineStateImpl() + { + if (m_pipeline != VK_NULL_HANDLE) + { + m_api->vkDestroyPipeline(m_api->m_device, m_pipeline, nullptr); + } + } + + const VulkanApi* m_api; + +// VkPrimitiveTopology m_primitiveTopology; + + RefPtr m_pipelineLayout; + +// RefPtr m_inputLayout; + RefPtr m_shaderProgram; + + VkPipeline m_pipeline = VK_NULL_HANDLE; + }; + + VkBool32 handleDebugMessage(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, + size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg); + + VkPipelineShaderStageCreateInfo compileEntryPoint( + ShaderProgram::KernelDesc const& kernelDesc, + VkShaderStageFlagBits stage, + List& bufferOut); + + static VKAPI_ATTR VkBool32 VKAPI_CALL debugMessageCallback(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, + size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg, void* pUserData); + + /// Returns true if m_currentPipeline matches the current configuration +// Pipeline* _getPipeline(); +// bool _isEqual(const Pipeline& pipeline) const; +// Slang::Result _createPipeline(RefPtr& pipelineOut); + void _beginRender(); + void _endRender(); + + Slang::Result _beginPass(); + void _endPass(); + void _transitionImageLayout(VkImage image, VkFormat format, const TextureResource::Desc& desc, VkImageLayout oldLayout, VkImageLayout newLayout); + + VkDebugReportCallbackEXT m_debugReportCallback; + +// RefPtr m_currentInputLayout; + +// RefPtr m_currentBindingState; + RefPtr m_currentPipelineLayout; + + RefPtr m_currentDescriptorSetImpls [kMaxDescriptorSets]; + VkDescriptorSet m_currentDescriptorSets [kMaxDescriptorSets]; + +// RefPtr m_currentProgram; + +// List > m_pipelineCache; + RefPtr m_currentPipeline; + + List m_boundVertexBuffers; + + VkPrimitiveTopology m_primitiveTopology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; + + VkDevice m_device = VK_NULL_HANDLE; + + VulkanModule m_module; + VulkanApi m_api; + + VulkanDeviceQueue m_deviceQueue; + VulkanSwapChain m_swapChain; + + VkRenderPass m_renderPass = VK_NULL_HANDLE; + + int m_swapChainImageIndex = -1; + + float m_clearColor[4] = { 0, 0, 0, 0 }; + + Desc m_desc; +}; + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VkRenderer::Buffer !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +Result VKRenderer::Buffer::init(const VulkanApi& api, size_t bufferSize, VkBufferUsageFlags usage, VkMemoryPropertyFlags reqMemoryProperties) +{ + assert(!isInitialized()); + + m_api = &api; + m_memory = VK_NULL_HANDLE; + m_buffer = VK_NULL_HANDLE; + + VkBufferCreateInfo bufferCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO }; + bufferCreateInfo.size = bufferSize; + bufferCreateInfo.usage = usage; + + SLANG_VK_CHECK(api.vkCreateBuffer(api.m_device, &bufferCreateInfo, nullptr, &m_buffer)); + + VkMemoryRequirements memoryReqs = {}; + api.vkGetBufferMemoryRequirements(api.m_device, m_buffer, &memoryReqs); + + int memoryTypeIndex = api.findMemoryTypeIndex(memoryReqs.memoryTypeBits, reqMemoryProperties); + assert(memoryTypeIndex >= 0); + + VkMemoryPropertyFlags actualMemoryProperites = api.m_deviceMemoryProperties.memoryTypes[memoryTypeIndex].propertyFlags; + + VkMemoryAllocateInfo allocateInfo = { VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO }; + allocateInfo.allocationSize = memoryReqs.size; + allocateInfo.memoryTypeIndex = memoryTypeIndex; + + SLANG_VK_CHECK(api.vkAllocateMemory(api.m_device, &allocateInfo, nullptr, &m_memory)); + SLANG_VK_CHECK(api.vkBindBufferMemory(api.m_device, m_buffer, m_memory, 0)); + + return SLANG_OK; +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VkRenderer !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +#if 0 +bool VKRenderer::_isEqual(const Pipeline& pipeline) const +{ + return + pipeline.m_pipelineLayout == m_currentPipelineLayout && + pipeline.m_primitiveTopology == m_primitiveTopology && + pipeline.m_inputLayout == m_currentInputLayout && + pipeline.m_shaderProgram == m_currentProgram; +} + +VKRenderer::Pipeline* VKRenderer::_getPipeline() +{ + if (m_currentPipeline && _isEqual(*m_currentPipeline)) + { + return m_currentPipeline; + } + + // Look for a match in the cache + for (int i = 0; i < int(m_pipelineCache.Count()); ++i) + { + Pipeline* pipeline = m_pipelineCache[i]; + if (_isEqual(*pipeline)) + { + m_currentPipeline = pipeline; + return pipeline; + } + } + + RefPtr pipeline; + SLANG_RETURN_NULL_ON_FAIL(_createPipeline(pipeline)); + m_pipelineCache.Add(pipeline); + m_currentPipeline = pipeline; + return pipeline; +} + +Slang::Result VKRenderer::_createPipeline(RefPtr& pipelineOut) +{ + RefPtr pipeline(new Pipeline(m_api)); + + // Initialize the state + pipeline->m_primitiveTopology = m_primitiveTopology; + pipeline->m_pipelineLayout = m_currentPipelineLayout; + pipeline->m_shaderProgram = m_currentProgram; + pipeline->m_inputLayout = m_currentInputLayout; + + // Must be equal at this point if all the items are correctly set in pipeline + assert(_isEqual(*pipeline)); + + VkPipelineCache pipelineCache = VK_NULL_HANDLE; + + if (m_currentProgram->m_pipelineType == PipelineType::Compute) + { + // Then create a pipeline to use that layout + + VkComputePipelineCreateInfo computePipelineInfo = { VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO }; + computePipelineInfo.stage = m_currentProgram->m_compute; + computePipelineInfo.layout = pipeline->m_pipelineLayout->m_pipelineLayout; + + SLANG_VK_CHECK(m_api.vkCreateComputePipelines(m_device, pipelineCache, 1, &computePipelineInfo, nullptr, &pipeline->m_pipeline)); + } + else if (m_currentProgram->m_pipelineType == PipelineType::Graphics) + { + // Create the graphics pipeline + + const int width = m_swapChain.getWidth(); + const int height = m_swapChain.getHeight(); + + VkPipelineShaderStageCreateInfo shaderStages[] = { m_currentProgram->m_vertex, m_currentProgram->m_fragment }; + + // VertexBuffer/s + // Currently only handles one + + VkPipelineVertexInputStateCreateInfo vertexInputInfo = { VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO }; + vertexInputInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO; + vertexInputInfo.vertexBindingDescriptionCount = 0; + vertexInputInfo.vertexAttributeDescriptionCount = 0; + + VkVertexInputBindingDescription vertexInputBindingDescription; + + if (m_currentInputLayout) + { + vertexInputBindingDescription.binding = 0; + vertexInputBindingDescription.stride = m_currentInputLayout->m_vertexSize; + vertexInputBindingDescription.inputRate = VK_VERTEX_INPUT_RATE_VERTEX; + + const auto& srcAttributeDescs = m_currentInputLayout->m_vertexDescs; + + vertexInputInfo.vertexBindingDescriptionCount = 1; + vertexInputInfo.pVertexBindingDescriptions = &vertexInputBindingDescription; + + vertexInputInfo.vertexAttributeDescriptionCount = static_cast(srcAttributeDescs.Count()); + vertexInputInfo.pVertexAttributeDescriptions = srcAttributeDescs.Buffer(); + } + + // + + VkPipelineInputAssemblyStateCreateInfo inputAssembly = {}; + inputAssembly.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO; + inputAssembly.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; + inputAssembly.primitiveRestartEnable = VK_FALSE; + + VkViewport viewport = {}; + viewport.x = 0.0f; + viewport.y = 0.0f; + viewport.width = (float)width; + viewport.height = (float)height; + viewport.minDepth = 0.0f; + viewport.maxDepth = 1.0f; + + VkRect2D scissor = {}; + scissor.offset = { 0, 0 }; + scissor.extent = { uint32_t(width), uint32_t(height) }; + + VkPipelineViewportStateCreateInfo viewportState = {}; + viewportState.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO; + viewportState.viewportCount = 1; + viewportState.pViewports = &viewport; + viewportState.scissorCount = 1; + viewportState.pScissors = &scissor; + + VkPipelineRasterizationStateCreateInfo rasterizer = {}; + rasterizer.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO; + rasterizer.depthClampEnable = VK_FALSE; + rasterizer.rasterizerDiscardEnable = VK_FALSE; + rasterizer.polygonMode = VK_POLYGON_MODE_FILL; + rasterizer.lineWidth = 1.0f; + rasterizer.cullMode = VK_CULL_MODE_NONE; + rasterizer.frontFace = VK_FRONT_FACE_CLOCKWISE; + rasterizer.depthBiasEnable = VK_FALSE; + + VkPipelineMultisampleStateCreateInfo multisampling = {}; + multisampling.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO; + multisampling.sampleShadingEnable = VK_FALSE; + multisampling.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT; + + VkPipelineColorBlendAttachmentState colorBlendAttachment = {}; + colorBlendAttachment.colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT; + colorBlendAttachment.blendEnable = VK_FALSE; + + VkPipelineColorBlendStateCreateInfo colorBlending = {}; + colorBlending.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO; + colorBlending.logicOpEnable = VK_FALSE; + colorBlending.logicOp = VK_LOGIC_OP_COPY; + colorBlending.attachmentCount = 1; + colorBlending.pAttachments = &colorBlendAttachment; + colorBlending.blendConstants[0] = 0.0f; + colorBlending.blendConstants[1] = 0.0f; + colorBlending.blendConstants[2] = 0.0f; + colorBlending.blendConstants[3] = 0.0f; + + VkGraphicsPipelineCreateInfo pipelineInfo = { VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO }; + + pipelineInfo.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO; + pipelineInfo.stageCount = 2; + pipelineInfo.pStages = shaderStages; + pipelineInfo.pVertexInputState = &vertexInputInfo; + pipelineInfo.pInputAssemblyState = &inputAssembly; + pipelineInfo.pViewportState = &viewportState; + pipelineInfo.pRasterizationState = &rasterizer; + pipelineInfo.pMultisampleState = &multisampling; + pipelineInfo.pColorBlendState = &colorBlending; + pipelineInfo.layout = pipeline->m_pipelineLayout->m_pipelineLayout; + pipelineInfo.renderPass = m_renderPass; + pipelineInfo.subpass = 0; + pipelineInfo.basePipelineHandle = VK_NULL_HANDLE; + + SLANG_VK_CHECK(m_api.vkCreateGraphicsPipelines(m_device, pipelineCache, 1, &pipelineInfo, nullptr, &pipeline->m_pipeline)); + } + else + { + assert(!"Unhandled program type"); + return SLANG_FAIL; + } + + pipelineOut = pipeline; + return SLANG_OK; +} +#endif + +Result VKRenderer::_beginPass() +{ + if (m_swapChainImageIndex < 0) + { + return SLANG_FAIL; + } + + const int numRenderTargets = 1; + + const VulkanSwapChain::Image& image = m_swapChain.getImages()[m_swapChainImageIndex]; + + int numAttachments = 0; + + // Start render pass + VkClearValue clearValues[kMaxAttachments]; + clearValues[numAttachments++] = VkClearValue{ m_clearColor[0], m_clearColor[1], m_clearColor[2], m_clearColor[3] }; + + bool hasDepthBuffer = false; + if (hasDepthBuffer) + { + VkClearValue& clearValue = clearValues[numAttachments++]; + + clearValue.depthStencil.depth = 1.0f; + clearValue.depthStencil.stencil = 0; + } + + const int width = m_swapChain.getWidth(); + const int height = m_swapChain.getHeight(); + + VkCommandBuffer cmdBuffer = m_deviceQueue.getCommandBuffer(); + + VkRenderPassBeginInfo renderPassBegin = {}; + renderPassBegin.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO; + renderPassBegin.renderPass = m_renderPass; + renderPassBegin.framebuffer = image.m_frameBuffer; + renderPassBegin.renderArea.offset.x = 0; + renderPassBegin.renderArea.offset.y = 0; + renderPassBegin.renderArea.extent.width = width; + renderPassBegin.renderArea.extent.height = height; + renderPassBegin.clearValueCount = numAttachments; + renderPassBegin.pClearValues = clearValues; + + m_api.vkCmdBeginRenderPass(cmdBuffer, &renderPassBegin, VK_SUBPASS_CONTENTS_INLINE); + + // Set up scissor and viewport + { + VkRect2D rects[kMaxRenderTargets] = {}; + VkViewport viewports[kMaxRenderTargets] = {}; + for (int i = 0; i < numRenderTargets; ++i) + { + rects[i] = VkRect2D{ 0, 0, uint32_t(width), uint32_t(height) }; + + VkViewport& dstViewport = viewports[i]; + + dstViewport.x = 0.0f; + dstViewport.y = 0.0f; + dstViewport.width = float(width); + dstViewport.height = float(height); + dstViewport.minDepth = 0.0f; + dstViewport.maxDepth = 1.0f; + } + + m_api.vkCmdSetScissor(cmdBuffer, 0, numRenderTargets, rects); + m_api.vkCmdSetViewport(cmdBuffer, 0, numRenderTargets, viewports); + } + + return SLANG_OK; +} + +void VKRenderer::_endPass() +{ + VkCommandBuffer cmdBuffer = m_deviceQueue.getCommandBuffer(); + m_api.vkCmdEndRenderPass(cmdBuffer); +} + +void VKRenderer::_beginRender() +{ + m_swapChainImageIndex = m_swapChain.nextFrontImageIndex(); + + if (m_swapChainImageIndex < 0) + { + return; + } +} + +void VKRenderer::_endRender() +{ + m_deviceQueue.flush(); +} + +Renderer* createVKRenderer() +{ + return new VKRenderer; +} + +VKRenderer::~VKRenderer() +{ + if (m_renderPass != VK_NULL_HANDLE) + { + m_api.vkDestroyRenderPass(m_device, m_renderPass, nullptr); + m_renderPass = VK_NULL_HANDLE; + } +} + + +VkBool32 VKRenderer::handleDebugMessage(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, + size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg) +{ + char const* severity = "message"; + if (flags & VK_DEBUG_REPORT_WARNING_BIT_EXT) + severity = "warning"; + if (flags & VK_DEBUG_REPORT_ERROR_BIT_EXT) + severity = "error"; + + // pMsg can be really big (it can be assembler dump for example) + // Use a dynamic buffer to store + size_t bufferSize = strlen(pMsg) + 1 + 1024; + List bufferArray; + bufferArray.SetSize(bufferSize); + char* buffer = bufferArray.Buffer(); + + sprintf_s(buffer, + bufferSize, + "%s: %s %d: %s\n", + pLayerPrefix, + severity, + msgCode, + pMsg); + + fprintf(stderr, "%s", buffer); + fflush(stderr); + + OutputDebugStringA(buffer); + + return VK_FALSE; +} + +/* static */VKAPI_ATTR VkBool32 VKAPI_CALL VKRenderer::debugMessageCallback(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, + size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg, void* pUserData) +{ + return ((VKRenderer*)pUserData)->handleDebugMessage(flags, objType, srcObject, location, msgCode, pLayerPrefix, pMsg); +} + +VkPipelineShaderStageCreateInfo VKRenderer::compileEntryPoint( + ShaderProgram::KernelDesc const& kernelDesc, + VkShaderStageFlagBits stage, + List& bufferOut) +{ + char const* dataBegin = (char const*) kernelDesc.codeBegin; + char const* dataEnd = (char const*) kernelDesc.codeEnd; + + // We need to make a copy of the code, since the Slang compiler + // will free the memory after a compile request is closed. + size_t codeSize = dataEnd - dataBegin; + + bufferOut.InsertRange(0, dataBegin, codeSize); + + char* codeBegin = bufferOut.Buffer(); + + VkShaderModuleCreateInfo moduleCreateInfo = { VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO }; + moduleCreateInfo.pCode = (uint32_t*)codeBegin; + moduleCreateInfo.codeSize = codeSize; + + VkShaderModule module; + SLANG_VK_CHECK(m_api.vkCreateShaderModule(m_device, &moduleCreateInfo, nullptr, &module)); + + VkPipelineShaderStageCreateInfo shaderStageCreateInfo = { VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO }; + shaderStageCreateInfo.stage = stage; + + shaderStageCreateInfo.module = module; + shaderStageCreateInfo.pName = "main"; + + return shaderStageCreateInfo; +} + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! + +SlangResult VKRenderer::initialize(const Desc& desc, void* inWindowHandle) +{ + SLANG_RETURN_ON_FAIL(m_module.init()); + SLANG_RETURN_ON_FAIL(m_api.initGlobalProcs(m_module)); + + m_desc = desc; + + VkApplicationInfo applicationInfo = { VK_STRUCTURE_TYPE_APPLICATION_INFO }; + applicationInfo.pApplicationName = "slang-render-test"; + applicationInfo.pEngineName = "slang-render-test"; + applicationInfo.apiVersion = VK_API_VERSION_1_0; + + char const* instanceExtensions[] = + { + VK_KHR_SURFACE_EXTENSION_NAME, + +#if SLANG_WINDOWS_FAMILY + VK_KHR_WIN32_SURFACE_EXTENSION_NAME, +#else + VK_KHR_XLIB_SURFACE_EXTENSION_NAME +#endif + +#if ENABLE_VALIDATION_LAYER + VK_EXT_DEBUG_REPORT_EXTENSION_NAME, +#endif + }; + + VkInstance instance = VK_NULL_HANDLE; + + VkInstanceCreateInfo instanceCreateInfo = { VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO }; + instanceCreateInfo.pApplicationInfo = &applicationInfo; + + instanceCreateInfo.enabledExtensionCount = SLANG_COUNT_OF(instanceExtensions); + instanceCreateInfo.ppEnabledExtensionNames = &instanceExtensions[0]; + +#if ENABLE_VALIDATION_LAYER + const char* layerNames[] = { "VK_LAYER_LUNARG_standard_validation" }; + instanceCreateInfo.enabledLayerCount = SLANG_COUNT_OF(layerNames); + instanceCreateInfo.ppEnabledLayerNames = layerNames; +#endif + + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateInstance(&instanceCreateInfo, nullptr, &instance)); + SLANG_RETURN_ON_FAIL(m_api.initInstanceProcs(instance)); + +#if ENABLE_VALIDATION_LAYER + VkDebugReportFlagsEXT debugFlags = VK_DEBUG_REPORT_ERROR_BIT_EXT | VK_DEBUG_REPORT_WARNING_BIT_EXT; + + VkDebugReportCallbackCreateInfoEXT debugCreateInfo = { VK_STRUCTURE_TYPE_DEBUG_REPORT_CREATE_INFO_EXT }; + debugCreateInfo.pfnCallback = &debugMessageCallback; + debugCreateInfo.pUserData = this; + debugCreateInfo.flags = debugFlags; + + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateDebugReportCallbackEXT(instance, &debugCreateInfo, nullptr, &m_debugReportCallback)); +#endif + + uint32_t numPhysicalDevices = 0; + SLANG_VK_RETURN_ON_FAIL(m_api.vkEnumeratePhysicalDevices(instance, &numPhysicalDevices, nullptr)); + + List physicalDevices; + physicalDevices.SetSize(numPhysicalDevices); + SLANG_VK_RETURN_ON_FAIL(m_api.vkEnumeratePhysicalDevices(instance, &numPhysicalDevices, physicalDevices.Buffer())); + + // TODO: allow override of selected device + uint32_t selectedDeviceIndex = 0; + + SLANG_RETURN_ON_FAIL(m_api.initPhysicalDevice(physicalDevices[selectedDeviceIndex])); + + int queueFamilyIndex = m_api.findQueue(VK_QUEUE_GRAPHICS_BIT | VK_QUEUE_COMPUTE_BIT); + assert(queueFamilyIndex >= 0); + + float queuePriority = 0.0f; + VkDeviceQueueCreateInfo queueCreateInfo = { VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO }; + queueCreateInfo.queueFamilyIndex = queueFamilyIndex; + queueCreateInfo.queueCount = 1; + queueCreateInfo.pQueuePriorities = &queuePriority; + + char const* const deviceExtensions[] = + { + VK_KHR_SWAPCHAIN_EXTENSION_NAME, + }; + + VkDeviceCreateInfo deviceCreateInfo = { VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO }; + deviceCreateInfo.queueCreateInfoCount = 1; + deviceCreateInfo.pQueueCreateInfos = &queueCreateInfo; + deviceCreateInfo.pEnabledFeatures = &m_api.m_deviceFeatures; + + deviceCreateInfo.enabledExtensionCount = SLANG_COUNT_OF(deviceExtensions); + deviceCreateInfo.ppEnabledExtensionNames = deviceExtensions; + + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateDevice(m_api.m_physicalDevice, &deviceCreateInfo, nullptr, &m_device)); + SLANG_RETURN_ON_FAIL(m_api.initDeviceProcs(m_device)); + + { + VkQueue queue; + m_api.vkGetDeviceQueue(m_device, queueFamilyIndex, 0, &queue); + SLANG_RETURN_ON_FAIL(m_deviceQueue.init(m_api, queue, queueFamilyIndex)); + } + + // set up swap chain + + { + VulkanSwapChain::Desc desc; + VulkanSwapChain::PlatformDesc* platformDesc = nullptr; + + desc.init(); + desc.m_format = Format::RGBA_Unorm_UInt8; + +#if SLANG_WINDOWS_FAMILY + VulkanSwapChain::WinPlatformDesc winPlatformDesc; + winPlatformDesc.m_hinstance = ::GetModuleHandle(nullptr); + winPlatformDesc.m_hwnd = (HWND)inWindowHandle; + platformDesc = &winPlatformDesc; +#endif + + SLANG_RETURN_ON_FAIL(m_swapChain.init(&m_deviceQueue, desc, platformDesc)); + } + + // depth/stencil? + + // render pass? + + { + const int numRenderTargets = 1; + bool shouldClear = true; + bool shouldClearDepth = false; + bool shouldClearStencil = false; + bool hasDepthBuffer = false; + + Format depthFormat = Format::Unknown; + VkFormat colorFormat = m_swapChain.getVkFormat(); + + int numAttachments = 0; + // We need extra space if we have depth buffer + VkAttachmentDescription attachmentDesc[kMaxRenderTargets + 1] = {}; + for (int i = 0; i < numRenderTargets; ++i) + { + VkAttachmentDescription& dst = attachmentDesc[numAttachments ++]; + + dst.flags = 0; + dst.format = colorFormat; + dst.samples = VK_SAMPLE_COUNT_1_BIT; + dst.loadOp = shouldClear ? VK_ATTACHMENT_LOAD_OP_CLEAR : VK_ATTACHMENT_LOAD_OP_LOAD; + dst.storeOp = VK_ATTACHMENT_STORE_OP_STORE; + dst.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; + dst.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE; + dst.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED; // VK_IMAGE_LAYOUT_PRESENT_SRC_KHR; + dst.finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR; + } + if (hasDepthBuffer) + { + VkAttachmentDescription& dst = attachmentDesc[numAttachments++]; + + dst.flags = 0; + dst.format = VulkanUtil::getVkFormat(depthFormat); + dst.samples = VK_SAMPLE_COUNT_1_BIT; + dst.loadOp = shouldClearDepth ? VK_ATTACHMENT_LOAD_OP_CLEAR : VK_ATTACHMENT_LOAD_OP_LOAD; + dst.storeOp = VK_ATTACHMENT_STORE_OP_STORE; + dst.stencilLoadOp = shouldClearStencil ? VK_ATTACHMENT_LOAD_OP_CLEAR : VK_ATTACHMENT_LOAD_OP_LOAD; + dst.stencilStoreOp = VK_ATTACHMENT_STORE_OP_STORE; + dst.initialLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; + dst.finalLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; + } + + VkAttachmentReference colorAttachments[kMaxRenderTargets] = {}; + for (int i = 0; i < numRenderTargets; ++i) + { + VkAttachmentReference& dst = colorAttachments[i]; + dst.attachment = i; + dst.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; + } + + VkAttachmentReference depthAttachment = {}; + depthAttachment.attachment = numRenderTargets; + depthAttachment.layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; + + VkSubpassDescription subpassDesc = {}; + subpassDesc.flags = 0; + subpassDesc.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS; + subpassDesc.inputAttachmentCount = 0u; + subpassDesc.pInputAttachments = nullptr; + subpassDesc.colorAttachmentCount = numRenderTargets; + subpassDesc.pColorAttachments = colorAttachments; + subpassDesc.pResolveAttachments = nullptr; + subpassDesc.pDepthStencilAttachment = hasDepthBuffer ? &depthAttachment : nullptr; + subpassDesc.preserveAttachmentCount = 0u; + subpassDesc.pPreserveAttachments = nullptr; + + VkRenderPassCreateInfo renderPassCreateInfo = {}; + renderPassCreateInfo.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO; + renderPassCreateInfo.attachmentCount = numAttachments; + renderPassCreateInfo.pAttachments = attachmentDesc; + renderPassCreateInfo.subpassCount = 1; + renderPassCreateInfo.pSubpasses = &subpassDesc; + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateRenderPass(m_device, &renderPassCreateInfo, nullptr, &m_renderPass)); + } + + // frame buffer + SLANG_RETURN_ON_FAIL(m_swapChain.createFrameBuffers(m_renderPass)); + + _beginRender(); + + return SLANG_OK; +} + +void VKRenderer::submitGpuWork() +{ + m_deviceQueue.flush(); +} + +void VKRenderer::waitForGpu() +{ + m_deviceQueue.flushAndWait(); +} + +void VKRenderer::setClearColor(const float color[4]) +{ + for (int ii = 0; ii < 4; ++ii) + m_clearColor[ii] = color[ii]; +} + +void VKRenderer::clearFrame() +{ +} + +void VKRenderer::presentFrame() +{ + _endRender(); + + const bool vsync = true; + m_swapChain.present(vsync); + + _beginRender(); +} + +TextureResource::Desc VKRenderer::getSwapChainTextureDesc() +{ + TextureResource::Desc desc; + desc.init2D(Resource::Type::Texture2D, Format::Unknown, m_desc.width, m_desc.height, 1); + return desc; +} + +SlangResult VKRenderer::captureScreenSurface(Surface& surfaceOut) +{ + return SLANG_FAIL; +} + +static VkBufferUsageFlagBits _calcBufferUsageFlags(Resource::BindFlag::Enum bind) +{ + typedef Resource::BindFlag BindFlag; + + switch (bind) + { + case BindFlag::VertexBuffer: return VK_BUFFER_USAGE_VERTEX_BUFFER_BIT; + case BindFlag::IndexBuffer: return VK_BUFFER_USAGE_INDEX_BUFFER_BIT; + case BindFlag::ConstantBuffer: return VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT; + case BindFlag::StreamOutput: + case BindFlag::RenderTarget: + case BindFlag::DepthStencil: + { + assert(!"Not supported yet"); + return VkBufferUsageFlagBits(0); + } + case BindFlag::UnorderedAccess: return VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT; + case BindFlag::PixelShaderResource: return VK_BUFFER_USAGE_STORAGE_BUFFER_BIT; + case BindFlag::NonPixelShaderResource: return VK_BUFFER_USAGE_STORAGE_BUFFER_BIT; + default: return VkBufferUsageFlagBits(0); + } +} + +static VkBufferUsageFlagBits _calcBufferUsageFlags(int bindFlags) +{ + int dstFlags = 0; + while (bindFlags) + { + int lsb = bindFlags & -bindFlags; + dstFlags |= _calcBufferUsageFlags(Resource::BindFlag::Enum(lsb)); + bindFlags &= ~lsb; + } + return VkBufferUsageFlagBits(dstFlags); +} + +static VkBufferUsageFlags _calcBufferUsageFlags(int bindFlags, int cpuAccessFlags, const void* initData) +{ + VkBufferUsageFlags usage = _calcBufferUsageFlags(bindFlags); + + if (cpuAccessFlags & Resource::AccessFlag::Read) + { + // If it can be read from, set this + usage |= VK_BUFFER_USAGE_TRANSFER_SRC_BIT; + } + if ((cpuAccessFlags & Resource::AccessFlag::Write) || initData) + { + usage |= VK_BUFFER_USAGE_TRANSFER_DST_BIT; + } + + return usage; +} + +static VkImageUsageFlagBits _calcImageUsageFlags(Resource::BindFlag::Enum bind) +{ + typedef Resource::BindFlag BindFlag; + + switch (bind) + { + case BindFlag::RenderTarget: return VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT; + case BindFlag::DepthStencil: return VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT; + case BindFlag::NonPixelShaderResource: + case BindFlag::PixelShaderResource: + { + // Ignore + return VkImageUsageFlagBits(0); + } + default: + { + assert(!"Unsupported"); + return VkImageUsageFlagBits(0); + } + } +} + +static VkImageUsageFlagBits _calcImageUsageFlags(int bindFlags) +{ + int dstFlags = 0; + while (bindFlags) + { + int lsb = bindFlags & -bindFlags; + dstFlags |= _calcImageUsageFlags(Resource::BindFlag::Enum(lsb)); + bindFlags &= ~lsb; + } + return VkImageUsageFlagBits(dstFlags); +} + +static VkImageUsageFlags _calcImageUsageFlags(int bindFlags, int cpuAccessFlags, const void* initData) +{ + VkImageUsageFlags usage = _calcImageUsageFlags(bindFlags); + + usage |= VK_IMAGE_USAGE_SAMPLED_BIT; + + if (cpuAccessFlags & Resource::AccessFlag::Read) + { + // If it can be read from, set this + usage |= VK_IMAGE_USAGE_TRANSFER_SRC_BIT; + } + if ((cpuAccessFlags & Resource::AccessFlag::Write) || initData) + { + usage |= VK_IMAGE_USAGE_TRANSFER_DST_BIT; + } + + return usage; +} + +void VKRenderer::_transitionImageLayout(VkImage image, VkFormat format, const TextureResource::Desc& desc, VkImageLayout oldLayout, VkImageLayout newLayout) +{ + VkImageMemoryBarrier barrier = {}; + barrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER; + barrier.oldLayout = oldLayout; + barrier.newLayout = newLayout; + barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED; + barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED; + barrier.image = image; + barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; + barrier.subresourceRange.baseMipLevel = 0; + barrier.subresourceRange.levelCount = desc.numMipLevels; + barrier.subresourceRange.baseArrayLayer = 0; + barrier.subresourceRange.layerCount = 1; + + VkPipelineStageFlags sourceStage; + VkPipelineStageFlags destinationStage; + + if (oldLayout == VK_IMAGE_LAYOUT_UNDEFINED && newLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL) + { + barrier.srcAccessMask = 0; + barrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT; + + sourceStage = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT; + destinationStage = VK_PIPELINE_STAGE_TRANSFER_BIT; + } + else if (oldLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL && newLayout == VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL) + { + barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT; + barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT; + + sourceStage = VK_PIPELINE_STAGE_TRANSFER_BIT; + destinationStage = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT; + } + else + { + assert(!"unsupported layout transition!"); + return; + } + + VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); + + m_api.vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &barrier); +} + +Result VKRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData, TextureResource** outResource) +{ + TextureResource::Desc desc(descIn); + desc.setDefaults(initialUsage); + + const VkFormat format = VulkanUtil::getVkFormat(desc.format); + if (format == VK_FORMAT_UNDEFINED) + { + assert(!"Unhandled image format"); + return SLANG_FAIL; + } + + const int arraySize = desc.calcEffectiveArraySize(); + + RefPtr texture(new TextureResourceImpl(desc, initialUsage, &m_api)); + + // Create the image + { + VkImageCreateInfo imageInfo = {VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO}; + + switch (desc.type) + { + case Resource::Type::Texture1D: + { + imageInfo.imageType = VK_IMAGE_TYPE_1D; + imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), 1, 1 }; + break; + } + case Resource::Type::Texture2D: + { + imageInfo.imageType = VK_IMAGE_TYPE_2D; + imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), uint32_t(descIn.size.height), 1 }; + break; + } + case Resource::Type::TextureCube: + { + imageInfo.imageType = VK_IMAGE_TYPE_2D; + imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), uint32_t(descIn.size.height), 1 }; + break; + } + case Resource::Type::Texture3D: + { + // Can't have an array and 3d texture + assert(desc.arraySize <= 1); + + imageInfo.imageType = VK_IMAGE_TYPE_3D; + imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), uint32_t(descIn.size.height), uint32_t(descIn.size.depth) }; + break; + } + default: + { + assert(!"Unhandled type"); + return SLANG_FAIL; + } + } + + imageInfo.mipLevels = desc.numMipLevels; + imageInfo.arrayLayers = arraySize; + + imageInfo.format = format; + + imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL; + imageInfo.usage = _calcImageUsageFlags(desc.bindFlags, desc.cpuAccessFlags, initData); + imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE; + + imageInfo.samples = VK_SAMPLE_COUNT_1_BIT; + imageInfo.flags = 0; // Optional + + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateImage(m_device, &imageInfo, nullptr, &texture->m_image)); + } + + VkMemoryRequirements memRequirements; + m_api.vkGetImageMemoryRequirements(m_device, texture->m_image, &memRequirements); + + // Allocate the memory + { + VkMemoryPropertyFlags reqMemoryProperties = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT; + + VkMemoryAllocateInfo allocInfo = {VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO}; + + int memoryTypeIndex = m_api.findMemoryTypeIndex(memRequirements.memoryTypeBits, reqMemoryProperties); + assert(memoryTypeIndex >= 0); + + VkMemoryPropertyFlags actualMemoryProperites = m_api.m_deviceMemoryProperties.memoryTypes[memoryTypeIndex].propertyFlags; + + allocInfo.allocationSize = memRequirements.size; + allocInfo.memoryTypeIndex = memoryTypeIndex; + + SLANG_VK_RETURN_ON_FAIL(m_api.vkAllocateMemory(m_device, &allocInfo, nullptr, &texture->m_imageMemory)); + } + + // Bind the memory to the image + m_api.vkBindImageMemory(m_device, texture->m_image, texture->m_imageMemory, 0); + + if (initData) + { + List mipSizes; + + VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); + + const int numMipMaps = desc.numMipLevels; + assert(initData->numMips == numMipMaps); + + // Calculate how large the buffer has to be + size_t bufferSize = 0; + // Calculate how large an array entry is + for (int j = 0; j < numMipMaps; ++j) + { + const TextureResource::Size mipSize = desc.size.calcMipSize(j); + + const int rowSizeInBytes = Surface::calcRowSize(desc.format, mipSize.width); + const int numRows = Surface::calcNumRows(desc.format, mipSize.height); + + mipSizes.Add(mipSize); + + bufferSize += (rowSizeInBytes * numRows) * mipSize.depth; + } + + + // Calculate the total size taking into account the array + bufferSize *= arraySize; + + Buffer uploadBuffer; + SLANG_RETURN_ON_FAIL(uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); + + assert(mipSizes.Count() == numMipMaps); + + // Copy into upload buffer + { + int subResourceIndex = 0; + + uint8_t* dstData; + m_api.vkMapMemory(m_device, uploadBuffer.m_memory, 0, bufferSize, 0, (void**)&dstData); + + for (int i = 0; i < arraySize; ++i) + { + for (int j = 0; j < int(mipSizes.Count()); ++j) + { + const auto& mipSize = mipSizes[j]; + + const ptrdiff_t srcRowStride = initData->mipRowStrides[j]; + const int dstRowSizeInBytes = Surface::calcRowSize(desc.format, mipSize.width); + const int numRows = Surface::calcNumRows(desc.format, mipSize.height); + + for (int k = 0; k < mipSize.depth; k++) + { + const uint8_t* srcData = (const uint8_t*)(initData->subResources[subResourceIndex]); + + for (int l = 0; l < numRows; l++) + { + ::memcpy(dstData, srcData, dstRowSizeInBytes); + + dstData += dstRowSizeInBytes; + srcData += srcRowStride; + } + + subResourceIndex++; + } + } + } + + m_api.vkUnmapMemory(m_device, uploadBuffer.m_memory); + } + + _transitionImageLayout(texture->m_image, format, texture->getDesc(), VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL); + + { + size_t srcOffset = 0; + for (int i = 0; i < arraySize; ++i) + { + for (int j = 0; j < int(mipSizes.Count()); ++j) + { + const auto& mipSize = mipSizes[j]; + + const int rowSizeInBytes = Surface::calcRowSize(desc.format, mipSize.width); + const int numRows = Surface::calcNumRows(desc.format, mipSize.height); + + // https://www.khronos.org/registry/vulkan/specs/1.1-extensions/man/html/VkBufferImageCopy.html + // bufferRowLength and bufferImageHeight specify the data in buffer memory as a subregion of a larger two- or three-dimensional image, + // and control the addressing calculations of data in buffer memory. If either of these values is zero, that aspect of the buffer memory + // is considered to be tightly packed according to the imageExtent. + + VkBufferImageCopy region = {}; + + region.bufferOffset = srcOffset; + region.bufferRowLength = 0; //rowSizeInBytes; + region.bufferImageHeight = 0; + + region.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; + region.imageSubresource.mipLevel = j; + region.imageSubresource.baseArrayLayer = i; + region.imageSubresource.layerCount = 1; + region.imageOffset = { 0, 0, 0 }; + region.imageExtent = { uint32_t(mipSize.width), uint32_t(mipSize.height), uint32_t(mipSize.depth) }; + + // Do the copy (do all depths in a single go) + m_api.vkCmdCopyBufferToImage(commandBuffer, uploadBuffer.m_buffer, texture->m_image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, ®ion); + + // Next + srcOffset += rowSizeInBytes * numRows * mipSize.depth; + } + } + } + + _transitionImageLayout(texture->m_image, format, texture->getDesc(), VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL); + + m_deviceQueue.flushAndWait(); + } + + *outResource = texture.detach(); + return SLANG_OK; +} + +Result VKRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData, BufferResource** outResource) +{ + BufferResource::Desc desc(descIn); + desc.setDefaults(initialUsage); + + const size_t bufferSize = desc.sizeInBytes; + + VkMemoryPropertyFlags reqMemoryProperties = 0; + + VkBufferUsageFlags usage = _calcBufferUsageFlags(desc.bindFlags, desc.cpuAccessFlags, initData); + + switch (initialUsage) + { + case Resource::Usage::ConstantBuffer: + { + reqMemoryProperties = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT; + break; + } + default: break; + } + + RefPtr buffer(new BufferResourceImpl(initialUsage, desc, this)); + SLANG_RETURN_ON_FAIL(buffer->m_buffer.init(m_api, desc.sizeInBytes, usage, reqMemoryProperties)); + + if ((desc.cpuAccessFlags & Resource::AccessFlag::Write) || initData) + { + SLANG_RETURN_ON_FAIL(buffer->m_uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); + } + + if (initData) + { + // TODO: only create staging buffer if the memory type + // used for the buffer doesn't let us fill things in + // directly. + // Copy into staging buffer + void* mappedData = nullptr; + SLANG_VK_CHECK(m_api.vkMapMemory(m_device, buffer->m_uploadBuffer.m_memory, 0, bufferSize, 0, &mappedData)); + ::memcpy(mappedData, initData, bufferSize); + m_api.vkUnmapMemory(m_device, buffer->m_uploadBuffer.m_memory); + + // Copy from staging buffer to real buffer + VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); + + VkBufferCopy copyInfo = {}; + copyInfo.size = bufferSize; + m_api.vkCmdCopyBuffer(commandBuffer, buffer->m_uploadBuffer.m_buffer, buffer->m_buffer.m_buffer, 1, ©Info); + + //flushCommandBuffer(commandBuffer); + } + + *outResource = buffer.detach(); + return SLANG_OK; +} + + +VkFilter translateFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return VkFilter(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return VK_FILTER_##DST + + CASE(Point, NEAREST); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +VkSamplerMipmapMode translateMipFilterMode(TextureFilteringMode mode) +{ + switch (mode) + { + default: + return VkSamplerMipmapMode(0); + +#define CASE(SRC, DST) \ + case TextureFilteringMode::SRC: return VK_SAMPLER_MIPMAP_MODE_##DST + + CASE(Point, NEAREST); + CASE(Linear, LINEAR); + +#undef CASE + } +} + +VkSamplerAddressMode translateAddressingMode(TextureAddressingMode mode) +{ + switch (mode) + { + default: + return VkSamplerAddressMode(0); + +#define CASE(SRC, DST) \ + case TextureAddressingMode::SRC: return VK_SAMPLER_ADDRESS_MODE_##DST + + CASE(Wrap, REPEAT); + CASE(ClampToEdge, CLAMP_TO_EDGE); + CASE(ClampToBorder, CLAMP_TO_BORDER); + CASE(MirrorRepeat, MIRRORED_REPEAT); + CASE(MirrorOnce, MIRROR_CLAMP_TO_EDGE); + +#undef CASE + } +} + +static VkCompareOp translateComparisonFunc(ComparisonFunc func) +{ + switch (func) + { + default: + // TODO: need to report failures + return VK_COMPARE_OP_ALWAYS; + +#define CASE(FROM, TO) \ + case ComparisonFunc::FROM: return VK_COMPARE_OP_##TO + + CASE(Never, NEVER); + CASE(Less, LESS); + CASE(Equal, EQUAL); + CASE(LessEqual, LESS_OR_EQUAL); + CASE(Greater, GREATER); + CASE(NotEqual, NOT_EQUAL); + CASE(GreaterEqual, GREATER_OR_EQUAL); + CASE(Always, ALWAYS); +#undef CASE + } +} + +Result VKRenderer::createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) +{ + VkSamplerCreateInfo samplerInfo = { VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO }; + + samplerInfo.magFilter = translateFilterMode(desc.minFilter); + samplerInfo.minFilter = translateFilterMode(desc.magFilter); + + samplerInfo.addressModeU = translateAddressingMode(desc.addressU); + samplerInfo.addressModeV = translateAddressingMode(desc.addressV); + samplerInfo.addressModeW = translateAddressingMode(desc.addressW); + + samplerInfo.anisotropyEnable = desc.maxAnisotropy > 1; + samplerInfo.maxAnisotropy = (float) desc.maxAnisotropy; + + // TODO: support translation of border color... + samplerInfo.borderColor = VK_BORDER_COLOR_INT_OPAQUE_BLACK; + + samplerInfo.unnormalizedCoordinates = VK_FALSE; + samplerInfo.compareEnable = desc.reductionOp == TextureReductionOp::Comparison; + samplerInfo.compareOp = translateComparisonFunc(desc.comparisonFunc); + samplerInfo.mipmapMode = translateMipFilterMode(desc.mipFilter); + + VkSampler sampler; + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateSampler(m_device, &samplerInfo, nullptr, &sampler)); + + RefPtr samplerImpl = new SamplerStateImpl(); + samplerImpl->m_sampler = sampler; + *outSampler = samplerImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) +{ + assert(!"unimplemented"); + return SLANG_FAIL; +} + +Result VKRenderer::createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) +{ + auto resourceImpl = (BufferResourceImpl*) buffer; + + // TODO: These should come from the `ResourceView::Desc` + VkDeviceSize offset = 0; + VkDeviceSize size = resourceImpl->getDesc().sizeInBytes; + + // There are two different cases we need to think about for buffers. + // + // One is when we have a "uniform texel buffer" or "storage texel buffer," + // in which case we need to construct a `VkBufferView` to represent the + // formatting that is applied to the buffer. This case would correspond + // to a `textureBuffer` or `imageBuffer` in GLSL, and more or less to + // `Buffer<..>` or `RWBuffer<...>` in HLSL. + // + // The other case is a `storage buffer` which is the catch-all for any + // non-formatted R/W access to a buffer. In GLSL this is a `buffer { ... }` + // declaration, while in HLSL it covers a bunch of different `RW*Buffer` + // cases. In these cases we do *not* need a `VkBufferView`, but in + // order to be compatible with other APIs that require views for any + // potentially writable access, we will have to create one anyway. + // + // We will distinguish the two cases by looking at whether the view + // is being requested with a format or not. + // + + switch(desc.type) + { + default: + assert(!"unhandled"); + return SLANG_FAIL; + + case ResourceView::Type::UnorderedAccess: + // Is this a formatted view? + // + if(desc.format == Format::Unknown) + { + // Buffer usage that doesn't involve formatting doesn't + // require a view in Vulkan. + RefPtr viewImpl = new PlainBufferResourceViewImpl(); + viewImpl->m_buffer = resourceImpl; + viewImpl->offset = 0; + viewImpl->size = size; + *outView = viewImpl.detach(); + return SLANG_OK; + } + // + // If the view is formatted, then we need to handle + // it just like we would for a "sampled" buffer: + // + // FALLTHROUGH + case ResourceView::Type::ShaderResource: + { + VkBufferViewCreateInfo info = { VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO }; + + info.format = VulkanUtil::getVkFormat(desc.format); + info.buffer = resourceImpl->m_buffer.m_buffer; + info.offset = offset; + info.range = size; + + VkBufferView view; + SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateBufferView(m_device, &info, nullptr, &view)); + + RefPtr viewImpl = new TexelBufferResourceViewImpl(); + viewImpl->m_buffer = resourceImpl; + viewImpl->m_view = view; + *outView = viewImpl.detach(); + return SLANG_OK; + } + break; + } +} + +Result VKRenderer::createInputLayout(const InputElementDesc* elements, UInt numElements, InputLayout** outLayout) +{ + RefPtr layout(new InputLayoutImpl); + + List& dstVertexDescs = layout->m_vertexDescs; + + size_t vertexSize = 0; + dstVertexDescs.SetSize(numElements); + + for (UInt i = 0; i < numElements; ++i) + { + const InputElementDesc& srcDesc = elements[i]; + VkVertexInputAttributeDescription& dstDesc = dstVertexDescs[i]; + + dstDesc.location = uint32_t(i); + dstDesc.binding = 0; + dstDesc.format = VulkanUtil::getVkFormat(srcDesc.format); + if (dstDesc.format == VK_FORMAT_UNDEFINED) + { + return SLANG_FAIL; + } + + dstDesc.offset = uint32_t(srcDesc.offset); + + const size_t elementSize = RendererUtil::getFormatSize(srcDesc.format); + assert(elementSize > 0); + const size_t endElement = srcDesc.offset + elementSize; + + vertexSize = (vertexSize < endElement) ? endElement : vertexSize; + } + + // Work out the overall size + layout->m_vertexSize = int(vertexSize); + *outLayout = layout.detach(); + return SLANG_OK; +} + +void* VKRenderer::map(BufferResource* bufferIn, MapFlavor flavor) +{ + BufferResourceImpl* buffer = static_cast(bufferIn); + assert(buffer->m_mapFlavor == MapFlavor::Unknown); + + // Make sure everything has completed before reading... + m_deviceQueue.flushAndWait(); + + const size_t bufferSize = buffer->getDesc().sizeInBytes; + + switch (flavor) + { + case MapFlavor::WriteDiscard: + case MapFlavor::HostWrite: + { + if (!buffer->m_uploadBuffer.isInitialized()) + { + return nullptr; + } + + void* mappedData = nullptr; + SLANG_VK_CHECK(m_api.vkMapMemory(m_device, buffer->m_uploadBuffer.m_memory, 0, bufferSize, 0, &mappedData)); + buffer->m_mapFlavor = flavor; + return mappedData; + } + case MapFlavor::HostRead: + { + // Make sure there is space in the read buffer + buffer->m_readBuffer.SetSize(bufferSize); + + // create staging buffer + Buffer staging; + + SLANG_RETURN_NULL_ON_FAIL(staging.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); + + // Copy from real buffer to staging buffer + VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); + + VkBufferCopy copyInfo = {}; + copyInfo.size = bufferSize; + m_api.vkCmdCopyBuffer(commandBuffer, buffer->m_buffer.m_buffer, staging.m_buffer, 1, ©Info); + + m_deviceQueue.flushAndWait(); + + // Write out the data from the buffer + void* mappedData = nullptr; + SLANG_VK_CHECK(m_api.vkMapMemory(m_device, staging.m_memory, 0, bufferSize, 0, &mappedData)); + + ::memcpy(buffer->m_readBuffer.Buffer(), mappedData, bufferSize); + m_api.vkUnmapMemory(m_device, staging.m_memory); + + buffer->m_mapFlavor = flavor; + + return buffer->m_readBuffer.Buffer(); + } + default: + return nullptr; + } +} + +void VKRenderer::unmap(BufferResource* bufferIn) +{ + BufferResourceImpl* buffer = static_cast(bufferIn); + assert(buffer->m_mapFlavor != MapFlavor::Unknown); + + const size_t bufferSize = buffer->getDesc().sizeInBytes; + + switch (buffer->m_mapFlavor) + { + case MapFlavor::WriteDiscard: + case MapFlavor::HostWrite: + { + m_api.vkUnmapMemory(m_device, buffer->m_uploadBuffer.m_memory); + + // Copy from staging buffer to real buffer + VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); + + VkBufferCopy copyInfo = {}; + copyInfo.size = bufferSize; + m_api.vkCmdCopyBuffer(commandBuffer, buffer->m_uploadBuffer.m_buffer, buffer->m_buffer.m_buffer, 1, ©Info); + + // TODO: is this necessary? + //m_deviceQueue.flushAndWait(); + break; + } + default: break; + } + + // Mark as no longer mapped + buffer->m_mapFlavor = MapFlavor::Unknown; +} + +void VKRenderer::setPrimitiveTopology(PrimitiveTopology topology) +{ + m_primitiveTopology = VulkanUtil::getVkPrimitiveTopology(topology); +} + +void VKRenderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) +{ + { + const UInt num = startSlot + slotCount; + if (num > m_boundVertexBuffers.Count()) + { + m_boundVertexBuffers.SetSize(num); + } + } + + for (UInt i = 0; i < slotCount; i++) + { + BufferResourceImpl* buffer = static_cast(buffers[i]); + if (buffer) + { + assert(buffer->m_initialUsage == Resource::Usage::VertexBuffer); + } + + BoundVertexBuffer& boundBuffer = m_boundVertexBuffers[startSlot + i]; + boundBuffer.m_buffer = buffer; + boundBuffer.m_stride = int(strides[i]); + boundBuffer.m_offset = int(offsets[i]); + } +} + +void VKRenderer::setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset) +{ +} + +void VKRenderer::setDepthStencilTarget(ResourceView* depthStencilView) +{ +} + +void VKRenderer::setPipelineState(PipelineType pipelineType, PipelineState* state) +{ + m_currentPipeline = (PipelineStateImpl*)state; +} + +void VKRenderer::draw(UInt vertexCount, UInt startVertex = 0) +{ + auto pipeline = m_currentPipeline; + if (!pipeline || pipeline->m_shaderProgram->m_pipelineType != PipelineType::Graphics) + { + assert(!"Invalid render pipeline"); + return; + } + + SLANG_RETURN_VOID_ON_FAIL(_beginPass()); + + // Also create descriptor sets based on the given pipeline layout + VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); + + m_api.vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline->m_pipeline); + + auto pipelineLayoutImpl = pipeline->m_pipelineLayout.Ptr(); + m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipelineLayoutImpl->m_pipelineLayout, + 0, pipelineLayoutImpl->m_descriptorSetCount, + &m_currentDescriptorSets[0], + 0, nullptr); + + // Bind the vertex buffer + if (m_boundVertexBuffers.Count() > 0 && m_boundVertexBuffers[0].m_buffer) + { + const BoundVertexBuffer& boundVertexBuffer = m_boundVertexBuffers[0]; + + VkBuffer vertexBuffers[] = { boundVertexBuffer.m_buffer->m_buffer.m_buffer }; + VkDeviceSize offsets[] = { VkDeviceSize(boundVertexBuffer.m_offset) }; + + m_api.vkCmdBindVertexBuffers(commandBuffer, 0, 1, vertexBuffers, offsets); + } + + m_api.vkCmdDraw(commandBuffer, static_cast(vertexCount), 1, 0, 0); + + _endPass(); +} + +void VKRenderer::drawIndexed(UInt indexCount, UInt startIndex, UInt baseVertex) +{ +} + +void VKRenderer::dispatchCompute(int x, int y, int z) +{ + auto pipeline = m_currentPipeline; + if (!pipeline || pipeline->m_shaderProgram->m_pipelineType != PipelineType::Compute) + { + assert(!"Invalid compute pipeline"); + return; + } + + // Also create descriptor sets based on the given pipeline layout + VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); + + m_api.vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipeline->m_pipeline); + + auto pipelineLayoutImpl = pipeline->m_pipelineLayout.Ptr(); + m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipelineLayoutImpl->m_pipelineLayout, + 0, pipelineLayoutImpl->m_descriptorSetCount, + &m_currentDescriptorSets[0], + 0, nullptr); + + m_api.vkCmdDispatch(commandBuffer, x, y, z); +} + +static VkImageViewType _calcImageViewType(TextureResource::Type type, const TextureResource::Desc& desc) +{ + switch (type) + { + case Resource::Type::Texture1D: return desc.arraySize > 1 ? VK_IMAGE_VIEW_TYPE_1D_ARRAY : VK_IMAGE_VIEW_TYPE_1D; + case Resource::Type::Texture2D: return desc.arraySize > 1 ? VK_IMAGE_VIEW_TYPE_2D_ARRAY : VK_IMAGE_VIEW_TYPE_2D; + case Resource::Type::TextureCube: return desc.arraySize > 1 ? VK_IMAGE_VIEW_TYPE_CUBE_ARRAY : VK_IMAGE_VIEW_TYPE_CUBE; + case Resource::Type::Texture3D: + { + // Can't have an array and 3d texture + assert(desc.arraySize <= 1); + if (desc.arraySize <= 1) + { + return VK_IMAGE_VIEW_TYPE_3D; + } + break; + } + default: break; + } + + return VK_IMAGE_VIEW_TYPE_MAX_ENUM; +} + +#if 0 +BindingState* VKRenderer::createBindingState(const BindingState::Desc& bindingStateDesc) +{ + RefPtr bindingState(new BindingStateImpl(bindingStateDesc, &m_api)); + + const auto& srcBindings = bindingStateDesc.m_bindings; + const int numBindings = int(srcBindings.Count()); + + auto& dstDetails = bindingState->m_bindingDetails; + dstDetails.SetSize(numBindings); + + for (int i = 0; i < numBindings; ++i) + { + auto& dstDetail = dstDetails[i]; + const auto& srcBinding = srcBindings[i]; + + switch (srcBinding.bindingType) + { + case BindingType::Buffer: + { + if (!srcBinding.resource || !srcBinding.resource->isBuffer()) + { + assert(!"Needs to have a buffer resource set"); + return nullptr; + } + + BufferResourceImpl* bufferResource = static_cast(srcBinding.resource.Ptr()); + const BufferResource::Desc& bufferResourceDesc = bufferResource->getDesc(); + + if (bufferResourceDesc.bindFlags & Resource::BindFlag::UnorderedAccess) + { + // VkBufferView uav + + VkBufferViewCreateInfo info = { VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO }; + + info.format = VK_FORMAT_R32_SFLOAT; + // TODO: + // Not sure how to handle typeless? + if (bufferResourceDesc.elementSize == 0) + { + info.format = VK_FORMAT_R32_SFLOAT; // DXGI_FORMAT_R32_TYPELESS ? + } + + info.buffer = bufferResource->m_buffer.m_buffer; + info.offset = 0; + info.range = bufferResourceDesc.sizeInBytes; + + SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateBufferView(m_device, &info, nullptr, &dstDetail.m_uav)); + } + + // TODO: Setup views. + // VkImageView srv + + + break; + } + case BindingType::Sampler: + { + VkSamplerCreateInfo samplerInfo = { VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO }; + + samplerInfo.magFilter = VK_FILTER_LINEAR; + samplerInfo.minFilter = VK_FILTER_LINEAR; + + samplerInfo.addressModeU = VK_SAMPLER_ADDRESS_MODE_REPEAT; + samplerInfo.addressModeV = VK_SAMPLER_ADDRESS_MODE_REPEAT; + samplerInfo.addressModeW = VK_SAMPLER_ADDRESS_MODE_REPEAT; + + samplerInfo.anisotropyEnable = VK_FALSE; + samplerInfo.maxAnisotropy = 1; + + samplerInfo.borderColor = VK_BORDER_COLOR_INT_OPAQUE_BLACK; + samplerInfo.unnormalizedCoordinates = VK_FALSE; + samplerInfo.compareEnable = VK_FALSE; + samplerInfo.compareOp = VK_COMPARE_OP_ALWAYS; + samplerInfo.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR; + + SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateSampler(m_device, &samplerInfo, nullptr, &dstDetail.m_sampler)); + + break; + } + case BindingType::Texture: + { + if (!srcBinding.resource || !srcBinding.resource->isTexture()) + { + assert(!"Needs to have a texture resource set"); + return nullptr; + } + + TextureResourceImpl* textureResource = static_cast(srcBinding.resource.Ptr()); + const TextureResource::Desc& texDesc = textureResource->getDesc(); + + VkImageViewType imageViewType = _calcImageViewType(textureResource->getType(), texDesc); + if (imageViewType == VK_IMAGE_VIEW_TYPE_MAX_ENUM) + { + assert(!"Invalid view type"); + return nullptr; + } + const VkFormat format = VulkanUtil::getVkFormat(texDesc.format); + if (format == VK_FORMAT_UNDEFINED) + { + assert(!"Unhandled image format"); + return nullptr; + } + + // Create the image view + + VkImageViewCreateInfo viewInfo = {}; + viewInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; + viewInfo.image = textureResource->m_image; + viewInfo.viewType = imageViewType; + viewInfo.format = format; + viewInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; + viewInfo.subresourceRange.baseMipLevel = 0; + viewInfo.subresourceRange.levelCount = 1; + viewInfo.subresourceRange.baseArrayLayer = 0; + viewInfo.subresourceRange.layerCount = 1; + + viewInfo.components.r = VK_COMPONENT_SWIZZLE_IDENTITY; + viewInfo.components.g = VK_COMPONENT_SWIZZLE_IDENTITY; + viewInfo.components.b = VK_COMPONENT_SWIZZLE_IDENTITY; + viewInfo.components.a = VK_COMPONENT_SWIZZLE_IDENTITY; + + SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateImageView(m_device, &viewInfo, nullptr, &dstDetail.m_srv)); + + break; + } + case BindingType::CombinedTextureSampler: + { + assert(!"not implemented"); + return nullptr; + } + } + } + + return bindingState.detach();; +} +#endif + +static VkDescriptorType translateDescriptorType(DescriptorSlotType type) +{ + switch(type) + { + default: + return VK_DESCRIPTOR_TYPE_MAX_ENUM; + +#define CASE(SRC, DST) \ + case DescriptorSlotType::SRC: return VK_DESCRIPTOR_TYPE_##DST + + CASE(Sampler, SAMPLER); + CASE(CombinedImageSampler, COMBINED_IMAGE_SAMPLER); + CASE(SampledImage, SAMPLED_IMAGE); + CASE(StorageImage, STORAGE_IMAGE); + CASE(UniformTexelBuffer, UNIFORM_TEXEL_BUFFER); + CASE(StorageTexelBuffer, STORAGE_TEXEL_BUFFER); + CASE(UniformBuffer, UNIFORM_BUFFER); + CASE(StorageBuffer, STORAGE_BUFFER); + CASE(DynamicUniformBuffer, UNIFORM_BUFFER_DYNAMIC); + CASE(DynamicStorageBuffer, STORAGE_BUFFER_DYNAMIC); + CASE(InputAttachment, INPUT_ATTACHMENT); + +#undef CASE + } +} + +Result VKRenderer::createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) +{ + RefPtr descriptorSetLayoutImpl = new DescriptorSetLayoutImpl(m_api); + + Slang::List dstBindings; + + uint32_t descriptorCountForTypes[VK_DESCRIPTOR_TYPE_RANGE_SIZE] = { 0, }; + + UInt rangeCount = desc.slotRangeCount; + for(UInt rr = 0; rr < rangeCount; ++rr) + { + auto& srcRange = desc.slotRanges[rr]; + + VkDescriptorType dstDescriptorType = translateDescriptorType(srcRange.type); + + VkDescriptorSetLayoutBinding dstBinding; + dstBinding.binding = rr; + dstBinding.descriptorType = dstDescriptorType; + dstBinding.descriptorCount = srcRange.count; + dstBinding.stageFlags = VK_SHADER_STAGE_ALL; + dstBinding.pImmutableSamplers = nullptr; + + descriptorCountForTypes[dstDescriptorType] += srcRange.count; + + dstBindings.Add(dstBinding); + + DescriptorSetLayoutImpl::RangeInfo rangeInfo; + rangeInfo.descriptorType = dstDescriptorType; + descriptorSetLayoutImpl->m_ranges.Add(rangeInfo); + } + + VkDescriptorSetLayoutCreateInfo descriptorSetLayoutInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO }; + descriptorSetLayoutInfo.bindingCount = uint32_t(dstBindings.Count()); + descriptorSetLayoutInfo.pBindings = dstBindings.Buffer(); + + VkDescriptorSetLayout descriptorSetLayout = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateDescriptorSetLayout(m_device, &descriptorSetLayoutInfo, nullptr, &descriptorSetLayout)); + + // Create a pool while we are at it, to allocate descriptor sets of this type. + + VkDescriptorPoolSize poolSizes[VK_DESCRIPTOR_TYPE_RANGE_SIZE]; + uint32_t poolSizeCount = 0; + for (int ii = 0; ii < SLANG_COUNT_OF(descriptorCountForTypes); ++ii) + { + auto descriptorCount = descriptorCountForTypes[ii]; + if (descriptorCount > 0) + { + poolSizes[poolSizeCount].type = VkDescriptorType(ii); + poolSizes[poolSizeCount].descriptorCount = descriptorCount; + poolSizeCount++; + } + } + + VkDescriptorPoolCreateInfo descriptorPoolInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO }; + descriptorPoolInfo.maxSets = 128; // TODO: actually pick a size. + descriptorPoolInfo.poolSizeCount = poolSizeCount; + descriptorPoolInfo.pPoolSizes = &poolSizes[0]; + + VkDescriptorPool descriptorPool = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateDescriptorPool(m_device, &descriptorPoolInfo, nullptr, &descriptorPool)); + + descriptorSetLayoutImpl->m_descriptorSetLayout = descriptorSetLayout; + descriptorSetLayoutImpl->m_descriptorPool = descriptorPool; + + *outLayout = descriptorSetLayoutImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) +{ + UInt descriptorSetCount = desc.descriptorSetCount; + + VkDescriptorSetLayout descriptorSetLayouts[kMaxDescriptorSets]; + for(UInt ii = 0; ii < descriptorSetCount; ++ii) + { + descriptorSetLayouts[ii] = ((DescriptorSetLayoutImpl*) desc.descriptorSets[ii].layout)->m_descriptorSetLayout; + } + + VkPipelineLayoutCreateInfo pipelineLayoutInfo = { VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO }; + pipelineLayoutInfo.setLayoutCount = desc.descriptorSetCount; + pipelineLayoutInfo.pSetLayouts = &descriptorSetLayouts[0]; + + VkPipelineLayout pipelineLayout; + SLANG_VK_CHECK(m_api.vkCreatePipelineLayout(m_device, &pipelineLayoutInfo, nullptr, &pipelineLayout)); + + RefPtr pipelineLayoutImpl = new PipelineLayoutImpl(m_api); + pipelineLayoutImpl->m_pipelineLayout = pipelineLayout; + pipelineLayoutImpl->m_descriptorSetCount = descriptorSetCount; + + *outLayout = pipelineLayoutImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) +{ + auto layoutImpl = (DescriptorSetLayoutImpl*)layout; + + VkDescriptorSetAllocateInfo descriptorSetAllocInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO }; + descriptorSetAllocInfo.descriptorPool = layoutImpl->m_descriptorPool; + descriptorSetAllocInfo.descriptorSetCount = 1; + descriptorSetAllocInfo.pSetLayouts = &layoutImpl->m_descriptorSetLayout; + + VkDescriptorSet descriptorSet; + SLANG_VK_CHECK(m_api.vkAllocateDescriptorSets(m_device, &descriptorSetAllocInfo, &descriptorSet)); + + RefPtr descriptorSetImpl = new DescriptorSetImpl(this); + descriptorSetImpl->m_layout = layoutImpl; + descriptorSetImpl->m_descriptorSet = descriptorSet; + *outDescriptorSet = descriptorSetImpl.detach(); + return SLANG_OK; +} + +void VKRenderer::DescriptorSetImpl::setConstantBuffer(UInt range, UInt index, BufferResource* buffer) +{ + auto bufferImpl = (BufferResourceImpl*)buffer; + + VkDescriptorBufferInfo bufferInfo = {}; + bufferInfo.buffer = bufferImpl->m_buffer.m_buffer; + bufferInfo.offset = 0; + bufferInfo.range = bufferImpl->getDesc().sizeInBytes; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pBufferInfo = &bufferInfo; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); +} + +void VKRenderer::DescriptorSetImpl::setResource(UInt range, UInt index, ResourceView* view) +{ + auto viewImpl = (ResourceViewImpl*)view; + switch (viewImpl->m_type) + { + case ResourceViewImpl::ViewType::Texture: + { + auto textureViewImpl = (TextureResourceViewImpl*)viewImpl; + VkDescriptorImageInfo imageInfo = {}; + imageInfo.imageView = textureViewImpl->m_view; + imageInfo.imageLayout = textureViewImpl->m_layout; + // imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pImageInfo = &imageInfo; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); + } + break; + + case ResourceViewImpl::ViewType::TexelBuffer: + { + auto bufferViewImpl = (TexelBufferResourceViewImpl*)viewImpl; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pTexelBufferView = &bufferViewImpl->m_view; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); + } + break; + + case ResourceViewImpl::ViewType::PlainBuffer: + { + auto bufferViewImpl = (PlainBufferResourceViewImpl*) viewImpl; + + VkDescriptorBufferInfo bufferInfo = {}; + bufferInfo.buffer = bufferViewImpl->m_buffer->m_buffer.m_buffer; + bufferInfo.offset = bufferViewImpl->offset; + bufferInfo.range = bufferViewImpl->size; + + VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; + writeInfo.dstSet = m_descriptorSet; + writeInfo.dstBinding = range; + writeInfo.dstArrayElement = index; + writeInfo.descriptorCount = 1; + writeInfo.descriptorType = m_layout->m_ranges[range].descriptorType; + writeInfo.pBufferInfo = &bufferInfo; + + m_renderer->m_api.vkUpdateDescriptorSets(m_renderer->m_device, 1, &writeInfo, 0, nullptr); + } + break; + + } +} + +void VKRenderer::DescriptorSetImpl::setSampler(UInt range, UInt index, SamplerState* sampler) +{ +} + +void VKRenderer::DescriptorSetImpl::setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) +{ +} + +void VKRenderer::setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) +{ + // Ideally this should eventually be as simple as: + // + // m_api.vkCmdBindDescriptorSets( + // commandBuffer, + // translatePipelineBindPoint(pipelineType), + // layout->m_pipelineLayout, + // index, + // 1, + // ((DescriptorSetImpl*) descriptorSet)->m_descriptorSet, + // 0, + // nullptr); + // + // For now we are lazily flushing state right before drawing, so + // we will hang onto the parameters that were passed in and then + // use them later. + // + + auto descriptorSetImpl = (DescriptorSetImpl*)descriptorSet; + m_currentDescriptorSetImpls[index] = descriptorSetImpl; + m_currentDescriptorSets[index] = descriptorSetImpl->m_descriptorSet; +} + +Result VKRenderer::createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) +{ + ShaderProgramImpl* impl = new ShaderProgramImpl(desc.pipelineType); + if( desc.pipelineType == PipelineType::Compute) + { + auto computeKernel = desc.findKernel(StageType::Compute); + impl->m_compute = compileEntryPoint(*computeKernel, VK_SHADER_STAGE_COMPUTE_BIT, impl->m_buffers[0]); + } + else + { + auto vertexKernel = desc.findKernel(StageType::Vertex); + auto fragmentKernel = desc.findKernel(StageType::Fragment); + + impl->m_vertex = compileEntryPoint(*vertexKernel, VK_SHADER_STAGE_VERTEX_BIT, impl->m_buffers[0]); + impl->m_fragment = compileEntryPoint(*fragmentKernel, VK_SHADER_STAGE_FRAGMENT_BIT, impl->m_buffers[1]); + } + *outProgram = impl; + return SLANG_OK; +} + +Result VKRenderer::createGraphicsPipelineState(const GraphicsPipelineStateDesc& desc, PipelineState** outState) +{ + VkPipelineCache pipelineCache = VK_NULL_HANDLE; + + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + auto inputLayoutImpl = (InputLayoutImpl*) desc.inputLayout; + + int width = desc.framebufferWidth; + int height = desc.framebufferHeight; + + // Shader Stages + // + // Currently only handles vertex/fragment. + + static const uint32_t kMaxShaderStages = 2; + VkPipelineShaderStageCreateInfo shaderStages[kMaxShaderStages]; + + uint32_t shaderStageCount = 0; + shaderStages[shaderStageCount++] = programImpl->m_vertex; + shaderStages[shaderStageCount++] = programImpl->m_fragment; + + // VertexBuffer/s + // Currently only handles one + + VkPipelineVertexInputStateCreateInfo vertexInputInfo = { VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO }; + vertexInputInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO; + vertexInputInfo.vertexBindingDescriptionCount = 0; + vertexInputInfo.vertexAttributeDescriptionCount = 0; + + VkVertexInputBindingDescription vertexInputBindingDescription; + + if (inputLayoutImpl) + { + vertexInputBindingDescription.binding = 0; + vertexInputBindingDescription.stride = inputLayoutImpl->m_vertexSize; + vertexInputBindingDescription.inputRate = VK_VERTEX_INPUT_RATE_VERTEX; + + const auto& srcAttributeDescs = inputLayoutImpl->m_vertexDescs; + + vertexInputInfo.vertexBindingDescriptionCount = 1; + vertexInputInfo.pVertexBindingDescriptions = &vertexInputBindingDescription; + + vertexInputInfo.vertexAttributeDescriptionCount = static_cast(srcAttributeDescs.Count()); + vertexInputInfo.pVertexAttributeDescriptions = srcAttributeDescs.Buffer(); + } + + VkPipelineInputAssemblyStateCreateInfo inputAssembly = {}; + inputAssembly.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO; + inputAssembly.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; + inputAssembly.primitiveRestartEnable = VK_FALSE; + + VkViewport viewport = {}; + viewport.x = 0.0f; + viewport.y = 0.0f; + viewport.width = (float)width; + viewport.height = (float)height; + viewport.minDepth = 0.0f; + viewport.maxDepth = 1.0f; + + VkRect2D scissor = {}; + scissor.offset = { 0, 0 }; + scissor.extent = { uint32_t(width), uint32_t(height) }; + + VkPipelineViewportStateCreateInfo viewportState = {}; + viewportState.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO; + viewportState.viewportCount = 1; + viewportState.pViewports = &viewport; + viewportState.scissorCount = 1; + viewportState.pScissors = &scissor; + + VkPipelineRasterizationStateCreateInfo rasterizer = {}; + rasterizer.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO; + rasterizer.depthClampEnable = VK_FALSE; + rasterizer.rasterizerDiscardEnable = VK_FALSE; + rasterizer.polygonMode = VK_POLYGON_MODE_FILL; + rasterizer.lineWidth = 1.0f; + rasterizer.cullMode = VK_CULL_MODE_NONE; + rasterizer.frontFace = VK_FRONT_FACE_CLOCKWISE; + rasterizer.depthBiasEnable = VK_FALSE; + + VkPipelineMultisampleStateCreateInfo multisampling = {}; + multisampling.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO; + multisampling.sampleShadingEnable = VK_FALSE; + multisampling.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT; + + VkPipelineColorBlendAttachmentState colorBlendAttachment = {}; + colorBlendAttachment.colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT; + colorBlendAttachment.blendEnable = VK_FALSE; + + VkPipelineColorBlendStateCreateInfo colorBlending = {}; + colorBlending.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO; + colorBlending.logicOpEnable = VK_FALSE; + colorBlending.logicOp = VK_LOGIC_OP_COPY; + colorBlending.attachmentCount = 1; + colorBlending.pAttachments = &colorBlendAttachment; + colorBlending.blendConstants[0] = 0.0f; + colorBlending.blendConstants[1] = 0.0f; + colorBlending.blendConstants[2] = 0.0f; + colorBlending.blendConstants[3] = 0.0f; + + VkGraphicsPipelineCreateInfo pipelineInfo = { VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO }; + + pipelineInfo.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO; + pipelineInfo.stageCount = 2; + pipelineInfo.pStages = shaderStages; + pipelineInfo.pVertexInputState = &vertexInputInfo; + pipelineInfo.pInputAssemblyState = &inputAssembly; + pipelineInfo.pViewportState = &viewportState; + pipelineInfo.pRasterizationState = &rasterizer; + pipelineInfo.pMultisampleState = &multisampling; + pipelineInfo.pColorBlendState = &colorBlending; + pipelineInfo.layout = pipelineLayoutImpl->m_pipelineLayout; + pipelineInfo.renderPass = m_renderPass; + pipelineInfo.subpass = 0; + pipelineInfo.basePipelineHandle = VK_NULL_HANDLE; + + VkPipeline pipeline = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateGraphicsPipelines(m_device, pipelineCache, 1, &pipelineInfo, nullptr, &pipeline)); + + RefPtr pipelineStateImpl; + pipelineStateImpl->m_pipeline = pipeline; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_shaderProgram = programImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + +Result VKRenderer::createComputePipelineState(const ComputePipelineStateDesc& desc, PipelineState** outState) +{ + VkPipelineCache pipelineCache = VK_NULL_HANDLE; + + auto programImpl = (ShaderProgramImpl*) desc.program; + auto pipelineLayoutImpl = (PipelineLayoutImpl*) desc.pipelineLayout; + + VkComputePipelineCreateInfo computePipelineInfo = { VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO }; + computePipelineInfo.stage = programImpl->m_compute; + computePipelineInfo.layout = pipelineLayoutImpl->m_pipelineLayout; + + VkPipeline pipeline = VK_NULL_HANDLE; + SLANG_VK_CHECK(m_api.vkCreateComputePipelines(m_device, pipelineCache, 1, &computePipelineInfo, nullptr, &pipeline)); + + RefPtr pipelineStateImpl = new PipelineStateImpl(m_api); + pipelineStateImpl->m_pipeline = pipeline; + pipelineStateImpl->m_pipelineLayout = pipelineLayoutImpl; + pipelineStateImpl->m_shaderProgram = programImpl; + *outState = pipelineStateImpl.detach(); + return SLANG_OK; +} + + +#if 0 + else if (m_currentProgram->m_pipelineType == PipelineType::Graphics) + { + // Create the graphics pipeline + + const int width = m_swapChain.getWidth(); + const int height = m_swapChain.getHeight(); + + + + + + // + + + } + else + { + assert(!"Unhandled program type"); + return SLANG_FAIL; + } + + pipelineOut = pipeline; + return SLANG_OK; + + +#endif + +} // renderer_test diff --git a/tools/gfx/render-vk.h b/tools/gfx/render-vk.h new file mode 100644 index 000000000..14a8e403a --- /dev/null +++ b/tools/gfx/render-vk.h @@ -0,0 +1,10 @@ +// render-vk.h +#pragma once + +namespace gfx { + +class Renderer; + +Renderer* createVKRenderer(); + +} // gfx diff --git a/tools/gfx/render.cpp b/tools/gfx/render.cpp new file mode 100644 index 000000000..8f887b491 --- /dev/null +++ b/tools/gfx/render.cpp @@ -0,0 +1,391 @@ +// render.cpp +#include "render.h" + +#include "../../source/core/slang-math.h" + +namespace gfx { +using namespace Slang; + +/* static */const Resource::BindFlag::Enum Resource::s_requiredBinding[] = +{ + BindFlag::VertexBuffer, // VertexBuffer + BindFlag::IndexBuffer, // IndexBuffer + BindFlag::ConstantBuffer, // ConstantBuffer + BindFlag::StreamOutput, // StreamOut + BindFlag::RenderTarget, // RenderTager + BindFlag::DepthStencil, // DepthRead + BindFlag::DepthStencil, // DepthWrite + BindFlag::UnorderedAccess, // UnorderedAccess + BindFlag::PixelShaderResource, // PixelShaderResource + BindFlag::NonPixelShaderResource, // NonPixelShaderResource + BindFlag::Enum(BindFlag::PixelShaderResource | BindFlag::NonPixelShaderResource), // GenericRead +}; + + +/* static */void Resource::compileTimeAsserts() +{ + SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_requiredBinding) == int(Usage::CountOf)); +} + +static const Resource::DescBase s_emptyDescBase = {}; + +const Resource::DescBase& Resource::getDescBase() const +{ + if (isBuffer()) + { + return static_cast(this)->getDesc(); + } + else if (isTexture()) + { + return static_cast(this)->getDesc(); + } + return s_emptyDescBase; +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! RendererUtil !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +/* static */const uint8_t RendererUtil::s_formatSize[] = +{ + 0, // Unknown, + + uint8_t(sizeof(float) * 4), // RGBA_Float32, + uint8_t(sizeof(float) * 3), // RGB_Float32, + uint8_t(sizeof(float) * 2), // RG_Float32, + uint8_t(sizeof(float) * 1), // R_Float32, + + uint8_t(sizeof(uint32_t)), // RGBA_Unorm_UInt8, + + uint8_t(sizeof(uint32_t)), // R_UInt32, + + uint8_t(sizeof(float)), // D_Float32, + uint8_t(sizeof(uint32_t)), // D_Unorm24_S8, +}; + +/* static */const BindingStyle RendererUtil::s_rendererTypeToBindingStyle[] = +{ + BindingStyle::Unknown, // Unknown, + BindingStyle::DirectX, // DirectX11, + BindingStyle::DirectX, // DirectX12, + BindingStyle::OpenGl, // OpenGl, + BindingStyle::Vulkan, // Vulkan +}; + +/* static */void RendererUtil::compileTimeAsserts() +{ + SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_formatSize) == int(Format::CountOf)); + SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_rendererTypeToBindingStyle) == int(RendererType::CountOf)); +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!! BindingState::Desc !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ +#if 0 +void BindingState::Desc::addSampler(const SamplerDesc& desc, const RegisterRange& registerRange) +{ + int descIndex = int(m_samplerDescs.Count()); + m_samplerDescs.Add(desc); + + Binding binding; + binding.bindingType = BindingType::Sampler; + binding.resource = nullptr; + binding.registerRange = registerRange; + binding.descIndex = descIndex; + + m_bindings.Add(binding); +} + +void BindingState::Desc::addResource(BindingType bindingType, Resource* resource, const RegisterRange& registerRange) +{ + assert(resource); + + Binding binding; + binding.bindingType = bindingType; + binding.resource = resource; + binding.descIndex = -1; + binding.registerRange = registerRange; + m_bindings.Add(binding); +} + +void BindingState::Desc::addCombinedTextureSampler(TextureResource* resource, const SamplerDesc& samplerDesc, const RegisterRange& registerRange) +{ + assert(resource); + + int samplerDescIndex = int(m_samplerDescs.Count()); + m_samplerDescs.Add(samplerDesc); + + Binding binding; + binding.bindingType = BindingType::CombinedTextureSampler; + binding.resource = resource; + binding.descIndex = samplerDescIndex; + binding.registerRange = registerRange; + m_bindings.Add(binding); +} + +void BindingState::Desc::clear() +{ + m_bindings.Clear(); + m_samplerDescs.Clear(); + m_numRenderTargets = 1; +} + +int BindingState::Desc::findBindingIndex(Resource::BindFlag::Enum bindFlag, int registerIndex) const +{ + const int numBindings = int(m_bindings.Count()); + for (int i = 0; i < numBindings; ++i) + { + const Binding& binding = m_bindings[i]; + if (binding.resource && (binding.resource->getDescBase().bindFlags & bindFlag) != 0) + { + if (binding.registerRange.hasRegister(registerIndex)) + { + return i; + } + } + } + + return -1; +} +#endif + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!! TextureResource::Size !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +int TextureResource::Size::calcMaxDimension(Type type) const +{ + switch (type) + { + case Resource::Type::Texture1D: return this->width; + case Resource::Type::Texture3D: return std::max(std::max(this->width, this->height), this->depth); + case Resource::Type::TextureCube: // fallthru + case Resource::Type::Texture2D: + { + return std::max(this->width, this->height); + } + default: return 0; + } +} + +TextureResource::Size TextureResource::Size::calcMipSize(int mipLevel) const +{ + Size size; + size.width = TextureResource::calcMipSize(this->width, mipLevel); + size.height = TextureResource::calcMipSize(this->height, mipLevel); + size.depth = TextureResource::calcMipSize(this->depth, mipLevel); + return size; +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!! BufferResource::Desc !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +void BufferResource::Desc::setDefaults(Usage initialUsage) +{ + if (this->bindFlags == 0) + { + this->bindFlags = Resource::s_requiredBinding[int(initialUsage)]; + } +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!! TextureResource::Desc !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +int TextureResource::Desc::calcNumMipLevels() const +{ + const int maxDimensionSize = this->size.calcMaxDimension(type); + return (maxDimensionSize > 0) ? (Math::Log2Floor(maxDimensionSize) + 1) : 0; +} + +int TextureResource::Desc::calcNumSubResources() const +{ + const int numMipMaps = (this->numMipLevels > 0) ? this->numMipLevels : calcNumMipLevels(); + const int arrSize = (this->arraySize > 0) ? this->arraySize : 1; + + switch (type) + { + case Resource::Type::Texture1D: + case Resource::Type::Texture2D: + { + return numMipMaps * arrSize; + } + case Resource::Type::Texture3D: + { + // can't have arrays of 3d textures + assert(this->arraySize <= 1); + return numMipMaps * this->size.depth; + } + case Resource::Type::TextureCube: + { + // There are 6 faces to a cubemap + return numMipMaps * arrSize * 6; + } + default: return 0; + } +} + +void TextureResource::Desc::fixSize() +{ + switch (type) + { + case Resource::Type::Texture1D: + { + this->size.height = 1; + this->size.depth = 1; + break; + } + case Resource::Type::TextureCube: + case Resource::Type::Texture2D: + { + this->size.depth = 1; + break; + } + case Resource::Type::Texture3D: + { + // Can't have an array + this->arraySize = 0; + break; + } + default: break; + } +} + +void TextureResource::Desc::setDefaults(Usage initialUsage) +{ + fixSize(); + if (this->bindFlags == 0) + { + this->bindFlags = Resource::s_requiredBinding[int(initialUsage)]; + } + if (this->numMipLevels <= 0) + { + this->numMipLevels = calcNumMipLevels(); + } +} + +int TextureResource::Desc::calcEffectiveArraySize() const +{ + const int arrSize = (this->arraySize > 0) ? this->arraySize : 1; + + switch (type) + { + case Resource::Type::Texture1D: // fallthru + case Resource::Type::Texture2D: + { + return arrSize; + } + case Resource::Type::TextureCube: return arrSize * 6; + case Resource::Type::Texture3D: return 1; + default: return 0; + } +} + +void TextureResource::Desc::init(Type typeIn) +{ + this->type = typeIn; + this->size.init(); + + this->format = Format::Unknown; + this->arraySize = 0; + this->numMipLevels = 0; + this->sampleDesc.init(); + + this->bindFlags = 0; + this->cpuAccessFlags = 0; +} + +void TextureResource::Desc::init1D(Format formatIn, int widthIn, int numMipMapsIn) +{ + this->type = Type::Texture1D; + this->size.init(widthIn); + + this->format = formatIn; + this->arraySize = 0; + this->numMipLevels = numMipMapsIn; + this->sampleDesc.init(); + + this->bindFlags = 0; + this->cpuAccessFlags = 0; +} + +void TextureResource::Desc::init2D(Type typeIn, Format formatIn, int widthIn, int heightIn, int numMipMapsIn) +{ + assert(typeIn == Type::Texture2D || typeIn == Type::TextureCube); + + this->type = typeIn; + this->size.init(widthIn, heightIn); + + this->format = formatIn; + this->arraySize = 0; + this->numMipLevels = numMipMapsIn; + this->sampleDesc.init(); + + this->bindFlags = 0; + this->cpuAccessFlags = 0; +} + +void TextureResource::Desc::init3D(Format formatIn, int widthIn, int heightIn, int depthIn, int numMipMapsIn) +{ + this->type = Type::Texture3D; + this->size.init(widthIn, heightIn, depthIn); + + this->format = formatIn; + this->arraySize = 0; + this->numMipLevels = numMipMapsIn; + this->sampleDesc.init(); + + this->bindFlags = 0; + this->cpuAccessFlags = 0; +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!! RennderUtil !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +ProjectionStyle RendererUtil::getProjectionStyle(RendererType type) +{ + switch (type) + { + case RendererType::DirectX11: + case RendererType::DirectX12: + { + return ProjectionStyle::DirectX; + } + case RendererType::OpenGl: return ProjectionStyle::OpenGl; + case RendererType::Vulkan: return ProjectionStyle::Vulkan; + case RendererType::Unknown: return ProjectionStyle::Unknown; + default: + { + assert(!"Unhandled type"); + return ProjectionStyle::Unknown; + } + } +} + +/* static */void RendererUtil::getIdentityProjection(ProjectionStyle style, float projMatrix[16]) +{ + switch (style) + { + case ProjectionStyle::DirectX: + case ProjectionStyle::OpenGl: + { + static const float kIdentity[] = + { + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1 + }; + ::memcpy(projMatrix, kIdentity, sizeof(kIdentity)); + break; + } + case ProjectionStyle::Vulkan: + { + static const float kIdentity[] = + { + 1, 0, 0, 0, + 0, -1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1 + }; + ::memcpy(projMatrix, kIdentity, sizeof(kIdentity)); + break; + } + default: + { + assert(!"Not handled"); + } + } +} + +} // renderer_test diff --git a/tools/gfx/render.h b/tools/gfx/render.h new file mode 100644 index 000000000..b43152b68 --- /dev/null +++ b/tools/gfx/render.h @@ -0,0 +1,869 @@ +// render.h +#pragma once + +#include "window.h" + +//#include "shader-input-layout.h" + +#include "../../slang-com-helper.h" + +#include "../../source/core/smart-pointer.h" +#include "../../source/core/list.h" +#include "../../source/core/dictionary.h" + +namespace gfx { + +using Slang::RefObject; +using Slang::RefPtr; +using Slang::Dictionary; +using Slang::GetHashCode; +using Slang::combineHash; +using Slang::List; + +typedef SlangResult Result; + +// Had to move here, because Options needs types defined here +typedef intptr_t Int; +typedef uintptr_t UInt; + +// pre declare types +class Surface; + +// Declare opaque type +class InputLayout: public Slang::RefObject +{ + public: +}; + +enum class PipelineType +{ + Unknown, + Graphics, + Compute, + CountOf, +}; + +enum class StageType +{ + Unknown, + Vertex, + Hull, + Domain, + Geometry, + Fragment, + Compute, + CountOf, +}; + +enum class RendererType +{ + Unknown, + DirectX11, + DirectX12, + OpenGl, + Vulkan, + CountOf, +}; + +enum class ProjectionStyle +{ + Unknown, + OpenGl, + DirectX, + Vulkan, + CountOf, +}; + +/// The style of the binding +enum class BindingStyle +{ + Unknown, + DirectX, + OpenGl, + Vulkan, + CountOf, +}; + +class ShaderProgram: public Slang::RefObject +{ +public: + + struct KernelDesc + { + StageType stage; + void const* codeBegin; + void const* codeEnd; + + UInt getCodeSize() const { return (char const*)codeEnd - (char const*)codeBegin; } + }; + + struct Desc + { + PipelineType pipelineType; + KernelDesc const* kernels; + Int kernelCount; + + /// Find and return the kernel for `stage`, if present. + KernelDesc const* findKernel(StageType stage) const + { + for(Int ii = 0; ii < kernelCount; ++ii) + if(kernels[ii].stage == stage) + return &kernels[ii]; + return nullptr; + } + }; +}; + +struct ShaderCompileRequest +{ + struct SourceInfo + { + char const* path; + + // The data may either be source text (in which + // case it can be assumed to be nul-terminated with + // `dataEnd` pointing at the terminator), or + // raw binary data (in which case `dataEnd` points + // at the end of the buffer). + char const* dataBegin; + char const* dataEnd; + }; + + struct EntryPoint + { + char const* name = nullptr; + SourceInfo source; + }; + + SourceInfo source; + EntryPoint vertexShader; + EntryPoint fragmentShader; + EntryPoint computeShader; + Slang::List entryPointTypeArguments; +}; + +/// Different formats of things like pixels or elements of vertices +/// NOTE! Any change to this type (adding, removing, changing order) - must also be reflected in changes to RendererUtil +enum class Format +{ + Unknown, + + RGBA_Float32, + RGB_Float32, + RG_Float32, + R_Float32, + + RGBA_Unorm_UInt8, + + R_UInt32, + + D_Float32, + D_Unorm24_S8, + + CountOf, +}; + +struct InputElementDesc +{ + char const* semanticName; + UInt semanticIndex; + Format format; + UInt offset; +}; + +enum class MapFlavor +{ + Unknown, ///< Unknown mapping type + HostRead, + HostWrite, + WriteDiscard, +}; + +enum class PrimitiveTopology +{ + TriangleList, +}; + +class Resource: public Slang::RefObject +{ + public: + + /// The type of resource. + /// NOTE! The order needs to be such that all texture types are at or after Texture1D (otherwise isTexture won't work correctly) + enum class Type + { + Unknown, ///< Unknown + Buffer, ///< A buffer (like a constant/index/vertex buffer) + Texture1D, ///< A 1d texture + Texture2D, ///< A 2d texture + Texture3D, ///< A 3d texture + TextureCube, ///< A cubemap consists of 6 Texture2D like faces + CountOf, + }; + + /// Describes how a resource is to be used + enum class Usage + { + Unknown = -1, + VertexBuffer = 0, + IndexBuffer, + ConstantBuffer, + StreamOutput, + RenderTarget, + DepthRead, + DepthWrite, + UnorderedAccess, + PixelShaderResource, + NonPixelShaderResource, + GenericRead, + CountOf, + }; + + /// Binding flags describe all of the ways a resource can be bound - and therefore used + struct BindFlag + { + enum Enum + { + VertexBuffer = 0x001, + IndexBuffer = 0x002, + ConstantBuffer = 0x004, + StreamOutput = 0x008, + RenderTarget = 0x010, + DepthStencil = 0x020, + UnorderedAccess = 0x040, + PixelShaderResource = 0x080, + NonPixelShaderResource = 0x100, + }; + }; + + /// Combinations describe how a resource can be accessed (typically by the host/cpu) + struct AccessFlag + { + enum Enum + { + Read = 0x1, + Write = 0x2 + }; + }; + + /// Base class for Descs + struct DescBase + { + bool canBind(BindFlag::Enum bindFlag) const { return (bindFlags & bindFlag) != 0; } + bool hasCpuAccessFlag(AccessFlag::Enum accessFlag) { return (cpuAccessFlags & accessFlag) != 0; } + + Type type = Type::Unknown; + + int bindFlags = 0; ///< Combination of Resource::BindFlag or 0 (and will use initialUsage to set) + int cpuAccessFlags = 0; ///< Combination of Resource::AccessFlag + }; + + /// Get the type + SLANG_FORCE_INLINE Type getType() const { return m_type; } + /// True if it's a texture derived type + SLANG_FORCE_INLINE bool isTexture() const { return int(m_type) >= int(Type::Texture1D); } + /// True if it's a buffer derived type + SLANG_FORCE_INLINE bool isBuffer() const { return m_type == Type::Buffer; } + + /// Get the descBase + const DescBase& getDescBase() const; + /// Returns true if can bind with flag + bool canBind(BindFlag::Enum bindFlag) const { return getDescBase().canBind(bindFlag); } + + /// For a usage gives the required binding flags + static const BindFlag::Enum s_requiredBinding[]; /// Maps Usage to bind flags required + + protected: + Resource(Type type): + m_type(type) + {} + + static void compileTimeAsserts(); + + Type m_type; +}; + +class BufferResource: public Resource +{ + public: + typedef Resource Parent; + + struct Desc: public DescBase + { + void init(size_t sizeInBytesIn) + { + sizeInBytes = sizeInBytesIn; + elementSize = 0; + format = Format::Unknown; + } + /// Set up default parameters based on usage + void setDefaults(Usage initialUsage); + + size_t sizeInBytes; ///< Total size in bytes + int elementSize; ///< Get the element stride. If > 0, this is a structured buffer + Format format; + }; + + /// Get the buffer description + SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } + + /// Ctor + BufferResource(const Desc& desc): + Parent(Type::Buffer), + m_desc(desc) + { + } + + protected: + Desc m_desc; +}; + +class TextureResource: public Resource +{ + public: + typedef Resource Parent; + + struct SampleDesc + { + void init() + { + numSamples = 1; + quality = 0; + } + int numSamples; ///< Number of samples per pixel + int quality; ///< The quality measure for the samples + }; + + struct Size + { + void init() + { + width = height = depth = 1; + } + void init(int widthIn, int heightIn = 1, int depthIn = 1) + { + width = widthIn; + height = heightIn; + depth = depthIn; + } + /// Given the type works out the maximum dimension size + int calcMaxDimension(Type type) const; + /// Given a size, calculates the size at a mip level + Size calcMipSize(int mipLevel) const; + + int width; ///< Width in pixels + int height; ///< Height in pixels (if 2d or 3d) + int depth; ///< Depth (if 3d) + }; + + struct Desc: public DescBase + { + /// Initialize with default values + void init(Type typeIn); + /// Initialize different dimensions. For cubemap, use init2D + void init1D(Format format, int width, int numMipMaps = 0); + void init2D(Type typeIn, Format format, int width, int height, int numMipMaps = 0); + void init3D(Format format, int width, int height, int depth, int numMipMaps = 0); + + /// Given the type, calculates the number of mip maps. 0 on error + int calcNumMipLevels() const; + /// Calculate the total number of sub resources. 0 on error. + int calcNumSubResources() const; + + /// Calculate the effective array size - in essence the amount if mip map sets needed. + /// In practice takes into account if the arraySize is 0 (it's not an array, but it will still have at least one mip set) + /// and if the type is a cubemap (multiplies the amount of mip sets by 6) + int calcEffectiveArraySize() const; + + /// Use type to fix the size values (and array size). + /// For example a 1d texture, should have height and depth set to 1. + void fixSize(); + + /// Set up default parameters based on type and usage + void setDefaults(Usage initialUsage); + + Size size; + + int arraySize; ///< Array size + + int numMipLevels; ///< Number of mip levels - if 0 will create all mip levels + Format format; ///< The resources format + SampleDesc sampleDesc; ///< How the resource is sampled + }; + + /// The ordering of the subResources is + /// forall (effectiveArraySize) + /// forall (mip levels) + /// forall (depth levels) + struct Data + { + ptrdiff_t* mipRowStrides; ///< The row stride for a mip map + int numMips; ///< The number of mip maps + const void*const* subResources; ///< Pointers to each full mip subResource + int numSubResources; ///< The total amount of subResources. Typically = numMips * depth * arraySize + }; + + /// Get the description of the texture + SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } + + /// Ctor + TextureResource(const Desc& desc): + Parent(desc.type), + m_desc(desc) + { + } + + SLANG_FORCE_INLINE static int calcMipSize(int width, int mipLevel) + { + width = width >> mipLevel; + return width > 0 ? width : 1; + } + + protected: + Desc m_desc; +}; + +enum class ComparisonFunc : uint8_t +{ + Never = 0, + Less = 0x01, + Equal = 0x02, + LessEqual = 0x03, + Greater = 0x04, + NotEqual = 0x05, + GreaterEqual = 0x06, + Always = 0x07, +}; + +enum class TextureFilteringMode +{ + Point, + Linear, +}; + +enum class TextureAddressingMode +{ + Wrap, + ClampToEdge, + ClampToBorder, + MirrorRepeat, + MirrorOnce, +}; + +enum class TextureReductionOp +{ + Average, + Comparison, + Minimum, + Maximum, +}; + +class SamplerState : public Slang::RefObject +{ +public: + struct Desc + { + TextureFilteringMode minFilter = TextureFilteringMode::Linear; + TextureFilteringMode magFilter = TextureFilteringMode::Linear; + TextureFilteringMode mipFilter = TextureFilteringMode::Linear; + TextureReductionOp reductionOp = TextureReductionOp::Average; + TextureAddressingMode addressU = TextureAddressingMode::Wrap; + TextureAddressingMode addressV = TextureAddressingMode::Wrap; + TextureAddressingMode addressW = TextureAddressingMode::Wrap; + float mipLODBias = 0.0f; + uint32_t maxAnisotropy = 1; + ComparisonFunc comparisonFunc = ComparisonFunc::Never; + float borderColor[4] = { 1.0f, 1.0f, 1.0f, 1.0f }; + float minLOD = -FLT_MAX; + float maxLOD = FLT_MAX; + }; +}; + +enum class DescriptorSlotType +{ + Unknown, + + Sampler, + CombinedImageSampler, + SampledImage, + StorageImage, + UniformTexelBuffer, + StorageTexelBuffer, + UniformBuffer, + StorageBuffer, + DynamicUniformBuffer, + DynamicStorageBuffer, + InputAttachment, +}; + +class DescriptorSetLayout : public Slang::RefObject +{ +public: + struct SlotRangeDesc + { + DescriptorSlotType type = DescriptorSlotType::Unknown; + UInt count = 1; + + SlotRangeDesc() + {} + + SlotRangeDesc( + DescriptorSlotType type, + UInt count = 1) + : type(type) + , count(count) + {} + }; + + struct Desc + { + UInt slotRangeCount = 0; + SlotRangeDesc const* slotRanges = nullptr; + }; +}; + +class PipelineLayout : public Slang::RefObject +{ +public: + struct DescriptorSetDesc + { + DescriptorSetLayout* layout = nullptr; + + DescriptorSetDesc() + {} + + DescriptorSetDesc( + DescriptorSetLayout* layout) + : layout(layout) + {} + }; + + struct Desc + { + UInt renderTargetCount = 0; + UInt descriptorSetCount = 0; + DescriptorSetDesc const* descriptorSets = nullptr; + }; +}; + +class ResourceView : public Slang::RefObject +{ +public: + enum class Type + { + Unknown, + + RenderTarget, + DepthStencil, + ShaderResource, + UnorderedAccess, + }; + + struct Desc + { + Type type; + Format format; + }; +}; + +class DescriptorSet : public Slang::RefObject +{ +public: + virtual void setConstantBuffer(UInt range, UInt index, BufferResource* buffer) = 0; + virtual void setResource(UInt range, UInt index, ResourceView* view) = 0; + virtual void setSampler(UInt range, UInt index, SamplerState* sampler) = 0; + virtual void setCombinedTextureSampler( + UInt range, + UInt index, + ResourceView* textureView, + SamplerState* sampler) = 0; +}; + +enum class StencilOp : uint8_t +{ + Keep, + Zero, + Replace, + IncrementSaturate, + DecrementSaturate, + Invert, + IncrementWrap, + DecrementWrap, +}; + +enum class FillMode : uint8_t +{ + Solid, + Wireframe, +}; + +enum class CullMode : uint8_t +{ + None, + Front, + Back, +}; + +enum class FrontFaceMode : uint8_t +{ + CounterClockwise, + Clockwise, +}; + +struct DepthStencilOpDesc +{ + StencilOp stencilFailOp = StencilOp::Keep; + StencilOp stencilDepthFailOp = StencilOp::Keep; + StencilOp stencilPassOp = StencilOp::Keep; + ComparisonFunc stencilFunc = ComparisonFunc::Always; +}; + +struct DepthStencilDesc +{ + bool depthTestEnable = true; + bool depthWriteEnable = true; + ComparisonFunc depthFunc = ComparisonFunc::Less; + + bool stencilEnable = false; + uint32_t stencilReadMask = 0xFFFFFFFF; + uint32_t stencilWriteMask = 0xFFFFFFFF; + DepthStencilOpDesc frontFace; + DepthStencilOpDesc backFace; + + uint32_t stencilRef = 0; +}; + +struct RasterizerDesc +{ + FillMode fillMode = FillMode::Solid; + CullMode cullMode = CullMode::Back; + FrontFaceMode frontFace = FrontFaceMode::CounterClockwise; + int32_t depthBias = 0; + float depthBiasClamp = 0.0f; + float slopeScaledDepthBias = 0.0f; + bool depthClipEnable = true; + bool scissorEnable = false; + bool multisampleEnable = false; + bool antialiasedLineEnable = false; +}; + +struct GraphicsPipelineStateDesc +{ + ShaderProgram* program; + PipelineLayout* pipelineLayout; + InputLayout* inputLayout; + UInt framebufferWidth; + UInt framebufferHeight; + UInt renderTargetCount; + DepthStencilDesc depthStencil; + RasterizerDesc rasterizer; +}; + +struct ComputePipelineStateDesc +{ + ShaderProgram* program; + PipelineLayout* pipelineLayout; +}; + +class PipelineState : public Slang::RefObject +{ +public: +}; + +class Renderer: public Slang::RefObject +{ +public: + + struct Desc + { + int width; ///< Width in pixels + int height; ///< height in pixels + }; + + virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) = 0; + + virtual void setClearColor(const float color[4]) = 0; + virtual void clearFrame() = 0; + + virtual void presentFrame() = 0; + + virtual TextureResource::Desc getSwapChainTextureDesc() = 0; + + /// Create a texture resource. initData holds the initialize data to set the contents of the texture when constructed. + virtual Result createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData, TextureResource** outResource) = 0; + + /// Create a texture resource. initData holds the initialize data to set the contents of the texture when constructed. + inline RefPtr createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData = nullptr) + { + RefPtr resource; + SLANG_RETURN_NULL_ON_FAIL(createTextureResource(initialUsage, desc, initData, resource.writeRef())); + return resource; + } + + /// Create a buffer resource + virtual Result createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData, BufferResource** outResource) = 0; + + inline RefPtr createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData = nullptr) + { + RefPtr resource; + SLANG_RETURN_NULL_ON_FAIL(createBufferResource(initialUsage, desc, initData, resource.writeRef())); + return resource; + } + + virtual Result createSamplerState(SamplerState::Desc const& desc, SamplerState** outSampler) = 0; + + inline RefPtr createSamplerState(SamplerState::Desc const& desc) + { + RefPtr sampler; + SLANG_RETURN_NULL_ON_FAIL(createSamplerState(desc, sampler.writeRef())); + return sampler; + } + + virtual Result createTextureView(TextureResource* texture, ResourceView::Desc const& desc, ResourceView** outView) = 0; + + inline RefPtr createTextureView(TextureResource* texture, ResourceView::Desc const& desc) + { + RefPtr view; + SLANG_RETURN_NULL_ON_FAIL(createTextureView(texture, desc, view.writeRef())); + return view; + } + + virtual Result createBufferView(BufferResource* buffer, ResourceView::Desc const& desc, ResourceView** outView) = 0; + + inline RefPtr createBufferView(BufferResource* buffer, ResourceView::Desc const& desc) + { + RefPtr view; + SLANG_RETURN_NULL_ON_FAIL(createBufferView(buffer, desc, view.writeRef())); + return view; + } + + virtual Result createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount, InputLayout** outLayout) = 0; + + inline RefPtr createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) + { + RefPtr layout; + SLANG_RETURN_NULL_ON_FAIL(createInputLayout(inputElements, inputElementCount, layout.writeRef())); + return layout; + } + + virtual Result createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc, DescriptorSetLayout** outLayout) = 0; + + inline RefPtr createDescriptorSetLayout(const DescriptorSetLayout::Desc& desc) + { + RefPtr layout; + SLANG_RETURN_NULL_ON_FAIL(createDescriptorSetLayout(desc, layout.writeRef())); + return layout; + } + + virtual Result createPipelineLayout(const PipelineLayout::Desc& desc, PipelineLayout** outLayout) = 0; + + inline RefPtr createPipelineLayout(const PipelineLayout::Desc& desc) + { + RefPtr layout; + SLANG_RETURN_NULL_ON_FAIL(createPipelineLayout(desc, layout.writeRef())); + return layout; + } + + virtual Result createDescriptorSet(DescriptorSetLayout* layout, DescriptorSet** outDescriptorSet) = 0; + + inline RefPtr createDescriptorSet(DescriptorSetLayout* layout) + { + RefPtr descriptorSet; + SLANG_RETURN_NULL_ON_FAIL(createDescriptorSet(layout, descriptorSet.writeRef())); + return descriptorSet; + } + + virtual Result createProgram(const ShaderProgram::Desc& desc, ShaderProgram** outProgram) = 0; + + inline RefPtr createProgram(const ShaderProgram::Desc& desc) + { + RefPtr program; + SLANG_RETURN_NULL_ON_FAIL(createProgram(desc, program.writeRef())); + return program; + } + + virtual Result createGraphicsPipelineState( + const GraphicsPipelineStateDesc& desc, + PipelineState** outState) = 0; + + inline RefPtr createGraphicsPipelineState( + const GraphicsPipelineStateDesc& desc) + { + RefPtr state; + SLANG_RETURN_NULL_ON_FAIL(createGraphicsPipelineState(desc, state.writeRef())); + return state; + } + + virtual Result createComputePipelineState( + const ComputePipelineStateDesc& desc, + PipelineState** outState) = 0; + + inline RefPtr createComputePipelineState( + const ComputePipelineStateDesc& desc) + { + RefPtr state; + SLANG_RETURN_NULL_ON_FAIL(createComputePipelineState(desc, state.writeRef())); + return state; + } + + /// Captures the back buffer and stores the result in surfaceOut. If the surface contains data - it will either be overwritten (if same size and format), or freed and a re-allocated. + virtual SlangResult captureScreenSurface(Surface& surfaceOut) = 0; + + virtual void* map(BufferResource* buffer, MapFlavor flavor) = 0; + virtual void unmap(BufferResource* buffer) = 0; + + virtual void setPrimitiveTopology(PrimitiveTopology topology) = 0; + + virtual void setDescriptorSet(PipelineType pipelineType, PipelineLayout* layout, UInt index, DescriptorSet* descriptorSet) = 0; + + virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) = 0; + inline void setVertexBuffer(UInt slot, BufferResource* buffer, UInt stride, UInt offset = 0); + + virtual void setIndexBuffer(BufferResource* buffer, Format indexFormat, UInt offset = 0) = 0; + + virtual void setDepthStencilTarget(ResourceView* depthStencilView) = 0; + + virtual void setPipelineState(PipelineType pipelineType, PipelineState* state) = 0; + + virtual void draw(UInt vertexCount, UInt startVertex = 0) = 0; + virtual void drawIndexed(UInt indexCount, UInt startIndex = 0, UInt baseVertex = 0) = 0; + + virtual void dispatchCompute(int x, int y, int z) = 0; + + /// Commit any buffered state changes or draw calls. + /// presentFrame will commitAll implicitly before doing a present + virtual void submitGpuWork() = 0; + /// Blocks until Gpu work is complete + virtual void waitForGpu() = 0; + + /// Get the type of this renderer + virtual RendererType getRendererType() const = 0; +}; + +// ---------------------------------------------------------------------------------------- +inline void Renderer::setVertexBuffer(UInt slot, BufferResource* buffer, UInt stride, UInt offset) +{ + setVertexBuffers(slot, 1, &buffer, &stride, &offset); +} + +/// Functions that are around Renderer and it's types +struct RendererUtil +{ + /// Gets the size in bytes of a Format type. Returns 0 if a size is not defined/invalid + SLANG_FORCE_INLINE static size_t getFormatSize(Format format) { return s_formatSize[int(format)]; } + /// Given a renderer type, gets a projection style + static ProjectionStyle getProjectionStyle(RendererType type); + + /// Given the projection style returns an 'identity' matrix, which ensures x,y mapping to pixels is the same on all targets + static void getIdentityProjection(ProjectionStyle style, float projMatrix[16]); + + /// Get the binding style from the type + static BindingStyle getBindingStyle(RendererType type) { return s_rendererTypeToBindingStyle[int(type)]; } + + private: + static void compileTimeAsserts(); + static const uint8_t s_formatSize[]; // Maps Format::XXX to a size in bytes; + static const BindingStyle s_rendererTypeToBindingStyle[]; ///< Maps a RendererType to a BindingStyle +}; + +} // renderer_test diff --git a/tools/gfx/resource-d3d12.cpp b/tools/gfx/resource-d3d12.cpp new file mode 100644 index 000000000..2e0f78371 --- /dev/null +++ b/tools/gfx/resource-d3d12.cpp @@ -0,0 +1,214 @@ +// resource-d3d12.cpp +#include "resource-d3d12.h" + +namespace gfx { +using namespace Slang; + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! D3D12BarrierSubmitter !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +void D3D12BarrierSubmitter::_flush() +{ + assert(m_numBarriers > 0); + + if (m_commandList) + { + m_commandList->ResourceBarrier(UINT(m_numBarriers), m_barriers); + } + m_numBarriers = 0; +} + +D3D12_RESOURCE_BARRIER& D3D12BarrierSubmitter::_expandOne() +{ + _flush(); + return m_barriers[m_numBarriers++]; +} + +void D3D12BarrierSubmitter::transition(ID3D12Resource* resource, D3D12_RESOURCE_STATES prevState, D3D12_RESOURCE_STATES nextState) +{ + if (nextState != prevState) + { + D3D12_RESOURCE_BARRIER& barrier = expandOne(); + + const UINT subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES; + const D3D12_RESOURCE_BARRIER_FLAGS flags = D3D12_RESOURCE_BARRIER_FLAG_NONE; + + ::memset(&barrier, 0, sizeof(barrier)); + barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION; + barrier.Flags = flags; + barrier.Transition.pResource = resource; + barrier.Transition.StateBefore = prevState; + barrier.Transition.StateAfter = nextState; + barrier.Transition.Subresource = subresource; + } + else + { + if (nextState == D3D12_RESOURCE_STATE_UNORDERED_ACCESS) + { + D3D12_RESOURCE_BARRIER& barrier = expandOne(); + + ::memset(&barrier, 0, sizeof(barrier)); + barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_UAV; + barrier.UAV.pResource = resource; + } + } +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! D3D12ResourceBase !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +/* static */DXGI_FORMAT D3D12ResourceBase::calcFormat(D3DUtil::UsageType usage, ID3D12Resource* resource) +{ + return resource ? D3DUtil::calcFormat(usage, resource->GetDesc().Format) : DXGI_FORMAT_UNKNOWN; +} + +void D3D12ResourceBase::transition(D3D12_RESOURCE_STATES nextState, D3D12BarrierSubmitter& submitter) +{ + // Transition only if there is a resource + if (m_resource) + { + submitter.transition(m_resource, m_state, nextState); + m_state = nextState; + } +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! D3D12CounterFence !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + +D3D12CounterFence::~D3D12CounterFence() +{ + if (m_event) + { + CloseHandle(m_event); + } +} + +Result D3D12CounterFence::init(ID3D12Device* device, uint64_t initialValue) +{ + m_currentValue = initialValue; + + SLANG_RETURN_ON_FAIL(device->CreateFence(m_currentValue, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(m_fence.writeRef()))); + // Create an event handle to use for frame synchronization. + m_event = ::CreateEvent(nullptr, FALSE, FALSE, nullptr); + if (m_event == nullptr) + { + Result res = HRESULT_FROM_WIN32(GetLastError()); + return SLANG_FAILED(res) ? res : SLANG_FAIL; + } + return SLANG_OK; +} + +UInt64 D3D12CounterFence::nextSignal(ID3D12CommandQueue* commandQueue) +{ + // Increment the fence value. Save on the frame - we'll know that frame is done when the fence value >= + m_currentValue++; + // Schedule a Signal command in the queue. + Result res = commandQueue->Signal(m_fence, m_currentValue); + if (SLANG_FAILED(res)) + { + assert(!"Signal failed"); + } + return m_currentValue; +} + +void D3D12CounterFence::waitUntilCompleted(uint64_t completedValue) +{ + // You can only wait for a value that is less than or equal to the current value + assert(completedValue <= m_currentValue); + + // Wait until the previous frame is finished. + while (m_fence->GetCompletedValue() < completedValue) + { + // Make it signal with the current value + SLANG_ASSERT_VOID_ON_FAIL(m_fence->SetEventOnCompletion(completedValue, m_event)); + WaitForSingleObject(m_event, INFINITE); + } +} + +void D3D12CounterFence::nextSignalAndWait(ID3D12CommandQueue* commandQueue) +{ + waitUntilCompleted(nextSignal(commandQueue)); +} + +/* !!!!!!!!!!!!!!!!!!!!!!!!! D3D12Resource !!!!!!!!!!!!!!!!!!!!!!!! */ + +/* static */void D3D12Resource::setDebugName(ID3D12Resource* resource, const char* name) +{ + if (resource) + { + size_t len = ::strlen(name); + List buf; + buf.SetSize(len + 1); + + D3DUtil::appendWideChars(name, buf); + resource->SetName(buf.begin()); + } +} + +void D3D12Resource::setDebugName(const char* name) +{ + setDebugName(m_resource, name); +} + +void D3D12Resource::setDebugName(const wchar_t* name) +{ + if (m_resource) + { + m_resource->SetName(name); + } +} + +void D3D12Resource::setResource(ID3D12Resource* resource, D3D12_RESOURCE_STATES initialState) +{ + if (resource != m_resource) + { + if (resource) + { + resource->AddRef(); + } + if (m_resource) + { + m_resource->Release(); + } + m_resource = resource; + } + m_prevState = initialState; + m_state = initialState; +} + +void D3D12Resource::setResourceNull() +{ + if (m_resource) + { + m_resource->Release(); + m_resource = nullptr; + } +} + +Result D3D12Resource::initCommitted(ID3D12Device* device, const D3D12_HEAP_PROPERTIES& heapProps, D3D12_HEAP_FLAGS heapFlags, const D3D12_RESOURCE_DESC& resourceDesc, D3D12_RESOURCE_STATES initState, const D3D12_CLEAR_VALUE * clearValue) +{ + setResourceNull(); + ComPtr resource; + SLANG_RETURN_ON_FAIL(device->CreateCommittedResource(&heapProps, heapFlags, &resourceDesc, initState, clearValue, IID_PPV_ARGS(resource.writeRef()))); + setResource(resource, initState); + return SLANG_OK; +} + +ID3D12Resource* D3D12Resource::detach() +{ + ID3D12Resource* resource = m_resource; + m_resource = nullptr; + return resource; +} + +void D3D12Resource::swap(ComPtr& resourceInOut) +{ + ID3D12Resource* tmp = m_resource; + m_resource = resourceInOut.detach(); + resourceInOut.attach(tmp); +} + +void D3D12Resource::setState(D3D12_RESOURCE_STATES state) +{ + m_prevState = state; + m_state = state; +} + +} // renderer_test diff --git a/tools/gfx/resource-d3d12.h b/tools/gfx/resource-d3d12.h new file mode 100644 index 000000000..1764adf9d --- /dev/null +++ b/tools/gfx/resource-d3d12.h @@ -0,0 +1,178 @@ +// resource-d3d12.h +#pragma once + +#define WIN32_LEAN_AND_MEAN +#define NOMINMAX +#include +#undef WIN32_LEAN_AND_MEAN +#undef NOMINMAX + +#include +#include + +#include "../../slang-com-ptr.h" +#include "d3d-util.h" + +namespace gfx { + +// Enables more conservative barriers - restoring the state of resources after they are used. +// Should not need to be enabled in normal builds, as the barriers should correctly sync resources +// If enabling fixes an issue it implies regular barriers are not correctly used. +#define SLANG_ENABLE_CONSERVATIVE_RESOURCE_BARRIERS 0 + +struct D3D12BarrierSubmitter +{ + enum { MAX_BARRIERS = 8 }; + + /// Expand one space to hold a barrier + SLANG_FORCE_INLINE D3D12_RESOURCE_BARRIER& expandOne() { return (m_numBarriers < MAX_BARRIERS) ? m_barriers[m_numBarriers++] : _expandOne(); } + /// Flush barriers to command list + SLANG_FORCE_INLINE void flush() { if (m_numBarriers > 0) _flush(); } + + /// Transition resource from prevState to nextState + void transition(ID3D12Resource* resource, D3D12_RESOURCE_STATES prevState, D3D12_RESOURCE_STATES nextState); + + /// Ctor + SLANG_FORCE_INLINE D3D12BarrierSubmitter(ID3D12GraphicsCommandList* commandList) : m_numBarriers(0), m_commandList(commandList) { } + /// Dtor + SLANG_FORCE_INLINE ~D3D12BarrierSubmitter() { flush(); } + +protected: + D3D12_RESOURCE_BARRIER& _expandOne(); + void _flush(); + + ID3D12GraphicsCommandList* m_commandList; + int m_numBarriers; + D3D12_RESOURCE_BARRIER m_barriers[MAX_BARRIERS]; +}; + +/*! \brief A class to simplify using Dx12 fences. + +A fence is a mechanism to track GPU work. This is achieved by having a counter that the CPU holds +called the current value. Calling nextSignal will increase the CPU counter, and add a fence +with that value to the commandQueue. When the GPU has completed all the work before the fence it will +update the completed value. This is typically used when +the CPU needs to know the GPU has finished some piece of work has completed. To do this the CPU +can check the completed value, and when it is greater or equal to the value returned by nextSignal the +CPU will know that all the work prior to when the nextSignal was added to the queue will have completed. + +NOTE! This cannot be used across threads, as for amongst other reasons SetEventOnCompletion +only works with a single value. + +Signal on the CommandQueue updates the fence on the GPU side. Signal on the fence object changes +the value on the CPU side (not used here). + +Useful article describing how Dx12 synchronization works: +https://msdn.microsoft.com/en-us/library/windows/desktop/dn899217%28v=vs.85%29.aspx +*/ +class D3D12CounterFence +{ +public: + /// Must be called before used + SlangResult init(ID3D12Device* device, uint64_t initialValue = 0); + /// Increases the counter, signals the queue and waits for the signal to be hit + void nextSignalAndWait(ID3D12CommandQueue* queue); + /// Signals with next counter value. Returns the value the signal was called on + uint64_t nextSignal(ID3D12CommandQueue* commandQueue); + /// Get the current value + SLANG_FORCE_INLINE uint64_t getCurrentValue() const { return m_currentValue; } + /// Get the completed value + SLANG_FORCE_INLINE uint64_t getCompletedValue() const { return m_fence->GetCompletedValue(); } + + /// Waits for the the specified value + void waitUntilCompleted(uint64_t completedValue); + + /// Ctor + D3D12CounterFence() :m_event(nullptr), m_currentValue(0) {} + /// Dtor + ~D3D12CounterFence(); + +protected: + HANDLE m_event; + Slang::ComPtr m_fence; + UINT64 m_currentValue; +}; + +/** The base class for resource types allows for tracking of state. It does not allow for setting of the resource though, such that +an interface can return a D3D12ResourceBase, and a client cant manipulate it's state, but it cannot replace/change the actual resource */ +struct D3D12ResourceBase +{ + /// Add a transition if necessary to the list + void transition(D3D12_RESOURCE_STATES nextState, D3D12BarrierSubmitter& submitter); + /// Get the current state + SLANG_FORCE_INLINE D3D12_RESOURCE_STATES getState() const { return m_state; } + + /// Get the associated resource + SLANG_FORCE_INLINE ID3D12Resource* getResource() const { return m_resource; } + + /// True if a resource is set + SLANG_FORCE_INLINE bool isSet() const { return m_resource != nullptr; } + + /// Coercible into ID3D12Resource + SLANG_FORCE_INLINE operator ID3D12Resource*() const { return m_resource; } + + /// restore previous state +#if SLANG_ENABLE_CONSERVATIVE_RESOURCE_BARRIERS + SLANG_FORCE_INLINE Void restore(D3D12BarrierSubmitter& submitter) { transition(m_prevState, submitter); } +#else + SLANG_FORCE_INLINE void restore(D3D12BarrierSubmitter& submitter) { SLANG_UNUSED(submitter) } +#endif + + /// Given the usage, flags, and format will return the most suitable format. Will return DXGI_UNKNOWN if combination is not possible + static DXGI_FORMAT calcFormat(D3DUtil::UsageType usage, ID3D12Resource* resource); + + /// Ctor + SLANG_FORCE_INLINE D3D12ResourceBase() : + m_state(D3D12_RESOURCE_STATE_COMMON), + m_prevState(D3D12_RESOURCE_STATE_COMMON), + m_resource(nullptr) + {} + +protected: + /// This is protected so as clients cannot slice the class, and so state tracking is lost + ~D3D12ResourceBase() {} + + ID3D12Resource* m_resource; ///< The resource (ref counted) + D3D12_RESOURCE_STATES m_state; ///< The current tracked expected state, if all associated transitions have completed on ID3D12CommandList + D3D12_RESOURCE_STATES m_prevState; ///< The previous state +}; + +struct D3D12Resource : public D3D12ResourceBase +{ + + /// Dtor + ~D3D12Resource() + { + if (m_resource) + { + m_resource->Release(); + } + } + + /// Initialize as committed resource + Slang::Result initCommitted(ID3D12Device* device, const D3D12_HEAP_PROPERTIES& heapProps, D3D12_HEAP_FLAGS heapFlags, const D3D12_RESOURCE_DESC& resourceDesc, D3D12_RESOURCE_STATES initState, const D3D12_CLEAR_VALUE * clearValue); + + /// Set a resource with an initial state + void setResource(ID3D12Resource* resource, D3D12_RESOURCE_STATES initialState); + /// Make the resource null + void setResourceNull(); + /// Returns the attached resource (with any ref counts) and sets to nullptr on this. + ID3D12Resource* detach(); + + /// Swaps the resource contents with the contents of the smart pointer + void swap(Slang::ComPtr& resourceInOut); + + /// Sets the current state of the resource (the current state is taken to be the future state once the command list has executed) + /// NOTE! This must be used with care, otherwise state tracking can be made incorrect. + void setState(D3D12_RESOURCE_STATES state); + + /// Set the debug name on a resource + static void setDebugName(ID3D12Resource* resource, const char* name); + + /// Set the the debug name on the resource + void setDebugName(const wchar_t* name); + /// Set the debug name + void setDebugName(const char* name); +}; + +} // renderer_test diff --git a/tools/gfx/surface.cpp b/tools/gfx/surface.cpp new file mode 100644 index 000000000..4b53d278a --- /dev/null +++ b/tools/gfx/surface.cpp @@ -0,0 +1,222 @@ +// surface.cpp +#include "surface.h" + +#include +#include + +#include "../../source/core/list.h" + +namespace gfx { +using namespace Slang; + +class MallocSurfaceAllocator: public SurfaceAllocator +{ + public: + + virtual Slang::Result allocate(int width, int height, Format format, int alignment, Surface& surface) override; + virtual void deallocate(Surface& surface) override; +}; + +static MallocSurfaceAllocator s_mallocSurfaceAllocator; + +/// Get the malloc allocator +/* static */SurfaceAllocator* SurfaceAllocator::getMallocAllocator() +{ + return &s_mallocSurfaceAllocator; +} + +Slang::Result MallocSurfaceAllocator::allocate(int width, int height, Format format, int alignment, Surface& surface) +{ + assert(surface.m_data == nullptr); + + // Calculate row size + + const int rowSizeInBytes = Surface::calcRowSize(format, width); + const int numRows = Surface::calcNumRows(format, height); + + alignment = (alignment <= 0) ? int(sizeof(void*)) : alignment; + // It must be a power of 2 + assert( ((alignment - 1) & alignment) == 0); + + // Align rowSize + const int alignedRowSizeInBytes = (rowSizeInBytes + alignment - 1) & -alignment; + + size_t totalSize = numRows * alignedRowSizeInBytes; + + uint8_t* data = (uint8_t*)::malloc(totalSize); + if (!data) + { + return SLANG_E_OUT_OF_MEMORY; + } + + surface.m_data = data; + surface.m_width = width; + surface.m_height = height; + surface.m_format = format; + surface.m_numRows = numRows; + surface.m_rowStrideInBytes = alignedRowSizeInBytes; + + surface.m_allocator = this; + return SLANG_OK; +} + +void MallocSurfaceAllocator::deallocate(Surface& surface) +{ + assert(surface.m_data); + // Make sure it's not an inverted, cos otherwise m_data is not the start address + assert(surface.m_rowStrideInBytes > 0); + ::free(surface.m_data); +} + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Surface !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +/* static */int Surface::calcRowSize(Format format, int width) +{ + size_t pixelSize = RendererUtil::getFormatSize(format); + if (pixelSize == 0) + { + return 0; + } + return int(pixelSize * width); +} + +/* static */int Surface::calcNumRows(Format format, int height) +{ + // Don't have any compressed types, so number of rows is same as the height + return height; +} + +void Surface::init() +{ + m_width = 0; + m_height = 0; + m_format = Format::Unknown; + m_data = nullptr; + m_numRows = 0; + m_rowStrideInBytes = 0; + // NOTE! does not clear the allocator. + // If called with an allocation memory will leak! +} + +Surface::~Surface() +{ + if (m_data && m_allocator) + { + m_allocator->deallocate(*this); + } +} + +void Surface::deallocate() +{ + if (m_data && m_allocator) + { + m_allocator->deallocate(*this); + init(); + } +} + +Result Surface::allocate(int width, int height, Format format, int alignment, SurfaceAllocator* allocator) +{ + deallocate(); + allocator = allocator ? allocator : m_allocator; + if (!allocator) + { + // An allocator needs to be set on the surface, or one passed in. + return SLANG_FAIL; + } + return allocator->allocate(width, height, format, alignment, *this); +} + +void Surface::setUnowned(int width, int height, Format format, int strideInBytes, void* data) +{ + deallocate(); + + // This is unowned + m_allocator = nullptr; + + m_width = width; + m_height = height; + m_format = format; + m_rowStrideInBytes = strideInBytes; + m_data = (uint8_t*)data; + + m_numRows = Surface::calcNumRows(format, height); + + const int rowSizeInBytes = Surface::calcRowSize(format, width); + assert((strideInBytes > 0 && rowSizeInBytes <= strideInBytes) || (strideInBytes < 0 && rowSizeInBytes <= -strideInBytes)); +} + +void Surface::zeroContents() +{ + const int rowSizeInBytes = Surface::calcRowSize(m_format, m_width); + + const int stride = m_rowStrideInBytes; + uint8_t* dst = m_data; + + for (int i = 0; i < m_numRows; i++, dst += stride) + { + ::memset(dst, 0, rowSizeInBytes); + } +} + +void Surface::flipInplaceVertically() +{ + // Can only flip when m_height matches number of rows + assert(m_numRows == m_height); + + const int rowSizeInBytes = Surface::calcRowSize(m_format, m_width); + if (rowSizeInBytes <= 0 || m_numRows <= 1) + { + return; + } + + uint8_t* top = m_data; + uint8_t* bottom = m_data + (m_numRows - 1) * m_rowStrideInBytes; + + List bufferList; + bufferList.SetSize(rowSizeInBytes); + uint8_t* buffer = bufferList.Buffer(); + + const int stride = m_rowStrideInBytes; + + const int num = m_height >> 1; + for (int i = 0; i < num; ++i, top += stride, bottom -= stride) + { + ::memcpy(buffer, top, rowSizeInBytes); + ::memcpy(top, bottom, rowSizeInBytes); + ::memcpy(bottom, buffer, rowSizeInBytes); + } +} + +SlangResult Surface::set(int width, int height, Format format, int srcRowStride, const void* data, SurfaceAllocator* allocator) +{ + if (hasContents() && m_width == width && m_height == height && m_format == format) + { + // I can just overwrite the contents that is there + } + else + { + SLANG_RETURN_ON_FAIL(allocate(width, height, format, 0, allocator)); + } + + // Okay just need to set the contents + + { + const size_t rowSize = calcRowSize(format, width); + + const uint8_t* srcRow = (const uint8_t*)data; + uint8_t* dstRow = (uint8_t*)m_data; + + for (int i = 0; i < m_numRows; i++) + { + ::memcpy(dstRow, srcRow, rowSize); + + srcRow += srcRowStride; + dstRow += m_rowStrideInBytes; + } + } + + return SLANG_OK; +} + +} // renderer_test diff --git a/tools/gfx/surface.h b/tools/gfx/surface.h new file mode 100644 index 000000000..3e0f6f0aa --- /dev/null +++ b/tools/gfx/surface.h @@ -0,0 +1,86 @@ +// surface.h +#pragma once + +#include "render.h" + +namespace gfx { + +class Surface; + +class SurfaceAllocator +{ + public: + virtual Slang::Result allocate(int width, int height, Format format, int alignment, Surface& surface) = 0; + virtual void deallocate(Surface& surface) = 0; + + /// Get the malloc allocator + static SurfaceAllocator* getMallocAllocator(); +}; + +class Surface +{ + public: + + enum + { + kDefaultAlignment = sizeof(void*) + }; + + /// Allocate + Slang::Result allocate(int width, int height, Format format, int alignment = kDefaultAlignment, SurfaceAllocator* allocator = nullptr); + + /// Deallocate contents + void deallocate(); + /// Initialize contents (zero sized, no data). Note that the allocator pointer is left as is + void init(); + + /// Set unowned + void setUnowned(int width, int height, Format format, int strideInBytes, void* data); + + /// Set the contents - the memory will be owned by this surface (ie will be freed by the allocator when goes out of scope or is deallocated) + Slang::Result set(int width, int height, Format format, int strideInBytes, const void* data, SurfaceAllocator* allocator); + + template + T* calcNextRow(T* ptr) const { return (T*)calcNextRow((void*)ptr); } + template + const T* calcNextRow(const T* ptr) const { return (const T*)calcNextRow((const void*)ptr); } + + void* calcNextRow(void* ptr) const { return (void*)(((uint8_t*)ptr) + m_rowStrideInBytes); } + const void* calcNextRow(const void* ptr) const { return (const void*)(((const uint8_t*)ptr) + m_rowStrideInBytes); } + + /// Writes zero to all of the contents + void zeroContents(); + + /// Flips the contents vertically in place + void flipInplaceVertically(); + + /// True if has some contents + bool hasContents() const { return m_data != nullptr; } + + /// Ctor + Surface() : + m_allocator(nullptr) + { + init(); + } + /// Dtor + ~Surface(); + + /// Get the size of the row in bytes + static int calcRowSize(Format format, int width); + /// Calculates the number of rows + static int calcNumRows(Format format, int height); + + int m_width; + int m_height; + Format m_format; + + uint8_t* m_data; /// The data that makes up the image. If nullptr, has no data. Pointer to first 'row' of the image. + + int m_numRows; ///< Total amount of rows (typically same as height, but in compressed formats may be less) + int m_rowStrideInBytes; ///< The number of bytes between rows + + SurfaceAllocator* m_allocator; ///< Can be null if so contents is 'unowned', if set +}; + +} // renderer_test diff --git a/tools/gfx/vector-math.h b/tools/gfx/vector-math.h new file mode 100644 index 000000000..88cb0c1d9 --- /dev/null +++ b/tools/gfx/vector-math.h @@ -0,0 +1,14 @@ +// vector-math.h +#pragma once + +// We will use the GLM library for our vector math types, just for simplicity. + +#include "../../external/glm/glm/glm.hpp" +#include "../../external/glm/glm/gtc/matrix_transform.hpp" +#include "../../external/glm/glm/gtc/constants.hpp" + +namespace gfx { + +using namespace glm; + +} // gfx diff --git a/tools/gfx/vk-api.cpp b/tools/gfx/vk-api.cpp new file mode 100644 index 000000000..4030e43ba --- /dev/null +++ b/tools/gfx/vk-api.cpp @@ -0,0 +1,138 @@ +// vk-api.cpp +#include "vk-api.h" + +#include "../../source/core/list.h" + +namespace gfx { +using namespace Slang; + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VulkanApi !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +#define VK_API_CHECK_FUNCTION(x) && (x != nullptr) +#define VK_API_CHECK_FUNCTIONS(FUNCTION_LIST) true FUNCTION_LIST(VK_API_CHECK_FUNCTION) + +bool VulkanApi::areDefined(ProcType type) const +{ + switch (type) + { + case ProcType::Global: return VK_API_CHECK_FUNCTIONS(VK_API_ALL_GLOBAL_PROCS); + case ProcType::Instance: return VK_API_CHECK_FUNCTIONS(VK_API_ALL_INSTANCE_PROCS); + case ProcType::Device: return VK_API_CHECK_FUNCTIONS(VK_API_ALL_DEVICE_PROCS); + default: + { + assert(!"Unhandled type"); + return false; + } + } +} + +Slang::Result VulkanApi::initGlobalProcs(const VulkanModule& module) +{ +#define VK_API_GET_GLOBAL_PROC(x) x = (PFN_##x)module.getFunction(#x); + + // Initialize all the global functions + VK_API_ALL_GLOBAL_PROCS(VK_API_GET_GLOBAL_PROC) + + if (!areDefined(ProcType::Global)) + { + return SLANG_FAIL; + } + m_module = &module; + return SLANG_OK; +} + +Slang::Result VulkanApi::initInstanceProcs(VkInstance instance) +{ + assert(instance && vkGetInstanceProcAddr != nullptr); + +#define VK_API_GET_INSTANCE_PROC(x) x = (PFN_##x)vkGetInstanceProcAddr(instance, #x); + + VK_API_ALL_INSTANCE_PROCS(VK_API_GET_INSTANCE_PROC) + + if (!areDefined(ProcType::Instance)) + { + return SLANG_FAIL; + } + + m_instance = instance; + return SLANG_OK; +} + +Slang::Result VulkanApi::initPhysicalDevice(VkPhysicalDevice physicalDevice) +{ + assert(m_physicalDevice == VK_NULL_HANDLE); + m_physicalDevice = physicalDevice; + + vkGetPhysicalDeviceProperties(m_physicalDevice, &m_deviceProperties); + vkGetPhysicalDeviceFeatures(m_physicalDevice, &m_deviceFeatures); + vkGetPhysicalDeviceMemoryProperties(m_physicalDevice, &m_deviceMemoryProperties); + + return SLANG_OK; +} + +Slang::Result VulkanApi::initDeviceProcs(VkDevice device) +{ + assert(m_instance && device && vkGetDeviceProcAddr != nullptr); + +#define VK_API_GET_DEVICE_PROC(x) x = (PFN_##x)vkGetDeviceProcAddr(device, #x); + + VK_API_ALL_DEVICE_PROCS(VK_API_GET_DEVICE_PROC) + + if (!areDefined(ProcType::Device)) + { + return SLANG_FAIL; + } + + m_device = device; + return SLANG_OK; +} + +int VulkanApi::findMemoryTypeIndex(uint32_t typeBits, VkMemoryPropertyFlags properties) const +{ + assert(typeBits); + + const int numMemoryTypes = int(m_deviceMemoryProperties.memoryTypeCount); + + // bit holds current test bit against typeBits. Ie bit == 1 << typeBits + + uint32_t bit = 1; + for (int i = 0; i < numMemoryTypes; ++i, bit += bit) + { + auto const& memoryType = m_deviceMemoryProperties.memoryTypes[i]; + if ((typeBits & bit) && (memoryType.propertyFlags & properties) == properties) + { + return i; + } + } + + //assert(!"failed to find a usable memory type"); + return -1; +} + +int VulkanApi::findQueue(VkQueueFlags reqFlags) const +{ + assert(m_physicalDevice != VK_NULL_HANDLE); + + uint32_t numQueueFamilies = 0; + vkGetPhysicalDeviceQueueFamilyProperties(m_physicalDevice, &numQueueFamilies, nullptr); + + Slang::List queueFamilies; + queueFamilies.SetSize(numQueueFamilies); + vkGetPhysicalDeviceQueueFamilyProperties(m_physicalDevice, &numQueueFamilies, queueFamilies.Buffer()); + + // Find a queue that can service our needs + //VkQueueFlags reqQueueFlags = VK_QUEUE_GRAPHICS_BIT | VK_QUEUE_COMPUTE_BIT; + + int queueFamilyIndex = -1; + for (int i = 0; i < int(numQueueFamilies); ++i) + { + if ((queueFamilies[i].queueFlags & reqFlags) == reqFlags) + { + return i; + } + } + + return -1; +} + +} // renderer_test diff --git a/tools/gfx/vk-api.h b/tools/gfx/vk-api.h new file mode 100644 index 000000000..5ec28ef6e --- /dev/null +++ b/tools/gfx/vk-api.h @@ -0,0 +1,196 @@ +// vk-api.h +#pragma once + +#include "vk-module.h" + +namespace gfx { + +#define VK_API_GLOBAL_PROCS(x) \ + x(vkGetInstanceProcAddr) \ + x(vkCreateInstance) \ + /* */ + +#define VK_API_INSTANCE_PROCS(x) \ + x(vkCreateDevice) \ + x(vkCreateDebugReportCallbackEXT) \ + x(vkDestroyDebugReportCallbackEXT) \ + x(vkDebugReportMessageEXT) \ + x(vkEnumeratePhysicalDevices) \ + x(vkGetPhysicalDeviceProperties) \ + x(vkGetPhysicalDeviceFeatures) \ + x(vkGetPhysicalDeviceMemoryProperties) \ + x(vkGetPhysicalDeviceQueueFamilyProperties) \ + x(vkGetPhysicalDeviceFormatProperties) \ + x(vkGetDeviceProcAddr) \ + /* */ + +#define VK_API_DEVICE_PROCS(x) \ + x(vkCreateDescriptorPool) \ + x(vkDestroyDescriptorPool) \ + x(vkGetDeviceQueue) \ + x(vkQueueSubmit) \ + x(vkQueueWaitIdle) \ + x(vkCreateBuffer) \ + x(vkAllocateMemory) \ + x(vkMapMemory) \ + x(vkUnmapMemory) \ + x(vkCmdCopyBuffer) \ + x(vkDestroyBuffer) \ + x(vkFreeMemory) \ + x(vkCreateDescriptorSetLayout) \ + x(vkDestroyDescriptorSetLayout) \ + x(vkAllocateDescriptorSets) \ + x(vkUpdateDescriptorSets) \ + x(vkCreatePipelineLayout) \ + x(vkDestroyPipelineLayout) \ + x(vkCreateComputePipelines) \ + x(vkCreateGraphicsPipelines) \ + x(vkDestroyPipeline) \ + x(vkCreateShaderModule) \ + x(vkDestroyShaderModule) \ + x(vkCreateFramebuffer) \ + x(vkDestroyFramebuffer) \ + x(vkCreateImage) \ + x(vkDestroyImage) \ + x(vkCreateImageView) \ + x(vkDestroyImageView) \ + x(vkCreateRenderPass) \ + x(vkDestroyRenderPass) \ + x(vkCreateCommandPool) \ + x(vkDestroyCommandPool) \ + x(vkCreateSampler) \ + x(vkDestroySampler) \ + x(vkCreateBufferView) \ + x(vkDestroyBufferView) \ + \ + x(vkGetBufferMemoryRequirements) \ + x(vkGetImageMemoryRequirements) \ + \ + x(vkCmdBindPipeline) \ + x(vkCmdBindDescriptorSets) \ + x(vkCmdDispatch) \ + x(vkCmdDraw) \ + x(vkCmdSetScissor) \ + x(vkCmdSetViewport) \ + x(vkCmdBindVertexBuffers) \ + x(vkCmdBindIndexBuffer) \ + x(vkCmdBeginRenderPass) \ + x(vkCmdEndRenderPass) \ + x(vkCmdPipelineBarrier) \ + x(vkCmdCopyBufferToImage)\ + \ + x(vkCreateFence) \ + x(vkDestroyFence) \ + x(vkResetFences) \ + x(vkGetFenceStatus) \ + x(vkWaitForFences) \ + \ + x(vkCreateSemaphore) \ + x(vkDestroySemaphore) \ + \ + x(vkCreateEvent) \ + x(vkDestroyEvent) \ + x(vkGetEventStatus) \ + x(vkSetEvent) \ + x(vkResetEvent) \ + \ + x(vkFreeCommandBuffers) \ + x(vkAllocateCommandBuffers) \ + x(vkBeginCommandBuffer) \ + x(vkEndCommandBuffer) \ + x(vkResetCommandBuffer) \ + \ + x(vkBindImageMemory) \ + x(vkBindBufferMemory) \ + /* */ + +#if SLANG_WINDOWS_FAMILY +# define VK_API_INSTANCE_PLATFORM_KHR_PROCS(x) \ + x(vkCreateWin32SurfaceKHR) \ + /* */ +#else +# define VK_API_INSTANCE_PLATFORM_KHR_PROCS(x) \ + x(vkCreateXlibSurfaceKHR) \ + /* */ +#endif + +#define VK_API_INSTANCE_KHR_PROCS(x) \ + VK_API_INSTANCE_PLATFORM_KHR_PROCS(x) \ + x(vkGetPhysicalDeviceSurfaceSupportKHR) \ + x(vkGetPhysicalDeviceSurfaceFormatsKHR) \ + x(vkGetPhysicalDeviceSurfacePresentModesKHR) \ + x(vkGetPhysicalDeviceSurfaceCapabilitiesKHR) \ + x(vkDestroySurfaceKHR) \ + /* */ + +#define VK_API_DEVICE_KHR_PROCS(x) \ + x(vkQueuePresentKHR) \ + x(vkCreateSwapchainKHR) \ + x(vkGetSwapchainImagesKHR) \ + x(vkDestroySwapchainKHR) \ + x(vkAcquireNextImageKHR) \ + /* */ + +#define VK_API_ALL_GLOBAL_PROCS(x) \ + VK_API_GLOBAL_PROCS(x) + +#define VK_API_ALL_INSTANCE_PROCS(x) \ + VK_API_INSTANCE_PROCS(x) \ + VK_API_INSTANCE_KHR_PROCS(x) + +#define VK_API_ALL_DEVICE_PROCS(x) \ + VK_API_DEVICE_PROCS(x) \ + VK_API_DEVICE_KHR_PROCS(x) + +#define VK_API_ALL_PROCS(x) \ + VK_API_ALL_GLOBAL_PROCS(x) \ + VK_API_ALL_INSTANCE_PROCS(x) \ + VK_API_ALL_DEVICE_PROCS(x) \ + /* */ + +#define VK_API_DECLARE_PROC(NAME) PFN_##NAME NAME = nullptr; + +struct VulkanApi +{ + VK_API_ALL_PROCS(VK_API_DECLARE_PROC) + + enum class ProcType + { + Global, + Instance, + Device, + }; + + /// Returns true if all the functions in the class are defined + bool areDefined(ProcType type) const; + + /// Sets up global parameters + Slang::Result initGlobalProcs(const VulkanModule& module); + /// Initialize the instance functions + Slang::Result initInstanceProcs(VkInstance instance); + + /// Called before initDevice + Slang::Result initPhysicalDevice(VkPhysicalDevice physicalDevice); + + /// Initialize the device functions + Slang::Result initDeviceProcs(VkDevice device); + + /// Type bits control which indices are tested against bit 0 for testing at index 0 + /// properties - a memory type must have all the bits set as passed in + /// Returns -1 if couldn't find an appropriate memory type index + int findMemoryTypeIndex(uint32_t typeBits, VkMemoryPropertyFlags properties) const; + + /// Given queue required flags, finds a queue + int findQueue(VkQueueFlags reqFlags) const; + + const VulkanModule* m_module = nullptr; ///< Module this was all loaded from + VkInstance m_instance = VK_NULL_HANDLE; + VkDevice m_device = VK_NULL_HANDLE; + VkPhysicalDevice m_physicalDevice = VK_NULL_HANDLE; + + VkPhysicalDeviceProperties m_deviceProperties; + VkPhysicalDeviceFeatures m_deviceFeatures; + VkPhysicalDeviceMemoryProperties m_deviceMemoryProperties; +}; + +} // renderer_test diff --git a/tools/gfx/vk-device-queue.cpp b/tools/gfx/vk-device-queue.cpp new file mode 100644 index 000000000..10a3d0e3b --- /dev/null +++ b/tools/gfx/vk-device-queue.cpp @@ -0,0 +1,199 @@ +// vk-device-queue.cpp +#include "vk-device-queue.h" + +#include +#include +#include + +namespace gfx { +using namespace Slang; + +VulkanDeviceQueue::~VulkanDeviceQueue() +{ + for (int i = 0; i < int(EventType::CountOf); ++i) + { + m_api->vkDestroySemaphore(m_api->m_device, m_semaphores[i], nullptr); + } + + for (int i = 0; i < m_numCommandBuffers; i++) + { + m_api->vkFreeCommandBuffers(m_api->m_device, m_commandPool, 1, &m_commandBuffers[i]); + m_api->vkDestroyFence(m_api->m_device, m_fences[i].fence, nullptr); + } + m_api->vkDestroyCommandPool(m_api->m_device, m_commandPool, nullptr); +} + +SlangResult VulkanDeviceQueue::init(const VulkanApi& api, VkQueue queue, int queueIndex) +{ + assert(m_api == nullptr); + m_api = &api; + + for (int i = 0; i < int(EventType::CountOf); ++i) + { + m_semaphores[i] = VK_NULL_HANDLE; + m_currentSemaphores[i] = VK_NULL_HANDLE; + } + + m_numCommandBuffers = kMaxCommandBuffers; + m_queueIndex = queueIndex; + + m_queue = queue; + + VkCommandPoolCreateInfo poolCreateInfo = {}; + poolCreateInfo.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO; + poolCreateInfo.flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT; + + poolCreateInfo.queueFamilyIndex = queueIndex; + + api.vkCreateCommandPool(api.m_device, &poolCreateInfo, nullptr, &m_commandPool); + + VkCommandBufferAllocateInfo commandInfo = {}; + commandInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO; + commandInfo.commandPool = m_commandPool; + commandInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY; + commandInfo.commandBufferCount = 1; + + VkFenceCreateInfo fenceCreateInfo = {}; + fenceCreateInfo.sType = VK_STRUCTURE_TYPE_FENCE_CREATE_INFO; + fenceCreateInfo.flags = 0; // VK_FENCE_CREATE_SIGNALED_BIT; + + for (int i = 0; i < m_numCommandBuffers; i++) + { + Fence& fence = m_fences[i]; + + api.vkAllocateCommandBuffers(api.m_device, &commandInfo, &m_commandBuffers[i]); + + api.vkCreateFence(api.m_device, &fenceCreateInfo, nullptr, &fence.fence); + fence.active = false; + fence.value = 0; + } + + VkSemaphoreCreateInfo semaphoreCreateInfo = {}; + semaphoreCreateInfo.sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO; + + for (int i = 0; i < int(EventType::CountOf); ++i) + { + api.vkCreateSemaphore(api.m_device, &semaphoreCreateInfo, nullptr, &m_semaphores[i]); + } + + // Second step of flush to prime command buffer + flushStepB(); + + return SLANG_OK; +} + +void VulkanDeviceQueue::flushStepA() +{ + m_api->vkEndCommandBuffer(m_commandBuffer); + + VkPipelineStageFlags stageFlags = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT; + + VkSubmitInfo submitInfo = {}; + submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; + + // Wait semaphores + if (isCurrent(EventType::BeginFrame)) + { + submitInfo.waitSemaphoreCount = 1; + submitInfo.pWaitSemaphores = &m_currentSemaphores[int(EventType::BeginFrame)]; + } + + submitInfo.pWaitDstStageMask = &stageFlags; + submitInfo.commandBufferCount = 1; + submitInfo.pCommandBuffers = &m_commandBuffer; + + // Signal semaphores + if (isCurrent(EventType::EndFrame)) + { + submitInfo.signalSemaphoreCount = 1; + submitInfo.pSignalSemaphores = &m_currentSemaphores[int(EventType::EndFrame)]; + } + + Fence& fence = m_fences[m_commandBufferIndex]; + + m_api->vkQueueSubmit(m_queue, 1, &submitInfo, fence.fence); + + // mark signaled fence value + fence.value = m_nextFenceValue; + fence.active = true; + + // increment fence value + m_nextFenceValue++; + + // No longer waiting on this semaphore + makeCompleted(EventType::BeginFrame); +} + +void VulkanDeviceQueue::_updateFenceAtIndex( int fenceIndex, bool blocking) +{ + Fence& fence = m_fences[fenceIndex]; + + if (fence.active) + { + uint64_t timeout = blocking ? ~uint64_t(0) : 0; + + if (VK_SUCCESS == m_api->vkWaitForFences(m_api->m_device, 1, &fence.fence, VK_TRUE, timeout)) + { + m_api->vkResetFences(m_api->m_device, 1, &fence.fence); + + fence.active = false; + + if (fence.value > m_lastFenceCompleted) + { + m_lastFenceCompleted = fence.value; + } + } + } +} + +void VulkanDeviceQueue::flushStepB() +{ + m_commandBufferIndex = (m_commandBufferIndex + 1) % m_numCommandBuffers; + m_commandBuffer = m_commandBuffers[m_commandBufferIndex]; + + // non-blocking update of fence values + for (int i = 0; i < m_numCommandBuffers; ++i) + { + _updateFenceAtIndex(i, false); + } + + // blocking update of fence values + _updateFenceAtIndex(m_commandBufferIndex, true); + + m_api->vkResetCommandBuffer(m_commandBuffer, 0); + + //m_api.vkResetCommandPool(m_api->m_device, m_commandPool, 0); + + VkCommandBufferBeginInfo beginInfo = {}; + beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO; + beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT; + + m_api->vkBeginCommandBuffer(m_commandBuffer, &beginInfo); +} + +void VulkanDeviceQueue::flush() +{ + flushStepA(); + flushStepB(); +} + +void VulkanDeviceQueue::flushAndWait() +{ + flush(); + waitForIdle(); +} + +VkSemaphore VulkanDeviceQueue::makeCurrent(EventType eventType) +{ + assert(!isCurrent(eventType)); + VkSemaphore semaphore = m_semaphores[int(eventType)]; + m_currentSemaphores[int(eventType)] = semaphore; + return semaphore; +} + +void VulkanDeviceQueue::makeCompleted(EventType eventType) +{ + m_currentSemaphores[int(eventType)] = VK_NULL_HANDLE; +} + +} // renderer_test diff --git a/tools/gfx/vk-device-queue.h b/tools/gfx/vk-device-queue.h new file mode 100644 index 000000000..d57483ec0 --- /dev/null +++ b/tools/gfx/vk-device-queue.h @@ -0,0 +1,94 @@ +// vk-swap-chain.h +#pragma once + +#include "vk-api.h" + +namespace gfx { + +struct VulkanDeviceQueue +{ + enum + { + kMaxCommandBuffers = 8, + }; + + enum class EventType + { + BeginFrame, + EndFrame, + CountOf, + }; + + /// Initialize - must be called before anything else can be done + SlangResult init(const VulkanApi& api, VkQueue queue, int queueIndex); + + /// Flushes the current command list, and steps to next (internally this is equivalent to a stepA followed by stepB) + void flush(); + /// Performs a full flush, and then waits for idle. + void flushAndWait(); + + /// Blocks until all work submitted to GPU has completed + void waitForIdle() { m_api->vkQueueWaitIdle(m_queue); } + + /// Get the graphics queue index (as set on init) + int getQueueIndex() const { return m_queueIndex; } + + /// Make the specified event 'current' - meaning it's semaphore must be waited on + VkSemaphore makeCurrent(EventType eventType); + /// Makes the event no longer required to be waited on + void makeCompleted(EventType eventType); + /// Returns true if the event is already current + SLANG_FORCE_INLINE bool isCurrent(EventType eventType) const { return m_currentSemaphores[int(eventType)] != VK_NULL_HANDLE; } + + /// Get the command buffer + VkCommandBuffer getCommandBuffer() const { return m_commandBuffer; } + + /// Get the queue + VkQueue getQueue() const { return m_queue; } + + /// Get the API + const VulkanApi* getApi() const { return m_api; } + + /// Flushes the current command list + void flushStepA(); + /// Steps to next command buffer and opens. May block if command buffer is still in use + void flushStepB(); + + /// Dtor + ~VulkanDeviceQueue(); + + protected: + + struct Fence + { + VkFence fence; + bool active; + uint64_t value; + }; + + void _updateFenceAtIndex(int fenceIndex, bool blocking); + + VkQueue m_queue = VK_NULL_HANDLE; + + VkCommandPool m_commandPool = VK_NULL_HANDLE; + int m_numCommandBuffers = 0; + int m_commandBufferIndex = 0; + // There are the same amount of command buffers as fences + VkCommandBuffer m_commandBuffers[kMaxCommandBuffers] = { VK_NULL_HANDLE }; + + Fence m_fences[kMaxCommandBuffers] = { {VK_NULL_HANDLE, 0, 0u} }; + + VkCommandBuffer m_commandBuffer = VK_NULL_HANDLE; + + VkSemaphore m_semaphores[int(EventType::CountOf)]; + VkSemaphore m_currentSemaphores[int(EventType::CountOf)]; + + uint64_t m_lastFenceCompleted = 1; + uint64_t m_nextFenceValue = 2; + + int m_queueIndex = 0; + + const VulkanApi* m_api = nullptr; +}; + +} // renderer_test diff --git a/tools/gfx/vk-module.cpp b/tools/gfx/vk-module.cpp new file mode 100644 index 000000000..4e92a3d2c --- /dev/null +++ b/tools/gfx/vk-module.cpp @@ -0,0 +1,76 @@ +// module.cpp +#include "vk-module.h" + +#include +#include +#include + +#if SLANG_WINDOWS_FAMILY +# include +#else +# include +#endif + +namespace gfx { +using namespace Slang; + +// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VulkanModule !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +Slang::Result VulkanModule::init() +{ + if (isInitialized()) + { + destroy(); + return SLANG_OK; + } + + const char* dynamicLibraryName = "Unknown"; + +#if SLANG_WINDOWS_FAMILY + dynamicLibraryName = "vulkan-1.dll"; + HMODULE module = ::LoadLibraryA(dynamicLibraryName); + m_module = (void*)module; +#else + dynamicLibraryName = "libvulkan.so.1"; + m_module = dlopen(dynamicLibraryName, RTLD_NOW); +#endif + + if (!m_module) + { + fprintf(stderr, "error: failed load '%s'\n", dynamicLibraryName); + return SLANG_FAIL; + } + + return SLANG_OK; +} + +PFN_vkVoidFunction VulkanModule::getFunction(const char* name) const +{ + assert(m_module); + if (!m_module) + { + return nullptr; + } +#if SLANG_WINDOWS_FAMILY + return (PFN_vkVoidFunction)::GetProcAddress((HMODULE)m_module, name); +#else + return (PFN_vkVoidFunction)dlsym(m_module, name); +#endif +} + +void VulkanModule::destroy() +{ + if (!isInitialized()) + { + return; + } + +#if SLANG_WINDOWS_FAMILY + ::FreeLibrary((HMODULE)m_module); +#else + dlclose(m_module); +#endif + m_module = nullptr; +} + +} // renderer_test diff --git a/tools/gfx/vk-module.h b/tools/gfx/vk-module.h new file mode 100644 index 000000000..55e26f335 --- /dev/null +++ b/tools/gfx/vk-module.h @@ -0,0 +1,39 @@ +// vk-module.h +#pragma once + +#include "../../slang.h" + +#include "../../slang-com-helper.h" + +#if SLANG_WINDOWS_FAMILY +# define VK_USE_PLATFORM_WIN32_KHR 1 +#else +# define VK_USE_PLATFORM_XLIB_KHR 1 +#endif + +#define VK_NO_PROTOTYPES +#include + +namespace gfx { + +struct VulkanModule +{ + /// true if has been initialized + SLANG_FORCE_INLINE bool isInitialized() const { return m_module != nullptr; } + + /// Get a function by name + PFN_vkVoidFunction getFunction(const char* name) const; + + /// Initialize + Slang::Result init(); + /// Destroy + void destroy(); + + /// Dtor + ~VulkanModule() { destroy(); } + + protected: + void* m_module = nullptr; +}; + +} // renderer_test diff --git a/tools/gfx/vk-swap-chain.cpp b/tools/gfx/vk-swap-chain.cpp new file mode 100644 index 000000000..6e89c946c --- /dev/null +++ b/tools/gfx/vk-swap-chain.cpp @@ -0,0 +1,421 @@ +// vk-swap-chain.cpp +#include "vk-swap-chain.h" + +#include "vk-util.h" + +#include "../../source/core/list.h" + +#include +#include + +namespace gfx { +using namespace Slang; + +static int _indexOf(List& formatsIn, VkFormat format) +{ + const int numFormats = int(formatsIn.Count()); + const VkSurfaceFormatKHR* formats = formatsIn.Buffer(); + + for (int i = 0; i < numFormats; ++i) + { + if (formats[i].format == format) + { + return i; + } + } + return -1; +} + +SlangResult VulkanSwapChain::init(VulkanDeviceQueue* deviceQueue, const Desc& descIn, const PlatformDesc* platformDescIn) +{ + assert(platformDescIn); + + m_deviceQueue = deviceQueue; + m_api = deviceQueue->getApi(); + + // Make sure it's not set initially + m_format = VK_FORMAT_UNDEFINED; + + Desc desc(descIn); + +#if SLANG_WINDOWS_FAMILY + const WinPlatformDesc* platformDesc = static_cast(platformDescIn); + _setPlatformDesc(*platformDesc); + + VkWin32SurfaceCreateInfoKHR surfaceCreateInfo = {}; + surfaceCreateInfo.sType = VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR; + surfaceCreateInfo.hinstance = platformDesc->m_hinstance; + surfaceCreateInfo.hwnd = platformDesc->m_hwnd; + + SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateWin32SurfaceKHR(m_api->m_instance, &surfaceCreateInfo, nullptr, &m_surface)); +#else + const XPlatformDesc* platformDesc = static_cast(platformDescIn); + _setPlatformDesc(*platformDesc); + + VkXlibSurfaceCreateInfoKHR surfaceCreateInfo = {}; + surfaceCreateInfo.sType = VK_STRUCTURE_TYPE_XLIB_SURFACE_CREATE_INFO_KHR; + surfaceCreateInfo.dpy = platformDesc->m_display; + surfaceCreateInfo.window = platformDesc->m_window; + + SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateXlibSurfaceKHR(m_api->m_instance, &surfaceCreateInfo, nullptr, &m_surface)); +#endif + + VkBool32 supported = false; + m_api->vkGetPhysicalDeviceSurfaceSupportKHR(m_api->m_physicalDevice, deviceQueue->getQueueIndex(), m_surface, &supported); + + uint32_t numSurfaceFormats = 0; + List surfaceFormats; + m_api->vkGetPhysicalDeviceSurfaceFormatsKHR(m_api->m_physicalDevice, m_surface, &numSurfaceFormats, nullptr); + surfaceFormats.SetSize(int(numSurfaceFormats)); + m_api->vkGetPhysicalDeviceSurfaceFormatsKHR(m_api->m_physicalDevice, m_surface, &numSurfaceFormats, surfaceFormats.Buffer()); + + // Look for a suitable format + List formats; + formats.Add(VulkanUtil::getVkFormat(desc.m_format)); + // HACK! To check for a different format if couldn't be found + if (descIn.m_format == Format::RGBA_Unorm_UInt8) + { + formats.Add(VK_FORMAT_B8G8R8A8_UNORM); + } + + for(int i = 0; i < int(formats.Count()); ++i) + { + VkFormat format = formats[i]; + if (_indexOf(surfaceFormats, format) >= 0) + { + m_format = format; + } + } + + if (m_format == VK_FORMAT_UNDEFINED) + { + return SLANG_FAIL; + } + + // Save the desc + m_desc = desc; + + SLANG_RETURN_ON_FAIL(_createSwapChain()); + + m_desc = desc; + return SLANG_OK; +} + +void VulkanSwapChain::getWindowSize(int* widthOut, int* heightOut) const +{ +#if SLANG_WINDOWS_FAMILY + auto platformDesc = _getPlatformDesc(); + + RECT rc; + ::GetClientRect(platformDesc->m_hwnd, &rc); + *widthOut = rc.right - rc.left; + *heightOut = rc.bottom - rc.top; +#else + auto platformDesc = _getPlatformDesc(); + + XWindowAttributes winAttr = {}; + XGetWindowAttributes(platformDesc->m_display, platformDesc->m_window, &winAttr); + + *widthOut = winAttr.width; + *heightOut = winAttr.height; +#endif +} + +SlangResult VulkanSwapChain::_createFrameBuffers(VkRenderPass renderPass) +{ + assert(renderPass != VK_NULL_HANDLE); + + for (int i = 0; i < int(m_images.Count()); ++i) + { + Image& image = m_images[i]; + VkImageView attachments[] = + { + image.m_imageView + }; + + VkFramebufferCreateInfo framebufferInfo = {}; + framebufferInfo.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO; + framebufferInfo.renderPass = renderPass; + framebufferInfo.attachmentCount = 1; + framebufferInfo.pAttachments = attachments; + framebufferInfo.width = m_width; + framebufferInfo.height = m_height; + framebufferInfo.layers = 1; + + SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateFramebuffer(m_api->m_device, &framebufferInfo, nullptr, &image.m_frameBuffer)); + } + + return SLANG_OK; +} + +void VulkanSwapChain::_destroyFrameBuffers() +{ + for (int i = 0; i < int(m_images.Count()); ++i) + { + Image& image = m_images[i]; + if (image.m_frameBuffer != VK_NULL_HANDLE) + { + m_api->vkDestroyFramebuffer(m_api->m_device, image.m_frameBuffer, nullptr); + image.m_frameBuffer = VK_NULL_HANDLE; + } + } +} + +SlangResult VulkanSwapChain::createFrameBuffers(VkRenderPass renderPass) +{ + if (m_renderPass != VK_NULL_HANDLE) + { + _destroyFrameBuffers(); + m_renderPass = VK_NULL_HANDLE; + } + if (renderPass != VK_NULL_HANDLE) + { + SLANG_RETURN_ON_FAIL(_createFrameBuffers(renderPass)); + } + m_renderPass = renderPass; + return SLANG_OK; +} + +SlangResult VulkanSwapChain::_createSwapChain() +{ + if (hasValidSwapChain()) + { + return SLANG_OK; + } + + int width, height; + getWindowSize(&width, &height); + + VkExtent2D imageExtent = {}; + imageExtent.width = width; + imageExtent.height = height; + + m_width = width; + m_height = height; + + // catch this before throwing error + if (m_width == 0 || m_height == 0) + { + return SLANG_FAIL; + } + + // It is necessary to query the caps -> otherwise the LunarG verification layer will issue an error + { + VkSurfaceCapabilitiesKHR surfaceCaps; + + SLANG_VK_RETURN_ON_FAIL(m_api->vkGetPhysicalDeviceSurfaceCapabilitiesKHR(m_api->m_physicalDevice, m_surface, &surfaceCaps)); + } + + List presentModes; + uint32_t numPresentModes = 0; + m_api->vkGetPhysicalDeviceSurfacePresentModesKHR(m_api->m_physicalDevice, m_surface, &numPresentModes, nullptr); + presentModes.SetSize(numPresentModes); + m_api->vkGetPhysicalDeviceSurfacePresentModesKHR(m_api->m_physicalDevice, m_surface, &numPresentModes, presentModes.Buffer()); + + { + int numCheckPresentOptions = 3; + VkPresentModeKHR presentOptions[] = { VK_PRESENT_MODE_IMMEDIATE_KHR, VK_PRESENT_MODE_MAILBOX_KHR, VK_PRESENT_MODE_FIFO_KHR }; + if (m_vsync) + { + presentOptions[0] = VK_PRESENT_MODE_FIFO_KHR; + presentOptions[1] = VK_PRESENT_MODE_IMMEDIATE_KHR; + presentOptions[2] = VK_PRESENT_MODE_MAILBOX_KHR; + } + + m_presentMode = VK_PRESENT_MODE_MAX_ENUM_KHR; // Invalid + + // Find the first option that's available on the device + for (int j = 0; j < numCheckPresentOptions; j++) + { + if (presentModes.IndexOf(presentOptions[j]) != UInt(-1)) + { + m_presentMode = presentOptions[j]; + break; + } + } + + if (m_presentMode == VK_PRESENT_MODE_MAX_ENUM_KHR) + { + return SLANG_FAIL; + } + } + + VkSwapchainKHR oldSwapchain = VK_NULL_HANDLE; + + VkSwapchainCreateInfoKHR swapchainDesc = {}; + swapchainDesc.sType = VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR; + swapchainDesc.surface = m_surface; + swapchainDesc.minImageCount = 3; + swapchainDesc.imageFormat = m_format; + swapchainDesc.imageColorSpace = VK_COLOR_SPACE_SRGB_NONLINEAR_KHR; + swapchainDesc.imageExtent = imageExtent; + swapchainDesc.imageArrayLayers = 1; + swapchainDesc.imageUsage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT; + swapchainDesc.imageSharingMode = VK_SHARING_MODE_EXCLUSIVE; + swapchainDesc.preTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR; + swapchainDesc.compositeAlpha = VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR; + swapchainDesc.presentMode = m_presentMode; + swapchainDesc.clipped = VK_TRUE; + swapchainDesc.oldSwapchain = oldSwapchain; + + SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateSwapchainKHR(m_api->m_device, &swapchainDesc, nullptr, &m_swapChain)); + + uint32_t numSwapChainImages = 0; + m_api->vkGetSwapchainImagesKHR(m_api->m_device, m_swapChain, &numSwapChainImages, nullptr); + + { + List images; + images.SetSize(numSwapChainImages); + + m_api->vkGetSwapchainImagesKHR(m_api->m_device, m_swapChain, &numSwapChainImages, images.Buffer()); + + m_images.SetSize(numSwapChainImages); + for (int i = 0; i < int(numSwapChainImages); ++i) + { + Image& dstImage = m_images[i]; + dstImage.m_image = images[i]; + + } + } + + { + VkImageViewCreateInfo createInfo = {}; + createInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; + + createInfo.viewType = VK_IMAGE_VIEW_TYPE_2D; + createInfo.format = m_format; + + createInfo.components.r = VK_COMPONENT_SWIZZLE_IDENTITY; + createInfo.components.g = VK_COMPONENT_SWIZZLE_IDENTITY; + createInfo.components.b = VK_COMPONENT_SWIZZLE_IDENTITY; + createInfo.components.a = VK_COMPONENT_SWIZZLE_IDENTITY; + + createInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; + createInfo.subresourceRange.baseMipLevel = 0; + createInfo.subresourceRange.levelCount = 1; + createInfo.subresourceRange.baseArrayLayer = 0; + createInfo.subresourceRange.layerCount = 1; + + for (int i = 0; i < int(numSwapChainImages); ++i) + { + Image& image = m_images[i]; + + createInfo.image = image.m_image; + + SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateImageView(m_api->m_device, &createInfo, nullptr, &image.m_imageView)); + } + } + + if (m_renderPass != VK_NULL_HANDLE) + { + _createFrameBuffers(m_renderPass); + } + + return SLANG_OK; +} + +void VulkanSwapChain::_destroySwapChain() +{ + if (!hasValidSwapChain()) + { + return; + } + + m_deviceQueue->waitForIdle(); + + if (m_renderPass != VK_NULL_HANDLE) + { + _destroyFrameBuffers(); + } + + for (int i = 0; i < int(m_images.Count()); ++i) + { + Image& image = m_images[i]; + + if (image.m_imageView != VK_NULL_HANDLE) + { + m_api->vkDestroyImageView(m_api->m_device, image.m_imageView, nullptr); + } + } + + if (m_swapChain != VK_NULL_HANDLE) + { + m_api->vkDestroySwapchainKHR(m_api->m_device, m_swapChain, nullptr); + m_swapChain = VK_NULL_HANDLE; + } + + // Mark that it is no longer used + m_images.Clear(); +} + +VulkanSwapChain::~VulkanSwapChain() +{ + _destroySwapChain(); + + if (m_surface) + { + m_api->vkDestroySurfaceKHR(m_api->m_instance, m_surface, nullptr); + m_surface = VK_NULL_HANDLE; + } +} + +int VulkanSwapChain::nextFrontImageIndex() +{ + if (!hasValidSwapChain()) + { + if (SLANG_FAILED(_createSwapChain())) + { + return -1; + } + } + + VkSemaphore beginFrameSemaphore = m_deviceQueue->makeCurrent(VulkanDeviceQueue::EventType::BeginFrame); + + uint32_t swapChainIndex = 0; + VkResult result = m_api->vkAcquireNextImageKHR(m_api->m_device, m_swapChain, UINT64_MAX, beginFrameSemaphore, VK_NULL_HANDLE, &swapChainIndex); + + if (result != VK_SUCCESS) + { + _destroySwapChain(); + return -1; + } + m_currentSwapChainIndex = int(swapChainIndex); + return swapChainIndex; +} + +void VulkanSwapChain::present(bool vsync) +{ + if (!hasValidSwapChain()) + { + m_deviceQueue->flush(); + return; + } + + VkSemaphore endFrameSemaphore = m_deviceQueue->makeCurrent(VulkanDeviceQueue::EventType::EndFrame); + + m_deviceQueue->flushStepA(); + + uint32_t swapChainIndices[] = { uint32_t(m_currentSwapChainIndex) }; + + VkPresentInfoKHR presentInfo = {}; + presentInfo.sType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR; + presentInfo.swapchainCount = 1; + presentInfo.pSwapchains = &m_swapChain; + presentInfo.pImageIndices = swapChainIndices; + presentInfo.waitSemaphoreCount = 1; + presentInfo.pWaitSemaphores = &endFrameSemaphore; + + VkResult result = m_api->vkQueuePresentKHR(m_deviceQueue->getQueue(), &presentInfo); + + m_deviceQueue->makeCompleted(VulkanDeviceQueue::EventType::EndFrame); + + m_deviceQueue->flushStepB(); + + if (result != VK_SUCCESS || m_vsync != vsync) + { + m_vsync = vsync; + _destroySwapChain(); + } +} + +} // renderer_test diff --git a/tools/gfx/vk-swap-chain.h b/tools/gfx/vk-swap-chain.h new file mode 100644 index 000000000..12feb0ed5 --- /dev/null +++ b/tools/gfx/vk-swap-chain.h @@ -0,0 +1,141 @@ +// vk-swap-chain.h +#pragma once + +#include "vk-api.h" +#include "vk-device-queue.h" + +#include "render.h" + +#include "../../source/core/list.h" + +namespace gfx { + +struct VulkanSwapChain +{ + /* enum + { + kMaxImages = 8, + }; */ + + /// Base class for platform specific information + struct PlatformDesc + { + }; + +#if SLANG_WINDOWS_FAMILY + struct WinPlatformDesc: public PlatformDesc + { + HINSTANCE m_hinstance; + HWND m_hwnd; + }; +#else + struct XPlatformDesc : public PlatformDesc + { + Display* m_display; + Window m_window; + }; +#endif + + struct Desc + { + void init() + { + m_format = Format::Unknown; + m_depthFormatTypeless = Format::Unknown; + m_depthFormat = Format::Unknown; + m_textureDepthFormat = Format::Unknown; + } + + Format m_format; + //bool m_enableFormat; + Format m_depthFormatTypeless; + Format m_depthFormat; + Format m_textureDepthFormat; + }; + + struct Image + { + VkImage m_image = VK_NULL_HANDLE; + VkImageView m_imageView = VK_NULL_HANDLE; + VkFramebuffer m_frameBuffer = VK_NULL_HANDLE; + }; + + + /// Must be called before the swap chain can be used + SlangResult init(VulkanDeviceQueue* deviceQueue, const Desc& desc, const PlatformDesc* platformDesc); + + /// Create the frame buffers (they must be compatible with the supplied renderPass) + SlangResult createFrameBuffers(VkRenderPass renderPass); + + /// Returned the desc used to construct the swap chain. + /// Is invalid if init hasn't returned with successful result. + const Desc& getDesc() const { return m_desc; } + + /// True if the swap chain is available + bool hasValidSwapChain() const { return m_images.Count() > 0; } + + /// Present to the display + void present(bool vsync); + + /// Get the current size of the window (in pixels written to widthOut, heightOut) + void getWindowSize(int* widthOut, int* heightOut) const; + + /// Get the VkFormat for the back buffer + VkFormat getVkFormat() const { return m_format; } + + /// Get width of the back buffers + int getWidth() const { return m_width; } + /// Get the height of the back buffer + int getHeight() const { return m_height; } + + /// Get the detail about the images + const Slang::List& getImages() const { return m_images; } + + /// Get the next front render image index. Returns -1, if image couldn't be found + int nextFrontImageIndex(); + + /// Dtor + ~VulkanSwapChain(); + + protected: + + + template + void _setPlatformDesc(const T& desc) + { + const PlatformDesc* check = &desc; + int size = (sizeof(T) + sizeof(void*) - 1) / sizeof(void*); + m_platformDescBuffer.SetSize(size); + *(T*)m_platformDescBuffer.Buffer() = desc; + } + template + const T* _getPlatformDesc() const { return static_cast((const PlatformDesc*)m_platformDescBuffer.Buffer()); } + SlangResult _createSwapChain(); + void _destroySwapChain(); + SlangResult _createFrameBuffers(VkRenderPass renderPass); + void _destroyFrameBuffers(); + + bool m_vsync = true; + int m_width = 0; + int m_height = 0; + + VkPresentModeKHR m_presentMode = VK_PRESENT_MODE_IMMEDIATE_KHR; + VkFormat m_format = VK_FORMAT_UNDEFINED; ///< The format used for backbuffer. Valid after successful init. + + VkSurfaceKHR m_surface = VK_NULL_HANDLE; + VkSwapchainKHR m_swapChain = VK_NULL_HANDLE; + + VkRenderPass m_renderPass = VK_NULL_HANDLE; //< Not owned + + int m_currentSwapChainIndex = 0; + + Slang::List m_images; + + VulkanDeviceQueue* m_deviceQueue = nullptr; + const VulkanApi* m_api = nullptr; + + Desc m_desc; ///< The desc used to init this swap chain + Slang::List m_platformDescBuffer; ///< Buffer to hold the platform specific description parameters (as passed in platformDesc) +}; + +} // renderer_test diff --git a/tools/gfx/vk-util.cpp b/tools/gfx/vk-util.cpp new file mode 100644 index 000000000..e8940d1b2 --- /dev/null +++ b/tools/gfx/vk-util.cpp @@ -0,0 +1,59 @@ +// vk-util.cpp +#include "vk-util.h" + +#include +#include + +namespace gfx { + +/* static */VkFormat VulkanUtil::getVkFormat(Format format) +{ + switch (format) + { + case Format::RGBA_Float32: return VK_FORMAT_R32G32B32A32_SFLOAT; + case Format::RGB_Float32: return VK_FORMAT_R32G32B32_SFLOAT; + case Format::RG_Float32: return VK_FORMAT_R32G32_SFLOAT; + case Format::R_Float32: return VK_FORMAT_R32_SFLOAT; + case Format::RGBA_Unorm_UInt8: return VK_FORMAT_R8G8B8A8_UNORM; + case Format::R_UInt32: return VK_FORMAT_R32_UINT; + + case Format::D_Float32: return VK_FORMAT_D32_SFLOAT; + case Format::D_Unorm24_S8: return VK_FORMAT_D24_UNORM_S8_UINT; + + default: return VK_FORMAT_UNDEFINED; + } +} + +/* static */SlangResult VulkanUtil::toSlangResult(VkResult res) +{ + return (res == VK_SUCCESS) ? SLANG_OK : SLANG_FAIL; +} + +/* static */Slang::Result VulkanUtil::handleFail(VkResult res) +{ + if (res != VK_SUCCESS) + { + assert(!"Vulkan returned a failure"); + } + return toSlangResult(res); +} + +/* static */void VulkanUtil::checkFail(VkResult res) +{ + assert(res != VK_SUCCESS); + assert(!"Vulkan check failed"); + +} + +/* static */VkPrimitiveTopology VulkanUtil::getVkPrimitiveTopology(PrimitiveTopology topology) +{ + switch (topology) + { + case PrimitiveTopology::TriangleList: return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; + default: break; + } + assert(!"Unknown topology"); + return VK_PRIMITIVE_TOPOLOGY_MAX_ENUM; +} + +} // renderer_test diff --git a/tools/gfx/vk-util.h b/tools/gfx/vk-util.h new file mode 100644 index 000000000..edba3a7d2 --- /dev/null +++ b/tools/gfx/vk-util.h @@ -0,0 +1,41 @@ +// vk-util.h +#pragma once + +#include "vk-api.h" +#include "render.h" + +// Macros to make testing vulkan return codes simpler + +/// SLANG_VK_RETURN_ON_FAIL can be used in a similar way to SLANG_RETURN_ON_FAIL macro, except it will turn a vulkan failure into Slang::Result in the process +/// Calls handleFail which on debug builds asserts +#define SLANG_VK_RETURN_ON_FAIL(x) { VkResult _res = x; if (_res != VK_SUCCESS) { return VulkanUtil::handleFail(_res); } } + +#define SLANG_VK_RETURN_NULL_ON_FAIL(x) { VkResult _res = x; if (_res != VK_SUCCESS) { VulkanUtil::handleFail(_res); return nullptr; } } + +/// Is similar to SLANG_VK_RETURN_ON_FAIL, but does not return. Will call checkFail on failure - which asserts on debug builds. +#define SLANG_VK_CHECK(x) { VkResult _res = x; if (_res != VK_SUCCESS) { VulkanUtil::checkFail(_res); } } + +namespace gfx { + +// Utility functions for Vulkan +struct VulkanUtil +{ + /// Get the equivalent VkFormat from the format + /// Returns VK_FORMAT_UNDEFINED if a match is not found + static VkFormat getVkFormat(Format format); + + /// Called by SLANG_VK_RETURN_FAIL if a res is a failure. + /// On debug builds this will cause an assertion on failure. + static Slang::Result handleFail(VkResult res); + /// Called when a failure has occurred with SLANG_VK_CHECK - will typically assert. + static void checkFail(VkResult res); + + /// Get the VkPrimitiveTopology for the given topology. + /// Returns VK_PRIMITIVE_TOPOLOGY_MAX_ENUM on failure + static VkPrimitiveTopology getVkPrimitiveTopology(PrimitiveTopology topology); + + /// Returns Slang::Result equivalent of a VkResult + static Slang::Result toSlangResult(VkResult res); +}; + +} // renderer_test diff --git a/tools/gfx/window.cpp b/tools/gfx/window.cpp new file mode 100644 index 000000000..ee9f50813 --- /dev/null +++ b/tools/gfx/window.cpp @@ -0,0 +1,289 @@ +// window.cpp +#include "window.h" +#pragma once + +#include + +#ifdef _MSC_VER +#include +#if (_MSC_VER < 1900) +#define snprintf sprintf_s +#endif +#endif + +#include + + +#if _WIN32 +#include +#else +#error "The slang-graphics library currently only supports Windows platforms" +#endif + +namespace gfx { + +#if _WIN32 + +struct OSString +{ + OSString(char const* begin, char const* end) + { + _initialize(begin, end - begin); + } + + OSString(char const* begin) + { + _initialize(begin, strlen(begin)); + } + + ~OSString() + { + free(mBegin); + } + + operator WCHAR const*() + { + return mBegin; + } + +private: + WCHAR* mBegin; + WCHAR* mEnd; + + void _initialize(char const* input, size_t inputSize) + { + const DWORD dwFlags = 0; + int outputCodeUnitCount = ::MultiByteToWideChar(CP_UTF8, dwFlags, input, int(inputSize), nullptr, 0); + + WCHAR* buffer = (WCHAR*)malloc(sizeof(WCHAR) * (outputCodeUnitCount + 1)); + + ::MultiByteToWideChar(CP_UTF8, dwFlags, input, int(inputSize), buffer, outputCodeUnitCount); + buffer[outputCodeUnitCount] = 0; + + mBegin = buffer; + mEnd = buffer + outputCodeUnitCount; + } +}; + +struct ApplicationContext +{ + HINSTANCE instance; + int showCommand = SW_SHOWDEFAULT; + int resultCode = 0; +}; + +static uint64_t gTimerFrequency; + + +static void initApplication(ApplicationContext* context) +{ + LARGE_INTEGER timerFrequency; + QueryPerformanceFrequency(&timerFrequency); + gTimerFrequency = timerFrequency.QuadPart; +} + +/// Run an application given the specified callback and command-line arguments. +int runApplication( + ApplicationFunc func, + int argc, + char const* const* argv) +{ + ApplicationContext context; + context.instance = (HINSTANCE) GetModuleHandle(0); + initApplication(&context); + func(&context); + return context.resultCode; +} + +int runWindowsApplication( + ApplicationFunc func, + void* instance, + int showCommand) +{ + ApplicationContext context; + context.instance = (HINSTANCE) instance; + context.showCommand = showCommand; + initApplication(&context); + func(&context); + return context.resultCode; +} + +struct Window +{ + HWND handle; +}; + +static LRESULT CALLBACK windowProc( + HWND windowHandle, + UINT message, + WPARAM wParam, + LPARAM lParam) +{ + // TODO: Actually implement some reasonable logic here. + switch (message) + { + case WM_CLOSE: + PostQuitMessage(0); + return 0; + } + + return DefWindowProcW(windowHandle, message, wParam, lParam); +} + + +static ATOM createWindowClassAtom() +{ + WNDCLASSEXW windowClassDesc; + windowClassDesc.cbSize = sizeof(windowClassDesc); + windowClassDesc.style = CS_OWNDC | CS_HREDRAW | CS_VREDRAW; + windowClassDesc.lpfnWndProc = &windowProc; + windowClassDesc.cbClsExtra = 0; + windowClassDesc.cbWndExtra = 0; + windowClassDesc.hInstance = (HINSTANCE) GetModuleHandle(0); + windowClassDesc.hIcon = 0; + windowClassDesc.hCursor = 0; + windowClassDesc.hbrBackground = 0; + windowClassDesc.lpszMenuName = 0; + windowClassDesc.lpszClassName = L"SlangGraphicsWindow"; + windowClassDesc.hIconSm = 0; + ATOM windowClassAtom = RegisterClassExW(&windowClassDesc); + return windowClassAtom; +} + +static ATOM getWindowClassAtom() +{ + static ATOM windowClassAtom = createWindowClassAtom(); + return windowClassAtom; +} + +Window* createWindow(WindowDesc const& desc) +{ + Window* window = new Window(); + + OSString windowTitle(desc.title); + + DWORD windowExtendedStyle = 0; + DWORD windowStyle = 0; + + HINSTANCE instance = (HINSTANCE) GetModuleHandle(0); + + HWND windowHandle = CreateWindowExW( + windowExtendedStyle, + (LPWSTR) getWindowClassAtom(), + windowTitle, + windowStyle, + 0, 0, // x, y + desc.width, desc.height, + NULL, // parent + NULL, // menu + instance, + window); + + if(!windowHandle) + { + delete window; + return nullptr; + } + + window->handle = windowHandle; + return window; +} + +void showWindow(Window* window) +{ + ShowWindow(window->handle, SW_SHOW); +} + +void* getPlatformWindowHandle(Window* window) +{ + return window->handle; +} + +bool dispatchEvents(ApplicationContext* context) +{ + for(;;) + { + MSG message; + + int result = PeekMessageW(&message, NULL, 0, 0, PM_REMOVE); + if (result != 0) + { + if (message.message == WM_QUIT) + { + context->resultCode = (int)message.wParam; + return false; + } + + TranslateMessage(&message); + DispatchMessageW(&message); + } + else + { + return true; + } + } + +} + +void exitApplication(ApplicationContext* context, int resultCode) +{ + ExitProcess(resultCode); +} + +void log(char const* message, ...) +{ + va_list args; + va_start(args, message); + + static const int kBufferSize = 1024; + char messageBuffer[kBufferSize]; + vsnprintf(messageBuffer, kBufferSize - 1, message, args); + messageBuffer[kBufferSize - 1] = 0; + + va_end(args); + + fputs(messageBuffer, stderr); + + OSString wideMessageBuffer(messageBuffer); + OutputDebugStringW(wideMessageBuffer); +} + +int reportError(char const* message, ...) +{ + va_list args; + va_start(args, message); + + static const int kBufferSize = 1024; + char messageBuffer[kBufferSize]; + vsnprintf(messageBuffer, kBufferSize - 1, message, args); + messageBuffer[kBufferSize - 1] = 0; + + va_end(args); + + fputs(messageBuffer, stderr); + + OSString wideMessageBuffer(messageBuffer); + OutputDebugStringW(wideMessageBuffer); + + return 1; +} + +uint64_t getCurrentTime() +{ + LARGE_INTEGER counter; + QueryPerformanceCounter(&counter); + return counter.QuadPart; +} + +uint64_t getTimerFrequency() +{ + return gTimerFrequency; +} + +#else + +// TODO: put an SDL version here + +#endif + +} // gfx diff --git a/tools/gfx/window.h b/tools/gfx/window.h new file mode 100644 index 000000000..6e557d26c --- /dev/null +++ b/tools/gfx/window.h @@ -0,0 +1,78 @@ +// window.h +#pragma once + +#include + +namespace gfx { + +struct WindowDesc +{ + char const* title; + int width; + int height; +}; + +typedef struct Window Window; + +Window* createWindow(WindowDesc const& desc); +void showWindow(Window* window); +void* getPlatformWindowHandle(Window* window); + +/// Opaque state provided by platform for a running application. +typedef struct ApplicationContext ApplicationContext; + +/// User-defined application entry-point function. +typedef void(*ApplicationFunc)(ApplicationContext* context); + +/// Dispatch any pending events for application. +/// +/// @returns `true` if application should keep running. +bool dispatchEvents(ApplicationContext* context); + +/// Exit the application with a given result code +void exitApplication(ApplicationContext* context, int resultCode); + +/// Log a message to an appropriate logging destination. +void log(char const* message, ...); + +/// Report an error to an appropriate logging destination. +int reportError(char const* message, ...); + +uint64_t getCurrentTime(); + +uint64_t getTimerFrequency(); + +/// Run an application given the specified callback and command-line arguments. +int runApplication( + ApplicationFunc func, + int argc, + char const* const* argv); + +#define GFX_CONSOLE_MAIN(APPLICATION_ENTRY) \ + int main(int argc, char** argv) { \ + return gfx::runApplication(&(APPLIATION_ENTRY), argc, argv); \ + } + +#ifdef _WIN32 + +int runWindowsApplication( + ApplicationFunc func, + void* instance, + int showCommand); + +#define GFX_UI_MAIN(APPLICATION_ENTRY) \ + int __stdcall WinMain( \ + void* instance, \ + void* /* prevInstance */, \ + void* /* commandLine */, \ + int showCommand) { \ + return gfx::runWindowsApplication(&(APPLICATION_ENTRY), instance, showCommand); \ + } + +#else + +#define GFX_UI_MAIN(APPLICATION_ENTRY) GFX_CONSOLE_MAIN(APPLICATION_ENTRY) + +#endif + +} // gfx diff --git a/tools/render-test/main.cpp b/tools/render-test/main.cpp index 935b9bc98..4734c2c8f 100644 --- a/tools/render-test/main.cpp +++ b/tools/render-test/main.cpp @@ -29,6 +29,8 @@ namespace renderer_test { +using Slang::Result; + int gWindowWidth = 1024; int gWindowHeight = 768; @@ -45,7 +47,7 @@ struct Vertex float uv[2]; }; -static const Vertex kVertexData[] = +static const Vertex kVertexData[] = { { { 0, 0, 0.5 }, {1, 0, 0} , {0, 0} }, { { 0, 1, 0.5 }, {0, 0, 1} , {1, 0} }, @@ -61,15 +63,15 @@ class RenderTestApp // At initialization time, we are going to load and compile our Slang shader // code, and then create the API objects we need for rendering. - Result initialize(Renderer* renderer, ShaderCompiler* shaderCompiler); + Result initialize(Renderer* renderer, ShaderCompiler* shaderCompiler); void runCompute(); void renderFrame(); void finalize(); - BindingState* getBindingState() const { return m_bindingState; } + BindingStateImpl* getBindingState() const { return m_bindingState; } Result writeBindingOutput(const char* fileName); - + Result writeScreen(const char* filename); protected: @@ -85,7 +87,8 @@ class RenderTestApp RefPtr m_inputLayout; RefPtr m_vertexBuffer; RefPtr m_shaderProgram; - RefPtr m_bindingState; + RefPtr m_pipelineState; + RefPtr m_bindingState; ShaderInputLayout m_shaderInputLayout; ///< The binding layout int m_numAddedConstantBuffers; ///< Constant buffers can be added to the binding directly. Will be added at the end. @@ -117,22 +120,26 @@ SlangResult RenderTestApp::initialize(Renderer* renderer, ShaderCompiler* shader } { - BindingState::Desc bindingStateDesc; - SLANG_RETURN_ON_FAIL(ShaderRendererUtil::createBindingStateDesc(m_shaderInputLayout, m_renderer, bindingStateDesc)); - - //! Hack -> if bindings are specified, just set up the constant buffer binding - // Should probably be more sophisticated than this - with 'dynamic' constant buffer/s binding always being specified - // in the test file - - if ((gOptions.shaderType == Options::ShaderProgramType::Graphics || gOptions.shaderType == Options::ShaderProgramType::GraphicsCompute) - && bindingStateDesc.findBindingIndex(Resource::BindFlag::ConstantBuffer, 0) < 0) + //! Hack -> if doing a graphics test, add an extra binding for our dynamic constant buffer + // + // TODO: Should probably be more sophisticated than this - with 'dynamic' constant buffer/s binding always being specified + // in the test file + RefPtr addedConstantBuffer; + switch(gOptions.shaderType) { - bindingStateDesc.addResource(BindingType::Buffer, m_constantBuffer, BindingState::RegisterRange::makeSingle(0) ); + default: + break; + case Options::ShaderProgramType::Graphics: + case Options::ShaderProgramType::GraphicsCompute: + addedConstantBuffer = m_constantBuffer; m_numAddedConstantBuffers++; + break; } - m_bindingState = m_renderer->createBindingState(bindingStateDesc); + BindingStateImpl* bindingState = nullptr; + SLANG_RETURN_ON_FAIL(ShaderRendererUtil::createBindingState(m_shaderInputLayout, m_renderer, addedConstantBuffer, &bindingState)); + m_bindingState = bindingState; } // Do other initialization that doesn't depend on the source language. @@ -156,6 +163,38 @@ SlangResult RenderTestApp::initialize(Renderer* renderer, ShaderCompiler* shader if(!m_vertexBuffer) return SLANG_FAIL; + { + switch(gOptions.shaderType) + { + default: + assert(!"unexpected test shader type"); + return SLANG_FAIL; + + case Options::ShaderProgramType::Compute: + { + ComputePipelineStateDesc desc; + desc.pipelineLayout = m_bindingState->pipelineLayout; + desc.program = m_shaderProgram; + + m_pipelineState = renderer->createComputePipelineState(desc); + } + break; + + case Options::ShaderProgramType::Graphics: + case Options::ShaderProgramType::GraphicsCompute: + { + GraphicsPipelineStateDesc desc; + desc.pipelineLayout = m_bindingState->pipelineLayout; + desc.program = m_shaderProgram; + desc.inputLayout = m_inputLayout; + desc.renderTargetCount = m_bindingState->m_numRenderTargets; + + m_pipelineState = renderer->createGraphicsPipelineState(desc); + } + break; + } + } + return SLANG_OK; } @@ -182,6 +221,16 @@ Result RenderTestApp::initializeShaders(ShaderCompiler* shaderCompiler) fclose(sourceFile); sourceText[sourceSize] = 0; + switch( gOptions.shaderType ) + { + default: + m_shaderInputLayout.numRenderTargets = 1; + break; + + case Options::ShaderProgramType::Compute: + m_shaderInputLayout.numRenderTargets = 0; + break; + } m_shaderInputLayout.Parse(sourceText); ShaderCompileRequest::SourceInfo sourceInfo; @@ -220,31 +269,27 @@ void RenderTestApp::renderFrame() { const ProjectionStyle projectionStyle = RendererUtil::getProjectionStyle(m_renderer->getRendererType()); RendererUtil::getIdentityProjection(projectionStyle, (float*)mappedData); - + m_renderer->unmap(m_constantBuffer); } - // Input Assembler (IA) + auto pipelineType = PipelineType::Graphics; - m_renderer->setInputLayout(m_inputLayout); - m_renderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); + m_renderer->setPipelineState(pipelineType, m_pipelineState); + m_renderer->setPrimitiveTopology(PrimitiveTopology::TriangleList); m_renderer->setVertexBuffer(0, m_vertexBuffer, sizeof(Vertex)); - // Vertex Shader (VS) - // Pixel Shader (PS) - - m_renderer->setShaderProgram(m_shaderProgram); - m_renderer->setBindingState(m_bindingState); - // + m_bindingState->apply(m_renderer, pipelineType); m_renderer->draw(3); } void RenderTestApp::runCompute() { - m_renderer->setShaderProgram(m_shaderProgram); - m_renderer->setBindingState(m_bindingState); + auto pipelineType = PipelineType::Compute; + m_renderer->setPipelineState(pipelineType, m_pipelineState); + m_bindingState->apply(m_renderer, pipelineType); m_renderer->dispatchCompute(1, 1, 1); } @@ -265,18 +310,12 @@ Result RenderTestApp::writeBindingOutput(const char* fileName) return SLANG_FAIL; } - const BindingState::Desc& bindingStateDesc = m_bindingState->getDesc(); - // Must be the same amount of entries - assert(bindingStateDesc.m_bindings.Count() == m_shaderInputLayout.entries.Count() + m_numAddedConstantBuffers); - - const int numBindings = int(m_shaderInputLayout.entries.Count()); - - for (int i = 0; i < numBindings; ++i) + for(auto binding : m_bindingState->outputBindings) { + auto i = binding.entryIndex; const auto& layoutBinding = m_shaderInputLayout.entries[i]; - const auto& binding = bindingStateDesc.m_bindings[i]; - if (layoutBinding.isOutput) + assert(layoutBinding.isOutput); { if (binding.resource && binding.resource->isBuffer()) { @@ -524,11 +563,11 @@ SlangResult innerMain(int argc, char** argv) else { Result res = app.writeScreen(gOptions.outputPath); - + if (SLANG_FAILED(res)) { fprintf(stderr, "ERROR: failed to write screen capture to file\n"); - return res; + return res; } } return SLANG_OK; diff --git a/tools/render-test/options.h b/tools/render-test/options.h index 82c018f66..78f673796 100644 --- a/tools/render-test/options.h +++ b/tools/render-test/options.h @@ -9,7 +9,7 @@ namespace renderer_test { -using namespace slang_graphics; +using namespace gfx; struct Options { diff --git a/tools/render-test/png-serialize-util.h b/tools/render-test/png-serialize-util.h index dad17ae74..1ec5204f7 100644 --- a/tools/render-test/png-serialize-util.h +++ b/tools/render-test/png-serialize-util.h @@ -5,7 +5,7 @@ namespace renderer_test { -using namespace slang_graphics; +using namespace gfx; struct PngSerializeUtil { diff --git a/tools/render-test/render-test.vcxproj b/tools/render-test/render-test.vcxproj index 66ad9e7ed..91c8bd997 100644 --- a/tools/render-test/render-test.vcxproj +++ b/tools/render-test/render-test.vcxproj @@ -99,7 +99,7 @@ NotUsing Level3 _DEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories) + ..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories) EditAndContinue Disabled MultiThreadedDebug @@ -117,7 +117,7 @@ NotUsing Level3 _DEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories) + ..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories) EditAndContinue Disabled MultiThreadedDebug @@ -135,7 +135,7 @@ NotUsing Level3 NDEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories) + ..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories) Full true true @@ -157,7 +157,7 @@ NotUsing Level3 NDEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;..\slang-graphics;%(AdditionalIncludeDirectories) + ..\..;..\..\external;..\..\source;..\gfx;%(AdditionalIncludeDirectories) Full true true @@ -196,7 +196,7 @@ {DB00DA62-0533-4AFD-B59F-A67D5B3A0808} - + {222F7498-B40C-4F3F-A704-DDEB91A4484A} diff --git a/tools/render-test/shader-input-layout.h b/tools/render-test/shader-input-layout.h index 19a7e59d0..92dd516a7 100644 --- a/tools/render-test/shader-input-layout.h +++ b/tools/render-test/shader-input-layout.h @@ -7,7 +7,7 @@ namespace renderer_test { -using namespace slang_graphics; +using namespace gfx; enum class ShaderInputType { diff --git a/tools/render-test/shader-renderer-util.cpp b/tools/render-test/shader-renderer-util.cpp index e46c725bc..f6c0366bb 100644 --- a/tools/render-test/shader-renderer-util.cpp +++ b/tools/render-test/shader-renderer-util.cpp @@ -5,6 +5,16 @@ namespace renderer_test { using namespace Slang; +using Slang::Result; + +void BindingStateImpl::apply(Renderer* renderer, PipelineType pipelineType) +{ + renderer->setDescriptorSet( + pipelineType, + pipelineLayout, + 0, + descriptorSet); +} /* static */Result ShaderRendererUtil::generateTextureResource(const InputTextureDesc& inputDesc, int bindFlags, Renderer* renderer, RefPtr& textureOut) { @@ -125,16 +135,27 @@ using namespace Slang; return SLANG_OK; } -static BindingState::SamplerDesc _calcSamplerDesc(const InputSamplerDesc& srcDesc) +static SamplerState::Desc _calcSamplerDesc(const InputSamplerDesc& srcDesc) { - BindingState::SamplerDesc dstDesc; - dstDesc.isCompareSampler = srcDesc.isCompareSampler; + SamplerState::Desc dstDesc; + if (srcDesc.isCompareSampler) + { + dstDesc.reductionOp = TextureReductionOp::Comparison; + dstDesc.comparisonFunc = ComparisonFunc::Less; + } return dstDesc; } -/* static */BindingState::RegisterRange ShaderRendererUtil::calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry) +static RefPtr _createSamplerState( + Renderer* renderer, + const InputSamplerDesc& srcDesc) { - typedef BindingState::RegisterRange RegisterRange; + return renderer->createSamplerState(_calcSamplerDesc(srcDesc)); +} + +/* static */BindingStateImpl::RegisterRange ShaderRendererUtil::calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry) +{ + typedef BindingStateImpl::RegisterRange RegisterRange; BindingStyle bindingStyle = RendererUtil::getBindingStyle(renderer->getRendererType()); @@ -179,71 +200,227 @@ static BindingState::SamplerDesc _calcSamplerDesc(const InputSamplerDesc& srcDes return RegisterRange::makeInvalid(); } -/* static */Result ShaderRendererUtil::createBindingStateDesc(ShaderInputLayoutEntry* srcEntries, int numEntries, Renderer* renderer, BindingState::Desc& descOut) +/* static */Result ShaderRendererUtil::createBindingState(const ShaderInputLayout& layout, Renderer* renderer, BufferResource* addedConstantBuffer, BindingStateImpl** outBindingState) { + auto srcEntries = layout.entries.Buffer(); + auto numEntries = int(layout.entries.Count()); + const int textureBindFlags = Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource; - descOut.clear(); + List slotRangeDescs; + + if(addedConstantBuffer) + { + DescriptorSetLayout::SlotRangeDesc slotRangeDesc; + slotRangeDesc.type = DescriptorSlotType::UniformBuffer; + + slotRangeDescs.Add(slotRangeDesc); + } + for (int i = 0; i < numEntries; i++) { const ShaderInputLayoutEntry& srcEntry = srcEntries[i]; - const BindingState::RegisterRange registerSet = calcRegisterRange(renderer, srcEntry); + const BindingStateImpl::RegisterRange registerSet = calcRegisterRange(renderer, srcEntry); if (!registerSet.isValid()) { assert(!"Couldn't find a binding"); return SLANG_FAIL; } + DescriptorSetLayout::SlotRangeDesc slotRangeDesc; + switch (srcEntry.type) { case ShaderInputType::Buffer: - { - const InputBufferDesc& srcBuffer = srcEntry.bufferDesc; + { + const InputBufferDesc& srcBuffer = srcEntry.bufferDesc; + + switch (srcBuffer.type) + { + case InputBufferType::ConstantBuffer: + slotRangeDesc.type = DescriptorSlotType::UniformBuffer; + break; + + case InputBufferType::StorageBuffer: + slotRangeDesc.type = DescriptorSlotType::StorageBuffer; + break; + } + } + break; - const size_t bufferSize = srcEntry.bufferData.Count() * sizeof(uint32_t); + case ShaderInputType::CombinedTextureSampler: + { + slotRangeDesc.type = DescriptorSlotType::CombinedImageSampler; + } + break; - RefPtr bufferResource; - SLANG_RETURN_ON_FAIL(createBufferResource(srcEntry.bufferDesc, srcEntry.isOutput, bufferSize, srcEntry.bufferData.Buffer(), renderer, bufferResource)); + case ShaderInputType::Texture: + { + if (srcEntry.textureDesc.isRWTexture) + { + slotRangeDesc.type = DescriptorSlotType::StorageImage; + } + else + { + slotRangeDesc.type = DescriptorSlotType::SampledImage; + } + } + break; - descOut.addBufferResource(bufferResource, registerSet); + case ShaderInputType::Sampler: + slotRangeDesc.type = DescriptorSlotType::Sampler; break; - } + + default: + assert(!"Unhandled type"); + return SLANG_FAIL; + } + slotRangeDescs.Add(slotRangeDesc); + } + + DescriptorSetLayout::Desc descriptorSetLayoutDesc; + descriptorSetLayoutDesc.slotRangeCount = slotRangeDescs.Count(); + descriptorSetLayoutDesc.slotRanges = slotRangeDescs.Buffer(); + + auto descriptorSetLayout = renderer->createDescriptorSetLayout(descriptorSetLayoutDesc); + if(!descriptorSetLayout) return SLANG_FAIL; + + List pipelineDescriptorSets; + pipelineDescriptorSets.Add(PipelineLayout::DescriptorSetDesc(descriptorSetLayout)); + + PipelineLayout::Desc pipelineLayoutDesc; + pipelineLayoutDesc.renderTargetCount = layout.numRenderTargets; + pipelineLayoutDesc.descriptorSetCount = pipelineDescriptorSets.Count(); + pipelineLayoutDesc.descriptorSets = pipelineDescriptorSets.Buffer(); + + auto pipelineLayout = renderer->createPipelineLayout(pipelineLayoutDesc); + if(!pipelineLayout) return SLANG_FAIL; + + auto descriptorSet = renderer->createDescriptorSet(descriptorSetLayout); + if(!descriptorSet) return SLANG_FAIL; + + List outputBindings; + + if(addedConstantBuffer) + { + descriptorSet->setConstantBuffer(0, 0, addedConstantBuffer); + } + for (int i = 0; i < numEntries; i++) + { + const ShaderInputLayoutEntry& srcEntry = srcEntries[i]; + + auto rangeIndex = i + (addedConstantBuffer ? 1 : 0); + + switch (srcEntry.type) + { + case ShaderInputType::Buffer: + { + const InputBufferDesc& srcBuffer = srcEntry.bufferDesc; + const size_t bufferSize = srcEntry.bufferData.Count() * sizeof(uint32_t); + + RefPtr bufferResource; + SLANG_RETURN_ON_FAIL(createBufferResource(srcEntry.bufferDesc, srcEntry.isOutput, bufferSize, srcEntry.bufferData.Buffer(), renderer, bufferResource)); + + switch(srcBuffer.type) + { + case InputBufferType::ConstantBuffer: + descriptorSet->setConstantBuffer(rangeIndex, 0, bufferResource); + break; + + case InputBufferType::StorageBuffer: + { + ResourceView::Desc viewDesc; + viewDesc.type = ResourceView::Type::UnorderedAccess; + viewDesc.format = srcBuffer.format; + auto bufferView = renderer->createBufferView( + bufferResource, + viewDesc); + descriptorSet->setResource(rangeIndex, 0, bufferView); + } + break; + } + + if(srcEntry.isOutput) + { + BindingStateImpl::OutputBinding binding; + binding.entryIndex = i; + binding.resource = bufferResource; + outputBindings.Add(binding); + } + } + break; + case ShaderInputType::CombinedTextureSampler: - { - RefPtr texture; - SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); - descOut.addCombinedTextureSampler(texture, _calcSamplerDesc(srcEntry.samplerDesc), registerSet); + { + RefPtr texture; + SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); + + auto sampler = _createSamplerState(renderer, srcEntry.samplerDesc); + + ResourceView::Desc viewDesc; + viewDesc.type = ResourceView::Type::ShaderResource; + auto textureView = renderer->createTextureView( + texture, + viewDesc); + + descriptorSet->setCombinedTextureSampler(rangeIndex, 0, textureView, sampler); + + if(srcEntry.isOutput) + { + BindingStateImpl::OutputBinding binding; + binding.entryIndex = i; + binding.resource = texture; + outputBindings.Add(binding); + } + } break; - } - case ShaderInputType::Texture: - { - RefPtr texture; - SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); - descOut.addTextureResource(texture, registerSet); + case ShaderInputType::Texture: + { + RefPtr texture; + SLANG_RETURN_ON_FAIL(generateTextureResource(srcEntry.textureDesc, textureBindFlags, renderer, texture)); + + // TODO: support UAV textures... + + ResourceView::Desc viewDesc; + viewDesc.type = ResourceView::Type::ShaderResource; + auto textureView = renderer->createTextureView( + texture, + viewDesc); + + descriptorSet->setResource(rangeIndex, 0, textureView); + + if(srcEntry.isOutput) + { + BindingStateImpl::OutputBinding binding; + binding.entryIndex = i; + binding.resource = texture; + outputBindings.Add(binding); + } + } break; - } + case ShaderInputType::Sampler: - { - descOut.addSampler(_calcSamplerDesc(srcEntry.samplerDesc), registerSet); + { + auto sampler = _createSamplerState(renderer, srcEntry.samplerDesc); + descriptorSet->setSampler(rangeIndex, 0, sampler); + } break; - } + default: - { assert(!"Unhandled type"); return SLANG_FAIL; - } } } - return SLANG_OK; -} + BindingStateImpl* bindingState = new BindingStateImpl(); + bindingState->descriptorSet = descriptorSet; + bindingState->pipelineLayout = pipelineLayout; + bindingState->outputBindings = outputBindings; + bindingState->m_numRenderTargets = layout.numRenderTargets; -/* static */Result ShaderRendererUtil::createBindingStateDesc(const ShaderInputLayout& layout, Renderer* renderer, BindingState::Desc& descOut) -{ - SLANG_RETURN_ON_FAIL(createBindingStateDesc(layout.entries.Buffer(), int(layout.entries.Count()), renderer, descOut)); - descOut.m_numRenderTargets = layout.numRenderTargets; + *outBindingState = bindingState; return SLANG_OK; } diff --git a/tools/render-test/shader-renderer-util.h b/tools/render-test/shader-renderer-util.h index 849e68754..bbdea2af6 100644 --- a/tools/render-test/shader-renderer-util.h +++ b/tools/render-test/shader-renderer-util.h @@ -6,26 +6,68 @@ namespace renderer_test { -/// Utility class containing functions that construct items on the renderer using the ShaderInputLayout representation -struct ShaderRendererUtil +using namespace Slang; + +struct BindingStateImpl : public Slang::RefObject +{ + /// A register set consists of one or more contiguous indices. + /// To be valid index >= 0 and size >= 1 + struct RegisterRange + { + /// True if contains valid contents + bool isValid() const { return size > 0; } + /// True if valid single value + bool isSingle() const { return size == 1; } + /// Get as a single index (must be at least one index) + int getSingleIndex() const { return (size == 1) ? index : -1; } + /// Return the first index + int getFirstIndex() const { return (size > 0) ? index : -1; } + /// True if contains register index + bool hasRegister(int registerIndex) const { return registerIndex >= index && registerIndex < index + size; } + + static RegisterRange makeInvalid() { return RegisterRange{ -1, 0 }; } + static RegisterRange makeSingle(int index) { return RegisterRange{ int16_t(index), 1 }; } + static RegisterRange makeRange(int index, int size) { return RegisterRange{ int16_t(index), uint16_t(size) }; } + + int16_t index; ///< The base index + uint16_t size; ///< The amount of register indices + }; + + void apply(Renderer* renderer, PipelineType pipelineType); + + struct OutputBinding + { + RefPtr resource; + Slang::UInt entryIndex; + }; + List outputBindings; + + RefPtr pipelineLayout; + RefPtr descriptorSet; + int m_numRenderTargets = 1; +}; + +/// Utility class containing functions that construct items on the renderer using the ShaderInputLayout representation +struct ShaderRendererUtil { /// Generate a texture using the InputTextureDesc and construct a TextureResource using the Renderer with the contents static Slang::Result generateTextureResource(const InputTextureDesc& inputDesc, int bindFlags, Renderer* renderer, Slang::RefPtr& textureOut); /// Create texture resource using inputDesc, and texData to describe format, and contents static Slang::Result createTextureResource(const InputTextureDesc& inputDesc, const TextureData& texData, int bindFlags, Renderer* renderer, Slang::RefPtr& textureOut); - + /// Create the BufferResource using the renderer from the contents of inputDesc static Slang::Result createBufferResource(const InputBufferDesc& inputDesc, bool isOutput, size_t bufferSize, const void* initData, Renderer* renderer, Slang::RefPtr& bufferOut); /// Create BindingState::Desc from the contents of layout - static Slang::Result createBindingStateDesc(const ShaderInputLayout& layout, Renderer* renderer, BindingState::Desc& descOut); - /// Create BindingState::Desc from a list of ShaderInputLayout entries - static Slang::Result createBindingStateDesc(ShaderInputLayoutEntry* srcEntries, int numEntries, Renderer* renderer, BindingState::Desc& descOut); + static Slang::Result createBindingState(const ShaderInputLayout& layout, Renderer* renderer, BufferResource* addedConstantBuffer, BindingStateImpl** outBindingState); /// Get the binding register associated with this binding (or -1 if none defined) - static BindingState::RegisterRange calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry); + static BindingStateImpl::RegisterRange calcRegisterRange(Renderer* renderer, const ShaderInputLayoutEntry& entry); +private: + /// Create BindingState::Desc from a list of ShaderInputLayout entries + static Slang::Result _createBindingState(ShaderInputLayoutEntry* srcEntries, int numEntries, Renderer* renderer, BufferResource* addedConstantBuffer, BindingStateImpl** outBindingState); }; } // renderer_test diff --git a/tools/render-test/slang-support.cpp b/tools/render-test/slang-support.cpp index a6c252843..26e856295 100644 --- a/tools/render-test/slang-support.cpp +++ b/tools/render-test/slang-support.cpp @@ -11,7 +11,7 @@ namespace renderer_test { -ShaderProgram* ShaderCompiler::compileProgram( +RefPtr ShaderCompiler::compileProgram( ShaderCompileRequest const& request) { SlangSession* slangSession = spCreateSession(NULL); @@ -92,7 +92,7 @@ ShaderProgram* ShaderCompiler::compileProgram( } - ShaderProgram * shaderProgram = nullptr; + RefPtr shaderProgram; Slang::List rawTypeNames; for (auto typeName : request.entryPointTypeArguments) rawTypeNames.Add(typeName.Buffer()); diff --git a/tools/render-test/slang-support.h b/tools/render-test/slang-support.h index 8697abcb8..03de062d1 100644 --- a/tools/render-test/slang-support.h +++ b/tools/render-test/slang-support.h @@ -11,13 +11,13 @@ namespace renderer_test { struct ShaderCompiler { - Renderer* renderer; + RefPtr renderer; SlangCompileTarget target; SlangSourceLanguage sourceLanguage; SlangPassThrough passThrough; char const* profile; - ShaderProgram* compileProgram( + RefPtr compileProgram( ShaderCompileRequest const& request); }; diff --git a/tools/slang-graphics/circular-resource-heap-d3d12.cpp b/tools/slang-graphics/circular-resource-heap-d3d12.cpp deleted file mode 100644 index 8b63819a5..000000000 --- a/tools/slang-graphics/circular-resource-heap-d3d12.cpp +++ /dev/null @@ -1,222 +0,0 @@ -#include "circular-resource-heap-d3d12.h" - -namespace slang_graphics { -using namespace Slang; - -D3D12CircularResourceHeap::D3D12CircularResourceHeap(): - m_fence(nullptr), - m_device(nullptr), - m_blockFreeList(sizeof(Block), SLANG_ALIGN_OF(Block), 16), - m_blocks(nullptr) -{ - m_back.m_block = nullptr; - m_back.m_position = nullptr; - m_front.m_block = nullptr; - m_front.m_position = nullptr; -} - -D3D12CircularResourceHeap::~D3D12CircularResourceHeap() -{ - _freeBlockListResources(m_blocks); -} - -void D3D12CircularResourceHeap::_freeBlockListResources(const Block* start) -{ - if (start) - { - const Block* block = start; - do - { - ID3D12Resource* resource = block->m_resource; - - resource->Unmap(0, nullptr); - resource->Release(); - - // Next in list - block = block->m_next; - - } while (block != start); - } -} - -Result D3D12CircularResourceHeap::init(ID3D12Device* device, const Desc& desc, D3D12CounterFence* fence) -{ - assert(m_blocks == nullptr); - assert(desc.m_blockSize > 0); - - m_fence = fence; - m_desc = desc; - m_device = device; - - return SLANG_OK; -} - -void D3D12CircularResourceHeap::addSync(uint64_t signalValue) -{ - assert(signalValue == m_fence->getCurrentValue()); - PendingEntry entry; - entry.m_completedValue = signalValue; - entry.m_cursor = m_front; - m_pendingQueue.Add(entry); -} - -void D3D12CircularResourceHeap::updateCompleted() -{ - const uint64_t completedValue = m_fence->getCompletedValue(); - -#if 0 - while (m_pendingQueue.Count() != 0) - { - const PendingEntry& entry = m_pendingQueue[0]; - if (entry.m_completedValue <= completedValue) - { - m_back = entry.m_cursor; - m_pendingQueue.RemoveAt(0); - } - else - { - break; - } - } -#else - // A more efficient implementation is m_pendingQueue is implemented as a vector like type - const int size = int(m_pendingQueue.Count()); - int end = 0; - while (end < size && m_pendingQueue[end].m_completedValue <= completedValue) - { - end++; - } - - if (end > 0) - { - // Set the back position - m_back = m_pendingQueue[end - 1].m_cursor; - if (end == size) - { - m_pendingQueue.Clear(); - } - else - { - m_pendingQueue.RemoveRange(0, size); - } - } -#endif -} - -D3D12CircularResourceHeap::Block* D3D12CircularResourceHeap::_newBlock() -{ - D3D12_RESOURCE_DESC desc; - - desc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER; - desc.Alignment = 0; - desc.Width = m_desc.m_blockSize; - desc.Height = 1; - desc.DepthOrArraySize = 1; - desc.MipLevels = 1; - desc.Format = DXGI_FORMAT_UNKNOWN; - desc.SampleDesc.Count = 1; - desc.SampleDesc.Quality = 0; - desc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR; - desc.Flags = D3D12_RESOURCE_FLAG_NONE; - - ComPtr resource; - Result res = m_device->CreateCommittedResource(&m_desc.m_heapProperties, m_desc.m_heapFlags, &desc, m_desc.m_initialState, nullptr, IID_PPV_ARGS(resource.writeRef())); - if (SLANG_FAILED(res)) - { - assert(!"Resource allocation failed"); - return nullptr; - } - - uint8_t* data = nullptr; - if (m_desc.m_heapProperties.Type == D3D12_HEAP_TYPE_READBACK) - { - } - else - { - // Map it, and keep it mapped - resource->Map(0, nullptr, (void**)&data); - } - - // We have no blocks -> so lets allocate the first - Block* block = (Block*)m_blockFreeList.allocate(); - block->m_next = nullptr; - - block->m_resource = resource.detach(); - block->m_start = data; - return block; -} - -D3D12CircularResourceHeap::Cursor D3D12CircularResourceHeap::allocate(size_t size, size_t alignment) -{ - const size_t blockSize = getBlockSize(); - - assert(size <= blockSize); - - // If nothing is allocated add the first block - if (m_blocks == nullptr) - { - Block* block = _newBlock(); - if (!block) - { - Cursor cursor = {}; - return cursor; - } - m_blocks = block; - // Make circular - block->m_next = block; - - // Point front and back to same position, as currently it is all free - m_back = { block, block->m_start }; - m_front = m_back; - } - - // If front and back are in the same block then front MUST be ahead of back (as that defined as - // an invariant and is required for block insertion to be possible - Block* block = m_front.m_block; - - // Check the invariant - assert(block != m_back.m_block || m_front.m_position >= m_back.m_position); - - { - uint8_t* cur = (uint8_t*)((size_t(m_front.m_position) + alignment - 1) & ~(alignment - 1)); - // Does the the allocation fit? - if (cur + size <= block->m_start + blockSize) - { - // It fits - // Move the front forward - m_front.m_position = cur + size; - Cursor cursor = { block, cur }; - return cursor; - } - } - - // Okay I can't fit into current block... - - // If the next block contains front, we need to add a block, else we can use that block - if (block->m_next == m_back.m_block) - { - Block* newBlock = _newBlock(); - // Insert into the list - newBlock->m_next = block->m_next; - block->m_next = newBlock; - } - - // Use the block we are going to add to - block = block->m_next; - uint8_t* cur = (uint8_t*)((size_t(block->m_start) + alignment - 1) & ~(alignment - 1)); - // Does the the allocation fit? - if (cur + size > block->m_start + blockSize) - { - assert(!"Couldn't fit into a free block(!) Alignment breaks it?"); - Cursor cursor = {}; - return cursor; - } - // It fits - // Move the front forward - m_front.m_block = block; - m_front.m_position = cur + size; - Cursor cursor = { block, cur }; - return cursor; -} - -} // namespace slang_graphics diff --git a/tools/slang-graphics/circular-resource-heap-d3d12.h b/tools/slang-graphics/circular-resource-heap-d3d12.h deleted file mode 100644 index a200d3bbc..000000000 --- a/tools/slang-graphics/circular-resource-heap-d3d12.h +++ /dev/null @@ -1,206 +0,0 @@ -#pragma once - -#include "../../slang-com-ptr.h" -#include "../../source/core/list.h" -#include "../../source/core/slang-free-list.h" - -#include "resource-d3d12.h" - -namespace slang_graphics { - -/*! \brief The D3D12CircularResourceHeap is a heap that is suited for size constrained real-time resources allocation that -is transitory in nature. It is designed to allocate resources which are used and discarded, often used where in -previous versions of DirectX the 'DISCARD' flag was used. - -The idea is to have a heap which chunks of resource can be allocated, and used for GPU execution, -and that the heap is able through the addSync/updateCompleted idiom is able to track when the usage of the resources is -completed allowing them to be reused. The heap is arranged as circularly, with new allocations made from the front, and the back -being updated as the GPU updating the back when it is informed anything using prior parts of the heap have completed. In this -arrangement all the heap between the back and the front can be thought of as in use or potentially in use by the GPU. All the heap -from the front back around to the back, is free and can be allocated from. It is the responsibility of the user of the Heap to make -sure the invariant holds, but in most normal usage it does so simply. - -Another feature of the heap is that it does not require upfront knowledge of how big a heap is needed. The backing resources will be expanded -dynamically with requests as needed. The only requirement is that know single request can be larger than m_blockSize specified in the Desc -used to initialize the heap. This is because all the backing resources are allocated to a single size. This limitation means the D3D12CircularResourceHeap -may not be the best use for example for uploading a texture - because it's design is really around transitory uploads or write backs, and so more suited -to constant buffers, vertex buffer, index buffers and the like. - -To upload a texture at program startup it is most likely better to use a D3D12ResourceScopeManager. - -\code{.cpp} - -typedef D3D12CircularResourceHeap Heap; - -Heap::Cursor cursor = heap.allocateVertexBuffer(sizeof(Vertex) * numVerts); -Memory:copy(cursor.m_position, verts, sizeof(Vertex) * numVerts); - -// Do a command using the GPU handle -m_commandList->... -// Do another command using the GPU handle - -m_commandList->... - -// Execute the command list on the command queue -{ - ID3D12CommandList* lists[] = { m_commandList }; - m_commandQueue->ExecuteCommandLists(SLANG_COUNT_OF(lists), lists); -} - -// Add a sync point -const uint64_t signalValue = m_fence.nextSignal(m_commandQueue); -heap.addSync(signalValue) - -// The cursors cannot be used anymore - -// At some later point call updateCompleted. This will see where the GPU is at, and make resources available that the GPU no longer accesses. -heap.updateCompleted(); - -\endcode - -### Implementation - -Front and back can be in the same block, but ONLY if back is behind front, because we have to always be able to insert -new blocks in front of front. So it must be possible to do an block insertion between the two of them. - -|--B---F-----| |----------| - -When B and F are on top of one another it means there is nothing in the list. NOTE this also means that a move of front can never place it -top of the back. - -https://msdn.microsoft.com/en-us/library/windows/desktop/dn899125%28v=vs.85%29.aspx -https://msdn.microsoft.com/en-us/library/windows/desktop/mt426646%28v=vs.85%29.aspx -*/ - -class D3D12CircularResourceHeap -{ - protected: - struct Block; - public: - typedef D3D12CircularResourceHeap ThisType; - - /// The alignment used for VERTEX_BUFFER allocations - /// Strictly speaking it seems the hardware can handle 4 byte alignment, but since often in use - /// data will be copied from CPU memory to the allocation, using 16 byte alignment is superior as allows - /// significantly faster memcpy. - /// The sample that shows sizeof(float) - 4 bytes is appropriate is at the link below. - /// https://msdn.microsoft.com/en-us/library/windows/desktop/mt426646%28v=vs.85%29.aspx - enum - { - VERTEX_BUFFER_ALIGNMENT = 16, - }; - - struct Desc - { - void init() - { - { - D3D12_HEAP_PROPERTIES& props = m_heapProperties; - - props.Type = D3D12_HEAP_TYPE_UPLOAD; - props.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - props.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - props.CreationNodeMask = 1; - props.VisibleNodeMask = 1; - } - m_heapFlags = D3D12_HEAP_FLAG_NONE; - m_initialState = D3D12_RESOURCE_STATE_GENERIC_READ; - m_blockSize = 0; - } - - D3D12_HEAP_PROPERTIES m_heapProperties; - D3D12_HEAP_FLAGS m_heapFlags; - D3D12_RESOURCE_STATES m_initialState; - size_t m_blockSize; - }; - - /// Cursor position - struct Cursor - { - /// Get GpuHandle - SLANG_FORCE_INLINE D3D12_GPU_VIRTUAL_ADDRESS getGpuHandle() const { return m_block->m_resource->GetGPUVirtualAddress() + size_t(m_position - m_block->m_start); } - /// Must have a block and position - SLANG_FORCE_INLINE bool isValid() const { return m_block != nullptr; } - /// Calculate the offset into the underlying resource - SLANG_FORCE_INLINE size_t getOffset() const { return size_t(m_position - m_block->m_start); } - /// Get the underlying resource - SLANG_FORCE_INLINE ID3D12Resource* getResource() const { return m_block->m_resource; } - - Block* m_block; ///< The block index - uint8_t* m_position; ///< The current position - }; - - /// Get the desc used to initialize the heap - SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } - - /// Must be called before used - /// Block size must be at least as large as the _largest_ thing allocated - /// Also note depending on alignment of a resource allocation, the block size might also need to take into account the - /// maximum alignment use. It is a REQUIREMENT that a newly allocated resource block is large enough to hold any - /// allocation taking into account the alignment used. - Slang::Result init(ID3D12Device* device, const Desc& desc, D3D12CounterFence* fence); - - /// Get the block size - SLANG_FORCE_INLINE size_t getBlockSize() const { return m_desc.m_blockSize; } - - /// Allocate constant buffer of specified size - Cursor allocate(size_t size, size_t alignment); - - /// Allocate a constant buffer - SLANG_FORCE_INLINE Cursor allocateConstantBuffer(size_t size) { return allocate(size, D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT); } - /// Allocate a vertex buffer - SLANG_FORCE_INLINE Cursor allocateVertexBuffer(size_t size) { return allocate(size, VERTEX_BUFFER_ALIGNMENT); } - - /// Create filled in constant buffer - SLANG_FORCE_INLINE Cursor newConstantBuffer(const void* data, size_t size) { Cursor cursor = allocateConstantBuffer(size); ::memcpy(cursor.m_position, data, size); return cursor; } - /// Create in filled in constant buffer - template - SLANG_FORCE_INLINE Cursor newConstantBuffer(const T& in) { return newConstantBuffer(&in, sizeof(T)); } - - /// Look where the GPU has got to and release anything not currently used - void updateCompleted(); - /// Add a sync point - meaning that when this point is hit in the queue - /// all of the resources up to this point will no longer be used. - void addSync(uint64_t signalValue); - - /// Get the gpu address of this cursor - D3D12_GPU_VIRTUAL_ADDRESS getGpuHandle(const Cursor& cursor) const { return cursor.m_block->m_resource->GetGPUVirtualAddress() + size_t(cursor.m_position - cursor.m_block->m_start); } - - /// Ctor - D3D12CircularResourceHeap(); - /// Dtor - ~D3D12CircularResourceHeap(); - - protected: - - struct Block - { - ID3D12Resource* m_resource; ///< The mapped resource - uint8_t* m_start; ///< Once created the resource is mapped to here - Block* m_next; ///< Points to next block in the list - }; - struct PendingEntry - { - uint64_t m_completedValue; ///< The value when this is completed - Cursor m_cursor; ///< the cursor at that point - }; - void _freeBlockListResources(const Block* block); - /// Create a new block (with associated resource), do not add the block list - Block* _newBlock(); - - Block* m_blocks; ///< Circular singly linked list of block. nullptr initially - Slang::FreeList m_blockFreeList; ///< Free list of actual allocations of blocks - Slang::List m_pendingQueue; ///< Holds the list of pending positions. When the fence value is greater than the value on the queue entry, the entry is done. - - // Allocation is made from the front, and freed from the back. - Cursor m_back; ///< Current back position. - Cursor m_front; ///< Current front position. - - Desc m_desc; ///< Describes the heap - - D3D12CounterFence* m_fence; ///< The fence to use - ID3D12Device* m_device; ///< The device that resources will be constructed on -}; - -} // namespace slang_graphics - diff --git a/tools/slang-graphics/d3d-util.cpp b/tools/slang-graphics/d3d-util.cpp deleted file mode 100644 index b2c3f87ee..000000000 --- a/tools/slang-graphics/d3d-util.cpp +++ /dev/null @@ -1,306 +0,0 @@ -// d3d-util.cpp -#include "d3d-util.h" - -#include - -// We will use the C standard library just for printing error messages. -#include - -namespace slang_graphics { -using namespace Slang; - -/* static */D3D_PRIMITIVE_TOPOLOGY D3DUtil::getPrimitiveTopology(PrimitiveTopology topology) -{ - switch (topology) - { - case PrimitiveTopology::TriangleList: - { - return D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST; - } - default: break; - } - return D3D11_PRIMITIVE_TOPOLOGY_UNDEFINED; -} - -/* static */DXGI_FORMAT D3DUtil::getMapFormat(Format format) -{ - switch (format) - { - case Format::RGBA_Float32: return DXGI_FORMAT_R32G32B32A32_FLOAT; - case Format::RGB_Float32: return DXGI_FORMAT_R32G32B32_FLOAT; - case Format::RG_Float32: return DXGI_FORMAT_R32G32_FLOAT; - case Format::R_Float32: return DXGI_FORMAT_R32_FLOAT; - case Format::RGBA_Unorm_UInt8: return DXGI_FORMAT_R8G8B8A8_UNORM; - case Format::R_UInt32: return DXGI_FORMAT_R32_UINT; - - case Format::D_Float32: return DXGI_FORMAT_D32_FLOAT; - case Format::D_Unorm24_S8: return DXGI_FORMAT_D24_UNORM_S8_UINT; - - default: return DXGI_FORMAT_UNKNOWN; - } -} - -/* static */DXGI_FORMAT D3DUtil::calcResourceFormat(UsageType usage, Int usageFlags, DXGI_FORMAT format) -{ - SLANG_UNUSED(usage); - if (usageFlags) - { - switch (format) - { - case DXGI_FORMAT_R32_FLOAT: /* fallthru */ - case DXGI_FORMAT_R32_UINT: - case DXGI_FORMAT_D32_FLOAT: - { - return DXGI_FORMAT_R32_TYPELESS; - } - case DXGI_FORMAT_D24_UNORM_S8_UINT: return DXGI_FORMAT_R24G8_TYPELESS; - default: break; - } - return format; - } - return format; -} - -/* static */DXGI_FORMAT D3DUtil::calcFormat(UsageType usage, DXGI_FORMAT format) -{ - switch (usage) - { - case USAGE_COUNT_OF: - case USAGE_UNKNOWN: - { - return DXGI_FORMAT_UNKNOWN; - } - case USAGE_DEPTH_STENCIL: - { - switch (format) - { - case DXGI_FORMAT_D32_FLOAT: /* fallthru */ - case DXGI_FORMAT_R32_TYPELESS: - { - return DXGI_FORMAT_D32_FLOAT; - } - case DXGI_FORMAT_R24_UNORM_X8_TYPELESS: return DXGI_FORMAT_D24_UNORM_S8_UINT; - case DXGI_FORMAT_R24G8_TYPELESS: return DXGI_FORMAT_D24_UNORM_S8_UINT; - default: break; - } - return format; - } - case USAGE_TARGET: - { - switch (format) - { - case DXGI_FORMAT_D32_FLOAT: /* fallthru */ - case DXGI_FORMAT_D24_UNORM_S8_UINT: - { - return DXGI_FORMAT_UNKNOWN; - } - case DXGI_FORMAT_R32_TYPELESS: return DXGI_FORMAT_R32_FLOAT; - default: break; - } - return format; - } - case USAGE_SRV: - { - switch (format) - { - case DXGI_FORMAT_D32_FLOAT: /* fallthru */ - case DXGI_FORMAT_R32_TYPELESS: - { - return DXGI_FORMAT_R32_FLOAT; - } - case DXGI_FORMAT_R24_UNORM_X8_TYPELESS: return DXGI_FORMAT_R24_UNORM_X8_TYPELESS; - default: break; - } - - return format; - } - } - - assert(!"Not reachable"); - return DXGI_FORMAT_UNKNOWN; -} - -bool D3DUtil::isTypeless(DXGI_FORMAT format) -{ - switch (format) - { - case DXGI_FORMAT_R32G32B32A32_TYPELESS: - case DXGI_FORMAT_R32G32B32_TYPELESS: - case DXGI_FORMAT_R16G16B16A16_TYPELESS: - case DXGI_FORMAT_R32G32_TYPELESS: - case DXGI_FORMAT_R32G8X24_TYPELESS: - case DXGI_FORMAT_R32_FLOAT_X8X24_TYPELESS: - case DXGI_FORMAT_R10G10B10A2_TYPELESS: - case DXGI_FORMAT_R8G8B8A8_TYPELESS: - case DXGI_FORMAT_R16G16_TYPELESS: - case DXGI_FORMAT_R32_TYPELESS: - case DXGI_FORMAT_R24_UNORM_X8_TYPELESS: - case DXGI_FORMAT_R24G8_TYPELESS: - case DXGI_FORMAT_R8G8_TYPELESS: - case DXGI_FORMAT_R16_TYPELESS: - case DXGI_FORMAT_R8_TYPELESS: - case DXGI_FORMAT_BC1_TYPELESS: - case DXGI_FORMAT_BC2_TYPELESS: - case DXGI_FORMAT_BC3_TYPELESS: - case DXGI_FORMAT_BC4_TYPELESS: - case DXGI_FORMAT_BC5_TYPELESS: - case DXGI_FORMAT_B8G8R8A8_TYPELESS: - case DXGI_FORMAT_BC6H_TYPELESS: - case DXGI_FORMAT_BC7_TYPELESS: - { - return true; - } - default: break; - } - return false; -} - -/* static */Int D3DUtil::getNumColorChannelBits(DXGI_FORMAT fmt) -{ - switch (fmt) - { - case DXGI_FORMAT_R32G32B32A32_TYPELESS: - case DXGI_FORMAT_R32G32B32A32_FLOAT: - case DXGI_FORMAT_R32G32B32A32_UINT: - case DXGI_FORMAT_R32G32B32A32_SINT: - case DXGI_FORMAT_R32G32B32_TYPELESS: - case DXGI_FORMAT_R32G32B32_FLOAT: - case DXGI_FORMAT_R32G32B32_UINT: - case DXGI_FORMAT_R32G32B32_SINT: - { - return 32; - } - case DXGI_FORMAT_R16G16B16A16_TYPELESS: - case DXGI_FORMAT_R16G16B16A16_FLOAT: - case DXGI_FORMAT_R16G16B16A16_UNORM: - case DXGI_FORMAT_R16G16B16A16_UINT: - case DXGI_FORMAT_R16G16B16A16_SNORM: - case DXGI_FORMAT_R16G16B16A16_SINT: - { - return 16; - } - case DXGI_FORMAT_R10G10B10A2_TYPELESS: - case DXGI_FORMAT_R10G10B10A2_UNORM: - case DXGI_FORMAT_R10G10B10A2_UINT: - case DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM: - { - return 10; - } - case DXGI_FORMAT_R8G8B8A8_TYPELESS: - case DXGI_FORMAT_R8G8B8A8_UNORM: - case DXGI_FORMAT_R8G8B8A8_UNORM_SRGB: - case DXGI_FORMAT_R8G8B8A8_UINT: - case DXGI_FORMAT_R8G8B8A8_SNORM: - case DXGI_FORMAT_R8G8B8A8_SINT: - case DXGI_FORMAT_B8G8R8A8_UNORM: - case DXGI_FORMAT_B8G8R8X8_UNORM: - case DXGI_FORMAT_B8G8R8A8_TYPELESS: - case DXGI_FORMAT_B8G8R8A8_UNORM_SRGB: - case DXGI_FORMAT_B8G8R8X8_TYPELESS: - case DXGI_FORMAT_B8G8R8X8_UNORM_SRGB: - { - return 8; - } - case DXGI_FORMAT_B5G6R5_UNORM: - case DXGI_FORMAT_B5G5R5A1_UNORM: - { - return 5; - } - case DXGI_FORMAT_B4G4R4A4_UNORM: - return 4; - - default: - return 0; - } -} - -// Note: this subroutine is now only used by D3D11 for generating bytecode to go into input layouts. -// -// TODO: we can probably remove that code completely by switching to a PSO-like model across all APIs. -// -/* static */Result D3DUtil::compileHLSLShader(char const* sourcePath, char const* source, char const* entryPointName, char const* dxProfileName, ComPtr& shaderBlobOut) -{ - // Rather than statically link against the `d3dcompile` library, we - // dynamically load it. - // - // Note: A more realistic application would compile from HLSL text to D3D - // shader bytecode as part of an offline process, rather than doing it - // on-the-fly like this - // - static pD3DCompile compileFunc = nullptr; - if (!compileFunc) - { - // TODO(tfoley): maybe want to search for one of a few versions of the DLL - HMODULE compilerModule = LoadLibraryA("d3dcompiler_47.dll"); - if (!compilerModule) - { - fprintf(stderr, "error: failed load 'd3dcompiler_47.dll'\n"); - return SLANG_FAIL; - } - - compileFunc = (pD3DCompile)GetProcAddress(compilerModule, "D3DCompile"); - if (!compileFunc) - { - fprintf(stderr, "error: failed load symbol 'D3DCompile'\n"); - return SLANG_FAIL; - } - } - - // For this example, we turn on debug output, and turn off all - // optimization. A real application would only use these flags - // when shader debugging is needed. - UINT flags = 0; - flags |= D3DCOMPILE_DEBUG; - flags |= D3DCOMPILE_OPTIMIZATION_LEVEL0 | D3DCOMPILE_SKIP_OPTIMIZATION; - - // We will always define `__HLSL__` when compiling here, so that - // input code can react differently to being compiled as pure HLSL. - D3D_SHADER_MACRO defines[] = { - { "__HLSL__", "1" }, - { nullptr, nullptr }, - }; - - // The `D3DCompile` entry point takes a bunch of parameters, but we - // don't really need most of them for Slang-generated code. - ComPtr shaderBlob; - ComPtr errorBlob; - - HRESULT hr = compileFunc(source, strlen(source), sourcePath, &defines[0], nullptr, entryPointName, dxProfileName, flags, 0, - shaderBlob.writeRef(), errorBlob.writeRef()); - - // If the HLSL-to-bytecode compilation produced any diagnostic messages - // then we will print them out (whether or not the compilation failed). - if (errorBlob) - { - ::fputs((const char*)errorBlob->GetBufferPointer(), stderr); - ::fflush(stderr); - ::OutputDebugStringA((const char*)errorBlob->GetBufferPointer()); - } - - SLANG_RETURN_ON_FAIL(hr); - shaderBlobOut.swap(shaderBlob); - return SLANG_OK; -} - -/* static */void D3DUtil::appendWideChars(const char* in, List& out) -{ - size_t len = ::strlen(in); - - const DWORD dwFlags = 0; - int outSize = ::MultiByteToWideChar(CP_UTF8, dwFlags, in, int(len), nullptr, 0); - - if (outSize > 0) - { - const UInt prevSize = out.Count(); - out.SetSize(prevSize + len + 1); - - WCHAR* dst = out.Buffer() + prevSize; - ::MultiByteToWideChar(CP_UTF8, dwFlags, in, int(len), dst, outSize); - // Make null terminated - dst[outSize] = 0; - // Remove terminating 0 from array - out.UnsafeShrinkToSize(prevSize + outSize); - } -} - -} // renderer_test diff --git a/tools/slang-graphics/d3d-util.h b/tools/slang-graphics/d3d-util.h deleted file mode 100644 index b5f154e6e..000000000 --- a/tools/slang-graphics/d3d-util.h +++ /dev/null @@ -1,61 +0,0 @@ -// d3d-util.h -#pragma once - -#include - -#include "../../slang-com-helper.h" - -#include "../../slang-com-ptr.h" -#include "../../source/core/list.h" - -#include "render.h" - -#include -#include - -namespace slang_graphics { - -class D3DUtil -{ - public: - enum UsageType - { - USAGE_UNKNOWN, ///< Generally used to mark an error - USAGE_TARGET, ///< Format should be used when written as target - USAGE_DEPTH_STENCIL, ///< Format should be used when written as depth stencil - USAGE_SRV, ///< Format if being read as srv - USAGE_COUNT_OF, - }; - enum UsageFlag - { - USAGE_FLAG_MULTI_SAMPLE = 0x1, ///< If set will be used form multi sampling (such as MSAA) - USAGE_FLAG_SRV = 0x2, ///< If set means will be used as a shader resource view (SRV) - }; - - /// Get primitive topology as D3D primitive topology - static D3D_PRIMITIVE_TOPOLOGY getPrimitiveTopology(PrimitiveTopology prim); - - /// Calculate size taking into account alignment. Alignment must be a power of 2 - static UInt calcAligned(UInt size, UInt alignment) { return (size + alignment - 1) & ~(alignment - 1); } - - /// Compile HLSL code to DXBC - static Slang::Result compileHLSLShader(char const* sourcePath, char const* source, char const* entryPointName, char const* dxProfileName, Slang::ComPtr& shaderBlobOut); - - /// Given a slang pixel format returns the equivalent DXGI_ pixel format. If the format is not known, will return DXGI_FORMAT_UNKNOWN - static DXGI_FORMAT getMapFormat(Format format); - - /// Given the usage, flags, and format will return the most suitable format. Will return DXGI_UNKNOWN if combination is not possible - static DXGI_FORMAT calcFormat(UsageType usage, DXGI_FORMAT format); - /// Calculate appropriate format for creating a buffer for usage and flags - static DXGI_FORMAT calcResourceFormat(UsageType usage, Int usageFlags, DXGI_FORMAT format); - /// True if the type is 'typeless' - static bool isTypeless(DXGI_FORMAT format); - - /// Returns number of bits used for color channel for format (for channels with multiple sizes, returns smallest ie RGB565 -> 5) - static Int getNumColorChannelBits(DXGI_FORMAT fmt); - - /// Append text in in, into wide char array - static void appendWideChars(const char* in, Slang::List& out); -}; - -} // renderer_test diff --git a/tools/slang-graphics/descriptor-heap-d3d12.cpp b/tools/slang-graphics/descriptor-heap-d3d12.cpp deleted file mode 100644 index 23c56d46d..000000000 --- a/tools/slang-graphics/descriptor-heap-d3d12.cpp +++ /dev/null @@ -1,47 +0,0 @@ - -#include "descriptor-heap-d3d12.h" - -namespace slang_graphics { -using namespace Slang; - -D3D12DescriptorHeap::D3D12DescriptorHeap(): - m_totalSize(0), - m_currentIndex(0), - m_descriptorSize(0) -{ -} - -Result D3D12DescriptorHeap::init(ID3D12Device* device, int size, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags) -{ - D3D12_DESCRIPTOR_HEAP_DESC srvHeapDesc = {}; - srvHeapDesc.NumDescriptors = size; - srvHeapDesc.Flags = flags; - srvHeapDesc.Type = type; - SLANG_RETURN_ON_FAIL(device->CreateDescriptorHeap(&srvHeapDesc, IID_PPV_ARGS(m_heap.writeRef()))); - - m_descriptorSize = device->GetDescriptorHandleIncrementSize(type); - m_totalSize = size; - - return SLANG_OK; -} - -Result D3D12DescriptorHeap::init(ID3D12Device* device, const D3D12_CPU_DESCRIPTOR_HANDLE* handles, int numHandles, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags) -{ - SLANG_RETURN_ON_FAIL(init(device, numHandles, type, flags)); - D3D12_CPU_DESCRIPTOR_HANDLE dst = m_heap->GetCPUDescriptorHandleForHeapStart(); - - // Copy them all - for (int i = 0; i < numHandles; i++, dst.ptr += m_descriptorSize) - { - D3D12_CPU_DESCRIPTOR_HANDLE src = handles[i]; - if (src.ptr != 0) - { - device->CopyDescriptorsSimple(1, dst, src, type); - } - } - - return SLANG_OK; -} - -} // namespace slang_graphics - diff --git a/tools/slang-graphics/descriptor-heap-d3d12.h b/tools/slang-graphics/descriptor-heap-d3d12.h deleted file mode 100644 index 6ddb583dc..000000000 --- a/tools/slang-graphics/descriptor-heap-d3d12.h +++ /dev/null @@ -1,115 +0,0 @@ -#pragma once - - -#include -#include - -#include "../../slang-com-ptr.h" - -namespace slang_graphics { - -/*! \brief A simple class to manage an underlying Dx12 Descriptor Heap. Allocations are made linearly in order. It is not possible to free -individual allocations, but all allocations can be deallocated with 'deallocateAll'. */ -class D3D12DescriptorHeap -{ - public: - typedef D3D12DescriptorHeap ThisType; - - /// Initialize - Slang::Result init(ID3D12Device* device, int size, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags); - /// Initialize with an array of handles copying over the representation - Slang::Result init(ID3D12Device* device, const D3D12_CPU_DESCRIPTOR_HANDLE* handles, int numHandles, D3D12_DESCRIPTOR_HEAP_TYPE type, D3D12_DESCRIPTOR_HEAP_FLAGS flags); - - /// Returns the number of slots that have been used - SLANG_FORCE_INLINE int getUsedSize() const { return m_currentIndex; } - - /// Get the total amount of descriptors possible on the heap - SLANG_FORCE_INLINE int getTotalSize() const { return m_totalSize; } - /// Allocate a descriptor. Returns the index, or -1 if none left. - SLANG_FORCE_INLINE int allocate(); - /// Allocate a number of descriptors. Returns the start index (or -1 if not possible) - SLANG_FORCE_INLINE int allocate(int numDescriptors); - - /// - SLANG_FORCE_INLINE int placeAt(int index); - - /// Deallocates all allocations, and starts allocation from the start of the underlying heap again - SLANG_FORCE_INLINE void deallocateAll() { m_currentIndex = 0; } - - /// Get the size of each - SLANG_FORCE_INLINE int getDescriptorSize() const { return m_descriptorSize; } - - /// Get the GPU heap start - SLANG_FORCE_INLINE D3D12_GPU_DESCRIPTOR_HANDLE getGpuStart() const { return m_heap->GetGPUDescriptorHandleForHeapStart(); } - /// Get the CPU heap start - SLANG_FORCE_INLINE D3D12_CPU_DESCRIPTOR_HANDLE getCpuStart() const { return m_heap->GetCPUDescriptorHandleForHeapStart(); } - - /// Get the GPU handle at the specified index - SLANG_FORCE_INLINE D3D12_GPU_DESCRIPTOR_HANDLE getGpuHandle(int index) const; - /// Get the CPU handle at the specified index - SLANG_FORCE_INLINE D3D12_CPU_DESCRIPTOR_HANDLE getCpuHandle(int index) const; - - /// Get the underlying heap - SLANG_FORCE_INLINE ID3D12DescriptorHeap* getHeap() const { return m_heap; } - - /// Ctor - D3D12DescriptorHeap(); - -protected: - Slang::ComPtr m_heap; ///< The underlying heap being allocated from - int m_totalSize; ///< Total amount of allocations available on the heap - int m_currentIndex; ///< The current descriptor - int m_descriptorSize; ///< The size of each descriptor -}; - -// --------------------------------------------------------------------------- -int D3D12DescriptorHeap::allocate() -{ - assert(m_currentIndex < m_totalSize); - if (m_currentIndex < m_totalSize) - { - return m_currentIndex++; - } - return -1; -} -// --------------------------------------------------------------------------- -int D3D12DescriptorHeap::allocate(int numDescriptors) -{ - assert(m_currentIndex + numDescriptors <= m_totalSize); - if (m_currentIndex + numDescriptors <= m_totalSize) - { - const int index = m_currentIndex; - m_currentIndex += numDescriptors; - return index; - } - return -1; -} -// --------------------------------------------------------------------------- -SLANG_FORCE_INLINE int D3D12DescriptorHeap::placeAt(int index) -{ - assert(index >= 0 && index < m_totalSize); - m_currentIndex = index + 1; - return index; -} - -// --------------------------------------------------------------------------- -SLANG_FORCE_INLINE D3D12_CPU_DESCRIPTOR_HANDLE D3D12DescriptorHeap::getCpuHandle(int index) const -{ - assert(index >= 0 && index < m_totalSize); - D3D12_CPU_DESCRIPTOR_HANDLE start = m_heap->GetCPUDescriptorHandleForHeapStart(); - D3D12_CPU_DESCRIPTOR_HANDLE dst; - dst.ptr = start.ptr + m_descriptorSize * index; - return dst; -} -// --------------------------------------------------------------------------- -SLANG_FORCE_INLINE D3D12_GPU_DESCRIPTOR_HANDLE D3D12DescriptorHeap::getGpuHandle(int index) const -{ - assert(index >= 0 && index < m_totalSize); - D3D12_GPU_DESCRIPTOR_HANDLE start = m_heap->GetGPUDescriptorHandleForHeapStart(); - D3D12_GPU_DESCRIPTOR_HANDLE dst; - dst.ptr = start.ptr + m_descriptorSize * index; - return dst; -} - -} // namespace slang_graphics - diff --git a/tools/slang-graphics/render-d3d11.cpp b/tools/slang-graphics/render-d3d11.cpp deleted file mode 100644 index 4f9749e39..000000000 --- a/tools/slang-graphics/render-d3d11.cpp +++ /dev/null @@ -1,1101 +0,0 @@ -// render-d3d11.cpp - -#define _CRT_SECURE_NO_WARNINGS - -#include "render-d3d11.h" - -//WORKING: #include "options.h" -#include "render.h" -#include "d3d-util.h" - -#include "surface.h" - -// In order to use the Slang API, we need to include its header - -//#include - -#include "../../slang-com-ptr.h" - -// We will be rendering with Direct3D 11, so we need to include -// the Windows and D3D11 headers - -#define WIN32_LEAN_AND_MEAN -#define NOMINMAX -#include -#undef WIN32_LEAN_AND_MEAN -#undef NOMINMAX - -#include -#include - -// We will use the C standard library just for printing error messages. -#include - -#ifdef _MSC_VER -#include -#if (_MSC_VER < 1900) -#define snprintf sprintf_s -#endif -#endif -// -using namespace Slang; - -namespace slang_graphics { - -class D3D11Renderer : public Renderer -{ -public: - // Renderer implementation - virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; - virtual void setClearColor(const float color[4]) override; - virtual void clearFrame() override; - virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& bufferDesc, const void* initData) override; - virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; - virtual InputLayout* createInputLayout( const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& desc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; - virtual void* map(BufferResource* buffer, MapFlavor flavor) override; - virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; - virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState * state); - virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; - virtual void draw(UInt vertexCount, UInt startVertex) override; - virtual void dispatchCompute(int x, int y, int z) override; - virtual void submitGpuWork() override {} - virtual void waitForGpu() override {} - virtual RendererType getRendererType() const override { return RendererType::DirectX11; } - - protected: - - struct BindingDetail - { - ComPtr m_srv; - ComPtr m_uav; - ComPtr m_samplerState; - }; - - class BindingStateImpl: public BindingState - { - public: - typedef BindingState Parent; - - /// Ctor - BindingStateImpl(const Desc& desc): - Parent(desc) - {} - - List m_bindingDetails; - }; - - class ShaderProgramImpl: public ShaderProgram - { - public: - ComPtr m_vertexShader; - ComPtr m_pixelShader; - ComPtr m_computeShader; - }; - - class BufferResourceImpl: public BufferResource - { - public: - typedef BufferResource Parent; - - BufferResourceImpl(const Desc& desc, Usage initialUsage): - Parent(desc), - m_initialUsage(initialUsage) - { - } - - MapFlavor m_mapFlavor; - Usage m_initialUsage; - ComPtr m_buffer; - ComPtr m_staging; - }; - class TextureResourceImpl : public TextureResource - { - public: - typedef TextureResource Parent; - - TextureResourceImpl(const Desc& desc, Usage initialUsage) : - Parent(desc), - m_initialUsage(initialUsage) - { - } - Usage m_initialUsage; - ComPtr m_resource; - }; - - class InputLayoutImpl: public InputLayout - { - public: - ComPtr m_layout; - }; - - /// Capture a texture to a file - static HRESULT captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut); - - void _applyBindingState(bool isCompute); - - ComPtr m_swapChain; - ComPtr m_device; - ComPtr m_immediateContext; - ComPtr m_backBufferTexture; - - List > m_renderTargetViews; - List > m_renderTargetTextures; - - RefPtr m_currentBindings; - - Desc m_desc; - - float m_clearColor[4] = { 0, 0, 0, 0 }; -}; - -Renderer* createD3D11Renderer() -{ - return new D3D11Renderer(); -} - -/* static */HRESULT D3D11Renderer::captureTextureToSurface(ID3D11Device* device, ID3D11DeviceContext* context, ID3D11Texture2D* texture, Surface& surfaceOut) -{ - if (!context) return E_INVALIDARG; - if (!texture) return E_INVALIDARG; - - D3D11_TEXTURE2D_DESC textureDesc; - texture->GetDesc(&textureDesc); - - // Don't bother supporting MSAA for right now - if (textureDesc.SampleDesc.Count > 1) - { - fprintf(stderr, "ERROR: cannot capture multi-sample texture\n"); - return E_INVALIDARG; - } - - HRESULT hr = S_OK; - ComPtr stagingTexture; - - if (textureDesc.Usage == D3D11_USAGE_STAGING && (textureDesc.CPUAccessFlags & D3D11_CPU_ACCESS_READ)) - { - stagingTexture = texture; - } - else - { - // Modify the descriptor to give us a staging texture - textureDesc.BindFlags = 0; - textureDesc.MiscFlags &= ~D3D11_RESOURCE_MISC_TEXTURECUBE; - textureDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; - textureDesc.Usage = D3D11_USAGE_STAGING; - - hr = device->CreateTexture2D(&textureDesc, 0, stagingTexture.writeRef()); - if (FAILED(hr)) - { - fprintf(stderr, "ERROR: failed to create staging texture\n"); - return hr; - } - - context->CopyResource(stagingTexture, texture); - } - - // Now just read back texels from the staging textures - { - D3D11_MAPPED_SUBRESOURCE mappedResource; - SLANG_RETURN_ON_FAIL(context->Map(stagingTexture, 0, D3D11_MAP_READ, 0, &mappedResource)); - - Result res = surfaceOut.set(textureDesc.Width, textureDesc.Height, Format::RGBA_Unorm_UInt8, mappedResource.RowPitch, mappedResource.pData, SurfaceAllocator::getMallocAllocator()); - - // Make sure to unmap - context->Unmap(stagingTexture, 0); - return res; - } -} - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! - -SlangResult D3D11Renderer::initialize(const Desc& desc, void* inWindowHandle) -{ - auto windowHandle = (HWND)inWindowHandle; - m_desc = desc; - - // Rather than statically link against D3D, we load it dynamically. - HMODULE d3dModule = LoadLibraryA("d3d11.dll"); - if (!d3dModule) - { - fprintf(stderr, "error: failed load 'd3d11.dll'\n"); - return SLANG_FAIL; - } - - PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN D3D11CreateDeviceAndSwapChain_ = - (PFN_D3D11_CREATE_DEVICE_AND_SWAP_CHAIN)GetProcAddress(d3dModule, "D3D11CreateDeviceAndSwapChain"); - if (!D3D11CreateDeviceAndSwapChain_) - { - fprintf(stderr, - "error: failed load symbol 'D3D11CreateDeviceAndSwapChain'\n"); - return SLANG_FAIL; - } - - // We create our device in debug mode, just so that we can check that the - // example doesn't trigger warnings. - UINT deviceFlags = 0; - deviceFlags |= D3D11_CREATE_DEVICE_DEBUG; - - // Our swap chain uses RGBA8 with sRGB, with double buffering. - DXGI_SWAP_CHAIN_DESC swapChainDesc = { 0 }; - swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; - - // Note(tfoley): Disabling sRGB for DX back buffer for now, so that we - // can get consistent output with OpenGL, where setting up sRGB will - // probably be more involved. - // swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM_SRGB; - swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; - - swapChainDesc.SampleDesc.Count = 1; - swapChainDesc.SampleDesc.Quality = 0; - swapChainDesc.BufferCount = 2; - swapChainDesc.OutputWindow = windowHandle; - swapChainDesc.Windowed = TRUE; - swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD; - swapChainDesc.Flags = 0; - - // We will ask for the highest feature level that can be supported. - const D3D_FEATURE_LEVEL featureLevels[] = { - D3D_FEATURE_LEVEL_11_1, - D3D_FEATURE_LEVEL_11_0, - D3D_FEATURE_LEVEL_10_1, - D3D_FEATURE_LEVEL_10_0, - D3D_FEATURE_LEVEL_9_3, - D3D_FEATURE_LEVEL_9_2, - D3D_FEATURE_LEVEL_9_1, - }; - D3D_FEATURE_LEVEL featureLevel = D3D_FEATURE_LEVEL_9_1; - const int totalNumFeatureLevels = SLANG_COUNT_OF(featureLevels); - - // On a machine that does not have an up-to-date version of D3D installed, - // the `D3D11CreateDeviceAndSwapChain` call will fail with `E_INVALIDARG` - // if you ask for featuer level 11_1. The workaround is to call - // `D3D11CreateDeviceAndSwapChain` up to twice: the first time with 11_1 - // at the start of the list of requested feature levels, and the second - // time without it. - - for (int ii = 0; ii < 2; ++ii) - { - const HRESULT hr = D3D11CreateDeviceAndSwapChain_( - nullptr, // adapter (use default) - D3D_DRIVER_TYPE_REFERENCE, - //D3D_DRIVER_TYPE_HARDWARE, - nullptr, // software - deviceFlags, - &featureLevels[ii], - totalNumFeatureLevels - ii, - D3D11_SDK_VERSION, - &swapChainDesc, - m_swapChain.writeRef(), - m_device.writeRef(), - &featureLevel, - m_immediateContext.writeRef()); - - // Failures with `E_INVALIDARG` might be due to feature level 11_1 - // not being supported. - if (hr == E_INVALIDARG) - { - continue; - } - - // Other failures are real, though. - SLANG_RETURN_ON_FAIL(hr); - // We must have a swap chain - break; - } - - // After we've created the swap chain, we can request a pointer to the - // back buffer as a D3D11 texture, and create a render-target view from it. - - static const IID kIID_ID3D11Texture2D = { - 0x6f15aaf2, 0xd208, 0x4e89, 0x9a, 0xb4, 0x48, - 0x95, 0x35, 0xd3, 0x4f, 0x9c }; - - SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(0, kIID_ID3D11Texture2D, (void**)m_backBufferTexture.writeRef())); - - for (int i = 0; i < 8; i++) - { - ComPtr texture; - D3D11_TEXTURE2D_DESC textureDesc; - m_backBufferTexture->GetDesc(&textureDesc); - SLANG_RETURN_ON_FAIL(m_device->CreateTexture2D(&textureDesc, nullptr, texture.writeRef())); - - ComPtr rtv; - D3D11_RENDER_TARGET_VIEW_DESC rtvDesc; - rtvDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; - rtvDesc.Texture2D.MipSlice = 0; - rtvDesc.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D; - SLANG_RETURN_ON_FAIL(m_device->CreateRenderTargetView(texture, &rtvDesc, rtv.writeRef())); - - m_renderTargetViews.Add(rtv); - m_renderTargetTextures.Add(texture); - } - - m_immediateContext->OMSetRenderTargets((UINT)m_renderTargetViews.Count(), m_renderTargetViews.Buffer()->readRef(), nullptr); - - // Similarly, we are going to set up a viewport once, and then never - // switch, since this is a simple test app. - D3D11_VIEWPORT viewport; - viewport.TopLeftX = 0; - viewport.TopLeftY = 0; - viewport.Width = (float)desc.width; - viewport.Height = (float)desc.height; - viewport.MaxDepth = 1; // TODO(tfoley): use reversed depth - viewport.MinDepth = 0; - m_immediateContext->RSSetViewports(1, &viewport); - - return SLANG_OK; -} - -void D3D11Renderer::setClearColor(const float color[4]) -{ - memcpy(m_clearColor, color, sizeof(m_clearColor)); -} - -void D3D11Renderer::clearFrame() -{ - for (auto i = 0u; i < m_renderTargetViews.Count(); i++) - { - m_immediateContext->ClearRenderTargetView(m_renderTargetViews[i], m_clearColor); - } -} - -void D3D11Renderer::presentFrame() -{ - m_immediateContext->CopyResource(m_backBufferTexture, m_renderTargetTextures[0]); - m_swapChain->Present(0, 0); -} - -SlangResult D3D11Renderer::captureScreenSurface(Surface& surfaceOut) -{ - return captureTextureToSurface(m_device, m_immediateContext, m_renderTargetTextures[0], surfaceOut); -} - -static D3D11_BIND_FLAG _calcResourceFlag(Resource::BindFlag::Enum bindFlag) -{ - typedef Resource::BindFlag BindFlag; - switch (bindFlag) - { - case BindFlag::VertexBuffer: return D3D11_BIND_VERTEX_BUFFER; - case BindFlag::IndexBuffer: return D3D11_BIND_INDEX_BUFFER; - case BindFlag::ConstantBuffer: return D3D11_BIND_CONSTANT_BUFFER; - case BindFlag::StreamOutput: return D3D11_BIND_STREAM_OUTPUT; - case BindFlag::RenderTarget: return D3D11_BIND_RENDER_TARGET; - case BindFlag::DepthStencil: return D3D11_BIND_DEPTH_STENCIL; - case BindFlag::UnorderedAccess: return D3D11_BIND_UNORDERED_ACCESS; - case BindFlag::PixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; - case BindFlag::NonPixelShaderResource: return D3D11_BIND_SHADER_RESOURCE; - default: return D3D11_BIND_FLAG(0); - } -} - -static int _calcResourceBindFlags(int bindFlags) -{ - int dstFlags = 0; - while (bindFlags) - { - int lsb = bindFlags & -bindFlags; - - dstFlags |= _calcResourceFlag(Resource::BindFlag::Enum(lsb)); - bindFlags &= ~lsb; - } - return dstFlags; -} - -static int _calcResourceAccessFlags(int accessFlags) -{ - switch (accessFlags) - { - case 0: return 0; - case Resource::AccessFlag::Read: return D3D11_CPU_ACCESS_READ; - case Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_WRITE; - case Resource::AccessFlag::Read | - Resource::AccessFlag::Write: return D3D11_CPU_ACCESS_READ | D3D11_CPU_ACCESS_WRITE; - default: assert(!"Invalid flags"); return 0; - } -} - -TextureResource* D3D11Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) -{ - TextureResource::Desc srcDesc(descIn); - srcDesc.setDefaults(initialUsage); - - const int effectiveArraySize = srcDesc.calcEffectiveArraySize(); - - assert(initData); - assert(initData->numSubResources == srcDesc.numMipLevels * effectiveArraySize * srcDesc.size.depth); - - const DXGI_FORMAT format = D3DUtil::getMapFormat(srcDesc.format); - if (format == DXGI_FORMAT_UNKNOWN) - { - return nullptr; - } - - const int bindFlags = _calcResourceBindFlags(srcDesc.bindFlags); - - // Set up the initialize data - List subRes; - subRes.SetSize(srcDesc.numMipLevels * effectiveArraySize); - { - int subResourceIndex = 0; - for (int i = 0; i < effectiveArraySize; i++) - { - for (int j = 0; j < srcDesc.numMipLevels; j++) - { - const int mipHeight = TextureResource::calcMipSize(srcDesc.size.height, j); - - D3D11_SUBRESOURCE_DATA& data = subRes[subResourceIndex]; - - data.pSysMem = initData->subResources[subResourceIndex]; - - data.SysMemPitch = UINT(initData->mipRowStrides[j]); - data.SysMemSlicePitch = UINT(initData->mipRowStrides[j] * mipHeight); - - subResourceIndex++; - } - } - } - - const int accessFlags = _calcResourceAccessFlags(srcDesc.cpuAccessFlags); - - RefPtr texture(new TextureResourceImpl(srcDesc, initialUsage)); - - switch (srcDesc.type) - { - case Resource::Type::Texture1D: - { - D3D11_TEXTURE1D_DESC desc = { 0 }; - desc.BindFlags = bindFlags; - desc.CPUAccessFlags = accessFlags; - desc.Format = format; - desc.MiscFlags = 0; - desc.MipLevels = srcDesc.numMipLevels; - desc.ArraySize = effectiveArraySize; - desc.Width = srcDesc.size.width; - desc.Usage = D3D11_USAGE_DEFAULT; - - ComPtr texture1D; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateTexture1D(&desc, subRes.Buffer(), texture1D.writeRef())); - - texture->m_resource = texture1D; - break; - } - case Resource::Type::TextureCube: - case Resource::Type::Texture2D: - { - D3D11_TEXTURE2D_DESC desc = { 0 }; - desc.BindFlags = bindFlags; - desc.CPUAccessFlags = accessFlags; - desc.Format = format; - desc.MiscFlags = 0; - desc.MipLevels = srcDesc.numMipLevels; - desc.ArraySize = effectiveArraySize; - - desc.Width = srcDesc.size.width; - desc.Height = srcDesc.size.height; - desc.Usage = D3D11_USAGE_DEFAULT; - desc.SampleDesc.Count = srcDesc.sampleDesc.numSamples; - desc.SampleDesc.Quality = srcDesc.sampleDesc.quality; - - if (srcDesc.type == Resource::Type::TextureCube) - { - desc.MiscFlags |= D3D11_RESOURCE_MISC_TEXTURECUBE; - } - - ComPtr texture2D; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateTexture2D(&desc, subRes.Buffer(), texture2D.writeRef())); - - texture->m_resource = texture2D; - break; - } - case Resource::Type::Texture3D: - { - D3D11_TEXTURE3D_DESC desc = { 0 }; - desc.BindFlags = bindFlags; - desc.CPUAccessFlags = accessFlags; - desc.Format = format; - desc.MiscFlags = 0; - desc.MipLevels = srcDesc.numMipLevels; - desc.Width = srcDesc.size.width; - desc.Height = srcDesc.size.height; - desc.Depth = srcDesc.size.depth; - desc.Usage = D3D11_USAGE_DEFAULT; - - ComPtr texture3D; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateTexture3D(&desc, subRes.Buffer(), texture3D.writeRef())); - - texture->m_resource = texture3D; - break; - } - default: return nullptr; - } - - return texture.detach(); -} - -BufferResource* D3D11Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) -{ - BufferResource::Desc srcDesc(descIn); - srcDesc.setDefaults(initialUsage); - - // Make aligned to 256 bytes... not sure why, but if you remove this the tests do fail. - const size_t alignedSizeInBytes = D3DUtil::calcAligned(srcDesc.sizeInBytes, 256); - - // Hack to make the initialization never read from out of bounds memory, by copying into a buffer - List initDataBuffer; - if (initData && alignedSizeInBytes > srcDesc.sizeInBytes) - { - initDataBuffer.SetSize(alignedSizeInBytes); - ::memcpy(initDataBuffer.Buffer(), initData, srcDesc.sizeInBytes); - initData = initDataBuffer.Buffer(); - } - - D3D11_BUFFER_DESC bufferDesc = { 0 }; - bufferDesc.ByteWidth = UINT(alignedSizeInBytes); - bufferDesc.BindFlags = _calcResourceBindFlags(srcDesc.bindFlags); - // For read we'll need to do some staging - bufferDesc.CPUAccessFlags = _calcResourceAccessFlags(descIn.cpuAccessFlags & Resource::AccessFlag::Write); - bufferDesc.Usage = D3D11_USAGE_DEFAULT; - - // If written by CPU, make it dynamic - if (descIn.cpuAccessFlags & Resource::AccessFlag::Write) - { - bufferDesc.Usage = D3D11_USAGE_DYNAMIC; - } - - switch (initialUsage) - { - case Resource::Usage::ConstantBuffer: - { - // We'll just assume ConstantBuffers are dynamic for now - bufferDesc.Usage = D3D11_USAGE_DYNAMIC; - break; - } - default: break; - } - - if (bufferDesc.BindFlags & (D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE)) - { - //desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE; - if (srcDesc.elementSize != 0) - { - bufferDesc.StructureByteStride = srcDesc.elementSize; - bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED; - } - else - { - bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS; - } - } - - D3D11_SUBRESOURCE_DATA subResourceData = { 0 }; - subResourceData.pSysMem = initData; - - RefPtr buffer(new BufferResourceImpl(srcDesc, initialUsage)); - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateBuffer(&bufferDesc, initData ? &subResourceData : nullptr, buffer->m_buffer.writeRef())); - - if (srcDesc.cpuAccessFlags & Resource::AccessFlag::Read) - { - D3D11_BUFFER_DESC bufDesc = {}; - bufDesc.BindFlags = 0; - bufDesc.ByteWidth = (UINT)alignedSizeInBytes; - bufDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; - bufDesc.Usage = D3D11_USAGE_STAGING; - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateBuffer(&bufDesc, nullptr, buffer->m_staging.writeRef())); - } - - return buffer.detach(); -} - -InputLayout* D3D11Renderer::createInputLayout(const InputElementDesc* inputElementsIn, UInt inputElementCount) -{ - D3D11_INPUT_ELEMENT_DESC inputElements[16] = {}; - - char hlslBuffer[1024]; - char* hlslCursor = &hlslBuffer[0]; - - hlslCursor += sprintf(hlslCursor, "float4 main(\n"); - - for (UInt ii = 0; ii < inputElementCount; ++ii) - { - inputElements[ii].SemanticName = inputElementsIn[ii].semanticName; - inputElements[ii].SemanticIndex = (UINT)inputElementsIn[ii].semanticIndex; - inputElements[ii].Format = D3DUtil::getMapFormat(inputElementsIn[ii].format); - inputElements[ii].InputSlot = 0; - inputElements[ii].AlignedByteOffset = (UINT)inputElementsIn[ii].offset; - inputElements[ii].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA; - inputElements[ii].InstanceDataStepRate = 0; - - if (ii != 0) - { - hlslCursor += sprintf(hlslCursor, ",\n"); - } - - char const* typeName = "Unknown"; - switch (inputElementsIn[ii].format) - { - case Format::RGBA_Float32: - typeName = "float4"; - break; - case Format::RGB_Float32: - typeName = "float3"; - break; - case Format::RG_Float32: - typeName = "float2"; - break; - case Format::R_Float32: - typeName = "float"; - break; - default: - return nullptr; - } - - hlslCursor += sprintf(hlslCursor, "%s a%d : %s%d", - typeName, - (int)ii, - inputElementsIn[ii].semanticName, - (int)inputElementsIn[ii].semanticIndex); - } - - hlslCursor += sprintf(hlslCursor, "\n) : SV_Position { return 0; }"); - - ComPtr vertexShaderBlob; - SLANG_RETURN_NULL_ON_FAIL(D3DUtil::compileHLSLShader("inputLayout", hlslBuffer, "main", "vs_5_0", vertexShaderBlob)); - - ComPtr inputLayout; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateInputLayout(&inputElements[0], (UINT)inputElementCount, vertexShaderBlob->GetBufferPointer(), vertexShaderBlob->GetBufferSize(), - inputLayout.writeRef())); - - InputLayoutImpl* impl = new InputLayoutImpl; - impl->m_layout.swap(inputLayout); - - return impl; -} - -void* D3D11Renderer::map(BufferResource* bufferIn, MapFlavor flavor) -{ - BufferResourceImpl* bufferResource = static_cast(bufferIn); - - D3D11_MAP mapType; - ID3D11Buffer* buffer = bufferResource->m_buffer; - - switch (flavor) - { - case MapFlavor::WriteDiscard: - mapType = D3D11_MAP_WRITE_DISCARD; - break; - case MapFlavor::HostWrite: - mapType = D3D11_MAP_WRITE; - break; - case MapFlavor::HostRead: - mapType = D3D11_MAP_READ; - - buffer = bufferResource->m_staging; - if (!buffer) - { - return nullptr; - } - - // Okay copy the data over - m_immediateContext->CopyResource(buffer, bufferResource->m_buffer); - - break; - default: - return nullptr; - } - - // We update our constant buffer per-frame, just for the purposes - // of the example, but we don't actually load different data - // per-frame (we always use an identity projection). - D3D11_MAPPED_SUBRESOURCE mappedSub; - SLANG_RETURN_NULL_ON_FAIL(m_immediateContext->Map(buffer, 0, mapType, 0, &mappedSub)); - - bufferResource->m_mapFlavor = flavor; - - return mappedSub.pData; -} - -void D3D11Renderer::unmap(BufferResource* bufferIn) -{ - BufferResourceImpl* bufferResource = static_cast(bufferIn); - ID3D11Buffer* buffer = (bufferResource->m_mapFlavor == MapFlavor::HostRead) ? bufferResource->m_staging : bufferResource->m_buffer; - m_immediateContext->Unmap(buffer, 0); -} - -void D3D11Renderer::setInputLayout(InputLayout* inputLayoutIn) -{ - auto inputLayout = static_cast(inputLayoutIn); - m_immediateContext->IASetInputLayout(inputLayout->m_layout); -} - -void D3D11Renderer::setPrimitiveTopology(PrimitiveTopology topology) -{ - m_immediateContext->IASetPrimitiveTopology(D3DUtil::getPrimitiveTopology(topology)); -} - -void D3D11Renderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffersIn, const UInt* stridesIn, const UInt* offsetsIn) -{ - static const int kMaxVertexBuffers = 16; - assert(slotCount <= kMaxVertexBuffers); - - UINT vertexStrides[kMaxVertexBuffers]; - UINT vertexOffsets[kMaxVertexBuffers]; - ID3D11Buffer* dxBuffers[kMaxVertexBuffers]; - - auto buffers = (BufferResourceImpl*const*)buffersIn; - - for (UInt ii = 0; ii < slotCount; ++ii) - { - vertexStrides[ii] = (UINT)stridesIn[ii]; - vertexOffsets[ii] = (UINT)offsetsIn[ii]; - dxBuffers[ii] = buffers[ii]->m_buffer; - } - - m_immediateContext->IASetVertexBuffers((UINT)startSlot, (UINT)slotCount, dxBuffers, &vertexStrides[0], &vertexOffsets[0]); -} - -void D3D11Renderer::setShaderProgram(ShaderProgram* programIn) -{ - auto program = (ShaderProgramImpl*)programIn; - m_immediateContext->CSSetShader(program->m_computeShader, nullptr, 0); - m_immediateContext->VSSetShader(program->m_vertexShader, nullptr, 0); - m_immediateContext->PSSetShader(program->m_pixelShader, nullptr, 0); -} - -void D3D11Renderer::draw(UInt vertexCount, UInt startVertex) -{ - _applyBindingState(false); - m_immediateContext->Draw((UINT)vertexCount, (UINT)startVertex); -} - -ShaderProgram* D3D11Renderer::createProgram(const ShaderProgram::Desc& desc) -{ - if (desc.pipelineType == PipelineType::Compute) - { - auto computeKernel = desc.findKernel(StageType::Compute); - - ComPtr computeShader; - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateComputeShader(computeKernel->codeBegin, computeKernel->getCodeSize(), nullptr, computeShader.writeRef())); - - ShaderProgramImpl* shaderProgram = new ShaderProgramImpl(); - shaderProgram->m_computeShader.swap(computeShader); - return shaderProgram; - } - else - { - auto vertexKernel = desc.findKernel(StageType::Vertex); - auto fragmentKernel = desc.findKernel(StageType::Fragment); - - ComPtr vertexShader; - ComPtr pixelShader; - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateVertexShader(vertexKernel->codeBegin, vertexKernel->getCodeSize(), nullptr, vertexShader.writeRef())); - SLANG_RETURN_NULL_ON_FAIL(m_device->CreatePixelShader(fragmentKernel->codeBegin, fragmentKernel->getCodeSize(), nullptr, pixelShader.writeRef())); - - ShaderProgramImpl* shaderProgram = new ShaderProgramImpl(); - shaderProgram->m_vertexShader.swap(vertexShader); - shaderProgram->m_pixelShader.swap(pixelShader); - return shaderProgram; - } -} - -void D3D11Renderer::dispatchCompute(int x, int y, int z) -{ - _applyBindingState(true); - m_immediateContext->Dispatch(x, y, z); -} - -BindingState* D3D11Renderer::createBindingState(const BindingState::Desc& bindingStateDesc) -{ - RefPtr bindingState(new BindingStateImpl(bindingStateDesc)); - - const auto& srcBindings = bindingStateDesc.m_bindings; - const int numBindings = int(srcBindings.Count()); - - auto& dstDetails = bindingState->m_bindingDetails; - dstDetails.SetSize(numBindings); - - for (int i = 0; i < numBindings; ++i) - { - auto& dstDetail = dstDetails[i]; - const auto& srcBinding = srcBindings[i]; - - assert(srcBinding.registerRange.isSingle()); - - switch (srcBinding.bindingType) - { - case BindingType::Buffer: - { - assert(srcBinding.resource && srcBinding.resource->isBuffer()); - - BufferResourceImpl* buffer = static_cast(srcBinding.resource.Ptr()); - const BufferResource::Desc& bufferDesc = buffer->getDesc(); - - const int elemSize = bufferDesc.elementSize <= 0 ? 1 : bufferDesc.elementSize; - - if (bufferDesc.bindFlags & Resource::BindFlag::UnorderedAccess) - { - D3D11_UNORDERED_ACCESS_VIEW_DESC viewDesc; - memset(&viewDesc, 0, sizeof(viewDesc)); - viewDesc.Buffer.FirstElement = 0; - viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); - viewDesc.Buffer.Flags = 0; - viewDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER; - viewDesc.Format = D3DUtil::getMapFormat(bufferDesc.format); - - if (bufferDesc.elementSize == 0 && bufferDesc.format == Format::Unknown) - { - viewDesc.Buffer.Flags |= D3D11_BUFFER_UAV_FLAG_RAW; - viewDesc.Format = DXGI_FORMAT_R32_TYPELESS; - } - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateUnorderedAccessView(buffer->m_buffer, &viewDesc, dstDetail.m_uav.writeRef())); - } - if (bufferDesc.bindFlags & (Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource)) - { - D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; - memset(&viewDesc, 0, sizeof(viewDesc)); - viewDesc.Buffer.FirstElement = 0; - viewDesc.Buffer.ElementWidth = elemSize; - viewDesc.Buffer.NumElements = (UINT)(bufferDesc.sizeInBytes / elemSize); - viewDesc.Buffer.ElementOffset = 0; - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER; - viewDesc.Format = DXGI_FORMAT_UNKNOWN; - - if (bufferDesc.elementSize == 0) - { - viewDesc.Format = DXGI_FORMAT_R32_FLOAT; - } - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(buffer->m_buffer, &viewDesc, dstDetail.m_srv.writeRef())); - } - break; - } - case BindingType::Texture: - case BindingType::CombinedTextureSampler: - { - assert(srcBinding.resource && srcBinding.resource->isTexture()); - - TextureResourceImpl* texture = static_cast(srcBinding.resource.Ptr()); - - const TextureResource::Desc& textureDesc = texture->getDesc(); - - D3D11_SHADER_RESOURCE_VIEW_DESC viewDesc; - viewDesc.Format = D3DUtil::getMapFormat(textureDesc.format); - - switch (texture->getType()) - { - case Resource::Type::Texture1D: - { - if (textureDesc.arraySize <= 0) - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1D; - viewDesc.Texture1D.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture1D.MostDetailedMip = 0; - } - else - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE1DARRAY; - viewDesc.Texture1DArray.ArraySize = textureDesc.arraySize; - viewDesc.Texture1DArray.FirstArraySlice = 0; - viewDesc.Texture1DArray.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture1DArray.MostDetailedMip = 0; - } - break; - } - case Resource::Type::Texture2D: - { - if (textureDesc.arraySize <= 0) - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D; - viewDesc.Texture2D.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture2D.MostDetailedMip = 0; - } - else - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2DARRAY; - viewDesc.Texture2DArray.ArraySize = textureDesc.arraySize; - viewDesc.Texture2DArray.FirstArraySlice = 0; - viewDesc.Texture2DArray.MipLevels = textureDesc.numMipLevels; - viewDesc.Texture2DArray.MostDetailedMip = 0; - } - break; - } - case Resource::Type::TextureCube: - { - if (textureDesc.arraySize <= 0) - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBE; - viewDesc.TextureCube.MipLevels = textureDesc.numMipLevels; - viewDesc.TextureCube.MostDetailedMip = 0; - } - else - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBEARRAY; - viewDesc.TextureCubeArray.MipLevels = textureDesc.numMipLevels; - viewDesc.TextureCubeArray.MostDetailedMip = 0; - viewDesc.TextureCubeArray.First2DArrayFace = 0; - viewDesc.TextureCubeArray.NumCubes = textureDesc.arraySize; - } - break; - } - case Resource::Type::Texture3D: - { - viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE3D; - viewDesc.Texture3D.MipLevels = textureDesc.numMipLevels; // Old code fixed as one - viewDesc.Texture3D.MostDetailedMip = 0; - break; - } - default: - { - assert(!"Unhandled type"); - return nullptr; - } - } - - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateShaderResourceView(texture->m_resource, &viewDesc, dstDetail.m_srv.writeRef())); - break; - } - case BindingType::Sampler: - { - const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; - - D3D11_SAMPLER_DESC desc = {}; - desc.AddressU = desc.AddressV = desc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP; - - if (samplerDesc.isCompareSampler) - { - desc.ComparisonFunc = D3D11_COMPARISON_LESS_EQUAL; - desc.Filter = D3D11_FILTER_MIN_LINEAR_MAG_MIP_POINT; - desc.MinLOD = desc.MaxLOD = 0.0f; - } - else - { - desc.Filter = D3D11_FILTER_ANISOTROPIC; - desc.MaxAnisotropy = 8; - desc.MinLOD = 0.0f; - desc.MaxLOD = 100.0f; - } - SLANG_RETURN_NULL_ON_FAIL(m_device->CreateSamplerState(&desc, dstDetail.m_samplerState.writeRef())); - break; - } - default: - { - assert(!"Unhandled type"); - return nullptr; - } - } - } - - // Done - return bindingState.detach(); -} - -void D3D11Renderer::_applyBindingState(bool isCompute) -{ - auto context = m_immediateContext.get(); - - const auto& details = m_currentBindings->m_bindingDetails; - const auto& bindings = m_currentBindings->getDesc().m_bindings; - - const int numBindings = int(bindings.Count()); - - for (int i = 0; i < numBindings; ++i) - { - const auto& binding = bindings[i]; - const auto& detail = details[i]; - - const int bindingIndex = binding.registerRange.getSingleIndex(); - - switch (binding.bindingType) - { - case BindingType::Buffer: - { - assert(binding.resource && binding.resource->isBuffer()); - if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) - { - ID3D11Buffer* buffer = static_cast(binding.resource.Ptr())->m_buffer; - if (isCompute) - context->CSSetConstantBuffers(bindingIndex, 1, &buffer); - else - { - context->VSSetConstantBuffers(bindingIndex, 1, &buffer); - context->PSSetConstantBuffers(bindingIndex, 1, &buffer); - } - } - else if (detail.m_uav) - { - if (isCompute) - context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); - else - context->OMSetRenderTargetsAndUnorderedAccessViews(m_currentBindings->getDesc().m_numRenderTargets, - m_renderTargetViews.Buffer()->readRef(), nullptr, bindingIndex, 1, detail.m_uav.readRef(), nullptr); - } - else - { - if (isCompute) - context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - else - { - context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - } - } - break; - } - case BindingType::Texture: - { - if (detail.m_uav) - { - if (isCompute) - context->CSSetUnorderedAccessViews(bindingIndex, 1, detail.m_uav.readRef(), nullptr); - else - context->OMSetRenderTargetsAndUnorderedAccessViews(D3D11_KEEP_RENDER_TARGETS_AND_DEPTH_STENCIL, - nullptr, nullptr, bindingIndex, 1, detail.m_uav.readRef(), nullptr); - } - else - { - if (isCompute) - context->CSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - else - { - context->PSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - context->VSSetShaderResources(bindingIndex, 1, detail.m_srv.readRef()); - } - } - break; - } - case BindingType::Sampler: - { - if (isCompute) - context->CSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); - else - { - context->PSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); - context->VSSetSamplers(bindingIndex, 1, detail.m_samplerState.readRef()); - } - break; - } - default: - { - assert(!"Not implemented"); - return; - } - } - } -} - -void D3D11Renderer::setBindingState(BindingState* state) -{ - m_currentBindings = static_cast(state); -} - -} // renderer_test diff --git a/tools/slang-graphics/render-d3d11.h b/tools/slang-graphics/render-d3d11.h deleted file mode 100644 index 7b3d25e9f..000000000 --- a/tools/slang-graphics/render-d3d11.h +++ /dev/null @@ -1,10 +0,0 @@ -// render-d3d11.h -#pragma once - -namespace slang_graphics { - -class Renderer; - -Renderer* createD3D11Renderer(); - -} // slang_graphics diff --git a/tools/slang-graphics/render-d3d12.cpp b/tools/slang-graphics/render-d3d12.cpp deleted file mode 100644 index 24c9ecacb..000000000 --- a/tools/slang-graphics/render-d3d12.cpp +++ /dev/null @@ -1,2467 +0,0 @@ -// render-d3d12.cpp -#define _CRT_SECURE_NO_WARNINGS - -#include "render-d3d12.h" - -//WORKING:#include "options.h" -#include "render.h" - -#include "surface.h" - -// In order to use the Slang API, we need to include its header - -//WORKING:#include - -// We will be rendering with Direct3D 12, so we need to include -// the Windows and D3D12 headers - -#define WIN32_LEAN_AND_MEAN -#define NOMINMAX -#include -#undef WIN32_LEAN_AND_MEAN -#undef NOMINMAX - -#include -#include -#include - -#include "../../slang-com-ptr.h" - -#include "resource-d3d12.h" -#include "descriptor-heap-d3d12.h" -#include "circular-resource-heap-d3d12.h" - -#include "d3d-util.h" - -// We will use the C standard library just for printing error messages. -#include - -#ifdef _MSC_VER -#include -#if (_MSC_VER < 1900) -#define snprintf sprintf_s -#endif -#endif -// - -#define ENABLE_DEBUG_LAYER 1 - -namespace slang_graphics { -using namespace Slang; - -class D3D12Renderer : public Renderer -{ -public: - // Renderer implementation - virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; - virtual void setClearColor(const float color[4]) override; - virtual void clearFrame() override; - virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& bufferDesc, const void* initData) override; - virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& bindingStateDesc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; - virtual void* map(BufferResource* buffer, MapFlavor flavor) override; - virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; - virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState* state); - virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; - virtual void draw(UInt vertexCount, UInt startVertex) override; - virtual void dispatchCompute(int x, int y, int z) override; - virtual void submitGpuWork() override; - virtual void waitForGpu() override; - virtual RendererType getRendererType() const override { return RendererType::DirectX12; } - - ~D3D12Renderer(); - -protected: - static const Int kMaxNumRenderFrames = 4; - static const Int kMaxNumRenderTargets = 3; - - struct Submitter - { - virtual void setRootConstantBufferView(int index, D3D12_GPU_VIRTUAL_ADDRESS gpuBufferLocation) = 0; - virtual void setRootDescriptorTable(int index, D3D12_GPU_DESCRIPTOR_HANDLE BaseDescriptor) = 0; - virtual void setRootSignature(ID3D12RootSignature* rootSignature) = 0; - }; - - struct FrameInfo - { - FrameInfo() :m_fenceValue(0) {} - void reset() - { - m_commandAllocator.setNull(); - } - ComPtr m_commandAllocator; ///< The command allocator for this frame - UINT64 m_fenceValue; ///< The fence value when rendering this Frame is complete - }; - - class ShaderProgramImpl: public ShaderProgram - { - public: - PipelineType m_pipelineType; - List m_vertexShader; - List m_pixelShader; - List m_computeShader; - }; - - class BufferResourceImpl: public BufferResource - { - public: - typedef BufferResource Parent; - - enum class BackingStyle - { - Unknown, - ResourceBacked, ///< The contents is only held within the resource - MemoryBacked, ///< The current contents is held in m_memory and copied to GPU every time it's used (typically used for constant buffers) - }; - - void bindConstantBufferView(D3D12CircularResourceHeap& circularHeap, int index, Submitter* submitter) const - { - switch (m_backingStyle) - { - case BackingStyle::MemoryBacked: - { - const size_t bufferSize = m_memory.Count(); - D3D12CircularResourceHeap::Cursor cursor = circularHeap.allocateConstantBuffer(bufferSize); - ::memcpy(cursor.m_position, m_memory.Buffer(), bufferSize); - // Set the constant buffer - submitter->setRootConstantBufferView(index, circularHeap.getGpuHandle(cursor)); - break; - } - case BackingStyle::ResourceBacked: - { - // Set the constant buffer - submitter->setRootConstantBufferView(index, m_resource.getResource()->GetGPUVirtualAddress()); - break; - } - default: break; - } - } - - BufferResourceImpl(Resource::Usage initialUsage, const Desc& desc): - Parent(desc), - m_mapFlavor(MapFlavor::HostRead), - m_initialUsage(initialUsage) - { - } - - static BackingStyle _calcResourceBackingStyle(Usage usage) - { - switch (usage) - { - case Usage::ConstantBuffer: return BackingStyle::MemoryBacked; - default: return BackingStyle::ResourceBacked; - } - } - - BackingStyle m_backingStyle; ///< How the resource is 'backed' - either as a resource or cpu memory. Cpu memory is typically used for constant buffers. - D3D12Resource m_resource; ///< The resource typically in gpu memory - D3D12Resource m_uploadResource; ///< If the resource can be written to, and is in gpu memory (ie not Memory backed), will have upload resource - - Usage m_initialUsage; - - List m_memory; ///< Cpu memory buffer, used if the m_backingStyle is MemoryBacked - MapFlavor m_mapFlavor; ///< If the resource is mapped holds the current mapping flavor - }; - - class TextureResourceImpl: public TextureResource - { - public: - typedef TextureResource Parent; - - TextureResourceImpl(const Desc& desc): - Parent(desc) - { - } - - D3D12Resource m_resource; - }; - - class InputLayoutImpl: public InputLayout - { - public: - List m_elements; - List m_text; ///< Holds all strings to keep in scope - }; - - struct BindingDetail - { - int m_srvIndex = -1; - int m_uavIndex = -1; - int m_samplerIndex = -1; - }; - - class BindingStateImpl: public BindingState - { - public: - typedef BindingState Parent; - - Result init(ID3D12Device* device) - { - // Set up descriptor heaps - SLANG_RETURN_ON_FAIL(m_viewHeap.init(device, 256, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); - SLANG_RETURN_ON_FAIL(m_samplerHeap.init(device, 16, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER, D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)); - return SLANG_OK; - } - - /// Ctor - BindingStateImpl(const Desc& desc) : - Parent(desc) - {} - - List m_bindingDetails; ///< These match 1-1 to the bindings in the m_desc - - D3D12DescriptorHeap m_viewHeap; ///< Cbv, Srv, Uav - D3D12DescriptorHeap m_samplerHeap; ///< Heap for samplers - }; - - class RenderState: public RefObject - { - public: - D3D12_PRIMITIVE_TOPOLOGY_TYPE m_primitiveTopologyType; - RefPtr m_bindingState; - RefPtr m_inputLayout; - RefPtr m_shaderProgram; - - ComPtr m_rootSignature; - ComPtr m_pipelineState; - }; - - struct BoundVertexBuffer - { - RefPtr m_buffer; - int m_stride; - int m_offset; - }; - - struct BindParameters - { - enum - { - kMaxRanges = 16, - kMaxParameters = 32 - }; - - D3D12_DESCRIPTOR_RANGE& nextRange() { return m_ranges[m_rangeIndex++]; } - D3D12_ROOT_PARAMETER& nextParameter() { return m_parameters[m_paramIndex++]; } - - BindParameters(): - m_rangeIndex(0), - m_paramIndex(0) - {} - - D3D12_DESCRIPTOR_RANGE m_ranges[kMaxRanges]; - int m_rangeIndex; - D3D12_ROOT_PARAMETER m_parameters[kMaxParameters]; - int m_paramIndex; - }; - - struct GraphicsSubmitter : public Submitter - { - virtual void setRootConstantBufferView(int index, D3D12_GPU_VIRTUAL_ADDRESS gpuBufferLocation) override - { - m_commandList->SetGraphicsRootConstantBufferView(index, gpuBufferLocation); - } - virtual void setRootDescriptorTable(int index, D3D12_GPU_DESCRIPTOR_HANDLE baseDescriptor) override - { - m_commandList->SetGraphicsRootDescriptorTable(index, baseDescriptor); - } - void setRootSignature(ID3D12RootSignature* rootSignature) - { - m_commandList->SetGraphicsRootSignature(rootSignature); - } - - GraphicsSubmitter(ID3D12GraphicsCommandList* commandList): - m_commandList(commandList) - { - } - - ID3D12GraphicsCommandList* m_commandList; - }; - - struct ComputeSubmitter : public Submitter - { - virtual void setRootConstantBufferView(int index, D3D12_GPU_VIRTUAL_ADDRESS gpuBufferLocation) override - { - m_commandList->SetComputeRootConstantBufferView(index, gpuBufferLocation); - } - virtual void setRootDescriptorTable(int index, D3D12_GPU_DESCRIPTOR_HANDLE baseDescriptor) override - { - m_commandList->SetComputeRootDescriptorTable(index, baseDescriptor); - } - void setRootSignature(ID3D12RootSignature* rootSignature) - { - m_commandList->SetComputeRootSignature(rootSignature); - } - - ComputeSubmitter(ID3D12GraphicsCommandList* commandList) : - m_commandList(commandList) - { - } - - ID3D12GraphicsCommandList* m_commandList; - }; - - static PROC loadProc(HMODULE module, char const* name); - Result createFrameResources(); - /// Blocks until gpu has completed all work - void releaseFrameResources(); - - Result createBuffer(const D3D12_RESOURCE_DESC& resourceDesc, const void* srcData, D3D12Resource& uploadResource, D3D12_RESOURCE_STATES finalState, D3D12Resource& resourceOut); - - void beginRender(); - - void endRender(); - - void submitGpuWorkAndWait(); - void _resetCommandList(); - - Result captureTextureToSurface(D3D12Resource& resource, Surface& surfaceOut); - - FrameInfo& getFrame() { return m_frameInfos[m_frameIndex]; } - const FrameInfo& getFrame() const { return m_frameInfos[m_frameIndex]; } - - ID3D12GraphicsCommandList* getCommandList() const { return m_commandList; } - - RenderState* calcRenderState(); - /// From current bindings calculate the root signature and pipeline state - Result calcGraphicsPipelineState(ComPtr& sigOut, ComPtr& pipelineStateOut); - Result calcComputePipelineState(ComPtr& signatureOut, ComPtr& pipelineStateOut); - - Result _bindRenderState(RenderState* renderState, ID3D12GraphicsCommandList* commandList, Submitter* submitter); - - Result _calcBindParameters(BindParameters& params); - RenderState* findRenderState(PipelineType pipelineType); - - PFN_D3D12_SERIALIZE_ROOT_SIGNATURE m_D3D12SerializeRootSignature = nullptr; - - D3D12CircularResourceHeap m_circularResourceHeap; - - int m_commandListOpenCount = 0; ///< If >0 the command list should be open - - List m_boundVertexBuffers; - - RefPtr m_boundShaderProgram; - RefPtr m_boundInputLayout; - RefPtr m_boundBindingState; - - DXGI_FORMAT m_targetFormat = DXGI_FORMAT_R8G8B8A8_UNORM; - DXGI_FORMAT m_depthStencilFormat = DXGI_FORMAT_D24_UNORM_S8_UINT; - bool m_hasVsync = true; - bool m_isFullSpeed = false; - bool m_allowFullScreen = false; - bool m_isMultiSampled = false; - int m_numTargetSamples = 1; ///< The number of multi sample samples - int m_targetSampleQuality = 0; ///< The multi sample quality - - Desc m_desc; - - bool m_isInitialized = false; - - D3D12_PRIMITIVE_TOPOLOGY_TYPE m_primitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; - D3D12_PRIMITIVE_TOPOLOGY m_primitiveTopology = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST; - - float m_clearColor[4] = { 0, 0, 0, 0 }; - - D3D12_VIEWPORT m_viewport = {}; - - ComPtr m_dxDebug; - - ComPtr m_device; - ComPtr m_swapChain; - ComPtr m_commandQueue; - ComPtr m_rtvHeap; - ComPtr m_commandList; - - D3D12_RECT m_scissorRect = {}; - - List > m_renderStates; ///< Holds list of all render state combinations - RenderState* m_currentRenderState = nullptr; ///< The current combination - - UINT m_rtvDescriptorSize = 0; - - ComPtr m_dsvHeap; - UINT m_dsvDescriptorSize = 0; - - // Synchronization objects. - D3D12CounterFence m_fence; - - HANDLE m_swapChainWaitableObject; - - // Frame specific data - int m_numRenderFrames = 0; - UINT m_frameIndex = 0; - FrameInfo m_frameInfos[kMaxNumRenderFrames]; - - int m_numRenderTargets = 2; - int m_renderTargetIndex = 0; - - D3D12Resource* m_backBuffers[kMaxNumRenderTargets]; - D3D12Resource* m_renderTargets[kMaxNumRenderTargets]; - - D3D12Resource m_backBufferResources[kMaxNumRenderTargets]; - D3D12Resource m_renderTargetResources[kMaxNumRenderTargets]; - - D3D12Resource m_depthStencil; - D3D12_CPU_DESCRIPTOR_HANDLE m_depthStencilView = {}; - - int32_t m_depthStencilUsageFlags = 0; ///< D3DUtil::UsageFlag combination for depth stencil - int32_t m_targetUsageFlags = 0; ///< D3DUtil::UsageFlag combination for target - - HWND m_hwnd = nullptr; -}; - -Renderer* createD3D12Renderer() -{ - return new D3D12Renderer; -} - -/* static */PROC D3D12Renderer::loadProc(HMODULE module, char const* name) -{ - PROC proc = ::GetProcAddress(module, name); - if (!proc) - { - fprintf(stderr, "error: failed load symbol '%s'\n", name); - return nullptr; - } - return proc; -} - -void D3D12Renderer::releaseFrameResources() -{ - // https://msdn.microsoft.com/en-us/library/windows/desktop/bb174577%28v=vs.85%29.aspx - - // Release the resources holding references to the swap chain (requirement of - // IDXGISwapChain::ResizeBuffers) and reset the frame fence values to the - // current fence value. - for (int i = 0; i < m_numRenderFrames; i++) - { - FrameInfo& info = m_frameInfos[i]; - info.reset(); - info.m_fenceValue = m_fence.getCurrentValue(); - } - for (int i = 0; i < m_numRenderTargets; i++) - { - m_backBuffers[i]->setResourceNull(); - m_renderTargets[i]->setResourceNull(); - } -} - -void D3D12Renderer::waitForGpu() -{ - m_fence.nextSignalAndWait(m_commandQueue); -} - -D3D12Renderer::~D3D12Renderer() -{ - if (m_isInitialized) - { - // Ensure that the GPU is no longer referencing resources that are about to be - // cleaned up by the destructor. - waitForGpu(); - } -} - -static void _initSrvDesc(Resource::Type resourceType, const TextureResource::Desc& textureDesc, const D3D12_RESOURCE_DESC& desc, DXGI_FORMAT pixelFormat, D3D12_SHADER_RESOURCE_VIEW_DESC& descOut) -{ - // create SRV - descOut = D3D12_SHADER_RESOURCE_VIEW_DESC(); - - descOut.Format = (pixelFormat == DXGI_FORMAT_UNKNOWN) ? D3DUtil::calcFormat(D3DUtil::USAGE_SRV, desc.Format) : pixelFormat; - descOut.Shader4ComponentMapping = D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING; - if (desc.DepthOrArraySize == 1) - { - switch (desc.Dimension) - { - case D3D12_RESOURCE_DIMENSION_TEXTURE1D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE1D; break; - case D3D12_RESOURCE_DIMENSION_TEXTURE2D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE2D; break; - case D3D12_RESOURCE_DIMENSION_TEXTURE3D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE3D; break; - default: assert(!"Unknown dimension"); - } - - descOut.Texture2D.MipLevels = desc.MipLevels; - descOut.Texture2D.MostDetailedMip = 0; - descOut.Texture2D.PlaneSlice = 0; - descOut.Texture2D.ResourceMinLODClamp = 0.0f; - } - else if (resourceType == Resource::Type::TextureCube) - { - if (textureDesc.arraySize > 1) - { - descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURECUBEARRAY; - - descOut.TextureCubeArray.NumCubes = textureDesc.arraySize; - descOut.TextureCubeArray.First2DArrayFace = 0; - descOut.TextureCubeArray.MipLevels = desc.MipLevels; - descOut.TextureCubeArray.MostDetailedMip = 0; - descOut.TextureCubeArray.ResourceMinLODClamp = 0; - } - else - { - descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURECUBE; - - descOut.TextureCube.MipLevels = desc.MipLevels; - descOut.TextureCube.MostDetailedMip = 0; - descOut.TextureCube.ResourceMinLODClamp = 0; - } - } - else - { - assert(desc.DepthOrArraySize > 1); - - switch (desc.Dimension) - { - case D3D12_RESOURCE_DIMENSION_TEXTURE1D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE1DARRAY; break; - case D3D12_RESOURCE_DIMENSION_TEXTURE2D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE2DARRAY; break; - case D3D12_RESOURCE_DIMENSION_TEXTURE3D: descOut.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE3D; break; - - default: assert(!"Unknown dimension"); - } - - descOut.Texture2DArray.ArraySize = desc.DepthOrArraySize; - descOut.Texture2DArray.MostDetailedMip = 0; - descOut.Texture2DArray.MipLevels = desc.MipLevels; - descOut.Texture2DArray.FirstArraySlice = 0; - descOut.Texture2DArray.PlaneSlice = 0; - descOut.Texture2DArray.ResourceMinLODClamp = 0; - } -} - -static void _initBufferResourceDesc(size_t bufferSize, D3D12_RESOURCE_DESC& out) -{ - out = {}; - - out.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER; - out.Alignment = 0; - out.Width = bufferSize; - out.Height = 1; - out.DepthOrArraySize = 1; - out.MipLevels = 1; - out.Format = DXGI_FORMAT_UNKNOWN; - out.SampleDesc.Count = 1; - out.SampleDesc.Quality = 0; - out.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR; - out.Flags = D3D12_RESOURCE_FLAG_NONE; -} - -Result D3D12Renderer::createBuffer(const D3D12_RESOURCE_DESC& resourceDesc, const void* srcData, D3D12Resource& uploadResource, D3D12_RESOURCE_STATES finalState, D3D12Resource& resourceOut) -{ - const size_t bufferSize = size_t(resourceDesc.Width); - - { - D3D12_HEAP_PROPERTIES heapProps; - heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - - const D3D12_RESOURCE_STATES initialState = srcData ? D3D12_RESOURCE_STATE_COPY_DEST : finalState; - - SLANG_RETURN_ON_FAIL(resourceOut.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, initialState, nullptr)); - } - - { - D3D12_HEAP_PROPERTIES heapProps; - heapProps.Type = D3D12_HEAP_TYPE_UPLOAD; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - - D3D12_RESOURCE_DESC uploadResourceDesc(resourceDesc); - uploadResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE; - - SLANG_RETURN_ON_FAIL(uploadResource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, uploadResourceDesc, D3D12_RESOURCE_STATE_GENERIC_READ, nullptr)); - } - - if (srcData) - { - // Copy data to the intermediate upload heap and then schedule a copy - // from the upload heap to the vertex buffer. - UINT8* dstData; - D3D12_RANGE readRange = {}; // We do not intend to read from this resource on the CPU. - - ID3D12Resource* dxUploadResource = uploadResource.getResource(); - - SLANG_RETURN_ON_FAIL(dxUploadResource->Map(0, &readRange, reinterpret_cast(&dstData))); - ::memcpy(dstData, srcData, bufferSize); - dxUploadResource->Unmap(0, nullptr); - - m_commandList->CopyBufferRegion(resourceOut, 0, uploadResource, 0, bufferSize); - - // Make sure it's in the right state - { - D3D12BarrierSubmitter submitter(m_commandList); - resourceOut.transition(finalState, submitter); - } - - submitGpuWorkAndWait(); - } - - return SLANG_OK; -} - -void D3D12Renderer::_resetCommandList() -{ - const FrameInfo& frame = getFrame(); - - ID3D12GraphicsCommandList* commandList = getCommandList(); - commandList->Reset(frame.m_commandAllocator, nullptr); - - D3D12_CPU_DESCRIPTOR_HANDLE rtvHandle = { m_rtvHeap->GetCPUDescriptorHandleForHeapStart().ptr + m_renderTargetIndex * m_rtvDescriptorSize }; - if (m_depthStencil) - { - commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, &m_depthStencilView); - } - else - { - commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, nullptr); - } - - // Set necessary state. - commandList->RSSetViewports(1, &m_viewport); - commandList->RSSetScissorRects(1, &m_scissorRect); -} - -void D3D12Renderer::beginRender() -{ - // Should currently not be open! - assert(m_commandListOpenCount == 0); - - m_circularResourceHeap.updateCompleted(); - - getFrame().m_commandAllocator->Reset(); - - _resetCommandList(); - - // Indicate that the render target needs to be writable - { - D3D12BarrierSubmitter submitter(m_commandList); - m_renderTargets[m_renderTargetIndex]->transition(D3D12_RESOURCE_STATE_RENDER_TARGET, submitter); - } - - m_commandListOpenCount = 1; -} - -void D3D12Renderer::endRender() -{ - assert(m_commandListOpenCount == 1); - - { - const UInt64 signalValue = m_fence.nextSignal(m_commandQueue); - m_circularResourceHeap.addSync(signalValue); - } - - D3D12Resource& backBuffer = *m_backBuffers[m_renderTargetIndex]; - if (m_isMultiSampled) - { - // MSAA resolve - D3D12Resource& renderTarget = *m_renderTargets[m_renderTargetIndex]; - assert(&renderTarget != &backBuffer); - // Barriers to wait for the render target, and the backbuffer to be in correct state - { - D3D12BarrierSubmitter submitter(m_commandList); - renderTarget.transition(D3D12_RESOURCE_STATE_RESOLVE_SOURCE, submitter); - backBuffer.transition(D3D12_RESOURCE_STATE_RESOLVE_DEST, submitter); - } - - // Do the resolve... - m_commandList->ResolveSubresource(backBuffer, 0, renderTarget, 0, m_targetFormat); - } - - // Make the back buffer presentable - { - D3D12BarrierSubmitter submitter(m_commandList); - backBuffer.transition(D3D12_RESOURCE_STATE_PRESENT, submitter); - } - - SLANG_ASSERT_VOID_ON_FAIL(m_commandList->Close()); - - { - // Execute the command list. - ID3D12CommandList* commandLists[] = { m_commandList }; - m_commandQueue->ExecuteCommandLists(SLANG_COUNT_OF(commandLists), commandLists); - } - - assert(m_commandListOpenCount == 1); - // Must be 0 - m_commandListOpenCount = 0; -} - -void D3D12Renderer::submitGpuWork() -{ - assert(m_commandListOpenCount); - ID3D12GraphicsCommandList* commandList = getCommandList(); - - SLANG_ASSERT_VOID_ON_FAIL(commandList->Close()); - { - // Execute the command list. - ID3D12CommandList* commandLists[] = { commandList }; - m_commandQueue->ExecuteCommandLists(SLANG_COUNT_OF(commandLists), commandLists); - } - - // Reset the render target - _resetCommandList(); -} - -void D3D12Renderer::submitGpuWorkAndWait() -{ - submitGpuWork(); - waitForGpu(); -} - -Result D3D12Renderer::captureTextureToSurface(D3D12Resource& resource, Surface& surfaceOut) -{ - const D3D12_RESOURCE_STATES initialState = resource.getState(); - - const D3D12_RESOURCE_DESC desc = resource.getResource()->GetDesc(); - - // Don't bother supporting MSAA for right now - if (desc.SampleDesc.Count > 1) - { - fprintf(stderr, "ERROR: cannot capture multi-sample texture\n"); - return SLANG_FAIL; - } - - size_t bytesPerPixel = sizeof(uint32_t); - size_t rowPitch = int(desc.Width) * bytesPerPixel; - size_t bufferSize = rowPitch * int(desc.Height); - - D3D12Resource stagingResource; - { - D3D12_RESOURCE_DESC stagingDesc; - _initBufferResourceDesc(bufferSize, stagingDesc); - - D3D12_HEAP_PROPERTIES heapProps; - heapProps.Type = D3D12_HEAP_TYPE_READBACK; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - - SLANG_RETURN_ON_FAIL(stagingResource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, stagingDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); - } - - { - D3D12BarrierSubmitter submitter(m_commandList); - resource.transition(D3D12_RESOURCE_STATE_COPY_SOURCE, submitter); - } - - // Do the copy - { - D3D12_TEXTURE_COPY_LOCATION srcLoc; - srcLoc.pResource = resource; - srcLoc.Type = D3D12_TEXTURE_COPY_TYPE_SUBRESOURCE_INDEX; - srcLoc.SubresourceIndex = 0; - - D3D12_TEXTURE_COPY_LOCATION dstLoc; - dstLoc.pResource = stagingResource; - dstLoc.Type = D3D12_TEXTURE_COPY_TYPE_PLACED_FOOTPRINT; - dstLoc.PlacedFootprint.Offset = 0; - dstLoc.PlacedFootprint.Footprint.Format = desc.Format; - dstLoc.PlacedFootprint.Footprint.Width = UINT(desc.Width); - dstLoc.PlacedFootprint.Footprint.Height = UINT(desc.Height); - dstLoc.PlacedFootprint.Footprint.Depth = 1; - dstLoc.PlacedFootprint.Footprint.RowPitch = UINT(rowPitch); - - m_commandList->CopyTextureRegion(&dstLoc, 0, 0, 0, &srcLoc, nullptr); - } - - { - D3D12BarrierSubmitter submitter(m_commandList); - resource.transition(initialState, submitter); - } - - // Submit the copy, and wait for copy to complete - submitGpuWorkAndWait(); - - { - ID3D12Resource* dxResource = stagingResource; - - UINT8* data; - D3D12_RANGE readRange = {0, bufferSize}; - - SLANG_RETURN_ON_FAIL(dxResource->Map(0, &readRange, reinterpret_cast(&data))); - - Result res = surfaceOut.set(int(desc.Width), int(desc.Height), Format::RGBA_Unorm_UInt8, int(rowPitch), data, SurfaceAllocator::getMallocAllocator()); - - dxResource->Unmap(0, nullptr); - return res; - } -} - -Result D3D12Renderer::calcComputePipelineState(ComPtr& signatureOut, ComPtr& pipelineStateOut) -{ - BindParameters bindParameters; - _calcBindParameters(bindParameters); - - ComPtr rootSignature; - ComPtr pipelineState; - - { - D3D12_ROOT_SIGNATURE_DESC rootSignatureDesc; - rootSignatureDesc.NumParameters = bindParameters.m_paramIndex; - rootSignatureDesc.pParameters = bindParameters.m_parameters; - rootSignatureDesc.NumStaticSamplers = 0; - rootSignatureDesc.pStaticSamplers = nullptr; - rootSignatureDesc.Flags = D3D12_ROOT_SIGNATURE_FLAG_NONE; - - ComPtr signature; - ComPtr error; - SLANG_RETURN_ON_FAIL(m_D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, signature.writeRef(), error.writeRef())); - SLANG_RETURN_ON_FAIL(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(rootSignature.writeRef()))); - } - - { - // Describe and create the compute pipeline state object - D3D12_COMPUTE_PIPELINE_STATE_DESC computeDesc = {}; - computeDesc.pRootSignature = rootSignature; - computeDesc.CS = { m_boundShaderProgram->m_computeShader.Buffer(), m_boundShaderProgram->m_computeShader.Count() }; - SLANG_RETURN_ON_FAIL(m_device->CreateComputePipelineState(&computeDesc, IID_PPV_ARGS(pipelineState.writeRef()))); - } - - signatureOut.swap(rootSignature); - pipelineStateOut.swap(pipelineState); - - return SLANG_OK; -} - -Result D3D12Renderer::calcGraphicsPipelineState(ComPtr& signatureOut, ComPtr& pipelineStateOut) -{ - BindParameters bindParameters; - _calcBindParameters(bindParameters); - - ComPtr rootSignature; - ComPtr pipelineState; - - { - // Deny unnecessary access to certain pipeline stages - D3D12_ROOT_SIGNATURE_DESC rootSignatureDesc; - rootSignatureDesc.NumParameters = bindParameters.m_paramIndex; - rootSignatureDesc.pParameters = bindParameters.m_parameters; - rootSignatureDesc.NumStaticSamplers = 0; - rootSignatureDesc.pStaticSamplers = nullptr; - rootSignatureDesc.Flags = m_boundInputLayout ? D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT : D3D12_ROOT_SIGNATURE_FLAG_NONE; - - ComPtr signature; - ComPtr error; - SLANG_RETURN_ON_FAIL(m_D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, signature.writeRef(), error.writeRef())); - SLANG_RETURN_ON_FAIL(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(rootSignature.writeRef()))); - } - - { - // Describe and create the graphics pipeline state object (PSO) - D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {}; - - psoDesc.pRootSignature = rootSignature; - - psoDesc.VS = { m_boundShaderProgram->m_vertexShader.Buffer(), m_boundShaderProgram->m_vertexShader.Count() }; - psoDesc.PS = { m_boundShaderProgram->m_pixelShader.Buffer(), m_boundShaderProgram->m_pixelShader.Count() }; - - { - psoDesc.InputLayout = { m_boundInputLayout->m_elements.Buffer(), UINT(m_boundInputLayout->m_elements.Count()) }; - psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; - - { - const int numRenderTargets = m_boundBindingState ? m_boundBindingState->getDesc().m_numRenderTargets : 1; - - psoDesc.DSVFormat = m_depthStencilFormat; - psoDesc.NumRenderTargets = numRenderTargets; - for (Int i = 0; i < numRenderTargets; i++) - { - psoDesc.RTVFormats[i] = m_targetFormat; - } - - psoDesc.SampleDesc.Count = 1; - psoDesc.SampleDesc.Quality = 0; - - psoDesc.SampleMask = UINT_MAX; - } - - { - auto& rs = psoDesc.RasterizerState; - rs.FillMode = D3D12_FILL_MODE_SOLID; - rs.CullMode = D3D12_CULL_MODE_NONE; - rs.FrontCounterClockwise = FALSE; - rs.DepthBias = D3D12_DEFAULT_DEPTH_BIAS; - rs.DepthBiasClamp = D3D12_DEFAULT_DEPTH_BIAS_CLAMP; - rs.SlopeScaledDepthBias = D3D12_DEFAULT_SLOPE_SCALED_DEPTH_BIAS; - rs.DepthClipEnable = TRUE; - rs.MultisampleEnable = FALSE; - rs.AntialiasedLineEnable = FALSE; - rs.ForcedSampleCount = 0; - rs.ConservativeRaster = D3D12_CONSERVATIVE_RASTERIZATION_MODE_OFF; - } - - { - D3D12_BLEND_DESC& blend = psoDesc.BlendState; - - blend.AlphaToCoverageEnable = FALSE; - blend.IndependentBlendEnable = FALSE; - const D3D12_RENDER_TARGET_BLEND_DESC defaultRenderTargetBlendDesc = - { - FALSE,FALSE, - D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, - D3D12_BLEND_ONE, D3D12_BLEND_ZERO, D3D12_BLEND_OP_ADD, - D3D12_LOGIC_OP_NOOP, - D3D12_COLOR_WRITE_ENABLE_ALL, - }; - for (UINT i = 0; i < D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT; ++i) - { - blend.RenderTarget[i] = defaultRenderTargetBlendDesc; - } - } - - { - auto& ds = psoDesc.DepthStencilState; - - ds.DepthEnable = FALSE; - ds.DepthWriteMask = D3D12_DEPTH_WRITE_MASK_ALL; - ds.DepthFunc = D3D12_COMPARISON_FUNC_ALWAYS; - //ds.DepthFunc = D3D12_COMPARISON_FUNC_LESS; - ds.StencilEnable = FALSE; - ds.StencilReadMask = D3D12_DEFAULT_STENCIL_READ_MASK; - ds.StencilWriteMask = D3D12_DEFAULT_STENCIL_WRITE_MASK; - const D3D12_DEPTH_STENCILOP_DESC defaultStencilOp = - { - D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_STENCIL_OP_KEEP, D3D12_COMPARISON_FUNC_ALWAYS - }; - ds.FrontFace = defaultStencilOp; - ds.BackFace = defaultStencilOp; - } - } - - psoDesc.PrimitiveTopologyType = m_primitiveTopologyType; - - SLANG_RETURN_ON_FAIL(m_device->CreateGraphicsPipelineState(&psoDesc, IID_PPV_ARGS(pipelineState.writeRef()))); - } - - signatureOut.swap(rootSignature); - pipelineStateOut.swap(pipelineState); - - return SLANG_OK; -} - -D3D12Renderer::RenderState* D3D12Renderer::findRenderState(PipelineType pipelineType) -{ - switch (pipelineType) - { - case PipelineType::Compute: - { - // Check if current state is a match - if (m_currentRenderState) - { - if (m_currentRenderState->m_bindingState == m_boundBindingState && - m_currentRenderState->m_shaderProgram == m_boundShaderProgram) - { - return m_currentRenderState; - } - } - - const int num = int(m_renderStates.Count()); - for (int i = 0; i < num; i++) - { - RenderState* renderState = m_renderStates[i]; - if (renderState->m_bindingState == m_boundBindingState && - renderState->m_shaderProgram == m_boundShaderProgram) - { - return renderState; - } - } - break; - } - case PipelineType::Graphics: - { - if (m_currentRenderState) - { - if (m_currentRenderState->m_bindingState == m_boundBindingState && - m_currentRenderState->m_inputLayout == m_boundInputLayout && - m_currentRenderState->m_shaderProgram == m_boundShaderProgram && - m_currentRenderState->m_primitiveTopologyType == m_primitiveTopologyType) - { - return m_currentRenderState; - } - } - // See if matches one in the list - { - const int num = int(m_renderStates.Count()); - for (int i = 0; i < num; i++) - { - RenderState* renderState = m_renderStates[i]; - if (renderState->m_bindingState == m_boundBindingState && - renderState->m_inputLayout == m_boundInputLayout && - renderState->m_shaderProgram == m_boundShaderProgram && - renderState->m_primitiveTopologyType == m_primitiveTopologyType) - { - // Okay we have a match - return renderState; - } - } - } - break; - } - default: break; - } - return nullptr; -} - -D3D12Renderer::RenderState* D3D12Renderer::calcRenderState() -{ - if (!m_boundShaderProgram) - { - return nullptr; - } - m_currentRenderState = findRenderState(m_boundShaderProgram->m_pipelineType); - if (m_currentRenderState) - { - return m_currentRenderState; - } - - ComPtr rootSignature; - ComPtr pipelineState; - - switch (m_boundShaderProgram->m_pipelineType) - { - case PipelineType::Compute: - { - if (SLANG_FAILED(calcComputePipelineState(rootSignature, pipelineState))) - { - return nullptr; - } - break; - } - case PipelineType::Graphics: - { - if (SLANG_FAILED(calcGraphicsPipelineState(rootSignature, pipelineState))) - { - return nullptr; - } - break; - } - default: return nullptr; - } - - RenderState* renderState = new RenderState; - - renderState->m_primitiveTopologyType = m_primitiveTopologyType; - renderState->m_bindingState = m_boundBindingState; - renderState->m_inputLayout = m_boundInputLayout; - renderState->m_shaderProgram = m_boundShaderProgram; - - renderState->m_rootSignature.swap(rootSignature); - renderState->m_pipelineState.swap(pipelineState); - - m_renderStates.Add(renderState); - - m_currentRenderState = renderState; - - return renderState; -} - -Result D3D12Renderer::_calcBindParameters(BindParameters& params) -{ - int numConstantBuffers = 0; - { - if (m_boundBindingState) - { - const int numBoundConstantBuffers = numConstantBuffers; - - const BindingState::Desc& bindingStateDesc = m_boundBindingState->getDesc(); - - const auto& bindings = bindingStateDesc.m_bindings; - const auto& details = m_boundBindingState->m_bindingDetails; - - const int numBindings = int(bindings.Count()); - - for (int i = 0; i < numBindings; i++) - { - const auto& binding = bindings[i]; - const auto& detail = details[i]; - - const int bindingIndex = binding.registerRange.getSingleIndex(); - - if (binding.bindingType == BindingType::Buffer) - { - assert(binding.resource && binding.resource->isBuffer()); - if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) - { - // Make sure it's not overlapping the ones we just statically defined - //assert(binding.m_binding < numBoundConstantBuffers); - - D3D12_ROOT_PARAMETER& param = params.nextParameter(); - param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_CBV; - param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; - - D3D12_ROOT_DESCRIPTOR& descriptor = param.Descriptor; - descriptor.ShaderRegister = bindingIndex; - descriptor.RegisterSpace = 0; - - numConstantBuffers++; - } - } - - if (detail.m_srvIndex >= 0) - { - D3D12_DESCRIPTOR_RANGE& range = params.nextRange(); - - range.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SRV; - range.NumDescriptors = 1; - range.BaseShaderRegister = bindingIndex; - range.RegisterSpace = 0; - range.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; - - D3D12_ROOT_PARAMETER& param = params.nextParameter(); - - param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; - param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; - - D3D12_ROOT_DESCRIPTOR_TABLE& table = param.DescriptorTable; - table.NumDescriptorRanges = 1; - table.pDescriptorRanges = ⦥ - } - - if (detail.m_uavIndex >= 0) - { - D3D12_DESCRIPTOR_RANGE& range = params.nextRange(); - - range.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_UAV; - range.NumDescriptors = 1; - range.BaseShaderRegister = bindingIndex; - range.RegisterSpace = 0; - range.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; - - D3D12_ROOT_PARAMETER& param = params.nextParameter(); - - param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; - param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; - - D3D12_ROOT_DESCRIPTOR_TABLE& table = param.DescriptorTable; - table.NumDescriptorRanges = 1; - table.pDescriptorRanges = ⦥ - } - } - } - } - - // All the samplers are in one continuous section of the sampler heap - if (m_boundBindingState && m_boundBindingState->m_samplerHeap.getUsedSize() > 0) - { - D3D12_DESCRIPTOR_RANGE& range = params.nextRange(); - - range.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER; - range.NumDescriptors = m_boundBindingState->m_samplerHeap.getUsedSize(); - range.BaseShaderRegister = 0; - range.RegisterSpace = 0; - range.OffsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND; - - D3D12_ROOT_PARAMETER& param = params.nextParameter(); - - param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; - param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; - - D3D12_ROOT_DESCRIPTOR_TABLE& table = param.DescriptorTable; - table.NumDescriptorRanges = 1; - table.pDescriptorRanges = ⦥ - } - return SLANG_OK; -} - -Result D3D12Renderer::_bindRenderState(RenderState* renderState, ID3D12GraphicsCommandList* commandList, Submitter* submitter) -{ - BindingStateImpl* bindingState = m_boundBindingState; - - submitter->setRootSignature(renderState->m_rootSignature); - commandList->SetPipelineState(renderState->m_pipelineState); - - if (bindingState) - { - ID3D12DescriptorHeap* heaps[] = - { - bindingState->m_viewHeap.getHeap(), - bindingState->m_samplerHeap.getHeap(), - }; - commandList->SetDescriptorHeaps(SLANG_COUNT_OF(heaps), heaps); - } - else - { - commandList->SetDescriptorHeaps(0, nullptr); - } - - { - int index = 0; - - int numConstantBuffers = 0; - { - if (bindingState) - { - D3D12DescriptorHeap& heap = bindingState->m_viewHeap; - const auto& details = bindingState->m_bindingDetails; - const auto& bindings = bindingState->getDesc().m_bindings; - const int numBindings = int(details.Count()); - - for (int i = 0; i < numBindings; i++) - { - const auto& detail = details[i]; - const auto& binding = bindings[i]; - - if (binding.bindingType == BindingType::Buffer) - { - assert(binding.resource && binding.resource->isBuffer()); - if (binding.resource->canBind(Resource::BindFlag::ConstantBuffer)) - { - BufferResourceImpl* buffer = static_cast(binding.resource.Ptr()); - buffer->bindConstantBufferView(m_circularResourceHeap, index++, submitter); - numConstantBuffers++; - } - } - - if (detail.m_srvIndex >= 0) - { - submitter->setRootDescriptorTable(index++, heap.getGpuHandle(detail.m_srvIndex)); - } - - if (detail.m_uavIndex >= 0) - { - submitter->setRootDescriptorTable(index++, heap.getGpuHandle(detail.m_uavIndex)); - } - } - } - } - - if (bindingState && bindingState->m_samplerHeap.getUsedSize() > 0) - { - submitter->setRootDescriptorTable(index, bindingState->m_samplerHeap.getGpuStart()); - } - } - - return SLANG_OK; -} - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! - -Result D3D12Renderer::initialize(const Desc& desc, void* inWindowHandle) -{ - m_hwnd = (HWND)inWindowHandle; - // Rather than statically link against D3D, we load it dynamically. - - HMODULE d3dModule = LoadLibraryA("d3d12.dll"); - if (!d3dModule) - { - fprintf(stderr, "error: failed load 'd3d12.dll'\n"); - return SLANG_FAIL; - } - - HMODULE dxgiModule = LoadLibraryA("Dxgi.dll"); - if (!dxgiModule) - { - fprintf(stderr, "error: failed load 'dxgi.dll'\n"); - return SLANG_FAIL; - } - - -#define LOAD_D3D_PROC(TYPE, NAME) \ - TYPE NAME##_ = (TYPE) loadProc(d3dModule, #NAME); -#define LOAD_DXGI_PROC(TYPE, NAME) \ - TYPE NAME##_ = (TYPE) loadProc(dxgiModule, #NAME); - - UINT dxgiFactoryFlags = 0; - -#if ENABLE_DEBUG_LAYER - { - LOAD_D3D_PROC(PFN_D3D12_GET_DEBUG_INTERFACE, D3D12GetDebugInterface); - if (D3D12GetDebugInterface_) - { - if (SUCCEEDED(D3D12GetDebugInterface_(IID_PPV_ARGS(m_dxDebug.writeRef())))) - { - m_dxDebug->EnableDebugLayer(); - dxgiFactoryFlags |= DXGI_CREATE_FACTORY_DEBUG; - } - } - } -#endif - - m_D3D12SerializeRootSignature = (PFN_D3D12_SERIALIZE_ROOT_SIGNATURE)loadProc(d3dModule, "D3D12SerializeRootSignature"); - if (!m_D3D12SerializeRootSignature) - { - return SLANG_FAIL; - } - - // Try and create DXGIFactory - ComPtr dxgiFactory; - { - typedef HRESULT(WINAPI *PFN_DXGI_CREATE_FACTORY_2)(UINT Flags, REFIID riid, _COM_Outptr_ void **ppFactory); - LOAD_DXGI_PROC(PFN_DXGI_CREATE_FACTORY_2, CreateDXGIFactory2); - if (!CreateDXGIFactory2_) - { - return SLANG_FAIL; - } - SLANG_RETURN_ON_FAIL(CreateDXGIFactory2_(dxgiFactoryFlags, IID_PPV_ARGS(dxgiFactory.writeRef()))); - } - - D3D_FEATURE_LEVEL featureLevel = D3D_FEATURE_LEVEL_11_0; - - // Search for an adapter that meets our requirements - ComPtr adapter; - - LOAD_D3D_PROC(PFN_D3D12_CREATE_DEVICE, D3D12CreateDevice); - if (!D3D12CreateDevice_) - { - return SLANG_FAIL; - } - - const bool useWarp = false; - - if (useWarp) - { - SLANG_RETURN_ON_FAIL(dxgiFactory->EnumWarpAdapter(IID_PPV_ARGS(adapter.writeRef()))); - SLANG_RETURN_ON_FAIL(D3D12CreateDevice_(adapter, featureLevel, IID_PPV_ARGS(m_device.writeRef()))); - } - else - { - UINT adapterCounter = 0; - for (;;) - { - UINT adapterIndex = adapterCounter++; - - ComPtr candidateAdapter; - if (dxgiFactory->EnumAdapters1(adapterIndex, candidateAdapter.writeRef()) == DXGI_ERROR_NOT_FOUND) - break; - - DXGI_ADAPTER_DESC1 desc; - candidateAdapter->GetDesc1(&desc); - - if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE) - { - // TODO: may want to allow software driver as fallback - } - else if (SUCCEEDED(D3D12CreateDevice_(candidateAdapter, featureLevel, IID_PPV_ARGS(m_device.writeRef())))) - { - // We found one! - adapter = candidateAdapter; - break; - } - } - } - - if (!adapter) - { - // Couldn't find an adapter - return SLANG_FAIL; - } - - m_numRenderFrames = 3; - m_numRenderTargets = 2; - - m_desc = desc; - - // set viewport - { - m_viewport.Width = float(m_desc.width); - m_viewport.Height = float(m_desc.height); - m_viewport.MinDepth = 0; - m_viewport.MaxDepth = 1; - m_viewport.TopLeftX = 0; - m_viewport.TopLeftY = 0; - } - - { - m_scissorRect.left = 0; - m_scissorRect.top = 0; - m_scissorRect.right = m_desc.width; - m_scissorRect.bottom = m_desc.height; - } - - // Describe and create the command queue. - D3D12_COMMAND_QUEUE_DESC queueDesc = {}; - queueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE; - queueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT; - - SLANG_RETURN_ON_FAIL(m_device->CreateCommandQueue(&queueDesc, IID_PPV_ARGS(m_commandQueue.writeRef()))); - - // Describe the swap chain. - DXGI_SWAP_CHAIN_DESC swapChainDesc = {}; - swapChainDesc.BufferCount = m_numRenderTargets; - swapChainDesc.BufferDesc.Width = m_desc.width; - swapChainDesc.BufferDesc.Height = m_desc.height; - swapChainDesc.BufferDesc.Format = m_targetFormat; - swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; - swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD; - swapChainDesc.OutputWindow = m_hwnd; - swapChainDesc.SampleDesc.Count = 1; - swapChainDesc.Windowed = TRUE; - - if (m_isFullSpeed) - { - m_hasVsync = false; - m_allowFullScreen = false; - } - - if (!m_hasVsync) - { - swapChainDesc.Flags |= DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT; - } - - // Swap chain needs the queue so that it can force a flush on it. - ComPtr swapChain; - SLANG_RETURN_ON_FAIL(dxgiFactory->CreateSwapChain(m_commandQueue, &swapChainDesc, swapChain.writeRef())); - SLANG_RETURN_ON_FAIL(swapChain->QueryInterface(m_swapChain.writeRef())); - - if (!m_hasVsync) - { - m_swapChainWaitableObject = m_swapChain->GetFrameLatencyWaitableObject(); - - int maxLatency = m_numRenderTargets - 2; - - // Make sure the maximum latency is in the range required by dx12 runtime - maxLatency = (maxLatency < 1) ? 1 : maxLatency; - maxLatency = (maxLatency > DXGI_MAX_SWAP_CHAIN_BUFFERS) ? DXGI_MAX_SWAP_CHAIN_BUFFERS : maxLatency; - - m_swapChain->SetMaximumFrameLatency(maxLatency); - } - - // This sample does not support fullscreen transitions. - SLANG_RETURN_ON_FAIL(dxgiFactory->MakeWindowAssociation(m_hwnd, DXGI_MWA_NO_ALT_ENTER)); - - m_renderTargetIndex = m_swapChain->GetCurrentBackBufferIndex(); - - // Create descriptor heaps. - { - // Describe and create a render target view (RTV) descriptor heap. - D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {}; - - rtvHeapDesc.NumDescriptors = m_numRenderTargets; - rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; - rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; - SLANG_RETURN_ON_FAIL(m_device->CreateDescriptorHeap(&rtvHeapDesc, IID_PPV_ARGS(m_rtvHeap.writeRef()))); - m_rtvDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV); - } - - { - // Describe and create a depth stencil view (DSV) descriptor heap. - D3D12_DESCRIPTOR_HEAP_DESC dsvHeapDesc = {}; - dsvHeapDesc.NumDescriptors = 1; - dsvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_DSV; - dsvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; - SLANG_RETURN_ON_FAIL(m_device->CreateDescriptorHeap(&dsvHeapDesc, IID_PPV_ARGS(m_dsvHeap.writeRef()))); - - m_dsvDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_DSV); - } - - // Setup frame resources - { - SLANG_RETURN_ON_FAIL(createFrameResources()); - } - - // Setup fence, and close the command list (as default state without begin/endRender is closed) - { - SLANG_RETURN_ON_FAIL(m_fence.init(m_device)); - // Create the command list. When command lists are created they are open, so close it. - FrameInfo& frame = m_frameInfos[m_frameIndex]; - SLANG_RETURN_ON_FAIL(m_device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, frame.m_commandAllocator, nullptr, IID_PPV_ARGS(m_commandList.writeRef()))); - m_commandList->Close(); - } - - { - D3D12CircularResourceHeap::Desc desc; - desc.init(); - // Define size - desc.m_blockSize = 65536; - // Set up the heap - m_circularResourceHeap.init(m_device, desc, &m_fence); - } - - // Setup for rendering - beginRender(); - - m_isInitialized = true; - return SLANG_OK; -} - -Result D3D12Renderer::createFrameResources() -{ - // Create back buffers - { - D3D12_CPU_DESCRIPTOR_HANDLE rtvStart(m_rtvHeap->GetCPUDescriptorHandleForHeapStart()); - - // Work out target format - D3D12_RESOURCE_DESC resourceDesc; - { - ComPtr backBuffer; - SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(0, IID_PPV_ARGS(backBuffer.writeRef()))); - resourceDesc = backBuffer->GetDesc(); - } - const DXGI_FORMAT resourceFormat = D3DUtil::calcResourceFormat(D3DUtil::USAGE_TARGET, m_targetUsageFlags, resourceDesc.Format); - const DXGI_FORMAT targetFormat = D3DUtil::calcFormat(D3DUtil::USAGE_TARGET, resourceFormat); - - // Set the target format - m_targetFormat = targetFormat; - - // Create a RTV, and a command allocator for each frame. - for (int i = 0; i < m_numRenderTargets; i++) - { - // Get the back buffer - ComPtr backBuffer; - SLANG_RETURN_ON_FAIL(m_swapChain->GetBuffer(UINT(i), IID_PPV_ARGS(backBuffer.writeRef()))); - - // Set up resource for back buffer - m_backBufferResources[i].setResource(backBuffer, D3D12_RESOURCE_STATE_COMMON); - m_backBuffers[i] = &m_backBufferResources[i]; - // Assume they are the same thing for now... - m_renderTargets[i] = &m_backBufferResources[i]; - - // If we are multi-sampling - create a render target separate from the back buffer - if (m_isMultiSampled) - { - D3D12_HEAP_PROPERTIES heapProps; - heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - D3D12_CLEAR_VALUE clearValue = {}; - clearValue.Format = m_targetFormat; - - // Don't know targets alignment, so just memory copy - ::memcpy(clearValue.Color, m_clearColor, sizeof(m_clearColor)); - - D3D12_RESOURCE_DESC desc(resourceDesc); - - desc.Format = resourceFormat; - desc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D; - desc.SampleDesc.Count = m_numTargetSamples; - desc.SampleDesc.Quality = m_targetSampleQuality; - desc.Alignment = 0; - - SLANG_RETURN_ON_FAIL(m_renderTargetResources[i].initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, desc, D3D12_RESOURCE_STATE_RENDER_TARGET, &clearValue)); - m_renderTargets[i] = &m_renderTargetResources[i]; - } - - D3D12_CPU_DESCRIPTOR_HANDLE rtvHandle = { rtvStart.ptr + i * m_rtvDescriptorSize }; - m_device->CreateRenderTargetView(*m_renderTargets[i], nullptr, rtvHandle); - } - } - - // Set up frames - for (int i = 0; i < m_numRenderFrames; i++) - { - FrameInfo& frame = m_frameInfos[i]; - SLANG_RETURN_ON_FAIL(m_device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(frame.m_commandAllocator.writeRef()))); - } - - { - D3D12_RESOURCE_DESC desc = m_backBuffers[0]->getResource()->GetDesc(); - assert(desc.Width == UINT64(m_desc.width) && desc.Height == UINT64(m_desc.height)); - } - - // Create the depth stencil view. - { - D3D12_HEAP_PROPERTIES heapProps; - heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - - DXGI_FORMAT resourceFormat = D3DUtil::calcResourceFormat(D3DUtil::USAGE_DEPTH_STENCIL, m_depthStencilUsageFlags, m_depthStencilFormat); - DXGI_FORMAT depthStencilFormat = D3DUtil::calcFormat(D3DUtil::USAGE_DEPTH_STENCIL, resourceFormat); - - // Set the depth stencil format - m_depthStencilFormat = depthStencilFormat; - - // Setup default clear - D3D12_CLEAR_VALUE clearValue = {}; - clearValue.Format = depthStencilFormat; - clearValue.DepthStencil.Depth = 1.0f; - clearValue.DepthStencil.Stencil = 0; - - D3D12_RESOURCE_DESC resourceDesc = {}; - resourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D; - resourceDesc.Format = resourceFormat; - resourceDesc.Width = m_desc.width; - resourceDesc.Height = m_desc.height; - resourceDesc.DepthOrArraySize = 1; - resourceDesc.MipLevels = 1; - resourceDesc.SampleDesc.Count = m_numTargetSamples; - resourceDesc.SampleDesc.Quality = m_targetSampleQuality; - resourceDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN; - resourceDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL; - resourceDesc.Alignment = 0; - - SLANG_RETURN_ON_FAIL(m_depthStencil.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, D3D12_RESOURCE_STATE_DEPTH_WRITE, &clearValue)); - - // Set the depth stencil - D3D12_DEPTH_STENCIL_VIEW_DESC depthStencilDesc = {}; - depthStencilDesc.Format = depthStencilFormat; - depthStencilDesc.ViewDimension = m_isMultiSampled ? D3D12_DSV_DIMENSION_TEXTURE2DMS : D3D12_DSV_DIMENSION_TEXTURE2D; - depthStencilDesc.Flags = D3D12_DSV_FLAG_NONE; - - // Set up as the depth stencil view - m_device->CreateDepthStencilView(m_depthStencil, &depthStencilDesc, m_dsvHeap->GetCPUDescriptorHandleForHeapStart()); - m_depthStencilView = m_dsvHeap->GetCPUDescriptorHandleForHeapStart(); - } - - m_viewport.Width = static_cast(m_desc.width); - m_viewport.Height = static_cast(m_desc.height); - m_viewport.MaxDepth = 1.0f; - - m_scissorRect.right = static_cast(m_desc.width); - m_scissorRect.bottom = static_cast(m_desc.height); - - return SLANG_OK; -} - -void D3D12Renderer::setClearColor(const float color[4]) -{ - memcpy(m_clearColor, color, sizeof(m_clearColor)); -} - -void D3D12Renderer::clearFrame() -{ - // Record commands - D3D12_CPU_DESCRIPTOR_HANDLE rtvHandle = { m_rtvHeap->GetCPUDescriptorHandleForHeapStart().ptr + m_renderTargetIndex * m_rtvDescriptorSize }; - m_commandList->ClearRenderTargetView(rtvHandle, m_clearColor, 0, nullptr); - if (m_depthStencil) - { - m_commandList->ClearDepthStencilView(m_depthStencilView, D3D12_CLEAR_FLAG_DEPTH, 1.0f, 0, 0, nullptr); - } -} - -void D3D12Renderer::presentFrame() -{ - endRender(); - - if (m_swapChainWaitableObject) - { - // check if now is good time to present - // This doesn't wait - because the wait time is 0. If it returns WAIT_TIMEOUT it means that no frame is waiting to be be displayed - // so there is no point doing a present. - const bool shouldPresent = (WaitForSingleObjectEx(m_swapChainWaitableObject, 0, TRUE) != WAIT_TIMEOUT); - if (shouldPresent) - { - m_swapChain->Present(0, 0); - } - } - else - { - if (SLANG_FAILED(m_swapChain->Present(1, 0))) - { - assert(!"Problem presenting"); - beginRender(); - return; - } - } - - // Increment the fence value. Save on the frame - we'll know that frame is done when the fence value >= - m_frameInfos[m_frameIndex].m_fenceValue = m_fence.nextSignal(m_commandQueue); - - // increment frame index after signal - m_frameIndex = (m_frameIndex + 1) % m_numRenderFrames; - // Update the render target index. - m_renderTargetIndex = m_swapChain->GetCurrentBackBufferIndex(); - - // On the current frame wait until it is completed - { - FrameInfo& frame = m_frameInfos[m_frameIndex]; - // If the next frame is not ready to be rendered yet, wait until it is ready. - m_fence.waitUntilCompleted(frame.m_fenceValue); - } - - // Setup such that rendering can restart - beginRender(); -} - -SlangResult D3D12Renderer::captureScreenSurface(Surface& surfaceOut) -{ - return captureTextureToSurface(*m_renderTargets[m_renderTargetIndex], surfaceOut); -} - -static D3D12_RESOURCE_STATES _calcResourceState(Resource::Usage usage) -{ - typedef Resource::Usage Usage; - switch (usage) - { - case Usage::VertexBuffer: return D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER; - case Usage::IndexBuffer: return D3D12_RESOURCE_STATE_INDEX_BUFFER; - case Usage::ConstantBuffer: return D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER; - case Usage::StreamOutput: return D3D12_RESOURCE_STATE_STREAM_OUT; - case Usage::RenderTarget: return D3D12_RESOURCE_STATE_RENDER_TARGET; - case Usage::DepthWrite: return D3D12_RESOURCE_STATE_DEPTH_WRITE; - case Usage::DepthRead: return D3D12_RESOURCE_STATE_DEPTH_READ; - case Usage::UnorderedAccess: return D3D12_RESOURCE_STATE_UNORDERED_ACCESS; - case Usage::PixelShaderResource: return D3D12_RESOURCE_STATE_PIXEL_SHADER_RESOURCE; - case Usage::NonPixelShaderResource: return D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE; - case Usage::GenericRead: return D3D12_RESOURCE_STATE_GENERIC_READ; - default: return D3D12_RESOURCE_STATES(0); - } -} - -static D3D12_RESOURCE_FLAGS _calcResourceFlag(Resource::BindFlag::Enum bindFlag) -{ - typedef Resource::BindFlag BindFlag; - switch (bindFlag) - { - case BindFlag::RenderTarget: return D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET; - case BindFlag::DepthStencil: return D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL; - case BindFlag::UnorderedAccess: return D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS; - default: return D3D12_RESOURCE_FLAG_NONE; - } -} - -static D3D12_RESOURCE_FLAGS _calcResourceBindFlags(Resource::Usage initialUsage, int bindFlags) -{ - int dstFlags = 0; - while (bindFlags) - { - int lsb = bindFlags & -bindFlags; - - dstFlags |= _calcResourceFlag(Resource::BindFlag::Enum(lsb)); - bindFlags &= ~lsb; - } - return D3D12_RESOURCE_FLAGS(dstFlags); -} - -static D3D12_RESOURCE_DIMENSION _calcResourceDimension(Resource::Type type) -{ - switch (type) - { - case Resource::Type::Buffer: return D3D12_RESOURCE_DIMENSION_BUFFER; - case Resource::Type::Texture1D: return D3D12_RESOURCE_DIMENSION_TEXTURE1D; - case Resource::Type::TextureCube: - case Resource::Type::Texture2D: - { - return D3D12_RESOURCE_DIMENSION_TEXTURE2D; - } - case Resource::Type::Texture3D: return D3D12_RESOURCE_DIMENSION_TEXTURE3D; - default: return D3D12_RESOURCE_DIMENSION_UNKNOWN; - } -} - -TextureResource* D3D12Renderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) -{ - // Description of uploading on Dx12 - // https://msdn.microsoft.com/en-us/library/windows/desktop/dn899215%28v=vs.85%29.aspx - - TextureResource::Desc srcDesc(descIn); - srcDesc.setDefaults(initialUsage); - - const DXGI_FORMAT pixelFormat = D3DUtil::getMapFormat(srcDesc.format); - if (pixelFormat == DXGI_FORMAT_UNKNOWN) - { - return nullptr; - } - - const int arraySize = srcDesc.calcEffectiveArraySize(); - - const D3D12_RESOURCE_DIMENSION dimension = _calcResourceDimension(srcDesc.type); - if (dimension == D3D12_RESOURCE_DIMENSION_UNKNOWN) - { - return nullptr; - } - - const int numMipMaps = srcDesc.numMipLevels; - - // Setup desc - D3D12_RESOURCE_DESC resourceDesc; - - resourceDesc.Dimension = dimension; - resourceDesc.Format = pixelFormat; - resourceDesc.Width = srcDesc.size.width; - resourceDesc.Height = srcDesc.size.height; - resourceDesc.DepthOrArraySize = (srcDesc.size.depth > 1) ? srcDesc.size.depth : arraySize; - - resourceDesc.MipLevels = numMipMaps; - resourceDesc.SampleDesc.Count = srcDesc.sampleDesc.numSamples; - resourceDesc.SampleDesc.Quality = srcDesc.sampleDesc.quality; - - resourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE; - resourceDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN; - resourceDesc.Alignment = 0; - - RefPtr texture(new TextureResourceImpl(srcDesc)); - - // Create the target resource - { - D3D12_HEAP_PROPERTIES heapProps; - - heapProps.Type = D3D12_HEAP_TYPE_DEFAULT; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - - SLANG_RETURN_NULL_ON_FAIL(texture->m_resource.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, resourceDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); - - texture->m_resource.setDebugName(L"Texture"); - } - - // Calculate the layout - List layouts; - layouts.SetSize(numMipMaps); - List mipRowSizeInBytes; - mipRowSizeInBytes.SetSize(numMipMaps); - List mipNumRows; - mipNumRows.SetSize(numMipMaps); - - // Since textures are effectively immutable currently initData must be set - assert(initData); - // We should have this many sub resources - assert(initData->numSubResources == numMipMaps * srcDesc.size.depth * arraySize); - - // This is just the size for one array upload -> not for the whole texure - UInt64 requiredSize = 0; - m_device->GetCopyableFootprints(&resourceDesc, 0, numMipMaps, 0, layouts.begin(), mipNumRows.begin(), mipRowSizeInBytes.begin(), &requiredSize); - - // Sub resource indexing - // https://msdn.microsoft.com/en-us/library/windows/desktop/dn705766(v=vs.85).aspx#subresource_indexing - - int subResourceIndex = 0; - for (int i = 0; i < arraySize; i++) - { - // Create the upload texture - D3D12Resource uploadTexture; - { - D3D12_HEAP_PROPERTIES heapProps; - - heapProps.Type = D3D12_HEAP_TYPE_UPLOAD; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - - D3D12_RESOURCE_DESC uploadResourceDesc; - - uploadResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER; - uploadResourceDesc.Format = DXGI_FORMAT_UNKNOWN; - uploadResourceDesc.Width = requiredSize; - uploadResourceDesc.Height = 1; - uploadResourceDesc.DepthOrArraySize = 1; - uploadResourceDesc.MipLevels = 1; - uploadResourceDesc.SampleDesc.Count = 1; - uploadResourceDesc.SampleDesc.Quality = 0; - uploadResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE; - uploadResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR; - uploadResourceDesc.Alignment = 0; - - SLANG_RETURN_NULL_ON_FAIL(uploadTexture.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, uploadResourceDesc, D3D12_RESOURCE_STATE_GENERIC_READ, nullptr)); - - uploadTexture.setDebugName(L"TextureUpload"); - } - - ID3D12Resource* uploadResource = uploadTexture; - - uint8_t* p; - uploadResource->Map(0, nullptr, reinterpret_cast(&p)); - - for (int j = 0; j < numMipMaps; ++j) - { - const D3D12_PLACED_SUBRESOURCE_FOOTPRINT& layout = layouts[j]; - const D3D12_SUBRESOURCE_FOOTPRINT& footprint = layout.Footprint; - - const TextureResource::Size mipSize = srcDesc.size.calcMipSize(j); - - assert(footprint.Width == mipSize.width && footprint.Height == mipSize.height && footprint.Depth == mipSize.depth); - - const ptrdiff_t dstMipRowPitch = ptrdiff_t(layouts[j].Footprint.RowPitch); - const ptrdiff_t srcMipRowPitch = ptrdiff_t(initData->mipRowStrides[j]); - - assert(dstMipRowPitch >= srcMipRowPitch); - - const uint8_t* srcRow = (const uint8_t*)initData->subResources[subResourceIndex]; - uint8_t* dstRow = p + layouts[j].Offset; - - // Copy the depth each mip - for (int l = 0; l < mipSize.depth; l++) - { - // Copy rows - for (int k = 0; k < mipSize.height; ++k) - { - ::memcpy(dstRow, srcRow, srcMipRowPitch); - - srcRow += srcMipRowPitch; - dstRow += dstMipRowPitch; - } - } - - //assert(srcRow == (const uint8_t*)(srcMip.Buffer() + srcMip.Count())); - } - uploadResource->Unmap(0, nullptr); - - for (int mipIndex = 0; mipIndex < numMipMaps; ++mipIndex) - { - // https://msdn.microsoft.com/en-us/library/windows/desktop/dn903862(v=vs.85).aspx - - D3D12_TEXTURE_COPY_LOCATION src; - src.pResource = uploadTexture; - src.Type = D3D12_TEXTURE_COPY_TYPE_PLACED_FOOTPRINT; - src.PlacedFootprint = layouts[mipIndex]; - - D3D12_TEXTURE_COPY_LOCATION dst; - dst.pResource = texture->m_resource; - dst.Type = D3D12_TEXTURE_COPY_TYPE_SUBRESOURCE_INDEX; - dst.SubresourceIndex = subResourceIndex; - m_commandList->CopyTextureRegion(&dst, 0, 0, 0, &src, nullptr); - - subResourceIndex++; - } - - { - // const D3D12_RESOURCE_STATES finalState = D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE; - const D3D12_RESOURCE_STATES finalState = _calcResourceState(initialUsage); - - D3D12BarrierSubmitter submitter(m_commandList); - texture->m_resource.transition(finalState, submitter); - } - - // Block - waiting for copy to complete (so can drop upload texture) - submitGpuWorkAndWait(); - } - - return texture.detach(); -} - -BufferResource* D3D12Renderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) -{ - typedef BufferResourceImpl::BackingStyle Style; - - BufferResource::Desc srcDesc(descIn); - srcDesc.setDefaults(initialUsage); - - RefPtr buffer(new BufferResourceImpl(initialUsage, srcDesc)); - - // Save the style - buffer->m_backingStyle = BufferResourceImpl::_calcResourceBackingStyle(initialUsage); - - D3D12_RESOURCE_DESC bufferDesc; - _initBufferResourceDesc(srcDesc.sizeInBytes, bufferDesc); - - bufferDesc.Flags = _calcResourceBindFlags(initialUsage, srcDesc.bindFlags); - - switch (buffer->m_backingStyle) - { - case Style::MemoryBacked: - { - // Assume the constant buffer will change every frame. We'll just keep a copy of the contents - // in regular memory until it needed - buffer->m_memory.SetSize(UInt(srcDesc.sizeInBytes)); - // Initialize - if (initData) - { - ::memcpy(buffer->m_memory.Buffer(), initData, srcDesc.sizeInBytes); - } - break; - } - case Style::ResourceBacked: - { - const D3D12_RESOURCE_STATES initialState = _calcResourceState(initialUsage); - SLANG_RETURN_NULL_ON_FAIL(createBuffer(bufferDesc, initData, buffer->m_uploadResource, initialState, buffer->m_resource)); - break; - } - default: return nullptr; - } - - return buffer.detach(); -} - -InputLayout* D3D12Renderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) -{ - RefPtr layout(new InputLayoutImpl); - - // Work out a buffer size to hold all text - size_t textSize = 0; - for (int i = 0; i < Int(inputElementCount); ++i) - { - const char* text = inputElements[i].semanticName; - textSize += text ? (::strlen(text) + 1) : 0; - } - layout->m_text.SetSize(textSize); - char* textPos = layout->m_text.Buffer(); - - // - List& elements = layout->m_elements; - elements.SetSize(inputElementCount); - - - for (UInt i = 0; i < inputElementCount; ++i) - { - const InputElementDesc& srcEle = inputElements[i]; - D3D12_INPUT_ELEMENT_DESC& dstEle = elements[i]; - - // Add text to the buffer - const char* semanticName = srcEle.semanticName; - if (semanticName) - { - const int len = int(::strlen(semanticName)); - ::memcpy(textPos, semanticName, len + 1); - semanticName = textPos; - textPos += len + 1; - } - - dstEle.SemanticName = semanticName; - dstEle.SemanticIndex = (UINT)srcEle.semanticIndex; - dstEle.Format = D3DUtil::getMapFormat(srcEle.format); - dstEle.InputSlot = 0; - dstEle.AlignedByteOffset = (UINT)srcEle.offset; - dstEle.InputSlotClass = D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA; - dstEle.InstanceDataStepRate = 0; - } - - return layout.detach(); -} - -void* D3D12Renderer::map(BufferResource* bufferIn, MapFlavor flavor) -{ - typedef BufferResourceImpl::BackingStyle Style; - - BufferResourceImpl* buffer = static_cast(bufferIn); - buffer->m_mapFlavor = flavor; - - const size_t bufferSize = buffer->getDesc().sizeInBytes; - - switch (buffer->m_backingStyle) - { - case Style::ResourceBacked: - { - // We need this in a state so we can upload - switch (flavor) - { - case MapFlavor::HostWrite: - case MapFlavor::WriteDiscard: - { - D3D12BarrierSubmitter submitter(m_commandList); - buffer->m_uploadResource.transition(D3D12_RESOURCE_STATE_GENERIC_READ, submitter); - buffer->m_resource.transition(D3D12_RESOURCE_STATE_COPY_DEST, submitter); - - const D3D12_RANGE readRange = {}; - - void* uploadData; - SLANG_RETURN_NULL_ON_FAIL(buffer->m_uploadResource.getResource()->Map(0, &readRange, reinterpret_cast(&uploadData))); - return uploadData; - - break; - } - case MapFlavor::HostRead: - { - // This will be slow!!! - it blocks CPU on GPU completion - D3D12Resource& resource = buffer->m_resource; - - // Readback heap - D3D12_HEAP_PROPERTIES heapProps; - heapProps.Type = D3D12_HEAP_TYPE_READBACK; - heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN; - heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN; - heapProps.CreationNodeMask = 1; - heapProps.VisibleNodeMask = 1; - - // Resource to readback to - D3D12_RESOURCE_DESC stagingDesc; - _initBufferResourceDesc(bufferSize, stagingDesc); - - D3D12Resource stageBuf; - SLANG_RETURN_NULL_ON_FAIL(stageBuf.initCommitted(m_device, heapProps, D3D12_HEAP_FLAG_NONE, stagingDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr)); - - const D3D12_RESOURCE_STATES initialState = resource.getState(); - - // Make it a source - { - D3D12BarrierSubmitter submitter(m_commandList); - resource.transition(D3D12_RESOURCE_STATE_COPY_SOURCE, submitter); - } - // Do the copy - m_commandList->CopyBufferRegion(stageBuf, 0, resource, 0, bufferSize); - // Switch it back - { - D3D12BarrierSubmitter submitter(m_commandList); - resource.transition(initialState, submitter); - } - - // Wait until complete - submitGpuWorkAndWait(); - - // Map and copy - { - UINT8* data; - D3D12_RANGE readRange = { 0, bufferSize }; - - SLANG_RETURN_NULL_ON_FAIL(stageBuf.getResource()->Map(0, &readRange, reinterpret_cast(&data))); - - // Copy to memory buffer - buffer->m_memory.SetSize(bufferSize); - ::memcpy(buffer->m_memory.Buffer(), data, bufferSize); - - stageBuf.getResource()->Unmap(0, nullptr); - } - - return buffer->m_memory.Buffer(); - } - } - break; - } - case Style::MemoryBacked: - { - return buffer->m_memory.Buffer(); - } - default: return nullptr; - } - - return nullptr; -} - -void D3D12Renderer::unmap(BufferResource* bufferIn) -{ - typedef BufferResourceImpl::BackingStyle Style; - BufferResourceImpl* buffer = static_cast(bufferIn); - - switch (buffer->m_backingStyle) - { - case Style::MemoryBacked: - { - // Don't need to do anything, as will be uploaded automatically when used - break; - } - case Style::ResourceBacked: - { - // We need this in a state so we can upload - switch (buffer->m_mapFlavor) - { - case MapFlavor::HostWrite: - case MapFlavor::WriteDiscard: - { - // Unmap - ID3D12Resource* uploadResource = buffer->m_uploadResource; - ID3D12Resource* resource = buffer->m_resource; - - uploadResource->Unmap(0, nullptr); - - const D3D12_RESOURCE_STATES initialState = buffer->m_resource.getState(); - - { - D3D12BarrierSubmitter submitter(m_commandList); - buffer->m_uploadResource.transition(D3D12_RESOURCE_STATE_GENERIC_READ, submitter); - buffer->m_resource.transition(D3D12_RESOURCE_STATE_COPY_DEST, submitter); - } - - m_commandList->CopyBufferRegion(resource, 0, uploadResource, 0, buffer->getDesc().sizeInBytes); - - { - D3D12BarrierSubmitter submitter(m_commandList); - buffer->m_resource.transition(initialState, submitter); - } - break; - } - case MapFlavor::HostRead: - { - break; - } - } - } - } -} - -void D3D12Renderer::setInputLayout(InputLayout* inputLayout) -{ - m_boundInputLayout = static_cast(inputLayout); -} - -void D3D12Renderer::setPrimitiveTopology(PrimitiveTopology topology) -{ - switch (topology) - { - case PrimitiveTopology::TriangleList: - { - m_primitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; - m_primitiveTopology = D3DUtil::getPrimitiveTopology(topology); - break; - } - default: - { - assert(!"Unhandled type"); - } - } -} - -void D3D12Renderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) -{ - { - const UInt num = startSlot + slotCount; - if (num > m_boundVertexBuffers.Count()) - { - m_boundVertexBuffers.SetSize(num); - } - } - - for (UInt i = 0; i < slotCount; i++) - { - BufferResourceImpl* buffer = static_cast(buffers[i]); - if (buffer) - { - assert(buffer->m_initialUsage == Resource::Usage::VertexBuffer); - } - - BoundVertexBuffer& boundBuffer = m_boundVertexBuffers[startSlot + i]; - boundBuffer.m_buffer = buffer; - boundBuffer.m_stride = int(strides[i]); - boundBuffer.m_offset = int(offsets[i]); - } -} - -void D3D12Renderer::setShaderProgram(ShaderProgram* inProgram) -{ - m_boundShaderProgram = static_cast(inProgram); -} - -void D3D12Renderer::draw(UInt vertexCount, UInt startVertex) -{ - ID3D12GraphicsCommandList* commandList = m_commandList; - - RenderState* renderState = calcRenderState(); - if (!renderState) - { - assert(!"Couldn't create render state"); - return; - } - - BindingStateImpl* bindingState = m_boundBindingState; - - // Submit - setting for graphics - { - GraphicsSubmitter submitter(commandList); - _bindRenderState(renderState, commandList, &submitter); - } - - commandList->IASetPrimitiveTopology(m_primitiveTopology); - - // Set up vertex buffer views - { - int numVertexViews = 0; - D3D12_VERTEX_BUFFER_VIEW vertexViews[16]; - for (int i = 0; i < int(m_boundVertexBuffers.Count()); i++) - { - const BoundVertexBuffer& boundVertexBuffer = m_boundVertexBuffers[i]; - BufferResourceImpl* buffer = boundVertexBuffer.m_buffer; - if (buffer) - { - D3D12_VERTEX_BUFFER_VIEW& vertexView = vertexViews[numVertexViews++]; - vertexView.BufferLocation = buffer->m_resource.getResource()->GetGPUVirtualAddress(); - vertexView.SizeInBytes = int(buffer->getDesc().sizeInBytes); - vertexView.StrideInBytes = boundVertexBuffer.m_stride; - } - } - commandList->IASetVertexBuffers(0, numVertexViews, vertexViews); - } - - commandList->DrawInstanced(UINT(vertexCount), 1, UINT(startVertex), 0); -} - -void D3D12Renderer::dispatchCompute(int x, int y, int z) -{ - ID3D12GraphicsCommandList* commandList = m_commandList; - RenderState* renderState = calcRenderState(); - - // Submit binding for compute - { - ComputeSubmitter submitter(commandList); - _bindRenderState(renderState, commandList, &submitter); - } - - commandList->Dispatch(x, y, z); -} - -BindingState* D3D12Renderer::createBindingState(const BindingState::Desc& bindingStateDesc) -{ - RefPtr bindingState(new BindingStateImpl(bindingStateDesc)); - - SLANG_RETURN_NULL_ON_FAIL(bindingState->init(m_device)); - - const auto& srcBindings = bindingStateDesc.m_bindings; - const int numBindings = int(srcBindings.Count()); - - auto& dstDetails = bindingState->m_bindingDetails; - dstDetails.SetSize(numBindings); - - for (int i = 0; i < numBindings; ++i) - { - const auto& srcEntry = srcBindings[i]; - auto& dstDetail = dstDetails[i]; - - const int bindingIndex = srcEntry.registerRange.getSingleIndex(); - - switch (srcEntry.bindingType) - { - case BindingType::Buffer: - { - assert(srcEntry.resource && srcEntry.resource->isBuffer()); - BufferResourceImpl* bufferResource = static_cast(srcEntry.resource.Ptr()); - const BufferResource::Desc& bufferDesc = bufferResource->getDesc(); - - const size_t bufferSize = bufferDesc.sizeInBytes; - const int elemSize = bufferDesc.elementSize <= 0 ? sizeof(uint32_t) : bufferDesc.elementSize; - - const bool createSrv = false; - - // NOTE! In this arrangement the buffer can either be a ConstantBuffer or a 'StorageBuffer'. - // If it's a storage buffer then it has a 'uav'. - // In neither circumstance is there an associated srv - // This departs a little from dx11 code - in that it will create srv and uav for a storage buffer. - if (bufferDesc.bindFlags & Resource::BindFlag::UnorderedAccess) - { - dstDetail.m_uavIndex = bindingState->m_viewHeap.allocate(); - if (dstDetail.m_uavIndex < 0) - { - return nullptr; - } - - D3D12_UNORDERED_ACCESS_VIEW_DESC uavDesc = {}; - - uavDesc.ViewDimension = D3D12_UAV_DIMENSION_BUFFER; - uavDesc.Format = D3DUtil::getMapFormat(bufferDesc.format); - - uavDesc.Buffer.StructureByteStride = elemSize; - - uavDesc.Buffer.FirstElement = 0; - uavDesc.Buffer.NumElements = (UINT)(bufferSize / elemSize); - uavDesc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_NONE; - - if (bufferDesc.elementSize == 0 && bufferDesc.format == Format::Unknown) - { - uavDesc.Buffer.Flags |= D3D12_BUFFER_UAV_FLAG_RAW; - uavDesc.Format = DXGI_FORMAT_R32_TYPELESS; - - uavDesc.Buffer.StructureByteStride = 0; - } - else if( bufferDesc.format != Format::Unknown ) - { - uavDesc.Buffer.StructureByteStride = 0; - } - - m_device->CreateUnorderedAccessView(bufferResource->m_resource, nullptr, &uavDesc, bindingState->m_viewHeap.getCpuHandle(dstDetail.m_uavIndex)); - } - if (createSrv && (bufferDesc.bindFlags & (Resource::BindFlag::NonPixelShaderResource | Resource::BindFlag::PixelShaderResource))) - { - dstDetail.m_srvIndex = bindingState->m_viewHeap.allocate(); - if (dstDetail.m_srvIndex < 0) - { - return nullptr; - } - - D3D12_SHADER_RESOURCE_VIEW_DESC srvDesc; - - srvDesc.ViewDimension = D3D12_SRV_DIMENSION_BUFFER; - srvDesc.Format = DXGI_FORMAT_UNKNOWN; - srvDesc.Shader4ComponentMapping = D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING; - - srvDesc.Buffer.FirstElement = 0; - srvDesc.Buffer.NumElements = (UINT)(bufferSize / elemSize); - srvDesc.Buffer.StructureByteStride = elemSize; - srvDesc.Buffer.Flags = D3D12_BUFFER_SRV_FLAG_NONE; - - if (bufferDesc.elementSize == 0) - { - srvDesc.Format = DXGI_FORMAT_R32_FLOAT; - } - - m_device->CreateShaderResourceView(bufferResource->m_resource, &srvDesc, bindingState->m_viewHeap.getCpuHandle(dstDetail.m_srvIndex)); - } - - break; - } - case BindingType::Texture: - { - assert(srcEntry.resource && srcEntry.resource->isTexture()); - - TextureResourceImpl* textureResource = static_cast(srcEntry.resource.Ptr()); - - dstDetail.m_srvIndex = bindingState->m_viewHeap.allocate(); - if (dstDetail.m_srvIndex < 0) - { - return nullptr; - } - - { - const D3D12_RESOURCE_DESC resourceDesc = textureResource->m_resource.getResource()->GetDesc(); - const DXGI_FORMAT pixelFormat = resourceDesc.Format; - - D3D12_SHADER_RESOURCE_VIEW_DESC srvDesc; - _initSrvDesc(textureResource->getType(), textureResource->getDesc(), resourceDesc, pixelFormat, srvDesc); - - // Create descriptor - m_device->CreateShaderResourceView(textureResource->m_resource, &srvDesc, bindingState->m_viewHeap.getCpuHandle(dstDetail.m_srvIndex)); - } - - break; - } - case BindingType::Sampler: - { - const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcEntry.descIndex]; - - const int samplerIndex = bindingIndex; - dstDetail.m_samplerIndex = samplerIndex; - bindingState->m_samplerHeap.placeAt(samplerIndex); - - D3D12_SAMPLER_DESC desc = {}; - desc.AddressU = desc.AddressV = desc.AddressW = D3D12_TEXTURE_ADDRESS_MODE_WRAP; - desc.ComparisonFunc = D3D12_COMPARISON_FUNC_ALWAYS; - - if (samplerDesc.isCompareSampler) - { - desc.ComparisonFunc = D3D12_COMPARISON_FUNC_LESS_EQUAL; - desc.Filter = D3D12_FILTER_MIN_LINEAR_MAG_MIP_POINT; - } - else - { - desc.Filter = D3D12_FILTER_ANISOTROPIC; - desc.MaxAnisotropy = 8; - desc.MinLOD = 0.0f; - desc.MaxLOD = 100.0f; - } - - m_device->CreateSampler(&desc, bindingState->m_samplerHeap.getCpuHandle(samplerIndex)); - - break; - } - case BindingType::CombinedTextureSampler: - { - assert(!"Not implemented"); - return nullptr; - } - } - } - - return bindingState.detach(); -} - -void D3D12Renderer::setBindingState(BindingState* state) -{ - m_boundBindingState = static_cast(state); -} - -ShaderProgram* D3D12Renderer::createProgram(const ShaderProgram::Desc& desc) -{ - RefPtr program(new ShaderProgramImpl()); - program->m_pipelineType = desc.pipelineType; - - if (desc.pipelineType == PipelineType::Compute) - { - auto computeKernel = desc.findKernel(StageType::Compute); - program->m_computeShader.InsertRange(0, (const uint8_t*) computeKernel->codeBegin, computeKernel->getCodeSize()); - } - else - { - auto vertexKernel = desc.findKernel(StageType::Vertex); - auto fragmentKernel = desc.findKernel(StageType::Fragment); - - program->m_vertexShader.InsertRange(0, (const uint8_t*) vertexKernel->codeBegin, vertexKernel->getCodeSize()); - program->m_pixelShader.InsertRange(0, (const uint8_t*) fragmentKernel->codeBegin, fragmentKernel->getCodeSize()); - } - - return program.detach(); -} - - -} // renderer_test diff --git a/tools/slang-graphics/render-d3d12.h b/tools/slang-graphics/render-d3d12.h deleted file mode 100644 index 5f0eea4d2..000000000 --- a/tools/slang-graphics/render-d3d12.h +++ /dev/null @@ -1,10 +0,0 @@ -// render-d3d12.h -#pragma once - -namespace slang_graphics { - -class Renderer; - -Renderer* createD3D12Renderer(); - -} // slang_graphics diff --git a/tools/slang-graphics/render-gl.cpp b/tools/slang-graphics/render-gl.cpp deleted file mode 100644 index f85a81ca4..000000000 --- a/tools/slang-graphics/render-gl.cpp +++ /dev/null @@ -1,1049 +0,0 @@ -// render-gl.cpp -#include "render-gl.h" - -//WORKING:#include "options.h" -#include "render.h" - -#include -#include -#include -#include "core/basic.h" -#include "core/secure-crt.h" -#include "external/stb/stb_image_write.h" - -#include "surface.h" - -// TODO(tfoley): eventually we should be able to run these -// tests on non-Windows targets to confirm that cross-compilation -// at least *works* on those platforms... -#define WIN32_LEAN_AND_MEAN -#define NOMINMAX -#include -#undef WIN32_LEAN_AND_MEAN -#undef NOMINMAX - -#ifdef _MSC_VER -#include -#if (_MSC_VER < 1900) -#define snprintf sprintf_s -#endif -#endif - -#pragma comment(lib, "opengl32") - -#include -#include "external/glext.h" - -// We define an "X-macro" for mapping over loadable OpenGL -// extension entry point that we will use, so that we can -// easily write generic code to iterate over them. -#define MAP_GL_EXTENSION_FUNCS(F) \ - F(glCreateProgram, PFNGLCREATEPROGRAMPROC) \ - F(glCreateShader, PFNGLCREATESHADERPROC) \ - F(glShaderSource, PFNGLSHADERSOURCEPROC) \ - F(glCompileShader, PFNGLCOMPILESHADERPROC) \ - F(glGetShaderiv, PFNGLGETSHADERIVPROC) \ - F(glDeleteShader, PFNGLDELETESHADERPROC) \ - F(glAttachShader, PFNGLATTACHSHADERPROC) \ - F(glLinkProgram, PFNGLLINKPROGRAMPROC) \ - F(glGetProgramiv, PFNGLGETPROGRAMIVPROC) \ - F(glGetProgramInfoLog, PFNGLGETPROGRAMINFOLOGPROC) \ - F(glDeleteProgram, PFNGLDELETEPROGRAMPROC) \ - F(glGetShaderInfoLog, PFNGLGETSHADERINFOLOGPROC) \ - F(glGenBuffers, PFNGLGENBUFFERSPROC) \ - F(glBindBuffer, PFNGLBINDBUFFERPROC) \ - F(glBufferData, PFNGLBUFFERDATAPROC) \ - F(glDeleteBuffers, PFNGLDELETEBUFFERSPROC) \ - F(glMapBuffer, PFNGLMAPBUFFERPROC) \ - F(glUnmapBuffer, PFNGLUNMAPBUFFERPROC) \ - F(glUseProgram, PFNGLUSEPROGRAMPROC) \ - F(glBindBufferBase, PFNGLBINDBUFFERBASEPROC) \ - F(glVertexAttribPointer, PFNGLVERTEXATTRIBPOINTERPROC) \ - F(glEnableVertexAttribArray, PFNGLENABLEVERTEXATTRIBARRAYPROC) \ - F(glDisableVertexAttribArray, PFNGLDISABLEVERTEXATTRIBARRAYPROC) \ - F(glDebugMessageCallback, PFNGLDEBUGMESSAGECALLBACKPROC) \ - F(glDispatchCompute, PFNGLDISPATCHCOMPUTEPROC) \ - F(glActiveTexture, PFNGLACTIVETEXTUREPROC) \ - F(glCreateSamplers, PFNGLCREATESAMPLERSPROC) \ - F(glDeleteSamplers, PFNGLDELETESAMPLERSPROC) \ - F(glBindSampler, PFNGLBINDSAMPLERPROC) \ - F(glTexImage3D, PFNGLTEXIMAGE3DPROC) \ - F(glSamplerParameteri, PFNGLSAMPLERPARAMETERIPROC) \ - /* end */ - -using namespace Slang; - -namespace slang_graphics { - -class GLRenderer : public Renderer -{ -public: - - // Renderer implementation - virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; - virtual void setClearColor(const float color[4]) override; - virtual void clearFrame() override; - virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) override; - virtual SlangResult captureScreenSurface(Surface& surfaceOut) override; - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& bindingStateDesc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; - virtual void* map(BufferResource* buffer, MapFlavor flavor) override; - virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; - virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState* state); - virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; - virtual void draw(UInt vertexCount, UInt startVertex) override; - virtual void dispatchCompute(int x, int y, int z) override; - virtual void submitGpuWork() override {} - virtual void waitForGpu() override {} - virtual RendererType getRendererType() const override { return RendererType::OpenGl; } - - protected: - enum - { - kMaxVertexStreams = 16, - }; - - struct VertexAttributeFormat - { - GLint componentCount; - GLenum componentType; - GLboolean normalized; - }; - - struct VertexAttributeDesc - { - VertexAttributeFormat format; - GLuint streamIndex; - GLsizei offset; - }; - - class InputLayoutImpl: public InputLayout - { - public: - VertexAttributeDesc m_attributes[kMaxVertexStreams]; - UInt m_attributeCount = 0; - }; - - class BufferResourceImpl: public BufferResource - { - public: - typedef BufferResource Parent; - - BufferResourceImpl(Usage initialUsage, const Desc& desc, GLRenderer* renderer, GLuint id, GLenum target): - Parent(desc), - m_renderer(renderer), - m_handle(id), - m_initialUsage(initialUsage), - m_target(target) - {} - ~BufferResourceImpl() - { - if (m_renderer) - { - m_renderer->glDeleteBuffers(1, &m_handle); - } - } - - Usage m_initialUsage; - GLRenderer* m_renderer; - GLuint m_handle; - GLenum m_target; - }; - - class TextureResourceImpl: public TextureResource - { - public: - typedef TextureResource Parent; - - TextureResourceImpl(Usage initialUsage, const Desc& desc, GLRenderer* renderer): - Parent(desc), - m_initialUsage(initialUsage), - m_renderer(renderer) - { - m_target = 0; - m_handle = 0; - } - - ~TextureResourceImpl() - { - if (m_handle) - { - glDeleteTextures(1, &m_handle); - } - } - - Usage m_initialUsage; - GLRenderer* m_renderer; - GLenum m_target; - GLuint m_handle; - }; - - struct BindingDetail - { - GLuint m_samplerHandle = 0; - }; - - class BindingStateImpl: public BindingState - { - public: - typedef BindingState Parent; - - /// Ctor - BindingStateImpl(const Desc& desc, GLRenderer* renderer): - Parent(desc), - m_renderer(renderer) - { - } - - ~BindingStateImpl() - { - if (m_renderer) - { - m_renderer->destroyBindingEntries(getDesc(), m_bindingDetails.Buffer()); - } - } - - GLRenderer* m_renderer; - List m_bindingDetails; - }; - - class ShaderProgramImpl : public ShaderProgram - { - public: - ShaderProgramImpl(GLRenderer* renderer, GLuint id): - m_renderer(renderer), - m_id(id) - { - } - ~ShaderProgramImpl() - { - if (m_renderer) - { - m_renderer->glDeleteProgram(m_id); - } - } - - GLuint m_id; - GLRenderer* m_renderer; - }; - - enum class GlPixelFormat - { - Unknown, - RGBA_Unorm_UInt8, - CountOf, - }; - - struct GlPixelFormatInfo - { - GLint internalFormat; // such as GL_RGBA8 - GLenum format; // such as GL_RGBA - GLenum formatType; // such as GL_UNSIGNED_BYTE - }; - - void destroyBindingEntries(const BindingState::Desc& desc, const BindingDetail* details); - - void bindBufferImpl(int target, UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* offsets); - void flushStateForDraw(); - GLuint loadShader(GLenum stage, char const* source); - void debugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message); - - /// Returns GlPixelFormat::Unknown if not an equivalent - static GlPixelFormat _getGlPixelFormat(Format format); - - static void APIENTRY staticDebugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message, const void* userParam); - static VertexAttributeFormat getVertexAttributeFormat(Format format); - - static void compileTimeAsserts(); - - HDC m_hdc; - HGLRC m_glContext; - float m_clearColor[4] = { 0, 0, 0, 0 }; - - RefPtr m_boundShaderProgram; - RefPtr m_boundInputLayout; - - GLenum m_boundPrimitiveTopology = GL_TRIANGLES; - GLuint m_boundVertexStreamBuffers[kMaxVertexStreams]; - UInt m_boundVertexStreamStrides[kMaxVertexStreams]; - UInt m_boundVertexStreamOffsets[kMaxVertexStreams]; - - Desc m_desc; - - // Declare a function pointer for each OpenGL - // extension function we need to load -#define DECLARE_GL_EXTENSION_FUNC(NAME, TYPE) TYPE NAME; - MAP_GL_EXTENSION_FUNCS(DECLARE_GL_EXTENSION_FUNC) -#undef DECLARE_GL_EXTENSION_FUNC - - static const GlPixelFormatInfo s_pixelFormatInfos[]; /// Maps GlPixelFormat to a format info -}; - -/* static */GLRenderer::GlPixelFormat GLRenderer::_getGlPixelFormat(Format format) -{ - switch (format) - { - case Format::RGBA_Unorm_UInt8: return GlPixelFormat::RGBA_Unorm_UInt8; - default: return GlPixelFormat::Unknown; - } -} - -/* static */ const GLRenderer::GlPixelFormatInfo GLRenderer::s_pixelFormatInfos[] = -{ - // internalType, format, formatType - { 0, 0, 0}, // GlPixelFormat::Unknown - { GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE }, // GlPixelFormat::RGBA_Unorm_UInt8 -}; - -/* static */void GLRenderer::compileTimeAsserts() -{ - SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_pixelFormatInfos) == int(GlPixelFormat::CountOf)); -} - -Renderer* createGLRenderer() -{ - return new GLRenderer(); -} - -void GLRenderer::debugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message) -{ - ::OutputDebugStringA("GL: "); - ::OutputDebugStringA(message); - ::OutputDebugStringA("\n"); - - switch (type) - { - case GL_DEBUG_TYPE_ERROR: - break; - default: - break; - } -} - -/* static */void APIENTRY GLRenderer::staticDebugCallback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, const GLchar* message, const void* userParam) -{ - ((GLRenderer*)userParam)->debugCallback(source, type, id, severity, length, message); -} - -/* static */GLRenderer::VertexAttributeFormat GLRenderer::getVertexAttributeFormat(Format format) -{ - switch (format) - { - default: assert(!"unexpected"); return VertexAttributeFormat(); - -#define CASE(NAME, COUNT, TYPE, NORMALIZED) \ - case Format::NAME: do { VertexAttributeFormat result = {COUNT, TYPE, NORMALIZED}; return result; } while (0) - - CASE(RGBA_Float32, 4, GL_FLOAT, GL_FALSE); - CASE(RGB_Float32, 3, GL_FLOAT, GL_FALSE); - CASE(RG_Float32, 2, GL_FLOAT, GL_FALSE); - CASE(R_Float32, 1, GL_FLOAT, GL_FALSE); -#undef CASE - } -} - -void GLRenderer::bindBufferImpl(int target, UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* offsets) -{ - for (UInt ii = 0; ii < slotCount; ++ii) - { - UInt slot = startSlot + ii; - - BufferResourceImpl* buffer = static_cast(buffers[ii]); - GLuint bufferID = buffer ? buffer->m_handle : 0; - - assert(!offsets || !offsets[ii]); - - glBindBufferBase(target, (GLuint)slot, bufferID); - } -} - -void GLRenderer::flushStateForDraw() -{ - auto layout = m_boundInputLayout.Ptr(); - auto attrCount = layout->m_attributeCount; - for (UInt ii = 0; ii < attrCount; ++ii) - { - auto& attr = layout->m_attributes[ii]; - - auto streamIndex = attr.streamIndex; - - glBindBuffer(GL_ARRAY_BUFFER, m_boundVertexStreamBuffers[streamIndex]); - - glVertexAttribPointer( - (GLuint)ii, - attr.format.componentCount, - attr.format.componentType, - attr.format.normalized, - (GLsizei)m_boundVertexStreamStrides[streamIndex], - (GLvoid*)(attr.offset + m_boundVertexStreamOffsets[streamIndex])); - - glEnableVertexAttribArray((GLuint)ii); - } - for (UInt ii = attrCount; ii < kMaxVertexStreams; ++ii) - { - glDisableVertexAttribArray((GLuint)ii); - } -} - -GLuint GLRenderer::loadShader(GLenum stage, const char* source) -{ - // GLSL is monumentally stupid. It officially requires the `#version` directive - // to be the first thing in the file, which wouldn't be so bad but the API - // doesn't provide a way to pass a `#define` into your shader other than by - // prepending it to the whole thing. - // - // We are going to solve this problem by doing some surgery on the source - // that was passed in. - - const char* sourceBegin = source; - const char* sourceEnd = source + strlen(source); - - // Look for a version directive in the user-provided source. - const char* versionBegin = strstr(source, "#version"); - const char* versionEnd = nullptr; - if (versionBegin) - { - // If we found a directive, then scan for the end-of-line - // after it, and use that to specify the slice. - versionEnd = strchr(versionBegin, '\n'); - if (!versionEnd) - { - versionEnd = sourceEnd; - } - else - { - versionEnd = versionEnd + 1; - } - } - else - { - // If we didn't find a directive, then treat it as being - // a zero-byte slice at the start of the string - versionBegin = sourceBegin; - versionEnd = sourceBegin; - } - - enum { kMaxSourceStringCount = 16 }; - const GLchar* sourceStrings[kMaxSourceStringCount]; - GLint sourceStringLengths[kMaxSourceStringCount]; - - int sourceStringCount = 0; - - const char* stagePrelude = "\n"; - switch (stage) - { -#define CASE(NAME) case GL_##NAME##_SHADER: stagePrelude = "#define __GLSL_" #NAME "__ 1\n"; break - - CASE(VERTEX); - CASE(TESS_CONTROL); - CASE(TESS_EVALUATION); - CASE(GEOMETRY); - CASE(FRAGMENT); - CASE(COMPUTE); - -#undef CASE - } - - const char* prelude = - "#define __GLSL__ 1\n" - ; - -#define ADD_SOURCE_STRING_SPAN(BEGIN, END) \ - sourceStrings[sourceStringCount] = BEGIN; \ - sourceStringLengths[sourceStringCount++] = GLint(END - BEGIN) \ - /* end */ - -#define ADD_SOURCE_STRING(BEGIN) \ - sourceStrings[sourceStringCount] = BEGIN; \ - sourceStringLengths[sourceStringCount++] = GLint(strlen(BEGIN)) \ - /* end */ - - ADD_SOURCE_STRING_SPAN(versionBegin, versionEnd); - ADD_SOURCE_STRING(stagePrelude); - ADD_SOURCE_STRING(prelude); - ADD_SOURCE_STRING_SPAN(sourceBegin, versionBegin); - ADD_SOURCE_STRING_SPAN(versionEnd, sourceEnd); - - auto shaderID = glCreateShader(stage); - glShaderSource( - shaderID, - sourceStringCount, - &sourceStrings[0], - &sourceStringLengths[0]); - glCompileShader(shaderID); - - GLint success = GL_FALSE; - glGetShaderiv(shaderID, GL_COMPILE_STATUS, &success); - if (!success) - { - int maxSize = 0; - glGetShaderiv(shaderID, GL_INFO_LOG_LENGTH, &maxSize); - - auto infoBuffer = (char*)malloc(maxSize); - - int infoSize = 0; - glGetShaderInfoLog(shaderID, maxSize, &infoSize, infoBuffer); - if (infoSize > 0) - { - fprintf(stderr, "%s", infoBuffer); - ::OutputDebugStringA(infoBuffer); - } - - glDeleteShader(shaderID); - return 0; - } - - return shaderID; -} - -void GLRenderer::destroyBindingEntries(const BindingState::Desc& desc, const BindingDetail* details) -{ - const auto& bindings = desc.m_bindings; - const int numBindings = int(bindings.Count()); - for (int i = 0; i < numBindings; ++i) - { - const auto& binding = bindings[i]; - const auto& detail = details[i]; - - if (binding.bindingType == BindingType::Sampler && detail.m_samplerHandle != 0) - { - glDeleteSamplers(1, &detail.m_samplerHandle); - } - } -} - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! - -SlangResult GLRenderer::initialize(const Desc& desc, void* inWindowHandle) -{ - auto windowHandle = (HWND)inWindowHandle; - m_desc = desc; - - m_hdc = ::GetDC(windowHandle); - - PIXELFORMATDESCRIPTOR pixelFormatDesc = { sizeof(PIXELFORMATDESCRIPTOR) }; - pixelFormatDesc.nVersion = 1; - pixelFormatDesc.dwFlags = PFD_DRAW_TO_WINDOW | PFD_SUPPORT_OPENGL | PFD_DOUBLEBUFFER; - pixelFormatDesc.iPixelType = PFD_TYPE_RGBA; - pixelFormatDesc.cColorBits = 32; - pixelFormatDesc.cDepthBits = 24; - pixelFormatDesc.cStencilBits = 8; - pixelFormatDesc.iLayerType = PFD_MAIN_PLANE; - - int pixelFormatIndex = ChoosePixelFormat(m_hdc, &pixelFormatDesc); - SetPixelFormat(m_hdc, pixelFormatIndex, &pixelFormatDesc); - - m_glContext = wglCreateContext(m_hdc); - wglMakeCurrent(m_hdc, m_glContext); - - auto renderer = glGetString(GL_RENDERER); - auto extensions = glGetString(GL_EXTENSIONS); - - // Load each of our extension functions by name - -#define LOAD_GL_EXTENSION_FUNC(NAME, TYPE) NAME = (TYPE) wglGetProcAddress(#NAME); - MAP_GL_EXTENSION_FUNCS(LOAD_GL_EXTENSION_FUNC) -#undef LOAD_GL_EXTENSION_FUNC - - glDisable(GL_DEPTH_TEST); - glDisable(GL_CULL_FACE); - - glViewport(0, 0, desc.width, desc.height); - - if (glDebugMessageCallback) - { - glEnable(GL_DEBUG_OUTPUT); - glDebugMessageCallback(staticDebugCallback, this); - } - - return SLANG_OK; -} - -void GLRenderer::setClearColor(const float color[4]) -{ - glClearColor(color[0], color[1], color[2], color[3]); -} - -void GLRenderer::clearFrame() -{ - glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT); -} - -void GLRenderer::presentFrame() -{ - glFlush(); - ::SwapBuffers(m_hdc); -} - -SlangResult GLRenderer::captureScreenSurface(Surface& surfaceOut) -{ - SLANG_RETURN_ON_FAIL(surfaceOut.allocate(m_desc.width, m_desc.height, Format::RGBA_Unorm_UInt8, 1, SurfaceAllocator::getMallocAllocator())); - glReadPixels(0, 0, m_desc.width, m_desc.height, GL_RGBA, GL_UNSIGNED_BYTE, surfaceOut.m_data); - surfaceOut.flipInplaceVertically(); - return SLANG_OK; -} - -TextureResource* GLRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) -{ - TextureResource::Desc srcDesc(descIn); - srcDesc.setDefaults(initialUsage); - - GlPixelFormat pixelFormat = _getGlPixelFormat(srcDesc.format); - if (pixelFormat == GlPixelFormat::Unknown) - { - return nullptr; - } - - const GlPixelFormatInfo& info = s_pixelFormatInfos[int(pixelFormat)]; - - const GLint internalFormat = info.internalFormat; - const GLenum format = info.format; - const GLenum formatType = info.formatType; - - RefPtr texture(new TextureResourceImpl(initialUsage, srcDesc, this)); - - GLenum target = 0; - GLuint handle = 0; - glGenTextures(1, &handle); - - const int effectiveArraySize = srcDesc.calcEffectiveArraySize(); - - assert(initData); - assert(initData->numSubResources == srcDesc.numMipLevels * srcDesc.size.depth * effectiveArraySize); - - // Set on texture so will be freed if failure - texture->m_handle = handle; - const void*const*const data = initData->subResources; - - switch (srcDesc.type) - { - case Resource::Type::Texture1D: - { - if (srcDesc.arraySize > 0) - { - target = GL_TEXTURE_1D_ARRAY; - glBindTexture(target, handle); - - int slice = 0; - for (int i = 0; i < effectiveArraySize; i++) - { - for (int j = 0; j < srcDesc.numMipLevels; j++) - { - glTexImage2D(target, j, internalFormat, srcDesc.size.width, i, 0, format, formatType, data[slice++]); - } - } - } - else - { - target = GL_TEXTURE_1D; - glBindTexture(target, handle); - for (int i = 0; i < srcDesc.numMipLevels; i++) - { - glTexImage1D(target, i, internalFormat, srcDesc.size.width, 0, format, formatType, data[i]); - } - } - break; - } - case Resource::Type::TextureCube: - case Resource::Type::Texture2D: - { - if (srcDesc.arraySize > 0) - { - if (srcDesc.type == Resource::Type::TextureCube) - { - target = GL_TEXTURE_CUBE_MAP_ARRAY; - } - else - { - target = GL_TEXTURE_2D_ARRAY; - } - - glBindTexture(target, handle); - - int slice = 0; - for (int i = 0; i < effectiveArraySize; i++) - { - for (int j = 0; j < srcDesc.numMipLevels; j++) - { - glTexImage3D(target, j, internalFormat, srcDesc.size.width, srcDesc.size.height, slice, 0, format, formatType, data[slice++]); - } - } - } - else - { - if (srcDesc.type == Resource::Type::TextureCube) - { - target = GL_TEXTURE_CUBE_MAP; - glBindTexture(target, handle); - - int slice = 0; - for (int j = 0; j < 6; j++) - { - for (int i = 0; i < srcDesc.numMipLevels; i++) - { - glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + j, i, internalFormat, srcDesc.size.width, srcDesc.size.height, 0, format, formatType, data[slice++]); - } - } - } - else - { - target = GL_TEXTURE_2D; - glBindTexture(target, handle); - for (int i = 0; i < srcDesc.numMipLevels; i++) - { - glTexImage2D(target, i, internalFormat, srcDesc.size.width, srcDesc.size.height, 0, format, formatType, data[i]); - } - } - } - break; - } - case Resource::Type::Texture3D: - { - target = GL_TEXTURE_3D; - glBindTexture(target, handle); - for (int i = 0; i < srcDesc.numMipLevels; i++) - { - glTexImage3D(target, i, internalFormat, srcDesc.size.width, srcDesc.size.height, srcDesc.size.depth, 0, format, formatType, data[i]); - } - break; - } - default: return nullptr; - } - - glTexParameteri(target, GL_TEXTURE_WRAP_S, GL_REPEAT); - glTexParameteri(target, GL_TEXTURE_WRAP_T, GL_REPEAT); - glTexParameteri(target, GL_TEXTURE_WRAP_R, GL_REPEAT); - - // Assume regular sampling (might be superseded - if a combined sampler wanted) - glTexParameteri(target, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR); - glTexParameteri(target, GL_TEXTURE_MAG_FILTER, GL_LINEAR); - glTexParameterf(target, GL_TEXTURE_MAX_ANISOTROPY_EXT, 8.0f); - - texture->m_target = target; - - return texture.detach(); -} - -static GLenum _calcUsage(Resource::Usage usage) -{ - typedef Resource::Usage Usage; - switch (usage) - { - case Usage::ConstantBuffer: return GL_DYNAMIC_DRAW; - default: return GL_STATIC_READ; - } -} - -static GLenum _calcTarget(Resource::Usage usage) -{ - typedef Resource::Usage Usage; - switch (usage) - { - case Usage::ConstantBuffer: return GL_UNIFORM_BUFFER; - default: return GL_SHADER_STORAGE_BUFFER; - } -} - -BufferResource* GLRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) -{ - BufferResource::Desc desc(descIn); - desc.setDefaults(initialUsage); - - const GLenum target = _calcTarget(initialUsage); - // TODO: should derive from desc... - const GLenum usage = _calcUsage(initialUsage); - - GLuint bufferID = 0; - glGenBuffers(1, &bufferID); - glBindBuffer(target, bufferID); - - glBufferData(target, descIn.sizeInBytes, initData, usage); - - return new BufferResourceImpl(initialUsage, desc, this, bufferID, target); -} - -InputLayout* GLRenderer::createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) -{ - InputLayoutImpl* inputLayout = new InputLayoutImpl; - - inputLayout->m_attributeCount = inputElementCount; - for (UInt ii = 0; ii < inputElementCount; ++ii) - { - auto& inputAttr = inputElements[ii]; - auto& glAttr = inputLayout->m_attributes[ii]; - - glAttr.streamIndex = 0; - glAttr.format = getVertexAttributeFormat(inputAttr.format); - glAttr.offset = (GLsizei)inputAttr.offset; - } - - return (InputLayout*)inputLayout; -} - -void* GLRenderer::map(BufferResource* bufferIn, MapFlavor flavor) -{ - BufferResourceImpl* buffer = static_cast(bufferIn); - - //GLenum target = GL_UNIFORM_BUFFER; - - GLuint access = 0; - switch (flavor) - { - case MapFlavor::WriteDiscard: - case MapFlavor::HostWrite: - access = GL_WRITE_ONLY; - break; - case MapFlavor::HostRead: - access = GL_READ_ONLY; - break; - } - - glBindBuffer(buffer->m_target, buffer->m_handle); - - return glMapBuffer(buffer->m_target, access); -} - -void GLRenderer::unmap(BufferResource* bufferIn) -{ - BufferResourceImpl* buffer = static_cast(bufferIn); - glUnmapBuffer(buffer->m_target); -} - -void GLRenderer::setInputLayout(InputLayout* inputLayout) -{ - m_boundInputLayout = static_cast(inputLayout); -} - -void GLRenderer::setPrimitiveTopology(PrimitiveTopology topology) -{ - GLenum glTopology = 0; - switch (topology) - { -#define CASE(NAME, VALUE) case PrimitiveTopology::NAME: glTopology = VALUE; break - - CASE(TriangleList, GL_TRIANGLES); - -#undef CASE - } - m_boundPrimitiveTopology = glTopology; -} - -void GLRenderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) -{ - for (UInt ii = 0; ii < slotCount; ++ii) - { - UInt slot = startSlot + ii; - - BufferResourceImpl* buffer = static_cast(buffers[ii]); - GLuint bufferID = buffer ? buffer->m_handle : 0; - - m_boundVertexStreamBuffers[slot] = bufferID; - m_boundVertexStreamStrides[slot] = strides[ii]; - m_boundVertexStreamOffsets[slot] = offsets[ii]; - } -} - -void GLRenderer::setShaderProgram(ShaderProgram* programIn) -{ - ShaderProgramImpl* program = static_cast(programIn); - m_boundShaderProgram = program; - GLuint programID = program ? program->m_id : 0; - glUseProgram(programID); -} - -void GLRenderer::draw(UInt vertexCount, UInt startVertex = 0) -{ - flushStateForDraw(); - - glDrawArrays(m_boundPrimitiveTopology, (GLint)startVertex, (GLsizei)vertexCount); -} - -void GLRenderer::dispatchCompute(int x, int y, int z) -{ - glDispatchCompute(x, y, z); -} - -BindingState* GLRenderer::createBindingState(const BindingState::Desc& bindingStateDesc) -{ - RefPtr bindingState(new BindingStateImpl(bindingStateDesc, this)); - - const auto& srcBindings = bindingStateDesc.m_bindings; - const int numBindings = int(srcBindings.Count()); - - auto& dstDetails = bindingState->m_bindingDetails; - dstDetails.SetSize(numBindings); - - for (int i = 0; i < numBindings; ++i) - { - auto& dstDetail = dstDetails[i]; - const auto& srcBinding = srcBindings[i]; - - - switch (srcBinding.bindingType) - { - case BindingType::Texture: - case BindingType::Buffer: - { - break; - } - case BindingType::CombinedTextureSampler: - { - assert(srcBinding.resource && srcBinding.resource->isTexture()); - TextureResourceImpl* texture = static_cast(srcBinding.resource.Ptr()); - const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; - - if (samplerDesc.isCompareSampler) - { - auto target = texture->m_target; - - glTexParameteri(target, GL_TEXTURE_MIN_FILTER, GL_LINEAR); - glTexParameteri(target, GL_TEXTURE_MAG_FILTER, GL_LINEAR); - glTexParameteri(target, GL_TEXTURE_COMPARE_MODE, GL_COMPARE_REF_TO_TEXTURE); - glTexParameteri(target, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL); - } - break; - } - case BindingType::Sampler: - { - const BindingState::SamplerDesc& samplerDesc = bindingStateDesc.m_samplerDescs[srcBinding.descIndex]; - - GLuint handle; - - glCreateSamplers(1, &handle); - glSamplerParameteri(handle, GL_TEXTURE_WRAP_S, GL_REPEAT); - glSamplerParameteri(handle, GL_TEXTURE_WRAP_T, GL_REPEAT); - glSamplerParameteri(handle, GL_TEXTURE_WRAP_R, GL_REPEAT); - - if (samplerDesc.isCompareSampler) - { - glSamplerParameteri(handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR); - glSamplerParameteri(handle, GL_TEXTURE_MAG_FILTER, GL_LINEAR); - glSamplerParameteri(handle, GL_TEXTURE_COMPARE_MODE, GL_COMPARE_REF_TO_TEXTURE); - glSamplerParameteri(handle, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL); - } - else - { - glSamplerParameteri(handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR); - glSamplerParameteri(handle, GL_TEXTURE_MAG_FILTER, GL_LINEAR); - glSamplerParameteri(handle, GL_TEXTURE_MAX_ANISOTROPY_EXT, 8); - } - - dstDetail.m_samplerHandle = handle; - break; - } - } - } - - return bindingState.detach(); -} - -void GLRenderer::setBindingState(BindingState* stateIn) -{ - BindingStateImpl* state = static_cast(stateIn); - - const auto& bindingDesc = state->getDesc(); - - const auto& details = state->m_bindingDetails; - const auto& bindings = bindingDesc.m_bindings; - const int numBindings = int(bindings.Count()); - - for (int i = 0; i < numBindings; ++i) - { - const auto& binding = bindings[i]; - const auto& detail = details[i]; - - switch (binding.bindingType) - { - case BindingType::Buffer: - { - const int bindingIndex = binding.registerRange.getSingleIndex(); - - BufferResourceImpl* buffer = static_cast(binding.resource.Ptr()); - glBindBufferBase(buffer->m_target, bindingIndex, buffer->m_handle); - break; - } - case BindingType::Sampler: - { - for (int index = binding.registerRange.index; index < binding.registerRange.index + binding.registerRange.size; ++index) - { - glBindSampler(index, detail.m_samplerHandle); - } - break; - } - case BindingType::Texture: - case BindingType::CombinedTextureSampler: - { - BufferResourceImpl* buffer = static_cast(binding.resource.Ptr()); - - const int bindingIndex = binding.registerRange.getSingleIndex(); - - glActiveTexture(GL_TEXTURE0 + bindingIndex); - glBindTexture(buffer->m_target, buffer->m_handle); - break; - } - } - } -} - -ShaderProgram* GLRenderer::createProgram(const ShaderProgram::Desc& desc) -{ - auto programID = glCreateProgram(); - if(desc.pipelineType == PipelineType::Compute ) - { - auto computeKernel = desc.findKernel(StageType::Compute); - auto computeShaderID = loadShader(GL_COMPUTE_SHADER, (char const*) computeKernel->codeBegin); - glAttachShader(programID, computeShaderID); - glLinkProgram(programID); - glDeleteShader(computeShaderID); - } - else - { - auto vertexKernel = desc.findKernel(StageType::Vertex); - auto fragmentKernel = desc.findKernel(StageType::Fragment); - - auto vertexShaderID = loadShader(GL_VERTEX_SHADER, (char const*) vertexKernel->codeBegin); - auto fragmentShaderID = loadShader(GL_FRAGMENT_SHADER, (char const*) fragmentKernel->codeBegin); - - glAttachShader(programID, vertexShaderID); - glAttachShader(programID, fragmentShaderID); - - - glLinkProgram(programID); - - glDeleteShader(vertexShaderID); - glDeleteShader(fragmentShaderID); - } - GLint success = GL_FALSE; - glGetProgramiv(programID, GL_LINK_STATUS, &success); - if (!success) - { - int maxSize = 0; - glGetProgramiv(programID, GL_INFO_LOG_LENGTH, &maxSize); - - auto infoBuffer = (char*)::malloc(maxSize); - - int infoSize = 0; - glGetProgramInfoLog(programID, maxSize, &infoSize, infoBuffer); - if (infoSize > 0) - { - fprintf(stderr, "%s", infoBuffer); - OutputDebugStringA(infoBuffer); - } - - ::free(infoBuffer); - - glDeleteProgram(programID); - return nullptr; - } - - return new ShaderProgramImpl(this, programID); -} - - -} // renderer_test diff --git a/tools/slang-graphics/render-gl.h b/tools/slang-graphics/render-gl.h deleted file mode 100644 index 2b211cda4..000000000 --- a/tools/slang-graphics/render-gl.h +++ /dev/null @@ -1,10 +0,0 @@ -// render-d3d11.h -#pragma once - -namespace slang_graphics { - -class Renderer; - -Renderer* createGLRenderer(); - -} // slang_graphics diff --git a/tools/slang-graphics/render-vk.cpp b/tools/slang-graphics/render-vk.cpp deleted file mode 100644 index d7cd93e67..000000000 --- a/tools/slang-graphics/render-vk.cpp +++ /dev/null @@ -1,2019 +0,0 @@ -// render-vk.cpp -#include "render-vk.h" - -//WORKING:#include "options.h" -#include "render.h" - -#include "../../source/core/smart-pointer.h" - -#include "vk-api.h" -#include "vk-util.h" -#include "vk-device-queue.h" -#include "vk-swap-chain.h" - -#include "surface.h" - -// Vulkan has a different coordinate system to ogl -// http://anki3d.org/vulkan-coordinate-system/ - -#define ENABLE_VALIDATION_LAYER 1 - -#ifdef _MSC_VER -# include -# pragma warning(disable: 4996) -# if (_MSC_VER < 1900) -# define snprintf sprintf_s -# endif -#endif - -namespace slang_graphics { -using namespace Slang; - -class VKRenderer : public Renderer -{ -public: - enum { kMaxRenderTargets = 8, kMaxAttachments = kMaxRenderTargets + 1 }; - - // Renderer implementation - virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) override; - virtual void setClearColor(const float color[4]) override; - virtual void clearFrame() override; - virtual void presentFrame() override; - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData) override; - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& bufferDesc, const void* initData) override; - virtual SlangResult captureScreenSurface(Surface& surface) override; - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) override; - virtual BindingState* createBindingState(const BindingState::Desc& bindingStateDesc) override; - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) override; - virtual void* map(BufferResource* buffer, MapFlavor flavor) override; - virtual void unmap(BufferResource* buffer) override; - virtual void setInputLayout(InputLayout* inputLayout) override; - virtual void setPrimitiveTopology(PrimitiveTopology topology) override; - virtual void setBindingState(BindingState* state); - virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) override; - virtual void setShaderProgram(ShaderProgram* inProgram) override; - virtual void draw(UInt vertexCount, UInt startVertex) override; - virtual void dispatchCompute(int x, int y, int z) override; - virtual void submitGpuWork() override; - virtual void waitForGpu() override; - virtual RendererType getRendererType() const override { return RendererType::Vulkan; } - - /// Dtor - ~VKRenderer(); - - protected: - - class Buffer - { - public: - /// Initialize a buffer with specified size, and memory props - Result init(const VulkanApi& api, size_t bufferSize, VkBufferUsageFlags usage, VkMemoryPropertyFlags reqMemoryProperties); - - /// Returns true if has been initialized - bool isInitialized() const { return m_api != nullptr; } - - // Default Ctor - Buffer(): - m_api(nullptr) - {} - - /// Dtor - ~Buffer() - { - if (m_api) - { - m_api->vkDestroyBuffer(m_api->m_device, m_buffer, nullptr); - m_api->vkFreeMemory(m_api->m_device, m_memory, nullptr); - } - } - - VkBuffer m_buffer; - VkDeviceMemory m_memory; - const VulkanApi* m_api; - }; - - class InputLayoutImpl : public InputLayout - { - public: - List m_vertexDescs; - int m_vertexSize; - }; - - class BufferResourceImpl: public BufferResource - { - public: - typedef BufferResource Parent; - - BufferResourceImpl(Resource::Usage initialUsage, const BufferResource::Desc& desc, VKRenderer* renderer): - Parent(desc), - m_renderer(renderer), - m_initialUsage(initialUsage) - { - assert(renderer); - } - - Resource::Usage m_initialUsage; - VKRenderer* m_renderer; - Buffer m_buffer; - Buffer m_uploadBuffer; - List m_readBuffer; ///< Stores the contents when a map read is performed - - MapFlavor m_mapFlavor = MapFlavor::Unknown; ///< If resource is mapped, records what kind of mapping else Unknown (if not mapped) - }; - - class TextureResourceImpl : public TextureResource - { - public: - typedef TextureResource Parent; - - TextureResourceImpl(const Desc& desc, Usage initialUsage, const VulkanApi* api) : - Parent(desc), - m_initialUsage(initialUsage), - m_api(api) - { - } - ~TextureResourceImpl() - { - if (m_api) - { - if (m_imageMemory != VK_NULL_HANDLE) - { - m_api->vkFreeMemory(m_api->m_device, m_imageMemory, nullptr); - } - if (m_image != VK_NULL_HANDLE) - { - m_api->vkDestroyImage(m_api->m_device, m_image, nullptr); - } - } - } - - Usage m_initialUsage; - - VkImage m_image = VK_NULL_HANDLE; - VkDeviceMemory m_imageMemory = VK_NULL_HANDLE; - - const VulkanApi* m_api; - }; - - class ShaderProgramImpl: public ShaderProgram - { - public: - - ShaderProgramImpl(PipelineType pipelineType): - m_pipelineType(pipelineType) - {} - - PipelineType m_pipelineType; - - VkPipelineShaderStageCreateInfo m_compute; - VkPipelineShaderStageCreateInfo m_vertex; - VkPipelineShaderStageCreateInfo m_fragment; - - List m_buffers[2]; //< To keep storage of code in scope - }; - - struct BindingDetail - { - VkImageView m_srv = VK_NULL_HANDLE; - VkBufferView m_uav = VK_NULL_HANDLE; - VkSampler m_sampler = VK_NULL_HANDLE; - }; - - class BindingStateImpl: public BindingState - { - public: - typedef BindingState Parent; - - BindingStateImpl(const Desc& desc, const VulkanApi* api): - Parent(desc), - m_api(api) - { - } - ~BindingStateImpl() - { - for (int i = 0; i < int(m_bindingDetails.Count()); ++i) - { - BindingDetail& detail = m_bindingDetails[i]; - if (detail.m_sampler != VK_NULL_HANDLE) - { - m_api->vkDestroySampler(m_api->m_device, detail.m_sampler, nullptr); - } - if (detail.m_srv != VK_NULL_HANDLE) - { - m_api->vkDestroyImageView(m_api->m_device, detail.m_srv, nullptr); - } - if (detail.m_uav != VK_NULL_HANDLE) - { - m_api->vkDestroyBufferView(m_api->m_device, detail.m_uav, nullptr); - } - } - } - - const VulkanApi* m_api; - List m_bindingDetails; - }; - - struct BoundVertexBuffer - { - RefPtr m_buffer; - int m_stride; - int m_offset; - }; - - class Pipeline : public RefObject - { - public: - Pipeline(const VulkanApi& api): - m_api(&api) - { - } - ~Pipeline() - { - if (m_pipeline != VK_NULL_HANDLE) - { - m_api->vkDestroyPipeline(m_api->m_device, m_pipeline, nullptr); - } - if (m_descriptorPool != VK_NULL_HANDLE) - { - m_api->vkDestroyDescriptorPool(m_api->m_device, m_descriptorPool, nullptr); - } - if (m_pipelineLayout != VK_NULL_HANDLE) - { - m_api->vkDestroyPipelineLayout(m_api->m_device, m_pipelineLayout, nullptr); - } - if(m_descriptorSetLayout != VK_NULL_HANDLE) - { - m_api->vkDestroyDescriptorSetLayout(m_api->m_device, m_descriptorSetLayout, nullptr); - } - } - - const VulkanApi* m_api; - - VkPrimitiveTopology m_primitiveTopology; - RefPtr m_bindingState; - RefPtr m_inputLayout; - RefPtr m_shaderProgram; - - VkDescriptorSetLayout m_descriptorSetLayout = VK_NULL_HANDLE; - VkPipelineLayout m_pipelineLayout = VK_NULL_HANDLE; - VkDescriptorPool m_descriptorPool = VK_NULL_HANDLE; - VkDescriptorSet m_descriptorSet = VK_NULL_HANDLE; - VkPipeline m_pipeline = VK_NULL_HANDLE; - }; - - VkBool32 handleDebugMessage(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, - size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg); - - VkPipelineShaderStageCreateInfo compileEntryPoint( - ShaderProgram::KernelDesc const& kernelDesc, - VkShaderStageFlagBits stage, - List& bufferOut); - - static VKAPI_ATTR VkBool32 VKAPI_CALL debugMessageCallback(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, - size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg, void* pUserData); - - /// Returns true if m_currentPipeline matches the current configuration - Pipeline* _getPipeline(); - bool _isEqual(const Pipeline& pipeline) const; - Slang::Result _createPipeline(RefPtr& pipelineOut); - void _beginRender(); - void _endRender(); - - Slang::Result _beginPass(); - void _endPass(); - void _transitionImageLayout(VkImage image, VkFormat format, const TextureResource::Desc& desc, VkImageLayout oldLayout, VkImageLayout newLayout); - - VkDebugReportCallbackEXT m_debugReportCallback; - - RefPtr m_currentInputLayout; - RefPtr m_currentBindingState; - RefPtr m_currentProgram; - - List > m_pipelineCache; - Pipeline* m_currentPipeline = nullptr; - - List m_boundVertexBuffers; - - VkPrimitiveTopology m_primitiveTopology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; - - VkDevice m_device = VK_NULL_HANDLE; - - VulkanModule m_module; - VulkanApi m_api; - - VulkanDeviceQueue m_deviceQueue; - VulkanSwapChain m_swapChain; - - VkRenderPass m_renderPass = VK_NULL_HANDLE; - - int m_swapChainImageIndex = -1; - - float m_clearColor[4] = { 0, 0, 0, 0 }; - - Desc m_desc; -}; - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VkRenderer::Buffer !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -Result VKRenderer::Buffer::init(const VulkanApi& api, size_t bufferSize, VkBufferUsageFlags usage, VkMemoryPropertyFlags reqMemoryProperties) -{ - assert(!isInitialized()); - - m_api = &api; - m_memory = VK_NULL_HANDLE; - m_buffer = VK_NULL_HANDLE; - - VkBufferCreateInfo bufferCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO }; - bufferCreateInfo.size = bufferSize; - bufferCreateInfo.usage = usage; - - SLANG_VK_CHECK(api.vkCreateBuffer(api.m_device, &bufferCreateInfo, nullptr, &m_buffer)); - - VkMemoryRequirements memoryReqs = {}; - api.vkGetBufferMemoryRequirements(api.m_device, m_buffer, &memoryReqs); - - int memoryTypeIndex = api.findMemoryTypeIndex(memoryReqs.memoryTypeBits, reqMemoryProperties); - assert(memoryTypeIndex >= 0); - - VkMemoryPropertyFlags actualMemoryProperites = api.m_deviceMemoryProperties.memoryTypes[memoryTypeIndex].propertyFlags; - - VkMemoryAllocateInfo allocateInfo = { VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO }; - allocateInfo.allocationSize = memoryReqs.size; - allocateInfo.memoryTypeIndex = memoryTypeIndex; - - SLANG_VK_CHECK(api.vkAllocateMemory(api.m_device, &allocateInfo, nullptr, &m_memory)); - SLANG_VK_CHECK(api.vkBindBufferMemory(api.m_device, m_buffer, m_memory, 0)); - - return SLANG_OK; -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VkRenderer !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -bool VKRenderer::_isEqual(const Pipeline& pipeline) const -{ - return - pipeline.m_bindingState == m_currentBindingState && - pipeline.m_primitiveTopology == m_primitiveTopology && - pipeline.m_inputLayout == m_currentInputLayout && - pipeline.m_shaderProgram == m_currentProgram; -} - -VKRenderer::Pipeline* VKRenderer::_getPipeline() -{ - if (m_currentPipeline && _isEqual(*m_currentPipeline)) - { - return m_currentPipeline; - } - - // Look for a match in the cache - for (int i = 0; i < int(m_pipelineCache.Count()); ++i) - { - Pipeline* pipeline = m_pipelineCache[i]; - if (_isEqual(*pipeline)) - { - m_currentPipeline = pipeline; - return pipeline; - } - } - - RefPtr pipeline; - SLANG_RETURN_NULL_ON_FAIL(_createPipeline(pipeline)); - m_pipelineCache.Add(pipeline); - m_currentPipeline = pipeline; - return pipeline; -} - -Slang::Result VKRenderer::_createPipeline(RefPtr& pipelineOut) -{ - RefPtr pipeline(new Pipeline(m_api)); - - // Initialize the state - pipeline->m_primitiveTopology = m_primitiveTopology; - pipeline->m_bindingState = m_currentBindingState; - pipeline->m_shaderProgram = m_currentProgram; - pipeline->m_inputLayout = m_currentInputLayout; - - // Must be equal at this point if all the items are correctly set in pipeline - assert(_isEqual(*pipeline)); - - // First create a pipeline layout based on what is bound - - const auto& srcDetails = m_currentBindingState->m_bindingDetails; - const auto& srcBindings = m_currentBindingState->getDesc().m_bindings; - - const int numBindings = int(srcBindings.Count()); - - int numBuffers = 0; - int numImages = 0; - - int numDescriptorByType[VK_DESCRIPTOR_TYPE_RANGE_SIZE] = { 0, }; - - Slang::List dstBindings; - for (int i = 0; i < numBindings; ++i) - { - const auto& srcDetail = srcDetails[i]; - const auto& srcBinding = srcBindings[i]; - - VkDescriptorSetLayoutBinding dstBinding = {}; - - dstBinding.descriptorCount = 1; - - switch (srcBinding.bindingType) - { - case BindingType::Buffer: - { - BufferResourceImpl* bufferResource = static_cast(srcBinding.resource.Ptr()); - const BufferResource::Desc& bufferResourceDesc = bufferResource->getDesc(); - - if (bufferResourceDesc.bindFlags & Resource::BindFlag::UnorderedAccess) - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; - dstBinding.stageFlags = VK_SHADER_STAGE_ALL; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numBuffers++; - } - else if (bufferResourceDesc.bindFlags & Resource::BindFlag::ConstantBuffer) - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; - dstBinding.stageFlags = VK_SHADER_STAGE_ALL; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numBuffers++; - } - break; - } - case BindingType::Texture: - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE; - dstBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numImages++; - break; - } - case BindingType::Sampler: - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLER; - dstBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numImages++; - break; - } - - case BindingType::CombinedTextureSampler: - { - dstBinding.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER; - dstBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; - dstBindings.Add(dstBinding); - - numDescriptorByType[dstBinding.descriptorType] ++; - numImages++; - break; - } - default: - { - assert(!"Unhandled type"); - return SLANG_FAIL; - } - } - } - - // Create a descriptor pool for allocating sets - { -#if 0 - VkDescriptorPoolSize poolSizes[] = - { - { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, 128 }, - { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 128 }, - { VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, 128 }, - }; -#endif - - List poolSizes; - for (int i = 0; i < SLANG_COUNT_OF(numDescriptorByType); ++i) - { - int numDescriptors = numDescriptorByType[i]; - if (numDescriptors > 0) - { - const VkDescriptorPoolSize poolSize = { VkDescriptorType(i), uint32_t(numDescriptors) }; - poolSizes.Add(poolSize); - } - } - VkDescriptorPoolCreateInfo descriptorPoolInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO }; - - descriptorPoolInfo.maxSets = 128; // TODO: actually pick a size. - descriptorPoolInfo.poolSizeCount = uint32_t(poolSizes.Count()); - descriptorPoolInfo.pPoolSizes = poolSizes.Buffer(); - - SLANG_VK_CHECK(m_api.vkCreateDescriptorPool(m_device, &descriptorPoolInfo, nullptr, &pipeline->m_descriptorPool)); - } - - // Create the layout - { - VkDescriptorSetLayoutCreateInfo descriptorSetLayoutInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO }; - descriptorSetLayoutInfo.bindingCount = uint32_t(dstBindings.Count()); - descriptorSetLayoutInfo.pBindings = dstBindings.Buffer(); - - SLANG_VK_CHECK(m_api.vkCreateDescriptorSetLayout(m_device, &descriptorSetLayoutInfo, nullptr, &pipeline->m_descriptorSetLayout)); - } - - // Create a descriptor set based on our layout - { - VkDescriptorSetAllocateInfo descriptorSetAllocInfo = { VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO }; - descriptorSetAllocInfo.descriptorPool = pipeline->m_descriptorPool; - descriptorSetAllocInfo.descriptorSetCount = 1; - descriptorSetAllocInfo.pSetLayouts = &pipeline->m_descriptorSetLayout; - - SLANG_VK_CHECK(m_api.vkAllocateDescriptorSets(m_device, &descriptorSetAllocInfo, &pipeline->m_descriptorSet)); - } - - // Fill in the descriptor set, using our binding information - - List imageInfos; - List bufferInfos; - List writes; - - // Make sure there is enough space... - imageInfos.Reserve(numImages); - bufferInfos.Reserve(numBuffers); - - int elementIndex = 0; - - for (int i = 0; i < numBindings; ++i) - { - const auto& srcDetail = srcDetails[i]; - const auto& srcBinding = srcBindings[i]; - - const int bindingIndex = srcBinding.registerRange.getSingleIndex(); - - VkWriteDescriptorSet writeInfo = { VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET }; - writeInfo.descriptorCount = 1; - writeInfo.dstSet = pipeline->m_descriptorSet; - writeInfo.dstBinding = bindingIndex; - writeInfo.dstArrayElement = 0; - - switch (srcBinding.bindingType) - { - case BindingType::Buffer: - { - assert(srcBinding.resource && srcBinding.resource->isBuffer()); - BufferResourceImpl* bufferResource = static_cast(srcBinding.resource.Ptr()); - const BufferResource::Desc& bufferResourceDesc = bufferResource->getDesc(); - - { - VkDescriptorBufferInfo bufferInfo; - bufferInfo.buffer = bufferResource->m_buffer.m_buffer; - bufferInfo.offset = 0; - bufferInfo.range = bufferResourceDesc.sizeInBytes; - - bufferInfos.Add(bufferInfo); - } - - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; - if (bufferResource->m_initialUsage == Resource::Usage::UnorderedAccess) - { - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; - } - else if (bufferResource->m_initialUsage == Resource::Usage::ConstantBuffer) - { - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; - } - - writeInfo.pBufferInfo = &bufferInfos.Last(); - - writes.Add(writeInfo); - break; - } - case BindingType::Texture: - { - assert(srcBinding.resource && srcBinding.resource->isTexture()); - - TextureResourceImpl* textureResource = static_cast(srcBinding.resource.Ptr()); - const TextureResource::Desc& textureResourceDesc = textureResource->getDesc(); - - { - VkDescriptorImageInfo imageInfo = {}; - imageInfo.imageView = srcDetail.m_srv; - imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; - imageInfos.Add(imageInfo); - } - - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE; - writeInfo.pImageInfo = &imageInfos.Last(); - - writes.Add(writeInfo); - break; - } - case BindingType::Sampler: - { - { - VkDescriptorImageInfo imageInfo = {}; - imageInfo.sampler = srcDetail.m_sampler; - //imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; - imageInfos.Add(imageInfo); - } - - writeInfo.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLER; - writeInfo.pImageInfo = &imageInfos.Last(); - - writes.Add(writeInfo); - break; - } - default: - { - assert(!"Binding not currently handled"); - return SLANG_FAIL; - } - } - } - - assert(imageInfos.Count() == numImages); - assert(bufferInfos.Count() == numBuffers); - - // Write into the descriptor set - { - m_api.vkUpdateDescriptorSets(m_device, uint32_t(writes.Count()), writes.Buffer(), 0, nullptr); - } - - // Create a pipeline layout based on our descriptor set layout(s) - - VkPipelineLayoutCreateInfo pipelineLayoutInfo = { VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO }; - pipelineLayoutInfo.setLayoutCount = 1; - pipelineLayoutInfo.pSetLayouts = &pipeline->m_descriptorSetLayout; - - SLANG_VK_CHECK(m_api.vkCreatePipelineLayout(m_device, &pipelineLayoutInfo, nullptr, &pipeline->m_pipelineLayout)); - - VkPipelineCache pipelineCache = VK_NULL_HANDLE; - - if (m_currentProgram->m_pipelineType == PipelineType::Compute) - { - // Then create a pipeline to use that layout - - VkComputePipelineCreateInfo computePipelineInfo = { VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO }; - computePipelineInfo.stage = m_currentProgram->m_compute; - computePipelineInfo.layout = pipeline->m_pipelineLayout; - - SLANG_VK_CHECK(m_api.vkCreateComputePipelines(m_device, pipelineCache, 1, &computePipelineInfo, nullptr, &pipeline->m_pipeline)); - } - else if (m_currentProgram->m_pipelineType == PipelineType::Graphics) - { - // Create the graphics pipeline - - const int width = m_swapChain.getWidth(); - const int height = m_swapChain.getHeight(); - - VkPipelineShaderStageCreateInfo shaderStages[] = { m_currentProgram->m_vertex, m_currentProgram->m_fragment }; - - // VertexBuffer/s - // Currently only handles one - - VkPipelineVertexInputStateCreateInfo vertexInputInfo = { VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO }; - vertexInputInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO; - vertexInputInfo.vertexBindingDescriptionCount = 0; - vertexInputInfo.vertexAttributeDescriptionCount = 0; - - VkVertexInputBindingDescription vertexInputBindingDescription; - - if (m_currentInputLayout) - { - vertexInputBindingDescription.binding = 0; - vertexInputBindingDescription.stride = m_currentInputLayout->m_vertexSize; - vertexInputBindingDescription.inputRate = VK_VERTEX_INPUT_RATE_VERTEX; - - const auto& srcAttributeDescs = m_currentInputLayout->m_vertexDescs; - - vertexInputInfo.vertexBindingDescriptionCount = 1; - vertexInputInfo.pVertexBindingDescriptions = &vertexInputBindingDescription; - - vertexInputInfo.vertexAttributeDescriptionCount = static_cast(srcAttributeDescs.Count()); - vertexInputInfo.pVertexAttributeDescriptions = srcAttributeDescs.Buffer(); - } - - // - - VkPipelineInputAssemblyStateCreateInfo inputAssembly = {}; - inputAssembly.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO; - inputAssembly.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; - inputAssembly.primitiveRestartEnable = VK_FALSE; - - VkViewport viewport = {}; - viewport.x = 0.0f; - viewport.y = 0.0f; - viewport.width = (float)width; - viewport.height = (float)height; - viewport.minDepth = 0.0f; - viewport.maxDepth = 1.0f; - - VkRect2D scissor = {}; - scissor.offset = { 0, 0 }; - scissor.extent = { uint32_t(width), uint32_t(height) }; - - VkPipelineViewportStateCreateInfo viewportState = {}; - viewportState.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO; - viewportState.viewportCount = 1; - viewportState.pViewports = &viewport; - viewportState.scissorCount = 1; - viewportState.pScissors = &scissor; - - VkPipelineRasterizationStateCreateInfo rasterizer = {}; - rasterizer.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO; - rasterizer.depthClampEnable = VK_FALSE; - rasterizer.rasterizerDiscardEnable = VK_FALSE; - rasterizer.polygonMode = VK_POLYGON_MODE_FILL; - rasterizer.lineWidth = 1.0f; - rasterizer.cullMode = VK_CULL_MODE_NONE; - rasterizer.frontFace = VK_FRONT_FACE_CLOCKWISE; - rasterizer.depthBiasEnable = VK_FALSE; - - VkPipelineMultisampleStateCreateInfo multisampling = {}; - multisampling.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO; - multisampling.sampleShadingEnable = VK_FALSE; - multisampling.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT; - - VkPipelineColorBlendAttachmentState colorBlendAttachment = {}; - colorBlendAttachment.colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT; - colorBlendAttachment.blendEnable = VK_FALSE; - - VkPipelineColorBlendStateCreateInfo colorBlending = {}; - colorBlending.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO; - colorBlending.logicOpEnable = VK_FALSE; - colorBlending.logicOp = VK_LOGIC_OP_COPY; - colorBlending.attachmentCount = 1; - colorBlending.pAttachments = &colorBlendAttachment; - colorBlending.blendConstants[0] = 0.0f; - colorBlending.blendConstants[1] = 0.0f; - colorBlending.blendConstants[2] = 0.0f; - colorBlending.blendConstants[3] = 0.0f; - - VkGraphicsPipelineCreateInfo pipelineInfo = { VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO }; - - pipelineInfo.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO; - pipelineInfo.stageCount = 2; - pipelineInfo.pStages = shaderStages; - pipelineInfo.pVertexInputState = &vertexInputInfo; - pipelineInfo.pInputAssemblyState = &inputAssembly; - pipelineInfo.pViewportState = &viewportState; - pipelineInfo.pRasterizationState = &rasterizer; - pipelineInfo.pMultisampleState = &multisampling; - pipelineInfo.pColorBlendState = &colorBlending; - pipelineInfo.layout = pipeline->m_pipelineLayout; - pipelineInfo.renderPass = m_renderPass; - pipelineInfo.subpass = 0; - pipelineInfo.basePipelineHandle = VK_NULL_HANDLE; - - SLANG_VK_CHECK(m_api.vkCreateGraphicsPipelines(m_device, pipelineCache, 1, &pipelineInfo, nullptr, &pipeline->m_pipeline)); - } - else - { - assert(!"Unhandled program type"); - return SLANG_FAIL; - } - - pipelineOut = pipeline; - return SLANG_OK; -} - -Result VKRenderer::_beginPass() -{ - if (m_swapChainImageIndex < 0) - { - return SLANG_FAIL; - } - - const int numRenderTargets = 1; - - const VulkanSwapChain::Image& image = m_swapChain.getImages()[m_swapChainImageIndex]; - - int numAttachments = 0; - - // Start render pass - VkClearValue clearValues[kMaxAttachments]; - clearValues[numAttachments++] = VkClearValue{ m_clearColor[0], m_clearColor[1], m_clearColor[2], m_clearColor[3] }; - - bool hasDepthBuffer = false; - if (hasDepthBuffer) - { - VkClearValue& clearValue = clearValues[numAttachments++]; - - clearValue.depthStencil.depth = 1.0f; - clearValue.depthStencil.stencil = 0; - } - - const int width = m_swapChain.getWidth(); - const int height = m_swapChain.getHeight(); - - VkCommandBuffer cmdBuffer = m_deviceQueue.getCommandBuffer(); - - VkRenderPassBeginInfo renderPassBegin = {}; - renderPassBegin.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO; - renderPassBegin.renderPass = m_renderPass; - renderPassBegin.framebuffer = image.m_frameBuffer; - renderPassBegin.renderArea.offset.x = 0; - renderPassBegin.renderArea.offset.y = 0; - renderPassBegin.renderArea.extent.width = width; - renderPassBegin.renderArea.extent.height = height; - renderPassBegin.clearValueCount = numAttachments; - renderPassBegin.pClearValues = clearValues; - - m_api.vkCmdBeginRenderPass(cmdBuffer, &renderPassBegin, VK_SUBPASS_CONTENTS_INLINE); - - // Set up scissor and viewport - { - VkRect2D rects[kMaxRenderTargets] = {}; - VkViewport viewports[kMaxRenderTargets] = {}; - for (int i = 0; i < numRenderTargets; ++i) - { - rects[i] = VkRect2D{ 0, 0, uint32_t(width), uint32_t(height) }; - - VkViewport& dstViewport = viewports[i]; - - dstViewport.x = 0.0f; - dstViewport.y = 0.0f; - dstViewport.width = float(width); - dstViewport.height = float(height); - dstViewport.minDepth = 0.0f; - dstViewport.maxDepth = 1.0f; - } - - m_api.vkCmdSetScissor(cmdBuffer, 0, numRenderTargets, rects); - m_api.vkCmdSetViewport(cmdBuffer, 0, numRenderTargets, viewports); - } - - return SLANG_OK; -} - -void VKRenderer::_endPass() -{ - VkCommandBuffer cmdBuffer = m_deviceQueue.getCommandBuffer(); - m_api.vkCmdEndRenderPass(cmdBuffer); -} - -void VKRenderer::_beginRender() -{ - m_swapChainImageIndex = m_swapChain.nextFrontImageIndex(); - - if (m_swapChainImageIndex < 0) - { - return; - } -} - -void VKRenderer::_endRender() -{ - m_deviceQueue.flush(); -} - -Renderer* createVKRenderer() -{ - return new VKRenderer; -} - -VKRenderer::~VKRenderer() -{ - if (m_renderPass != VK_NULL_HANDLE) - { - m_api.vkDestroyRenderPass(m_device, m_renderPass, nullptr); - m_renderPass = VK_NULL_HANDLE; - } -} - - -VkBool32 VKRenderer::handleDebugMessage(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, - size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg) -{ - char const* severity = "message"; - if (flags & VK_DEBUG_REPORT_WARNING_BIT_EXT) - severity = "warning"; - if (flags & VK_DEBUG_REPORT_ERROR_BIT_EXT) - severity = "error"; - - // pMsg can be really big (it can be assembler dump for example) - // Use a dynamic buffer to store - size_t bufferSize = strlen(pMsg) + 1 + 1024; - List bufferArray; - bufferArray.SetSize(bufferSize); - char* buffer = bufferArray.Buffer(); - - sprintf_s(buffer, - bufferSize, - "%s: %s %d: %s\n", - pLayerPrefix, - severity, - msgCode, - pMsg); - - fprintf(stderr, "%s", buffer); - fflush(stderr); - - OutputDebugStringA(buffer); - - return VK_FALSE; -} - -/* static */VKAPI_ATTR VkBool32 VKAPI_CALL VKRenderer::debugMessageCallback(VkDebugReportFlagsEXT flags, VkDebugReportObjectTypeEXT objType, uint64_t srcObject, - size_t location, int32_t msgCode, const char* pLayerPrefix, const char* pMsg, void* pUserData) -{ - return ((VKRenderer*)pUserData)->handleDebugMessage(flags, objType, srcObject, location, msgCode, pLayerPrefix, pMsg); -} - -VkPipelineShaderStageCreateInfo VKRenderer::compileEntryPoint( - ShaderProgram::KernelDesc const& kernelDesc, - VkShaderStageFlagBits stage, - List& bufferOut) -{ - char const* dataBegin = (char const*) kernelDesc.codeBegin; - char const* dataEnd = (char const*) kernelDesc.codeEnd; - - // We need to make a copy of the code, since the Slang compiler - // will free the memory after a compile request is closed. - size_t codeSize = dataEnd - dataBegin; - - bufferOut.InsertRange(0, dataBegin, codeSize); - - char* codeBegin = bufferOut.Buffer(); - - VkShaderModuleCreateInfo moduleCreateInfo = { VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO }; - moduleCreateInfo.pCode = (uint32_t*)codeBegin; - moduleCreateInfo.codeSize = codeSize; - - VkShaderModule module; - SLANG_VK_CHECK(m_api.vkCreateShaderModule(m_device, &moduleCreateInfo, nullptr, &module)); - - VkPipelineShaderStageCreateInfo shaderStageCreateInfo = { VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO }; - shaderStageCreateInfo.stage = stage; - - shaderStageCreateInfo.module = module; - shaderStageCreateInfo.pName = "main"; - - return shaderStageCreateInfo; -} - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!! Renderer interface !!!!!!!!!!!!!!!!!!!!!!!!!! - -SlangResult VKRenderer::initialize(const Desc& desc, void* inWindowHandle) -{ - SLANG_RETURN_ON_FAIL(m_module.init()); - SLANG_RETURN_ON_FAIL(m_api.initGlobalProcs(m_module)); - - m_desc = desc; - - VkApplicationInfo applicationInfo = { VK_STRUCTURE_TYPE_APPLICATION_INFO }; - applicationInfo.pApplicationName = "slang-render-test"; - applicationInfo.pEngineName = "slang-render-test"; - applicationInfo.apiVersion = VK_API_VERSION_1_0; - - char const* instanceExtensions[] = - { - VK_KHR_SURFACE_EXTENSION_NAME, - -#if SLANG_WINDOWS_FAMILY - VK_KHR_WIN32_SURFACE_EXTENSION_NAME, -#else - VK_KHR_XLIB_SURFACE_EXTENSION_NAME -#endif - -#if ENABLE_VALIDATION_LAYER - VK_EXT_DEBUG_REPORT_EXTENSION_NAME, -#endif - }; - - VkInstance instance = VK_NULL_HANDLE; - - VkInstanceCreateInfo instanceCreateInfo = { VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO }; - instanceCreateInfo.pApplicationInfo = &applicationInfo; - - instanceCreateInfo.enabledExtensionCount = SLANG_COUNT_OF(instanceExtensions); - instanceCreateInfo.ppEnabledExtensionNames = &instanceExtensions[0]; - -#if ENABLE_VALIDATION_LAYER - const char* layerNames[] = { "VK_LAYER_LUNARG_standard_validation" }; - instanceCreateInfo.enabledLayerCount = SLANG_COUNT_OF(layerNames); - instanceCreateInfo.ppEnabledLayerNames = layerNames; -#endif - - SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateInstance(&instanceCreateInfo, nullptr, &instance)); - SLANG_RETURN_ON_FAIL(m_api.initInstanceProcs(instance)); - -#if ENABLE_VALIDATION_LAYER - VkDebugReportFlagsEXT debugFlags = VK_DEBUG_REPORT_ERROR_BIT_EXT | VK_DEBUG_REPORT_WARNING_BIT_EXT; - - VkDebugReportCallbackCreateInfoEXT debugCreateInfo = { VK_STRUCTURE_TYPE_DEBUG_REPORT_CREATE_INFO_EXT }; - debugCreateInfo.pfnCallback = &debugMessageCallback; - debugCreateInfo.pUserData = this; - debugCreateInfo.flags = debugFlags; - - SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateDebugReportCallbackEXT(instance, &debugCreateInfo, nullptr, &m_debugReportCallback)); -#endif - - uint32_t numPhysicalDevices = 0; - SLANG_VK_RETURN_ON_FAIL(m_api.vkEnumeratePhysicalDevices(instance, &numPhysicalDevices, nullptr)); - - List physicalDevices; - physicalDevices.SetSize(numPhysicalDevices); - SLANG_VK_RETURN_ON_FAIL(m_api.vkEnumeratePhysicalDevices(instance, &numPhysicalDevices, physicalDevices.Buffer())); - - // TODO: allow override of selected device - uint32_t selectedDeviceIndex = 0; - - SLANG_RETURN_ON_FAIL(m_api.initPhysicalDevice(physicalDevices[selectedDeviceIndex])); - - int queueFamilyIndex = m_api.findQueue(VK_QUEUE_GRAPHICS_BIT | VK_QUEUE_COMPUTE_BIT); - assert(queueFamilyIndex >= 0); - - float queuePriority = 0.0f; - VkDeviceQueueCreateInfo queueCreateInfo = { VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO }; - queueCreateInfo.queueFamilyIndex = queueFamilyIndex; - queueCreateInfo.queueCount = 1; - queueCreateInfo.pQueuePriorities = &queuePriority; - - char const* const deviceExtensions[] = - { - VK_KHR_SWAPCHAIN_EXTENSION_NAME, - }; - - VkDeviceCreateInfo deviceCreateInfo = { VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO }; - deviceCreateInfo.queueCreateInfoCount = 1; - deviceCreateInfo.pQueueCreateInfos = &queueCreateInfo; - deviceCreateInfo.pEnabledFeatures = &m_api.m_deviceFeatures; - - deviceCreateInfo.enabledExtensionCount = SLANG_COUNT_OF(deviceExtensions); - deviceCreateInfo.ppEnabledExtensionNames = deviceExtensions; - - SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateDevice(m_api.m_physicalDevice, &deviceCreateInfo, nullptr, &m_device)); - SLANG_RETURN_ON_FAIL(m_api.initDeviceProcs(m_device)); - - { - VkQueue queue; - m_api.vkGetDeviceQueue(m_device, queueFamilyIndex, 0, &queue); - SLANG_RETURN_ON_FAIL(m_deviceQueue.init(m_api, queue, queueFamilyIndex)); - } - - // set up swap chain - - { - VulkanSwapChain::Desc desc; - VulkanSwapChain::PlatformDesc* platformDesc = nullptr; - - desc.init(); - desc.m_format = Format::RGBA_Unorm_UInt8; - -#if SLANG_WINDOWS_FAMILY - VulkanSwapChain::WinPlatformDesc winPlatformDesc; - winPlatformDesc.m_hinstance = ::GetModuleHandle(nullptr); - winPlatformDesc.m_hwnd = (HWND)inWindowHandle; - platformDesc = &winPlatformDesc; -#endif - - SLANG_RETURN_ON_FAIL(m_swapChain.init(&m_deviceQueue, desc, platformDesc)); - } - - // depth/stencil? - - // render pass? - - { - const int numRenderTargets = 1; - bool shouldClear = true; - bool shouldClearDepth = false; - bool shouldClearStencil = false; - bool hasDepthBuffer = false; - - Format depthFormat = Format::Unknown; - VkFormat colorFormat = m_swapChain.getVkFormat(); - - int numAttachments = 0; - // We need extra space if we have depth buffer - VkAttachmentDescription attachmentDesc[kMaxRenderTargets + 1] = {}; - for (int i = 0; i < numRenderTargets; ++i) - { - VkAttachmentDescription& dst = attachmentDesc[numAttachments ++]; - - dst.flags = 0; - dst.format = colorFormat; - dst.samples = VK_SAMPLE_COUNT_1_BIT; - dst.loadOp = shouldClear ? VK_ATTACHMENT_LOAD_OP_CLEAR : VK_ATTACHMENT_LOAD_OP_LOAD; - dst.storeOp = VK_ATTACHMENT_STORE_OP_STORE; - dst.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; - dst.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE; - dst.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED; // VK_IMAGE_LAYOUT_PRESENT_SRC_KHR; - dst.finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR; - } - if (hasDepthBuffer) - { - VkAttachmentDescription& dst = attachmentDesc[numAttachments++]; - - dst.flags = 0; - dst.format = VulkanUtil::getVkFormat(depthFormat); - dst.samples = VK_SAMPLE_COUNT_1_BIT; - dst.loadOp = shouldClearDepth ? VK_ATTACHMENT_LOAD_OP_CLEAR : VK_ATTACHMENT_LOAD_OP_LOAD; - dst.storeOp = VK_ATTACHMENT_STORE_OP_STORE; - dst.stencilLoadOp = shouldClearStencil ? VK_ATTACHMENT_LOAD_OP_CLEAR : VK_ATTACHMENT_LOAD_OP_LOAD; - dst.stencilStoreOp = VK_ATTACHMENT_STORE_OP_STORE; - dst.initialLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; - dst.finalLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; - } - - VkAttachmentReference colorAttachments[kMaxRenderTargets] = {}; - for (int i = 0; i < numRenderTargets; ++i) - { - VkAttachmentReference& dst = colorAttachments[i]; - dst.attachment = i; - dst.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; - } - - VkAttachmentReference depthAttachment = {}; - depthAttachment.attachment = numRenderTargets; - depthAttachment.layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; - - VkSubpassDescription subpassDesc = {}; - subpassDesc.flags = 0; - subpassDesc.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS; - subpassDesc.inputAttachmentCount = 0u; - subpassDesc.pInputAttachments = nullptr; - subpassDesc.colorAttachmentCount = numRenderTargets; - subpassDesc.pColorAttachments = colorAttachments; - subpassDesc.pResolveAttachments = nullptr; - subpassDesc.pDepthStencilAttachment = hasDepthBuffer ? &depthAttachment : nullptr; - subpassDesc.preserveAttachmentCount = 0u; - subpassDesc.pPreserveAttachments = nullptr; - - VkRenderPassCreateInfo renderPassCreateInfo = {}; - renderPassCreateInfo.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO; - renderPassCreateInfo.attachmentCount = numAttachments; - renderPassCreateInfo.pAttachments = attachmentDesc; - renderPassCreateInfo.subpassCount = 1; - renderPassCreateInfo.pSubpasses = &subpassDesc; - SLANG_VK_RETURN_ON_FAIL(m_api.vkCreateRenderPass(m_device, &renderPassCreateInfo, nullptr, &m_renderPass)); - } - - // frame buffer - SLANG_RETURN_ON_FAIL(m_swapChain.createFrameBuffers(m_renderPass)); - - _beginRender(); - - return SLANG_OK; -} - -void VKRenderer::submitGpuWork() -{ - m_deviceQueue.flush(); -} - -void VKRenderer::waitForGpu() -{ - m_deviceQueue.flushAndWait(); -} - -void VKRenderer::setClearColor(const float color[4]) -{ - for (int ii = 0; ii < 4; ++ii) - m_clearColor[ii] = color[ii]; -} - -void VKRenderer::clearFrame() -{ -} - -void VKRenderer::presentFrame() -{ - _endRender(); - - const bool vsync = true; - m_swapChain.present(vsync); - - _beginRender(); -} - -SlangResult VKRenderer::captureScreenSurface(Surface& surfaceOut) -{ - return SLANG_FAIL; -} - -static VkBufferUsageFlagBits _calcBufferUsageFlags(Resource::BindFlag::Enum bind) -{ - typedef Resource::BindFlag BindFlag; - - switch (bind) - { - case BindFlag::VertexBuffer: return VK_BUFFER_USAGE_VERTEX_BUFFER_BIT; - case BindFlag::IndexBuffer: return VK_BUFFER_USAGE_INDEX_BUFFER_BIT; - case BindFlag::ConstantBuffer: return VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT; - case BindFlag::StreamOutput: - case BindFlag::RenderTarget: - case BindFlag::DepthStencil: - { - assert(!"Not supported yet"); - return VkBufferUsageFlagBits(0); - } - case BindFlag::UnorderedAccess: return VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT; - case BindFlag::PixelShaderResource: return VK_BUFFER_USAGE_STORAGE_BUFFER_BIT; - case BindFlag::NonPixelShaderResource: return VK_BUFFER_USAGE_STORAGE_BUFFER_BIT; - default: return VkBufferUsageFlagBits(0); - } -} - -static VkBufferUsageFlagBits _calcBufferUsageFlags(int bindFlags) -{ - int dstFlags = 0; - while (bindFlags) - { - int lsb = bindFlags & -bindFlags; - dstFlags |= _calcBufferUsageFlags(Resource::BindFlag::Enum(lsb)); - bindFlags &= ~lsb; - } - return VkBufferUsageFlagBits(dstFlags); -} - -static VkBufferUsageFlags _calcBufferUsageFlags(int bindFlags, int cpuAccessFlags, const void* initData) -{ - VkBufferUsageFlags usage = _calcBufferUsageFlags(bindFlags); - - if (cpuAccessFlags & Resource::AccessFlag::Read) - { - // If it can be read from, set this - usage |= VK_BUFFER_USAGE_TRANSFER_SRC_BIT; - } - if ((cpuAccessFlags & Resource::AccessFlag::Write) || initData) - { - usage |= VK_BUFFER_USAGE_TRANSFER_DST_BIT; - } - - return usage; -} - -static VkImageUsageFlagBits _calcImageUsageFlags(Resource::BindFlag::Enum bind) -{ - typedef Resource::BindFlag BindFlag; - - switch (bind) - { - case BindFlag::RenderTarget: return VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT; - case BindFlag::DepthStencil: return VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT; - case BindFlag::NonPixelShaderResource: - case BindFlag::PixelShaderResource: - { - // Ignore - return VkImageUsageFlagBits(0); - } - default: - { - assert(!"Unsupported"); - return VkImageUsageFlagBits(0); - } - } -} - -static VkImageUsageFlagBits _calcImageUsageFlags(int bindFlags) -{ - int dstFlags = 0; - while (bindFlags) - { - int lsb = bindFlags & -bindFlags; - dstFlags |= _calcImageUsageFlags(Resource::BindFlag::Enum(lsb)); - bindFlags &= ~lsb; - } - return VkImageUsageFlagBits(dstFlags); -} - -static VkImageUsageFlags _calcImageUsageFlags(int bindFlags, int cpuAccessFlags, const void* initData) -{ - VkImageUsageFlags usage = _calcImageUsageFlags(bindFlags); - - usage |= VK_IMAGE_USAGE_SAMPLED_BIT; - - if (cpuAccessFlags & Resource::AccessFlag::Read) - { - // If it can be read from, set this - usage |= VK_IMAGE_USAGE_TRANSFER_SRC_BIT; - } - if ((cpuAccessFlags & Resource::AccessFlag::Write) || initData) - { - usage |= VK_IMAGE_USAGE_TRANSFER_DST_BIT; - } - - return usage; -} - -void VKRenderer::_transitionImageLayout(VkImage image, VkFormat format, const TextureResource::Desc& desc, VkImageLayout oldLayout, VkImageLayout newLayout) -{ - VkImageMemoryBarrier barrier = {}; - barrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER; - barrier.oldLayout = oldLayout; - barrier.newLayout = newLayout; - barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED; - barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED; - barrier.image = image; - barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; - barrier.subresourceRange.baseMipLevel = 0; - barrier.subresourceRange.levelCount = desc.numMipLevels; - barrier.subresourceRange.baseArrayLayer = 0; - barrier.subresourceRange.layerCount = 1; - - VkPipelineStageFlags sourceStage; - VkPipelineStageFlags destinationStage; - - if (oldLayout == VK_IMAGE_LAYOUT_UNDEFINED && newLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL) - { - barrier.srcAccessMask = 0; - barrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT; - - sourceStage = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT; - destinationStage = VK_PIPELINE_STAGE_TRANSFER_BIT; - } - else if (oldLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL && newLayout == VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL) - { - barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT; - barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT; - - sourceStage = VK_PIPELINE_STAGE_TRANSFER_BIT; - destinationStage = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT; - } - else - { - assert(!"unsupported layout transition!"); - return; - } - - VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); - - m_api.vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &barrier); -} - -TextureResource* VKRenderer::createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& descIn, const TextureResource::Data* initData) -{ - TextureResource::Desc desc(descIn); - desc.setDefaults(initialUsage); - - const VkFormat format = VulkanUtil::getVkFormat(desc.format); - if (format == VK_FORMAT_UNDEFINED) - { - assert(!"Unhandled image format"); - return nullptr; - } - - const int arraySize = desc.calcEffectiveArraySize(); - - RefPtr texture(new TextureResourceImpl(desc, initialUsage, &m_api)); - - // Create the image - { - VkImageCreateInfo imageInfo = {VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO}; - - switch (desc.type) - { - case Resource::Type::Texture1D: - { - imageInfo.imageType = VK_IMAGE_TYPE_1D; - imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), 1, 1 }; - break; - } - case Resource::Type::Texture2D: - { - imageInfo.imageType = VK_IMAGE_TYPE_2D; - imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), uint32_t(descIn.size.height), 1 }; - break; - } - case Resource::Type::TextureCube: - { - imageInfo.imageType = VK_IMAGE_TYPE_2D; - imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), uint32_t(descIn.size.height), 1 }; - break; - } - case Resource::Type::Texture3D: - { - // Can't have an array and 3d texture - assert(desc.arraySize <= 1); - - imageInfo.imageType = VK_IMAGE_TYPE_3D; - imageInfo.extent = VkExtent3D{ uint32_t(descIn.size.width), uint32_t(descIn.size.height), uint32_t(descIn.size.depth) }; - break; - } - default: - { - assert(!"Unhandled type"); - return nullptr; - } - } - - imageInfo.mipLevels = desc.numMipLevels; - imageInfo.arrayLayers = arraySize; - - imageInfo.format = format; - - imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL; - imageInfo.usage = _calcImageUsageFlags(desc.bindFlags, desc.cpuAccessFlags, initData); - imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE; - - imageInfo.samples = VK_SAMPLE_COUNT_1_BIT; - imageInfo.flags = 0; // Optional - - SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateImage(m_device, &imageInfo, nullptr, &texture->m_image)); - } - - VkMemoryRequirements memRequirements; - m_api.vkGetImageMemoryRequirements(m_device, texture->m_image, &memRequirements); - - // Allocate the memory - { - VkMemoryPropertyFlags reqMemoryProperties = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT; - - VkMemoryAllocateInfo allocInfo = {VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO}; - - int memoryTypeIndex = m_api.findMemoryTypeIndex(memRequirements.memoryTypeBits, reqMemoryProperties); - assert(memoryTypeIndex >= 0); - - VkMemoryPropertyFlags actualMemoryProperites = m_api.m_deviceMemoryProperties.memoryTypes[memoryTypeIndex].propertyFlags; - - allocInfo.allocationSize = memRequirements.size; - allocInfo.memoryTypeIndex = memoryTypeIndex; - - SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkAllocateMemory(m_device, &allocInfo, nullptr, &texture->m_imageMemory)); - } - - // Bind the memory to the image - m_api.vkBindImageMemory(m_device, texture->m_image, texture->m_imageMemory, 0); - - if (initData) - { - List mipSizes; - - VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); - - const int numMipMaps = desc.numMipLevels; - assert(initData->numMips == numMipMaps); - - // Calculate how large the buffer has to be - size_t bufferSize = 0; - // Calculate how large an array entry is - for (int j = 0; j < numMipMaps; ++j) - { - const TextureResource::Size mipSize = desc.size.calcMipSize(j); - - const int rowSizeInBytes = Surface::calcRowSize(desc.format, mipSize.width); - const int numRows = Surface::calcNumRows(desc.format, mipSize.height); - - mipSizes.Add(mipSize); - - bufferSize += (rowSizeInBytes * numRows) * mipSize.depth; - } - - - // Calculate the total size taking into account the array - bufferSize *= arraySize; - - Buffer uploadBuffer; - SLANG_RETURN_NULL_ON_FAIL(uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); - - assert(mipSizes.Count() == numMipMaps); - - // Copy into upload buffer - { - int subResourceIndex = 0; - - uint8_t* dstData; - m_api.vkMapMemory(m_device, uploadBuffer.m_memory, 0, bufferSize, 0, (void**)&dstData); - - for (int i = 0; i < arraySize; ++i) - { - for (int j = 0; j < int(mipSizes.Count()); ++j) - { - const auto& mipSize = mipSizes[j]; - - const ptrdiff_t srcRowStride = initData->mipRowStrides[j]; - const int dstRowSizeInBytes = Surface::calcRowSize(desc.format, mipSize.width); - const int numRows = Surface::calcNumRows(desc.format, mipSize.height); - - for (int k = 0; k < mipSize.depth; k++) - { - const uint8_t* srcData = (const uint8_t*)(initData->subResources[subResourceIndex]); - - for (int l = 0; l < numRows; l++) - { - ::memcpy(dstData, srcData, dstRowSizeInBytes); - - dstData += dstRowSizeInBytes; - srcData += srcRowStride; - } - - subResourceIndex++; - } - } - } - - m_api.vkUnmapMemory(m_device, uploadBuffer.m_memory); - } - - _transitionImageLayout(texture->m_image, format, texture->getDesc(), VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL); - - { - size_t srcOffset = 0; - for (int i = 0; i < arraySize; ++i) - { - for (int j = 0; j < int(mipSizes.Count()); ++j) - { - const auto& mipSize = mipSizes[j]; - - const int rowSizeInBytes = Surface::calcRowSize(desc.format, mipSize.width); - const int numRows = Surface::calcNumRows(desc.format, mipSize.height); - - // https://www.khronos.org/registry/vulkan/specs/1.1-extensions/man/html/VkBufferImageCopy.html - // bufferRowLength and bufferImageHeight specify the data in buffer memory as a subregion of a larger two- or three-dimensional image, - // and control the addressing calculations of data in buffer memory. If either of these values is zero, that aspect of the buffer memory - // is considered to be tightly packed according to the imageExtent. - - VkBufferImageCopy region = {}; - - region.bufferOffset = srcOffset; - region.bufferRowLength = 0; //rowSizeInBytes; - region.bufferImageHeight = 0; - - region.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; - region.imageSubresource.mipLevel = j; - region.imageSubresource.baseArrayLayer = i; - region.imageSubresource.layerCount = 1; - region.imageOffset = { 0, 0, 0 }; - region.imageExtent = { uint32_t(mipSize.width), uint32_t(mipSize.height), uint32_t(mipSize.depth) }; - - // Do the copy (do all depths in a single go) - m_api.vkCmdCopyBufferToImage(commandBuffer, uploadBuffer.m_buffer, texture->m_image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, ®ion); - - // Next - srcOffset += rowSizeInBytes * numRows * mipSize.depth; - } - } - } - - _transitionImageLayout(texture->m_image, format, texture->getDesc(), VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL); - - m_deviceQueue.flushAndWait(); - } - - return texture.detach(); -} - -BufferResource* VKRenderer::createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& descIn, const void* initData) -{ - BufferResource::Desc desc(descIn); - desc.setDefaults(initialUsage); - - const size_t bufferSize = desc.sizeInBytes; - - VkMemoryPropertyFlags reqMemoryProperties = 0; - - VkBufferUsageFlags usage = _calcBufferUsageFlags(desc.bindFlags, desc.cpuAccessFlags, initData); - - switch (initialUsage) - { - case Resource::Usage::ConstantBuffer: - { - reqMemoryProperties = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT; - break; - } - default: break; - } - - RefPtr buffer(new BufferResourceImpl(initialUsage, desc, this)); - SLANG_RETURN_NULL_ON_FAIL(buffer->m_buffer.init(m_api, desc.sizeInBytes, usage, reqMemoryProperties)); - - if ((desc.cpuAccessFlags & Resource::AccessFlag::Write) || initData) - { - SLANG_RETURN_NULL_ON_FAIL(buffer->m_uploadBuffer.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); - } - - if (initData) - { - // TODO: only create staging buffer if the memory type - // used for the buffer doesn't let us fill things in - // directly. - // Copy into staging buffer - void* mappedData = nullptr; - SLANG_VK_CHECK(m_api.vkMapMemory(m_device, buffer->m_uploadBuffer.m_memory, 0, bufferSize, 0, &mappedData)); - ::memcpy(mappedData, initData, bufferSize); - m_api.vkUnmapMemory(m_device, buffer->m_uploadBuffer.m_memory); - - // Copy from staging buffer to real buffer - VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); - - VkBufferCopy copyInfo = {}; - copyInfo.size = bufferSize; - m_api.vkCmdCopyBuffer(commandBuffer, buffer->m_uploadBuffer.m_buffer, buffer->m_buffer.m_buffer, 1, ©Info); - - //flushCommandBuffer(commandBuffer); - } - - return buffer.detach(); -} - -InputLayout* VKRenderer::createInputLayout(const InputElementDesc* elements, UInt numElements) -{ - RefPtr layout(new InputLayoutImpl); - - List& dstVertexDescs = layout->m_vertexDescs; - - size_t vertexSize = 0; - dstVertexDescs.SetSize(numElements); - - for (UInt i = 0; i < numElements; ++i) - { - const InputElementDesc& srcDesc = elements[i]; - VkVertexInputAttributeDescription& dstDesc = dstVertexDescs[i]; - - dstDesc.location = uint32_t(i); - dstDesc.binding = 0; - dstDesc.format = VulkanUtil::getVkFormat(srcDesc.format); - if (dstDesc.format == VK_FORMAT_UNDEFINED) - { - return nullptr; - } - - dstDesc.offset = uint32_t(srcDesc.offset); - - const size_t elementSize = RendererUtil::getFormatSize(srcDesc.format); - assert(elementSize > 0); - const size_t endElement = srcDesc.offset + elementSize; - - vertexSize = (vertexSize < endElement) ? endElement : vertexSize; - } - - // Work out the overall size - layout->m_vertexSize = int(vertexSize); - return layout.detach(); -} - -void* VKRenderer::map(BufferResource* bufferIn, MapFlavor flavor) -{ - BufferResourceImpl* buffer = static_cast(bufferIn); - assert(buffer->m_mapFlavor == MapFlavor::Unknown); - - // Make sure everything has completed before reading... - m_deviceQueue.flushAndWait(); - - const size_t bufferSize = buffer->getDesc().sizeInBytes; - - switch (flavor) - { - case MapFlavor::WriteDiscard: - case MapFlavor::HostWrite: - { - if (!buffer->m_uploadBuffer.isInitialized()) - { - return nullptr; - } - - void* mappedData = nullptr; - SLANG_VK_CHECK(m_api.vkMapMemory(m_device, buffer->m_uploadBuffer.m_memory, 0, bufferSize, 0, &mappedData)); - buffer->m_mapFlavor = flavor; - return mappedData; - } - case MapFlavor::HostRead: - { - // Make sure there is space in the read buffer - buffer->m_readBuffer.SetSize(bufferSize); - - // create staging buffer - Buffer staging; - - SLANG_RETURN_NULL_ON_FAIL(staging.init(m_api, bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)); - - // Copy from real buffer to staging buffer - VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); - - VkBufferCopy copyInfo = {}; - copyInfo.size = bufferSize; - m_api.vkCmdCopyBuffer(commandBuffer, buffer->m_buffer.m_buffer, staging.m_buffer, 1, ©Info); - - m_deviceQueue.flushAndWait(); - - // Write out the data from the buffer - void* mappedData = nullptr; - SLANG_VK_CHECK(m_api.vkMapMemory(m_device, staging.m_memory, 0, bufferSize, 0, &mappedData)); - - ::memcpy(buffer->m_readBuffer.Buffer(), mappedData, bufferSize); - m_api.vkUnmapMemory(m_device, staging.m_memory); - - buffer->m_mapFlavor = flavor; - - return buffer->m_readBuffer.Buffer(); - } - default: - return nullptr; - } -} - -void VKRenderer::unmap(BufferResource* bufferIn) -{ - BufferResourceImpl* buffer = static_cast(bufferIn); - assert(buffer->m_mapFlavor != MapFlavor::Unknown); - - const size_t bufferSize = buffer->getDesc().sizeInBytes; - - switch (buffer->m_mapFlavor) - { - case MapFlavor::WriteDiscard: - case MapFlavor::HostWrite: - { - m_api.vkUnmapMemory(m_device, buffer->m_uploadBuffer.m_memory); - - // Copy from staging buffer to real buffer - VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); - - VkBufferCopy copyInfo = {}; - copyInfo.size = bufferSize; - m_api.vkCmdCopyBuffer(commandBuffer, buffer->m_uploadBuffer.m_buffer, buffer->m_buffer.m_buffer, 1, ©Info); - - // TODO: is this necessary? - //m_deviceQueue.flushAndWait(); - break; - } - default: break; - } - - // Mark as no longer mapped - buffer->m_mapFlavor = MapFlavor::Unknown; -} - -void VKRenderer::setInputLayout(InputLayout* inputLayout) -{ - m_currentInputLayout = static_cast(inputLayout); -} - -void VKRenderer::setPrimitiveTopology(PrimitiveTopology topology) -{ - m_primitiveTopology = VulkanUtil::getVkPrimitiveTopology(topology); -} - -void VKRenderer::setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) -{ - { - const UInt num = startSlot + slotCount; - if (num > m_boundVertexBuffers.Count()) - { - m_boundVertexBuffers.SetSize(num); - } - } - - for (UInt i = 0; i < slotCount; i++) - { - BufferResourceImpl* buffer = static_cast(buffers[i]); - if (buffer) - { - assert(buffer->m_initialUsage == Resource::Usage::VertexBuffer); - } - - BoundVertexBuffer& boundBuffer = m_boundVertexBuffers[startSlot + i]; - boundBuffer.m_buffer = buffer; - boundBuffer.m_stride = int(strides[i]); - boundBuffer.m_offset = int(offsets[i]); - } -} - -void VKRenderer::setShaderProgram(ShaderProgram* program) -{ - m_currentProgram = (ShaderProgramImpl*)program; -} - -void VKRenderer::draw(UInt vertexCount, UInt startVertex = 0) -{ - Pipeline* pipeline = _getPipeline(); - if (!pipeline || pipeline->m_shaderProgram->m_pipelineType != PipelineType::Graphics) - { - assert(!"Invalid render pipeline"); - return; - } - - SLANG_RETURN_VOID_ON_FAIL(_beginPass()); - - // Also create descriptor sets based on the given pipeline layout - VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); - - m_api.vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline->m_pipeline); - m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline->m_pipelineLayout, - 0, 1, &pipeline->m_descriptorSet, 0, nullptr); - - // Bind the vertex buffer - if (m_boundVertexBuffers.Count() > 0 && m_boundVertexBuffers[0].m_buffer) - { - const BoundVertexBuffer& boundVertexBuffer = m_boundVertexBuffers[0]; - - VkBuffer vertexBuffers[] = { boundVertexBuffer.m_buffer->m_buffer.m_buffer }; - VkDeviceSize offsets[] = { VkDeviceSize(boundVertexBuffer.m_offset) }; - - m_api.vkCmdBindVertexBuffers(commandBuffer, 0, 1, vertexBuffers, offsets); - } - - m_api.vkCmdDraw(commandBuffer, static_cast(vertexCount), 1, 0, 0); - - _endPass(); -} - -void VKRenderer::dispatchCompute(int x, int y, int z) -{ - Pipeline* pipeline = _getPipeline(); - if (!pipeline || pipeline->m_shaderProgram->m_pipelineType != PipelineType::Compute) - { - assert(!"Invalid render pipeline"); - return; - } - - // Also create descriptor sets based on the given pipeline layout - VkCommandBuffer commandBuffer = m_deviceQueue.getCommandBuffer(); - - m_api.vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipeline->m_pipeline); - - m_api.vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipeline->m_pipelineLayout, - 0, 1, &pipeline->m_descriptorSet, 0, nullptr); - - m_api.vkCmdDispatch(commandBuffer, x, y, z); -} - -static VkImageViewType _calcImageViewType(TextureResource::Type type, const TextureResource::Desc& desc) -{ - switch (type) - { - case Resource::Type::Texture1D: return desc.arraySize > 1 ? VK_IMAGE_VIEW_TYPE_1D_ARRAY : VK_IMAGE_VIEW_TYPE_1D; - case Resource::Type::Texture2D: return desc.arraySize > 1 ? VK_IMAGE_VIEW_TYPE_2D_ARRAY : VK_IMAGE_VIEW_TYPE_2D; - case Resource::Type::TextureCube: return desc.arraySize > 1 ? VK_IMAGE_VIEW_TYPE_CUBE_ARRAY : VK_IMAGE_VIEW_TYPE_CUBE; - case Resource::Type::Texture3D: - { - // Can't have an array and 3d texture - assert(desc.arraySize <= 1); - if (desc.arraySize <= 1) - { - return VK_IMAGE_VIEW_TYPE_3D; - } - break; - } - default: break; - } - - return VK_IMAGE_VIEW_TYPE_MAX_ENUM; -} - - -BindingState* VKRenderer::createBindingState(const BindingState::Desc& bindingStateDesc) -{ - RefPtr bindingState(new BindingStateImpl(bindingStateDesc, &m_api)); - - const auto& srcBindings = bindingStateDesc.m_bindings; - const int numBindings = int(srcBindings.Count()); - - auto& dstDetails = bindingState->m_bindingDetails; - dstDetails.SetSize(numBindings); - - for (int i = 0; i < numBindings; ++i) - { - auto& dstDetail = dstDetails[i]; - const auto& srcBinding = srcBindings[i]; - - switch (srcBinding.bindingType) - { - case BindingType::Buffer: - { - if (!srcBinding.resource || !srcBinding.resource->isBuffer()) - { - assert(!"Needs to have a buffer resource set"); - return nullptr; - } - - BufferResourceImpl* bufferResource = static_cast(srcBinding.resource.Ptr()); - const BufferResource::Desc& bufferResourceDesc = bufferResource->getDesc(); - - if (bufferResourceDesc.bindFlags & Resource::BindFlag::UnorderedAccess) - { - // VkBufferView uav - - VkBufferViewCreateInfo info = { VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO }; - - info.format = VK_FORMAT_R32_SFLOAT; - // TODO: - // Not sure how to handle typeless? - if (bufferResourceDesc.elementSize == 0) - { - info.format = VK_FORMAT_R32_SFLOAT; // DXGI_FORMAT_R32_TYPELESS ? - } - - info.buffer = bufferResource->m_buffer.m_buffer; - info.offset = 0; - info.range = bufferResourceDesc.sizeInBytes; - - SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateBufferView(m_device, &info, nullptr, &dstDetail.m_uav)); - } - - // TODO: Setup views. - // VkImageView srv - - - break; - } - case BindingType::Sampler: - { - VkSamplerCreateInfo samplerInfo = { VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO }; - - samplerInfo.magFilter = VK_FILTER_LINEAR; - samplerInfo.minFilter = VK_FILTER_LINEAR; - - samplerInfo.addressModeU = VK_SAMPLER_ADDRESS_MODE_REPEAT; - samplerInfo.addressModeV = VK_SAMPLER_ADDRESS_MODE_REPEAT; - samplerInfo.addressModeW = VK_SAMPLER_ADDRESS_MODE_REPEAT; - - samplerInfo.anisotropyEnable = VK_FALSE; - samplerInfo.maxAnisotropy = 1; - - samplerInfo.borderColor = VK_BORDER_COLOR_INT_OPAQUE_BLACK; - samplerInfo.unnormalizedCoordinates = VK_FALSE; - samplerInfo.compareEnable = VK_FALSE; - samplerInfo.compareOp = VK_COMPARE_OP_ALWAYS; - samplerInfo.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR; - - SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateSampler(m_device, &samplerInfo, nullptr, &dstDetail.m_sampler)); - - break; - } - case BindingType::Texture: - { - if (!srcBinding.resource || !srcBinding.resource->isTexture()) - { - assert(!"Needs to have a texture resource set"); - return nullptr; - } - - TextureResourceImpl* textureResource = static_cast(srcBinding.resource.Ptr()); - const TextureResource::Desc& texDesc = textureResource->getDesc(); - - VkImageViewType imageViewType = _calcImageViewType(textureResource->getType(), texDesc); - if (imageViewType == VK_IMAGE_VIEW_TYPE_MAX_ENUM) - { - assert(!"Invalid view type"); - return nullptr; - } - const VkFormat format = VulkanUtil::getVkFormat(texDesc.format); - if (format == VK_FORMAT_UNDEFINED) - { - assert(!"Unhandled image format"); - return nullptr; - } - - // Create the image view - - VkImageViewCreateInfo viewInfo = {}; - viewInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; - viewInfo.image = textureResource->m_image; - viewInfo.viewType = imageViewType; - viewInfo.format = format; - viewInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; - viewInfo.subresourceRange.baseMipLevel = 0; - viewInfo.subresourceRange.levelCount = 1; - viewInfo.subresourceRange.baseArrayLayer = 0; - viewInfo.subresourceRange.layerCount = 1; - - viewInfo.components.r = VK_COMPONENT_SWIZZLE_IDENTITY; - viewInfo.components.g = VK_COMPONENT_SWIZZLE_IDENTITY; - viewInfo.components.b = VK_COMPONENT_SWIZZLE_IDENTITY; - viewInfo.components.a = VK_COMPONENT_SWIZZLE_IDENTITY; - - SLANG_VK_RETURN_NULL_ON_FAIL(m_api.vkCreateImageView(m_device, &viewInfo, nullptr, &dstDetail.m_srv)); - - break; - } - case BindingType::CombinedTextureSampler: - { - assert(!"not implemented"); - return nullptr; - } - } - } - - return bindingState.detach();; -} - -void VKRenderer::setBindingState(BindingState* state) -{ - m_currentBindingState = static_cast(state); -} - -ShaderProgram* VKRenderer::createProgram(const ShaderProgram::Desc& desc) -{ - ShaderProgramImpl* impl = new ShaderProgramImpl(desc.pipelineType); - if( desc.pipelineType == PipelineType::Compute) - { - auto computeKernel = desc.findKernel(StageType::Compute); - impl->m_compute = compileEntryPoint(*computeKernel, VK_SHADER_STAGE_COMPUTE_BIT, impl->m_buffers[0]); - } - else - { - auto vertexKernel = desc.findKernel(StageType::Vertex); - auto fragmentKernel = desc.findKernel(StageType::Fragment); - - impl->m_vertex = compileEntryPoint(*vertexKernel, VK_SHADER_STAGE_VERTEX_BIT, impl->m_buffers[0]); - impl->m_fragment = compileEntryPoint(*fragmentKernel, VK_SHADER_STAGE_FRAGMENT_BIT, impl->m_buffers[1]); - } - return impl; -} - -} // renderer_test diff --git a/tools/slang-graphics/render-vk.h b/tools/slang-graphics/render-vk.h deleted file mode 100644 index 720f35a2c..000000000 --- a/tools/slang-graphics/render-vk.h +++ /dev/null @@ -1,10 +0,0 @@ -// render-vk.h -#pragma once - -namespace slang_graphics { - -class Renderer; - -Renderer* createVKRenderer(); - -} // slang_graphics diff --git a/tools/slang-graphics/render.cpp b/tools/slang-graphics/render.cpp deleted file mode 100644 index 3595f73c1..000000000 --- a/tools/slang-graphics/render.cpp +++ /dev/null @@ -1,390 +0,0 @@ -// render.cpp -#include "render.h" - -#include "../../source/core/slang-math.h" - -namespace slang_graphics { -using namespace Slang; - -/* static */const Resource::BindFlag::Enum Resource::s_requiredBinding[] = -{ - BindFlag::VertexBuffer, // VertexBuffer - BindFlag::IndexBuffer, // IndexBuffer - BindFlag::ConstantBuffer, // ConstantBuffer - BindFlag::StreamOutput, // StreamOut - BindFlag::RenderTarget, // RenderTager - BindFlag::DepthStencil, // DepthRead - BindFlag::DepthStencil, // DepthWrite - BindFlag::UnorderedAccess, // UnorderedAccess - BindFlag::PixelShaderResource, // PixelShaderResource - BindFlag::NonPixelShaderResource, // NonPixelShaderResource - BindFlag::Enum(BindFlag::PixelShaderResource | BindFlag::NonPixelShaderResource), // GenericRead -}; - - -/* static */void Resource::compileTimeAsserts() -{ - SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_requiredBinding) == int(Usage::CountOf)); -} - -static const Resource::DescBase s_emptyDescBase = {}; - -const Resource::DescBase& Resource::getDescBase() const -{ - if (isBuffer()) - { - return static_cast(this)->getDesc(); - } - else if (isTexture()) - { - return static_cast(this)->getDesc(); - } - return s_emptyDescBase; -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! RendererUtil !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -/* static */const uint8_t RendererUtil::s_formatSize[] = -{ - 0, // Unknown, - - uint8_t(sizeof(float) * 4), // RGBA_Float32, - uint8_t(sizeof(float) * 3), // RGB_Float32, - uint8_t(sizeof(float) * 2), // RG_Float32, - uint8_t(sizeof(float) * 1), // R_Float32, - - uint8_t(sizeof(uint32_t)), // RGBA_Unorm_UInt8, - - uint8_t(sizeof(uint32_t)), // R_UInt32, - - uint8_t(sizeof(float)), // D_Float32, - uint8_t(sizeof(uint32_t)), // D_Unorm24_S8, -}; - -/* static */const BindingStyle RendererUtil::s_rendererTypeToBindingStyle[] = -{ - BindingStyle::Unknown, // Unknown, - BindingStyle::DirectX, // DirectX11, - BindingStyle::DirectX, // DirectX12, - BindingStyle::OpenGl, // OpenGl, - BindingStyle::Vulkan, // Vulkan -}; - -/* static */void RendererUtil::compileTimeAsserts() -{ - SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_formatSize) == int(Format::CountOf)); - SLANG_COMPILE_TIME_ASSERT(SLANG_COUNT_OF(s_rendererTypeToBindingStyle) == int(RendererType::CountOf)); -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!! BindingState::Desc !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -void BindingState::Desc::addSampler(const SamplerDesc& desc, const RegisterRange& registerRange) -{ - int descIndex = int(m_samplerDescs.Count()); - m_samplerDescs.Add(desc); - - Binding binding; - binding.bindingType = BindingType::Sampler; - binding.resource = nullptr; - binding.registerRange = registerRange; - binding.descIndex = descIndex; - - m_bindings.Add(binding); -} - -void BindingState::Desc::addResource(BindingType bindingType, Resource* resource, const RegisterRange& registerRange) -{ - assert(resource); - - Binding binding; - binding.bindingType = bindingType; - binding.resource = resource; - binding.descIndex = -1; - binding.registerRange = registerRange; - m_bindings.Add(binding); -} - -void BindingState::Desc::addCombinedTextureSampler(TextureResource* resource, const SamplerDesc& samplerDesc, const RegisterRange& registerRange) -{ - assert(resource); - - int samplerDescIndex = int(m_samplerDescs.Count()); - m_samplerDescs.Add(samplerDesc); - - Binding binding; - binding.bindingType = BindingType::CombinedTextureSampler; - binding.resource = resource; - binding.descIndex = samplerDescIndex; - binding.registerRange = registerRange; - m_bindings.Add(binding); -} - -void BindingState::Desc::clear() -{ - m_bindings.Clear(); - m_samplerDescs.Clear(); - m_numRenderTargets = 1; -} - -int BindingState::Desc::findBindingIndex(Resource::BindFlag::Enum bindFlag, int registerIndex) const -{ - const int numBindings = int(m_bindings.Count()); - for (int i = 0; i < numBindings; ++i) - { - const Binding& binding = m_bindings[i]; - if (binding.resource && (binding.resource->getDescBase().bindFlags & bindFlag) != 0) - { - if (binding.registerRange.hasRegister(registerIndex)) - { - return i; - } - } - } - - return -1; -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!! TextureResource::Size !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -int TextureResource::Size::calcMaxDimension(Type type) const -{ - switch (type) - { - case Resource::Type::Texture1D: return this->width; - case Resource::Type::Texture3D: return std::max(std::max(this->width, this->height), this->depth); - case Resource::Type::TextureCube: // fallthru - case Resource::Type::Texture2D: - { - return std::max(this->width, this->height); - } - default: return 0; - } -} - -TextureResource::Size TextureResource::Size::calcMipSize(int mipLevel) const -{ - Size size; - size.width = TextureResource::calcMipSize(this->width, mipLevel); - size.height = TextureResource::calcMipSize(this->height, mipLevel); - size.depth = TextureResource::calcMipSize(this->depth, mipLevel); - return size; -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!! BufferResource::Desc !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -void BufferResource::Desc::setDefaults(Usage initialUsage) -{ - if (this->bindFlags == 0) - { - this->bindFlags = Resource::s_requiredBinding[int(initialUsage)]; - } -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!! TextureResource::Desc !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -int TextureResource::Desc::calcNumMipLevels() const -{ - const int maxDimensionSize = this->size.calcMaxDimension(type); - return (maxDimensionSize > 0) ? (Math::Log2Floor(maxDimensionSize) + 1) : 0; -} - -int TextureResource::Desc::calcNumSubResources() const -{ - const int numMipMaps = (this->numMipLevels > 0) ? this->numMipLevels : calcNumMipLevels(); - const int arrSize = (this->arraySize > 0) ? this->arraySize : 1; - - switch (type) - { - case Resource::Type::Texture1D: - case Resource::Type::Texture2D: - { - return numMipMaps * arrSize; - } - case Resource::Type::Texture3D: - { - // can't have arrays of 3d textures - assert(this->arraySize <= 1); - return numMipMaps * this->size.depth; - } - case Resource::Type::TextureCube: - { - // There are 6 faces to a cubemap - return numMipMaps * arrSize * 6; - } - default: return 0; - } -} - -void TextureResource::Desc::fixSize() -{ - switch (type) - { - case Resource::Type::Texture1D: - { - this->size.height = 1; - this->size.depth = 1; - break; - } - case Resource::Type::TextureCube: - case Resource::Type::Texture2D: - { - this->size.depth = 1; - break; - } - case Resource::Type::Texture3D: - { - // Can't have an array - this->arraySize = 0; - break; - } - default: break; - } -} - -void TextureResource::Desc::setDefaults(Usage initialUsage) -{ - fixSize(); - if (this->bindFlags == 0) - { - this->bindFlags = Resource::s_requiredBinding[int(initialUsage)]; - } - if (this->numMipLevels <= 0) - { - this->numMipLevels = calcNumMipLevels(); - } -} - -int TextureResource::Desc::calcEffectiveArraySize() const -{ - const int arrSize = (this->arraySize > 0) ? this->arraySize : 1; - - switch (type) - { - case Resource::Type::Texture1D: // fallthru - case Resource::Type::Texture2D: - { - return arrSize; - } - case Resource::Type::TextureCube: return arrSize * 6; - case Resource::Type::Texture3D: return 1; - default: return 0; - } -} - -void TextureResource::Desc::init(Type typeIn) -{ - this->type = typeIn; - this->size.init(); - - this->format = Format::Unknown; - this->arraySize = 0; - this->numMipLevels = 0; - this->sampleDesc.init(); - - this->bindFlags = 0; - this->cpuAccessFlags = 0; -} - -void TextureResource::Desc::init1D(Format formatIn, int widthIn, int numMipMapsIn) -{ - this->type = Type::Texture1D; - this->size.init(widthIn); - - this->format = format; - this->arraySize = 0; - this->numMipLevels = numMipMapsIn; - this->sampleDesc.init(); - - this->bindFlags = 0; - this->cpuAccessFlags = 0; -} - -void TextureResource::Desc::init2D(Type typeIn, Format formatIn, int widthIn, int heightIn, int numMipMapsIn) -{ - assert(typeIn == Type::Texture2D || typeIn == Type::TextureCube); - - this->type = type; - this->size.init(widthIn, heightIn); - - this->format = format; - this->arraySize = 0; - this->numMipLevels = numMipMapsIn; - this->sampleDesc.init(); - - this->bindFlags = 0; - this->cpuAccessFlags = 0; -} - -void TextureResource::Desc::init3D(Format formatIn, int widthIn, int heightIn, int depthIn, int numMipMapsIn) -{ - this->type = Type::Texture3D; - this->size.init(widthIn, heightIn, depthIn); - - this->format = format; - this->arraySize = 0; - this->numMipLevels = numMipMapsIn; - this->sampleDesc.init(); - - this->bindFlags = 0; - this->cpuAccessFlags = 0; -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!! RennderUtil !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -ProjectionStyle RendererUtil::getProjectionStyle(RendererType type) -{ - switch (type) - { - case RendererType::DirectX11: - case RendererType::DirectX12: - { - return ProjectionStyle::DirectX; - } - case RendererType::OpenGl: return ProjectionStyle::OpenGl; - case RendererType::Vulkan: return ProjectionStyle::Vulkan; - case RendererType::Unknown: return ProjectionStyle::Unknown; - default: - { - assert(!"Unhandled type"); - return ProjectionStyle::Unknown; - } - } -} - -/* static */void RendererUtil::getIdentityProjection(ProjectionStyle style, float projMatrix[16]) -{ - switch (style) - { - case ProjectionStyle::DirectX: - case ProjectionStyle::OpenGl: - { - static const float kIdentity[] = - { - 1, 0, 0, 0, - 0, 1, 0, 0, - 0, 0, 1, 0, - 0, 0, 0, 1 - }; - ::memcpy(projMatrix, kIdentity, sizeof(kIdentity)); - break; - } - case ProjectionStyle::Vulkan: - { - static const float kIdentity[] = - { - 1, 0, 0, 0, - 0, -1, 0, 0, - 0, 0, 1, 0, - 0, 0, 0, 1 - }; - ::memcpy(projMatrix, kIdentity, sizeof(kIdentity)); - break; - } - default: - { - assert(!"Not handled"); - } - } -} - -} // renderer_test diff --git a/tools/slang-graphics/render.h b/tools/slang-graphics/render.h deleted file mode 100644 index 92e9c0930..000000000 --- a/tools/slang-graphics/render.h +++ /dev/null @@ -1,583 +0,0 @@ -// render.h -#pragma once - -#include "window.h" - -//#include "shader-input-layout.h" - -#include "../../slang-com-helper.h" - -#include "../../source/core/smart-pointer.h" -#include "../../source/core/list.h" - -namespace slang_graphics { - -// Had to move here, because Options needs types defined here -typedef intptr_t Int; -typedef uintptr_t UInt; - -// pre declare types -class Surface; - -// Declare opaque type -class InputLayout: public Slang::RefObject -{ - public: -}; - -enum class PipelineType -{ - Unknown, - Graphics, - Compute, - CountOf, -}; - -enum class StageType -{ - Unknown, - Vertex, - Hull, - Domain, - Geometry, - Fragment, - Compute, - CountOf, -}; - -enum class RendererType -{ - Unknown, - DirectX11, - DirectX12, - OpenGl, - Vulkan, - CountOf, -}; - -enum class ProjectionStyle -{ - Unknown, - OpenGl, - DirectX, - Vulkan, - CountOf, -}; - -/// The style of the binding -enum class BindingStyle -{ - Unknown, - DirectX, - OpenGl, - Vulkan, - CountOf, -}; - -class ShaderProgram: public Slang::RefObject -{ -public: - - struct KernelDesc - { - StageType stage; - void const* codeBegin; - void const* codeEnd; - - UInt getCodeSize() const { return (char const*)codeEnd - (char const*)codeBegin; } - }; - - struct Desc - { - PipelineType pipelineType; - KernelDesc const* kernels; - Int kernelCount; - - /// Find and return the kernel for `stage`, if present. - KernelDesc const* findKernel(StageType stage) const - { - for(Int ii = 0; ii < kernelCount; ++ii) - if(kernels[ii].stage == stage) - return &kernels[ii]; - return nullptr; - } - }; -}; - -struct ShaderCompileRequest -{ - struct SourceInfo - { - char const* path; - - // The data may either be source text (in which - // case it can be assumed to be nul-terminated with - // `dataEnd` pointing at the terminator), or - // raw binary data (in which case `dataEnd` points - // at the end of the buffer). - char const* dataBegin; - char const* dataEnd; - }; - - struct EntryPoint - { - char const* name = nullptr; - SourceInfo source; - }; - - SourceInfo source; - EntryPoint vertexShader; - EntryPoint fragmentShader; - EntryPoint computeShader; - Slang::List entryPointTypeArguments; -}; - -/// Different formats of things like pixels or elements of vertices -/// NOTE! Any change to this type (adding, removing, changing order) - must also be reflected in changes to RendererUtil -enum class Format -{ - Unknown, - - RGBA_Float32, - RGB_Float32, - RG_Float32, - R_Float32, - - RGBA_Unorm_UInt8, - - R_UInt32, - - D_Float32, - D_Unorm24_S8, - - CountOf, -}; - -struct InputElementDesc -{ - char const* semanticName; - UInt semanticIndex; - Format format; - UInt offset; -}; - -enum class MapFlavor -{ - Unknown, ///< Unknown mapping type - HostRead, - HostWrite, - WriteDiscard, -}; - -enum class PrimitiveTopology -{ - TriangleList, -}; - -class Resource: public Slang::RefObject -{ - public: - - /// The type of resource. - /// NOTE! The order needs to be such that all texture types are at or after Texture1D (otherwise isTexture won't work correctly) - enum class Type - { - Unknown, ///< Unknown - Buffer, ///< A buffer (like a constant/index/vertex buffer) - Texture1D, ///< A 1d texture - Texture2D, ///< A 2d texture - Texture3D, ///< A 3d texture - TextureCube, ///< A cubemap consists of 6 Texture2D like faces - CountOf, - }; - - /// Describes how a resource is to be used - enum class Usage - { - Unknown = -1, - VertexBuffer = 0, - IndexBuffer, - ConstantBuffer, - StreamOutput, - RenderTarget, - DepthRead, - DepthWrite, - UnorderedAccess, - PixelShaderResource, - NonPixelShaderResource, - GenericRead, - CountOf, - }; - - /// Binding flags describe all of the ways a resource can be bound - and therefore used - struct BindFlag - { - enum Enum - { - VertexBuffer = 0x001, - IndexBuffer = 0x002, - ConstantBuffer = 0x004, - StreamOutput = 0x008, - RenderTarget = 0x010, - DepthStencil = 0x020, - UnorderedAccess = 0x040, - PixelShaderResource = 0x080, - NonPixelShaderResource = 0x100, - }; - }; - - /// Combinations describe how a resource can be accessed (typically by the host/cpu) - struct AccessFlag - { - enum Enum - { - Read = 0x1, - Write = 0x2 - }; - }; - - /// Base class for Descs - struct DescBase - { - bool canBind(BindFlag::Enum bindFlag) const { return (bindFlags & bindFlag) != 0; } - bool hasCpuAccessFlag(AccessFlag::Enum accessFlag) { return (cpuAccessFlags & accessFlag) != 0; } - - Type type = Type::Unknown; - - int bindFlags = 0; ///< Combination of Resource::BindFlag or 0 (and will use initialUsage to set) - int cpuAccessFlags = 0; ///< Combination of Resource::AccessFlag - }; - - /// Get the type - SLANG_FORCE_INLINE Type getType() const { return m_type; } - /// True if it's a texture derived type - SLANG_FORCE_INLINE bool isTexture() const { return int(m_type) >= int(Type::Texture1D); } - /// True if it's a buffer derived type - SLANG_FORCE_INLINE bool isBuffer() const { return m_type == Type::Buffer; } - - /// Get the descBase - const DescBase& getDescBase() const; - /// Returns true if can bind with flag - bool canBind(BindFlag::Enum bindFlag) const { return getDescBase().canBind(bindFlag); } - - /// For a usage gives the required binding flags - static const BindFlag::Enum s_requiredBinding[]; /// Maps Usage to bind flags required - - protected: - Resource(Type type): - m_type(type) - {} - - static void compileTimeAsserts(); - - Type m_type; -}; - -class BufferResource: public Resource -{ - public: - typedef Resource Parent; - - struct Desc: public DescBase - { - void init(size_t sizeInBytesIn) - { - sizeInBytes = sizeInBytesIn; - elementSize = 0; - format = Format::Unknown; - } - /// Set up default parameters based on usage - void setDefaults(Usage initialUsage); - - size_t sizeInBytes; ///< Total size in bytes - int elementSize; ///< Get the element stride. If > 0, this is a structured buffer - Format format; - }; - - /// Get the buffer description - SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } - - /// Ctor - BufferResource(const Desc& desc): - Parent(Type::Buffer), - m_desc(desc) - { - } - - protected: - Desc m_desc; -}; - -class TextureResource: public Resource -{ - public: - typedef Resource Parent; - - struct SampleDesc - { - void init() - { - numSamples = 1; - quality = 0; - } - int numSamples; ///< Number of samples per pixel - int quality; ///< The quality measure for the samples - }; - - struct Size - { - void init() - { - width = height = depth = 1; - } - void init(int widthIn, int heightIn = 1, int depthIn = 1) - { - width = widthIn; - height = heightIn; - depth = depthIn; - } - /// Given the type works out the maximum dimension size - int calcMaxDimension(Type type) const; - /// Given a size, calculates the size at a mip level - Size calcMipSize(int mipLevel) const; - - int width; ///< Width in pixels - int height; ///< Height in pixels (if 2d or 3d) - int depth; ///< Depth (if 3d) - }; - - struct Desc: public DescBase - { - /// Initialize with default values - void init(Type typeIn); - /// Initialize different dimensions. For cubemap, use init2D - void init1D(Format format, int width, int numMipMaps = 0); - void init2D(Type typeIn, Format format, int width, int height, int numMipMaps = 0); - void init3D(Format format, int width, int height, int depth, int numMipMaps = 0); - - /// Given the type, calculates the number of mip maps. 0 on error - int calcNumMipLevels() const; - /// Calculate the total number of sub resources. 0 on error. - int calcNumSubResources() const; - - /// Calculate the effective array size - in essence the amount if mip map sets needed. - /// In practice takes into account if the arraySize is 0 (it's not an array, but it will still have at least one mip set) - /// and if the type is a cubemap (multiplies the amount of mip sets by 6) - int calcEffectiveArraySize() const; - - /// Use type to fix the size values (and array size). - /// For example a 1d texture, should have height and depth set to 1. - void fixSize(); - - /// Set up default parameters based on type and usage - void setDefaults(Usage initialUsage); - - Size size; - - int arraySize; ///< Array size - - int numMipLevels; ///< Number of mip levels - if 0 will create all mip levels - Format format; ///< The resources format - SampleDesc sampleDesc; ///< How the resource is sampled - }; - - /// The ordering of the subResources is - /// forall (effectiveArraySize) - /// forall (mip levels) - /// forall (depth levels) - struct Data - { - ptrdiff_t* mipRowStrides; ///< The row stride for a mip map - int numMips; ///< The number of mip maps - const void*const* subResources; ///< Pointers to each full mip subResource - int numSubResources; ///< The total amount of subResources. Typically = numMips * depth * arraySize - }; - - /// Get the description of the texture - SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } - - /// Ctor - TextureResource(const Desc& desc): - Parent(desc.type), - m_desc(desc) - { - } - - SLANG_FORCE_INLINE static int calcMipSize(int width, int mipLevel) - { - width = width >> mipLevel; - return width > 0 ? width : 1; - } - - protected: - Desc m_desc; -}; - -enum class BindingType -{ - Unknown, - Sampler, - Buffer, - Texture, - CombinedTextureSampler, - CountOf, -}; - -class BindingState : public Slang::RefObject -{ -public: - /// A register set consists of one or more contiguous indices. - /// To be valid index >= 0 and size >= 1 - struct RegisterRange - { - /// True if contains valid contents - bool isValid() const { return size > 0; } - /// True if valid single value - bool isSingle() const { return size == 1; } - /// Get as a single index (must be at least one index) - int getSingleIndex() const { return (size == 1) ? index : -1; } - /// Return the first index - int getFirstIndex() const { return (size > 0) ? index : -1; } - /// True if contains register index - bool hasRegister(int registerIndex) const { return registerIndex >= index && registerIndex < index + size; } - - static RegisterRange makeInvalid() { return RegisterRange{ -1, 0 }; } - static RegisterRange makeSingle(int index) { return RegisterRange{ int16_t(index), 1 }; } - static RegisterRange makeRange(int index, int size) { return RegisterRange{ int16_t(index), uint16_t(size) }; } - - int16_t index; ///< The base index - uint16_t size; ///< The amount of register indices - }; - - struct SamplerDesc - { - bool isCompareSampler; - }; - - struct Binding - { - BindingType bindingType; ///< Type of binding - int descIndex; ///< The description index associated with type. -1 if not used. For example if bindingType is Sampler, the descIndex is into m_samplerDescs. - Slang::RefPtr resource; ///< Associated resource. nullptr if not used - RegisterRange registerRange; /// Defines the registers for binding - }; - - struct Desc - { - /// Add a resource - assumed that the binding will match the Desc of the resource - void addResource(BindingType bindingType, Resource* resource, const RegisterRange& registerRange); - /// Add a sampler - void addSampler(const SamplerDesc& desc, const RegisterRange& registerRange); - /// Add a BufferResource - void addBufferResource(BufferResource* resource, const RegisterRange& registerRange) { addResource(BindingType::Buffer, resource, registerRange); } - /// Add a texture - void addTextureResource(TextureResource* resource, const RegisterRange& registerRange) { addResource(BindingType::Texture, resource, registerRange); } - /// Add combined texture a - void addCombinedTextureSampler(TextureResource* resource, const SamplerDesc& samplerDesc, const RegisterRange& registerRange); - - /// Returns the bind index, that has the bind flag, and indexes the specified register - int findBindingIndex(Resource::BindFlag::Enum bindFlag, int registerIndex) const; - - /// Clear the contents - void clear(); - - Slang::List m_bindings; ///< All of the bindings in order - Slang::List m_samplerDescs; ///< Holds the SamplerDesc for the binding - indexed by the descIndex member of Binding - - int m_numRenderTargets = 1; - }; - - /// Get the Desc used to create this binding - SLANG_FORCE_INLINE const Desc& getDesc() const { return m_desc; } - - protected: - BindingState(const Desc& desc): - m_desc(desc) - { - } - - Desc m_desc; -}; - -class Renderer: public Slang::RefObject -{ -public: - - struct Desc - { - int width; ///< Width in pixels - int height; ///< height in pixels - }; - - virtual SlangResult initialize(const Desc& desc, void* inWindowHandle) = 0; - - virtual void setClearColor(const float color[4]) = 0; - virtual void clearFrame() = 0; - - virtual void presentFrame() = 0; - - /// Create a texture resource. initData holds the initialize data to set the contents of the texture when constructed. - virtual TextureResource* createTextureResource(Resource::Usage initialUsage, const TextureResource::Desc& desc, const TextureResource::Data* initData = nullptr) { return nullptr; } - /// Create a buffer resource - virtual BufferResource* createBufferResource(Resource::Usage initialUsage, const BufferResource::Desc& desc, const void* initData = nullptr) { return nullptr; } - - /// Captures the back buffer and stores the result in surfaceOut. If the surface contains data - it will either be overwritten (if same size and format), or freed and a re-allocated. - virtual SlangResult captureScreenSurface(Surface& surfaceOut) = 0; - - virtual InputLayout* createInputLayout(const InputElementDesc* inputElements, UInt inputElementCount) = 0; - virtual BindingState* createBindingState(const BindingState::Desc& desc) { return nullptr; } - - virtual ShaderProgram* createProgram(const ShaderProgram::Desc& desc) = 0; - - virtual void* map(BufferResource* buffer, MapFlavor flavor) = 0; - virtual void unmap(BufferResource* buffer) = 0; - - virtual void setInputLayout(InputLayout* inputLayout) = 0; - virtual void setPrimitiveTopology(PrimitiveTopology topology) = 0; - virtual void setBindingState(BindingState* state) = 0; - virtual void setVertexBuffers(UInt startSlot, UInt slotCount, BufferResource*const* buffers, const UInt* strides, const UInt* offsets) = 0; - - inline void setVertexBuffer(UInt slot, BufferResource* buffer, UInt stride, UInt offset = 0); - - virtual void setShaderProgram(ShaderProgram* program) = 0; - - virtual void draw(UInt vertexCount, UInt startVertex = 0) = 0; - virtual void dispatchCompute(int x, int y, int z) = 0; - - /// Commit any buffered state changes or draw calls. - /// presentFrame will commitAll implicitly before doing a present - virtual void submitGpuWork() = 0; - /// Blocks until Gpu work is complete - virtual void waitForGpu() = 0; - - /// Get the type of this renderer - virtual RendererType getRendererType() const = 0; -}; - -// ---------------------------------------------------------------------------------------- -inline void Renderer::setVertexBuffer(UInt slot, BufferResource* buffer, UInt stride, UInt offset) -{ - setVertexBuffers(slot, 1, &buffer, &stride, &offset); -} - -/// Functions that are around Renderer and it's types -struct RendererUtil -{ - /// Gets the size in bytes of a Format type. Returns 0 if a size is not defined/invalid - SLANG_FORCE_INLINE static size_t getFormatSize(Format format) { return s_formatSize[int(format)]; } - /// Given a renderer type, gets a projection style - static ProjectionStyle getProjectionStyle(RendererType type); - - /// Given the projection style returns an 'identity' matrix, which ensures x,y mapping to pixels is the same on all targets - static void getIdentityProjection(ProjectionStyle style, float projMatrix[16]); - - /// Get the binding style from the type - static BindingStyle getBindingStyle(RendererType type) { return s_rendererTypeToBindingStyle[int(type)]; } - - private: - static void compileTimeAsserts(); - static const uint8_t s_formatSize[]; // Maps Format::XXX to a size in bytes; - static const BindingStyle s_rendererTypeToBindingStyle[]; ///< Maps a RendererType to a BindingStyle -}; - -} // renderer_test diff --git a/tools/slang-graphics/resource-d3d12.cpp b/tools/slang-graphics/resource-d3d12.cpp deleted file mode 100644 index bb39d2529..000000000 --- a/tools/slang-graphics/resource-d3d12.cpp +++ /dev/null @@ -1,214 +0,0 @@ -// resource-d3d12.cpp -#include "resource-d3d12.h" - -namespace slang_graphics { -using namespace Slang; - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! D3D12BarrierSubmitter !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -void D3D12BarrierSubmitter::_flush() -{ - assert(m_numBarriers > 0); - - if (m_commandList) - { - m_commandList->ResourceBarrier(UINT(m_numBarriers), m_barriers); - } - m_numBarriers = 0; -} - -D3D12_RESOURCE_BARRIER& D3D12BarrierSubmitter::_expandOne() -{ - _flush(); - return m_barriers[m_numBarriers++]; -} - -void D3D12BarrierSubmitter::transition(ID3D12Resource* resource, D3D12_RESOURCE_STATES prevState, D3D12_RESOURCE_STATES nextState) -{ - if (nextState != prevState) - { - D3D12_RESOURCE_BARRIER& barrier = expandOne(); - - const UINT subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES; - const D3D12_RESOURCE_BARRIER_FLAGS flags = D3D12_RESOURCE_BARRIER_FLAG_NONE; - - ::memset(&barrier, 0, sizeof(barrier)); - barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION; - barrier.Flags = flags; - barrier.Transition.pResource = resource; - barrier.Transition.StateBefore = prevState; - barrier.Transition.StateAfter = nextState; - barrier.Transition.Subresource = subresource; - } - else - { - if (nextState == D3D12_RESOURCE_STATE_UNORDERED_ACCESS) - { - D3D12_RESOURCE_BARRIER& barrier = expandOne(); - - ::memset(&barrier, 0, sizeof(barrier)); - barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_UAV; - barrier.UAV.pResource = resource; - } - } -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! D3D12ResourceBase !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -/* static */DXGI_FORMAT D3D12ResourceBase::calcFormat(D3DUtil::UsageType usage, ID3D12Resource* resource) -{ - return resource ? D3DUtil::calcFormat(usage, resource->GetDesc().Format) : DXGI_FORMAT_UNKNOWN; -} - -void D3D12ResourceBase::transition(D3D12_RESOURCE_STATES nextState, D3D12BarrierSubmitter& submitter) -{ - // Transition only if there is a resource - if (m_resource) - { - submitter.transition(m_resource, m_state, nextState); - m_state = nextState; - } -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! D3D12CounterFence !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ - -D3D12CounterFence::~D3D12CounterFence() -{ - if (m_event) - { - CloseHandle(m_event); - } -} - -Result D3D12CounterFence::init(ID3D12Device* device, uint64_t initialValue) -{ - m_currentValue = initialValue; - - SLANG_RETURN_ON_FAIL(device->CreateFence(m_currentValue, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(m_fence.writeRef()))); - // Create an event handle to use for frame synchronization. - m_event = ::CreateEvent(nullptr, FALSE, FALSE, nullptr); - if (m_event == nullptr) - { - Result res = HRESULT_FROM_WIN32(GetLastError()); - return SLANG_FAILED(res) ? res : SLANG_FAIL; - } - return SLANG_OK; -} - -UInt64 D3D12CounterFence::nextSignal(ID3D12CommandQueue* commandQueue) -{ - // Increment the fence value. Save on the frame - we'll know that frame is done when the fence value >= - m_currentValue++; - // Schedule a Signal command in the queue. - Result res = commandQueue->Signal(m_fence, m_currentValue); - if (SLANG_FAILED(res)) - { - assert(!"Signal failed"); - } - return m_currentValue; -} - -void D3D12CounterFence::waitUntilCompleted(uint64_t completedValue) -{ - // You can only wait for a value that is less than or equal to the current value - assert(completedValue <= m_currentValue); - - // Wait until the previous frame is finished. - while (m_fence->GetCompletedValue() < completedValue) - { - // Make it signal with the current value - SLANG_ASSERT_VOID_ON_FAIL(m_fence->SetEventOnCompletion(completedValue, m_event)); - WaitForSingleObject(m_event, INFINITE); - } -} - -void D3D12CounterFence::nextSignalAndWait(ID3D12CommandQueue* commandQueue) -{ - waitUntilCompleted(nextSignal(commandQueue)); -} - -/* !!!!!!!!!!!!!!!!!!!!!!!!! D3D12Resource !!!!!!!!!!!!!!!!!!!!!!!! */ - -/* static */void D3D12Resource::setDebugName(ID3D12Resource* resource, const char* name) -{ - if (resource) - { - size_t len = ::strlen(name); - List buf; - buf.SetSize(len + 1); - - D3DUtil::appendWideChars(name, buf); - resource->SetName(buf.begin()); - } -} - -void D3D12Resource::setDebugName(const char* name) -{ - setDebugName(m_resource, name); -} - -void D3D12Resource::setDebugName(const wchar_t* name) -{ - if (m_resource) - { - m_resource->SetName(name); - } -} - -void D3D12Resource::setResource(ID3D12Resource* resource, D3D12_RESOURCE_STATES initialState) -{ - if (resource != m_resource) - { - if (resource) - { - resource->AddRef(); - } - if (m_resource) - { - m_resource->Release(); - } - m_resource = resource; - } - m_prevState = initialState; - m_state = initialState; -} - -void D3D12Resource::setResourceNull() -{ - if (m_resource) - { - m_resource->Release(); - m_resource = nullptr; - } -} - -Result D3D12Resource::initCommitted(ID3D12Device* device, const D3D12_HEAP_PROPERTIES& heapProps, D3D12_HEAP_FLAGS heapFlags, const D3D12_RESOURCE_DESC& resourceDesc, D3D12_RESOURCE_STATES initState, const D3D12_CLEAR_VALUE * clearValue) -{ - setResourceNull(); - ComPtr resource; - SLANG_RETURN_ON_FAIL(device->CreateCommittedResource(&heapProps, heapFlags, &resourceDesc, initState, clearValue, IID_PPV_ARGS(resource.writeRef()))); - setResource(resource, initState); - return SLANG_OK; -} - -ID3D12Resource* D3D12Resource::detach() -{ - ID3D12Resource* resource = m_resource; - m_resource = nullptr; - return resource; -} - -void D3D12Resource::swap(ComPtr& resourceInOut) -{ - ID3D12Resource* tmp = m_resource; - m_resource = resourceInOut.detach(); - resourceInOut.attach(tmp); -} - -void D3D12Resource::setState(D3D12_RESOURCE_STATES state) -{ - m_prevState = state; - m_state = state; -} - -} // renderer_test diff --git a/tools/slang-graphics/resource-d3d12.h b/tools/slang-graphics/resource-d3d12.h deleted file mode 100644 index 6040291cc..000000000 --- a/tools/slang-graphics/resource-d3d12.h +++ /dev/null @@ -1,178 +0,0 @@ -// resource-d3d12.h -#pragma once - -#define WIN32_LEAN_AND_MEAN -#define NOMINMAX -#include -#undef WIN32_LEAN_AND_MEAN -#undef NOMINMAX - -#include -#include - -#include "../../slang-com-ptr.h" -#include "d3d-util.h" - -namespace slang_graphics { - -// Enables more conservative barriers - restoring the state of resources after they are used. -// Should not need to be enabled in normal builds, as the barriers should correctly sync resources -// If enabling fixes an issue it implies regular barriers are not correctly used. -#define SLANG_ENABLE_CONSERVATIVE_RESOURCE_BARRIERS 0 - -struct D3D12BarrierSubmitter -{ - enum { MAX_BARRIERS = 8 }; - - /// Expand one space to hold a barrier - SLANG_FORCE_INLINE D3D12_RESOURCE_BARRIER& expandOne() { return (m_numBarriers < MAX_BARRIERS) ? m_barriers[m_numBarriers++] : _expandOne(); } - /// Flush barriers to command list - SLANG_FORCE_INLINE void flush() { if (m_numBarriers > 0) _flush(); } - - /// Transition resource from prevState to nextState - void transition(ID3D12Resource* resource, D3D12_RESOURCE_STATES prevState, D3D12_RESOURCE_STATES nextState); - - /// Ctor - SLANG_FORCE_INLINE D3D12BarrierSubmitter(ID3D12GraphicsCommandList* commandList) : m_numBarriers(0), m_commandList(commandList) { } - /// Dtor - SLANG_FORCE_INLINE ~D3D12BarrierSubmitter() { flush(); } - -protected: - D3D12_RESOURCE_BARRIER& _expandOne(); - void _flush(); - - ID3D12GraphicsCommandList* m_commandList; - int m_numBarriers; - D3D12_RESOURCE_BARRIER m_barriers[MAX_BARRIERS]; -}; - -/*! \brief A class to simplify using Dx12 fences. - -A fence is a mechanism to track GPU work. This is achieved by having a counter that the CPU holds -called the current value. Calling nextSignal will increase the CPU counter, and add a fence -with that value to the commandQueue. When the GPU has completed all the work before the fence it will -update the completed value. This is typically used when -the CPU needs to know the GPU has finished some piece of work has completed. To do this the CPU -can check the completed value, and when it is greater or equal to the value returned by nextSignal the -CPU will know that all the work prior to when the nextSignal was added to the queue will have completed. - -NOTE! This cannot be used across threads, as for amongst other reasons SetEventOnCompletion -only works with a single value. - -Signal on the CommandQueue updates the fence on the GPU side. Signal on the fence object changes -the value on the CPU side (not used here). - -Useful article describing how Dx12 synchronization works: -https://msdn.microsoft.com/en-us/library/windows/desktop/dn899217%28v=vs.85%29.aspx -*/ -class D3D12CounterFence -{ -public: - /// Must be called before used - SlangResult init(ID3D12Device* device, uint64_t initialValue = 0); - /// Increases the counter, signals the queue and waits for the signal to be hit - void nextSignalAndWait(ID3D12CommandQueue* queue); - /// Signals with next counter value. Returns the value the signal was called on - uint64_t nextSignal(ID3D12CommandQueue* commandQueue); - /// Get the current value - SLANG_FORCE_INLINE uint64_t getCurrentValue() const { return m_currentValue; } - /// Get the completed value - SLANG_FORCE_INLINE uint64_t getCompletedValue() const { return m_fence->GetCompletedValue(); } - - /// Waits for the the specified value - void waitUntilCompleted(uint64_t completedValue); - - /// Ctor - D3D12CounterFence() :m_event(nullptr), m_currentValue(0) {} - /// Dtor - ~D3D12CounterFence(); - -protected: - HANDLE m_event; - Slang::ComPtr m_fence; - UINT64 m_currentValue; -}; - -/** The base class for resource types allows for tracking of state. It does not allow for setting of the resource though, such that -an interface can return a D3D12ResourceBase, and a client cant manipulate it's state, but it cannot replace/change the actual resource */ -struct D3D12ResourceBase -{ - /// Add a transition if necessary to the list - void transition(D3D12_RESOURCE_STATES nextState, D3D12BarrierSubmitter& submitter); - /// Get the current state - SLANG_FORCE_INLINE D3D12_RESOURCE_STATES getState() const { return m_state; } - - /// Get the associated resource - SLANG_FORCE_INLINE ID3D12Resource* getResource() const { return m_resource; } - - /// True if a resource is set - SLANG_FORCE_INLINE bool isSet() const { return m_resource != nullptr; } - - /// Coercible into ID3D12Resource - SLANG_FORCE_INLINE operator ID3D12Resource*() const { return m_resource; } - - /// restore previous state -#if SLANG_ENABLE_CONSERVATIVE_RESOURCE_BARRIERS - SLANG_FORCE_INLINE Void restore(D3D12BarrierSubmitter& submitter) { transition(m_prevState, submitter); } -#else - SLANG_FORCE_INLINE void restore(D3D12BarrierSubmitter& submitter) { SLANG_UNUSED(submitter) } -#endif - - /// Given the usage, flags, and format will return the most suitable format. Will return DXGI_UNKNOWN if combination is not possible - static DXGI_FORMAT calcFormat(D3DUtil::UsageType usage, ID3D12Resource* resource); - - /// Ctor - SLANG_FORCE_INLINE D3D12ResourceBase() : - m_state(D3D12_RESOURCE_STATE_COMMON), - m_prevState(D3D12_RESOURCE_STATE_COMMON), - m_resource(nullptr) - {} - -protected: - /// This is protected so as clients cannot slice the class, and so state tracking is lost - ~D3D12ResourceBase() {} - - ID3D12Resource* m_resource; ///< The resource (ref counted) - D3D12_RESOURCE_STATES m_state; ///< The current tracked expected state, if all associated transitions have completed on ID3D12CommandList - D3D12_RESOURCE_STATES m_prevState; ///< The previous state -}; - -struct D3D12Resource : public D3D12ResourceBase -{ - - /// Dtor - ~D3D12Resource() - { - if (m_resource) - { - m_resource->Release(); - } - } - - /// Initialize as committed resource - Slang::Result initCommitted(ID3D12Device* device, const D3D12_HEAP_PROPERTIES& heapProps, D3D12_HEAP_FLAGS heapFlags, const D3D12_RESOURCE_DESC& resourceDesc, D3D12_RESOURCE_STATES initState, const D3D12_CLEAR_VALUE * clearValue); - - /// Set a resource with an initial state - void setResource(ID3D12Resource* resource, D3D12_RESOURCE_STATES initialState); - /// Make the resource null - void setResourceNull(); - /// Returns the attached resource (with any ref counts) and sets to nullptr on this. - ID3D12Resource* detach(); - - /// Swaps the resource contents with the contents of the smart pointer - void swap(Slang::ComPtr& resourceInOut); - - /// Sets the current state of the resource (the current state is taken to be the future state once the command list has executed) - /// NOTE! This must be used with care, otherwise state tracking can be made incorrect. - void setState(D3D12_RESOURCE_STATES state); - - /// Set the debug name on a resource - static void setDebugName(ID3D12Resource* resource, const char* name); - - /// Set the the debug name on the resource - void setDebugName(const wchar_t* name); - /// Set the debug name - void setDebugName(const char* name); -}; - -} // renderer_test diff --git a/tools/slang-graphics/slang-graphics.vcxproj b/tools/slang-graphics/slang-graphics.vcxproj deleted file mode 100644 index ce7502326..000000000 --- a/tools/slang-graphics/slang-graphics.vcxproj +++ /dev/null @@ -1,212 +0,0 @@ - - - - - Debug - Win32 - - - Debug - x64 - - - Release - Win32 - - - Release - x64 - - - - {222F7498-B40C-4F3F-A704-DDEB91A4484A} - true - Win32Proj - slang-graphics - 10.0.14393.0 - - - - StaticLibrary - true - Unicode - v140 - - - StaticLibrary - true - Unicode - v140 - - - StaticLibrary - false - Unicode - v140 - - - StaticLibrary - false - Unicode - v140 - - - - - - - - - - - - - - - - - - - ..\..\bin\windows-x86\debug\ - ..\..\intermediate\windows-x86\debug\slang-graphics\ - slang-graphics - .lib - - - ..\..\bin\windows-x64\debug\ - ..\..\intermediate\windows-x64\debug\slang-graphics\ - slang-graphics - .lib - - - ..\..\bin\windows-x86\release\ - ..\..\intermediate\windows-x86\release\slang-graphics\ - slang-graphics - .lib - - - ..\..\bin\windows-x64\release\ - ..\..\intermediate\windows-x64\release\slang-graphics\ - slang-graphics - .lib - - - - NotUsing - Level3 - _DEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) - EditAndContinue - Disabled - MultiThreadedDebug - - - Windows - true - - - "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x86/" "../../bin/windows-x86/debug/" - - - - - NotUsing - Level3 - _DEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) - EditAndContinue - Disabled - MultiThreadedDebug - - - Windows - true - - - "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x64/" "../../bin/windows-x64/debug/" - - - - - NotUsing - Level3 - NDEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) - Full - true - true - false - true - MultiThreaded - - - Windows - true - true - - - "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x86/" "../../bin/windows-x86/release/" - - - - - NotUsing - Level3 - NDEBUG;%(PreprocessorDefinitions) - ..\..;..\..\external;..\..\source;%(AdditionalIncludeDirectories) - Full - true - true - false - true - MultiThreaded - - - Windows - true - true - - - "$(SolutionDir)tools\copy-hlsl-libs.bat" "$(WindowsSdkDir)Redist/D3D/x64/" "../../bin/windows-x64/release/" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/tools/slang-graphics/slang-graphics.vcxproj.filters b/tools/slang-graphics/slang-graphics.vcxproj.filters deleted file mode 100644 index b1e4c42a3..000000000 --- a/tools/slang-graphics/slang-graphics.vcxproj.filters +++ /dev/null @@ -1,111 +0,0 @@ - - - - - {21EB8090-0D4E-1035-B6D3-48EBA215DCB7} - - - {E9C7FDCE-D52A-8D73-7EB0-C5296AF258F6} - - - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - Header Files - - - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - Source Files - - - \ No newline at end of file diff --git a/tools/slang-graphics/surface.cpp b/tools/slang-graphics/surface.cpp deleted file mode 100644 index 9d91f8778..000000000 --- a/tools/slang-graphics/surface.cpp +++ /dev/null @@ -1,222 +0,0 @@ -// surface.cpp -#include "surface.h" - -#include -#include - -#include "../../source/core/list.h" - -namespace slang_graphics { -using namespace Slang; - -class MallocSurfaceAllocator: public SurfaceAllocator -{ - public: - - virtual Slang::Result allocate(int width, int height, Format format, int alignment, Surface& surface) override; - virtual void deallocate(Surface& surface) override; -}; - -static MallocSurfaceAllocator s_mallocSurfaceAllocator; - -/// Get the malloc allocator -/* static */SurfaceAllocator* SurfaceAllocator::getMallocAllocator() -{ - return &s_mallocSurfaceAllocator; -} - -Slang::Result MallocSurfaceAllocator::allocate(int width, int height, Format format, int alignment, Surface& surface) -{ - assert(surface.m_data == nullptr); - - // Calculate row size - - const int rowSizeInBytes = Surface::calcRowSize(format, width); - const int numRows = Surface::calcNumRows(format, height); - - alignment = (alignment <= 0) ? int(sizeof(void*)) : alignment; - // It must be a power of 2 - assert( ((alignment - 1) & alignment) == 0); - - // Align rowSize - const int alignedRowSizeInBytes = (rowSizeInBytes + alignment - 1) & -alignment; - - size_t totalSize = numRows * alignedRowSizeInBytes; - - uint8_t* data = (uint8_t*)::malloc(totalSize); - if (!data) - { - return SLANG_E_OUT_OF_MEMORY; - } - - surface.m_data = data; - surface.m_width = width; - surface.m_height = height; - surface.m_format = format; - surface.m_numRows = numRows; - surface.m_rowStrideInBytes = alignedRowSizeInBytes; - - surface.m_allocator = this; - return SLANG_OK; -} - -void MallocSurfaceAllocator::deallocate(Surface& surface) -{ - assert(surface.m_data); - // Make sure it's not an inverted, cos otherwise m_data is not the start address - assert(surface.m_rowStrideInBytes > 0); - ::free(surface.m_data); -} - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Surface !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -/* static */int Surface::calcRowSize(Format format, int width) -{ - size_t pixelSize = RendererUtil::getFormatSize(format); - if (pixelSize == 0) - { - return 0; - } - return int(pixelSize * width); -} - -/* static */int Surface::calcNumRows(Format format, int height) -{ - // Don't have any compressed types, so number of rows is same as the height - return height; -} - -void Surface::init() -{ - m_width = 0; - m_height = 0; - m_format = Format::Unknown; - m_data = nullptr; - m_numRows = 0; - m_rowStrideInBytes = 0; - // NOTE! does not clear the allocator. - // If called with an allocation memory will leak! -} - -Surface::~Surface() -{ - if (m_data && m_allocator) - { - m_allocator->deallocate(*this); - } -} - -void Surface::deallocate() -{ - if (m_data && m_allocator) - { - m_allocator->deallocate(*this); - init(); - } -} - -Result Surface::allocate(int width, int height, Format format, int alignment, SurfaceAllocator* allocator) -{ - deallocate(); - allocator = allocator ? allocator : m_allocator; - if (!allocator) - { - // An allocator needs to be set on the surface, or one passed in. - return SLANG_FAIL; - } - return allocator->allocate(width, height, format, alignment, *this); -} - -void Surface::setUnowned(int width, int height, Format format, int strideInBytes, void* data) -{ - deallocate(); - - // This is unowned - m_allocator = nullptr; - - m_width = width; - m_height = height; - m_format = format; - m_rowStrideInBytes = strideInBytes; - m_data = (uint8_t*)data; - - m_numRows = Surface::calcNumRows(format, height); - - const int rowSizeInBytes = Surface::calcRowSize(format, width); - assert((strideInBytes > 0 && rowSizeInBytes <= strideInBytes) || (strideInBytes < 0 && rowSizeInBytes <= -strideInBytes)); -} - -void Surface::zeroContents() -{ - const int rowSizeInBytes = Surface::calcRowSize(m_format, m_width); - - const int stride = m_rowStrideInBytes; - uint8_t* dst = m_data; - - for (int i = 0; i < m_numRows; i++, dst += stride) - { - ::memset(dst, 0, rowSizeInBytes); - } -} - -void Surface::flipInplaceVertically() -{ - // Can only flip when m_height matches number of rows - assert(m_numRows == m_height); - - const int rowSizeInBytes = Surface::calcRowSize(m_format, m_width); - if (rowSizeInBytes <= 0 || m_numRows <= 1) - { - return; - } - - uint8_t* top = m_data; - uint8_t* bottom = m_data + (m_numRows - 1) * m_rowStrideInBytes; - - List bufferList; - bufferList.SetSize(rowSizeInBytes); - uint8_t* buffer = bufferList.Buffer(); - - const int stride = m_rowStrideInBytes; - - const int num = m_height >> 1; - for (int i = 0; i < num; ++i, top += stride, bottom -= stride) - { - ::memcpy(buffer, top, rowSizeInBytes); - ::memcpy(top, bottom, rowSizeInBytes); - ::memcpy(bottom, buffer, rowSizeInBytes); - } -} - -SlangResult Surface::set(int width, int height, Format format, int srcRowStride, const void* data, SurfaceAllocator* allocator) -{ - if (hasContents() && m_width == width && m_height == height && m_format == format) - { - // I can just overwrite the contents that is there - } - else - { - SLANG_RETURN_ON_FAIL(allocate(width, height, format, 0, allocator)); - } - - // Okay just need to set the contents - - { - const size_t rowSize = calcRowSize(format, width); - - const uint8_t* srcRow = (const uint8_t*)data; - uint8_t* dstRow = (uint8_t*)m_data; - - for (int i = 0; i < m_numRows; i++) - { - ::memcpy(dstRow, srcRow, rowSize); - - srcRow += srcRowStride; - dstRow += m_rowStrideInBytes; - } - } - - return SLANG_OK; -} - -} // renderer_test diff --git a/tools/slang-graphics/surface.h b/tools/slang-graphics/surface.h deleted file mode 100644 index 026ba25ed..000000000 --- a/tools/slang-graphics/surface.h +++ /dev/null @@ -1,86 +0,0 @@ -// surface.h -#pragma once - -#include "render.h" - -namespace slang_graphics { - -class Surface; - -class SurfaceAllocator -{ - public: - virtual Slang::Result allocate(int width, int height, Format format, int alignment, Surface& surface) = 0; - virtual void deallocate(Surface& surface) = 0; - - /// Get the malloc allocator - static SurfaceAllocator* getMallocAllocator(); -}; - -class Surface -{ - public: - - enum - { - kDefaultAlignment = sizeof(void*) - }; - - /// Allocate - Slang::Result allocate(int width, int height, Format format, int alignment = kDefaultAlignment, SurfaceAllocator* allocator = nullptr); - - /// Deallocate contents - void deallocate(); - /// Initialize contents (zero sized, no data). Note that the allocator pointer is left as is - void init(); - - /// Set unowned - void setUnowned(int width, int height, Format format, int strideInBytes, void* data); - - /// Set the contents - the memory will be owned by this surface (ie will be freed by the allocator when goes out of scope or is deallocated) - Slang::Result set(int width, int height, Format format, int strideInBytes, const void* data, SurfaceAllocator* allocator); - - template - T* calcNextRow(T* ptr) const { return (T*)calcNextRow((void*)ptr); } - template - const T* calcNextRow(const T* ptr) const { return (const T*)calcNextRow((const void*)ptr); } - - void* calcNextRow(void* ptr) const { return (void*)(((uint8_t*)ptr) + m_rowStrideInBytes); } - const void* calcNextRow(const void* ptr) const { return (const void*)(((const uint8_t*)ptr) + m_rowStrideInBytes); } - - /// Writes zero to all of the contents - void zeroContents(); - - /// Flips the contents vertically in place - void flipInplaceVertically(); - - /// True if has some contents - bool hasContents() const { return m_data != nullptr; } - - /// Ctor - Surface() : - m_allocator(nullptr) - { - init(); - } - /// Dtor - ~Surface(); - - /// Get the size of the row in bytes - static int calcRowSize(Format format, int width); - /// Calculates the number of rows - static int calcNumRows(Format format, int height); - - int m_width; - int m_height; - Format m_format; - - uint8_t* m_data; /// The data that makes up the image. If nullptr, has no data. Pointer to first 'row' of the image. - - int m_numRows; ///< Total amount of rows (typically same as height, but in compressed formats may be less) - int m_rowStrideInBytes; ///< The number of bytes between rows - - SurfaceAllocator* m_allocator; ///< Can be null if so contents is 'unowned', if set -}; - -} // renderer_test diff --git a/tools/slang-graphics/vk-api.cpp b/tools/slang-graphics/vk-api.cpp deleted file mode 100644 index 0ffbf46eb..000000000 --- a/tools/slang-graphics/vk-api.cpp +++ /dev/null @@ -1,138 +0,0 @@ -// vk-api.cpp -#include "vk-api.h" - -#include "../../source/core/list.h" - -namespace slang_graphics { -using namespace Slang; - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VulkanApi !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -#define VK_API_CHECK_FUNCTION(x) && (x != nullptr) -#define VK_API_CHECK_FUNCTIONS(FUNCTION_LIST) true FUNCTION_LIST(VK_API_CHECK_FUNCTION) - -bool VulkanApi::areDefined(ProcType type) const -{ - switch (type) - { - case ProcType::Global: return VK_API_CHECK_FUNCTIONS(VK_API_ALL_GLOBAL_PROCS); - case ProcType::Instance: return VK_API_CHECK_FUNCTIONS(VK_API_ALL_INSTANCE_PROCS); - case ProcType::Device: return VK_API_CHECK_FUNCTIONS(VK_API_ALL_DEVICE_PROCS); - default: - { - assert(!"Unhandled type"); - return false; - } - } -} - -Slang::Result VulkanApi::initGlobalProcs(const VulkanModule& module) -{ -#define VK_API_GET_GLOBAL_PROC(x) x = (PFN_##x)module.getFunction(#x); - - // Initialize all the global functions - VK_API_ALL_GLOBAL_PROCS(VK_API_GET_GLOBAL_PROC) - - if (!areDefined(ProcType::Global)) - { - return SLANG_FAIL; - } - m_module = &module; - return SLANG_OK; -} - -Slang::Result VulkanApi::initInstanceProcs(VkInstance instance) -{ - assert(instance && vkGetInstanceProcAddr != nullptr); - -#define VK_API_GET_INSTANCE_PROC(x) x = (PFN_##x)vkGetInstanceProcAddr(instance, #x); - - VK_API_ALL_INSTANCE_PROCS(VK_API_GET_INSTANCE_PROC) - - if (!areDefined(ProcType::Instance)) - { - return SLANG_FAIL; - } - - m_instance = instance; - return SLANG_OK; -} - -Slang::Result VulkanApi::initPhysicalDevice(VkPhysicalDevice physicalDevice) -{ - assert(m_physicalDevice == VK_NULL_HANDLE); - m_physicalDevice = physicalDevice; - - vkGetPhysicalDeviceProperties(m_physicalDevice, &m_deviceProperties); - vkGetPhysicalDeviceFeatures(m_physicalDevice, &m_deviceFeatures); - vkGetPhysicalDeviceMemoryProperties(m_physicalDevice, &m_deviceMemoryProperties); - - return SLANG_OK; -} - -Slang::Result VulkanApi::initDeviceProcs(VkDevice device) -{ - assert(m_instance && device && vkGetDeviceProcAddr != nullptr); - -#define VK_API_GET_DEVICE_PROC(x) x = (PFN_##x)vkGetDeviceProcAddr(device, #x); - - VK_API_ALL_DEVICE_PROCS(VK_API_GET_DEVICE_PROC) - - if (!areDefined(ProcType::Device)) - { - return SLANG_FAIL; - } - - m_device = device; - return SLANG_OK; -} - -int VulkanApi::findMemoryTypeIndex(uint32_t typeBits, VkMemoryPropertyFlags properties) const -{ - assert(typeBits); - - const int numMemoryTypes = int(m_deviceMemoryProperties.memoryTypeCount); - - // bit holds current test bit against typeBits. Ie bit == 1 << typeBits - - uint32_t bit = 1; - for (int i = 0; i < numMemoryTypes; ++i, bit += bit) - { - auto const& memoryType = m_deviceMemoryProperties.memoryTypes[i]; - if ((typeBits & bit) && (memoryType.propertyFlags & properties) == properties) - { - return i; - } - } - - //assert(!"failed to find a usable memory type"); - return -1; -} - -int VulkanApi::findQueue(VkQueueFlags reqFlags) const -{ - assert(m_physicalDevice != VK_NULL_HANDLE); - - uint32_t numQueueFamilies = 0; - vkGetPhysicalDeviceQueueFamilyProperties(m_physicalDevice, &numQueueFamilies, nullptr); - - Slang::List queueFamilies; - queueFamilies.SetSize(numQueueFamilies); - vkGetPhysicalDeviceQueueFamilyProperties(m_physicalDevice, &numQueueFamilies, queueFamilies.Buffer()); - - // Find a queue that can service our needs - //VkQueueFlags reqQueueFlags = VK_QUEUE_GRAPHICS_BIT | VK_QUEUE_COMPUTE_BIT; - - int queueFamilyIndex = -1; - for (int i = 0; i < int(numQueueFamilies); ++i) - { - if ((queueFamilies[i].queueFlags & reqFlags) == reqFlags) - { - return i; - } - } - - return -1; -} - -} // renderer_test diff --git a/tools/slang-graphics/vk-api.h b/tools/slang-graphics/vk-api.h deleted file mode 100644 index 0cbd3faf7..000000000 --- a/tools/slang-graphics/vk-api.h +++ /dev/null @@ -1,196 +0,0 @@ -// vk-api.h -#pragma once - -#include "vk-module.h" - -namespace slang_graphics { - -#define VK_API_GLOBAL_PROCS(x) \ - x(vkGetInstanceProcAddr) \ - x(vkCreateInstance) \ - /* */ - -#define VK_API_INSTANCE_PROCS(x) \ - x(vkCreateDevice) \ - x(vkCreateDebugReportCallbackEXT) \ - x(vkDestroyDebugReportCallbackEXT) \ - x(vkDebugReportMessageEXT) \ - x(vkEnumeratePhysicalDevices) \ - x(vkGetPhysicalDeviceProperties) \ - x(vkGetPhysicalDeviceFeatures) \ - x(vkGetPhysicalDeviceMemoryProperties) \ - x(vkGetPhysicalDeviceQueueFamilyProperties) \ - x(vkGetPhysicalDeviceFormatProperties) \ - x(vkGetDeviceProcAddr) \ - /* */ - -#define VK_API_DEVICE_PROCS(x) \ - x(vkCreateDescriptorPool) \ - x(vkDestroyDescriptorPool) \ - x(vkGetDeviceQueue) \ - x(vkQueueSubmit) \ - x(vkQueueWaitIdle) \ - x(vkCreateBuffer) \ - x(vkAllocateMemory) \ - x(vkMapMemory) \ - x(vkUnmapMemory) \ - x(vkCmdCopyBuffer) \ - x(vkDestroyBuffer) \ - x(vkFreeMemory) \ - x(vkCreateDescriptorSetLayout) \ - x(vkDestroyDescriptorSetLayout) \ - x(vkAllocateDescriptorSets) \ - x(vkUpdateDescriptorSets) \ - x(vkCreatePipelineLayout) \ - x(vkDestroyPipelineLayout) \ - x(vkCreateComputePipelines) \ - x(vkCreateGraphicsPipelines) \ - x(vkDestroyPipeline) \ - x(vkCreateShaderModule) \ - x(vkDestroyShaderModule) \ - x(vkCreateFramebuffer) \ - x(vkDestroyFramebuffer) \ - x(vkCreateImage) \ - x(vkDestroyImage) \ - x(vkCreateImageView) \ - x(vkDestroyImageView) \ - x(vkCreateRenderPass) \ - x(vkDestroyRenderPass) \ - x(vkCreateCommandPool) \ - x(vkDestroyCommandPool) \ - x(vkCreateSampler) \ - x(vkDestroySampler) \ - x(vkCreateBufferView) \ - x(vkDestroyBufferView) \ - \ - x(vkGetBufferMemoryRequirements) \ - x(vkGetImageMemoryRequirements) \ - \ - x(vkCmdBindPipeline) \ - x(vkCmdBindDescriptorSets) \ - x(vkCmdDispatch) \ - x(vkCmdDraw) \ - x(vkCmdSetScissor) \ - x(vkCmdSetViewport) \ - x(vkCmdBindVertexBuffers) \ - x(vkCmdBindIndexBuffer) \ - x(vkCmdBeginRenderPass) \ - x(vkCmdEndRenderPass) \ - x(vkCmdPipelineBarrier) \ - x(vkCmdCopyBufferToImage)\ - \ - x(vkCreateFence) \ - x(vkDestroyFence) \ - x(vkResetFences) \ - x(vkGetFenceStatus) \ - x(vkWaitForFences) \ - \ - x(vkCreateSemaphore) \ - x(vkDestroySemaphore) \ - \ - x(vkCreateEvent) \ - x(vkDestroyEvent) \ - x(vkGetEventStatus) \ - x(vkSetEvent) \ - x(vkResetEvent) \ - \ - x(vkFreeCommandBuffers) \ - x(vkAllocateCommandBuffers) \ - x(vkBeginCommandBuffer) \ - x(vkEndCommandBuffer) \ - x(vkResetCommandBuffer) \ - \ - x(vkBindImageMemory) \ - x(vkBindBufferMemory) \ - /* */ - -#if SLANG_WINDOWS_FAMILY -# define VK_API_INSTANCE_PLATFORM_KHR_PROCS(x) \ - x(vkCreateWin32SurfaceKHR) \ - /* */ -#else -# define VK_API_INSTANCE_PLATFORM_KHR_PROCS(x) \ - x(vkCreateXlibSurfaceKHR) \ - /* */ -#endif - -#define VK_API_INSTANCE_KHR_PROCS(x) \ - VK_API_INSTANCE_PLATFORM_KHR_PROCS(x) \ - x(vkGetPhysicalDeviceSurfaceSupportKHR) \ - x(vkGetPhysicalDeviceSurfaceFormatsKHR) \ - x(vkGetPhysicalDeviceSurfacePresentModesKHR) \ - x(vkGetPhysicalDeviceSurfaceCapabilitiesKHR) \ - x(vkDestroySurfaceKHR) \ - /* */ - -#define VK_API_DEVICE_KHR_PROCS(x) \ - x(vkQueuePresentKHR) \ - x(vkCreateSwapchainKHR) \ - x(vkGetSwapchainImagesKHR) \ - x(vkDestroySwapchainKHR) \ - x(vkAcquireNextImageKHR) \ - /* */ - -#define VK_API_ALL_GLOBAL_PROCS(x) \ - VK_API_GLOBAL_PROCS(x) - -#define VK_API_ALL_INSTANCE_PROCS(x) \ - VK_API_INSTANCE_PROCS(x) \ - VK_API_INSTANCE_KHR_PROCS(x) - -#define VK_API_ALL_DEVICE_PROCS(x) \ - VK_API_DEVICE_PROCS(x) \ - VK_API_DEVICE_KHR_PROCS(x) - -#define VK_API_ALL_PROCS(x) \ - VK_API_ALL_GLOBAL_PROCS(x) \ - VK_API_ALL_INSTANCE_PROCS(x) \ - VK_API_ALL_DEVICE_PROCS(x) \ - /* */ - -#define VK_API_DECLARE_PROC(NAME) PFN_##NAME NAME = nullptr; - -struct VulkanApi -{ - VK_API_ALL_PROCS(VK_API_DECLARE_PROC) - - enum class ProcType - { - Global, - Instance, - Device, - }; - - /// Returns true if all the functions in the class are defined - bool areDefined(ProcType type) const; - - /// Sets up global parameters - Slang::Result initGlobalProcs(const VulkanModule& module); - /// Initialize the instance functions - Slang::Result initInstanceProcs(VkInstance instance); - - /// Called before initDevice - Slang::Result initPhysicalDevice(VkPhysicalDevice physicalDevice); - - /// Initialize the device functions - Slang::Result initDeviceProcs(VkDevice device); - - /// Type bits control which indices are tested against bit 0 for testing at index 0 - /// properties - a memory type must have all the bits set as passed in - /// Returns -1 if couldn't find an appropriate memory type index - int findMemoryTypeIndex(uint32_t typeBits, VkMemoryPropertyFlags properties) const; - - /// Given queue required flags, finds a queue - int findQueue(VkQueueFlags reqFlags) const; - - const VulkanModule* m_module = nullptr; ///< Module this was all loaded from - VkInstance m_instance = VK_NULL_HANDLE; - VkDevice m_device = VK_NULL_HANDLE; - VkPhysicalDevice m_physicalDevice = VK_NULL_HANDLE; - - VkPhysicalDeviceProperties m_deviceProperties; - VkPhysicalDeviceFeatures m_deviceFeatures; - VkPhysicalDeviceMemoryProperties m_deviceMemoryProperties; -}; - -} // renderer_test diff --git a/tools/slang-graphics/vk-device-queue.cpp b/tools/slang-graphics/vk-device-queue.cpp deleted file mode 100644 index 9e978117f..000000000 --- a/tools/slang-graphics/vk-device-queue.cpp +++ /dev/null @@ -1,199 +0,0 @@ -// vk-device-queue.cpp -#include "vk-device-queue.h" - -#include -#include -#include - -namespace slang_graphics { -using namespace Slang; - -VulkanDeviceQueue::~VulkanDeviceQueue() -{ - for (int i = 0; i < int(EventType::CountOf); ++i) - { - m_api->vkDestroySemaphore(m_api->m_device, m_semaphores[i], nullptr); - } - - for (int i = 0; i < m_numCommandBuffers; i++) - { - m_api->vkFreeCommandBuffers(m_api->m_device, m_commandPool, 1, &m_commandBuffers[i]); - m_api->vkDestroyFence(m_api->m_device, m_fences[i].fence, nullptr); - } - m_api->vkDestroyCommandPool(m_api->m_device, m_commandPool, nullptr); -} - -SlangResult VulkanDeviceQueue::init(const VulkanApi& api, VkQueue queue, int queueIndex) -{ - assert(m_api == nullptr); - m_api = &api; - - for (int i = 0; i < int(EventType::CountOf); ++i) - { - m_semaphores[i] = VK_NULL_HANDLE; - m_currentSemaphores[i] = VK_NULL_HANDLE; - } - - m_numCommandBuffers = kMaxCommandBuffers; - m_queueIndex = queueIndex; - - m_queue = queue; - - VkCommandPoolCreateInfo poolCreateInfo = {}; - poolCreateInfo.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO; - poolCreateInfo.flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT; - - poolCreateInfo.queueFamilyIndex = queueIndex; - - api.vkCreateCommandPool(api.m_device, &poolCreateInfo, nullptr, &m_commandPool); - - VkCommandBufferAllocateInfo commandInfo = {}; - commandInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO; - commandInfo.commandPool = m_commandPool; - commandInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY; - commandInfo.commandBufferCount = 1; - - VkFenceCreateInfo fenceCreateInfo = {}; - fenceCreateInfo.sType = VK_STRUCTURE_TYPE_FENCE_CREATE_INFO; - fenceCreateInfo.flags = 0; // VK_FENCE_CREATE_SIGNALED_BIT; - - for (int i = 0; i < m_numCommandBuffers; i++) - { - Fence& fence = m_fences[i]; - - api.vkAllocateCommandBuffers(api.m_device, &commandInfo, &m_commandBuffers[i]); - - api.vkCreateFence(api.m_device, &fenceCreateInfo, nullptr, &fence.fence); - fence.active = false; - fence.value = 0; - } - - VkSemaphoreCreateInfo semaphoreCreateInfo = {}; - semaphoreCreateInfo.sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO; - - for (int i = 0; i < int(EventType::CountOf); ++i) - { - api.vkCreateSemaphore(api.m_device, &semaphoreCreateInfo, nullptr, &m_semaphores[i]); - } - - // Second step of flush to prime command buffer - flushStepB(); - - return SLANG_OK; -} - -void VulkanDeviceQueue::flushStepA() -{ - m_api->vkEndCommandBuffer(m_commandBuffer); - - VkPipelineStageFlags stageFlags = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT; - - VkSubmitInfo submitInfo = {}; - submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; - - // Wait semaphores - if (isCurrent(EventType::BeginFrame)) - { - submitInfo.waitSemaphoreCount = 1; - submitInfo.pWaitSemaphores = &m_currentSemaphores[int(EventType::BeginFrame)]; - } - - submitInfo.pWaitDstStageMask = &stageFlags; - submitInfo.commandBufferCount = 1; - submitInfo.pCommandBuffers = &m_commandBuffer; - - // Signal semaphores - if (isCurrent(EventType::EndFrame)) - { - submitInfo.signalSemaphoreCount = 1; - submitInfo.pSignalSemaphores = &m_currentSemaphores[int(EventType::EndFrame)]; - } - - Fence& fence = m_fences[m_commandBufferIndex]; - - m_api->vkQueueSubmit(m_queue, 1, &submitInfo, fence.fence); - - // mark signaled fence value - fence.value = m_nextFenceValue; - fence.active = true; - - // increment fence value - m_nextFenceValue++; - - // No longer waiting on this semaphore - makeCompleted(EventType::BeginFrame); -} - -void VulkanDeviceQueue::_updateFenceAtIndex( int fenceIndex, bool blocking) -{ - Fence& fence = m_fences[fenceIndex]; - - if (fence.active) - { - uint64_t timeout = blocking ? ~uint64_t(0) : 0; - - if (VK_SUCCESS == m_api->vkWaitForFences(m_api->m_device, 1, &fence.fence, VK_TRUE, timeout)) - { - m_api->vkResetFences(m_api->m_device, 1, &fence.fence); - - fence.active = false; - - if (fence.value > m_lastFenceCompleted) - { - m_lastFenceCompleted = fence.value; - } - } - } -} - -void VulkanDeviceQueue::flushStepB() -{ - m_commandBufferIndex = (m_commandBufferIndex + 1) % m_numCommandBuffers; - m_commandBuffer = m_commandBuffers[m_commandBufferIndex]; - - // non-blocking update of fence values - for (int i = 0; i < m_numCommandBuffers; ++i) - { - _updateFenceAtIndex(i, false); - } - - // blocking update of fence values - _updateFenceAtIndex(m_commandBufferIndex, true); - - m_api->vkResetCommandBuffer(m_commandBuffer, 0); - - //m_api.vkResetCommandPool(m_api->m_device, m_commandPool, 0); - - VkCommandBufferBeginInfo beginInfo = {}; - beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO; - beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT; - - m_api->vkBeginCommandBuffer(m_commandBuffer, &beginInfo); -} - -void VulkanDeviceQueue::flush() -{ - flushStepA(); - flushStepB(); -} - -void VulkanDeviceQueue::flushAndWait() -{ - flush(); - waitForIdle(); -} - -VkSemaphore VulkanDeviceQueue::makeCurrent(EventType eventType) -{ - assert(!isCurrent(eventType)); - VkSemaphore semaphore = m_semaphores[int(eventType)]; - m_currentSemaphores[int(eventType)] = semaphore; - return semaphore; -} - -void VulkanDeviceQueue::makeCompleted(EventType eventType) -{ - m_currentSemaphores[int(eventType)] = VK_NULL_HANDLE; -} - -} // renderer_test diff --git a/tools/slang-graphics/vk-device-queue.h b/tools/slang-graphics/vk-device-queue.h deleted file mode 100644 index 01ed16f5d..000000000 --- a/tools/slang-graphics/vk-device-queue.h +++ /dev/null @@ -1,94 +0,0 @@ -// vk-swap-chain.h -#pragma once - -#include "vk-api.h" - -namespace slang_graphics { - -struct VulkanDeviceQueue -{ - enum - { - kMaxCommandBuffers = 8, - }; - - enum class EventType - { - BeginFrame, - EndFrame, - CountOf, - }; - - /// Initialize - must be called before anything else can be done - SlangResult init(const VulkanApi& api, VkQueue queue, int queueIndex); - - /// Flushes the current command list, and steps to next (internally this is equivalent to a stepA followed by stepB) - void flush(); - /// Performs a full flush, and then waits for idle. - void flushAndWait(); - - /// Blocks until all work submitted to GPU has completed - void waitForIdle() { m_api->vkQueueWaitIdle(m_queue); } - - /// Get the graphics queue index (as set on init) - int getQueueIndex() const { return m_queueIndex; } - - /// Make the specified event 'current' - meaning it's semaphore must be waited on - VkSemaphore makeCurrent(EventType eventType); - /// Makes the event no longer required to be waited on - void makeCompleted(EventType eventType); - /// Returns true if the event is already current - SLANG_FORCE_INLINE bool isCurrent(EventType eventType) const { return m_currentSemaphores[int(eventType)] != VK_NULL_HANDLE; } - - /// Get the command buffer - VkCommandBuffer getCommandBuffer() const { return m_commandBuffer; } - - /// Get the queue - VkQueue getQueue() const { return m_queue; } - - /// Get the API - const VulkanApi* getApi() const { return m_api; } - - /// Flushes the current command list - void flushStepA(); - /// Steps to next command buffer and opens. May block if command buffer is still in use - void flushStepB(); - - /// Dtor - ~VulkanDeviceQueue(); - - protected: - - struct Fence - { - VkFence fence; - bool active; - uint64_t value; - }; - - void _updateFenceAtIndex(int fenceIndex, bool blocking); - - VkQueue m_queue = VK_NULL_HANDLE; - - VkCommandPool m_commandPool = VK_NULL_HANDLE; - int m_numCommandBuffers = 0; - int m_commandBufferIndex = 0; - // There are the same amount of command buffers as fences - VkCommandBuffer m_commandBuffers[kMaxCommandBuffers] = { VK_NULL_HANDLE }; - - Fence m_fences[kMaxCommandBuffers] = { {VK_NULL_HANDLE, 0, 0u} }; - - VkCommandBuffer m_commandBuffer = VK_NULL_HANDLE; - - VkSemaphore m_semaphores[int(EventType::CountOf)]; - VkSemaphore m_currentSemaphores[int(EventType::CountOf)]; - - uint64_t m_lastFenceCompleted = 1; - uint64_t m_nextFenceValue = 2; - - int m_queueIndex = 0; - - const VulkanApi* m_api = nullptr; -}; - -} // renderer_test diff --git a/tools/slang-graphics/vk-module.cpp b/tools/slang-graphics/vk-module.cpp deleted file mode 100644 index 460e6550c..000000000 --- a/tools/slang-graphics/vk-module.cpp +++ /dev/null @@ -1,76 +0,0 @@ -// module.cpp -#include "vk-module.h" - -#include -#include -#include - -#if SLANG_WINDOWS_FAMILY -# include -#else -# include -#endif - -namespace slang_graphics { -using namespace Slang; - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! VulkanModule !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -Slang::Result VulkanModule::init() -{ - if (isInitialized()) - { - destroy(); - return SLANG_OK; - } - - const char* dynamicLibraryName = "Unknown"; - -#if SLANG_WINDOWS_FAMILY - dynamicLibraryName = "vulkan-1.dll"; - HMODULE module = ::LoadLibraryA(dynamicLibraryName); - m_module = (void*)module; -#else - dynamicLibraryName = "libvulkan.so.1"; - m_module = dlopen(dynamicLibraryName, RTLD_NOW); -#endif - - if (!m_module) - { - fprintf(stderr, "error: failed load '%s'\n", dynamicLibraryName); - return SLANG_FAIL; - } - - return SLANG_OK; -} - -PFN_vkVoidFunction VulkanModule::getFunction(const char* name) const -{ - assert(m_module); - if (!m_module) - { - return nullptr; - } -#if SLANG_WINDOWS_FAMILY - return (PFN_vkVoidFunction)::GetProcAddress((HMODULE)m_module, name); -#else - return (PFN_vkVoidFunction)dlsym(m_module, name); -#endif -} - -void VulkanModule::destroy() -{ - if (!isInitialized()) - { - return; - } - -#if SLANG_WINDOWS_FAMILY - ::FreeLibrary((HMODULE)m_module); -#else - dlclose(m_module); -#endif - m_module = nullptr; -} - -} // renderer_test diff --git a/tools/slang-graphics/vk-module.h b/tools/slang-graphics/vk-module.h deleted file mode 100644 index c72334db5..000000000 --- a/tools/slang-graphics/vk-module.h +++ /dev/null @@ -1,39 +0,0 @@ -// vk-module.h -#pragma once - -#include "../../slang.h" - -#include "../../slang-com-helper.h" - -#if SLANG_WINDOWS_FAMILY -# define VK_USE_PLATFORM_WIN32_KHR 1 -#else -# define VK_USE_PLATFORM_XLIB_KHR 1 -#endif - -#define VK_NO_PROTOTYPES -#include - -namespace slang_graphics { - -struct VulkanModule -{ - /// true if has been initialized - SLANG_FORCE_INLINE bool isInitialized() const { return m_module != nullptr; } - - /// Get a function by name - PFN_vkVoidFunction getFunction(const char* name) const; - - /// Initialize - Slang::Result init(); - /// Destroy - void destroy(); - - /// Dtor - ~VulkanModule() { destroy(); } - - protected: - void* m_module = nullptr; -}; - -} // renderer_test diff --git a/tools/slang-graphics/vk-swap-chain.cpp b/tools/slang-graphics/vk-swap-chain.cpp deleted file mode 100644 index a6704e9d7..000000000 --- a/tools/slang-graphics/vk-swap-chain.cpp +++ /dev/null @@ -1,421 +0,0 @@ -// vk-swap-chain.cpp -#include "vk-swap-chain.h" - -#include "vk-util.h" - -#include "../../source/core/list.h" - -#include -#include - -namespace slang_graphics { -using namespace Slang; - -static int _indexOf(List& formatsIn, VkFormat format) -{ - const int numFormats = int(formatsIn.Count()); - const VkSurfaceFormatKHR* formats = formatsIn.Buffer(); - - for (int i = 0; i < numFormats; ++i) - { - if (formats[i].format == format) - { - return i; - } - } - return -1; -} - -SlangResult VulkanSwapChain::init(VulkanDeviceQueue* deviceQueue, const Desc& descIn, const PlatformDesc* platformDescIn) -{ - assert(platformDescIn); - - m_deviceQueue = deviceQueue; - m_api = deviceQueue->getApi(); - - // Make sure it's not set initially - m_format = VK_FORMAT_UNDEFINED; - - Desc desc(descIn); - -#if SLANG_WINDOWS_FAMILY - const WinPlatformDesc* platformDesc = static_cast(platformDescIn); - _setPlatformDesc(*platformDesc); - - VkWin32SurfaceCreateInfoKHR surfaceCreateInfo = {}; - surfaceCreateInfo.sType = VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR; - surfaceCreateInfo.hinstance = platformDesc->m_hinstance; - surfaceCreateInfo.hwnd = platformDesc->m_hwnd; - - SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateWin32SurfaceKHR(m_api->m_instance, &surfaceCreateInfo, nullptr, &m_surface)); -#else - const XPlatformDesc* platformDesc = static_cast(platformDescIn); - _setPlatformDesc(*platformDesc); - - VkXlibSurfaceCreateInfoKHR surfaceCreateInfo = {}; - surfaceCreateInfo.sType = VK_STRUCTURE_TYPE_XLIB_SURFACE_CREATE_INFO_KHR; - surfaceCreateInfo.dpy = platformDesc->m_display; - surfaceCreateInfo.window = platformDesc->m_window; - - SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateXlibSurfaceKHR(m_api->m_instance, &surfaceCreateInfo, nullptr, &m_surface)); -#endif - - VkBool32 supported = false; - m_api->vkGetPhysicalDeviceSurfaceSupportKHR(m_api->m_physicalDevice, deviceQueue->getQueueIndex(), m_surface, &supported); - - uint32_t numSurfaceFormats = 0; - List surfaceFormats; - m_api->vkGetPhysicalDeviceSurfaceFormatsKHR(m_api->m_physicalDevice, m_surface, &numSurfaceFormats, nullptr); - surfaceFormats.SetSize(int(numSurfaceFormats)); - m_api->vkGetPhysicalDeviceSurfaceFormatsKHR(m_api->m_physicalDevice, m_surface, &numSurfaceFormats, surfaceFormats.Buffer()); - - // Look for a suitable format - List formats; - formats.Add(VulkanUtil::getVkFormat(desc.m_format)); - // HACK! To check for a different format if couldn't be found - if (descIn.m_format == Format::RGBA_Unorm_UInt8) - { - formats.Add(VK_FORMAT_B8G8R8A8_UNORM); - } - - for(int i = 0; i < int(formats.Count()); ++i) - { - VkFormat format = formats[i]; - if (_indexOf(surfaceFormats, format) >= 0) - { - m_format = format; - } - } - - if (m_format == VK_FORMAT_UNDEFINED) - { - return SLANG_FAIL; - } - - // Save the desc - m_desc = desc; - - SLANG_RETURN_ON_FAIL(_createSwapChain()); - - m_desc = desc; - return SLANG_OK; -} - -void VulkanSwapChain::getWindowSize(int* widthOut, int* heightOut) const -{ -#if SLANG_WINDOWS_FAMILY - auto platformDesc = _getPlatformDesc(); - - RECT rc; - ::GetClientRect(platformDesc->m_hwnd, &rc); - *widthOut = rc.right - rc.left; - *heightOut = rc.bottom - rc.top; -#else - auto platformDesc = _getPlatformDesc(); - - XWindowAttributes winAttr = {}; - XGetWindowAttributes(platformDesc->m_display, platformDesc->m_window, &winAttr); - - *widthOut = winAttr.width; - *heightOut = winAttr.height; -#endif -} - -SlangResult VulkanSwapChain::_createFrameBuffers(VkRenderPass renderPass) -{ - assert(renderPass != VK_NULL_HANDLE); - - for (int i = 0; i < int(m_images.Count()); ++i) - { - Image& image = m_images[i]; - VkImageView attachments[] = - { - image.m_imageView - }; - - VkFramebufferCreateInfo framebufferInfo = {}; - framebufferInfo.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO; - framebufferInfo.renderPass = renderPass; - framebufferInfo.attachmentCount = 1; - framebufferInfo.pAttachments = attachments; - framebufferInfo.width = m_width; - framebufferInfo.height = m_height; - framebufferInfo.layers = 1; - - SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateFramebuffer(m_api->m_device, &framebufferInfo, nullptr, &image.m_frameBuffer)); - } - - return SLANG_OK; -} - -void VulkanSwapChain::_destroyFrameBuffers() -{ - for (int i = 0; i < int(m_images.Count()); ++i) - { - Image& image = m_images[i]; - if (image.m_frameBuffer != VK_NULL_HANDLE) - { - m_api->vkDestroyFramebuffer(m_api->m_device, image.m_frameBuffer, nullptr); - image.m_frameBuffer = VK_NULL_HANDLE; - } - } -} - -SlangResult VulkanSwapChain::createFrameBuffers(VkRenderPass renderPass) -{ - if (m_renderPass != VK_NULL_HANDLE) - { - _destroyFrameBuffers(); - m_renderPass = VK_NULL_HANDLE; - } - if (renderPass != VK_NULL_HANDLE) - { - SLANG_RETURN_ON_FAIL(_createFrameBuffers(renderPass)); - } - m_renderPass = renderPass; - return SLANG_OK; -} - -SlangResult VulkanSwapChain::_createSwapChain() -{ - if (hasValidSwapChain()) - { - return SLANG_OK; - } - - int width, height; - getWindowSize(&width, &height); - - VkExtent2D imageExtent = {}; - imageExtent.width = width; - imageExtent.height = height; - - m_width = width; - m_height = height; - - // catch this before throwing error - if (m_width == 0 || m_height == 0) - { - return SLANG_FAIL; - } - - // It is necessary to query the caps -> otherwise the LunarG verification layer will issue an error - { - VkSurfaceCapabilitiesKHR surfaceCaps; - - SLANG_VK_RETURN_ON_FAIL(m_api->vkGetPhysicalDeviceSurfaceCapabilitiesKHR(m_api->m_physicalDevice, m_surface, &surfaceCaps)); - } - - List presentModes; - uint32_t numPresentModes = 0; - m_api->vkGetPhysicalDeviceSurfacePresentModesKHR(m_api->m_physicalDevice, m_surface, &numPresentModes, nullptr); - presentModes.SetSize(numPresentModes); - m_api->vkGetPhysicalDeviceSurfacePresentModesKHR(m_api->m_physicalDevice, m_surface, &numPresentModes, presentModes.Buffer()); - - { - int numCheckPresentOptions = 3; - VkPresentModeKHR presentOptions[] = { VK_PRESENT_MODE_IMMEDIATE_KHR, VK_PRESENT_MODE_MAILBOX_KHR, VK_PRESENT_MODE_FIFO_KHR }; - if (m_vsync) - { - presentOptions[0] = VK_PRESENT_MODE_FIFO_KHR; - presentOptions[1] = VK_PRESENT_MODE_IMMEDIATE_KHR; - presentOptions[2] = VK_PRESENT_MODE_MAILBOX_KHR; - } - - m_presentMode = VK_PRESENT_MODE_MAX_ENUM_KHR; // Invalid - - // Find the first option that's available on the device - for (int j = 0; j < numCheckPresentOptions; j++) - { - if (presentModes.IndexOf(presentOptions[j]) != UInt(-1)) - { - m_presentMode = presentOptions[j]; - break; - } - } - - if (m_presentMode == VK_PRESENT_MODE_MAX_ENUM_KHR) - { - return SLANG_FAIL; - } - } - - VkSwapchainKHR oldSwapchain = VK_NULL_HANDLE; - - VkSwapchainCreateInfoKHR swapchainDesc = {}; - swapchainDesc.sType = VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR; - swapchainDesc.surface = m_surface; - swapchainDesc.minImageCount = 3; - swapchainDesc.imageFormat = m_format; - swapchainDesc.imageColorSpace = VK_COLOR_SPACE_SRGB_NONLINEAR_KHR; - swapchainDesc.imageExtent = imageExtent; - swapchainDesc.imageArrayLayers = 1; - swapchainDesc.imageUsage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT; - swapchainDesc.imageSharingMode = VK_SHARING_MODE_EXCLUSIVE; - swapchainDesc.preTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR; - swapchainDesc.compositeAlpha = VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR; - swapchainDesc.presentMode = m_presentMode; - swapchainDesc.clipped = VK_TRUE; - swapchainDesc.oldSwapchain = oldSwapchain; - - SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateSwapchainKHR(m_api->m_device, &swapchainDesc, nullptr, &m_swapChain)); - - uint32_t numSwapChainImages = 0; - m_api->vkGetSwapchainImagesKHR(m_api->m_device, m_swapChain, &numSwapChainImages, nullptr); - - { - List images; - images.SetSize(numSwapChainImages); - - m_api->vkGetSwapchainImagesKHR(m_api->m_device, m_swapChain, &numSwapChainImages, images.Buffer()); - - m_images.SetSize(numSwapChainImages); - for (int i = 0; i < int(numSwapChainImages); ++i) - { - Image& dstImage = m_images[i]; - dstImage.m_image = images[i]; - - } - } - - { - VkImageViewCreateInfo createInfo = {}; - createInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; - - createInfo.viewType = VK_IMAGE_VIEW_TYPE_2D; - createInfo.format = m_format; - - createInfo.components.r = VK_COMPONENT_SWIZZLE_IDENTITY; - createInfo.components.g = VK_COMPONENT_SWIZZLE_IDENTITY; - createInfo.components.b = VK_COMPONENT_SWIZZLE_IDENTITY; - createInfo.components.a = VK_COMPONENT_SWIZZLE_IDENTITY; - - createInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; - createInfo.subresourceRange.baseMipLevel = 0; - createInfo.subresourceRange.levelCount = 1; - createInfo.subresourceRange.baseArrayLayer = 0; - createInfo.subresourceRange.layerCount = 1; - - for (int i = 0; i < int(numSwapChainImages); ++i) - { - Image& image = m_images[i]; - - createInfo.image = image.m_image; - - SLANG_VK_RETURN_ON_FAIL(m_api->vkCreateImageView(m_api->m_device, &createInfo, nullptr, &image.m_imageView)); - } - } - - if (m_renderPass != VK_NULL_HANDLE) - { - _createFrameBuffers(m_renderPass); - } - - return SLANG_OK; -} - -void VulkanSwapChain::_destroySwapChain() -{ - if (!hasValidSwapChain()) - { - return; - } - - m_deviceQueue->waitForIdle(); - - if (m_renderPass != VK_NULL_HANDLE) - { - _destroyFrameBuffers(); - } - - for (int i = 0; i < int(m_images.Count()); ++i) - { - Image& image = m_images[i]; - - if (image.m_imageView != VK_NULL_HANDLE) - { - m_api->vkDestroyImageView(m_api->m_device, image.m_imageView, nullptr); - } - } - - if (m_swapChain != VK_NULL_HANDLE) - { - m_api->vkDestroySwapchainKHR(m_api->m_device, m_swapChain, nullptr); - m_swapChain = VK_NULL_HANDLE; - } - - // Mark that it is no longer used - m_images.Clear(); -} - -VulkanSwapChain::~VulkanSwapChain() -{ - _destroySwapChain(); - - if (m_surface) - { - m_api->vkDestroySurfaceKHR(m_api->m_instance, m_surface, nullptr); - m_surface = VK_NULL_HANDLE; - } -} - -int VulkanSwapChain::nextFrontImageIndex() -{ - if (!hasValidSwapChain()) - { - if (SLANG_FAILED(_createSwapChain())) - { - return -1; - } - } - - VkSemaphore beginFrameSemaphore = m_deviceQueue->makeCurrent(VulkanDeviceQueue::EventType::BeginFrame); - - uint32_t swapChainIndex = 0; - VkResult result = m_api->vkAcquireNextImageKHR(m_api->m_device, m_swapChain, UINT64_MAX, beginFrameSemaphore, VK_NULL_HANDLE, &swapChainIndex); - - if (result != VK_SUCCESS) - { - _destroySwapChain(); - return -1; - } - m_currentSwapChainIndex = int(swapChainIndex); - return swapChainIndex; -} - -void VulkanSwapChain::present(bool vsync) -{ - if (!hasValidSwapChain()) - { - m_deviceQueue->flush(); - return; - } - - VkSemaphore endFrameSemaphore = m_deviceQueue->makeCurrent(VulkanDeviceQueue::EventType::EndFrame); - - m_deviceQueue->flushStepA(); - - uint32_t swapChainIndices[] = { uint32_t(m_currentSwapChainIndex) }; - - VkPresentInfoKHR presentInfo = {}; - presentInfo.sType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR; - presentInfo.swapchainCount = 1; - presentInfo.pSwapchains = &m_swapChain; - presentInfo.pImageIndices = swapChainIndices; - presentInfo.waitSemaphoreCount = 1; - presentInfo.pWaitSemaphores = &endFrameSemaphore; - - VkResult result = m_api->vkQueuePresentKHR(m_deviceQueue->getQueue(), &presentInfo); - - m_deviceQueue->makeCompleted(VulkanDeviceQueue::EventType::EndFrame); - - m_deviceQueue->flushStepB(); - - if (result != VK_SUCCESS || m_vsync != vsync) - { - m_vsync = vsync; - _destroySwapChain(); - } -} - -} // renderer_test diff --git a/tools/slang-graphics/vk-swap-chain.h b/tools/slang-graphics/vk-swap-chain.h deleted file mode 100644 index 7c04af70c..000000000 --- a/tools/slang-graphics/vk-swap-chain.h +++ /dev/null @@ -1,141 +0,0 @@ -// vk-swap-chain.h -#pragma once - -#include "vk-api.h" -#include "vk-device-queue.h" - -#include "render.h" - -#include "../../source/core/list.h" - -namespace slang_graphics { - -struct VulkanSwapChain -{ - /* enum - { - kMaxImages = 8, - }; */ - - /// Base class for platform specific information - struct PlatformDesc - { - }; - -#if SLANG_WINDOWS_FAMILY - struct WinPlatformDesc: public PlatformDesc - { - HINSTANCE m_hinstance; - HWND m_hwnd; - }; -#else - struct XPlatformDesc : public PlatformDesc - { - Display* m_display; - Window m_window; - }; -#endif - - struct Desc - { - void init() - { - m_format = Format::Unknown; - m_depthFormatTypeless = Format::Unknown; - m_depthFormat = Format::Unknown; - m_textureDepthFormat = Format::Unknown; - } - - Format m_format; - //bool m_enableFormat; - Format m_depthFormatTypeless; - Format m_depthFormat; - Format m_textureDepthFormat; - }; - - struct Image - { - VkImage m_image = VK_NULL_HANDLE; - VkImageView m_imageView = VK_NULL_HANDLE; - VkFramebuffer m_frameBuffer = VK_NULL_HANDLE; - }; - - - /// Must be called before the swap chain can be used - SlangResult init(VulkanDeviceQueue* deviceQueue, const Desc& desc, const PlatformDesc* platformDesc); - - /// Create the frame buffers (they must be compatible with the supplied renderPass) - SlangResult createFrameBuffers(VkRenderPass renderPass); - - /// Returned the desc used to construct the swap chain. - /// Is invalid if init hasn't returned with successful result. - const Desc& getDesc() const { return m_desc; } - - /// True if the swap chain is available - bool hasValidSwapChain() const { return m_images.Count() > 0; } - - /// Present to the display - void present(bool vsync); - - /// Get the current size of the window (in pixels written to widthOut, heightOut) - void getWindowSize(int* widthOut, int* heightOut) const; - - /// Get the VkFormat for the back buffer - VkFormat getVkFormat() const { return m_format; } - - /// Get width of the back buffers - int getWidth() const { return m_width; } - /// Get the height of the back buffer - int getHeight() const { return m_height; } - - /// Get the detail about the images - const Slang::List& getImages() const { return m_images; } - - /// Get the next front render image index. Returns -1, if image couldn't be found - int nextFrontImageIndex(); - - /// Dtor - ~VulkanSwapChain(); - - protected: - - - template - void _setPlatformDesc(const T& desc) - { - const PlatformDesc* check = &desc; - int size = (sizeof(T) + sizeof(void*) - 1) / sizeof(void*); - m_platformDescBuffer.SetSize(size); - *(T*)m_platformDescBuffer.Buffer() = desc; - } - template - const T* _getPlatformDesc() const { return static_cast((const PlatformDesc*)m_platformDescBuffer.Buffer()); } - SlangResult _createSwapChain(); - void _destroySwapChain(); - SlangResult _createFrameBuffers(VkRenderPass renderPass); - void _destroyFrameBuffers(); - - bool m_vsync = true; - int m_width = 0; - int m_height = 0; - - VkPresentModeKHR m_presentMode = VK_PRESENT_MODE_IMMEDIATE_KHR; - VkFormat m_format = VK_FORMAT_UNDEFINED; ///< The format used for backbuffer. Valid after successful init. - - VkSurfaceKHR m_surface = VK_NULL_HANDLE; - VkSwapchainKHR m_swapChain = VK_NULL_HANDLE; - - VkRenderPass m_renderPass = VK_NULL_HANDLE; //< Not owned - - int m_currentSwapChainIndex = 0; - - Slang::List m_images; - - VulkanDeviceQueue* m_deviceQueue = nullptr; - const VulkanApi* m_api = nullptr; - - Desc m_desc; ///< The desc used to init this swap chain - Slang::List m_platformDescBuffer; ///< Buffer to hold the platform specific description parameters (as passed in platformDesc) -}; - -} // renderer_test diff --git a/tools/slang-graphics/vk-util.cpp b/tools/slang-graphics/vk-util.cpp deleted file mode 100644 index 374a3876d..000000000 --- a/tools/slang-graphics/vk-util.cpp +++ /dev/null @@ -1,59 +0,0 @@ -// vk-util.cpp -#include "vk-util.h" - -#include -#include - -namespace slang_graphics { - -/* static */VkFormat VulkanUtil::getVkFormat(Format format) -{ - switch (format) - { - case Format::RGBA_Float32: return VK_FORMAT_R32G32B32A32_SFLOAT; - case Format::RGB_Float32: return VK_FORMAT_R32G32B32_SFLOAT; - case Format::RG_Float32: return VK_FORMAT_R32G32_SFLOAT; - case Format::R_Float32: return VK_FORMAT_R32_SFLOAT; - case Format::RGBA_Unorm_UInt8: return VK_FORMAT_R8G8B8A8_UNORM; - case Format::R_UInt32: return VK_FORMAT_R32_UINT; - - case Format::D_Float32: return VK_FORMAT_D32_SFLOAT; - case Format::D_Unorm24_S8: return VK_FORMAT_D24_UNORM_S8_UINT; - - default: return VK_FORMAT_UNDEFINED; - } -} - -/* static */SlangResult VulkanUtil::toSlangResult(VkResult res) -{ - return (res == VK_SUCCESS) ? SLANG_OK : SLANG_FAIL; -} - -/* static */Slang::Result VulkanUtil::handleFail(VkResult res) -{ - if (res != VK_SUCCESS) - { - assert(!"Vulkan returned a failure"); - } - return toSlangResult(res); -} - -/* static */void VulkanUtil::checkFail(VkResult res) -{ - assert(res != VK_SUCCESS); - assert(!"Vulkan check failed"); - -} - -/* static */VkPrimitiveTopology VulkanUtil::getVkPrimitiveTopology(PrimitiveTopology topology) -{ - switch (topology) - { - case PrimitiveTopology::TriangleList: return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; - default: break; - } - assert(!"Unknown topology"); - return VK_PRIMITIVE_TOPOLOGY_MAX_ENUM; -} - -} // renderer_test diff --git a/tools/slang-graphics/vk-util.h b/tools/slang-graphics/vk-util.h deleted file mode 100644 index 420c0a57a..000000000 --- a/tools/slang-graphics/vk-util.h +++ /dev/null @@ -1,41 +0,0 @@ -// vk-util.h -#pragma once - -#include "vk-api.h" -#include "render.h" - -// Macros to make testing vulkan return codes simpler - -/// SLANG_VK_RETURN_ON_FAIL can be used in a similar way to SLANG_RETURN_ON_FAIL macro, except it will turn a vulkan failure into Slang::Result in the process -/// Calls handleFail which on debug builds asserts -#define SLANG_VK_RETURN_ON_FAIL(x) { VkResult _res = x; if (_res != VK_SUCCESS) { return VulkanUtil::handleFail(_res); } } - -#define SLANG_VK_RETURN_NULL_ON_FAIL(x) { VkResult _res = x; if (_res != VK_SUCCESS) { VulkanUtil::handleFail(_res); return nullptr; } } - -/// Is similar to SLANG_VK_RETURN_ON_FAIL, but does not return. Will call checkFail on failure - which asserts on debug builds. -#define SLANG_VK_CHECK(x) { VkResult _res = x; if (_res != VK_SUCCESS) { VulkanUtil::checkFail(_res); } } - -namespace slang_graphics { - -// Utility functions for Vulkan -struct VulkanUtil -{ - /// Get the equivalent VkFormat from the format - /// Returns VK_FORMAT_UNDEFINED if a match is not found - static VkFormat getVkFormat(Format format); - - /// Called by SLANG_VK_RETURN_FAIL if a res is a failure. - /// On debug builds this will cause an assertion on failure. - static Slang::Result handleFail(VkResult res); - /// Called when a failure has occurred with SLANG_VK_CHECK - will typically assert. - static void checkFail(VkResult res); - - /// Get the VkPrimitiveTopology for the given topology. - /// Returns VK_PRIMITIVE_TOPOLOGY_MAX_ENUM on failure - static VkPrimitiveTopology getVkPrimitiveTopology(PrimitiveTopology topology); - - /// Returns Slang::Result equivalent of a VkResult - static Slang::Result toSlangResult(VkResult res); -}; - -} // renderer_test diff --git a/tools/slang-graphics/window.cpp b/tools/slang-graphics/window.cpp deleted file mode 100644 index 7aef88c12..000000000 --- a/tools/slang-graphics/window.cpp +++ /dev/null @@ -1,245 +0,0 @@ -// window.cpp -#include "window.h" -#pragma once - -#include - -#ifdef _MSC_VER -#include -#if (_MSC_VER < 1900) -#define snprintf sprintf_s -#endif -#endif - - -#if _WIN32 -#include -#else -#error "The slang-graphics library currently only supports Windows platforms" -#endif - -namespace slang_graphics { - -#if _WIN32 - -struct OSString -{ - OSString(char const* begin, char const* end) - { - _initialize(begin, end - begin); - } - - OSString(char const* begin) - { - _initialize(begin, strlen(begin)); - } - - ~OSString() - { - free(mBegin); - } - - operator WCHAR const*() - { - return mBegin; - } - -private: - WCHAR* mBegin; - WCHAR* mEnd; - - void _initialize(char const* input, size_t inputSize) - { - const DWORD dwFlags = 0; - int outputCodeUnitCount = ::MultiByteToWideChar(CP_UTF8, dwFlags, input, int(inputSize), nullptr, 0); - - WCHAR* buffer = (WCHAR*)malloc(sizeof(WCHAR) * (outputCodeUnitCount + 1)); - - ::MultiByteToWideChar(CP_UTF8, dwFlags, input, int(inputSize), buffer, outputCodeUnitCount); - buffer[outputCodeUnitCount] = 0; - - mBegin = buffer; - mEnd = buffer + outputCodeUnitCount; - } -}; - -struct ApplicationContext -{ - HINSTANCE instance; - int showCommand = SW_SHOWDEFAULT; - int resultCode = 0; -}; - -/// Run an application given the specified callback and command-line arguments. -int runApplication( - ApplicationFunc func, - int argc, - char const* const* argv) -{ - ApplicationContext context; - context.instance = (HINSTANCE) GetModuleHandle(0); - func(&context); - return context.resultCode; -} - -int runWindowsApplication( - ApplicationFunc func, - void* instance, - int showCommand) -{ - ApplicationContext context; - context.instance = (HINSTANCE) instance; - context.showCommand = showCommand; - func(&context); - return context.resultCode; -} - -struct Window -{ - HWND handle; -}; - -static LRESULT CALLBACK windowProc( - HWND windowHandle, - UINT message, - WPARAM wParam, - LPARAM lParam) -{ - // TODO: Actually implement some reasonable logic here. - switch (message) - { - case WM_CLOSE: - PostQuitMessage(0); - return 0; - } - - return DefWindowProcW(windowHandle, message, wParam, lParam); -} - - -static ATOM createWindowClassAtom() -{ - WNDCLASSEXW windowClassDesc; - windowClassDesc.cbSize = sizeof(windowClassDesc); - windowClassDesc.style = CS_OWNDC | CS_HREDRAW | CS_VREDRAW; - windowClassDesc.lpfnWndProc = &windowProc; - windowClassDesc.cbClsExtra = 0; - windowClassDesc.cbWndExtra = 0; - windowClassDesc.hInstance = (HINSTANCE) GetModuleHandle(0); - windowClassDesc.hIcon = 0; - windowClassDesc.hCursor = 0; - windowClassDesc.hbrBackground = 0; - windowClassDesc.lpszMenuName = 0; - windowClassDesc.lpszClassName = L"SlangGraphicsWindow"; - windowClassDesc.hIconSm = 0; - ATOM windowClassAtom = RegisterClassExW(&windowClassDesc); - return windowClassAtom; -} - -static ATOM getWindowClassAtom() -{ - static ATOM windowClassAtom = createWindowClassAtom(); - return windowClassAtom; -} - -Window* createWindow(WindowDesc const& desc) -{ - Window* window = new Window(); - - OSString windowTitle(desc.title); - - DWORD windowExtendedStyle = 0; - DWORD windowStyle = 0; - - HINSTANCE instance = (HINSTANCE) GetModuleHandle(0); - - HWND windowHandle = CreateWindowExW( - windowExtendedStyle, - (LPWSTR) getWindowClassAtom(), - windowTitle, - windowStyle, - 0, 0, // x, y - desc.width, desc.height, - NULL, // parent - NULL, // menu - instance, - window); - - if(!windowHandle) - { - delete window; - return nullptr; - } - - window->handle = windowHandle; - return window; -} - -void showWindow(Window* window) -{ - ShowWindow(window->handle, SW_SHOW); -} - -void* getPlatformWindowHandle(Window* window) -{ - return window->handle; -} - -bool dispatchEvents(ApplicationContext* context) -{ - for(;;) - { - MSG message; - - int result = PeekMessageW(&message, NULL, 0, 0, PM_REMOVE); - if (result != 0) - { - if (message.message == WM_QUIT) - { - context->resultCode = (int)message.wParam; - return false; - } - - TranslateMessage(&message); - DispatchMessageW(&message); - } - else - { - return true; - } - } - -} - -void exitApplication(ApplicationContext* context, int resultCode) -{ - ExitProcess(resultCode); -} - -int reportError(char const* message, ...) -{ - va_list args; - va_start(args, message); - - static const int kBufferSize = 1024; - char messageBuffer[kBufferSize]; - vsnprintf(messageBuffer, kBufferSize - 1, message, args); - messageBuffer[kBufferSize - 1] = 0; - - va_end(args); - - fputs(messageBuffer, stderr); - - OSString wideMessageBuffer(messageBuffer); - OutputDebugStringW(wideMessageBuffer); - - return 1; -} - -#else - -// TODO: put an SDL version here - -#endif - -} // slang_graphics diff --git a/tools/slang-graphics/window.h b/tools/slang-graphics/window.h deleted file mode 100644 index 91c8286d5..000000000 --- a/tools/slang-graphics/window.h +++ /dev/null @@ -1,69 +0,0 @@ -// window.h -#pragma once - -namespace slang_graphics { - -struct WindowDesc -{ - char const* title; - int width; - int height; -}; - -typedef struct Window Window; - -Window* createWindow(WindowDesc const& desc); -void showWindow(Window* window); -void* getPlatformWindowHandle(Window* window); - -/// Opaque state provided by platform for a running application. -typedef struct ApplicationContext ApplicationContext; - -/// User-defined application entry-point function. -typedef void(*ApplicationFunc)(ApplicationContext* context); - -/// Dispatch any pending events for application. -/// -/// @returns `true` if application should keep running. -bool dispatchEvents(ApplicationContext* context); - -/// Exit the application with a given result code -void exitApplication(ApplicationContext* context, int resultCode); - -/// Report an error to an appropriate logging destination. -int reportError(char const* message, ...); - -/// Run an application given the specified callback and command-line arguments. -int runApplication( - ApplicationFunc func, - int argc, - char const* const* argv); - -#define SG_CONSOLE_MAIN(APPLICATION_ENTRY) \ - int main(int argc, char** argv) { \ - return slang_graphics::runApplication(&(APPLIATION_ENTRY), argc, argv); \ - } - -#ifdef _WIN32 - -int runWindowsApplication( - ApplicationFunc func, - void* instance, - int showCommand); - -#define SG_UI_MAIN(APPLICATION_ENTRY) \ - int __stdcall WinMain( \ - void* instance, \ - void* /* prevInstance */, \ - void* /* commandLine */, \ - int showCommand) { \ - return slang_graphics::runWindowsApplication(&(APPLICATION_ENTRY), instance, showCommand); \ - } - -#else - -#define SG_UI_MAIN(APPLICATION_ENTRY) SG_CONSOLE_MAIN(APPLICATION_ENTRY) - -#endif - -} // slang_graphics -- cgit v1.2.3