| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove old code paths from render-test
Historically, the `render-test` tool was using three different code paths:
* One based on `gfx` and manual (non-reflection-based) parameter setting, used for OpenGL, D3D11, D3D12, and Vulkan
* One for CPU that used reflection-based parameter setting but shared no code with the first
* One for CUDA that used reflection-based parameter setting and shared some, but not all, code with the CPU path
Recently we've updated `render-test` to include a fourth option:
* Using `gfx` and the "shader object" system it exposes for a unified reflection-based parameter-setting system taht works across OpenGL, D3D11, D3D12, Vulkan, CUDA, and CPU
This change removes the first three options and leaves only the single unified path. A sa result, a bunch of code in `render-test` is no longer needed, and the codebase no longer relies on things like the `IDescriptorSet`-related APIs in `gfx`.
Several existing tests had to be disabled to make this change possible. Those tests will need to be audited and either re-enabled once we fix issues in the shader object system, or permanently removed if they don't test stuff we intend to support in the long run (e.g., global-scope type parameters, which aren't a clear necessity).
* fixup: CUDA detection logic
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a CPU renderer implementation
This change adds a CPU back-end to `gfx` and ensures that most of our existing CPU tests pass when using it.
Detailed notes:
* Most of the CPU renderer implementation is copy-pasted from the CUDA case, so they share a lot of similar logic
* The main addition to the CPU renderer is a semi-complete implementation of host-memory textures. The logic here handles all the main shapes (Buffer, 1D, 2D, 3D, Cube) and all the currently-supported `Format`s that are sample-able as-is (no D24S8). The implementation is not intended to be fast, and it currently only does nearest-neighbor sampling, but otherwise it tries to avoid cutting too many corners and should be ar reasonable starting point for a more complete (but not performance-oriented) implementation.
* Refactored the CPU prelude `IRWTexture` interface to inherit from `ITexture`, since in most cases a single type will end up implementing both. It might be worth it to collapse it all down to a single interface later.
* Changed the CPU prelude `ITexture`/`IRWTexture` interface so that it takes both a pointer *and* a size for output arguments. This change seems necessary to allow a shader variable declared as a `Texture2D<float>` to fetch a single `float` when the underlying texture might be using RGBA32F.
* Added to the `IComponentType` public API so that we can query a "host callable" for an entry point and not just a binary.
* Turned off the `-shaderobj` flag on two tests that weren't yet compatible with shader objects but still had the flag left in on the path (since previously the CPU path always used the non-`gfx` non-shader-object logic anyway)
* Disabled one test (`dynamic-dispatch-11`) that relied on the `ConstantBuffer<IInterface>` idiom that we know we are planning to chagne soon anyway.
* Made a few changes to the CUDA path to bring it into line with what I added for the CPU path. These were mostly bug fixes around indexing logic for sub-objects and resources.
* fixup
|
| |
|
|
|
|
|
|
|
| |
* Add shader object parameter binding to renderer_test.
* remove multiple-definitions.hlsl
* Fix cuda implementation.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use integer RTTI/witness handles in existential tuples.
* Fix clang error.
* Fix IR serialization to use 16bits for opcode.
* Undo accidental comment change.
* Use variable length encoding for opcode.
* Fix compile error.
* Fixing issues
* Fix code review issues.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
* Allow unspecialized existential shader parameters (dynamic dispatch).
* Fixes.
* Fixes
* disable cuda test
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support dynamic existential shader parameters in render-test
* Fix linux build error.
* Fixes.
* Fix code review issues.
* Fix gcc error.
* More fixes.
* More fixes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Introduced heterogeneous example. Example includes C++ source and
header files, and does not currently make use of the associated slang
file when building. The intent of this commit is to introduce the
example as a baseline for later updates as the heterogeneous model is
expanded.
* Changing namespace
* Renamed and rewrote README
* Updated example to account for compiler updates
* Updated path
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* * Remove UniformState and UniformEntryPointParams types
* Put all output C++ source in an anonymous namespace
* If SLANG_PRELUDE_NAMESPACE is set, make what it defines available in generated file.
* Fix signature issue in performance-profile.slang
* Context -> KernelContext to avoid ambiguity.
* Fix issues around dynamic dispatch and anonymous namespace.
* Fix typo.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* render feature for CUDA compute model.
* Use SemanticVersion type.
* Enable CUDA wave tests that require CUDA SM 7.0.
Provide mechanism for DownstreamCompiler to specify version numbers.
* Enabled wave-equality.slang
* Make CUDA SM version major version not just a single digit.
* Fix assert.
* DownstreamCompiler::Version -> CapabilityVersion
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added CPU support for GetDimensions on C++/CPU target.
Added texture-get-dimension.slang test
* Fix some typos.
* Update CUDA docs.
* Fix output of GetDimensions on glsl when has an array.
Disabled VK - because VK renderer doesn't support createTextureView
* Fix typo.
* Fix typo.
* Fix bad-operator-call diagnostics output.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added FloatTextureData as a mechanism to enable CPU based Texture writes.
* Add [] RWTexture access for CPU.
* Fixed rw-texture-simple.slang.expected.txt
* WIP: CUDA stdlib has support for [] surface access.
* Made IRWTexture class able to take different locations.
Doing a Texture2d access on CUDA works.
* Fix bug in outputing UniformState - was missing out padding.
Support RWTexture with array. Support RWTexture3D.
* Use * for locations for read only textures, so only need a ITexture interface.
* Fix problem around application of set/get for CUDA on subscript Texture types.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* CUDA support for array of resources.
* * Add support for Texture2DArray on CPU
* Expand texture-simple.slang to test Texture2DArray
* Reorganise CUDAComputeUtil to split out createTextureResource.
* Add TextureCubeArray support for CPU/CUDA targets.
* Pulled out CUDAResource
Renamed derived classes to reflect that change.
* Creation of SurfObject type.
* Functions to return read/write access for simplifying future additions.
* WIP for RWTexture access on CPU/CUDA.
* CUsurfObject cannot have mips.
* Ability to set number of mips on test data.
Preliminary support for CUsurfObj and RWTexture1D on CUDA.
CUDA docs improvements.
* Fix typo.
|
| |
|
|
|
|
|
|
|
|
|
| |
* CUDA support for array of resources.
* * Add support for Texture2DArray on CPU
* Expand texture-simple.slang to test Texture2DArray
* Reorganise CUDAComputeUtil to split out createTextureResource.
* Add TextureCubeArray support for CPU/CUDA targets.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add cubemap support.
* Add CUDA fence instrinsics.
* Added Gather for CUDA.
* Use the CUDA driver API as much as possible.
* * Support 1D texture on CPU
* WIP on 1D texture on CUDA
* Added simplified texture test
* Fix test.
* Improve texture-simple tests.
* * Add CPU support for 3d textures
* Add support for mip maps to CUDA
* Disable warnings in nvrtc
* Update CUDA docs
* WIP on 3d texture support.
* Add support for 3d textures for CPU and CUDA.
* CPU and CUDA support for cube maps.
* Add CPU support for Texture1DArray.
* Support CUDA Layered/Array type in meta library.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add cubemap support.
* Add CUDA fence instrinsics.
* Added Gather for CUDA.
* Use the CUDA driver API as much as possible.
* * Support 1D texture on CPU
* WIP on 1D texture on CUDA
* Added simplified texture test
* Fix test.
* Improve texture-simple tests.
* * Add CPU support for 3d textures
* Add support for mip maps to CUDA
* Disable warnings in nvrtc
* Update CUDA docs
* WIP on 3d texture support.
* Add support for 3d textures for CPU and CUDA.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add cubemap support.
* Add CUDA fence instrinsics.
* Added Gather for CUDA.
* Use the CUDA driver API as much as possible.
* * Support 1D texture on CPU
* WIP on 1D texture on CUDA
* Added simplified texture test
* Fix test.
* Improve texture-simple tests.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* First pass at BindLocation.
* Added BindSet::init - for initializing with two input constant buffers. Needs better name, and perhaps should be another class.
* Fix handling of constant buffer stripping.
Improved initialization.
* Trying to generalize BindLocation a little more.
Split out CPULikeBindRoot.
* More work to make BindLocation et al work with non uniform bindings.
* Added parsing to a location.
* WIP: Trying to get CPU working with BindLocation.
* Describe problem of knowing the type of the reference point in the binding table.
* More ideas on getBindings fix.
* Remove BindSet as member of BindLocation.
* Added BindLocation::Invalid
* Made BindLocation able to be key in hash
* Use BindLocation for bindings on BindingSet.
* Added cuda and nvrtc categories to test infrastructure.
Disabled CUDA synthetic tests by default.
Fixed such that all tests now produce something in BindLocation style.
* Use m_userIndex instead of m_userData on Resource.
Move the binding setup out of cpu-compute-util (as no longer CPU specific)
* Removed CPUBinding - used BindLocation/BindSet instead.
Fixed some bugs around indexOf around uniform indirection.
* Renamed BindSet::Resource -> BindSet::Value.
* Document BindLocation.
* Fixes for Clang/GCC
Improve invariant requirement handling when constructing from BindPoints.
* WIP: First attempt to run CUDA kernel.
* Fix some issues around doing CUDA kernel launch.
* Fix issues around use of cudaMemCpy .
* Better cuda runtime error checking mechanism.
* Fixed bug in passing parameters to cuda kernel launch.
Simplified initialisation of context.
* WIP: Fix CUDA runtime issues.
* Add explicit CUDA synchronize so failures don't appear on implicit ones.
* Fix problem emitting non shared variable on CUDA.
* Fix some typos in CUDA layout.
Use just a pointer for now for CUDA StucturedBuffer.
* Arg order for CUDA launch was wrong.
* First compute kernel runs on CUDA.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* First pass at BindLocation.
* Added BindSet::init - for initializing with two input constant buffers. Needs better name, and perhaps should be another class.
* Fix handling of constant buffer stripping.
Improved initialization.
* Trying to generalize BindLocation a little more.
Split out CPULikeBindRoot.
* More work to make BindLocation et al work with non uniform bindings.
* Added parsing to a location.
* WIP: Trying to get CPU working with BindLocation.
* Describe problem of knowing the type of the reference point in the binding table.
* More ideas on getBindings fix.
* Remove BindSet as member of BindLocation.
* Added BindLocation::Invalid
* Made BindLocation able to be key in hash
* Use BindLocation for bindings on BindingSet.
* Added cuda and nvrtc categories to test infrastructure.
Disabled CUDA synthetic tests by default.
Fixed such that all tests now produce something in BindLocation style.
* Use m_userIndex instead of m_userData on Resource.
Move the binding setup out of cpu-compute-util (as no longer CPU specific)
* Removed CPUBinding - used BindLocation/BindSet instead.
Fixed some bugs around indexOf around uniform indirection.
* Renamed BindSet::Resource -> BindSet::Value.
* Document BindLocation.
* Fixes for Clang/GCC
Improve invariant requirement handling when constructing from BindPoints.
|
| |
|
|
|
|
|
|
| |
* WIP: Unsized arrays on CPU.
* unbounded-array-of-array working on CPU.
* Remove some left over comments.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* First pass support for performance profiling
* Test across all elements
* Fix bug - sourceContents is not used, should use rawSource.
* * Add ability to get prelude from API.
* Allow specifying source language for render-test
* Made it possible to compile a test input file as C++
* Special handling for reflection
* Added C++ impl to performance-profile.slang
* Remove some clang warnings.
* Output profile timings on appveyor and other TC.
* Remove passing around of StdWriters (can use global).
Small comment improvements.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* WIP: Improving CPU performance/ABI
* Optionally output code on CPU for groupThreadID and groupID.
* Added ability to set compute dispatch size on command line for render-test.
Dispatch compute tests taking into account dispatch size.
Added test for semantics are working.
* Test using GroupRange.
* Fix problem with adding \n for externa diagnostic - to do it if there isn't a \n at the end. Change the ouput order (put result before) so last value is diagnostic string.
* Made GroupRange the default exposed CPU ABI entry point style.
Removed CPU_EXECUTE test style -as tested via the now cross platform render-test
* Split out execution from setup for execution to improve perf.
* For better code coverage/testing test all styles of CPU compute entry point.
* Improve documentation for ABI changes for CPU code.
Add 'expecting' to error message from review.
* Fix small typos.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* WIP: Improving CPU performance/ABI
* Optionally output code on CPU for groupThreadID and groupID.
* Added ability to set compute dispatch size on command line for render-test.
Dispatch compute tests taking into account dispatch size.
Added test for semantics are working.
* Test using GroupRange.
* Fix problem with adding \n for externa diagnostic - to do it if there isn't a \n at the end. Change the ouput order (put result before) so last value is diagnostic string.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* First pass of render-test refactor.
* Make window construction a function that can choose an implementation.
* Remove OpenGL as currently has windows dependency.
* Disable Vulkan as Renderer impl has dependency on windows.
* Pass Window in as parameter of 'update'.
* Add win-window.cpp as was missing.
* Fix warning on windows about signs during comparison.
* * Added mechanism to add random arrays as buffer inputs and select type
* Improved RenderGenerator to generate more types, and to be more careful around int32 ranges.
* Added support for security checks (for Visual Studio C++)
* Disable Execption handling being on by default when compiling kernels
* Added a 'Group' version of the entry point that will evaluate all threads in a group in a single call. In test code use this method if available.
* Added -compile-arg to be able to pass arguments to the compile within render-test
* Add documention for the _Group execution feature.
* Fix some typos in cpu-target.md
|
|
|
* WIP: Refactor of CPUCompute and stand alone cpu-render-test
* Fix compilation on CygWin.
* Make CPU compute tests run on non windows targets.
* Check that C/C++ compiler is available for CPU compute.
* Fix some tabbing issues.
* Add -fPIC on gfx
* Use dxcompiler_47.dll from slang-binaries on windows.
* make https for git module slang-binaries
* Fix comment in premake5.lua around d3dcompiler_47.dll
* Add resources to the CPUComputeUtil::Context to keep in scope.
* Fixes problem compiling on cygwin where dx12 is included in build of gfx lib.
|