summaryrefslogtreecommitdiffstats
path: root/tools/gfx/cuda
Commit message (Collapse)AuthorAge
* Update cuda context creation to support cuda 13 (#8181)jarcherNV2025-08-15
| | | Update cuda context creation to support both cuda 12 and cuda 13.
* Make CUDA version capabilities reach NVRTC (#7074)Theresa Foley2025-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes #7049 The root cause of the problem in #7049 is simply that newer NVRTC versions produce a warning when asked to generate code for older CUDA SM versions, and the default that Slang was requesting compilation for was old enough to trigger that warning, and thus trip up the test case (which only looks at the first diagnostic produced by the downstream compiler). Superficially, the fix was easy: change the test case in question (`tests/diagnostics/local-line.slang`) to request `-capability cuda_sm_8_0`, the minimum version supported by current NVRTC. Unfortunately, the simple fix required some other fixes in order to actually work. The capability system includes capability names of the form `cuda_sm_*_*`, but specifying such a capability had *no* impact on the CUDA SM version passed in when invoking NVRTC. Instead, only the CUDA SM versions requested in the implementation of intrinsics in the core module were affecting the version number passed down. This change adds logic to `slang-compiler.cpp` to take explicitly requested capabilities into account when inferring the CUDA SM version to be passed downstream. A more complete fix would also add similar logic for all the other targets. Unfortunately... yet again... that fix wasn't enough to make things work as expect. Now I had the problem that requesting `-capability cuda_sm_8_0` was actually causing the NVRTC invocation to request CUDA SM version **9.0**! The underlying problem *there* was that the `slang-capabilities.capdef` file has defined certain capability names in a way that implies atomic capabilities much higher than one would expect. E.g., the `cuda_sm_8_0` alias was including HLSL `sm_5_0`, but then `sm_5_0` in turn included `_cuda_sm_9_0`. The fix, for now, is to change the definitions in `slang-capabilities.capdef` to not have the counter-intuitive definitions for `cuda_sm_*_*`. With this set of fixes, the test failure in the original bug report no longer occurs. The work that went into this change suggests several larger-scope fixes that would be good to pursue: * Ideally the capability definitions would have some sort of validation checking to make sure that counter-intuitive results like `cuda_sm_8_0` requesting CUDA SM 9.0 do not occur. * The translation of capabilities over to version numbers for a downstream compiler should be expanded to cover other targets, and not just CUDA. It might be better/simpler to just pass the capabilities themselves to the downstream compiler, since it is possible that a downstream compiler could have more fine-grained enable/disable options than a simple version number. * The entire approach to computing version numbers required for downstream compilation should be cleaned up so that we don't have this duplication between the capabilities that represent those versions and separate syntactic constructs that are used to "request" those versions as part of code generation. * We are very much at the point where we should consider dropping the current behavior where a profile name or capability like `sm_5_0`, that is specific to a single target or a subset of targets, also implies a set of comparable capabilities for other targets.
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* [gfx] use CUDA driver API (#3776)skallweitNV2024-03-15
|
* [SPIRV] Add NonSemanticDebugInfo for step-through debugging. (#3644)Yong He2024-02-28
| | | | | | | * [SPIRV] Add NonSemanticDebugInfo for step-through debugging. * Fix. * Fix.
* Refactor compiler option representations. (#3598)Yong He2024-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
* Parameter binding and gfx fixes. (#3302)Yong He2023-11-01
| | | | | | | | | | | * Parameter binding and gfx fixes. * Add diagnostics on entry point parameters. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Pointer layout support (#2930)jsmall-nvidia2023-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP looking at reflection with pointers. * Added GetPointerLayout. * Initial test via reflection with layout of ptr type. * WIP handles ptrs to types that have layout that hasn't been completed. * Move tests to ptr. * WIP try to take into account lowering correctly between AggTypeDecl and Type, but doesn't quite work. * WIP a different path to handling recursive lowering problem with Ptr. * Fix issues with reflection output. * Small tidy. * Fix for infinite recursion issue. * Lower IRPointerTypeLayout * Working with generics. Has a hack to work around Layout around Ptr in IR. The reflection around the generic - the name isn't much use, it should probably have the generic parameters, but that would require getName to do something more sophisticated. * Fix issue around calling finishOuterGenerics to early. * Remove feature/ptr test. * Fix type legalization being an infinite loop with Ptr self referencing. * Disable the pointer self reference test because produces an infintie loop on emit. * Fixed comment based on review. * Fix for issue with emit and pointers causing infinite recursion.
* Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska2023-04-26
|
* Set sharedMem argument to 0 when launching cuda kernel. (#2799)Yong He2023-04-13
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Preliminary support for realtime clock (#2772)jsmall-nvidia2023-04-04
| | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Initial support for realtime clock. * Add realtime-clock render feature where seems appropriate. * Fixes to make NVAPI compile properly. Change realtime-clock.slang check to use maths that can't overflow.
* Squash some warnings (#2765)Ellie Hermaszewska2023-04-02
| | | Co-authored-by: Yong He <yonghe@outlook.com>
* Enable CUDA render api on unix (#2757)Ellie Hermaszewska2023-03-30
| | | | | | | | | | | * Remove extra qualification in cuda device impl Only MSVC accepts this illegal code * Enable CUDA render api on unix --------- Co-authored-by: Yong He <yonghe@outlook.com>
* GFX: make dispatch commands return error code. (#2625)Yong He2023-02-06
| | | | | | | | | * GFX: make dispatch commands return error code. * Fix cuda. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Initial version of DeviceLimits implemented in d3d12, d3d11, vulkan and cuda ↵skallweitNV2022-11-07
| | | | (#2496)
* Add AdapterLUID to identify GPU adapters (#2492)skallweitNV2022-11-04
| | | | | * Add AdapterLUID to identify GPU adapters * Remove adapter option in render-test
* Add gfxGetAdapters function (currently implemented for D3D12/Vulkan) (#2486)skallweitNV2022-11-03
| | | | | | | | | | | * Add gfxGetAdapters function (currently implemented for D3D12/Vulkan) * Extend to handle DirectX11 and CUDA * Use blob to return adapter list and add AdapterList helper * Replace strncpy with memcpy Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
* Shader caching (#2432)lucy96chen2022-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Changed all getEntryPointCode calls to use RendererBase::getEntryPointCodeFromShaderCache * Hashing hooked up, tests pass but need to add more to fully test functionality * checkpoint * Checkpoint: File system creation seems functional, saving is broken * checkpoint: Fixed filename generation from MD5 hash, shader blob might be going missing ahead of pipeline state creation * Fixed a lot of bugs related to hash code generation, shader cache is likely working but needs further testing * Added workaround for module loading by re-creating the test device, shader cache test functional * Vulkan shader caching bug fixed, checkpoint commit before more refinement * pre-ToT merge checkpoint * checkpoint commit, improving cache keys * Significantly expanded items included in the dependency hash for Module; Added dependency hash functions to SpecializedComponentType and RenamedEntryPointComponentType * Temporarily disable shader cache test * Mid cleanup changes, solution successfully builds * Added several helper update functions to slang-md5 to help simplify usage; Added a function under ISession to compute a hash for all linkage-related items; Function renames and cleaned up some comments * Ran premake.bat; Renamed getASTBasedHashCode to computeASTBasedHash * Added slang unit tests for Checksum and MD5; Extended gfx shader cache test to test with multiple shader files and one shader file with multiple entry points * Solution builds and shader cache tests pass, but at least a couple other tests now failing * ran premake.bat * More cleanup changes * Added shaderCachePath field to IDevice desc in gfx.slang, gfx-smoke.slang should be functional * ran premake * cleanup changes; Adding test printf to getEntryPointCodeFromShaderCache to see if output can be seen in CI * Removed debugging printfs; Added handling for getEntryPointCode() failing * Cleanup changes; Jonathan's fixes to SerialWriter to zero initialize otherwise uninitialized memory; Change to SwizzleExpr creation to zero initialize elementCount * Changed enable_if_t to enable_if * Fixed enable_if * Added test for import vs include and changes to included and imported files; Fixed build errors in CUDA; Renamed shader cache statistics fields * cleanup changes * Readd removed file * Restructured computeDependencyBasedHash calls, added computeDependencyBasedHashImpl to all classes dervied from ComponentType * Applied same restructuring to the AST hash functions * Cleanup changes; Moved HashBuilder out to slang-digest.h and added some helper functions to streamline the process of adding items to a hash * Cleanup; Fixed incorrect expected results for shader import and include test
* Call `gfx` in slang program. (#2370)Yong He2022-08-20
|
* Add gfx interface definition in Slang. (#2364)Yong He2022-08-16
| | | | | | | | | | | | | | | | | | * Add gfx interface definition in Slang. - add gfx interface definitons in Slang. - fix slang compiler to correctly type-check `out` interface argument. - modify gfx interface to be fully COM compatible - add convenient ShaderProgram creation methods to gfx. * Fix compile errors and warnings. * Update project files * Fix cuda. * Properly implement queryInterface in command encoder impls. Co-authored-by: Yong He <yhe@nvidia.com>
* Artifact and ICastable (#2351)jsmall-nvidia2022-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls. * Add ICastableList * Added UnknownCastableAdapter * Make ISlangSharedLibrary derive from ICastable, and remain backwards compatible with slang-llvm. * Refactor Representation on Artifact. * Make our ISlangBlobs also derive from ICastable. Make ISlangBlob atomic ref counted. * Fix typo.
* Split render-cuda.cpp into smaller files (#2334)lucy96chen2022-07-25
| | | | | | | | | | | | | * render-cuda split, compile errors galore due to missing includes etc. * render-cuda split and fully compiles * Ran premake.bat to disable cuda; Added all new files * Removed render-cuda files * CI fixes * Rerun CI
* GFX renaming work part 2: slang-gfx.h renames (#2194)lucy96chen2022-04-21
| | | | | | | | | | | | | * Fixed all build errors and type conversion warnings from renames in slang-gfx.h * Made necessary build fixes to the CUDA implementation * Renamed ITextureResource::Size to ITextureResource::Extents * More rename changes based on CI errors * More renames to fix CI build errors * Rerun tests
* Expose API-specific row alignment values (#2151)lucy96chen2022-03-08
| | | | | | | * Added function to IDevice that retrieves the row alignment for the particular API; Added rowDstStride argument to copyTextureToBuffer and changed D3D12 footprint row pitch to check that the user-supplied stride is correctly aligned before assigning to the footprint's row pitch * Changed alignment from Uint to size_t Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
* gfx: d3d12 performance optimizations. (#2140)Yong He2022-02-23
| | | | | | | | | | | * gfx: d3d12 performance optimizations. * Fix. * Fix unit test bug. * Add gfx interface for directly allocating GPU descriptor tables. Co-authored-by: Yong He <yhe@nvidia.com>
* Optimize d3d12 mutable shader object implementation. (#2138)Yong He2022-02-19
| | | | | | | | | | | | | * Optimize d3d12 mutable shader object implementation. * Disable mismatched clear value warning message from d3d sdk. * Fix. * Fix. * gfx: Avoid redundant d3d12 QueryInterface call. Co-authored-by: Yong He <yhe@nvidia.com>
* Various gfx fixes. (#2132)Yong He2022-02-16
| | | | | | | | | | | | | | | * Various gfx fixes. * Fix test case. * Fix crash. * Trigger build * Trigger build 2 * Fix vulkan unit tests. Co-authored-by: Yong He <yhe@nvidia.com>
* gfx: Add `resolveQuery` command. (#2125)Yong He2022-02-10
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Various fixes to gfx. (#2120)Yong He2022-02-09
| | | | | | | | | | | * Various fixes to gfx. * Fix. * Fixes. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Add gfx interop to allow more direct D3D12 usage scenarios. (#2117)Yong He2022-02-03
| | | | | | | | | | | | | * Add gfx interop to allow more direct D3D12 usage scenarios. * Fix compile error in win32. * gfx: Implement IFence::getNativeHandle() on d3d12. * More GFX-D3D interop interface. * Fix cuda. Co-authored-by: Yong He <yhe@nvidia.com>
* Add implementations for resolveResource() to D3D12 and Vulkan backends (#2094)lucy96chen2022-01-25
| | | | | | | | | * Added implementations for resolveResource to both D3D and Vulkan backends * Simple test for resolving a multisampled resource written and confirmed to run successfully for D3D12 * Fixed a bug preventing MSAA from working in Vulkan * Changed test format to RGBA32 Float (because swiftshader)
* Vulkan implementations for copyTexture, copyTextureToBuffer, and ↵lucy96chen2022-01-19
| | | | | | | | | | | | | | | | | | | | | | | | textureSubresourceBarrier (#2080) * Added preliminary implementations for Vulkan's copyTexture, copyTextureToBuffer, and textureSubresourceBarrier * Simple copyTexture test working * Expanded test to use textureSubresourceBarrier() to change resource states before copying out to a buffer; Changed copyTextureToBuffer() to assert that only a single mip level is being copied; Test passes on Vulkan only * Fixed an incorrect loop condition in D3D12's textureSubresourceBarrier and changed the size of the results buffer to pre-account for padding; Test runs in D3D12 but does not pass * D3D12 test working, compareComputeResult for buffers now takes an offset into the results * Refactored texture copying tests * Second test written but does not copy correctly * Fixed texture creation in D3D12 to take into account the subresource index when copying texture data so it actually copies all slices instead of just the ones in the first array layer; Second test working on both D3D12 and Vulkan * Added a note for future tests to be added for texture copying; Fixed build errors in CUDA Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Theresa Foley <tfoleyNV@users.noreply.github.com>
* Various fixes to GFX, nested parameter block test for d3d12. (#2081)Yong He2022-01-14
| | | | | | | | | | | | | * Various fixes. * Add nested parameter block test. * Remove slang-llvm licence info * Ingore slang-llvm/ directory. * Fixup. Co-authored-by: Yong He <yhe@nvidia.com>
* Various fixes to gfx. (#2074)Yong He2022-01-10
| | | | | | | * Various gfx fixes. * Fixup. Co-authored-by: Yong He <yhe@nvidia.com>
* Draw call tests for Vulkan (#2073)lucy96chen2022-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | * Added instancing support to Vulkan, drawInstanced() test image is upside-down * Fixed inverted drawInstanced() test output by changing Vulkan viewport convention * Replaced vkCmdDraw with vkCmdDrawIndexed in all non-indirect indexed draws, drawIndexedIndirect test now working for Vulkan * Moved index and vertex buffer binding into setIndexBuffer and setVertexBuffers; Defaulted countBuffer to nullptr and countOffset to 0 and added a check for non-null countBuffer to drawIndirect and drawIndexedIndirect in Vulkan; All Vulkan draw tests working (but D3D12 tests broken) * Added support for drawInstanced and drawIndexedInstanced to D3D11 and added tests for both, however D3D11 tests are currently disabled due to readTextureResource assuming a fixed pixel size (among other possible problems); Fixed issues causing D3D12 tests to fail after major back-end fixes to get Vulkan up and running * Removed testing function for dumping images and some other commented out code * Removed some commented out code * Fix initializer list causing builds to fail (attempt 1) * Removed initializer list for VertexStreamDesc in createInputLayout() and fill in struct fields normally (build fix attempt 2) * Removed default values from VertexStreamDesc and changed all initializer lists to reflect this change * Moved applyBinding before setVertexBuffer in RenderTestApp::renderFrame() to ensure the pipeline has already been bound before vertex buffers are * Changed D3D11's readTextureResource to calculate pixel size using format-specific size information; Removed wrapper around D3D11 instanced and indexed instanced draw tests
* Fix cuda texture creation bug (#2076)Yong He2022-01-06
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fixes to `uploadTextureData`. (#2058)Yong He2021-12-13
| | | | | | | * Fixes to `uploadTextureData`. * Fix, Co-authored-by: Yong He <yhe@nvidia.com>
* gfx Fence implementation improvements. (#2049)Yong He2021-12-08
|
* D3D12 and Vulkan to CUDA Texture Sharing (#2038)lucy96chen2021-12-08
|
* gfx: D3D12 and VK Fence implementation. (#2048)Yong He2021-12-07
| | | | | | | | | | | | | * gfx: D3D12 and VK Fence implementation. * Fix. * Update project files. * Revert project file changes. * Remove project files Co-authored-by: Yong He <yhe@nvidia.com>
* gfx: add coverage for more resource commands. (#2020)Yong He2021-11-18
| | | | | | | | | | | | | | | * gfx: specify SubresourceRange for resource view creation. * Add gfx method for `resolveResource`. * Fix compile error. * `copyTextureToBuffer` and `textureSubresourceBarrier`. * Fix vulkan bug. * Fix test cras;h. Co-authored-by: Yong He <yhe@nvidia.com>
* gfx ShaderObject interface update, getTextureAllocationInfo() (#2019)Yong He2021-11-17
| | | | | | | * gfx ShaderObject interface update, getTextureAllocationInfo() * Fix render-vk compiler warnings and errors. Co-authored-by: Yong He <yhe@nvidia.com>
* gfx: setSamplePositions and clearResourceView. (#2018)Yong He2021-11-16
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Update gfx interface. (#2015)Yong He2021-11-15
|
* Add support for buffer sharing from Vulkan to CUDA (#2008)lucy96chen2021-11-12
| | | | | | | | | | | | | | | | | | | * Added getSharedHandle() and additional code to handle shareable buffer creation to Buffer::init() and initVulkanInstanceAndDevice() for Vulkan; Modified createBufferFromSharedHandle() in CUDA to assign externalMemoryHandleDesc.type based on the type of handle being provided; Added an additional test case to get-shared-handle.cpp testing Vulkan to CUDA * Added createBufferFromNativeHandle() to Vulkan and enabled corresponding test * disable cuda * Fixed getSharedHandle() for D3D12 buffers assigning Win32 as the handle's source * Removed a dangling comment inside Buffer::init() * Added a missing override; Added code to check that a physical device supports the necessary external memory extensions before adding them to the deviceExtensions list; Added #if SLANG_WINDOWS_FAMILY guards around all Windows-specific code and sharedHandleVulkanToCUDA test (which uses said platform-specific code) * Added Windows check around vkGetMemoryWin32HandleKHR in vk-api.h * Added missing Windows check around BufferResourceImpl destructor * Added a temporary hack to ensure synchronization between devices, which solves an issue with buffer sharing resulting in incorrect values being read back; Added #if SLANG_WIN64 around all CUDA tests as the backend currently only supports running CUDA on 64-bit (despite devices being created successfully in a 32-bit config)
* Allow buffers to be shared between D3D12 and CUDA (#2005)lucy96chen2021-11-09
| | | | | | | | | | | * Added both the SharedHandle struct containing a handle and the API the handle originated from and the getSharedHandle() method to IResource, which returns a Windows system handle for the resource that can then be shared between multiple APIs (currently only fully implemented for D3D12); Added createTextureFromNativeHandle() and createBufferFromNativeHandle() to IDevice, which creates a buffer or texture resource using the provided handle (currently only fully implemented for D3D12); Added createBufferFromSharedHandle() to IDevice, which creates a BufferResource using the provided system handle (currently only fully implemented for the D3D12 to CUDA interface); Provided a proper implementation for CUDADevice::getNativeHandle(); Added several new tests testing the aforementioned implementations; Moved NativeHandle and getNativeHandle() for IBufferResource and ITextureResource up a layer into IResource and renamed to NativeResourceHandle; Modified NativeResourceHandle to be a struct containing the handle and the API it originated from and propagated these changes where appropriate * Combined all native and shared handle representations into a unified InteropHandle struct which tracks the handle's value and source API; Modified all getNativeHandle() and getSharedHandle() variants to operate on InteropHandle and modified all affected files * D3D12 buffers and textures are now responsible for closing their shared handles if they exist; Renamed IDevice::getNativeHandle() to getNativeDeviceHandles() * Fixed getNativeDeviceHandles() in render-cuda to match updated method elsewhere * Temporarily disabling existingDeviceHandleCUDA and sharedHandleD3D12ToCUDA due to currently unreproducable test failures on TC
* Add interface for new gfx features. (#2003)Yong He2021-11-04
| | | | | | | | | | | * Add interface for new gfx features. * Add cuda implementation. * Code review fixes. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Expanded gfx::Format to include additional formats (#1982)lucy96chen2021-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Format list updated with additional formats supported by both D3D and Vulkan; D3DUtil::getMapFormat() and VkUtil::getVkFormat() updated to include additional formats; GFX_FORMAT() updated with all additional formats (BC compression unfinished) * Finished updating GFX_FORMAT with newly added formats and sizes; Pixel size is now tracked using the FormatPixelSize struct containing the values for bytes per block and pixels per block to accomodate BC formats; Updated gfxGetFormatSize and associated sub-calls to return FormatPixelSize instead of uint8_t; Most calls to gfxGetFormatSize() updated to reflect changes, a couple calls still unupdated * Changes to accommodate new formats finished, debugging slang-literal unit test * First format unit test working * One test added for BC1Unorm and RGBA8Unorm_SRGB, both passing * Refactored format testing code to merge BC1Unorm and RGBA8Unorm SRGB into a single file * All unit tests added for BC and Srgb formats * Most tests added and working; Added five additional formats (still need tests) and made the appropriate changes to support these; createTextureView() modified for D3D11, D3D12, and Vulkan to take into account the format specified in the texture view desc when the texture's format is typeless * Format enums renamed to more closely match their D3D counterparts; Added a universal float and uint buffer and buffer view for use across all Format tests * Remaining tests added; D3D12 tests pass, but Vulkan crashes in BC1_UNORM and D3D11 spits out a bunch of D3D11 Errors (but supposedly passes) * re-run premake * Added Sint versions of test shaders; Vulkan and D3D11 tests also pass * Size struct for format unit tests no longer use initializer lists * Fixed a Size struct missed in the previous pass * Fixed minor bugs causing tests to fail * Added documentation detailing all currently unsupported formats * Skip tests causing unsupported format warnings due to swiftshader * updated several test using old Format enum names * Revert change to compareComputeResult() that was added for debugging purposes * DEBUGGING: Added prints to identify which formats are failing on CI * Reverted attempted debugging changes; Fixed texture2d-gather.hlsl to use updated Format enums * Fixed incorrect array sizes in d3d11 _initSrvDesc() * Commented out further tests that produce unexpected results when tested for Vulkan with swiftshader * Revert "Merge branch 'expanded-format-support' of https://github.com/lucy96chen/slang into expanded-format-support" This reverts commit 20008f0d3ecc3b1405ecac8c138edaa3cd37ed6b, reversing changes made to 6081e95827315fee50e18409394d5abd62fac787. * Added a fuzzy comparison function for use with floats * submodule update * Revert messed up changes caused by previous revert after automatically merging on github
* GFX: implement mutable shader objects. (#1963)Yong He2021-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * GFX: implement mutable shader objects. * Revert unnecessary changes * Revert more changes. * Fix clang errors. * Fix clang/gcc errors. * Fix clang errors. * Remove CPU test. * Fix after merge. * Fix after merge. * Remove gl test * Code review fixes. * Fixing all vk validation errors. * Flush test output more often. * Fix a crash in `specializeDynamicAssociatedTypeLookup`. * temporarily disable std-lib-serialize test to see what happens * Fix crashes. * Make sure cpu gfx unit tests are properly disabled on TeamCity. * Disable cpu test. * Fix. * Fix cuda. * Disable nv-ray-tracing-motion-blur Co-authored-by: Yong He <yhe@nvidia.com>