summaryrefslogtreecommitdiff
path: root/tools/gfx/cuda/cuda-device.cpp
AgeCommit message (Collapse)Author
2024-11-05Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska
* Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-10-29formatEllie Hermaszewska
* format * Minor test fixes * enable checking cpp format in ci
2024-03-15[gfx] use CUDA driver API (#3776)skallweitNV
2024-02-28[SPIRV] Add NonSemanticDebugInfo for step-through debugging. (#3644)Yong He
* [SPIRV] Add NonSemanticDebugInfo for step-through debugging. * Fix. * Fix.
2024-02-20Refactor compiler option representations. (#3598)Yong He
* Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
2023-06-27Pointer layout support (#2930)jsmall-nvidia
* WIP looking at reflection with pointers. * Added GetPointerLayout. * Initial test via reflection with layout of ptr type. * WIP handles ptrs to types that have layout that hasn't been completed. * Move tests to ptr. * WIP try to take into account lowering correctly between AggTypeDecl and Type, but doesn't quite work. * WIP a different path to handling recursive lowering problem with Ptr. * Fix issues with reflection output. * Small tidy. * Fix for infinite recursion issue. * Lower IRPointerTypeLayout * Working with generics. Has a hack to work around Layout around Ptr in IR. The reflection around the generic - the name isn't much use, it should probably have the generic parameters, but that would require getName to do something more sophisticated. * Fix issue around calling finishOuterGenerics to early. * Remove feature/ptr test. * Fix type legalization being an infinite loop with Ptr self referencing. * Disable the pointer self reference test because produces an infintie loop on emit. * Fixed comment based on review. * Fix for issue with emit and pointers causing infinite recursion.
2023-04-26Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska
2023-04-04Preliminary support for realtime clock (#2772)jsmall-nvidia
* #include an absolute path didn't work - because paths were taken to always be relative. * Initial support for realtime clock. * Add realtime-clock render feature where seems appropriate. * Fixes to make NVAPI compile properly. Change realtime-clock.slang check to use maths that can't overflow.
2022-11-07Initial version of DeviceLimits implemented in d3d12, d3d11, vulkan and cuda ↵skallweitNV
(#2496)
2022-11-04Add AdapterLUID to identify GPU adapters (#2492)skallweitNV
* Add AdapterLUID to identify GPU adapters * Remove adapter option in render-test
2022-10-12Shader caching (#2432)lucy96chen
* Changed all getEntryPointCode calls to use RendererBase::getEntryPointCodeFromShaderCache * Hashing hooked up, tests pass but need to add more to fully test functionality * checkpoint * Checkpoint: File system creation seems functional, saving is broken * checkpoint: Fixed filename generation from MD5 hash, shader blob might be going missing ahead of pipeline state creation * Fixed a lot of bugs related to hash code generation, shader cache is likely working but needs further testing * Added workaround for module loading by re-creating the test device, shader cache test functional * Vulkan shader caching bug fixed, checkpoint commit before more refinement * pre-ToT merge checkpoint * checkpoint commit, improving cache keys * Significantly expanded items included in the dependency hash for Module; Added dependency hash functions to SpecializedComponentType and RenamedEntryPointComponentType * Temporarily disable shader cache test * Mid cleanup changes, solution successfully builds * Added several helper update functions to slang-md5 to help simplify usage; Added a function under ISession to compute a hash for all linkage-related items; Function renames and cleaned up some comments * Ran premake.bat; Renamed getASTBasedHashCode to computeASTBasedHash * Added slang unit tests for Checksum and MD5; Extended gfx shader cache test to test with multiple shader files and one shader file with multiple entry points * Solution builds and shader cache tests pass, but at least a couple other tests now failing * ran premake.bat * More cleanup changes * Added shaderCachePath field to IDevice desc in gfx.slang, gfx-smoke.slang should be functional * ran premake * cleanup changes; Adding test printf to getEntryPointCodeFromShaderCache to see if output can be seen in CI * Removed debugging printfs; Added handling for getEntryPointCode() failing * Cleanup changes; Jonathan's fixes to SerialWriter to zero initialize otherwise uninitialized memory; Change to SwizzleExpr creation to zero initialize elementCount * Changed enable_if_t to enable_if * Fixed enable_if * Added test for import vs include and changes to included and imported files; Fixed build errors in CUDA; Renamed shader cache statistics fields * cleanup changes * Readd removed file * Restructured computeDependencyBasedHash calls, added computeDependencyBasedHashImpl to all classes dervied from ComponentType * Applied same restructuring to the AST hash functions * Cleanup changes; Moved HashBuilder out to slang-digest.h and added some helper functions to streamline the process of adding items to a hash * Cleanup; Fixed incorrect expected results for shader import and include test
2022-08-10Artifact and ICastable (#2351)jsmall-nvidia
* #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls. * Add ICastableList * Added UnknownCastableAdapter * Make ISlangSharedLibrary derive from ICastable, and remain backwards compatible with slang-llvm. * Refactor Representation on Artifact. * Make our ISlangBlobs also derive from ICastable. Make ISlangBlob atomic ref counted. * Fix typo.
2022-07-25Split render-cuda.cpp into smaller files (#2334)lucy96chen
* render-cuda split, compile errors galore due to missing includes etc. * render-cuda split and fully compiles * Ran premake.bat to disable cuda; Added all new files * Removed render-cuda files * CI fixes * Rerun CI