diff options
| author | jsmall-nvidia <jsmall@nvidia.com> | 2022-06-08 19:51:49 -0400 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2022-06-08 19:51:49 -0400 |
| commit | 4db6bd3cd6da1871fdac520c280bd9f933e48489 (patch) | |
| tree | e4e1bf347a1ceac708ce598af7d4ca4bab71e013 /source/slang/hlsl.meta.slang | |
| parent | 1146920bc9ed9bef2b5bb91b3cdec4700eb09881 (diff) | |
Improved bounds checking for C++/CUDA (#2263)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Bounds checking for C++
Improved bounds checks for CUDA.
* Check that actual global initialization is working.
* Fix typo.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Fix typo in CUDA prelude.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* WIP Zero bound test.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* For now disable zero index test.
* Enable bounds checking for CPU/CUDA.
* Small fixes.
Disable CUDA zero index bound fix.
* Add test result for bound check.
* Work around for index wrapping issue.
* Added Fixed array test.
* Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
Diffstat (limited to 'source/slang/hlsl.meta.slang')
| -rw-r--r-- | source/slang/hlsl.meta.slang | 22 |
1 files changed, 11 insertions, 11 deletions
diff --git a/source/slang/hlsl.meta.slang b/source/slang/hlsl.meta.slang index 7d107888a..b2f6fa06b 100644 --- a/source/slang/hlsl.meta.slang +++ b/source/slang/hlsl.meta.slang @@ -331,7 +331,7 @@ ${{{{ __target_intrinsic(hlsl, "($3 = NvInterlockedAddFp32($0, $1, $2))") __cuda_sm_version(2.0) - __target_intrinsic(cuda, "(*$3 = atomicAdd((float*)$0._getPtrAt($1), $2))") + __target_intrinsic(cuda, "(*$3 = atomicAdd($0._getPtrAt<float>($1), $2))") [__requiresNVAPI] void InterlockedAddF32(uint byteAddress, float valueToAdd, out float originalValue); @@ -347,7 +347,7 @@ ${{{{ __target_intrinsic(hlsl, "(NvInterlockedAddFp32($0, $1, $2))") [__requiresNVAPI] __cuda_sm_version(2.0) - __target_intrinsic(cuda, "atomicAdd((float*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicAdd($0._getPtrAt<float>($1), $2)") void InterlockedAddF32(uint byteAddress, float valueToAdd); __specialized_for_target(glsl) @@ -359,7 +359,7 @@ ${{{{ // Int64 Add __cuda_sm_version(6.0) - __target_intrinsic(cuda, "(*$3 = atomicAdd((uint64_t*)$0._getPtrAt($1), $2))") + __target_intrinsic(cuda, "(*$3 = atomicAdd($0._getPtrAt<uint64_t>($1), $2))") void InterlockedAddI64(uint byteAddress, int64_t valueToAdd, out int64_t originalValue); __specialized_for_target(hlsl) @@ -377,7 +377,7 @@ ${{{{ // Without returning original value __cuda_sm_version(6.0) - __target_intrinsic(cuda, "atomicAdd((uint64_t*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicAdd($0._getPtrAt<uint64_t>($1), $2)") void InterlockedAddI64(uint byteAddress, int64_t valueToAdd); __specialized_for_target(hlsl) @@ -395,7 +395,7 @@ ${{{{ // Cas uint64_t - __target_intrinsic(cuda, "(*$4 = atomicCAS((uint64_t*)$0._getPtrAt($1), $2, $3))") + __target_intrinsic(cuda, "(*$4 = atomicCAS($0._getPtrAt<uint64_t>($1), $2, $3))") void InterlockedCompareExchangeU64(uint byteAddress, uint64_t compareValue, uint64_t value, out uint64_t outOriginalValue); __specialized_for_target(hlsl) @@ -414,7 +414,7 @@ ${{{{ // Max __cuda_sm_version(3.5) - __target_intrinsic(cuda, "atomicMax((uint64_t*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicMax($0._getPtrAt<uint64_t>($1), $2)") uint64_t InterlockedMaxU64(uint byteAddress, uint64_t value); __specialized_for_target(hlsl) @@ -430,7 +430,7 @@ ${{{{ // Min __cuda_sm_version(3.5) - __target_intrinsic(cuda, "atomicMin((uint64_t*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicMin($0._getPtrAt<uint64_t>($1), $2)") uint64_t InterlockedMinU64(uint byteAddress, uint64_t value); __specialized_for_target(hlsl) @@ -445,7 +445,7 @@ ${{{{ // And - __target_intrinsic(cuda, "atomicAnd((uint64_t*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicAnd($0._getPtrAt<uint64_t>($1), $2)") uint64_t InterlockedAndU64(uint byteAddress, uint64_t value); __specialized_for_target(hlsl) @@ -460,7 +460,7 @@ ${{{{ // Or - __target_intrinsic(cuda, "atomicOr((uint64_t*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicOr($0._getPtrAt<uint64_t>($1), $2)") uint64_t InterlockedOrU64(uint byteAddress, uint64_t value); __specialized_for_target(hlsl) @@ -475,7 +475,7 @@ ${{{{ // Xor - __target_intrinsic(cuda, "atomicXor((uint64_t*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicXor($0._getPtrAt<uint64_t>($1), $2)") uint64_t InterlockedXorU64(uint byteAddress, uint64_t value); __specialized_for_target(hlsl) @@ -490,7 +490,7 @@ ${{{{ // Exchange - __target_intrinsic(cuda, "atomicExch((uint64_t*)$0._getPtrAt($1), $2)") + __target_intrinsic(cuda, "atomicExch($0._getPtrAt<uint64_t>($1), $2)") uint64_t InterlockedExchangeU64(uint byteAddress, uint64_t value); __specialized_for_target(hlsl) |
