<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/hlsl-intrinsic/atomic, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2025-09-30T18:21:27+00:00</updated>
<entry>
<title>Enable metal tests (#8446)</title>
<updated>2025-09-30T18:21:27+00:00</updated>
<author>
<name>James Helferty (NVIDIA)</name>
<email>jhelferty@nvidia.com</email>
</author>
<published>2025-09-30T18:21:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=8086adc90b69f3199767c0617e2c429ce6b27f67'/>
<id>urn:sha1:8086adc90b69f3199767c0617e2c429ce6b27f67</id>
<content type='text'>
Enables all tests/metal/ tests that can be easily enabled.

These tests were not originally designed as render tests; they are
generally being enabled for pipecleaning purposes, and will not be
rigorously testing the corresponding funcitonality.

Where they cannot be enabled as render tests, and a metallib test wasn't
already enabled, a metallib test was enabled instead (where possible).

Fixes #7892</content>
</entry>
<entry>
<title>Enable CUDA support for additional HLSL intrinsic tests (#8293)</title>
<updated>2025-09-04T05:28:02+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-09-04T05:28:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=5ec41675d817f82a7ce3c4d79c68548db0bd4227'/>
<id>urn:sha1:5ec41675d817f82a7ce3c4d79c68548db0bd4227</id>
<content type='text'>
Enable CUDA support for additional HLSL intrinsic tests by implementing
missing functionality and fixing compiler bugs affecting CUDA targets.

- Fix critical bug in InterlockedCompareStore64 where division used /4
instead of /8 for 64-bit types, causing incorrect memory addressing for
all signed int 64_t atomics
- Add signed int64_t atomic wrappers (atomicExch, atomicCAS) to CUDA
prelu de that properly cast to/from unsigned types as required by CUDA's
atomic API
- Enable tests: atomic-intrinsics-64bit.slang

- Implement CUDA support for QuadAny and QuadAll operations using warp
shu ffle primitives (__shfl_sync with quad-level lane masking)
- Add CUDA to quad_control capability definition in
slang-capabilities.capdef
- Add _slang_quadAny/_slang_quadAll helper functions to CUDA prelude
- Enable tests: quad-control-comp-functionality.slang,
subgroup-quad.slang

---------

Co-authored-by: szihs &lt;675653+szihs@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>render-test: Change D3D12 default to sm_6_5 (#8320)</title>
<updated>2025-09-02T23:43:48+00:00</updated>
<author>
<name>James Helferty (NVIDIA)</name>
<email>jhelferty@nvidia.com</email>
</author>
<published>2025-09-02T23:43:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f02b08490aa905f42a8d90381db84b1f8e409c0c'/>
<id>urn:sha1:f02b08490aa905f42a8d90381db84b1f8e409c0c</id>
<content type='text'>
Changes default for render-test to sm_6_5.
Since sm_6_5 is the new default, remove the -use-dxil option, add
-use-dxcb option
Remove -use-dxil option from all test cases.
Add -use-dxcb to two tests that needed it.

Fixes #7611</content>
</entry>
<entry>
<title>Enable tests for CUDA (#7593)</title>
<updated>2025-07-03T12:30:38+00:00</updated>
<author>
<name>Mukund Keshava</name>
<email>mkeshava@nvidia.com</email>
</author>
<published>2025-07-03T12:30:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=141eac9eb4400cf94c0a076f339e1d43ed652306'/>
<id>urn:sha1:141eac9eb4400cf94c0a076f339e1d43ed652306</id>
<content type='text'>
Enable intrinsic tests for cuda. Most of these tests were either
disabled or just not enabled for cuda.

Fixes #7592

Co-authored-by: Ellie Hermaszewska &lt;ellieh@nvidia.com&gt;</content>
</entry>
<entry>
<title>Do not zero-initialize groupshared and rayquery variables (#4838)</title>
<updated>2024-08-14T17:05:57+00:00</updated>
<author>
<name>ArielG-NV</name>
<email>159081215+ArielG-NV@users.noreply.github.com</email>
</author>
<published>2024-08-14T17:05:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=45b76418f9da2248b069f2058c6a1d52b05a8c74'/>
<id>urn:sha1:45b76418f9da2248b069f2058c6a1d52b05a8c74</id>
<content type='text'>
* Do not zero-initialize groupshared and rayquery variables

Fixes: #4824

`-zero-initialize` option will explicitly not:
1. Set any groupshared values to defaults
2. Set any rayQuery object to a default state (currently invalid code generation)

* grammer

* disallow groupshared initializers

disallow groupshared initializers &amp; adjust tests accordingly

* remove disallowed groupshared-init expression

* do not default init if non-copyable

---------

Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
<entry>
<title>SCCP instead of CFG since SCCP removes code of unused branches, not CFG (#4640)</title>
<updated>2024-07-16T15:35:46+00:00</updated>
<author>
<name>ArielG-NV</name>
<email>159081215+ArielG-NV@users.noreply.github.com</email>
</author>
<published>2024-07-16T15:35:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=12ecc43ae07035d951beb531058ba27bdfb9c0de'/>
<id>urn:sha1:12ecc43ae07035d951beb531058ba27bdfb9c0de</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Implement non member function atomic texture support (#4544)</title>
<updated>2024-07-10T20:25:51+00:00</updated>
<author>
<name>ArielG-NV</name>
<email>159081215+ArielG-NV@users.noreply.github.com</email>
</author>
<published>2024-07-10T20:25:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=ff9437e6c926c1e7c6a0ebe66592b46dbb3fb36b'/>
<id>urn:sha1:ff9437e6c926c1e7c6a0ebe66592b46dbb3fb36b</id>
<content type='text'>
* Implement non member function atomic texture support texture_buffer and texture1d

Fixes: #4538
Related to: #4291, fixes `tests/compute/atomics-buffer.slang`

Texture objects cannot use `__getMetalAtomicRef` to cast objects into atomic value type. [Texture objects mandate use of member functions](https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf#Texture%20Functions). The implementation is as follows:
* We can detect texture object usage through checking for an `IRImageSubscript` Operation. `__isTextureAccess()` was added to evaluate if we have an `IRImageSubscript` operation at compile time (before `static_assert`). `__isTextureAccess()` only checks if we are targeting Metal.
* We have all parameter data needed to call a texture atomic function embedded inside `IRImageSubscript`. `__extractTextureFromTextureAccess()` and `__extractCoordFromTextureAccess()` was added to extract this data for use with Metal atomics.

Note:
* Metal documentation has various incorrect details (function names)
* Since we currently hardcode metal versions for compiling, the Metal compiler version was changed to target `Metal 3.1` (`slang-gcc-compiler-util.cpp`)
* textures do not permit atomic float operations

* add fallthrough attribute + fix bug with 'exchange instead of xor' + fix warning bug

* incorrect function name fix

* missing filecheck

* disable atomics-buffer.slang compute test since GFX issue causing it to fail

* Array support for metal interlockedAtomic and proper verification of texture with interlockedAtomic functions

* Array support for metal interlockedAtomic
* proper verification of texture with interlockedAtomic functions
note: had to seperate many functions to allow forceInlining to run

* missing getOperand(0)

* push atomic fix for metal

* fix atomic syntax for metal and hlsl emitting extra brackets (breaks tests)

* test changes and meta changes

1. max is 8 rw textures with metal because Metal has this limit. Split up tests to not hit this limit
2. added back `[0]`...,`T` to test since this legalizes metal atomic intrinsic

* macro'ify some of the atomic code

1. addresses review
2. makes code easier to modify in the future (rather than sifting through 1000 lines we can just look at ~10-30

* fix test 'check'

* missing float support due to macro

* add functions macro generates, `InternalAtomicOperationInfo`

---------

Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
<entry>
<title>Support 64bit HLSL atomic functions (#3957)</title>
<updated>2024-04-16T02:47:23+00:00</updated>
<author>
<name>Jay Kwak</name>
<email>82421531+jkwak-work@users.noreply.github.com</email>
</author>
<published>2024-04-16T02:47:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=030d7f45726187b5b23a3cfb9743166aa60fae30'/>
<id>urn:sha1:030d7f45726187b5b23a3cfb9743166aa60fae30</id>
<content type='text'>
Resolves #3951

This adds a few atomic functions for SM6.6.
The spec can be found from here:
https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_Int64_and_Float_Atomics.html

The new functions are:
void InterlockedAdd(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedAdd(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedAnd(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedOr(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedXor(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedMin(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedMin(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedMax(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedMax(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedExchange(inout XXX dest, in float value, out float original_value);
void InterlockedExchange(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedExchange(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedCompareStore(inout XXX dest, in int64_t compare_value, in int64_t value);
void InterlockedCompareStore(inout XXX dest, in uint64_t compare_value, in uint64_t value);
void InterlockedCompareStoreFloatBitwise(inout XXX dest, in float compare_value, in float value);
void InterlockedCompareExchange(inout XXX dest, in int64_t compare_value, in int64_t value, out int64_t original_value);
void InterlockedCompareExchange(inout XXX dest, in uint64_t compare_value, in uint64_t value, out uint64_t original_value);
void InterlockedCompareExchangeFloatBitwise(inout XXX dest, in float compare_value, in float value, out float original_value);

void RWByteAddressBuffer::InterlockedAnd64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedOr64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedXor64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedMin64(in uint dest_offset, in int64_t value, out int64_t original_value);
void RWByteAddressBuffer::InterlockedMin64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedMax64(in uint dest_offset, in int64_t value, out int64_t original_value);
void RWByteAddressBuffer::InterlockedMax64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedExchangeFloat(in uint dest_offset, in float value, out float original_value);
void RWByteAddressBuffer::InterlockedExchange64(in uint dest_offset, in int64_t value, out int64_t original_value);
void RWByteAddressBuffer::InterlockedExchange64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedCompareStore64(in uint dest_offset, in int64_t compare_value, in int64_t value);
void RWByteAddressBuffer::InterlockedCompareStore64(in uint dest_offset, in uint64_t compare_value, in uint64_t value);
void RWByteAddressBuffer::InterlockedCompareStoreFloatBitwise(in uint dest_offset, in float compare_value, in float value);
void RWByteAddressBuffer::InterlockedCompareExchangeFloatBitwise(in uint dest_offset, in float compare_value, in float value, out float original_value);</content>
</entry>
<entry>
<title>atomic intrinsic test (#3623)</title>
<updated>2024-02-27T00:09:44+00:00</updated>
<author>
<name>tgrimesnv</name>
<email>158093149+tgrimesnv@users.noreply.github.com</email>
</author>
<published>2024-02-27T00:09:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=3286076f3a0ff0eb72cf66caeadcde50a9540105'/>
<id>urn:sha1:3286076f3a0ff0eb72cf66caeadcde50a9540105</id>
<content type='text'>
* Add first draft of atomic intrinsic test

* Disable CUDA in atomic intrinsic test

---------

Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
</feed>
