diff options
| author | jsmall-nvidia <jsmall@nvidia.com> | 2020-03-27 18:35:06 -0400 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2020-03-27 22:35:06 +0000 |
| commit | 6f43b2698a99cc4f4bb4e905749fb87f24bf391b (patch) | |
| tree | 567927f4e36ee42481c200ca4caa8a7ea47e3150 /prelude | |
| parent | e267ce24e37b9b7f98921f75abc150c1463b1d6d (diff) | |
WaveBroadcastAt/WaveShuffle (#1299)
* Support for WaveReadLaneAt with dynamic (but uniform across Wave) on Vk by enabling VK1.4.
Fixed wave-lane-at.slang test to test with laneId that is uniform across the Wave.
* Added WaveShuffle intrinsic.
Test for WaveShuffle intrinsic.
* Added some documentation on WaveShuffle
* Fix that version required for subgroupBroadcast to be non constexpr is actually 1.5
* Added WaveBroadcastLaneAt
Documented WaveShuffle/BroadcastLaneAt/ReadLaneAt
* Update docs around WaveBroadcast/Read/Shuffle.
Use '_waveShuffle` as name in CUDA prelude to better describe it's more flexible behavior.
Diffstat (limited to 'prelude')
| -rw-r--r-- | prelude/slang-cuda-prelude.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/prelude/slang-cuda-prelude.h b/prelude/slang-cuda-prelude.h index dcc585b9c..b5d8b3788 100644 --- a/prelude/slang-cuda-prelude.h +++ b/prelude/slang-cuda-prelude.h @@ -851,7 +851,7 @@ __inline__ __device__ T _waveReadFirstMultiple(T inVal) } template <typename T> -__inline__ __device__ T _waveReadLaneAtMultiple(T inVal, int lane) +__inline__ __device__ T _waveShuffleMultiple(T inVal, int lane) { typedef typename ElementTypeTrait<T>::Type ElemType; const size_t count = sizeof(T) / sizeof(ElemType); |
