From ff9437e6c926c1e7c6a0ebe66592b46dbb3fb36b Mon Sep 17 00:00:00 2001 From: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> Date: Wed, 10 Jul 2024 16:25:51 -0400 Subject: Implement non member function atomic texture support (#4544) * Implement non member function atomic texture support texture_buffer and texture1d Fixes: #4538 Related to: #4291, fixes `tests/compute/atomics-buffer.slang` Texture objects cannot use `__getMetalAtomicRef` to cast objects into atomic value type. [Texture objects mandate use of member functions](https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf#Texture%20Functions). The implementation is as follows: * We can detect texture object usage through checking for an `IRImageSubscript` Operation. `__isTextureAccess()` was added to evaluate if we have an `IRImageSubscript` operation at compile time (before `static_assert`). `__isTextureAccess()` only checks if we are targeting Metal. * We have all parameter data needed to call a texture atomic function embedded inside `IRImageSubscript`. `__extractTextureFromTextureAccess()` and `__extractCoordFromTextureAccess()` was added to extract this data for use with Metal atomics. Note: * Metal documentation has various incorrect details (function names) * Since we currently hardcode metal versions for compiling, the Metal compiler version was changed to target `Metal 3.1` (`slang-gcc-compiler-util.cpp`) * textures do not permit atomic float operations * add fallthrough attribute + fix bug with 'exchange instead of xor' + fix warning bug * incorrect function name fix * missing filecheck * disable atomics-buffer.slang compute test since GFX issue causing it to fail * Array support for metal interlockedAtomic and proper verification of texture with interlockedAtomic functions * Array support for metal interlockedAtomic * proper verification of texture with interlockedAtomic functions note: had to seperate many functions to allow forceInlining to run * missing getOperand(0) * push atomic fix for metal * fix atomic syntax for metal and hlsl emitting extra brackets (breaks tests) * test changes and meta changes 1. max is 8 rw textures with metal because Metal has this limit. Split up tests to not hit this limit 2. added back `[0]`...,`T` to test since this legalizes metal atomic intrinsic * macro'ify some of the atomic code 1. addresses review 2. makes code easier to modify in the future (rather than sifting through 1000 lines we can just look at ~10-30 * fix test 'check' * missing float support due to macro * add functions macro generates, `InternalAtomicOperationInfo` --------- Co-authored-by: Yong He --- source/slang/slang-emit.cpp | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'source/slang/slang-emit.cpp') diff --git a/source/slang/slang-emit.cpp b/source/slang/slang-emit.cpp index d8f0686d5..243dd65e8 100644 --- a/source/slang/slang-emit.cpp +++ b/source/slang/slang-emit.cpp @@ -50,7 +50,9 @@ #include "slang-ir-lower-l-value-cast.h" #include "slang-ir-lower-reinterpret.h" #include "slang-ir-loop-unroll.h" +#include "slang-ir-legalize-extract-from-texture-access.h" #include "slang-ir-legalize-image-subscript.h" +#include "slang-ir-legalize-is-texture-access.h" #include "slang-ir-legalize-vector-types.h" #include "slang-ir-metadata.h" #include "slang-ir-optix-entry-point-uniforms.h" @@ -907,6 +909,9 @@ Result linkAndOptimizeIR( legalizeVectorTypes(irModule, sink); + // Legalize `__isTextureAccess` and related. + legalizeIsTextureAccess(irModule); + // Once specialization and type legalization have been performed, // we should perform some of our basic optimization steps again, // to see if we can clean up any temporaries created by legalization. @@ -1154,9 +1159,13 @@ Result linkAndOptimizeIR( if(isD3DTarget(targetRequest)) legalizeNonStructParameterToStructForHLSL(irModule); + legalizeExtractFromTextureAccess(irModule); + // Legalize `ImageSubscript` loads. switch (target) { + case CodeGenTarget::MetalLibAssembly: + case CodeGenTarget::MetalLib: case CodeGenTarget::Metal: case CodeGenTarget::GLSL: case CodeGenTarget::SPIRV: -- cgit v1.2.3