From 384df864fdd2c518924d32295a13894f16295d43 Mon Sep 17 00:00:00 2001 From: Tim Foley Date: Wed, 2 May 2018 14:44:13 -0700 Subject: Add support for "swizzled stores" (#544) This was a known issue in our IR representation, which was now biting a user. The basic problem is that in code like the following: ```hlsl RWStructureBuffer buffer; ... buffer[index].xz = value; ``` we ideally want to be able to reproduce the original HLSL code exactly, but that requires directly encoding the way that this code writes to two elements of a vector, but not the others. The currently lowering strategy we had produced IR something like: ```hlsl float4 tmp = buffer[index]; tmp.xz = value; buffer[index] = tmp; ``` That transformation might seem valid, but it has some big problems: * It generates UAV reads that are not needed, which could impact performance * It performs read-modify-write operations on memory that the programmer didn't explicitly write, which could create data races The fix here is somewhat obvious: if the "base" of a swizzle operation on a left-hand side resolves to a pointer in our IR, then we can output a "swizzled store" instead of the read-modify-write dance. We currently keep the read-modify-write around since it is potentially needed as a fallback in the general case. Along the way I also tried to make sure that we handle the case where we have a swizzle of a swizzle on the left-hand side: ```hlsl buffer[index].xz.y = value; ``` That code should behave the same as `buffer[index].z = value`. I am currently detecting and cleaning up this logic in the lowering path for `SwizzleExpr`, because that is the only place in the lowering logic that "swizzled l-values" currently get created. --- source/slang/ir.cpp | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) (limited to 'source/slang/ir.cpp') diff --git a/source/slang/ir.cpp b/source/slang/ir.cpp index a4a118250..53050b6b9 100644 --- a/source/slang/ir.cpp +++ b/source/slang/ir.cpp @@ -2033,6 +2033,45 @@ namespace Slang return emitSwizzleSet(type, base, source, elementCount, irElementIndices); } + IRInst* IRBuilder::emitSwizzledStore( + IRInst* dest, + IRInst* source, + UInt elementCount, + IRInst* const* elementIndices) + { + IRInst* fixedArgs[] = { dest, source }; + UInt fixedArgCount = sizeof(fixedArgs) / sizeof(fixedArgs[0]); + + auto inst = createInstImpl( + this, + kIROp_SwizzledStore, + nullptr, + fixedArgCount, + fixedArgs, + elementCount, + elementIndices); + + addInst(inst); + return inst; + } + + IRInst* IRBuilder::emitSwizzledStore( + IRInst* dest, + IRInst* source, + UInt elementCount, + UInt const* elementIndices) + { + auto intType = getBasicType(BaseType::Int); + + IRInst* irElementIndices[4]; + for (UInt ii = 0; ii < elementCount; ++ii) + { + irElementIndices[ii] = getIntValue(intType, elementIndices[ii]); + } + + return emitSwizzledStore(dest, source, elementCount, irElementIndices); + } + IRInst* IRBuilder::emitReturn( IRInst* val) { -- cgit v1.2.3