summaryrefslogtreecommitdiff
path: root/source/slang/slang-ir-legalize-empty-array.cpp
diff options
context:
space:
mode:
authorTheresa Foley <10618364+tangent-vector@users.noreply.github.com>2025-10-14 18:00:47 -0700
committerGitHub <noreply@github.com>2025-10-15 01:00:47 +0000
commit907410f6a52cf4e7538870ebf5aeb88858f97973 (patch)
tree2c053154c23d828f5220a127f92ea2c98f706ce7 /source/slang/slang-ir-legalize-empty-array.cpp
parent05cae938cbe1ac8807f4913044d9bee46dc3a0e1 (diff)
Clean up Slang IR representation of undefined values (#8708)
Prior to this change, the Slang IR used a single opcode (`kIROp_Undefined`) to encode all cases of undefined values. The particular motivation for this change was a need to distinguish those undefined values that represent a load from an uninitialized memory location versus other sorts of undefined values. If transforming a variable into SSA form results in `undefined` values in cases where the a `load` was executed without a prior `store`, that represents an error on the programmer's part, and should be diagnosed. However, other cases of undefined values can arise during program transformation and optimization, and should not typically result in diagnostics being emitted. While it was not the original motivation for this change, it is also worth noting that the LLVM project has transitioned from initially using only a single `undef` instruction to having a more nuanced model, and the same factors that motivated their shift also apply to the Slang IR. Counter-intuitively, the semantics of undefined values actually need to be carefully defined. Concretely, this change splits the pre-existing `undefined` opcode into two sub-cases: - `kIROp_LoadFromUninitializedMemory`, to represent the case of loading from a memory location (such as a local variable) that has not been initialized. - `kIROp_Poison`, corresponding to the LLVM `poison` value. Our poison instruction is intended to have semantics comparable to LLVM's equivalent. Conceptually, any operation that is invoked with a poison value as input will (with a few exceptions) produce a poison value as output. One can think of the behavior of `poison` as similar to how not-a-number values propagate in floating-point computations: by default they "infect" the result of any computation they are involved in. This semantic choice helps to ensure that many optimizations end up being correct in the presence of undefined values, even if they did not specifically account for them. The `kIROp_LoadFromUninitializedMemory` case is comparable to the combination of `freeze` and `undef` in LLVM. An LLVM `undef` value has semantics that allow *each* use of that value to be replaced with a *different* arbitrary value; these semantics cause many optimizations to only be correct in the absence of undefined values. An LLVM `freeze` instruction can take an undefined value as input, and produces a single value that is still arbitrary, but must be consistent across all uses. The latter semantics are what we want, since a given `load` from an uninitialized memory location will yield an arbitrary-but-fixed value. Note that we intentionally do not have a direct analogue to LLVM's `undef` instruction, because of the way that `undef` causes so many complications when trying to write optimizations. We also do not add a `kIROp_Freeze` instruction in this change, but that is simply because we currently have no need for it. Existing code that was creating `IRUndefined` values has been updated to create either `IRPoison` or `IRLoadFromUninitializedMemory` values, as appropriate to the use case. Code that was checking for the `kIROp_Undefined` opcode has been updated to either check for both of the new opcodes (in the case of `switch` statements), or to use `as<IRUndefined>` to perform a dynamic cast to the common base type of the two new instructions. Note that this change does not alter the way that instructions representing undefined values are typically emitted as ordinary instructions in the block that produces an undefined value. While emitting `IRLoadFromUninitializedMemory` as an ordinary instruction is exactly what we want, the `IRPoison` case would actually be better represented in Slang IR as a "hoistable" instruction, so that there would only be a singular `poison` value of each type. Changing `IRPoison` to be hoistable would be a good follow-up change, but might run into more challenges depending on what assumptions (if any) the codebase is making about where undefined values get emitted. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Diffstat (limited to 'source/slang/slang-ir-legalize-empty-array.cpp')
-rw-r--r--source/slang/slang-ir-legalize-empty-array.cpp55
1 files changed, 38 insertions, 17 deletions
diff --git a/source/slang/slang-ir-legalize-empty-array.cpp b/source/slang/slang-ir-legalize-empty-array.cpp
index 1e6798a2e..76e1874b7 100644
--- a/source/slang/slang-ir-legalize-empty-array.cpp
+++ b/source/slang/slang-ir-legalize-empty-array.cpp
@@ -50,9 +50,30 @@ struct EmptyArrayLoweringContext
return ptr && isEmptyArray(ptr->getValueType());
}
- // Visit each instruction and replace it with legalized instructiosn if necessary.
+ // Visit each instruction and replace it with legalized instructions if necessary.
void processInst(IRInst* inst)
{
+ //
+ // This function is handling several different replacements at once:
+ //
+ // * Instructions that would create an empty array value are changed
+ // to create a unit value (`void`) instead.
+ //
+ // * Instruction that would try to read an element from an empty array,
+ // or form a pointer to such an element, instead yield a `poison`
+ // (undefined) value. Note that they do not yield `LoadFromUninitializedMemory`
+ // because there is no guarantee that there would be anything in the
+ // memory corresponding to an index into an empty array.
+ //
+ // * Instructions that would try to read from an undefined pointer,
+ // buffer, image, etc. yield `poison`.
+ //
+ // * Instructions that would try to write to an undefined pointer,
+ // buffer, image, etc. are skipped (the behavior is undefined, so
+ // "do nothing" is a valid choice for the behavior, which has the
+ // nice side effect of potentially rendering other code redundant).
+ //
+
IRInst* replacement = nullptr;
IRBuilder builder(module);
@@ -65,64 +86,64 @@ struct EmptyArrayLoweringContext
[&](IRGetElement* getElement)
{
const auto base = getElement->getBase();
- return hasEmptyArrayType(base) ? builder.emitUndefined(getElement->getDataType())
+ return hasEmptyArrayType(base) ? builder.emitPoison(getElement->getDataType())
: nullptr;
},
[&](IRGetElementPtr* gep)
{
const auto base = gep->getBase();
return hasEmptyArrayPtrType(gep) || hasEmptyArrayPtrType(base) ||
- base->getOp() == kIROp_Undefined
- ? builder.emitUndefined(gep->getDataType())
+ as<IRUndefined>(base)
+ ? builder.emitPoison(gep->getDataType())
: nullptr;
},
[&](IRFieldAddress* gep)
{
const auto base = gep->getBase();
- return hasEmptyArrayPtrType(gep) || base->getOp() == kIROp_Undefined
- ? builder.emitUndefined(gep->getDataType())
+ return hasEmptyArrayPtrType(gep) || as<IRUndefined>(base)
+ ? builder.emitPoison(gep->getDataType())
: nullptr;
},
[&](IRLoad* load)
{
- return load->getOperand(0)->getOp() == kIROp_Undefined
- ? builder.emitUndefined(load->getDataType())
+ return as<IRUndefined>(load->getOperand(0))
+ ? builder.emitPoison(load->getDataType())
: nullptr;
},
[&](IRImageLoad* load)
{
- return load->getOperand(0)->getOp() == kIROp_Undefined
- ? builder.emitUndefined(load->getDataType())
+ return as<IRUndefined>(load->getOperand(0))
+ ? builder.emitPoison(load->getDataType())
: nullptr;
},
[&](IRStore* store)
{
- if (store->getPtr()->getOp() == kIROp_Undefined)
+ if (as<IRUndefined>(store->getPtr()))
store->removeAndDeallocate();
return nullptr;
},
[&](IRAtomicStore* store)
{
- if (store->getPtr()->getOp() == kIROp_Undefined)
+ if (as<IRUndefined>(store->getPtr()))
store->removeAndDeallocate();
return nullptr;
},
[&](IRImageStore* store)
{
- if (store->getImage()->getOp() == kIROp_Undefined)
+ if (as<IRUndefined>(store->getImage()))
store->removeAndDeallocate();
return nullptr;
},
[&](IRImageSubscript* subscript)
{
- return subscript->getImage()->getOp() == kIROp_Undefined
- ? builder.emitUndefined(subscript->getDataType())
+ return as<IRUndefined>(subscript->getImage())
+ ? builder.emitPoison(subscript->getDataType())
: nullptr;
},
[&](IRAtomicOperation* atomic)
{
- return atomic->getOperand(0)->getOp() == kIROp_Undefined
- ? builder.emitUndefined(atomic->getDataType())
+ return as<IRUndefined>(atomic->getOperand(0))
+ ? builder.emitPoison(atomic->getDataType())
: nullptr;
},
// The following should match any instruction which can construct a 0-sized array.