diff options
| author | Theresa Foley <10618364+tangent-vector@users.noreply.github.com> | 2025-10-14 18:00:47 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-10-15 01:00:47 +0000 |
| commit | 907410f6a52cf4e7538870ebf5aeb88858f97973 (patch) | |
| tree | 2c053154c23d828f5220a127f92ea2c98f706ce7 /source/slang/slang-ir-legalize-empty-array.cpp | |
| parent | 05cae938cbe1ac8807f4913044d9bee46dc3a0e1 (diff) | |
Clean up Slang IR representation of undefined values (#8708)
Prior to this change, the Slang IR used a single opcode
(`kIROp_Undefined`) to encode all cases of undefined values. The
particular motivation for this change was a need to distinguish those
undefined values that represent a load from an uninitialized memory
location versus other sorts of undefined values. If transforming a
variable into SSA form results in `undefined` values in cases where the
a `load` was executed without a prior `store`, that represents an error
on the programmer's part, and should be diagnosed. However, other cases
of undefined values can arise during program transformation and
optimization, and should not typically result in diagnostics being
emitted.
While it was not the original motivation for this change, it is also
worth noting that the LLVM project has transitioned from initially using
only a single `undef` instruction to having a more nuanced model, and
the same factors that motivated their shift also apply to the Slang IR.
Counter-intuitively, the semantics of undefined values actually need to
be carefully defined.
Concretely, this change splits the pre-existing `undefined` opcode into
two sub-cases:
- `kIROp_LoadFromUninitializedMemory`, to represent the case of loading
from a memory location (such as a local variable) that has not been
initialized.
- `kIROp_Poison`, corresponding to the LLVM `poison` value.
Our poison instruction is intended to have semantics comparable to
LLVM's equivalent. Conceptually, any operation that is invoked with a
poison value as input will (with a few exceptions) produce a poison
value as output. One can think of the behavior of `poison` as similar to
how not-a-number values propagate in floating-point computations: by
default they "infect" the result of any computation they are involved
in. This semantic choice helps to ensure that many optimizations end up
being correct in the presence of undefined values, even if they did not
specifically account for them.
The `kIROp_LoadFromUninitializedMemory` case is comparable to the
combination of `freeze` and `undef` in LLVM. An LLVM `undef` value has
semantics that allow *each* use of that value to be replaced with a
*different* arbitrary value; these semantics cause many optimizations to
only be correct in the absence of undefined values. An LLVM `freeze`
instruction can take an undefined value as input, and produces a single
value that is still arbitrary, but must be consistent across all uses.
The latter semantics are what we want, since a given `load` from an
uninitialized memory location will yield an arbitrary-but-fixed value.
Note that we intentionally do not have a direct analogue to LLVM's
`undef` instruction, because of the way that `undef` causes so many
complications when trying to write optimizations.
We also do not add a `kIROp_Freeze` instruction in this change, but that
is simply because we currently have no need for it.
Existing code that was creating `IRUndefined` values has been updated to
create either `IRPoison` or `IRLoadFromUninitializedMemory` values, as
appropriate to the use case. Code that was checking for the
`kIROp_Undefined` opcode has been updated to either check for both of
the new opcodes (in the case of `switch` statements), or to use
`as<IRUndefined>` to perform a dynamic cast to the common base type of
the two new instructions.
Note that this change does not alter the way that instructions
representing undefined values are typically emitted as ordinary
instructions in the block that produces an undefined value. While
emitting `IRLoadFromUninitializedMemory` as an ordinary instruction is
exactly what we want, the `IRPoison` case would actually be better
represented in Slang IR as a "hoistable" instruction, so that there
would only be a singular `poison` value of each type. Changing
`IRPoison` to be hoistable would be a good follow-up change, but might
run into more challenges depending on what assumptions (if any) the
codebase is making about where undefined values get emitted.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Diffstat (limited to 'source/slang/slang-ir-legalize-empty-array.cpp')
| -rw-r--r-- | source/slang/slang-ir-legalize-empty-array.cpp | 55 |
1 files changed, 38 insertions, 17 deletions
diff --git a/source/slang/slang-ir-legalize-empty-array.cpp b/source/slang/slang-ir-legalize-empty-array.cpp index 1e6798a2e..76e1874b7 100644 --- a/source/slang/slang-ir-legalize-empty-array.cpp +++ b/source/slang/slang-ir-legalize-empty-array.cpp @@ -50,9 +50,30 @@ struct EmptyArrayLoweringContext return ptr && isEmptyArray(ptr->getValueType()); } - // Visit each instruction and replace it with legalized instructiosn if necessary. + // Visit each instruction and replace it with legalized instructions if necessary. void processInst(IRInst* inst) { + // + // This function is handling several different replacements at once: + // + // * Instructions that would create an empty array value are changed + // to create a unit value (`void`) instead. + // + // * Instruction that would try to read an element from an empty array, + // or form a pointer to such an element, instead yield a `poison` + // (undefined) value. Note that they do not yield `LoadFromUninitializedMemory` + // because there is no guarantee that there would be anything in the + // memory corresponding to an index into an empty array. + // + // * Instructions that would try to read from an undefined pointer, + // buffer, image, etc. yield `poison`. + // + // * Instructions that would try to write to an undefined pointer, + // buffer, image, etc. are skipped (the behavior is undefined, so + // "do nothing" is a valid choice for the behavior, which has the + // nice side effect of potentially rendering other code redundant). + // + IRInst* replacement = nullptr; IRBuilder builder(module); @@ -65,64 +86,64 @@ struct EmptyArrayLoweringContext [&](IRGetElement* getElement) { const auto base = getElement->getBase(); - return hasEmptyArrayType(base) ? builder.emitUndefined(getElement->getDataType()) + return hasEmptyArrayType(base) ? builder.emitPoison(getElement->getDataType()) : nullptr; }, [&](IRGetElementPtr* gep) { const auto base = gep->getBase(); return hasEmptyArrayPtrType(gep) || hasEmptyArrayPtrType(base) || - base->getOp() == kIROp_Undefined - ? builder.emitUndefined(gep->getDataType()) + as<IRUndefined>(base) + ? builder.emitPoison(gep->getDataType()) : nullptr; }, [&](IRFieldAddress* gep) { const auto base = gep->getBase(); - return hasEmptyArrayPtrType(gep) || base->getOp() == kIROp_Undefined - ? builder.emitUndefined(gep->getDataType()) + return hasEmptyArrayPtrType(gep) || as<IRUndefined>(base) + ? builder.emitPoison(gep->getDataType()) : nullptr; }, [&](IRLoad* load) { - return load->getOperand(0)->getOp() == kIROp_Undefined - ? builder.emitUndefined(load->getDataType()) + return as<IRUndefined>(load->getOperand(0)) + ? builder.emitPoison(load->getDataType()) : nullptr; }, [&](IRImageLoad* load) { - return load->getOperand(0)->getOp() == kIROp_Undefined - ? builder.emitUndefined(load->getDataType()) + return as<IRUndefined>(load->getOperand(0)) + ? builder.emitPoison(load->getDataType()) : nullptr; }, [&](IRStore* store) { - if (store->getPtr()->getOp() == kIROp_Undefined) + if (as<IRUndefined>(store->getPtr())) store->removeAndDeallocate(); return nullptr; }, [&](IRAtomicStore* store) { - if (store->getPtr()->getOp() == kIROp_Undefined) + if (as<IRUndefined>(store->getPtr())) store->removeAndDeallocate(); return nullptr; }, [&](IRImageStore* store) { - if (store->getImage()->getOp() == kIROp_Undefined) + if (as<IRUndefined>(store->getImage())) store->removeAndDeallocate(); return nullptr; }, [&](IRImageSubscript* subscript) { - return subscript->getImage()->getOp() == kIROp_Undefined - ? builder.emitUndefined(subscript->getDataType()) + return as<IRUndefined>(subscript->getImage()) + ? builder.emitPoison(subscript->getDataType()) : nullptr; }, [&](IRAtomicOperation* atomic) { - return atomic->getOperand(0)->getOp() == kIROp_Undefined - ? builder.emitUndefined(atomic->getDataType()) + return as<IRUndefined>(atomic->getOperand(0)) + ? builder.emitPoison(atomic->getDataType()) : nullptr; }, // The following should match any instruction which can construct a 0-sized array. |
