diff options
| author | Sriram Murali <85252063+sriramm-nv@users.noreply.github.com> | 2024-05-13 23:57:57 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2024-05-13 23:57:57 -0700 |
| commit | 487ae034e2b03ddd67945132c8fecbd937952705 (patch) | |
| tree | 036d318a64385151ad9d5e7275c2e387fdca6cee /source/slang/slang-ir-peephole.cpp | |
| parent | 9f23046138629f78995d54a7722ad6749bd84db9 (diff) | |
Add LoadAligned and StoreAligned methods to ByteAddressBuffers (#4066)
Fixes #4062
This change enables wide load/stores for byte-address-buffer backed
resources, when the data is accessed at an offset that is aligned.
**Goals**
- Improve performance by issuing wider instructions instead of sequence
of scalar instructions, for load and stores of byte-address buffers.
- Reduce code-size and readability of the generated shaders.
- Help naive users as well as ninja programmers, generate optimal code.
**Non Goals**
- Help with Structured buffers, or other resources.
- Target compilation time improvements.
**Key changes**
Adds 2 new overloads for Load and Store operations on ByteAddress Buffers.
1. Load / Store with an extra alignment parameter
```
resource.Load<T>(offset, alignment);
resource.Store<T>(offset, value, alignment);
```
2. LoadAligned / StoreAligned with no extra parameter,
with the same signature as orignial Load / Store.
```
resource.LoadAligned<T>(offset);
resource.StoreAligned<T>(offset, value);
```
- This overload will implicitly identify the alignment value,
from the base type T of the elementary unit of the resource.
**Supported resources**
1. Vectors
This can be upto 4 elements, i.e. float -- float4.
2. Arrays
This does not have a limit on number of elements, but on a
conservative estimate, we can limit to few hundreds.
3. Structures
This is used to group a resource of a single type.
```
struct {
float4 x;
}
```
**Code updates**
- Modified byte-address-ir legalize to handle struct, array and vector
kinds of load or store access
- Added custom hlsl stdlib functions to implement all the overloads for Load,
Store etc.
- Added C-like emitter, SPIR-V emitter for handling ByteAddressBuffers.
- Added a new core stdlib function intrinsic to wrap around alignOf<T>().
- Added a new peephole optimization entry to identify the equivalent
IntLiteral value from the alignOf<T>() inst.
- Added tests to check explicit, and implicit aligned Load and Store
operations.
Diffstat (limited to 'source/slang/slang-ir-peephole.cpp')
| -rw-r--r-- | source/slang/slang-ir-peephole.cpp | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/source/slang/slang-ir-peephole.cpp b/source/slang/slang-ir-peephole.cpp index 88b26fbd3..16e440b32 100644 --- a/source/slang/slang-ir-peephole.cpp +++ b/source/slang/slang-ir-peephole.cpp @@ -250,6 +250,30 @@ struct PeepholeContext : InstPassBase switch (inst->getOp()) { + case kIROp_AlignOf: + // Fold all calls to alignOf<T>() that returns a simple integer value. + if (inst->getDataType()->getOp() == kIROp_IntType) + { + if (!targetProgram) + break; + + // Save the alignment information and exit early if it is invalid + IRSizeAndAlignment sizeAlignment; + auto alignOfInst = as<IRAlignOf>(inst); + auto baseType = alignOfInst->getBaseOp()->getDataType(); + if (SLANG_FAILED(getNaturalSizeAndAlignment(targetProgram->getOptionSet(), baseType, &sizeAlignment))) + break; + if (sizeAlignment.size == 0) + break; + + IRBuilder builder(module); + builder.setInsertBefore(inst); + auto stride = builder.getIntValue(inst->getDataType(), sizeAlignment.getStride()); + inst->replaceUsesWith(stride); + maybeRemoveOldInst(inst); + changed = true; + } + break; case kIROp_GetResultError: if (inst->getOperand(0)->getOp() == kIROp_MakeResultError) { |
