diff options
| author | Tim Foley <tfoleyNV@users.noreply.github.com> | 2019-04-29 09:31:25 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2019-04-29 09:31:25 -0700 |
| commit | ded340beb4b5197b559626acc39920abb2d39e77 (patch) | |
| tree | e8908f4549991e4f71c52b0681a8033a73fae5ba /source | |
| parent | b7c60910367f2af2c359d76783975de0a4659c68 (diff) | |
Initial support for the `precise` keyword (#958)
Fixes #858
The `precise` keyword exists in both HLSL and GLSL and when applied to a variable declaration is supposed to indicate that all computations that contribute to the value of that variable should not be altered based on "fast-math" optimizations. The main examples are that separate multiply and add operations should not be turned into fused multiply-add (fma) operations, and that operations cannot ignore the possibility of infinity or not-a-number values (e.g., by assuming that `x * 0.0f` is always `0.0f`).
(Aside: it is possible that my understanding of what the semantics of `precise` are in HLSL and GLSL is imperfect so that either the GLSL variant isn't sufficient to provide the semantics of the HLSL keyword, or that the definition of "all computations that contribute" to a value isn't actually correct. We may need to revise this implementation based on subsequent learnings.)
The basic idea here is to turn the AST `precise` keyword into a `[precise]` decoration in the IR and then emit that as a `precise` keyword again in the output.
The main catch is that whereas most of our existing IR decorations apply to things like global shader parameters or `struct` members that usually stick around for the duration of compilation, `[precise]` will get slapped on local variables that will often get optimized away by our SSA pass. There are two ways a variable can get eliminated/replaced during the SSA pass:
1. A use of the variable can be replaced with an ordinary instruction that computes its value.
2. A use of the variable can be replaced with a reference to a "phi node" that will take on the appropriate value based on control flow.
These two cases already had logic to copy a "name hint" decoration from the variable over to an instruction that will replace it, and I simply extended them to also propagate over a `[precise]` decoration.
The test case added with this change intentionally constructs a case where `[precise]` needs to be propagated over to an SSA "phi node" in order to generate correct output code.
The other gotcha is that we can emit variable declarations in various places in `emit.cpp`, and all of these needed to handle `[precise]`. Not only do we have actually local variables (`IRVar`), but we also have SSA phi nodes (`IRParam`), and then there are cases where an intermediate computation (an ordinary instruction) should be `[precise]` and thus we need to emit it as a temporary (not folding it into its use sites) and make sure that the temporary itself gets the `precise` keyword.
I have manually confirmed that in the output SPIR-V, this change results in the `NoContraction` SPIR-V decoration being added to the relevant operations, and the output DXBC contains a multiply and an add in place of a multiply-add. The output DXIL does not show any obvious changes due to `precise`, although the exact order and operands of the math instructions emitted does differ when `precise` is added/removed. In all cases the output is equivalent to hand-written HLSL/GLSL with a `precise`-qualified local variable.
Diffstat (limited to 'source')
| -rw-r--r-- | source/slang/emit.cpp | 26 | ||||
| -rw-r--r-- | source/slang/ir-clone.cpp | 15 | ||||
| -rw-r--r-- | source/slang/ir-clone.h | 13 | ||||
| -rw-r--r-- | source/slang/ir-inst-defs.h | 1 | ||||
| -rw-r--r-- | source/slang/ir-insts.h | 1 | ||||
| -rw-r--r-- | source/slang/ir-ssa.cpp | 40 | ||||
| -rw-r--r-- | source/slang/lower-to-ir.cpp | 4 | ||||
| -rw-r--r-- | source/slang/modifier-defs.h | 2 | ||||
| -rw-r--r-- | source/slang/parser.cpp | 2 |
9 files changed, 88 insertions, 16 deletions
diff --git a/source/slang/emit.cpp b/source/slang/emit.cpp index e69ee2a93..8bb2a244b 100644 --- a/source/slang/emit.cpp +++ b/source/slang/emit.cpp @@ -2554,6 +2554,14 @@ struct EmitVisitor if(inst->mightHaveSideEffects()) return false; + // Don't fold instructions that are marked `[precise]`. + // This could in principle be extended to any other + // decorations that affect the semantics of an instruction + // in ways that require a temporary to be introduced. + // + if(inst->findDecoration<IRPreciseDecoration>()) + return false; + // Okay, at this point we know our instruction must have a single use. auto use = inst->firstUse; SLANG_ASSERT(use); @@ -2707,6 +2715,8 @@ struct EmitVisitor if (as<IRVoidType>(type)) return; + emitIRTempModifiers(ctx, inst); + emitIRRateQualifiers(ctx, inst); emitIRType(ctx, type, getIRName(inst)); @@ -5141,6 +5151,7 @@ struct EmitVisitor { for (auto pp = bb->getFirstParam(); pp; pp = pp->getNextParam()) { + emitIRTempModifiers(ctx, pp); emitIRType(ctx, pp->getFullType(), getIRName(pp)); emit(";\n"); } @@ -5849,6 +5860,19 @@ struct EmitVisitor } } + /// Emit modifiers that should apply even for a declaration of an SSA temporary. + void emitIRTempModifiers( + EmitContext* ctx, + IRInst* temp) + { + SLANG_UNUSED(ctx); + + if(temp->findDecoration<IRPreciseDecoration>()) + { + emit("precise "); + } + } + void emitIRVarModifiers( EmitContext* ctx, VarLayout* layout, @@ -5897,6 +5921,8 @@ struct EmitVisitor } } + emitIRTempModifiers(ctx, varDecl); + if (!layout) return; diff --git a/source/slang/ir-clone.cpp b/source/slang/ir-clone.cpp index b648efec6..bd5f9bcaa 100644 --- a/source/slang/ir-clone.cpp +++ b/source/slang/ir-clone.cpp @@ -241,10 +241,11 @@ IRInst* cloneInst( void cloneDecoration( IRDecoration* oldDecoration, - IRInst* newParent) + IRInst* newParent, + IRModule* module) { SharedIRBuilder sharedBuilder; - sharedBuilder.module = newParent->getModule(); + sharedBuilder.module = module; IRBuilder builder; builder.sharedBuilder = &sharedBuilder; @@ -258,6 +259,16 @@ void cloneDecoration( cloneInst(&env, &builder, oldDecoration); } +void cloneDecoration( + IRDecoration* oldDecoration, + IRInst* newParent) +{ + cloneDecoration( + oldDecoration, + newParent, + newParent->getModule()); +} + bool IRSimpleSpecializationKey::operator==(IRSimpleSpecializationKey const& other) const { auto valCount = vals.Count(); diff --git a/source/slang/ir-clone.h b/source/slang/ir-clone.h index ce5ebed42..bafbfa69d 100644 --- a/source/slang/ir-clone.h +++ b/source/slang/ir-clone.h @@ -127,6 +127,19 @@ IRInst* cloneInst( /// Clone `oldDecoration` and attach the clone to `newParent`. /// + /// Uses `module` to allocate any new instructions. + /// +void cloneDecoration( + IRDecoration* oldDecoration, + IRInst* newParent, + IRModule* module); + + /// Clone `oldDecoration` and attach the clone to `newParent`. + /// + /// Uses the module of `newParent` to allocate any new instructions, + /// so that `newParent` must already be installed somewhere + /// in the ownership hierarchy of an existing module. + /// void cloneDecoration( IRDecoration* oldDecoration, IRInst* newParent); diff --git a/source/slang/ir-inst-defs.h b/source/slang/ir-inst-defs.h index 5ac7bd24e..f2393b2b3 100644 --- a/source/slang/ir-inst-defs.h +++ b/source/slang/ir-inst-defs.h @@ -400,6 +400,7 @@ INST(HighLevelDeclDecoration, highLevelDecl, 1, 0) INST(VulkanCallablePayloadDecoration, vulkanCallablePayload, 0, 0) INST(EarlyDepthStencilDecoration, earlyDepthStencil, 0, 0) INST(GloballyCoherentDecoration, globallyCoherent, 0, 0) + INST(PreciseDecoration, precise, 0, 0) INST(PatchConstantFuncDecoration, patchConstantFunc, 1, 0) /// An `[entryPoint]` decoration marks a function that represents a shader entry point. diff --git a/source/slang/ir-insts.h b/source/slang/ir-insts.h index 26a5b32be..ab333f2be 100644 --- a/source/slang/ir-insts.h +++ b/source/slang/ir-insts.h @@ -226,6 +226,7 @@ struct IRRequireGLSLExtensionDecoration : IRDecoration IR_SIMPLE_DECORATION(ReadNoneDecoration) IR_SIMPLE_DECORATION(EarlyDepthStencilDecoration) IR_SIMPLE_DECORATION(GloballyCoherentDecoration) +IR_SIMPLE_DECORATION(PreciseDecoration) /// A decoration that marks a value as having linkage. /// diff --git a/source/slang/ir-ssa.cpp b/source/slang/ir-ssa.cpp index 8c5db68fe..624454a43 100644 --- a/source/slang/ir-ssa.cpp +++ b/source/slang/ir-ssa.cpp @@ -2,6 +2,7 @@ #include "ir-ssa.h" #include "ir.h" +#include "ir-clone.h" #include "ir-insts.h" namespace Slang { @@ -350,21 +351,36 @@ IRInst* readVar( SSABlockInfo* blockInfo, IRVar* var); -/// Try to take any name hint on `var` and apply it to `val`. -/// -/// Doesn't do anything if `val` already has a name hint, -/// or if `var` doesn't have one to transfer over. -/// -void maybeApplyNameHint( - ConstructSSAContext* context, + /// Try to copy any relevant decorations from `var` over to `val`. + /// +static void cloneRelevantDecorations( IRVar* var, IRInst* val) { - if( auto nameHint = var->findDecoration<IRNameHintDecoration>() ) + // Copy selected decorations over from the original + // variable to the SSA variable, when doing so is + // required for semantics. + // + for( auto decoration : var->getDecorations() ) { - if( !val->findDecoration<IRNameHintDecoration>() ) + switch(decoration->op) { - context->getBuilder()->addNameHintDecoration(val, nameHint->getName()); + default: + // Ignore most decorations. + // + // TODO: Should we include or exclude by default? + break; + + case kIROp_PreciseDecoration: + case kIROp_NameHintDecoration: + // Copy these decorations if the target doesn't already have them, + // but don't make duplicate decorations on the target. + // + if( !val->findDecorationImpl(decoration->op) ) + { + cloneDecoration(decoration, val, var->getModule()); + } + break; } } } @@ -383,7 +399,7 @@ PhiInfo* addPhi( valueType = context->getBuilder()->getRateQualifiedType(rate, valueType); } IRParam* phi = builder->createParam(valueType); - maybeApplyNameHint(context, var, phi); + cloneRelevantDecorations(var, phi); RefPtr<PhiInfo> phiInfo = new PhiInfo(); context->phiInfos.Add(phi, phiInfo); @@ -808,7 +824,7 @@ void processBlock( // block. auto val = readVar(context, blockInfo, var); - maybeApplyNameHint(context, var, val); + cloneRelevantDecorations(var, val); val = applyAccessChain(context, &blockInfo->builder, ptrArg, val); diff --git a/source/slang/lower-to-ir.cpp b/source/slang/lower-to-ir.cpp index 0cc5b2a5c..0864d4bfe 100644 --- a/source/slang/lower-to-ir.cpp +++ b/source/slang/lower-to-ir.cpp @@ -1680,6 +1680,10 @@ void addVarDecorations( { builder->addSimpleDecoration<IRGloballyCoherentDecoration>(inst); } + else if(as<PreciseModifier>(mod)) + { + builder->addSimpleDecoration<IRPreciseDecoration>(inst); + } else if(auto formatAttr = as<FormatAttribute>(mod)) { builder->addFormatDecoration(inst, formatAttr->format); diff --git a/source/slang/modifier-defs.h b/source/slang/modifier-defs.h index 1fa3a1f8e..01ed792a9 100644 --- a/source/slang/modifier-defs.h +++ b/source/slang/modifier-defs.h @@ -275,7 +275,7 @@ SIMPLE_SYNTAX_CLASS(HLSLSampleModifier, InterpolationModeModifier) SIMPLE_SYNTAX_CLASS(HLSLCentroidModifier, InterpolationModeModifier) // HLSL `precise` modifier -SIMPLE_SYNTAX_CLASS(HLSLPreciseModifier, Modifier) +SIMPLE_SYNTAX_CLASS(PreciseModifier, Modifier) // HLSL `shared` modifier (which is used by the effect system, // and shouldn't be confused with `groupshared`) diff --git a/source/slang/parser.cpp b/source/slang/parser.cpp index 5cb712489..a6063a2a9 100644 --- a/source/slang/parser.cpp +++ b/source/slang/parser.cpp @@ -4663,7 +4663,7 @@ namespace Slang MODIFIER(linear, HLSLLinearModifier); MODIFIER(sample, HLSLSampleModifier); MODIFIER(centroid, HLSLCentroidModifier); - MODIFIER(precise, HLSLPreciseModifier); + MODIFIER(precise, PreciseModifier); MODIFIER(shared, HLSLEffectSharedModifier); MODIFIER(groupshared, HLSLGroupSharedModifier); MODIFIER(static, HLSLStaticModifier); |
