diff options
| author | Theresa Foley <10618364+tangent-vector@users.noreply.github.com> | 2023-01-14 15:31:31 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2023-01-14 15:31:31 -0800 |
| commit | 14fab67c5edd8eb697ffb10dbcc0467678521eef (patch) | |
| tree | 88c82d8fd29d496a5a44ff51bc9a0f4f2ab2905f /source | |
| parent | 4adc64f2a033ec141df6a16e65131612b30fb23b (diff) | |
Fixes for crash when inlining at global scope (#2593)
* Fixes for crash when inlining at global scope
Recent changes to the way inlining is implemented in the Slang compiler
have broken certain scenarios involving `static const` declarations.
The basic problem is that the initial-value expression for a `static const`
gets lowered into IR code at the global scope of a module, and if
that code includes `call`s to stdlib operations marked `forceInlineEarly`,
then we end up trying to apply inlining to code at module scope.
The current inlining operation assumes that all `call`s are in basic
blocks, and that the correct way to do inlining involves splitting
those blocks.
This change adds logic to detect when the callee at a call site to
be inlined consists of a single basic block ending in a `return`,
and in that case it invokes specialized inlining logic that doesn't
split basic blocks and doesn't need to care if the original `call`
is in a basic block.
Thus we are able to inline calls to single-basic-block `forceInlineEarly`
functions called as part of the initialization for global-scope
`static const` variables.
This logic does *not* solve the problem of calls to multi-block
`forceInlineEarly` functions from the global scope. Such calls cannot
really be inlined.
A secondary problem that arises when inlining such calls is that the
callee might include local temporaries (`var` instructions) that are
read and written (`load`s and `store`s), and none of those instructions
should be allowed at the global scope.
In the case of the functions being inlined here, the `load`/`store`
operations are superfluous, and should be cleaned up by our SSA pass.
The only reason that they seem to *not* be getting cleaned up in the
case that was been triggering crashes is that the callee is a generic.
The current logic for the SSA pass was skipping the bodies of generic
functions, so they would not be cleaned up. This change enables the SSA
pass to apply to the bodies of generic functions, and also ensures that
SSA cleanups are applied *before* any `forceInlineEarly` functions get
inlined.
* fixup: liveness test outputs
Diffstat (limited to 'source')
| -rw-r--r-- | source/slang/slang-ir-inline.cpp | 109 | ||||
| -rw-r--r-- | source/slang/slang-ir-ssa.cpp | 20 | ||||
| -rw-r--r-- | source/slang/slang-lower-to-ir.cpp | 15 |
3 files changed, 141 insertions, 3 deletions
diff --git a/source/slang/slang-ir-inline.cpp b/source/slang/slang-ir-inline.cpp index ef01de47e..7fc977170 100644 --- a/source/slang/slang-ir-inline.cpp +++ b/source/slang/slang-ir-inline.cpp @@ -393,6 +393,75 @@ struct InliningPassBase return clonedInst; } + /// Inline the body of the callee for `callSite`, for a callee that has only + /// a single basic block. + /// + void inlineSingleBlockFuncBody( + CallSiteInfo const& callSite, IRCloneEnv* env, IRBuilder* builder) + { + auto call = callSite.call; + auto callee = callSite.callee; + + // The callee had better have only a single basic block. + // + auto firstBlock = callee->getFirstBlock(); + SLANG_ASSERT(!firstBlock->getNextBlock()); + + // We will loop over the instructions in the block and clone + // them into the same basic block as the `call`. + // + builder->setInsertBefore(call); + + // Along the way, we will detect any `return` instruction, + // and remember the (clone of the) returned value. + // + IRInst* returnVal = nullptr; + + for (auto inst : firstBlock->getChildren()) + { + switch (inst->getOp()) + { + default: + // In the common case we just clone the instruction as-is + _cloneInstWithSourceLoc(callSite, env, builder, inst); + break; + + case kIROp_Param: + // Parameters of the first block are the parameters of + // the function itself, so we skip them rather than + // clone them. + // + break; + + case kIROp_Return: + // We expect to see only a single `return` instruction, + // and when we see it we note the value being returned. + // + SLANG_ASSERT(!returnVal); + returnVal = findCloneForOperand(env, inst->getOperand(0)); + break; + } + } + + // We are going to remove the original `call` now that the callee + // has been inlined, but before we do that we need to replace + // all uses of the `call` with whatever value was produced by the + // inlined body of the callee. + // + if (returnVal) + { + call->replaceUsesWith(returnVal); + } + else + { + call->replaceUsesWith(builder->getVoidValue()); + } + + // Once the `call` has no uses, we can safely remove it. + // + call->removeAndDeallocate(); + } + /// Inline the body of the callee for `callSite`. void inlineFuncBody( CallSiteInfo const& callSite, IRCloneEnv* env, IRBuilder* builder) @@ -400,11 +469,45 @@ struct InliningPassBase auto call = callSite.call; auto callee = callSite.callee; - // Break the basic block containing the call inst into two basic blocks. + // If the callee consists of a single basic block *and* that block + // ends with a `return` instruction, then we can apply a simple approach + // to inlining that is compatible with any call site (including those + // at the global scope). + // + auto firstBlock = callee->getFirstBlock(); + SLANG_ASSERT(firstBlock); + if(!firstBlock->getNextBlock() && as<IRReturn>(firstBlock->getTerminator())) + { + inlineSingleBlockFuncBody(callSite, env, builder); + return; + } + + // If the callee has any non-trivial control flow (multiple basic blocks + // and terminators other than `return`), we will need to split the control + // flow of the caller at the block that contains `call`. + // + // For any of this to work, we have to assume that the `call` appears + // in a basic block inside of a function (not, e.g., at the global scope). + // auto callerBlock = callSite.call->getParent(); - builder->setInsertInto(callerBlock->getParent()); + SLANG_ASSERT(as<IRBlock>(callerBlock)); + auto callerFunc = callerBlock->getParent(); + SLANG_ASSERT(callerFunc); + + // As a fail-safe for release builds, if the above expectations are somehow + // *not* met, we will fall back to not inlining the call at all. + // + if (!callerFunc) + { + return; + } + + // We will create a new basic block block in the parent function that + // will contain all the instructions that come *after* the `call`. + // + builder->setInsertInto(callerFunc); auto afterBlock = builder->createBlock(); - + // Many operations (e.g. `cloneInst`) has define-before-use assumptions on the IR. // It is important to make sure we keep the ordering of blocks by inserting the // second half of the basic block right after `callerBlock`. diff --git a/source/slang/slang-ir-ssa.cpp b/source/slang/slang-ir-ssa.cpp index 2415f1388..0bd5c6e9f 100644 --- a/source/slang/slang-ir-ssa.cpp +++ b/source/slang/slang-ir-ssa.cpp @@ -1221,6 +1221,26 @@ bool constructSSA(IRModule* module, IRInst* globalVal) case kIROp_GlobalVar: return constructSSA(module, (IRGlobalValueWithCode*)globalVal); + case kIROp_Generic: + { + // The above cases handle the actual code-bearing declarations + // that can contian basic blocks with local variables, but + // we would also like to perform SSA simplifications on + // *generic* functions, and so we will also process any + // instruction that is produced by an `IRGeneric`. + // + // TODO: At some point we may simply want to apply this pass + // recursively to *all* instructions, in order to make it + // robust to the presence of nested functions in general. + + auto generic = cast<IRGeneric>(globalVal); + auto returnVal = findInnerMostGenericReturnVal(generic); + if(!returnVal) + return false; + + return constructSSA(module, returnVal); + } + default: break; } diff --git a/source/slang/slang-lower-to-ir.cpp b/source/slang/slang-lower-to-ir.cpp index 9378a69e8..383067363 100644 --- a/source/slang/slang-lower-to-ir.cpp +++ b/source/slang/slang-lower-to-ir.cpp @@ -9102,9 +9102,24 @@ RefPtr<IRModule> generateIRForTranslationUnit( // normal `call` + `ifElse`, etc. lowerErrorHandling(module, compileRequest->getSink()); + // Next, attempt to promote local variables to SSA + // temporaries and do basic simplifications. + // + constructSSA(module); + simplifyCFG(module); + applySparseConditionalConstantPropagation(module); + // Next, inline calls to any functions that have been // marked for mandatory "early" inlining. // + // Note: We performed certain critical simplifications + // above, before this step, so that the body of functions + // subject to mandatory inlining can be simplified ahead + // of time. By simplifying the body before inlining it, + // we can make sure that things like superfluous temporaries + // are eliminated from the callee, and not copied into + // call sites. + // performMandatoryEarlyInlining(module); // Next, attempt to promote local variables to SSA |
