|
|
The Slang compiler was bit by a known issue when translating from SSA form back to straight-line code. Give code like the following:
int x = 0;
int y = 1;
while(...)
{
...
int t = x;
x = y;
y = t;
}
...
The SSA construction pass will eliminate the temporary `t` and yield code something like:
br(b, 0, 1);
block b(param x : Int, param y : Int):
...
br(b, y, x);
The loop-dependent variables have become parameters of the loop block, and the branchs to the top of the loop pass the appropriate values for the next iteration (e.g., the jump that starts the loop sends in `0` and `1`).
The problem comes up when translating the back-edge the continues the loop out of SSA form. Our generated code will re-introduce temporaries for `x` and `y`:
int x;
int y;
// jump into loop becomes:
x = 0;
y = 1;
for(;;)
{
...
// back-edge becomes
x = y;
y = x;
continue;
}
The problem there is that we've naively translated a branch like `br(b, <a>, <b>)` into `x = <a>; y = <b>;` but that doesn't work correctly in the case where `<b>` is `x`, because we will have already clobbered the value of `x` with `<a>`.
The simplest fix is to introduce a temporary (just like the input code had), and generate:
// back-edge becomes
int t = x;
x = y;
y = t;
This change modifies the `emitPhiVarAssignments()` function so that it detects bad cases like the above and emits temporaries to work around the problem. A new test case is included that produced incorrect output before the change, and now produces the expected results.
A secondary change is folded in here that tries to guard against a more subtle version of the problem:
for(...)
{
...
int x1 = x + 1;
int y1 = y + 1;
x = y1;
y = x1;
}
In this more complicated case, each of `x` and `y` is being assigned to a value derived from the other, but neither is being set using a block parameter directly, so the changes to `emitPhiVarAssignments()` do not apply.
The problem in this case would be if the `shouldFoldInstIntoUseSites()` logic decided to fold the computation of `x1` or `y1` into the branch instruction, resulting in:
x = y + 1;
y = x + 1;
which would again violate the semantics of the original code, because now there is an assignment to `x` before the computation of `x + 1`.
Right now it seems impossible to force this case to arise in practice, due to implementation details in how we generate IR code for loops. In particular, the block that computes the `x+1` and `y+1` values is currently always distinct from the block that branches back to the top of the loop, and we do not allow "folding" of sub-expressions from different blocks. It is possible, however, that future changes to the compiler could change the form of the IR we generate and make it possible for this problem to arise.
The right fix for this issue would be to say that we should introduce a temporary for any branch argument that "involves" a block parameter (whether directly using it or using it as a sub-expression). Unfortunately, the ad hoc approach we use for folding sub-expressions today means that testing if an operand "involves" something would be both expensive and unwieldy.
A more expedient fix is to disallow *all* folding of sub-expressions into unconditional branch instructions (the ones that can pass arguments to the target block), which is what I ended up implementing in this change. Making that defensive change alters the GLSL we output for some of our cross-compilation tests, in a way that required me to update the baseline/gold GLSL.
A better long-term fix for this whole space of issues would be to have the "de-SSA" operation be something we do explicitly on the IR. Such an IR pass would still need to be careful about the first issue addressed in this change, but the second one should (in principle) be a non-issue given that our emit/folding logic already handles code with explicit mutable local variables correctly.
|