slang.git/source/slang/ir-constexpr.cpp, branch master

Use slang- prefix on slang compiler and core source (#973)

2019-05-31T21:20:37+00:00

* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.

String/List closer to conventions, and use Index type (#959)

2019-04-29T21:03:46+00:00

* List made members m_ Tweaked types to closer match conventions. * Use asserts for checking conditions on List. Other small improvements. * List.Count() -> getSize() * List Add -> add First -> getFirst Last -> getLast RemoveLast -> removeLast ReleaseBuffer -> detachBuffer GetArrayView -> getArrayView * List:: AddRange -> addRange Capacity -> getCapacity Insert -> insert InsertRange -> insertRange AddRange -> addRange RemoveRange -> removeRange RemoveAt -> removeAt Remove -> remove Reverse -> reverse FastRemove -> fastRemove FastRemoveAt -> fastRemoveAt Clear -> clear * List FreeBuffer -> _deallocateBuffer Free -> clearAndDeallocate SwapWith -> swapWith * List SetSize -> setSize Reserve -> reserve GrowToSize growToSize * UnsafeShrinkToSize -> unsafeShrinkToSize Compress -> compress FindLast -> findLastIndex FindLast -> findLastIndex Simplify Contains * List Removed m_allocator (wasn't used) Swap -> swapElements Sort -> sort Contains -> contains ForEach -> forEach QuickSort -> quickSort InsertionSort -> insertionSort BinarySearch -> binarySearch Max -> calcMax Min -> calcMin * Initializer::Initialize -> initialize List:: Allocate -> _allocate Init -> _init IndexOf -> indexOf * * Put #include in common.h, and remove unneeded inclusions * Small refactor of ArrayView - remove stride as not used * getSize -> getCount setSize -> setCount unsafeShrinkToSize->unsafeShrinkToCount growToSize -> growToCount m_size -> m_count * Some tidy up around Allocator. * Use Index type on List. * Refactor of IntSet. First tentative look at using Index. * Made Index an Int Did preliminary fixes. Made String use Index. * Partial refactor of String. * String::Buffer -> getBuffer ToWString -> toWString * Small improvements to String. String:: Buffer() -> getBuffer() Equals() -> equals * Try to use Index where appropriate. * Fix warnings on windows x86 builds.

Improve handling of {} initializer list expressions (#778)

2019-01-16T19:50:14+00:00

Fixes #775 It was reported (in #775) that Slang doesn't handle initializer-list syntax when initializing matrix variables. When starting on a fix for that it became apparent that the time was right to fix two broad issues in the compiler's current handling of `{}`-enclosed initializer lists. The first issue was that the front-end checking of initializer lists wasn't handling the C-style behavior where an initializer list can either contain nested `{}`-enclosed lists for sub-arrays/-structures, or directly contain "leaf" values for initializing those aggregates. For example, the following two variable declarations ought to be equivalent: ```hlsl int4 a[] = { {1, 2, 3, 4}, {5, 6, 7, 8} }; int4 b[] = { 1, 2, 3, 4, 5, 6, 7, 8 }; ``` Getting this distinction right is important because we want to support initializing a matrix either from a list of vectors for its rows, or a list of scalars for its elements (in row-major order). The front-end semantic checking logic for initializer lists was revamped so that it conceptually tries to "read" an expression of a desired type from the initializer list, and decides at each step whether to consume a single expression by coercing it to the desired type, or to recursively read multiple sub-values to construct the type as an aggregate. The logic for deciding between direct vs aggregate initialization could potentially use some tweaking, but luckily it should always handle the case where users introduce explicit `{}`-enclosed sub-lists to make their intention clear, so that existing Slang code should continue to work as before. The second issue was that initializers without the expected number of elements weren't implemented in code generation, so they would lead to internal compiler errors. This change revamps the codegen logic for initializer lists so that it can synthesize default values for fields/elements that were left out during initialization. This includes an attempt to support default initialization of `struct` fields based on explicitly written initialization expressions.

Move mangled name out of IRGlobalValue (#752)

2018-12-13T20:13:58+00:00

* Move mangled name out of IRGlobalValue Previously the `IRGlobalValue` type was used as a root for all IR instructions that can have "linkage," in the sense that a definition in one module can satisfy a use in another module. The mangled symbol name was stored in state directly on each `IRGlobalValue`, which created some complications, and also forced IR instructions that wanted to support linkage to wedge into the hierarchy at that specific point. This change moves the mangled name out into a decoration: either an `IRImportDecoration` or an `IRExportDecoration`, both of which inherit from `IRLinkageDecoration` which exposes the mangled name. This change has a few benefits: * We can now have any kind of instruction be exported/imported, without having to inherit from `IRGlobalValue`. This could potentially let `IRStructType` and `IRWitnessTable` be simplified to just have operand lists instead of dummy chldren as they do today. * We can now easily have "global values" like functions that explicitly *don't* get linkage, instead of using a null or empty mangled name as a marker. * We can use the exact opcode on a linkage decoration to distinguish imports from exports, which could be used to more accurately resolve symbols during the linking step. Other than adding the decorations and making sure that AST->IR lowering adds them, the main changes here are around any code that used `IRGlobalValue`. Variables and parameters of type `IRGlobalValue*` were changed to `IRInst*` easily, so the main challenge was around code that *casts* to `IRGlobalValue*. In cases where a cast to `IRGlobalValue` also performed a test for the mangled name being non-null/non-empty, we simply switched the code to check for the presence of an `IRLinkageDecoration`, since that is the new way of indicating a value with linakge. Most of the serious complications arose in `ir.cpp` around the "linking"/target-specialization and generic specialization steps. The "linking" logic was checking for `IRGlobalValue` to opt into some more complicated cloning logic, and just checking for a linkage decoration here wasn't sufficient since the front-end *does* produce global values without linkage in some cases (e.g., for a function-`static` variable we produce a global variable without linkage). This logic was updated to just check for the cases that used to amount to `IRGlobalValue`s directly by opcode. It might be simpler in the short term to have kept `IRGlobalValue` around to make the existing casts Just Work, but I'm confident that this logic could actually be rewritten for much greater clarity and simplicity and that is the better way forward. The generic specialization logic was using some really messy code to generate a new mangled name to represent the specialized symbol, and then searching for an existing match for that name. The original idea there was that an IR module could include "pre-specialized" versions of certain generics to speed up back-end compilation by eliminating the need to specialize in some cases, but this feature has never been implemented so the overhead here is just a waste. Instead, I moved generic specialization to use a simpler dictionary to map the operands to a `specialize` instruction over to the resulting specialized value. This allows for some simplifications in the name mangling logic, because it no longer needs to figure out how to produce mangled names from IR instructions representing types/values. As part of this change I also overhauled the IR emit logic to produce cleaner output by default, borrowing some of the ideas from the logic in `emit.cpp`. IR values are now automatically given names based on their "name hint" decoration, if any, to make the code easier to follow, and I also made it so that types and literals get collapsed into their use sites in a new "simplified" IR dump mode (which is currently the default, with no way to opt into the other mode without tweaking the code). The resulting IR dumps are much nicer to look at, but as a result the one test that involves IR dumping (`ir/string-literal`) doesn't really test what it used to. One weird issue that came up during testing is that the `transitive-interface` test had previously been producing output that made no sense (that is, the expected output file wasn't really sensible), and somehow these changes were altering its behavior. Changing the test to use `int` values instead of `float` was enough to make the output be what I'd expect, and hand inspection of generating DXBC has me convinced we were compiling the `float` case correctly too. There appears to be some issue around tests with floating-point outputs that we should investigate. * fixup: C++ declaration order

Add a warning on missing return, and initial SCCP pass (#671)

2018-10-12T19:23:14+00:00

* Add a warning on missing return, and initial SCCP pass The user-visible feature added here is a diagnostic for functions with non-`void` return type where control flow might fall off the end. This *sounds* like a trivial diagnostic to add as part of the front-end AST checking, but that can run afoul of really basic stuff like: ```hlsl int thisFunctionisOkay(int a) { while(true) { if(a > 10) return a; a = a*2 + 1; } // no return here! } ``` This function "obviously" doesn't need to have a `return` statement at the end there, but realizing this fact relies on the compiler to understand that the `while(true)` loop can't exit normally, and doesn't contain any `break` statement. One can write "obvious" examples that need more and more complex analysis to rule out. The answer Slang uses for stuff like this is to do the analysis at the IR level right after initial code generation (this would be before serialization, BTW, so that attached `IRHighLevelDeclDecoration`s can be used). When lowering the AST to the IR, we always emit a `missingReturn` instruction (a subtype of `IRUnreachable`) at the end of its body if it isn't already terminated. The IR analysis pass to detect missing `return` statements is then as simple as just walking through all the functions in the module and making sure they don't contain `missingReturn` instructions. For that simple pass to work, we first need to make some effort to remove dead blocks that control flow can never reach. This change adds a very basic initial implementation of Spare Conditional Constant Propagation (SCCP), which is a well-known SSA optimization that combines constant propagation over SSA form with dead code elimination over a CFG to achieve optimizations that are not possible with either optimization along. For the moment, we don't actually implement any constant *folding* as part of the SCCP pass, so we can eliminate the dead block in a case like the function above (and those in the test case added in this change), but will not catch things like a `while(0 < 1)` loop. Handling more "obvious" cases like that is left for future work. * fixup: warning on unreachable code * Handle case where user of an inst isn't in same function/code The code as assuming any instruction in the SSA work list has to come from the function/code being processed, but this misses the case where an instruction in a generic has a use inside the function that the generic produces. This change adds code to guard against that case.

Introduce an IR-level type system (#481)

2018-04-11T23:18:29+00:00

* Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors

IR: next phase of "everything is an instruction" (#433)

2018-03-03T15:16:08+00:00

The main practical change here is that things that used to be `IRValue`s, like literals, are now being expressed as instructions in the global scope. In order to validate that things are actually being handled correctly, this change introduces an explicit "validation" pass that can be run on the IR to check for different invariants (although it doesn't check many of the important ones right now). I've left the validation pass turned off by default, but with a command-line flag to enable it. We may want to make it be on by default in debug builds, just to keep us honest. The main invariant for the moment is that when on IR instruction is used as an operand to another, it had better come from the same IR module. Some of the existing passes were violating this rule, in particular when it came to cloning of witness tables related to global generic parameter substitution. Those features can in theory be handled better now by allowing `specialize` instructions at other scopes, but I didn't want to over-complicate this change, so I make just enough fixes to ensure that these steps always clone witness tables they get from the "symbols" on an IR specialization context. In order for this to work when recursively specializing, I had to ensure that the logic for generic specialization had a notion of a "parent" specialization context that it would fall back to to perform cloning when necessary. This change keeps the logic that was caching and re-using the instructions for literal values within a module, but adds some logic that isn't really being tested right now for picking the right parent instruction to insert a constant instruction into. This logic doesn't trigger right now because all of the cases we are using it on have zero operands (and so they always get "hoisted" to the global scope), but eventually for things like types we want to be able to support instructions with operands (e.g., `vector`) and handle the case where some of those operands come from different scopes (e.g., when nested inside a generic). The final change here is mostly cosmetic: the `IRBuilder` is now more abstract about where insertion occurs: it tracks a single `IRParentInst` to insert into, and then an optional `IRInst` to insert before. In the common case, that parent is an `IRBlock`, but it could conceivably also be the global scope, or a witness table, etc. Use sites where we used to change those fields directly now use distinct methods `setInsertInto(parent)` and `setInsertBefore(inst)` which capture the two cases we care about. Accessors are also defined to extract the current block (if the current parent is a block), and the current "function" (global value with code, if the current parent is a global value with code, or a block inside one). With this work in place, it should be possible for a follow-on change to start putting `specialize` instructions at the global scope and thus clean up some of the on-the-fly specialization work. This work should also help with some of the requirements around a distinct IR-level type system and more explicit generics.

IR: "everything is an instruction" (#432)

2018-03-01T21:11:24+00:00

* IR: "everything is an instruction" This change tries to streamline the representation of the IR in the following ways: * Every IR value is an instruction (there is no `IRValue` type any more) * All IR values that can contain other values share a single base (`IRParentInstruction`) * Dynamic casts to specific IR instruction types can be accomplished with a new `as(inst)` operation, that uses the IR opcode to implement casts. The biggest change in terms of number of lines is getting rid of `IRValue`. The diff here could probably be smaller if I'd just done `typedef IRInst IRValue;`. Along the way I also renamed the `getArg`/`getArgs`/`getArgCount` combination over to `getOperand`/`getOperands`/`getOperandCount` to avoid being confusing when we have something like a `call` instruction where the "arguments" of the call don't line up with the operands of the instruction. I also tried to clean up the representation of lists of child instructions to try to make it easier to iterate over them with C++ range-based `for` loops. Developers still need to be careful about mutating the contents of a block while iterating over it in this fashion (e.g. if you remove the "current" element, the iteration will end prematurely). Probably the thorniest change here is that parameters are now just represented as the first N instructions in a block, which means: * We need to perform a linear search to find the end of the parameter list. This is probably not often a problem, because usually you would be iterating over the parameters anyway, and that will be linear in the number of parameters. * Algorithms that iterate over a block either need to ignore parameters, treat parameters just like other instructions, or somehow cleave the list into the range of parameters, and the range of "ordinary" instructions (which involves the same linear search above). * When inserting into a block, we need to be careful not to insert instructions at invalid locations (e.g., insert a temporary before the parameters, or insert a parameter in the middle of the code). I can't pretend that I've handled the details of that here. (This is no different than having to make the same adjustments for phi nodes in a typical SSA representation) * One possible future-proof approach is to implement a pass that sorts the instructions in a block so that parameters always come first. That would let us implement passes without caring about this detail, and then clean up right before any pass that cares about the relative order of parameters and other instructions. The current change is missing any work to make literals and other instructions that used to be `IRValue`s properly nest inside of their parent module. Right now these instructions are just left unparented, and may actually end up being shared between distinct modules. Fixing that will need a follow-up change. The biggest challenge there is that it introduces instructions at the global scope that aren't `IRGlobalValue`s. This change doesn't try to take advantage of any of the new flexibility (e.g., by nesting `specialize` instructions inside of witness tables). The goal is to do exactly what we were doing before, just with a different representation. * Warning fix

Initial work on validating "constexpr"-ness in IR (#420)

2018-02-22T19:44:57+00:00

* Initial work on validating "constexpr"-ness in IR The underlying issue here is that certain operations in the target shading languages constrain their operands to be compile-time constants. A notable example is the optional texel offset parameter to the `Texture2D.Sample` operation. When calling these operations in GLSL, the user is required to pass a "constant expression," and any variables in that expression must therefore be marked with the `const` qualifier (and themselves be initialized with constant expressions). Any GLSL output we generate must of course respect these rules. When calling these operations in HLSL, the user is not so constrained. Instead, they can pass an arbitrary expression, which may involve ordinary variables with no particular markup, and then the compiler is responsible for determining if the actual value after simplification works out to be a constant. In some cases, the requirement that a value be constant might actually trigger things like loop unrolling. Also, it is okay to use a function parameter to determine such a constant expression, as long as the argument turns out to be a constant at all call sites. The way we have decided to tackle these challenges in Slang is that we we propagate a notion of `constexpr`-ness through the IR. This is currently being tackled in `ir-constexpr.cpp` with a combination of forward and backward iterative dataflow: * When the operands to an instruction are all `constexpr`, and the opcode is one we believe can be constant-folded, then we infer that the instruction *can* be evaluated as `constexpr` * When instruction is required to be `constexpr`, then we infer that all of its operands are also required to be `constexpr`. If this process ever infers that a function parameter is required to be `constexpr`, then we might have to continue propagation at all the call sites to that function. If after all the propagation is done, there are any cases where an instruction is *required* to be `constexpr`, but it *can't* be `constexpr` (we weren't able to infer `constexpr`-ness for its operands), then we issue an error. This implementation encodes the idea of `constexpr`-ness in the IR as part of the type system, using a simplified notion of rates. This change adds a `RateQualifiedType` that can represent `@R T`, and then introduces a `ConstExprRate` that can be used for `R`. Many accessors for the type information on IR nodes were updated to distinguish when one wants the "full" type of an IR value (which might include rate information) vs. just the "data" type. A `constexpr` qualifier was added in the front-end, and is being used to decorate the texel offset parameter for `Texture2D.Sample`. Lowering from AST to IR looks for this qalifier and infers when a function parameter must be typed as `@ConstExpr T` instead of just `T`. There are lots of limitations and gotchas in the implementation so far: * The `@ConstExpr` rate is the only one added in this change, but it seems clear that the conceptual `ThreadGroup` rate that was added to represent `groupshared` should probably get folded into the representation. * I'm not 100% pleased with how many places in the IR I have to special-case for rate-qualified types. At the same type, pulling out rate as a distinct field on `IRValue` would probably require that we pay attention to rate everywhere. * I've added a test case to show that we can issue errors when users fail to provide a constant expression for the texel offset, but the actual error message isn't great because it doesn't indicate *why* a constant expression was required. Realistically the "initial IR" should contain a few more decorations we can use to relate error conditions back to the original code (even if this is in a side-band structure). * I've added a test case that is supposed to show that we can back-propagate `constexpr`-ness to local variables, and I've manually confirmed that it works for Vulkan/SPIR-V output, but the level of Vulkan support in `render_test` today means I can't enable the test for check-in. * While I'm attempting to propagate `@ConstExpr` information from callees to callers, I haven't implemented any logic to specialize callee functions based on values at call sites. * In a similar vein, there is no handling of control-flow dependence in the current code. If we infer that a phi (block parameter) needs to be `@ConstExpr`, then it isn't actually enough to require that the inputs to the phi (arguments from predecessor blocks) are all `@ConstExpr` because we also need any control-flow decisions that pick which incoming edge we take to be `@ConstExpr` as well. * As a practical matter, implicit propagation of `@ConstExpr` from a function body to a function parameter should only be allowed for functions that are "local" to a module. Any function that might be accessed from outside of a module should really have had its `@ConstExpr` parameter marked manually, and our pass should validate that they follow their own rules. Right now we have no kind of visibility (`public` vs `private`) system, so I'm kind of ignoring this issue. While that is a lot of gaps, this is also just enough code to get the Falcor MultiPassPostProcess example working, so I'm inclined to get it checked in. * Fixup: missing expected output for test * Fixup: disable test that relies on [unroll] for now