| Age | Commit message (Collapse) | Author |
|
|
|
The main practical change here is that things that used to be `IRValue`s, like literals, are now being expressed as instructions in the global scope.
In order to validate that things are actually being handled correctly, this change introduces an explicit "validation" pass that can be run on the IR to check for different invariants (although it doesn't check many of the important ones right now). I've left the validation pass turned off by default, but with a command-line flag to enable it. We may want to make it be on by default in debug builds, just to keep us honest. The main invariant for the moment is that when on IR instruction is used as an operand to another, it had better come from the same IR module.
Some of the existing passes were violating this rule, in particular when it came to cloning of witness tables related to global generic parameter substitution. Those features can in theory be handled better now by allowing `specialize` instructions at other scopes, but I didn't want to over-complicate this change, so I make just enough fixes to ensure that these steps always clone witness tables they get from the "symbols" on an IR specialization context. In order for this to work when recursively specializing, I had to ensure that the logic for generic specialization had a notion of a "parent" specialization context that it would fall back to to perform cloning when necessary.
This change keeps the logic that was caching and re-using the instructions for literal values within a module, but adds some logic that isn't really being tested right now for picking the right parent instruction to insert a constant instruction into. This logic doesn't trigger right now because all of the cases we are using it on have zero operands (and so they always get "hoisted" to the global scope), but eventually for things like types we want to be able to support instructions with operands (e.g., `vector<float, 4>`) and handle the case where some of those operands come from different scopes (e.g., when nested inside a generic).
The final change here is mostly cosmetic: the `IRBuilder` is now more abstract about where insertion occurs: it tracks a single `IRParentInst` to insert into, and then an optional `IRInst` to insert before. In the common case, that parent is an `IRBlock`, but it could conceivably also be the global scope, or a witness table, etc. Use sites where we used to change those fields directly now use distinct methods `setInsertInto(parent)` and `setInsertBefore(inst)` which capture the two cases we care about. Accessors are also defined to extract the current block (if the current parent is a block), and the current "function" (global value with code, if the current parent is a global value with code, or a block inside one).
With this work in place, it should be possible for a follow-on change to start putting `specialize` instructions at the global scope and thus clean up some of the on-the-fly specialization work. This work should also help with some of the requirements around a distinct IR-level type system and more explicit generics.
|
|
* IR: "everything is an instruction"
This change tries to streamline the representation of the IR in the following ways:
* Every IR value is an instruction (there is no `IRValue` type any more)
* All IR values that can contain other values share a single base (`IRParentInstruction`)
* Dynamic casts to specific IR instruction types can be accomplished with a new `as<Type>(inst)` operation, that uses the IR opcode to implement casts.
The biggest change in terms of number of lines is getting rid of `IRValue`. The diff here could probably be smaller if I'd just done `typedef IRInst IRValue;`.
Along the way I also renamed the `getArg`/`getArgs`/`getArgCount` combination over to `getOperand`/`getOperands`/`getOperandCount` to avoid being confusing when we have something like a `call` instruction where the "arguments" of the call don't line up with the operands of the instruction.
I also tried to clean up the representation of lists of child instructions to try to make it easier to iterate over them with C++ range-based `for` loops. Developers still need to be careful about mutating the contents of a block while iterating over it in this fashion (e.g. if you remove the "current" element, the iteration will end prematurely).
Probably the thorniest change here is that parameters are now just represented as the first N instructions in a block, which means:
* We need to perform a linear search to find the end of the parameter list. This is probably not often a problem, because usually you would be iterating over the parameters anyway, and that will be linear in the number of parameters.
* Algorithms that iterate over a block either need to ignore parameters, treat parameters just like other instructions, or somehow cleave the list into the range of parameters, and the range of "ordinary" instructions (which involves the same linear search above).
* When inserting into a block, we need to be careful not to insert instructions at invalid locations (e.g., insert a temporary before the parameters, or insert a parameter in the middle of the code). I can't pretend that I've handled the details of that here. (This is no different than having to make the same adjustments for phi nodes in a typical SSA representation)
* One possible future-proof approach is to implement a pass that sorts the instructions in a block so that parameters always come first. That would let us implement passes without caring about this detail, and then clean up right before any pass that cares about the relative order of parameters and other instructions.
The current change is missing any work to make literals and other instructions that used to be `IRValue`s properly nest inside of their parent module. Right now these instructions are just left unparented, and may actually end up being shared between distinct modules. Fixing that will need a follow-up change. The biggest challenge there is that it introduces instructions at the global scope that aren't `IRGlobalValue`s.
This change doesn't try to take advantage of any of the new flexibility (e.g., by nesting `specialize` instructions inside of witness tables). The goal is to do exactly what we were doing before, just with a different representation.
* Warning fix
|
|
The following translations are added:
* HLSL `countbits` becomes GLSL `bitCount`
* HLSL `firstbitlow` becomes GLSL `findLSB`
* HLSL `firstbithight` becomes GLSL `findMSB`
* HLSL `rerverseBits` becomes GLSL `bitfieldReverse`
There are currently no HLSL equivalents for the bitfield insert/extract operations in GLSL. In the future we could expose those as intrinsics under their GLSL names, with HLSL translations, if desired.
|
|
* Fix bug when subscripting a type that must be split (#396)
The logic was creating a `PairPseudoExpr` as part of a subscript (`operator[]`) operation, but neglecting to fill in its `pairInfo` field, which led to a null-pointer crash further along.
* Allow writes to UAV textures (#416)
Work on #415
This issue is already fixed in the `v0.10.*` line, but I'm back-porting the fix to `v0.9.*`.
The issue here was that the stdlib declarations for texture types were only including the `get` accessor for subscript operations, even if the texture was write-able.
I've also included the fixes for other subscript accessors in the stdlib (notably that `OutputPatch<T>` is readable, but not writable, despite what the name seems to imply).
* Fix infinite loop in semantic parsing (#424)
The code for parsing semantics was looking for a fixed set of tokens to terminate a semantic list, rather than assuming that whenever you don't see a `:` ahead, you probably are done with semantics. This meant that you could get into an infinite loop just with simple mistakes like leaving out a `;`.
This change fixes the parser to note infinite loop in this case, and adds a test case to verify the fix.
|
|
The code in `DeclRefType::Create` was treating all of the point/line/triangle output stream types as `HLSLPointStreamType`, which meant we always output GLSL geometry shaders with `layout(points) out;`.
|
|
Pull BaseType, TextureFlavor and SamplerStateFlavor enums and helper functions into a shared file "type-system-shared.h".
|
|
These changes are related to getting a first Slang geometry shader to translate to GLSL. There are some unrelated cross-compilation fixes in here as well.
* Add direct support to shader parameter layout for GS output streams, so that they are reflected as a container type
* Fix the declarations of the `SampleCmp` methods; they should always return `float`, independent of the nominal element type of the texture.
* Fix up our handling of `__target_intrinsic` modifiers, so that we are a little bit more careful in how we detect something as being just a simple name replacement (e.g., `__target_intrinsic(glsl, "foo")` should make us output `foo(original, args, here)`) vs. a custom expression (e.g., `__target_intrinsic(glsl, "bar+1")` should output `bar+1` and not use any arguments, even without any `$` substitutions).
* Don't emit the `[unroll]` modifier when outputting GLSL. Eventually we need to fully unroll loops for GLSL output anyway.
* Inspect th entry point parameter list (from the layout information) when emitting a GS, so that we can write out the correct `layout` modifiers for input primitive type and output primitive topology.
* Add a new case to `ScalarizedVal` to handle cases where an HLSL system value needs to map to a GLSL built-in variable with a slightly different type (e.g., `SV_RenderTargetArrayIndex` is a `uint` while `gl_Layer` is an `int`). For now this is only hanlding trivial cases (where a direct cast can achieve the result we want), but eventually it might need to handle things like conversion between arrays and vectors.
* This is mostly just the infrastructure for the feature, and the actual enumeration of the correct types for all the system values is still to be done.
* Handle a few more cases in assignment between `ScalarizedVal`. In particular, deal with cases where `materializeValue` is called on a tuple that has an array type, so that we need to construct the individual array elements.
* Add translation for GS output stream `Append()` and `RestartStrip()`
* Note that the translation of `Append()` seems to ignore its argument; this is because we desugar the operation during legalization for GLSL (see next item)
* When legalizing for GLSL, detect an entry point parameter that is a GS stream, and translate it into `out` variables for its element type, and then rewrite any calls to `Append()` in the body of the entry point to be preceded by assignment to those variables. This works in tandem with the above translation of HLSL `Append()` calls into GLSL `EmitVertex()` calls.
* We are detecting calls to `Append()` in a slightly hacky way, by looking at decorations on the callee to make sure that it is a function that is determined to translate to `EmitVertex()`.
* Right now we aren't handling calls to `Append()` in other functions. It wouldn't be hard in principle to walk all the functions in the module and apply the translation (assuming we don't want to start supporting multiple output streams), but this wouldn't handle the passing of the GS output stream between functions. (This points out that there is a need for an additional type legalization pass that desugars away parameters of types that aren't actually meaningful on the target).
|
|
This allows us to get rid of `IRGlobalValue::dispose()`.
|
|
* Initial work on validating "constexpr"-ness in IR
The underlying issue here is that certain operations in the target shading languages constrain their operands to be compile-time constants. A notable example is the optional texel offset parameter to the `Texture2D.Sample` operation.
When calling these operations in GLSL, the user is required to pass a "constant expression," and any variables in that expression must therefore be marked with the `const` qualifier (and themselves be initialized with constant expressions). Any GLSL output we generate must of course respect these rules.
When calling these operations in HLSL, the user is not so constrained. Instead, they can pass an arbitrary expression, which may involve ordinary variables with no particular markup, and then the compiler is responsible for determining if the actual value after simplification works out to be a constant. In some cases, the requirement that a value be constant might actually trigger things like loop unrolling. Also, it is okay to use a function parameter to determine such a constant expression, as long as the argument turns out to be a constant at all call sites.
The way we have decided to tackle these challenges in Slang is that we we propagate a notion of `constexpr`-ness through the IR. This is currently being tackled in `ir-constexpr.cpp` with a combination of forward and backward iterative dataflow:
* When the operands to an instruction are all `constexpr`, and the opcode is one we believe can be constant-folded, then we infer that the instruction *can* be evaluated as `constexpr`
* When instruction is required to be `constexpr`, then we infer that all of its operands are also required to be `constexpr`.
If this process ever infers that a function parameter is required to be `constexpr`, then we might have to continue propagation at all the call sites to that function.
If after all the propagation is done, there are any cases where an instruction is *required* to be `constexpr`, but it *can't* be `constexpr` (we weren't able to infer `constexpr`-ness for its operands), then we issue an error.
This implementation encodes the idea of `constexpr`-ness in the IR as part of the type system, using a simplified notion of rates. This change adds a `RateQualifiedType` that can represent `@R T`, and then introduces a `ConstExprRate` that can be used for `R`. Many accessors for the type information on IR nodes were updated to distinguish when one wants the "full" type of an IR value (which might include rate information) vs. just the "data" type.
A `constexpr` qualifier was added in the front-end, and is being used to decorate the texel offset parameter for `Texture2D.Sample`. Lowering from AST to IR looks for this qalifier and infers when a function parameter must be typed as `@ConstExpr T` instead of just `T`.
There are lots of limitations and gotchas in the implementation so far:
* The `@ConstExpr` rate is the only one added in this change, but it seems clear that the conceptual `ThreadGroup` rate that was added to represent `groupshared` should probably get folded into the representation.
* I'm not 100% pleased with how many places in the IR I have to special-case for rate-qualified types. At the same type, pulling out rate as a distinct field on `IRValue` would probably require that we pay attention to rate everywhere.
* I've added a test case to show that we can issue errors when users fail to provide a constant expression for the texel offset, but the actual error message isn't great because it doesn't indicate *why* a constant expression was required. Realistically the "initial IR" should contain a few more decorations we can use to relate error conditions back to the original code (even if this is in a side-band structure).
* I've added a test case that is supposed to show that we can back-propagate `constexpr`-ness to local variables, and I've manually confirmed that it works for Vulkan/SPIR-V output, but the level of Vulkan support in `render_test` today means I can't enable the test for check-in.
* While I'm attempting to propagate `@ConstExpr` information from callees to callers, I haven't implemented any logic to specialize callee functions based on values at call sites.
* In a similar vein, there is no handling of control-flow dependence in the current code. If we infer that a phi (block parameter) needs to be `@ConstExpr`, then it isn't actually enough to require that the inputs to the phi (arguments from predecessor blocks) are all `@ConstExpr` because we also need any control-flow decisions that pick which incoming edge we take to be `@ConstExpr` as well.
* As a practical matter, implicit propagation of `@ConstExpr` from a function body to a function parameter should only be allowed for functions that are "local" to a module. Any function that might be accessed from outside of a module should really have had its `@ConstExpr` parameter marked manually, and our pass should validate that they follow their own rules. Right now we have no kind of visibility (`public` vs `private`) system, so I'm kind of ignoring this issue.
While that is a lot of gaps, this is also just enough code to get the Falcor MultiPassPostProcess example working, so I'm inclined to get it checked in.
* Fixup: missing expected output for test
* Fixup: disable test that relies on [unroll] for now
|
|
getSpecailizedMangledName to work properly.
|
|
This is to workaround with the issue that the Types returned in ProgramLayout may reference to IRWitnessTables via GlobalGenericParamSubstitution.
|
|
1. reorder destruction order of several key classes to avoid using deleted IR objects when destroying Types
2. remove Session::canonicalTypes and make each Type own a RefPtr to the canonicalType, to allow types to be destroyed along with each IRModule it belongs to.
|
|
1, make IRModule class own a memory pool for all IR object allocations
2. For now, we allow IR objects to own other (externally) heap allocated objects, such as String, List and RefPtrs by tracking all IR objects that has been allocated for the IRModule in a list named `IRModule::irObjectsToFree`. and call destructor for all these objects upon the destruction of the IRModule. In the long term, we should eliminate the use of all these externally allocated types in IR system and get rid of this tracking and explicit destructor calls.
3. remove non-generic `createValueImpl` functions and retain only generic versions in IRBulider so we can properly call the constructor of the IR types to set up virtual tables correctly for destructor dispatching.
4. add `MemoryPool` class for allocation of the IR objects.
5. Make sure we are disposing IRSpecContexts when we are done with the specialized IR module.
6. Add `_CrtDumpMemoryLeaks()` calls to check memory leaks upon destruction of a Slang session. If we are to support multiple sessions at a time, this call should probably be replaced with the more advanced MemoryState versions of the memory leak checker.
|
|
* stdlib fixes for Vulkan
- Make sure to emit `image*` instead of `texture*` for `RWTexture*` types
- Change `GetDimensions` to call `imageSize` instead of `textureSize` when we use images
- Always output a `layout(rgba32f)` for variables that translate to `image` types
- TODO: we should emit an appropriate format based on the type, or let the user specify one
- Fix GLSL translation for `any()` function (required boolean inputs)
- Add GLSL translation for `GroupMemoryBarrierWithGroupSync()`
- Map HLSL `groupshared` to GLSL `shared`
These together are enough to get the Falor `ComputeShader` example to work.
* fixup for warning
|
|
The approach here isn't ideal. We already have a pass that transforms HLSL varying input/output types into GLSL global variables. This change makes it so that when those inputs/outputs have system-value semantics, we generate a global variable declaration with the appropriate `gl_*` name (leaving the type the same for now). Later, when emitting code, we just skip emitting declarations for declarations with mangled names that start with `gl_*`.
A more complete implementation will be needed later on, which handles cases where the translation requires types to be changed (so that conversion code needs to be inserted).
|
|
* Fix bugs around IR legalization of GLSL input/output
- Add case to handle assignment of one `ScalarizedVal::Flavor::address` to another (still need to make sure we are handling all the possible cases there)
- Revamp logic for creating global variable declarations for varying inputs/outputs.
- Actually handle creating array declarations (not sure if binding locations will be correct)
- Properly deal with offsetting of locations for nested fields
- Only create varying input/output layout information as needed for the separate `in` and `out` variables we create to represent a single HLSL `inout` varying
* During SSA generation, recursively remove trivial phis
This is actually written up in the original paper I used as a reference, but I hadn't implemented the case yet.
When you eliminate one phi as trivial (because its only operands were itself and at most one other value), you might find that another phi becomes trivial (because it had this phi as an operand, but now it will have the other value...).
The one thing that made any of this tricky is that our "phi" nodes are really block parameters, and thus they don't technically have operands (`IRUse`s). The `IRUse`s for each phi were being tracked in a separate array, and had their `user` field set to null.
With this change, I set their `user` to be the corresponding `IRParam` for the phi (and that means I changed `IRParam` to inherit from `IRUser` even though it shouldn't really be required).
* Re-build SSA form after specialization/legalization
The main reason to do this is that legalization might scalarize types, and thus might allow us to clean up resource-type local variables that we were not able to clean up when they were part of an aggregate.
Note: we shouldn't really need to do this, because the front-end should actually be guaranteeing that types that include resources are used in "safe" ways, but we currently don't have the analyses required to support that.
* Give an error message if we get GLSL input
The API and command-line interface still recognize and nominally support GLSL input files, because they need to be supported in the "pass-through" mode.
This change just adds an error message if we encounter a GLSL input file in anything other than "pass-through" mode.
|
|
The basic problem here is that when unlinking an `IRUse` from the linked list of uses, there were several cases where I was failing to set the `prevLink` field of the next node to match the `prevLink` field of the node being removed. That doesn't show up when walking the linked list of uses forward, but it breaks it whenever you have subsequent unlinking operations.
This change fixes the bugs of that kind I could find, and also adds a debug validation method to try to avoid breaking it again. I also made more access to `IRUse` go through accessor methods rather than using fields directly, to try to avoid this kind of error. I stopped short of making anything `private`, because I tend to find that it creates more hassles than it avoids.
A few other fixes along the way:
- Made the `List<T>` type default-initialize elements when you resize it. I hadn't realized we weren't doing that.
- Add a standalone `dumpIR(IRGlobalValue*)` so help when debugging issues.
|
|
* Issue error when shader parameter type doesn't match between translation units
We currently allow users to compile different translation units at the same time for the vertex and fragment shader, and those different translation units might contain distinct declarations for the "same" parameter. Furthermore, those declarations might use distinct declaratins of the "same" type.
We currently bind those as if they were one parameter, which means we assume they have types that are actually equivalent. Unfortunately, things break when that isn't the case.
This change adds error messages when parameter declarations that are determined to be the "same" don't have types that are either equiavlent or match "structurally" (same names, same fields, and field types match recursively).
Ideally most users won't see these errors, because they will only submit single files to the compiler. Eventually we may actually decide to require that case and simplify our logic.
* Support types with layout coming from a "matching" type
Because of how the front-end performs parameter binding, we can end up in cases where a variable is given a `TypeLayout` based on a "matching" type from another translation unit (because the "same" shader parmaeter got declared in multiple TUs).
This change tries to support that use case by avoiding absolute field decl-refs in the representation used for type legalization, so that we don't end up in a situation where we look up field layout based on information that doesn't match.
|
|
* Basic IR support for `static const` globals
Our strategy for lowering global *variables* can fall back to putting their initialization into a function, but that isn't really appropriate for global constants (it also isn't appropriate for arrays, but we'll need to deal with that seaprately).
This change adds a distinct case for global constants (rather than treating them as variables), and forces the emission logic to always emit them as a single expression.
Doing this makes assumptions about how the IR for these constants gets emitted (and what optimziations might do to it).
In order to make things work, I had to switch the handling of initializer-list expressions to not be lowered via temporaries and mutation (since that isn't a good fit for reverting to a single expression).
I've added a single test case to ensure that this works in the simplest scenario. My next priority will be to see if this unblocks my work in Falcor.
* Fixup: bug fixes
|
|
* Re-define deprecated compile flags
By including these flags in the header file, with a value of zero, we can allow some existing code to compile even after the major changes to the implementation.
* The `SLANG_COMPILE_FLAG_NO_CHECKING` option will effectively be ignored, since checking is always enabled.
* The `SLANG_COMPILE_FLAG_SPLIT_MIXED_TYPES` option will now act as if it is always enabled (and indeed some of the code has been relying on this flag being set always).
* Make subscript operators writable for writable textures
This even had a `TODO` comment saying that we needed to fix it, and now I'm seeing semantic checking failures because we didn't define these and so we find assignment to non l-values.
* Fix definitions of any() and all() intrinsics
These should always return a scalar `bool` value, but they were being defined wrong in two ways:
1. They were using their generic type parameter `T` in the return type
2. They were returning a vector in the vector case, and a matrix in the matrix case.
This change just alters the return type to be `bool` in all cases.
* Fix bug in SSA construction
When eliminating a trivial phi node, it is possible that the phi is still recorded as the "latest" value for a local variable in its block.
When later code queries that value from the block (which can happen whenever another block looks up a variable in its predecessors), it would get the old phi and not the replacement value.
I simply added a loop that checks if the value we look up is a phi that got replaced, and then continues with the replacement value (which might itself be a phi...). A more advanced solution might try to get clever and have the map itself hold `IRUse` values so that we can replace them seamlessly.
* Simplify IR control flow representation
This change gets rid of various special-case operations for conditional and unconditional branches, and instead requires emit logic to recognize when a direct branch is targetting a `break` or `continue` label.
The new approach here isn't perfect, but it seems beter than what we had before, because it can actually work in the presence of control-flow optimizations (including our current critical-edge-splitting step).
* Load from groupshared isn't groupshared
When loading from a `groupshared` variable, the resulting temporary shouldn't have the `groupshared` qualifier on it.
This might eventually need to generalize to a better understanding of storage modifiers in the IR, but I don't really want to deal with that right now.
* Don't emit references to typedefs in output code
Now that we are using the IR for all codegen, we shouldn't be dealing with surface-level things like `typedef` declarations in the output code; just use the type that was being referred to in the first place.
* Fix floating-point literal printing for IR
The IR was calling `emit()` instead of `Emit()` (we really need to normalize our convention here), and was implicitly invoking a default constructor on `String` that takes a `double` (that constructor should really be marked `explicit`), and which doesn't meet our requirements for printing floating-point values.
* Fix error when importing module that doesn't parse
We already added a case to bail out if semantic checking fails, but neglected to add a case if there is an error during parsing of a module to be imported.
Note: this logic doesn't correctly register the module as being loaded (but still in error), so users could see multiple error messages if there are multiple `import`s for the same module.
* Improve error message for overload resolution failure
- Drop debugging info from the candidate printing
- Add cases to print `double` and `half` types properly
* Fixup: switch loopTest to ifElse in expected IR output
|
|
* Generate SSA form for IR functions
The basic idea here is simple: in the front-end after we have lowered the AST to initial IR we will apply a set of "mandatory" optimization passes. The first of these is to attempt to translate the all functions into SSA form so that they are amenable to subsequent dataflow optimizations. Eventually, the mandatory optimization passes would include diagnostic passes that make sure variables aren't used when undefined, etc.
Just doing basic SSA generation already cleans up a lot of the messiness in our IR today, because constructs that used to involve many local variables can now be handled via SSA temporaries.
The implementation of SSA generation is in `ir-ssa.cpp`, and it follows the approach of Braun et al.'s "Simple and Efficient Construction of Static Single Assignment Form." I used this instead of the more well-known Cytron et al. algorithm because Braun's algorith mis very simple to code, and does not require auxiliary analyses to generate the dominance frontier.
The main wrinkle in our SSA representation right now is that instead of using ordinary phi nodes, we instead allow basic blocks to have parameters, where predecessor blocks pass in different parameter values. This encodes information equivalent to traditional phi nodes, but has two (small) benefits:
1. There is no fixed relationship between the order of phi operands and predecessor blocks, so we don't have to worry about breaking the phis when we alter the order in which predecessors are stored. This is important for us because predecessors are being stored implicitly.
2. It is easy to operationalize a "branch with arguments" either when lowering to other languages, or when interpreting the IR. A branch with arguments is implemented as a sequence of stores from the arguments to the parameters of the target block (very similar to a call), followed by a jump to the block.
Relevant to the above, this change also adds an interface for enumerating the predecessors or successors of a block in our CFG. Rather than use an auxliary structure, we directly use the information already encoded in the IR:
* The sucessors of a block are the target label operands of its terminator instruction. In our IR this is a contiguous range of `IRUse`s, possible with a stride (to account for the way `switch` interleaves values and blocks).
* The predecessors of a block are a subset of the uses of the block's value. Specifically, they are any uses that are on a terminator instruction, and within the range of values that represent the successor list of that instruction.
One important limitation of the "blocks with arguments" model for handling phis is that it is really only convenient to stash extra arguments on an unconditional terminator instruction. This change works around this prob lem by breaking any "critical edges" - edges between a block with multiple successors and one with multiple predecessors. We assume that "phi" nodes will only ever be needed on a block with multiple predecessors, and because critical edges are broken, each of these predecessors will then have only a single successor, so its branch instruction can handle the extra arguments.
This change introduces a notion of an "undefined" instruction in the IR. This is handled as an instruction rather than a value because I anticipate that we will want to distinguish different undefined values when it comes time to start issuing error messages (those messages will need to point to the variable that was used when undefined).
* Fix expected test output.
Another change was merged that enabled the `glsl-parameter-blocks` test, and its output is affected by our IR optimization work.
|
|
The standard library already has a bunch of these decorations, since they were added to support Slang->Vulkan codegen on the AST-to-AST path. This change makes the IR code generator able to exploit the modifiers so that we pick up a bunch of Vulkan support "for free" in the short term.
The basic change is in `lower-to-ir.cpp` where we copy over any `TargetIntrinsicModifier`s to become `IRTargetIntrinsicDecoration`s with the same information. We then need a bit of logic in `ir.cpp` to make sure we clone them as needed.
The core work of using the modifiers is in `emit.cpp`, where I basically just copy-pasted the existing logic that applied in the AST path (all the AST-related code there is dead, and we should clean it up soon).
The big change that comes with this logic is that when dealing with a member function, the numbering of the argument used in the intrinsic definition string changes, so that `$0` refers to the base object (whereas before the base object was looked up via the base expression of a `MemberExpr` used for the function). This requires a bunch of the definitions in the library to be updated; hopefully I caught them all.
For kicks, I've re-enabled a cross-compilation test just to confirm that we are generating valid SPIR-V for code that performs texture-fetch operations. I don't expect us to keep that test enabled as-is in the long term, though, because it would be much better to instead use render-test to do the same thing. Alas, beefing up the Vulkan support in render-test is an outstanding work item, and I didn't want to pollute this change with more work along those lines.
|
|
The basic change is simple: remove support for all code generation paths other than the IR.
There is a lot of vestigial code left, but the main logic in `ast-legalize.*` is gone.
Doing this breaks a *lot* of tests, for various reasons:
- We can no longer guarantee exactly matching DXBC or SPIR-V output after things pass through out IR
- Many builtins don't have matching versions defined for GLSL output via IR (even when they had versions defined via the earlier approach that worked with the AST)
- A lot of code creates intermediate values of opaque types in the IR, which turn into opaque-type temporaries that aren't allowed (this breaks many GLSL tests, but also some HLSL)
I implemented some small fixes for issues that I could get working in the time I had, but most of the above are larger than made sense to fix in this commit.
For now I'm disabling the tests that cause problems, but we will need to make a concerted effort to get things working on this new substrate if we are going to make good on our goals.
|
|
* Remove support for the -no-checking flag
Fixes #381
Fixes #383
Work on #382
- No longer expose flag through API (`SLANG_COMPILE_FLAG_NO_CHECKING`) and command-line (`-no-checking`) options
- Remove all logic in `check.cpp` that was withholding diagnostics (including errors) when the no-checking mode was enabled
- Remove `HiddenImplicitCastExpr`, which was only created to support no-checking mode (it represented an implicit cast that our checking through was needed, but couldn't emit because it might be wrong)
- Remove logic for storing function bodies as raw token lists when checking is turned off. I'm leaving in the `UnparsedStmt` AST node in case we ever need/want to lazily parse and check function bodies down the line.
- Remove a few of the code-generation paths we had to contend with, but keep the comment about them in place.
- Remove GLSL-based tests that can't meaningfully work with the new approach.
- Fix other tests that used a GLSL baseline so that their GLSL compiles with `-pass-through glslang` instead of invoking `slang` with the `-no-checking` flag.
- Remove tests that were explicitly added to test the "rewriter + IR" path, since that is no longer supported.
There is more cleanup that can be done here, now that we know that AST-based rewrite and IR will never co-exist, but it is probably easier to deal with that as part of removing the AST-based rewrite path.
We've lost some test coverage here, but actually not too much if we consider that we are dropping GLSL input anyway.
* Fixup: test runner was mis-counting ignored tests
* Fixup: turn on dumping on test failure under Travis
* Fixup: enable extensions in Linux build of glslang
|
|
* Basic fixes to gets some Vulkan GLSL out of the IR path
We haven't been paying much attention to the Vulkan output from the IR path, but that needs to change ASAP. This commit really just implements quick fixes, without concern for whether they are a good fit in the long term.
- Add some more mappings from D3D `SV_*` semantics to built-in GLSL variables, and stop redeclaring those built-in variables in our output GLSL.
- Add custom output logic for HLSL `*StructuredBuffer<T>` types, so that they emit as `buffer` declarations with an unsized array inside. This has some real limitations:
- What if the user passes the type into a function? The parameter should be typed as an (unsized) array, and not a buffer.
- What happens if we have an array of structured buffers? We need to declare an array of blocks (which GLSL allows), but this changes the GLSL we should emit when indexing.
- Customize the way that we emit entry point attributes (e.g., `[numthread(...)]`) to also support outputting equivalent GLSL `layout` qualifiers.
In many of these cases, a better fix might involve doing more of this work in the IR as part of legalization (e.g., we already have a pass that deals with varying input/output for GLSL, so that should probalby be responsible for swapping the `SV_*` to `gl_*`, especially in cases where the types don't match perfectly across langauges).
* Start adding Vulkan support to render-test
- Add both Vulkan and D3D12 as nominally supported back-ends
- Add a git submodule to pull in the Vulkan SDK dependencies
- I don't want our users to have to install it manually, since the SDK is huge
- Checking in the binaries to our main repository seems like a bad idea, but my hope is that we can prune the bloat using a subodule with the `shallow` cloning option
- Implement enough logic for the Vulkan back-end to get a single test passing on Vulkan
* Fix warning
* Fixup: disable new compute tests for Linux
* Fixup: ignore Vulkan tests on AppVeyor
* Dynamically load Vulkan implementation
Rather than statically link to the Vulkan library, we will dynamically load all of the required functions.
This removes the need to have the stub libs involved at all.
* Remove vulkan submodule
I had set up a `vulkan` submodule to pull in the headers and stub libs, but now that we are going to dynamically load all the symbols anyway, the stub lib binaries aren't needed and we can just commit the headers.
* Add Vulkan headers to external/
|
|
The recent change that removed `#import` accidentally introduced a regression that made *any* code that imports the same module in more than one place fail.
I'm just fixing the bug for now to unblock users, but this should really get a regression test.
|
|
* Fix render-test to handle raw buffers
I don't know if this fix will work for UAVs that are neither structured nor raw, but it fixes the code that currently only really works if every UAV is structured (since it doesn't set a format).
* Make type legalization consider raw buffer types
The type layout logic was already handling these, but the type splitting logic in legalization was failing to split structure types that contain, e.g., `RWByteAddressBuffer`.
A compute test case has been added to confirm the fix.
|
|
Fixes #380
The `#import` directive was a stopgap measure to allow a macro-heavy shader library to incrementally adopt `import`, but it has turned out to cause as many problems as it fixes (not least because users have never been able to form a good mental model around which kind of import to do when).
This change yanks support for the feature.
|
|
* Fix handling of errors in imported modules
- If a semantic error is detected in an imported module, then don't try to generate IR code for it
- Also, if a module (transitively) imports itself, then report that as an error
- The way I'm checking for this is a bit hacky (I'm adding the module to the map of loaded modules, but in an "unfinished" state, and then using that unfinished state to detect the import of a module already being imported).
This isn't a 100% complete solution for any of the related problems, but it improves the user experience for the common case.
* Remove #import test.
The feature is slated to be removed, so it isn't worth fixing up this test case.
|
|
The basic problem here arises when a local variable is used either before its own declaration:
```hlsl
int a = b;
...
int b = 0;
```
or when a local variable is used *in* its own decalration:
```hlsl
int b = b;
```
In each case, Slang considers the scope of the `{}`-enclosed function body (or nested statement) as a whole, and so the lookup can "see" the declaration even if it is later in the same function.
This behavior isn't really correct for HLSL semantics, so the right long-term fix is to change our scoping rules, but for now users really just want the compiler to not crash on code like this, and give an error message that points at the issue.
This change makes both of the above examples print an error message saying that variable `b` was used before its declaration, which is accurate to the way that Slang is interpreting those code examples.
This is currently treated as a fatal error, so that compilation aborts right away, to avoid all of the downstream crashes that these cases were causing.
|
|
If we don't find the generic we expect in the first pass during IR specialization, then we check for the special case where we are trying to specialize something from a generic extension, using the type being extended.
We assume that the generic parameter lists match (that part is the huge hack), and collect the arguments as if they were for the extension instead of the type.
This will break when/if we ever have generic extensions with parameter lists that don't match the type being extended.
|
|
- Don't drop specializations on a method when adding it to requirement dictionary
- Handle extension declarations under a generic when emitting to IR
|
|
Previously, all legalizations of a generic type would use the name of the original decl for the "ordinary" part of things, and this would lead to collisions because the names didn't include the mangled generic arguments.
This is now fixed by storing the mangled name of the original inside of `struct` declarations created for legalization, and using those names instead.
Also adds support for `getElementPtr` instructions when doing IR type legalization.
Also tries to make a `DeclRefType` convert to a string using the underlying `DeclRef`. This doesn't help because `DeclRef::toString` doesn't actually include generic arguments either.
|
|
`lookup_witness_table` instruction. (#376)
|
|
1. allow spReflection_FindTypeByName to accept arbitrary type expression string
2. allow const int generic value to be used as expression value, and as array size
3. various bug fixes in witness table specialization / function cloning during specializeIRForEntryPoint to avoid creating duplicate global values, not copying the right definition of a function from the other module, not cloning witness tables that are required by specializeGenerics etc.
|
|
fixes #373
fixes bug that misses current translation unit's scope when resolving entry-point global type argument expression.
|
|
reflection data
|
|
|
|
|
|
- support overloaded generic function. this involves adding a new expression type, `OverloadedExpr2` to hold the candidate expressions for the generic function decl being referenced.
- make BitNot a normal IROp instead of an IRPseudoOp
- make sure we clone the decorations of parameters when cloning ir functions
- propagate geometry shader entry point attributes (`[maxvertexcount]` and `[instance]`) through HLSL emit
- IR emit: handle geometry shader entry-point parameter decorations, such as 'triangle'.
- IR emit: treat geometry shader stream output typed ir value as `should fold into use`.
|
|
1. prevent cyclic lookups when an interface inherits transitively from itself.
2. in `createGlobalGenericParamSubstitution`, create a default substitution for the base type declref before using it to lookup the witness table.
|
|
This completes item 5 in issue #361.
The interesting change is that when checking for interface conformance, we include the requirements (include transitive interfaces) defined in extensions as well. (check.cpp line 1946)
All the other changes are for one thing: reoder the semantic checkings to two explicit stages: check header and check body. In check header phase, we check everything except function bodies, register all extensions with their target decls, then check interface conformances for all concrete types. In body checking phase, we look inside the function bodies and check concrete statements/expressions. This change ensures that we take extension into consideration in all places where it should be.
|
|
|
|
This commit is a bunch of quick hacks to get transitive interfaces to work. The idea is for each concrete type we create one giant witness table that contains entries for all the transitively reachable interface requirements, and then create one copy of that witness table for each interface it implements.
`DoLocalLookupImpl` now also looks up in inherited interface decles when looking up for a symbol in an interface decl.
When visiting `InheritanceDecl` in `lower-to-ir`, create copies of the giant witness table for each transitively inherited interface, so that these witness tables can be found later when the IR is specialized.
Re-enable the `copy all witness tables` hack in `specializeIRForEntryPoint` to ensure those transitive witness tables are copied over.
|
|
|
|
|
|
|
|
This fixes item 2 in #361
Modifies existing extension-multi-interface.slang test case to cover the additional scenario.
|
|
Also support the scenario that the extension declares conformance to interface I, and a method M in I is already supported by the base implementation.
|