| Age | Commit message (Collapse) | Author |
|
|
|
This commit addresses issue #275
This commit includes following changes:
1. legalize function parameter IRParam instructions
2. legalize function parameter types in IRFuncType
3. legalize call sites (IRCall) with proper arguments
4. legalize local vars that has a mixed resource type.
|
|
This commit addresses issue #275
This commit includes following changes:
1. legalize function parameter IRParam instructions
2. legalize function parameter types in IRFuncType
3. legalize call sites (IRCall) with proper arguments
4. legalize local vars that has a mixed resource type.
|
|
* Don't auto-enable IR use for compute tests
The `COMPARE_COMPUTE` and `COMPARE_RENDER_COMPUTE` test fixtures were set up to always enable the `-use-ir` flag on Slang, which precludes having any tests that confirm functionality on the old non-IR path (which is still required by our main customer).
This change adds the `-xslang -use-ir` flags explicitly to any compute test cases that left them out, and makes the fixture no longer add it by default.
* Continue building out parameter block support
The initial front-end logic for parameter blocks was already added, but they are still missing a bunch of functionality. This change addresses some of the known issues:
- Bug fix: don't try to emit HLSL `register` bindings for variables that consume whole register spaces/sets
- Overhaul type layout logic so that it can make decisions based on a given code generation target (currently passed in as a `TargetRequest`), which allows us to decide whether or not a parameter block should get its own register set on a per-target basis.
- Always use a register space/set for Vulkan
- Never use a register space/set for HLSL SM 5.0 and lower
- By default, don't use register spaces/sets for HLSL output
- Add a command-line flag and some "target flags" to enable register-space usage for D3D targets
- Hackily add initial support for parameter blocks in the AST-to-AST path
- This just blindly lowers `ParameterBlock<T>` to `T`, which shouldn't quite work
- A more complete overhaul will probably need to wait until the AST-to-AST legalization is changed to use the `LegalType`s from the IR legalization pass.
- Add a compute-based test case to actually run code using parameter blocks
- This file runs test cases both with and without the IR
|
|
* IR: Add support for break and continue statements
The front-end is already doing the work of connecting this statements to their "parent" statement, so we just needed to build a map from the `Stmt*` to the corresponding `IRBlock*`s to use for break/continue when outputting any loop statement, and then look up in the map for the branch target when outputting a break/continue.
When we get around to adding `switch` statements, the same pattern should work just fine.
I also added support for `do/while` statements in IR codegen, and made sure to exercise those in one of the test cases I added.
There is also an unrelated IR codegen fix for when there is a "bound subscript" on the RHS of an assignment.
* IR: fix handling of do/while and continue
Thanks to @csyonghe for pointing out my mistake in the earlier commit.
I implemented `continue` for `do/while` loops incorrectly, branching to the head of the loop instead of the loop test.
I'll try to blame this mistake on the fact that I never use `do/while` loops because I think they are awful. :)
The fix for that issue wasn't too bad (see `lower-to-ir.cpp`) but it surfaces a much more serious issue: I wasn't actually implementing `continue` correctly *at all* when it comes to generating HLSL/GLSL from the IR (I can't easily make an excuse for that one).
The basic issue at the heart of this is that given an input statement like:
```
for(int ii = 0; ii < N; ii = doSomething(ii))
{ ... }
```
The continue clause (`ii = doSomething(ii)`) could expand into many instructions (across multiple blocks, if we inline), and there is in general no guarantee when we are done that we can package up that code as an expression and spit out a new `for` loop (the same basic argument applies to a `do { ... } while(someComplexExpression())`.
So, if we assume that in general we have to generate a full *statement* for the `continue` clause, what can we emit?
- We could try to "outline" the continue code into its own function, so that we can call it from an expression. That could work, but has high implementation complexity.
- We could introduce additional `bool` variables for control flow, outputting something like:
```
bool useContinueBlock = false;
for(;;) {
if(useContinueBlock) { <CONTINUE CODE>; }
useContinueBlock = true;
<LOOP TEST>
<LOOP BODY>
}
```
This works but user might balk at the extra variable we introduce.
- We could duplicate the code at each continue site. That is, we emit the loop as:
```
for(;;) {
<LOOP TEST>
<LOOP BODY>
<CONTINUE CODE>
}
```
but then whenever we'd like to emit `continue;` we instead emit `{ <CONTINUE CODE>; continue; }`.
This doesn't introduce any extra variables, but it causes code duplication (limited, if we don't have too many `continue` sites, and the continue clause is small - which are the common cases).
When I was initially working on the IR codegen I picked that last option just because it is what `fxc` seems to do, but I neglected to actually *implement* the special-case codegen for a `continue` instruction. This change addresses that (see `emit.cpp`).
Finally, once things were fixed the `continue` test case produced the results Yong told me to expect, but it also produced a warning from the downstream HLSL compiler ("hey, your loop doesn't ever actually *loop*!"), so I reworked the test back to one that actually loops (but still tests `continue`).
As a final aside in this essay of a commit message: the current IR representation of control flow uses special-case instructions for various cases of unconditional branch (and two variations on `if`), but these are not strictly necessary, and a future change will hopefully clean it up. The biggest catch in doing that is that it will require the IR->source codegen to carefully track which blocks represent which kinds of branch targets in context (e.g., you can't assume that a `continue` that nees the special handling above will appear as a distinct kind of instruction).
|
|
Don't update generated .h file if its not changed.
|
|
|
|
Cleanup of "support generic interface method".
|
|
Now using DeclaredSubtypeWitness::declRef to determine the proper argument index in a GenericSubstitution.
|
|
Add a GenericValueParamDecl case in doesGenericSignatureMatchRequirement()
Return a substituted DeclaredSubtypeWitness in DeclaredSubtypeWitness::SubstituteImpl() instead of return this.
|
|
This changes logic of slang-generate to detect whether the newly generated .h file is different from the existing file, and update the existing file only when the actual content has changed. This helps prevent visual studio from repetitively rebuilding the slang project due to the header file being updated on every build.
|
|
|
|
- Add definition of `discard` instruction
- A `discard` is a terminator instruction, just like `returnVoid`
- Lower `DiscardStmt` in AST to a `discard` instruction in the IR
- Emit `discard` instruction as a `discard;` statement when emitting HLSL/GLSL
- Add a test case using the "graphics compute" mode that tests discard. The test writes to one entry in a UAV before doing a conditional (always true at runtime) discard, and then writes to another entry; we expect to see the results of the first write, but not the second.
|
|
* improve diagnostic messages and prevent fatal errors from crashing the compiler.
* fix top level exception catching.
* spelling fix
* change wording of invalidSwizzleExpr diagnostic
* add speculative GenericsApp expr parsing
* add new test case of cascading generics call.
* Fixing bugs in compiling cascaded generic function calls.
Add implementation of DeclaredSubTypeWitness::SubstituteImpl()
This is not needed by the type checker, but needed by IR specialization. When input source contains cascading generic function call, the arguments to `specialize` instruction is currently represented as a substitution. The arg values of this subsittution can be a `DeclaredSubTypeWitness` when a generic function uses one of its generic parameter to specialize another generic function. When the top level generics function is being specialized, this substitution argument, which is a `DeclaredSubTypeWitness`, needs to be substituted with the witness that used to specialize the top level function in the specialized specialize instruction as well.
* add a test case for cascading generic function call.
* parser bug fix
* fixes #255
* add test case for issue #255
* Generate missing `specialize` instruction when calling a generic method from an interface constraint.
When calling a generic method via an interface, we should be generating the following ir:
...
f = lookup_interface_method(...)
f_s = specailize(f, declRef)
...
This commit fixes this `emitFuncRef` function to emit the needed `specialize` instruction.
* fixes #260
This fix follows the second apporach in the disucssion. It generated mangled name for specialized functions by appending new substitution type names to the original mangled name.
* Disabling removing and re-inserting specailized functions in getSpecalizeFunc()
I am not sure why it is needed, it seems HLSL and GLSL backends are generating forward declarations anyways, so the order of functions in IRModule shouldn't matter.
* cleanup and complete test cases.
* fix warnings
|
|
This is currently only useful for `struct` types.
I implemented a special-case exception so that the auto-generated `struct` types used for `cbuffer` members don't show their internal name.
I did *not* implement any logic to avoid returning the name `vector` for a vector type, etc., since they are all `DeclRefType`s and it seemed easiest to just let the user access information they can't really use.
|
|
Falcor integration work
|
|
|
|
- During IR emit, treat a "select" expression (`?:` operator) like any other `InvokeExpr`, since it will have an `__intrinsic_op` modifier attached to turn it into a `select` instruction.
- During HLSL/GLSL emit from IR, turn a `select` instruction into a `?:` expression
- Also add support for the `neg` instruction during HLSL/GLSL emit
Note that right now we are assuming HLSL semantics for `?:` where it does not short-circuit. Correctly handling the GLSL case would require going back to special-case codegen for `SelectExpr`, but we can cross that bridge when we come to it.
|
|
The IR encodes `out` and `in out` function parameters as pointer types, so the emit logic needs to handle it. We had code to handle translation of pointers types into `out` declarations for function *declarations* but weren't handling it for function *definitions*.
This change unifies the logic so that it is shared by function definitions and decalrations.
This change does *not* deal with the following issues that need to be addressed sometime soon-ish:
- We currently always translate pointers into `out`, even if they should be `in out`. This is obviously wrong.
- If/when we eventually have targets that support true pointers (e.g., CUDA, NVIDIA OpenGL, etc.) we'll need a way to tell the difference between an `in` pointer parameter, and an `out` parameter.
Both of these issues are meant to be addressed by having a few special cases of pointer types, for the `out` and `in out` cases, and only translating those (not all pointers). We need to plumb those through the IR more completely, but I'm not dealing with that here.
|
|
The test case had previously been calling `GroupBarrierWithGroupSync` as if it was a special-cased instruction, but now it is just calling it as an ordinary (intrinsic) function.
I haven't removed the now-useless instruction, but it would be a good cleanup to go through and eliminate all the instruction cases we aren't using in the near future.
|
|
The old approach was relying on an `__intrinsic_op` modifier to tell us we need to do something special with an `InvokeExpr`, but a previous change removed a bunch of those modifiers.
Instead, we will now check for calls to subscript declarations as part of the normal flow of emitting *any* call, similar to what is done for constructor calls already.
Eventually we should be able to eliminate the special case in the `__intrinsic_op` path, but I'm holding off on that because the AST emit logic can probably be cleaned up a *lot* once it doesn't have to be used for cross-compilation as well.
|
|
This code isn't especially useful right now since most of the important subscripts are still special-cased with `__intrinsic_op`, but the idea is that if we de-mangle an intrinsic operation's name and see it is called `operator[]` then we are probably calling a subscript, and should emit an appropriate expression.
Aside: this change has pointed out to me that our current name mangling isn't properly handling non-alphanumeric characters, so we'll be in trouble as soon as we have non-intrinsic subscripts, operators, etc.
|
|
This is to allow me to compare for particular names in my de-mangling logic in `emit.cpp`.
|
|
The source of a lot of these changes is that our current strategy for dealing with "builtin" operations when emitting HLSL from the IR is to de-mangle the mangled name of an operation, and then emit HLSL code for a function call to an operation with that de-mangled name.
This change introduces a few fixups for that work:
- It adds support for parsing the mangled names of generics (specialized and unspecialized)
- It adds logic for detecting when the operation being invoked is a member function
- This is currently a bit ugly, since we compare the number of actual arguments we have in the IR against the number of parameters declared for the callee, and if they don't match we assume we have an extra `this` argument.
On the mangling side, we add (hacky) support for mangling a function name when its types involve generic parameters, e.g.:
```
__generic<T, let N : int> T length(vector<T,N> v);
```
In this case the mangled name of the function needs to include a mangling for the type `vector<T,N>` which means it also needs to include a mangling for `N`.
The reason I describe this support as "hacky" is because we really shouldn't be reproducing the names `T` or `N` in the mangled symbol name. By doing so we make it so that a user changing the name of a generic parameter would break (IR) binary compatibility with existing code that was separately compiled.
I've included comments in the code about a better way to handle this, but it isn't a priorit right now since binary compatibility isn't something meaningful until we start emitting usable bytecode modules.
|
|
Subscript declarations can have nested "accessor" declarations for the get/set behavior:
```
__subscript(int index) -> float
{
get { ... }
set { ... }
}
```
The AST type checks an expression like `a[i]` into a call to an appropriate `__subscript` declaration, and reads the return type off of that, but doesn't drill down to the individual getters/setters.
During IR code generation, we need to resolve a call to the subscript operation down to the actual getter or setter, since those are what will have the executable code (or be intrinsics). If we have a non-intrinsic accessor, then we end up asking for its "return type" and get NULL, which crashes the compiler.
The fix in this case is to add a bit more semantic checking for accessors, mostly just so that we can have them copy the return type from their parent declaration. While we are at it, this change goes ahead and has an accessor validate that the parent declaration is one that should be allowed, and emit a diagnostic if it is nested in an improper place.
|
|
The original code is handling the issue where a call site might be specializing a generic function, so it has a `DeclRef` that represents what it wants to specialize, but the callee is actually a different overload of the same generic function (e.g., a target-specific overload) and so we need to construct a set of substitutions that are equivalent (same arguments), but point to different `GenericDecl`s.
That code was making some bad assumptions, though:
1. It assumed that the substitutions list would always start with a generic substitution (no longer true with `ThisTypeSubstitution`.
2. It assumed that only the top-most substitution would need to be translated. This assumption is probably safe for now, but it could break down if we ever introduced an ability for a type to be re-opened to introduce new (target-specific) overloads of its members.
The new approach goes ahead and does a deep copy of the substitition list (but a shallow copy of the arguments), and only copies the generic substititions for now.
|
|
This attribute used to be how we marked ops for special handling in emission, but now it is being used to mark ops that map to single instructions. Either way, we have a bunch of intrinsic functions that need to get lowered in a more traditional fashion for HLSL, and the intrinsics are getting in the way.
Subsequent changes will fix up issues created by this removal.
A few cases were left unchanged, either because the ops really do map to single instructions, or because there is some special-case support attached to those operations that would be tricky to replace right now.
|
|
* Rename existing ParameterBlock to ParameterGroup
We are planning to add a new `ParameterBlock<T>` type, which maps to the notion of a "parameter block" as used in the Spire research work.
Unfortunately, the compiler codebase already uses the term `ParameterBlock` as catch-all to encompass all of HLSL `cbuffer`/`tbuffer` and GLSL `uniform`/`buffer`/`in`/`out` blocks (all of which are lexical `{}`-enclosed blocks that define parameters...).
This change instead renames all of the existing concepts over to `ParameterGroup`, which isn't an ideal name, but at least doesn't directly overlap the new terminology or any existing terminology.
The new `ParameterBlockType` case will probably be a subclass of `ParameterGroupType`, since it is a logical extension of the underlying concept.
* Add Shader Model 5.1 profiles
The HLSL `register(..., space0)` syntax is only allowed on "SM5.1" and later profiles (which is supported by the newer version of `d3dcompiler_47.dll` that comes with the Win10 SDK, but not the older version of `d3dcompiler_47.dll` - good luck figuring out which you have!).
This change adds those profiles to our master list of profiles, and nothing else.
* First pass at support for `ParameterBlock<T>`
- Add the type declaration in stdlib
- Add a special case of `ParameterGroupType` for parameter blocks
- Handle parameter blocks in type layout (currently handling them identically to constant buffers for now, which isn't going to be right in the long term)
- Add an IR pass that basically replaces `ParameterBlock<T>` with `T`
- Eventually this should replace it with either `T` or `ConstantBuffer<T>`, depending on whether the layout that was computed required a constant buffer to hold any "free" uniforms
- Add first stab at an IR pass to "scalarize" global variables using aggregate types with resources inside.
- This currently only applies to global variables, so it won't handle things passed through functions, or used as local variables
- It also only supports cases where the references to the original variable are always references to its fields, and not the whole value itself
- Add a single test case that technically passes with this level of support, but probably isn't very representative of what we need from the feature
* Fold parameter-block desugaring into a more complete "type legalization" pass
The basic problem that was arising is that once you desugar `ParameterBlock<T>` into `T`, you then need todeal with splitting `T` into its constituent fields if it contains any resource types.
Handling those transformations by following the usual use-def chains wasn't really helping, because you might need systematic rewriting that can really only be handled bottom-up.
This change adds a new pass that is intended to perform multiple kinds of type "legalization" at once:
- It will turn `ParameterBlock<T>` into `T`
- It may at some point also convert `ConstantBuffer<T>` into `T` as well
- It will turn an value of an aggregate type that contains resources into N different values (one per field)
- As a result of this, it will also deal with AOS-to-SOA conversion of these types
Legalization is applied to *every* function/instruction/value, so that it can make large-scale changes that would be tough to manage with a work list.
This pass needs to be run *after* generics have been fully specialized, so that we know we are always dealing with fully concrete types, so that their legalization for a given target is completely known.
This is still work in progress; there's more to be done to get this working with all our test cases, and finish the remaining `ParameterBlock<T>` work.
* Improve binding/layout information when using parameter blocks
- When doing type layout for a parameter block, don't include the resources consumed by the element type in the resource usage for the parameter block
- Note that this is pretty much identical to how a `ConstantBuffer<T>` does not report any `LayoutResourceKind::Uniform` usage, except that `ParameterBlock<T>` is *also* going to hide underlying texture/sampler reigster usage
- The one exception here is that any nested items that use up entire `space`s or `set`s those need to be exposed in the resource usage of the parent (I don't have a test for this)
- When type legalization needs to scalarize things, it must propagate layout information down to the new leaf variables. In general, the register/index for a new leaf parameter should be the sum of the offsets for all of the parent variables along the "chain" from the original variable down to the leaf (we aren't dealing with arrays here just yet).
- When type legalization decides to eliminate a pointer(-like) type (e.g., desugar `ParameterBlock<T>` over to `T`), actually deal with that in terms of the `LegalVal`s created, so that we can know to turn a `load` into a no-op when applied to a value that got indirection removed.
- Hack up the "complex" parameter-block test so that it actually passes (the big hack here is that the HLSL baseline is using names that are generated by the IR, and are unlikely to be stable as we add/remove transformations).
- Note: I can't make these be compute tests right now, because regsiter spaces/sets are a feature of D3D12/Vulkan, and our test runner isn't using those APIs.
|
|
Adding associated types
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|