| Age | Commit message (Collapse) | Author |
|
Fixes #380
The `#import` directive was a stopgap measure to allow a macro-heavy shader library to incrementally adopt `import`, but it has turned out to cause as many problems as it fixes (not least because users have never been able to form a good mental model around which kind of import to do when).
This change yanks support for the feature.
|
|
1. allow spReflection_FindTypeByName to accept arbitrary type expression string
2. allow const int generic value to be used as expression value, and as array size
3. various bug fixes in witness table specialization / function cloning during specializeIRForEntryPoint to avoid creating duplicate global values, not copying the right definition of a function from the other module, not cloning witness tables that are required by specializeGenerics etc.
|
|
|
|
Global type argument lookup should be done in both loaded modules and current trnaslation units. This is the same as the logic of spReflection_FindTypeByName, so it is extracted into `CompileRequest::lookupGlobalDecl(Name*)` method and reused in places.
|
|
Added two API functions:
1. `spReflection_FindTypeByName`, which returns a DeclRefType to the struct type with the given name. The function finds from all loaded modules in a `CompileRequest` for a decl with the given name, construct a `Type` object and cache it in `CompileRequest::types` dictionary. The subsequent calls to `spReflection_FindTypeByName` with the same name will simply returned the cached Type objects.
2. `spReflection_GetTypeLayout`, which returns a `TypeLayout` for a given `Type`. This function creates and caches the `TypeLayout` in the `TargetRequest` object that is used to create the `ProgramLayout`.
|
|
The basic idea here is that for each module that gets loaded via `import`, we should also generate the initial IR for the declarations in that module at the time it gets loaded.
Furthermore, when we generate initial IR for a module, we will only generate IR *declarations* (not *definitions*) for any functions/variables in modules it imports.
Later, when cloning IR to begin code generation for an entry point, we will effectively "link" all of the loadedm modules together, so that a given global value can get its definition from any of the IR modules present.
- Change the `loadedModulesList` and related data structures to hold a new `LoadedModule` type, instead of just the AST (and then have a `LoadedModule` own both the AST and the IR module)
- Share some logic between the `import` and `#import` cases, so that we always try to generate IR for modules we load.
- Make sure that IR generation always gets skipped if the command-line flags tell us not to use the IR.
- A few small fixups for cases that didn't arise in IR lowering so far, but come up when we try to actually generate IR for things like the stdlib.
There are some notable gaps in this work right now:
- The stdlib modules are exempted from this behavior; we always generate IR for stdlib functions in any user module that calls them. This is just a workaround for the fact that the stdlib modules don't show up in the list of imported modules right now.
- We don't currently have logic that does the "linking" step for global variables like we do for functions. We really need to look up the symbols with the same mangled name, and favor any one of them that has a definition (if there is one)
- Similarly, the handling of witness tables is incomplete. During initial IR generation, we should probably be generating empty witness tables for any conformances that were declared in other modules (but are being used locally in this module), and then the "linking" step should favor non-empty witness tables over empty ones.
Still, all the test cases pass with the code like this, and this seems like an important step in the right direction.
|
|
These were already being handled a little bit, by lowering an `out T` or `inout T` function parameter in the AST to a function parameter with type `T*` in the IR, and then emiting explicit loads/stores.
The HLSL emit logic, however, couldn't tell the difference between an `out` parameter, an `inout`, or a true pointer (if we ever needed to support them...).
The intention (not fully implemented) was that we'd use a hierarchy of types rooted at `PtrTypeBase`:
- `PtrTypeBase`
- `Ptr`: "real" pointers in the C/C++ sense
- `OutTypeBase`: pointers used to represent by-reference parameter passing
- `OutType`: IR level type for an `out` parameter
- `InOutType`: IR level type for an `inout` or `in out` parameter
Actually implementing this involved:
- Adding a bit more flexibility to the `Session::getPtrType` logic to allow for creating any of the concrete types above
- Making the `lower-to-ir` logic create the right type for function parameters (instead of just using `PtrType`)
- Making the HLSL emit logic check for the `OutType` and `InOutType` cases rather than just `PtrType`
- Changing a bunch of small places in the code so that they use `PtrTypeBase` instead of `PtrType` when they should handle any of the above cases, and also make a few places check for `OutTypeBase` instead of `PtrType` or `PtrTypeBase`, when they are really trying to capture by-reference parameters
- Add a test case that uses all of the different cases we care about (without these fixes, this test case generates errors from fxc because of variables being used before being initialized, becaues parameters get declared `out` that should be `inout`).
A minor point here is that we are playing a bit fast and loose right now because the IR does not actually enforce any type checks. From the standpoint of the front end, `Ptr<T>`, `Out<T>`, and `InOut<T>` are all unrelated types (each is just a `struct` declared in `core.meta.slang`), but this doesn't really matter because none of these are types our current users are explicitly using.
In the IR it makes perfect sense to allow `Out<T>` or `InOut<T>` as the operand of a `load` or `store` instruction (and ditto for `getFieldAddr`, etc.) - there instructions just apply to any `PtrTypeBase`. The place where this potentially gets tricky is whether an `Out<T>` can be used where a `Ptr<T>` is expected, or vice vers (e.g., can I just pass my local variable's pointer directly to an `Out<T>` function parameter?
I'm going to ignore these issues for now, since the code currently works for our test case.
|
|
* Add support for global generic parameters
(In-progress work)
This commit include:
1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest`
2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`.
3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR.
4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables.
5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest.
6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types.
This commit does not include ProgramLayout specialization.
* IR: pass through `[unroll]` attribute (#284)
The initial lowering was adding an `IRLoopControlDecoration` to the instruction at the head of a loop, but this was getting dropped when the IR gets cloned for a particular entry point.
The fix was simply to add a case for loop-control decorations to `cloneDecoration`.
* fix warnings
* IR: support `CompileTimeForStmt` (#286)
This statement type is a bit of a hack, to support loops that *must* be unrolled.
The AST-to-AST pass handles them by cloning the AST for the loop body N times, and it was easy enough to do the same thing for the IR: emit the instructions for the body N times.
The only thing that requires a bit of care is that now we might see the same variable declarations multiple times, so we need to play it safe and overwrite existing entries in our map from declarations to their IR values.
Of course a better answer long-term would be to do the actual unrolling in the IR. This is especially true because we might some day want to support compile-time/must-unroll loops in functions, where the loop counter comes in as a parameter (but must still be compile-time-constant at every call site).
* Add support for global generic parameters
(In-progress work)
This commit include:
1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest`
2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`.
3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR.
4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables.
5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest.
6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types.
progress on parameter binding
* Add a more contrived test case for specializing parameter bindings
* update render-test to align buffers to 256 bytes (to get rid of D3D complains on minimal buffer size).
* adding one more test case for parameter binding specialization.
* Cleanup according to @tfoleyNV 's suggestions.
* fix a bug introduced in the cleanup
|
|
* Don't auto-enable IR use for compute tests
The `COMPARE_COMPUTE` and `COMPARE_RENDER_COMPUTE` test fixtures were set up to always enable the `-use-ir` flag on Slang, which precludes having any tests that confirm functionality on the old non-IR path (which is still required by our main customer).
This change adds the `-xslang -use-ir` flags explicitly to any compute test cases that left them out, and makes the fixture no longer add it by default.
* Continue building out parameter block support
The initial front-end logic for parameter blocks was already added, but they are still missing a bunch of functionality. This change addresses some of the known issues:
- Bug fix: don't try to emit HLSL `register` bindings for variables that consume whole register spaces/sets
- Overhaul type layout logic so that it can make decisions based on a given code generation target (currently passed in as a `TargetRequest`), which allows us to decide whether or not a parameter block should get its own register set on a per-target basis.
- Always use a register space/set for Vulkan
- Never use a register space/set for HLSL SM 5.0 and lower
- By default, don't use register spaces/sets for HLSL output
- Add a command-line flag and some "target flags" to enable register-space usage for D3D targets
- Hackily add initial support for parameter blocks in the AST-to-AST path
- This just blindly lowers `ParameterBlock<T>` to `T`, which shouldn't quite work
- A more complete overhaul will probably need to wait until the AST-to-AST legalization is changed to use the `LegalType`s from the IR legalization pass.
- Add a compute-based test case to actually run code using parameter blocks
- This file runs test cases both with and without the IR
|
|
* Rename existing ParameterBlock to ParameterGroup
We are planning to add a new `ParameterBlock<T>` type, which maps to the notion of a "parameter block" as used in the Spire research work.
Unfortunately, the compiler codebase already uses the term `ParameterBlock` as catch-all to encompass all of HLSL `cbuffer`/`tbuffer` and GLSL `uniform`/`buffer`/`in`/`out` blocks (all of which are lexical `{}`-enclosed blocks that define parameters...).
This change instead renames all of the existing concepts over to `ParameterGroup`, which isn't an ideal name, but at least doesn't directly overlap the new terminology or any existing terminology.
The new `ParameterBlockType` case will probably be a subclass of `ParameterGroupType`, since it is a logical extension of the underlying concept.
* Add Shader Model 5.1 profiles
The HLSL `register(..., space0)` syntax is only allowed on "SM5.1" and later profiles (which is supported by the newer version of `d3dcompiler_47.dll` that comes with the Win10 SDK, but not the older version of `d3dcompiler_47.dll` - good luck figuring out which you have!).
This change adds those profiles to our master list of profiles, and nothing else.
* First pass at support for `ParameterBlock<T>`
- Add the type declaration in stdlib
- Add a special case of `ParameterGroupType` for parameter blocks
- Handle parameter blocks in type layout (currently handling them identically to constant buffers for now, which isn't going to be right in the long term)
- Add an IR pass that basically replaces `ParameterBlock<T>` with `T`
- Eventually this should replace it with either `T` or `ConstantBuffer<T>`, depending on whether the layout that was computed required a constant buffer to hold any "free" uniforms
- Add first stab at an IR pass to "scalarize" global variables using aggregate types with resources inside.
- This currently only applies to global variables, so it won't handle things passed through functions, or used as local variables
- It also only supports cases where the references to the original variable are always references to its fields, and not the whole value itself
- Add a single test case that technically passes with this level of support, but probably isn't very representative of what we need from the feature
* Fold parameter-block desugaring into a more complete "type legalization" pass
The basic problem that was arising is that once you desugar `ParameterBlock<T>` into `T`, you then need todeal with splitting `T` into its constituent fields if it contains any resource types.
Handling those transformations by following the usual use-def chains wasn't really helping, because you might need systematic rewriting that can really only be handled bottom-up.
This change adds a new pass that is intended to perform multiple kinds of type "legalization" at once:
- It will turn `ParameterBlock<T>` into `T`
- It may at some point also convert `ConstantBuffer<T>` into `T` as well
- It will turn an value of an aggregate type that contains resources into N different values (one per field)
- As a result of this, it will also deal with AOS-to-SOA conversion of these types
Legalization is applied to *every* function/instruction/value, so that it can make large-scale changes that would be tough to manage with a work list.
This pass needs to be run *after* generics have been fully specialized, so that we know we are always dealing with fully concrete types, so that their legalization for a given target is completely known.
This is still work in progress; there's more to be done to get this working with all our test cases, and finish the remaining `ParameterBlock<T>` work.
* Improve binding/layout information when using parameter blocks
- When doing type layout for a parameter block, don't include the resources consumed by the element type in the resource usage for the parameter block
- Note that this is pretty much identical to how a `ConstantBuffer<T>` does not report any `LayoutResourceKind::Uniform` usage, except that `ParameterBlock<T>` is *also* going to hide underlying texture/sampler reigster usage
- The one exception here is that any nested items that use up entire `space`s or `set`s those need to be exposed in the resource usage of the parent (I don't have a test for this)
- When type legalization needs to scalarize things, it must propagate layout information down to the new leaf variables. In general, the register/index for a new leaf parameter should be the sum of the offsets for all of the parent variables along the "chain" from the original variable down to the leaf (we aren't dealing with arrays here just yet).
- When type legalization decides to eliminate a pointer(-like) type (e.g., desugar `ParameterBlock<T>` over to `T`), actually deal with that in terms of the `LegalVal`s created, so that we can know to turn a `load` into a no-op when applied to a value that got indirection removed.
- Hack up the "complex" parameter-block test so that it actually passes (the big hack here is that the HLSL baseline is using names that are generated by the IR, and are unlikely to be stable as we add/remove transformations).
- Note: I can't make these be compute tests right now, because regsiter spaces/sets are a feature of D3D12/Vulkan, and our test runner isn't using those APIs.
|
|
- Add shader model 6.0, 6.1, and 6.2 targets
- Add DXIL and DXIL assembly as output formats
- Add header for DXC API to `external/`
- Add `dxc-support.cpp` that wraps usage of the API
- Add `-pass-through dxc` option, equivalent to what we have for `fxc`
Notes:
* This does *not* include any logic to add `dxcompiler.dll` to our build process; that is way out of scope for the build complexity I'm ready to deal with
* For right now, the use of `dxcompiler.dll` is hard-coded, and it must be discoverable in the current executable's search path; options to customize can come later
* The `-pass-through` option is kind of silly because the code doesn't actually pay attention to the value (just whether it is set). If you set it to `fxc` but ask for DXIL, we pass through `dxc` anyway.
|
|
The big addition here is that the Slang "bytecode" is no longer treated as just a "code generation target" (`CodeGenTarget`) akin to DX bytecode (DXBC) or SPIR-V, but instead is a `ContainerFormat` that can be used to emit all the results of a compile request (well, currently just the IR-as-BC, but the intention is there).
Getting to this goal involved some prior checkins that eliminated bogus "targets" that weren't really akin to SPIR-V or DXBC: `-target slang-ir-asm` and `-target reflection-json`. Those targets were really in place to support testing, and so they've been made more explicit testing/debug options.
This change eliminates `-target slang-ir` and instead tries to allow the user to specify `-o foo.slang-module` as an output file name, that indicates the intention to output a "container" file that will wrap up all the generated code.
I've also gone ahead and generalized the existing `-target` option so that we are actually building up a *list* of code generation targets. This is largely just a cleanup, since it forces code to be more aware of when it is doing something target-specific vs. target independent. For example, reflection layout information lives on a requested target, and not on the compile request as a whole, and similarly output code is per-target, per-entry-point.
As a cleanup, I eliminated support for per-translation-unit output. This was vestigial code from back when I used to try and do HLSL generation for a whole translation unit instead of per-entry-point (which turned out to be a lot of complexity for little gain), and it was only being used in the `hello` example and the `render-test` test fixture - in both cases fixing it up was easy enough. I've stubbed out the old `spGetTranslationUnitSource` API, but haven't removed it yet.
|
|
* Get rid of the `-slang-ir-asm` target
This is really only useful for debugging, so I've replaced the functionality with a `-dump-ir` command line option (which dump's the IR for an entry point before doing codegen).
* fixup: use HLSL target, not DXBC, so test can run on Linux
|
|
Move reflection JSON generation into separate test fixture
|
|
* IR: overhaul IR design/implementation
Closes #192
Closes #188
This is a major overhaul of how the IR is implemented, with the primary goal of just using the AST-level type representation as the IR's type representation, rather than inventing an entire shadow set of types (as captured in issue #192).
One consequence of this choice is that types in the IR are no longer explicit "instructions" and are not represented as ordinary operands (so a bunch of `+ 1` cases end up going away when enumerating ordinary operands).
Along the way I also got rid of the embedded IDs in the IR (issue #188) because this wasn't too hard to deal with at the same time.
Another related change was to split the `IRValue` and `IRInst` cases, so that there are values that are not also instructions. Non-instruction values are now used to represent literals, references to declarations, and would eventually be used for an `undef` value if we need one. IR functions, global variables, and basic blocks are all values (because they can appear as operands), but not instructions.
The main benefit of this approach is that the top-level structure of a bytecode (BC) module is much simpler to understand and walk, and BC-level types are represented much more directly (such that we could conceivably use them for reflection soon).
* fixup: 64-bit build fix
* fixup: try to silence clang's pedantic dependent-type errors
* fixup: bug in VM loading of constants
|
|
* IR: handle control flow constructs
This change includes a bunch of fixes and additions to the IR path:
- `slang-ir-assembly` is now a valid output target (so we can use it for testing)
- This uses what used to be the IR "dumping" logic, revamped to support much prettier output.
- A future change will need to add back support for less prettified output to use when actually debugging
- IR generation for `for` loops and `if` statements is supported
- HLSL output from the above control flow constructs is implemented
- Revamped the handling of l-values, and in particular work on compound ops like `+=`
- Add basic IR support for `groupshared` variables
- Add basic IR support for storing compute thread-group size
- Output semantics on entry point parameters
- This uses the AST structures to find semantics, so its still needs work
- Pass through loop unroll flags
- This is required to match `fxc` output, at least until we implement
unrolling ourselves.
* Fixup: 64-bit build issues.
* fixup for merge
|
|
Fixes #23
Up to this point, the compiler has used the ordinary `String` type to represent declaration names, which means a bunch of lookup structures throughout the compiler were string-to-whatever maps, which can reduce efficiency.
It also means that things like the `Token` type end up carying a `String` by value and paying for things like reference-counting.
This change adds a `Name` type that is used to represent names of variables, types, macros, etc.
Names are cached and unique'd globally for a session, and the string-to-name mapping gets done during lexing.
From that point on, most mapping is from pointers, which should make all the various table lookups faster.
More importantly (possibly), this brings us one step closer to being able to pool-allocate the AST nodes.
|
|
Just like the previous change did for declaration keywords, this change uses the lexical environment to drive the lookup and dispatch of modifier parsing.
This allows us to easily add modifiers to Slang, even when they might conflict with identifiers used in user code (because the modifier names are no longer special keywords, but ordinary identifiers).
There was already some support for ideas like this with `__modifier` declarations (`ModifierDecl`) used to introduce some GLSL-specific keywords (so that they wouldn't pollute the namespace of HLSL files).
The new approach changes these to be actual `syntax` declarations (`SyntaxDecl`) with the same representation as those used to introduce declaration keywords.
Because many modifiers just introduce a single keyword that maps to a simple AST node (no further tokens/data), I modified the handling of syntax declarations so that they can take a user-data parameter, and this allows the common case ("just create an AST node of this type...") to be handled with minimal complications.
This also adds in a general-purpose string-based lookup path for AST node classes, that should support programmatic creation in more cases.
Statements are now the main case of keywords that need to be made table driven.
|
|
The existing parser code was doing string-based matching on the lookahead token to figure out how to parse a declaration, e.g.:
```
if(lookAhead == "struct") { /* do struct thing */ }
else if(lookAhead == "interface") { /* do interface thing * }
...
```
That approach has some annoying down-sides:
- It is slower than it needs to be
- It is annoying to deal with cases where the available declaration keywords might differ by language
- Most importantly, it is not possible for us to introduce "extended" keywords that the user can make use of, but which can be ignored by the user and treated as an ordinary identifier.
That last part is important. Suppose the user wanted to have a local variable named `import`, but we also had a Slang extension that added an `import` keyword. Then a line of code like `import += 1` would lead to a failure because we'd try to parse an import declaration, even when it is obvious that the user meant their local variable. This would mean that Slang can't parse existing user code that might clash with syntax extensions. This issue is the reason why we currently have keywords like `__import`.
A traditional solution in a compiler is to map keywords to distinct token codes as part of lexing, which eliminates the first conern (performance) because now we can dispatch with `switch`. It can also aleviate the second concern if we add/remove names from the string->code mapping based on language (the rest of the parsing logic doesn't have to know about keywords being added/removed).
The solution we go for here is more aggressive.
Instead of mapping keyword names to special token codes during lexing, we instead introduce logical "syntax declarations" into the AST, which are looked up using the ordinary scoping rules of the language.
Depending on what code is imported into the scope where parsing is going on, different keywords may then be visible.
This solves our last concern, since a user-defined variable that just happens to use the same name as a keyword is now allowed to shadow the imported declaration for syntax (this is akin to, e.g., Scheme where there really aren't any "keywords").
This also opens the door to the possibility of eventually allowing user to define their own syntax (again, like Scheme).
For now I'm only using this for the declaration keywords.
With this change it should be pretty easy to also add statement keywords in the same fashion.
|
|
Fixes #24
So far the code has used a representation for source locations that is heavy-weight, but typical of research or hobby compilers: a `struct` type containing a line number and a (heap-allocated) string.
This is actually very convenient for debugging, but it means that any data structure that might contain a source location needs careful memory management (because of those strings) and has a tendency to bloat.
The new represnetation is that a source location is just a pointer-sized integer.
In the simplest mental model, you can think of this as just counting every byte of source text that is passed in, and using those to name locations.
Finding the path and line number that corresponds to a location involves a lookup step, but we can arrange to store all the files in an array sorted by their start locations, and do a binary search.
Finding line numbers inside a file is similarly fast (one you pay a one-time cost to build an array of starting offsets for lines).
More advanced compilers like clang actually go further and create a unique range of source locations to represent a file each time it gets included, so that they can track the include stack and reproduce it in diagnostic messages.
I'm not doing anything that clever here.
|
|
- `ExpressionSyntaxNode` becomes `Expr`
- `StatementSyntaxNode` becomes `Stmt`
- `StructSyntaxNode` becomes `StructDecl`
- `ProgramSyntaxNode` becomes `ModuleDecl`
- `ExpressionType` becomes `Type`
- Existing fields names `Type` become `type`
- There might be some collateral damage here if there were, e.g., `enum`s named `Type`, but I can live with that for now and fix those up as a I see them
|
|
There were two main places where global variables were used in the Slang implementation:
1. The "standard library" code was generated as a string at run-time, and stored in a global variable so that it could be amortized across compiles.
2. The representation of types uses some globals (well, class `static` members) to store common types (e.g., `void`) and to deal with memory lifetime for things like canonicalized types.
In each case the "simple" fix is to move the relevant state into the `Session` type that controlled their lifetime already (the `Session` destructor was already cleaning up these globals to avoid leaks).
For the standard library stuff this really was easy, but for the types it required threading through the `Session` a bit carefully.
One more case that I found: there was a function-`static` variable used to generate a unique ID for files output when dumping of intermediates is enabled (this is almost strictly a debugging option).
Rather than make this counter per-session (which would lead to different sessions on different threads clobbering the same few files), I went ahead and used an atomic in this case.
Note that the remaining case I had been worried about was any function-`static` counter that might be used in generating unique names.
It turns out that right now the parser doesn't use such a counter (even in cases where it probably should), and the lowering pass already uses a counter local to the pass (again, whether or not this is a good idea).
This change should be a major step toward allowing an application to use Slang in multiple threads, so long as each thread uses a distinct `SlangSession`. The case of using a single session across multiple threads is harder to support, and will require more careful implementation work.
|
|
Fixes #11
- This adds a `-o` command-line option for specifying an output file.
- The code tries to be a bit smart, to glean an output format from a file extension, and also to associate multiple `-o` options with multiple `-entry` options if needed.
- There is a restriction that all the output files need to agree on the code generation target. This is reasonable for now, but might be something to lift eventualy
- There is a restriction that only one output file is allowed per entry point
- Together with the previous item this means you can't output both a `.spv` and a `.spv.asm` in one pass, even though both should be possible
- There is currently a restriction that output paths only apply to entry points
- This means there is no way to output reflection JSON to a file with `-o` (but that is mostly just a debugging feature for now)
- This also means we don't support any "container" formats that can encapsulate multiple compiled entry points
|
|
- API users can use this to get "clean" output to aid with debugging Slang issues
- Also changes the prefix on intermediate files that Slang dumps, to make them easier to ignore with a regexp
|
|
Calling:
spSetDumpIntermedites(compileRequest, true);
will set up a mode where Slang tries to dump every intermediate HLSL, GLSL, DXBC, SPIR-V, etc. file it generates. If SPIR-V or DXBC is requested then we also dump assembly of those.
Right now the files are all named as `slang-<counter>.<ext>`, and get dropped in whatever the working directory is, but I'm open to ideas on how to improve that.
Note: this change introduces a new binary interface to `glslang`, so pulling it requires an updated `glslang.dll`.
|
|
EntryPointResult/TranslationUnitResult, added helper functionality; Ensure null termination when printing raw data
|
|
|
|
- Allow a code-generation target of `NONE` in order to suppress ordinary output in test cases where we don't care about the actual output (just pass/fail result)
- Add explicit `location` layout qualifiers to intermediate vertex-to-fragment variables in GLSL test cases for rendering, to work around apparent Intel driver bugs.
|
|
The tricky bit here was that the `reflection-json` output format isn't really a code generation target like the others, and we need to be able to have multiple "targets" active to make sense of it. This needs cleaning-up.
|
|
- The big change here is that all the definitions for syntax-node classes have been macro-ized, to that we can do light metaprogramming over them
- The use of macros for this has big down-sides, but I'm not quite ready to do anything more heavy-weight right now
- The macro-ized definitions can be included multiple times, to generate different declarations/code as needed
- The first example of using this meta-programming facility is a new visitor system
- The actual visitor base classes and the dispatch logic are all generated from the meta-files
- There was only one visitor left in the code: the semantics checker, so that was ported to the new system.
- All current test cases pass, so *of course* that means all is well.
|
|
Previously the code checked for a duplicate `#import` using a data structure attached to the compile request, but this would fail for nested imports.
It also wouldn't work for a combination of `#import` and `__import`.
This change makes it so that we instead track a set of already-imported modules in the semantic checking visitor, which is instantiated once per translation unit.
We also key this set on the actual module (AST) imported, rather than on path/name/whatever, so hopefully it will be robust to the same thing getting imported multiple ways.
|
|
With this change, there is now a meaningful semantic difference between `__import` and `#import`.
An `__import` compiles the target file in a fresh environment, only providing it any macro definitions passed via command line or API. Any macros defined in the imported file are not made visible at the import site. One can think of an `__import` as a bit like `using namespace` in C++.
A `#import` will tokenize the input in the same preprocessor environment as the importing file, and any macros defined along the way will be visible in the parent file.
It is a *bit* like a `#include` with two big differences:
- The imported code is always parsed as Slang, and as its own module with default flags, etc. (so semantic checks are on even if we are in "rewriter" mode). It is pulled into the outer namespace just as for `__import`.
- A given file will only get `#import`ed once for a translation unit, so it behaves a bit like there is an implicit `#pragma once` in the target file
|
|
Right now `#import` only differs from `#include` in that it takes a string literal for a file name instead of a raw identifier (to which `.slang` gets appended).
The next step is to make `#import` respect preprocessor state, while `__import` doesn't.
|
|
- The basic idea is simple: be sure to enumerate code in `__import`ed modules when generating reflection info
- Note that we don't currently allow an entry point to appear in an imported module, so we only consider globlal-scope parameters
- Although there isn't currently a real implementation of namespacing, I went ahead and ensured that parameters in imported modules are treated as distinct from parameters in the user's code, even if they have the same name.
|
|
The main user-visible change here is that instead of `spAddTranslationUnitEntryPoint` we have `spAddEntryPoint`, to reflect that the list of entry points is "global" to a compile request.
As a result, `spGetEntryPointSource` now only needs the entry point index, and not the translation unit index.
There are a bunch more behind-the-scenes changes, though, reflecting a streamlining of the concepts related to compilation into a smaller number of classes.
Now there is:
- `Session` (unchanged) to manage the lifetimes of shared stuff like the stdlib
- `CompileRequest` (merges in `CompileOptions`) to handle all the lifetime related to a single invocation of the compiler
- `TranslationUnitRequest` (merges `TranslationUnitOptions`, `CompileUnit`) to represent a single translation unit ("module") that the user is trying to compile. This is a single file for HLSL/GLSL, but can be multiple files for Slang.
- `EntryPointRequest` (merges `EntryPointOption` and a bit of `EntryPointResult`) to track a single entry point that the user is asking to compile (that entry point always comes from a single translation unit)
A lot of functions used to take some combination of these and end up with really long signatures.
I've given most of the objects "parent" pointers so that they can get back to all the context they need, so most functions don't need as many parameters.
It may eventually be important to tease these apart again, in particular:
- The code-generation side of things (the `*Result` types) might need to be pulled out in case we want to codegen multiple times from the same AST
- Similarly, the layout stuff may also need to be pulled out, in case we want to lay things out multiple times with different rules.
|
|
That is, even if hte user specified the `-no-checking` option (or the equivalent via API), we still want/need to apply full semantic checks to Slang code, so that cross-compilation will be possible.
|
|
The basic idea of this change is that user code can just write:
#include "foo.h"
and then if `foo.h` gets found in a list of registered directories for "auto-import," then it actually gets interpreted as if the user had writte, more or less:
__import foo;
That is, the code in `foo.h` will be treated as Slang, and will be fully parsed and checked (no matter what the source language had been), and the scoping rules will be those of `__import` instead of `#include`.
This is a really big hammer, and I could imagine it smashing fingers if used poorly.
I'm not sure this feature will pan out, but we need to try things to know.
One big piece of that that I'll likely keep in either case is an overhaul of command-line options parsing for `slangc`. In particular, this logic has been moved into the core `slang` library (so that users can just pass options in via the API), and it is all done on UTF-8 strings rather than wide strings (which was always going to be Windows-specific).
|
|
It is always easier to add back code when you need it, than it is to maintain code you aren't using.
|
|
Getting rid of more namespace complexity and stripping things down to the basics.
This also gets rid of some dead code in the "core" library.
|
|
This gets rid of one unecessary namespace.
|
|
This is a large change that contains many pieces:
- Update the `cross-compile0` test to actually make use of cross compilation.
Now the `cross-compile0.hlsl` file contains both HLSL and GLSL source code, and then imports code from `cross-compile0.slang`, which provides a "library" (one function) that can be shared between both the HLSL and GLSL version of things.
- Fixed a bug in the support for backslash-escaped newlines.
- Added a new `__import` declaration type (replaces the `using` directive that was still around in a vestigial form)
An `__import` causes the compiler to look for a Slang source file (currently using the ordinary `#include` lookup logic), and then parse/check the found file as an additional module ("translation unit"), before making its declarations visible in the current scope.
- Refactored the main compilation flow to be simpler. There were the `ShaderCompiler` and `ShaderCompilerImpl` classes that weren't relaly doing anything, but added complexity to the whole workflow.
- The `render-test` application has been heavily modified to better support testing cross-compilation workflows. At the most basic level we are starting to distinguish pass-through vs. rewriter workflows, and are passing various `#define`s down to the compiler(s) to let the source code be customized as needed for each case.
Several annoying corner cases are caused here by having to support the GLSL compilation model, which really wants each entry point in its own specific translation unit, whereas we really want to keep things nicely contained in single files.
- Added support for `__intrinsic` operations to have target-specific behavior.
This allows a function to be given a different name for some specific target (so a call gets emitted as a call to that other operation).
More generally, the library writer can put together an arbitrary format string that will be used in place of expressions that call the given function, e.g.:
__intrinsic(hlsl, "$1 - $0") __intrinsic int foo(int a, int b);
Given this declaration, a call like `foo(x,y)` will code generate as `x - y` for HLSL, and as `foo(x,y)` for all other targets.
Annoying things still to be dealt with:
- The way that I'm filtering the user-provided options when passing things down to the compilation of dynamically loaded modules is a bit ad hoc. It would be good to have a systematic notion of which options will be inherited and which won't. There is also more code duplication than I'd like, so we risk having the compiler behave differently when compiling a file at the top level, vs. because of `__import`.
- Adding target-specific behavior to intrinsics is all well and good, but the current approach means we can only add this to the original declaration, which limits the ability to easily extend the set of targets.
A better approach long-term would be to add a more robust notion of target-based overload resolution (which would happen after semantic checking). Then one mechanism would be used to find the right target-specific overload to use for an operation, and then each (target-specific) definition could use a simpler attribute to intercept code-generation behavior.
Note that we might eventually need a similar notion to deal with stage- or profile-specific functions and the overloading behavior around them, so using this for intrinsics doesn't seem like a bad idea.
|
|
|