| Age | Commit message (Collapse) | Author |
|
|
|
1. simplify RoundUpToAlignment()
2. add new a render-compute test case to cover the situation where the entry-point interface (parameter/return types of an entry-point function) is dependent on the global generic type.
3. initial fixes to get this test case to compile (but is not producing correct HLSL output yet)
|
|
* Add support for global generic parameters
(In-progress work)
This commit include:
1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest`
2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`.
3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR.
4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables.
5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest.
6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types.
This commit does not include ProgramLayout specialization.
* IR: pass through `[unroll]` attribute (#284)
The initial lowering was adding an `IRLoopControlDecoration` to the instruction at the head of a loop, but this was getting dropped when the IR gets cloned for a particular entry point.
The fix was simply to add a case for loop-control decorations to `cloneDecoration`.
* fix warnings
* IR: support `CompileTimeForStmt` (#286)
This statement type is a bit of a hack, to support loops that *must* be unrolled.
The AST-to-AST pass handles them by cloning the AST for the loop body N times, and it was easy enough to do the same thing for the IR: emit the instructions for the body N times.
The only thing that requires a bit of care is that now we might see the same variable declarations multiple times, so we need to play it safe and overwrite existing entries in our map from declarations to their IR values.
Of course a better answer long-term would be to do the actual unrolling in the IR. This is especially true because we might some day want to support compile-time/must-unroll loops in functions, where the loop counter comes in as a parameter (but must still be compile-time-constant at every call site).
* Add support for global generic parameters
(In-progress work)
This commit include:
1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest`
2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`.
3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR.
4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables.
5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest.
6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types.
progress on parameter binding
* Add a more contrived test case for specializing parameter bindings
* update render-test to align buffers to 256 bytes (to get rid of D3D complains on minimal buffer size).
* adding one more test case for parameter binding specialization.
* Cleanup according to @tfoleyNV 's suggestions.
* fix a bug introduced in the cleanup
|
|
testing framework.
|
|
The big addition here is that the Slang "bytecode" is no longer treated as just a "code generation target" (`CodeGenTarget`) akin to DX bytecode (DXBC) or SPIR-V, but instead is a `ContainerFormat` that can be used to emit all the results of a compile request (well, currently just the IR-as-BC, but the intention is there).
Getting to this goal involved some prior checkins that eliminated bogus "targets" that weren't really akin to SPIR-V or DXBC: `-target slang-ir-asm` and `-target reflection-json`. Those targets were really in place to support testing, and so they've been made more explicit testing/debug options.
This change eliminates `-target slang-ir` and instead tries to allow the user to specify `-o foo.slang-module` as an output file name, that indicates the intention to output a "container" file that will wrap up all the generated code.
I've also gone ahead and generalized the existing `-target` option so that we are actually building up a *list* of code generation targets. This is largely just a cleanup, since it forces code to be more aware of when it is doing something target-specific vs. target independent. For example, reflection layout information lives on a requested target, and not on the compile request as a whole, and similarly output code is per-target, per-entry-point.
As a cleanup, I eliminated support for per-translation-unit output. This was vestigial code from back when I used to try and do HLSL generation for a whole translation unit instead of per-entry-point (which turned out to be a lot of complexity for little gain), and it was only being used in the `hello` example and the `render-test` test fixture - in both cases fixing it up was easy enough. I've stubbed out the old `spGetTranslationUnitSource` API, but haven't removed it yet.
|
|
When using the lumped/"unity" build approach for Slang, the resulting `.obj` files run into number-of-sections limits in the VS linker. For now I'm using the `/bigobj` command-line flag to work around this for the `hello` example, just so I can be sure the lumped build still works, but longer term it seems like we need to just drop that approach anyway.
The `render-test` application was switched to link against `slang.dll` since there is no reason to have multiple apps use the lumped approach.
|
|
The main user-visible change here is that instead of `spAddTranslationUnitEntryPoint` we have `spAddEntryPoint`, to reflect that the list of entry points is "global" to a compile request.
As a result, `spGetEntryPointSource` now only needs the entry point index, and not the translation unit index.
There are a bunch more behind-the-scenes changes, though, reflecting a streamlining of the concepts related to compilation into a smaller number of classes.
Now there is:
- `Session` (unchanged) to manage the lifetimes of shared stuff like the stdlib
- `CompileRequest` (merges in `CompileOptions`) to handle all the lifetime related to a single invocation of the compiler
- `TranslationUnitRequest` (merges `TranslationUnitOptions`, `CompileUnit`) to represent a single translation unit ("module") that the user is trying to compile. This is a single file for HLSL/GLSL, but can be multiple files for Slang.
- `EntryPointRequest` (merges `EntryPointOption` and a bit of `EntryPointResult`) to track a single entry point that the user is asking to compile (that entry point always comes from a single translation unit)
A lot of functions used to take some combination of these and end up with really long signatures.
I've given most of the objects "parent" pointers so that they can get back to all the context they need, so most functions don't need as many parameters.
It may eventually be important to tease these apart again, in particular:
- The code-generation side of things (the `*Result` types) might need to be pulled out in case we want to codegen multiple times from the same AST
- Similarly, the layout stuff may also need to be pulled out, in case we want to lay things out multiple times with different rules.
|
|
The basic idea of this change is that user code can just write:
#include "foo.h"
and then if `foo.h` gets found in a list of registered directories for "auto-import," then it actually gets interpreted as if the user had writte, more or less:
__import foo;
That is, the code in `foo.h` will be treated as Slang, and will be fully parsed and checked (no matter what the source language had been), and the scoping rules will be those of `__import` instead of `#include`.
This is a really big hammer, and I could imagine it smashing fingers if used poorly.
I'm not sure this feature will pan out, but we need to try things to know.
One big piece of that that I'll likely keep in either case is an overhaul of command-line options parsing for `slangc`. In particular, this logic has been moved into the core `slang` library (so that users can just pass options in via the API), and it is all done on UTF-8 strings rather than wide strings (which was always going to be Windows-specific).
|
|
This is a large change that contains many pieces:
- Update the `cross-compile0` test to actually make use of cross compilation.
Now the `cross-compile0.hlsl` file contains both HLSL and GLSL source code, and then imports code from `cross-compile0.slang`, which provides a "library" (one function) that can be shared between both the HLSL and GLSL version of things.
- Fixed a bug in the support for backslash-escaped newlines.
- Added a new `__import` declaration type (replaces the `using` directive that was still around in a vestigial form)
An `__import` causes the compiler to look for a Slang source file (currently using the ordinary `#include` lookup logic), and then parse/check the found file as an additional module ("translation unit"), before making its declarations visible in the current scope.
- Refactored the main compilation flow to be simpler. There were the `ShaderCompiler` and `ShaderCompilerImpl` classes that weren't relaly doing anything, but added complexity to the whole workflow.
- The `render-test` application has been heavily modified to better support testing cross-compilation workflows. At the most basic level we are starting to distinguish pass-through vs. rewriter workflows, and are passing various `#define`s down to the compiler(s) to let the source code be customized as needed for each case.
Several annoying corner cases are caused here by having to support the GLSL compilation model, which really wants each entry point in its own specific translation unit, whereas we really want to keep things nicely contained in single files.
- Added support for `__intrinsic` operations to have target-specific behavior.
This allows a function to be given a different name for some specific target (so a call gets emitted as a call to that other operation).
More generally, the library writer can put together an arbitrary format string that will be used in place of expressions that call the given function, e.g.:
__intrinsic(hlsl, "$1 - $0") __intrinsic int foo(int a, int b);
Given this declaration, a call like `foo(x,y)` will code generate as `x - y` for HLSL, and as `foo(x,y)` for all other targets.
Annoying things still to be dealt with:
- The way that I'm filtering the user-provided options when passing things down to the compilation of dynamically loaded modules is a bit ad hoc. It would be good to have a systematic notion of which options will be inherited and which won't. There is also more code duplication than I'd like, so we risk having the compiler behave differently when compiling a file at the top level, vs. because of `__import`.
- Adding target-specific behavior to intrinsics is all well and good, but the current approach means we can only add this to the original declaration, which limits the ability to easily extend the set of targets.
A better approach long-term would be to add a more robust notion of target-based overload resolution (which would happen after semantic checking). Then one mechanism would be used to find the right target-specific overload to use for an operation, and then each (target-specific) definition could use a simpler attribute to intercept code-generation behavior.
Note that we might eventually need a similar notion to deal with stage- or profile-specific functions and the overloading behavior around them, so using this for intrinsics doesn't seem like a bad idea.
|
|
The test case that is there right now is nominally a cross-compilation test, but for right now it uses the preprocessor to present completely different code for HLSL and GLSL compilation.
This change is really just fleshing out the OpenGL side of `render-test` enough that it can produce images using OpenGL to enable further testing.
|
|
|