| Age | Commit message (Collapse) | Author |
|
* Fixed some small typos in api-users-guide.md
* Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp
* Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT.
* FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11.
* Fix the testing of feature levels on Dx11 renderer.
* First attempt at README.md for slang-test.
* Tidied up the slang-test README.md file.
* Fix some small typos in tools/slang-test/main.cpp
* Fix spaces -> tabs problems.
Fix some small types.
|
|
* Remove support for the -no-checking flag
Fixes #381
Fixes #383
Work on #382
- No longer expose flag through API (`SLANG_COMPILE_FLAG_NO_CHECKING`) and command-line (`-no-checking`) options
- Remove all logic in `check.cpp` that was withholding diagnostics (including errors) when the no-checking mode was enabled
- Remove `HiddenImplicitCastExpr`, which was only created to support no-checking mode (it represented an implicit cast that our checking through was needed, but couldn't emit because it might be wrong)
- Remove logic for storing function bodies as raw token lists when checking is turned off. I'm leaving in the `UnparsedStmt` AST node in case we ever need/want to lazily parse and check function bodies down the line.
- Remove a few of the code-generation paths we had to contend with, but keep the comment about them in place.
- Remove GLSL-based tests that can't meaningfully work with the new approach.
- Fix other tests that used a GLSL baseline so that their GLSL compiles with `-pass-through glslang` instead of invoking `slang` with the `-no-checking` flag.
- Remove tests that were explicitly added to test the "rewriter + IR" path, since that is no longer supported.
There is more cleanup that can be done here, now that we know that AST-based rewrite and IR will never co-exist, but it is probably easier to deal with that as part of removing the AST-based rewrite path.
We've lost some test coverage here, but actually not too much if we consider that we are dropping GLSL input anyway.
* Fixup: test runner was mis-counting ignored tests
* Fixup: turn on dumping on test failure under Travis
* Fixup: enable extensions in Linux build of glslang
|
|
* Basic fixes to gets some Vulkan GLSL out of the IR path
We haven't been paying much attention to the Vulkan output from the IR path, but that needs to change ASAP. This commit really just implements quick fixes, without concern for whether they are a good fit in the long term.
- Add some more mappings from D3D `SV_*` semantics to built-in GLSL variables, and stop redeclaring those built-in variables in our output GLSL.
- Add custom output logic for HLSL `*StructuredBuffer<T>` types, so that they emit as `buffer` declarations with an unsized array inside. This has some real limitations:
- What if the user passes the type into a function? The parameter should be typed as an (unsized) array, and not a buffer.
- What happens if we have an array of structured buffers? We need to declare an array of blocks (which GLSL allows), but this changes the GLSL we should emit when indexing.
- Customize the way that we emit entry point attributes (e.g., `[numthread(...)]`) to also support outputting equivalent GLSL `layout` qualifiers.
In many of these cases, a better fix might involve doing more of this work in the IR as part of legalization (e.g., we already have a pass that deals with varying input/output for GLSL, so that should probalby be responsible for swapping the `SV_*` to `gl_*`, especially in cases where the types don't match perfectly across langauges).
* Start adding Vulkan support to render-test
- Add both Vulkan and D3D12 as nominally supported back-ends
- Add a git submodule to pull in the Vulkan SDK dependencies
- I don't want our users to have to install it manually, since the SDK is huge
- Checking in the binaries to our main repository seems like a bad idea, but my hope is that we can prune the bloat using a subodule with the `shallow` cloning option
- Implement enough logic for the Vulkan back-end to get a single test passing on Vulkan
* Fix warning
* Fixup: disable new compute tests for Linux
* Fixup: ignore Vulkan tests on AppVeyor
* Dynamically load Vulkan implementation
Rather than statically link to the Vulkan library, we will dynamically load all of the required functions.
This removes the need to have the stub libs involved at all.
* Remove vulkan submodule
I had set up a `vulkan` submodule to pull in the headers and stub libs, but now that we are going to dynamically load all the symbols anyway, the stub lib binaries aren't needed and we can just commit the headers.
* Add Vulkan headers to external/
|
|
The big picture here is that the AST-to-AST pass in `ast-legalize` will now detect when a declaration being referenced comes from an `import`ed module, and (if IR codegen is enabled), it will trigger cloning of the IR for the chosen symbol into an IR module that will sit alongside the legalized AST.
Then, during HLSL/GLSL code emit, we emit all the IR-based code first, and then the AST-based code. Whenever the AST code references a symbol that was lowered via IR (we keep track of these) we emit the mangled name of the IR symbol.
Notes/details:
- A lot of the logic for cloning IR symbols referenced by the AST matches the same logic that would clone them for completely IR-based codegen, so I tried to hoist out the common logic and share it (e.g., so that we apply the same guaranteed transformations in both cases). This required basically rewriting the logic in `emit.cpp` that decomposed the various cases.
- There is a new compute test case added to test this functionality. `tests/compute/rewriter.hlsl` confirms that we can use the `-no-checking` mode for the HLSL code, but still make use of a library of Slang code that employs generics, etc.
- Adding this test case required adding a new compute test mode that invokes `render-test` with the `-hlsl-rewrite` flag.
- It turns out that the existing `tests/render/cross-compile0.hlsl` test should have been using this functionality already. It was opting into the use of the IR via `-use-ir`, and the `render-test` application already tries to set `-no-checking` for non-Slang input languages by default. Fixing the code path this test triggers means that it is now a second test of rewriter+IR codegen.
- The `translateDeclRef` logic in `ast-legalize.cpp` seemed sloppy in places, and would potentially clone declarations, when declaration references were desired. I tried to clean a bit of this up, so some call sites are now changed.
- This change tries to clean up some work around cloning of global values
- All global value kinds (not just functions) now go through the logic of trying to pick a "best" definition, so that they can be used when we are linking multiple modules
- The logic for registering cloned values has been unified a bit, so that clients always pass in an `IROriginalValuesForClone` that either wraps a single value (maybe just null), or an `IRSpecSymbol*` that gives a list of values to regsiter the new value as a clone for.
- I made one piece of code that was cloning witness tables as part of generic specializations *not* register a clone. I think this is correct because we may specialize the same generic multiple ways, so registering any values we clone is not the right idea, but I might be missing something...
- I also reorganized this logic so that it would be easier to clone a global value when we only know its mangled name (which is the case when it is the AST that triggers cloning)
- I made sure that when loading a module via `import`, the translation unit for the new module copies the `-use-ir` flag from the overall compile request, if it is present (otherwise we wouldn't generate IR for loaded modules at all... oops).
- Note that `getSpecializedGlobalValueForDeclRef()`, which is the main routine used by the AST legalization to trigger cloning of an IR value does *not* currently handle declaration references that require specialization.
- This change does *not* deal with trying to unify the type legalization logic between the AST-to-AST rewriter and the IR-based codegen, so if you call an imported function with types that require legalization, Bad Things are expected to happen right now.
|
|
* Don't auto-enable IR use for compute tests
The `COMPARE_COMPUTE` and `COMPARE_RENDER_COMPUTE` test fixtures were set up to always enable the `-use-ir` flag on Slang, which precludes having any tests that confirm functionality on the old non-IR path (which is still required by our main customer).
This change adds the `-xslang -use-ir` flags explicitly to any compute test cases that left them out, and makes the fixture no longer add it by default.
* Continue building out parameter block support
The initial front-end logic for parameter blocks was already added, but they are still missing a bunch of functionality. This change addresses some of the known issues:
- Bug fix: don't try to emit HLSL `register` bindings for variables that consume whole register spaces/sets
- Overhaul type layout logic so that it can make decisions based on a given code generation target (currently passed in as a `TargetRequest`), which allows us to decide whether or not a parameter block should get its own register set on a per-target basis.
- Always use a register space/set for Vulkan
- Never use a register space/set for HLSL SM 5.0 and lower
- By default, don't use register spaces/sets for HLSL output
- Add a command-line flag and some "target flags" to enable register-space usage for D3D targets
- Hackily add initial support for parameter blocks in the AST-to-AST path
- This just blindly lowers `ParameterBlock<T>` to `T`, which shouldn't quite work
- A more complete overhaul will probably need to wait until the AST-to-AST legalization is changed to use the `LegalType`s from the IR legalization pass.
- Add a compute-based test case to actually run code using parameter blocks
- This file runs test cases both with and without the IR
|
|
|
|
|
|
* Fix up test runner output for compute.
We want compute-based tests to produce a `.actual` file when compilation fails, so we can easily diagnose the issue. I thought I'd added this capability previous, but it seemst to not be present any more.
* Compute result types for constructor decls
Fixes #246
When the parser sees an `init()` declaration, it can't easily know what type is is supposed to return, so it leaves the type as NULL. This was causing some downstream crashes.
Rather than special-case every site that cares about the result type of a callable, we will instead ensure that we install an actual result type on an initializer/constructor as part of its semantic checking.
This code needs to handle both the case where the initializer is declared inside a type, as well as the case where it is declared inside an `extension`.
|
|
|
|
|
|
|
|
|
|
vertex/fragment shader pair, but instead of comparing the resulting framebuffer, it expects the test shader to write results into a UAV, and compares the pixel shader UAV output to the reference output.
|
|
|
|
|
|
|
|
inputs for running test shaders with arbitrary parameter definitions.
This commit contains the parser of the resource input definition.
|
|
* Fix up emission of shader parameter semantics when using IR
- Make sure to propagate entry point parameter layouts down to IR parameters when doing the initial cloning to form target-specific IR
- When layout information is present on an IR node, prefer to use that over the original high-level declaration for outputting semantics in final HLSL
- Fix up test runner to generate `.actual` files when running compute tests, in cases where the `render-test` application errors out (e.g., because of a Slang compilation error)
- Add a first test of generics functionality, to show that they generate valid code through the IR
- Right now this test is *not* using any "interesting" operations on the type parameter, so this is not a test that can confirm that interface constraints work
* fixup: skip compute tests when running on Linux
|
|
testing framework.
|
|
There are two big changes here:
- Add logic during the initial IR cloning pass for an entry point + target that tries to pick the best possible version of any target-overloaded function. This allows us to pick the intrinsic version of `saturate()` when compiling for HLSL output, but then pick the non-intrinsic version (that is implemented in terms of `clamp()`) when targetting GLSL.
- Add an initial specialization pass that tries to deal with generics. This required some fixing work to IR generation, so that we correctly generate explicit operations to specialize a generic for specific types (this is currently implemented as a `specialize` instruction that takes the generic to specialize plus a declaration-reference that represents the specialized form). With that work in place, we can scan for `specialize` instructions inside of non-generic functions, and use them to trigger generation of specialized code. We rely on the name-mangling scheme to help us find pre-existing specializations when possible.
There are also a bunch of cleanups encountered along the way:
- Don't use the explicit `layout(offset=...)` for uniforms, because it isn't supported by all current drivers. For now we will just assume that our layout rules compute the same values that the driver would for un-marked-up code. We can come back later and try to implement a workaround in the cases where this doesn't apply (e.g., by re-running the layout logic as part of emission, and dropping layout modifiers from variables that don't need explicit layout).
- Fix some issues in IR dump printing so that we print function declarations more nicely.
- Testing: print out failing pixel when image-diff fails
|
|
Move reflection JSON generation into separate test fixture
|
|
When outputting GLSL from a Slang or HLSL entry point, we need to translate any parameters or results of an entry-point function into global declarations of `in` or `out` parameters, as needed by GLSL.
This change adds these transformations at the IR level, so that they don't need to complicate the emit logic.
More detailed changes:
- Make `render0` test use IR. It passes out of the box.
- Fix test runner to not always dump diffs on failures
I accidentally initialized an option to `true` instead of `false` when working on debugging the Travis CI failures.
- Special-case output for component-wise multiplication to handle GLSL `matrixCompMul()`
- Handle GLSL vs. HLSL output for calls to `mul()`
- Output proper `layout(std140)` on GLSL constant buffer declarations
- Require appropriate GLSL extension when emitting explicit `layout(offset = ...)` on constant buffer members
- TODO: Need to avoid requiring this extension in cases where the offsets are what would be computed anyway.
Realistically, should probably be emitting code with explicit padding, etc. to guarantee layouts.
- Add an IR-based pass to translate entry point functions by eliminating their input/output parameters and replacing them with global variables.
- Demangle names when calling target intrinsics
The lowering to the IR will turn a call like `sin(foo)` into a call to a function declaration with a mangled name like `_S3sin...`. This works fine when the user is calling their own functions, since the name mangling will apply to both the definition and use sites, but for builtin functions it obviously isn't what we want.
This change makes it so that we demangle the name of an instrinsic function just enough so that we can extract the original simple name, and make a call using that.
These changes do nor provide 100% of what we need when translating to GLSL, so the `cross-compile-entry-point` test *still* hasn't been flipped over to use the IR (even though that is the test case I've been using to develop these changes).
|
|
* Attempt to fix subprocess handling for Linux
Our CI builds are sometimes hanging on Travis, and I suspect it might be something to do with how we are waiting for subprocesses to complete. I'm trying to following the manpage for the `wait()` and `waitpid()` calls a bit better here.
* fixup: try to use poll() instead of select()
* fixup: missing header
* fixup
* fixup
* fixup: try to emit test output when tests fail on Travis
|
|
* Get tests running/passing under Linux
- Fix up `dlopen` abstraction
- Fix up some test cases to request hlsl (rather than default to dxbc) so they can run on non-Windows targets
- Fix up test runner ignore tests that can't run on current platform (and not count those as failure)
- Fix file handle leeak in process spawner absttraction
- Get additional test-related applications building
- More tweaks to Travis script; in theory deployment is set up now (yeah, right)
* fixup
* fixup: Travis environment variable syntax
* fixup: Buffer->begin
* fixup: actually run full tests on one config
* fixup: add build status badge for Travis
|
|
At a high level, this commit adds two things:
1. A "bytecode" format for serializing Slang IR instructions and related structure (functions, "registers")
2. A virtual machine that can load and then execute code in that bytecode format.
The reason for kicking off this work right now is that we *need* a way to run tests on Slang code generation that doesn't rely on having a GPU present (given that our CI runs on VM instances without GPUs), nor on textual comparison to the output of other compilers. With these features I've implemented a slapdash `slang-eval-test` test fixture that can run a (trivial) compute shader to very our compilation flow through to bytecode.
Some key design constraints/challenges:
- The bytecode format should be "position independent" so that a user can just load a blob of data and then inspect it without having to deserialize into another format, allocate memory, etc. Eventually the bytecode format might be a replacement for out current reflection API (we used to base reflection off a similar format, but the cost/benefit wasn't there at the time and we switched to just using the AST).
- The VM should be able to execute bytecode functions without doing any per-operation translation, JIT, etc. (translation of more coarse-grained symbols is okay). For now the VM is just being used to run tests, but eventually I'd like it to be viable for:
- Running Slang-based code in the context of the compiler itself. This starts with stuff like constant-folding in the front-end, but could expand to more general metaprogramming features.
- Running Slang-based ocde within a runtime application (e.g., a game engine) that wants to be able to run things like "parameter shader" code, or even just evaluate compute-like code on CPU (e.g., when supporting particles on both CPU and GPU).
- Finally, the bytecode format should ideally be able to round-trip back to the IR without unacceptable loss of information. This requirement and the previous one play off of each other, because things like a traditional SSA phi operation is ugly when you have to actually *execute* it. This doesn't matter right now when we don't have SSA yet, but it might be part of the decision-making here.
The actual implementation is centralized in `bytecode.{h,cpp}` and `vm.{h.cpp}`.
Big picture notes:
- The space of opcodes is shared between IR and bytecode (BC), with the hope that this makes translation of operations between the two easy.
- The actual bytecode instruction stream relies on a variable-length encoding for integer values, including opcodes and operand numbers, so that the common case is single-byte encoding.
- In the long term I intend to have a rule that if you use a single-byte encoding for an opcode, then all operands are required to use single-byte encodings too. Operations that need multi-byte operands would then be forced to use a multi-byte encoding of the op, and would be sent down a slower path in the interpeter.
- The "bytecode"'s outer structure is based on ordinary data structures linked with pointers, but they are "relative pointers" so the actual structure is position-independent.
- There are two main kinds of operands: registers and "constants." An operand is a signed integer where non-negatie values indicate registers (with `index == operandVal`) and negative values indicate constants (with `index == ~operandVal`).
- Registers are stored in the "stack frame" for a VM function call, and each has a fixed offset based on the size of the type and those that come before it. Conceptually, registers are allowed to overlap if they aren't live at the same time, and we manage this with a simple stack model: every register is supposed to identify the register that comes directly before it (this isn't implemented yet).
- "Constants" are more realistically a representation of "captured" values, but they are currently also how constants come in. Basically we can use a compact range of indices in the bytecode for a function, and each of these indices indirectly refers to some value in the next outer scope.
- The actual encoding of bytecode instructions right now is largely ad-hoc and very wasteful (we encode the type on everything, and we also encode everything as if it had varargs).
- In some cases, an instruction needs to know the types of the values involved (e.g., because it needs to load an array element, which means copying a number of bytes based on the size). The way the VM works we have types attached to our registers, so we currently get sneaky and look at those types in some ops. Longer term is makes sense to encode the required type info directly in the BC.
- There's a whole lot of hand-waving going on with how the actual top-level bytecode module gets loaded, because of the way we currently treat the top-level module as an instruction stream in the IR. This means that we try to represent the loaded module as a "stack frame" for a call to the module as a function, but that approach as serious problems, and isn't realistically what we want to do.
|
|
|
|
Fixes #104
- Map HLSL `nointerpolation` to GLSL `flat`
- When lowering a `struct` type varying input/output, look for interpolation modifiers along the "chain" from the leaf field up to the original shader input variable (and take the first one found)
- Not sure if this is strictly needed, but it seems like a reasonable policy
- Add `flat` to varying input of integer type, with no other interpolation modifier
- Note: I do *not* do anything to ignore a manually imposed interpolation modifier that might be incorrect
|
|
- `RefPtr` no longer tries to have distinct cases for interal-vs-external reference counts. Instead we always require an internal reference count.
- Types the used `RefPtr` but weren't `RefObject` were made to inherit `RefObject`
- The `ReferenceCounted` base class was removed, so that only `RefObject` remains
- Implicit conversion from `RefPtr<T>` to `T*` added
- This created some complicates for other types that relied on implicit conversions, so this isn't a net cleanup right now
- The main type that got messed up by the above was `String`, which previously held a `RefPtr<char, ...>`. This change thus *also* includes a major overhaul of `String`:
- `String` now holds all its data via indirection, using a `StringRepresentation` that is a `RefObject`. This object holds a length, capacity, and directly stores the character data in its allocation. This means that `sizeof(String)==sizeof(void*)`
- It is now possible to directly mutate a `String` by appending to its representation (we just need to ensure it has a reference count of `1`, possibly by cloning it). This means that `StringBuilder` is now basically just an idomatic use of `String`
- A couple operations that just return sub-ranges of a `String` now return `StringSlice` to avoid allocation when it isn't needed. This required more work.
- Indices into strings changed from `int` to `UInt` (which is pointer-sized). This had a bunch of follow-on changes because the value `-1` sometimes needs to be special-cased in code that uses indices. Further cleanups are probably needed here.
|
|
|
|
|
|
Getting rid of more namespace complexity and stripping things down to the basics.
This also gets rid of some dead code in the "core" library.
|
|
This includes a bunch of related changes:
- `slang-test`
- Add a notion of an "output mode" that specifies whether we output to console (the default), or invoke the apprpriate AppVeyor command to update test status
- Add a notion of test categories, so that tests can be tagged with categories, and then we can invoke only those tets in a given category, or choose to *exclude* tests with specific categories
- Allow the `OSProcessSpawner` to look up an executable by "path" (meaning a full path is expected) or by "name" (meaning it should be allowed to look in the current directory, `PATH` environment variable, etc.). This was important to make sure that I can run `appveyor` without having to know its absolute path.
- AppVeyor configuration
- Change badge to reflect new build account for organization (rather than a single-user account)
- Remove attempt to set AppVeyor build version in a clever way, since it breaks links from GitHub to AppVeyor
- Change order or configurations in the build matrix to front-load the Release build (which has the main tests)
- Turn on `fast_finish` flag so we don't have to wait as long for failed builds
- Turn on `parallel` builds
- Set `verbosity: minimal` to avoid getting build spew about Xamarin stuff I'm not using
- Add custom `test_script` to invoke `test.bat`
- Sets the test category based on teh build configuration, so we don't run the full test suite on every input.
- `test.bat`
- Allow for `-platform` and `-configuration` arguments
- Rewrute a platform of `Win32` over to `x86` to match how the output directories are named
- Futz around with how the directories are being passed along to work around annoying `.bat` file quoting behavior (I still don't get how batch files work)
- Tests
- Mark a bunch of tests as `smoke` tests
- Mark the relevant tests as `render` tests
(these get filtered out for AppVeyor builds)
|
|
Now if there are multiple tests listed for an input file like `foo.slang`, they will be output as:
passed test: 'foo.slang'
passed test: 'foo.slang.1'
passed test: 'foo.slang.2'
This isn't a perfect solution, but it should work well enough for now.
This change doesn't make any tests actually use the new capability.
|
|
This is a large change that contains many pieces:
- Update the `cross-compile0` test to actually make use of cross compilation.
Now the `cross-compile0.hlsl` file contains both HLSL and GLSL source code, and then imports code from `cross-compile0.slang`, which provides a "library" (one function) that can be shared between both the HLSL and GLSL version of things.
- Fixed a bug in the support for backslash-escaped newlines.
- Added a new `__import` declaration type (replaces the `using` directive that was still around in a vestigial form)
An `__import` causes the compiler to look for a Slang source file (currently using the ordinary `#include` lookup logic), and then parse/check the found file as an additional module ("translation unit"), before making its declarations visible in the current scope.
- Refactored the main compilation flow to be simpler. There were the `ShaderCompiler` and `ShaderCompilerImpl` classes that weren't relaly doing anything, but added complexity to the whole workflow.
- The `render-test` application has been heavily modified to better support testing cross-compilation workflows. At the most basic level we are starting to distinguish pass-through vs. rewriter workflows, and are passing various `#define`s down to the compiler(s) to let the source code be customized as needed for each case.
Several annoying corner cases are caused here by having to support the GLSL compilation model, which really wants each entry point in its own specific translation unit, whereas we really want to keep things nicely contained in single files.
- Added support for `__intrinsic` operations to have target-specific behavior.
This allows a function to be given a different name for some specific target (so a call gets emitted as a call to that other operation).
More generally, the library writer can put together an arbitrary format string that will be used in place of expressions that call the given function, e.g.:
__intrinsic(hlsl, "$1 - $0") __intrinsic int foo(int a, int b);
Given this declaration, a call like `foo(x,y)` will code generate as `x - y` for HLSL, and as `foo(x,y)` for all other targets.
Annoying things still to be dealt with:
- The way that I'm filtering the user-provided options when passing things down to the compilation of dynamically loaded modules is a bit ad hoc. It would be good to have a systematic notion of which options will be inherited and which won't. There is also more code duplication than I'd like, so we risk having the compiler behave differently when compiling a file at the top level, vs. because of `__import`.
- Adding target-specific behavior to intrinsics is all well and good, but the current approach means we can only add this to the original declaration, which limits the ability to easily extend the set of targets.
A better approach long-term would be to add a more robust notion of target-based overload resolution (which would happen after semantic checking). Then one mechanism would be used to find the right target-specific overload to use for an operation, and then each (target-specific) definition could use a simpler attribute to intercept code-generation behavior.
Note that we might eventually need a similar notion to deal with stage- or profile-specific functions and the overloading behavior around them, so using this for intrinsics doesn't seem like a bad idea.
|
|
The test case that is there right now is nominally a cross-compilation test, but for right now it uses the preprocessor to present completely different code for HLSL and GLSL compilation.
This change is really just fleshing out the OpenGL side of `render-test` enough that it can produce images using OpenGL to enable further testing.
|
|
All of this is just related to cruft left over from the old project setup.
|
|
|