slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2017-10-25	ignore RENDER_COMPUTE test cases in linux build.	YONGH\yongh

2017-10-25	render-test code cleanup	YONGH\yongh

2017-10-25	fix d3d11 usage	YONGH\yongh

2017-10-25	test	YONGH\yongh

2017-10-25	test	YONGH\yongh

2017-10-25	test	YONGH\yongh

2017-10-25	test	YONGH\yongh

2017-10-25	test	YONGH\yongh

2017-10-25	test	YONGH\yongh

2017-10-25	add new test mode: COMPARE_RENDER_COMPUTE, which runs a input ↵	YONGH\yongh
	vertex/fragment shader pair, but instead of comparing the resulting framebuffer, it expects the test shader to write results into a UAV, and compares the pixel shader UAV output to the reference output.
2017-10-25	finish up opengl renderer implementation for input resource binding.	YONGH\yongh

2017-10-23	bug fix	Yong He

2017-10-23	test 7	Yong He

2017-10-23	test 6	Yong He

2017-10-23	test 5	Yong He

2017-10-23	test 4	Yong He

2017-10-23	try fix 3	Yong He

2017-10-23	try fix 2	Yong He

2017-10-23	test	Yong He

2017-10-23	attempt to fix	Yong He

2017-10-23	test	Yong He

2017-10-23	fix compute shader test result comparison	Yong He

2017-10-23	Merge https://github.com/shader-slang/slang	Yong He

2017-10-23	Work in-progress: simple compute test passed. (d3d renderer)	Yong He

2017-10-20	work in progress	YONGH\yongh

2017-10-20	in-progress work: allow render-test to generate and bind various resource ↵	YONGH\yongh
	inputs for running test shaders with arbitrary parameter definitions. This commit contains the parser of the resource input definition.
2017-10-20	Fix up emission of shader parameter semantics when using IR (#226)	Tim Foley
	* Fix up emission of shader parameter semantics when using IR - Make sure to propagate entry point parameter layouts down to IR parameters when doing the initial cloning to form target-specific IR - When layout information is present on an IR node, prefer to use that over the original high-level declaration for outputting semantics in final HLSL - Fix up test runner to generate `.actual` files when running compute tests, in cases where the `render-test` application errors out (e.g., because of a Slang compilation error) - Add a first test of generics functionality, to show that they generate valid code through the IR - Right now this test is not using any "interesting" operations on the type parameter, so this is not a test that can confirm that interface constraints work * fixup: skip compute tests when running on Linux
2017-10-19	Support running and comparing execution results of compute shaders in ↵	YONGH\yongh
	testing framework.
2017-10-19	Reflection: allow querying of semantics on varying input/output (#224)	Tim Foley
	This is functionality required to support a Falcor bug fix. Most of the code to compute the right semantic name/index for a parameter was already present. This change adds: - Storage for semantic name/index on every `VarLayout` - Note: this is wasteful and should be optimized later - A public API to query the semantic name/index - The contract is that this API returns `NULL` if the parameter had no semantic - A bit of work in `parameter-binding.cpp` to attach semantics to varying input/output when traversing varying parameters. - Note: this is intentionally set up so that it associates semantics even with non-leaf parameters, so that an API user can query the semantic of a `struct` parameter and know that its members will be assigned sequential semantic indices from its starting value. - Support for dumping this information in reflection tests One notable thing that I did not change here is that the reflection test fixture doesn't report information on the output of an entry point, even though it really should. That should be fixed in a separate change, though, because it would affect many of the expected outputs.
2017-10-18	Work on IR-based cross-compilation (#222)	Tim Foley
	There are two big changes here: - Add logic during the initial IR cloning pass for an entry point + target that tries to pick the best possible version of any target-overloaded function. This allows us to pick the intrinsic version of `saturate()` when compiling for HLSL output, but then pick the non-intrinsic version (that is implemented in terms of `clamp()`) when targetting GLSL. - Add an initial specialization pass that tries to deal with generics. This required some fixing work to IR generation, so that we correctly generate explicit operations to specialize a generic for specific types (this is currently implemented as a `specialize` instruction that takes the generic to specialize plus a declaration-reference that represents the specialized form). With that work in place, we can scan for `specialize` instructions inside of non-generic functions, and use them to trigger generation of specialized code. We rely on the name-mangling scheme to help us find pre-existing specializations when possible. There are also a bunch of cleanups encountered along the way: - Don't use the explicit `layout(offset=...)` for uniforms, because it isn't supported by all current drivers. For now we will just assume that our layout rules compute the same values that the driver would for un-marked-up code. We can come back later and try to implement a workaround in the cases where this doesn't apply (e.g., by re-running the layout logic as part of emission, and dropping layout modifiers from variables that don't need explicit layout). - Fix some issues in IR dump printing so that we print function declarations more nicely. - Testing: print out failing pixel when image-diff fails
2017-10-16	Implement notion of a "container format" (#213)	Tim Foley
	The big addition here is that the Slang "bytecode" is no longer treated as just a "code generation target" (`CodeGenTarget`) akin to DX bytecode (DXBC) or SPIR-V, but instead is a `ContainerFormat` that can be used to emit all the results of a compile request (well, currently just the IR-as-BC, but the intention is there). Getting to this goal involved some prior checkins that eliminated bogus "targets" that weren't really akin to SPIR-V or DXBC: `-target slang-ir-asm` and `-target reflection-json`. Those targets were really in place to support testing, and so they've been made more explicit testing/debug options. This change eliminates `-target slang-ir` and instead tries to allow the user to specify `-o foo.slang-module` as an output file name, that indicates the intention to output a "container" file that will wrap up all the generated code. I've also gone ahead and generalized the existing `-target` option so that we are actually building up a list of code generation targets. This is largely just a cleanup, since it forces code to be more aware of when it is doing something target-specific vs. target independent. For example, reflection layout information lives on a requested target, and not on the compile request as a whole, and similarly output code is per-target, per-entry-point. As a cleanup, I eliminated support for per-translation-unit output. This was vestigial code from back when I used to try and do HLSL generation for a whole translation unit instead of per-entry-point (which turned out to be a lot of complexity for little gain), and it was only being used in the `hello` example and the `render-test` test fixture - in both cases fixing it up was easy enough. I've stubbed out the old `spGetTranslationUnitSource` API, but haven't removed it yet.
2017-10-13	Move reflection JSON generation into separate text fixture (#211)	Tim Foley
	Move reflection JSON generation into separate test fixture
2017-10-06	Perform some transformations on IR to legalize for GLSL (#200)	Tim Foley
	When outputting GLSL from a Slang or HLSL entry point, we need to translate any parameters or results of an entry-point function into global declarations of `in` or `out` parameters, as needed by GLSL. This change adds these transformations at the IR level, so that they don't need to complicate the emit logic. More detailed changes: - Make `render0` test use IR. It passes out of the box. - Fix test runner to not always dump diffs on failures I accidentally initialized an option to `true` instead of `false` when working on debugging the Travis CI failures. - Special-case output for component-wise multiplication to handle GLSL `matrixCompMul()` - Handle GLSL vs. HLSL output for calls to `mul()` - Output proper `layout(std140)` on GLSL constant buffer declarations - Require appropriate GLSL extension when emitting explicit `layout(offset = ...)` on constant buffer members - TODO: Need to avoid requiring this extension in cases where the offsets are what would be computed anyway. Realistically, should probably be emitting code with explicit padding, etc. to guarantee layouts. - Add an IR-based pass to translate entry point functions by eliminating their input/output parameters and replacing them with global variables. - Demangle names when calling target intrinsics The lowering to the IR will turn a call like `sin(foo)` into a call to a function declaration with a mangled name like `_S3sin...`. This works fine when the user is calling their own functions, since the name mangling will apply to both the definition and use sites, but for builtin functions it obviously isn't what we want. This change makes it so that we demangle the name of an instrinsic function just enough so that we can extract the original simple name, and make a call using that. These changes do nor provide 100% of what we need when translating to GLSL, so the `cross-compile-entry-point` test still hasn't been flipped over to use the IR (even though that is the test case I've been using to develop these changes).
2017-10-06	Attempt to fix subprocess handling for Linux (#197)	Tim Foley
	* Attempt to fix subprocess handling for Linux Our CI builds are sometimes hanging on Travis, and I suspect it might be something to do with how we are waiting for subprocesses to complete. I'm trying to following the manpage for the `wait()` and `waitpid()` calls a bit better here. * fixup: try to use poll() instead of select() * fixup: missing header * fixup * fixup * fixup: try to emit test output when tests fail on Travis
2017-10-04	IR: overhaul IR design/implementation (#195)	Tim Foley
	* IR: overhaul IR design/implementation Closes #192 Closes #188 This is a major overhaul of how the IR is implemented, with the primary goal of just using the AST-level type representation as the IR's type representation, rather than inventing an entire shadow set of types (as captured in issue #192). One consequence of this choice is that types in the IR are no longer explicit "instructions" and are not represented as ordinary operands (so a bunch of `+ 1` cases end up going away when enumerating ordinary operands). Along the way I also got rid of the embedded IDs in the IR (issue #188) because this wasn't too hard to deal with at the same time. Another related change was to split the `IRValue` and `IRInst` cases, so that there are values that are not also instructions. Non-instruction values are now used to represent literals, references to declarations, and would eventually be used for an `undef` value if we need one. IR functions, global variables, and basic blocks are all values (because they can appear as operands), but not instructions. The main benefit of this approach is that the top-level structure of a bytecode (BC) module is much simpler to understand and walk, and BC-level types are represented much more directly (such that we could conceivably use them for reflection soon). * fixup: 64-bit build fix * fixup: try to silence clang's pedantic dependent-type errors * fixup: bug in VM loading of constants
2017-09-29	Get tests running/passing under Linux (#194)	Tim Foley
	* Get tests running/passing under Linux - Fix up `dlopen` abstraction - Fix up some test cases to request hlsl (rather than default to dxbc) so they can run on non-Windows targets - Fix up test runner ignore tests that can't run on current platform (and not count those as failure) - Fix file handle leeak in process spawner absttraction - Get additional test-related applications building - More tweaks to Travis script; in theory deployment is set up now (yeah, right) * fixup * fixup: Travis environment variable syntax * fixup: Buffer->begin * fixup: actually run full tests on one config * fixup: add build status badge for Travis
2017-09-27	First attempt at a Linux build (#193)	Tim Foley
	* First attempt at a Linux build - Fix up places where C++ idioms were written assuming lenient behavior of Microsoft's compiler - Add a few more alternatives for platform-specific behavior where Windows was the only platform accounted for. - Add a basic Makefile that can at least invoke our build, even if it isn't going good dependency tracking, etc. - Build `libslang.so` and `slangc` that depends on it, using a relative `RPATH` to make the binary portable (I hope) - Add an initial `.travis.yml` to see if we can trigger their build process. * Fixup: const bug in `List::Sort` I'm not clear why this gets picked up by the gcc and clang that Travis uses, but not the (newer) gcc I'm using on Ubuntu here, but I'm hoping it is just some missing `const` qualifiers. * Fixup: reorder specialization of "class info" Clang complains about things being specialized after being instantiated (implicilty), and I hope it is just the fact that I generate the class info for the roots of the hierarchy after the other cases. We'll see. * Fixup: add `platform.cpp` to unified/lumped build * Fixup: Windows uses `FreeLibrary` and not `UnloadLibrary` * Fixup: fix Windows project file to include new source file This obviously points to the fact that we are going to need to be generating these files sooner or later.
2017-09-25	Fixup: deal with hitting `.obj` size limits for VS	Tim Foley
	When using the lumped/"unity" build approach for Slang, the resulting `.obj` files run into number-of-sections limits in the VS linker. For now I'm using the `/bigobj` command-line flag to work around this for the `hello` example, just so I can be sure the lumped build still works, but longer term it seems like we need to just drop that approach anyway. The `render-test` application was switched to link against `slang.dll` since there is no reason to have multiple apps use the lumped approach.
2017-09-21	Use WARP for D3D rendering tests.	Tim Foley
	This should in principle allow for the D3D-only tests to run on AppVeyor so that we can validate running things for CI purposes (and is probably a whole lot easier than trying to plug the VM up to the rendering tests). I've switched up one of the tests so that it should run even on AppVeyor, so fingers crossed that it will actually run.
2017-09-21	Initial work on a "VM" for Slang code (#189)	Tim Foley
	At a high level, this commit adds two things: 1. A "bytecode" format for serializing Slang IR instructions and related structure (functions, "registers") 2. A virtual machine that can load and then execute code in that bytecode format. The reason for kicking off this work right now is that we need a way to run tests on Slang code generation that doesn't rely on having a GPU present (given that our CI runs on VM instances without GPUs), nor on textual comparison to the output of other compilers. With these features I've implemented a slapdash `slang-eval-test` test fixture that can run a (trivial) compute shader to very our compilation flow through to bytecode. Some key design constraints/challenges: - The bytecode format should be "position independent" so that a user can just load a blob of data and then inspect it without having to deserialize into another format, allocate memory, etc. Eventually the bytecode format might be a replacement for out current reflection API (we used to base reflection off a similar format, but the cost/benefit wasn't there at the time and we switched to just using the AST). - The VM should be able to execute bytecode functions without doing any per-operation translation, JIT, etc. (translation of more coarse-grained symbols is okay). For now the VM is just being used to run tests, but eventually I'd like it to be viable for: - Running Slang-based code in the context of the compiler itself. This starts with stuff like constant-folding in the front-end, but could expand to more general metaprogramming features. - Running Slang-based ocde within a runtime application (e.g., a game engine) that wants to be able to run things like "parameter shader" code, or even just evaluate compute-like code on CPU (e.g., when supporting particles on both CPU and GPU). - Finally, the bytecode format should ideally be able to round-trip back to the IR without unacceptable loss of information. This requirement and the previous one play off of each other, because things like a traditional SSA phi operation is ugly when you have to actually execute it. This doesn't matter right now when we don't have SSA yet, but it might be part of the decision-making here. The actual implementation is centralized in `bytecode.{h,cpp}` and `vm.{h.cpp}`. Big picture notes: - The space of opcodes is shared between IR and bytecode (BC), with the hope that this makes translation of operations between the two easy. - The actual bytecode instruction stream relies on a variable-length encoding for integer values, including opcodes and operand numbers, so that the common case is single-byte encoding. - In the long term I intend to have a rule that if you use a single-byte encoding for an opcode, then all operands are required to use single-byte encodings too. Operations that need multi-byte operands would then be forced to use a multi-byte encoding of the op, and would be sent down a slower path in the interpeter. - The "bytecode"'s outer structure is based on ordinary data structures linked with pointers, but they are "relative pointers" so the actual structure is position-independent. - There are two main kinds of operands: registers and "constants." An operand is a signed integer where non-negatie values indicate registers (with `index == operandVal`) and negative values indicate constants (with `index == ~operandVal`). - Registers are stored in the "stack frame" for a VM function call, and each has a fixed offset based on the size of the type and those that come before it. Conceptually, registers are allowed to overlap if they aren't live at the same time, and we manage this with a simple stack model: every register is supposed to identify the register that comes directly before it (this isn't implemented yet). - "Constants" are more realistically a representation of "captured" values, but they are currently also how constants come in. Basically we can use a compact range of indices in the bytecode for a function, and each of these indices indirectly refers to some value in the next outer scope. - The actual encoding of bytecode instructions right now is largely ad-hoc and very wasteful (we encode the type on everything, and we also encode everything as if it had varargs). - In some cases, an instruction needs to know the types of the values involved (e.g., because it needs to load an array element, which means copying a number of bytes based on the size). The way the VM works we have types attached to our registers, so we currently get sneaky and look at those types in some ops. Longer term is makes sense to encode the required type info directly in the BC. - There's a whole lot of hand-waving going on with how the actual top-level bytecode module gets loaded, because of the way we currently treat the top-level module as an instruction stream in the IR. This means that we try to represent the loaded module as a "stack frame" for a call to the module as a function, but that approach as serious problems, and isn't realistically what we want to do.
2017-09-11	Initial work on boilerplate code generator	Tim Foley
	The goal here is to get the Slang "standard library" code out of string literals and into something a bit more like an actual code file. This is handled by having a `slang-generate` tool that can translate a "template" file that mixes raw Slang code (or any language we want to generate...) with generation logic that is implemented in C++ (currently). This work isn't final by any stretch of the imagination, but it moves a lot of code and not merging it ASAP will complicate other changes. My expectation is that the generator tool will be beefed up on an as-needed basis, to get our stdlib code working. Similarly, the stdlib code does not really take advantage of the new approach as much as it could. That is something we can clean up along the way as we do modifications of the stdlib.
2017-08-25	Try to print failure from AppVeyor tests	Tim Foley

2017-07-19	Build a dynamic library for Slang	Tim Foley
	- Change the `slang` project from a static library to a dynamic one - Add some details around `slang.h` to make sure DLL export stuff is working - Make the `slangc` executable use the dynamic library - Rename the `glslang` sub-project to `slang-glslang` and move it into the main source hierarchy - This reflects the fact that it isn't a stand-alone tool, and isn't in any way a standard binary of glslang, but rather just an artifact of how Slang uses glslang
2017-07-18	Add a compile-time loop construct to Slang	Tim Foley
	The basic syntax is: $for(i in Range(0,99)) { /* stuff goes here / } Note that the exact form is very restrictive. All that you are allowed to change is `i`, `0`, `99` or `/ stuff goes here /`. As a tiny bit of syntax sugar, the following should work: $for(i in Range(99)) { / stuff goes here / } Note that the range given is half-open (C++ iterator `[begin,end)` style). Both the beginning and end of the range must be compile-time constant expressions that Slang knows how to constant-fold. The implementation will basically generate code for `/ stuff goes here */` N times, once for each value in the half-open range. Each time, the variable `i` will be replaced with a different compile-time-constant expression. While I was working on a test case for this, I also found that our build of glslang had an issue with resource limits, so I fixed that. Clients will need to build a new glslang to use the fix.
2017-07-17	Handle `flat` interpolation cases in cross compilation	Tim Foley
	Fixes #104 - Map HLSL `nointerpolation` to GLSL `flat` - When lowering a `struct` type varying input/output, look for interpolation modifiers along the "chain" from the leaf field up to the original shader input variable (and take the first one found) - Not sure if this is strictly needed, but it seems like a reasonable policy - Add `flat` to varying input of integer type, with no other interpolation modifier - Note: I do not do anything to ignore a manually imposed interpolation modifier that might be incorrect
2017-07-14	Don't use "auto locations" mode in glslang	Tim Foley
	Fixes #88 The code was using `glslang::TShader::setAutoMapLocations` as a workaround for an old glslang issue, but apparently this mode has a bug where it ends up applying a `location` layout to the implicitly-created `gl_PerVertex` definition (which shouldn't be allowed). This change drops the call to `setAutoMapLocations` (and `setAutoMapBindings`) since it should no longer be required.
2017-07-13	Add support for dumping intermediates for debugging.	Tim Foley
	Calling: spSetDumpIntermedites(compileRequest, true); will set up a mode where Slang tries to dump every intermediate HLSL, GLSL, DXBC, SPIR-V, etc. file it generates. If SPIR-V or DXBC is requested then we also dump assembly of those. Right now the files are all named as `slang-<counter>.<ext>`, and get dropped in whatever the working directory is, but I'm open to ideas on how to improve that. Note: this change introduces a new binary interface to `glslang`, so pulling it requires an updated `glslang.dll`.
2017-07-10	Removed spGetTranslationUnitCode; Unified ↵	Kai-Hwa Yao
	EntryPointResult/TranslationUnitResult, added helper functionality; Ensure null termination when printing raw data
2017-07-10	Don't assume vector contains contents	Kai-Hwa Yao

2017-07-10	Allow glslang wrapper to output regular SPIRV before disassembly	Kai-Hwa Yao