slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2018-05-23	Fix Slang->GLSL translation for entry point with multiple `out` parameters ↵	Tim Foley
	(#573) Fixes #568 The problem occurs when an entry point declares multiple `out` parameters: ```hlsl void myVS( out float4 a : A, out float4 b : B ) { ... a = whatever; b = somethingElse; ... if(done) { return; // explicit return } ... // implicit return } ``` Slang translates code like this by introducing a GLSL global `out` parameter for each of `a` and `b`, rewriting the logic inside the entry point to use a local temporary instead of the real parameters, and then assigning from the locals to the globals at every `return` site: ```glsl out vec4 g_a; out vec4 g_b; void main() { // insertion location (1) vec4 t_a; vec4 t_b; ... t_a = whatever; t_b = somethingElse; ... if(done) { // insertion location(2) g_a = t_a; g_b = t_b; return; // explicit return } ... // insertion location (3) g_a = t_a; g_b = t_b; // implicit return } ``` Note that there are three different places (for this example) where code gets inserted to make the translation work. We insert declarations of local variables at the top of the function, and then insert the copy from the temporariesto the globals at each `return` site (implicit or explicit). The bug in this case was that the pass was setting the insertion location to (1) outside of the loop for parameters, so that when it was done with `a` and moved on to `b`, it would end up inseting the temporary `t_b` at the last location used (location (3) in this example), and this would result in invalid code, because `t_b` gets used before it is declared. This bug has been around for a while, but it has largely been masked by the fact that so few shaders use multiple `out` parameters, and also because Slang's SSA-ification pass would often be able to eliminate the local variable anyway, so that the bug never bites the user. The reason it surfaced now for a user shader was because we introduced `swizzledStore`, which currently inhibits SSA-ification, so that some temporaries that used to get eliminated are now retained so that they can break things. The fix in this case is small: we use the existing `IRBuilder` only for insertions at location (1) and construct a new builder on the fly for all the insertions at `return` sites. I have not included a test case yet, because our end-to-end Vulkan testing is not yet ready, so this may regress again in the future.
2018-05-23	When outputing a vector type with a size of 1 in GLSL, it needs to be output ↵	jsmall-nvidia
	as the underlying type. For example vector<float,1> should be output as float in GLSL. (#572)
2018-05-21	Handle structure initializers in IR type legalization (#567)	Tim Foley
	Fixes #566 The basic problem here is that the front-end translates a structure initializer-list expression into a `makeStruct` instruction (with one argument per field), but the IR type legalization logic wasn't handling the case where a `makeStruct` is used to construct a struct value that needs to get split by legalization. The implementation is relatively straightforward, and like the other cases of instruction legalization for compound types, it follows the shape of the `LegalType`/`LegalVal` cases. The one interesting bit is that we need to be a bit careful and filter the single argument list for `makeStruct` into two in the case where we generate a "pair" type for something that has both "ordinary" and "special" (resource) fields. Luckily the `PairInfo` data that was generated by type legalization has exactly the information we need (by design). This change does not address several issues that could be handled in follow-on changes: * The `makeArray` instruction will face similar issues if it is applied to a type that requires legalization: we'd need to turn an array of `LegalVal`s into a bunch of distinct arrays. * The error message when we hit the unimplemented case here isn't great. Ideally we should provide the line number of the instruction that fails in an error message when legalization fails. This change tries to focus narrowly on the bug at hand, and leave these issues for later changes.
2018-05-11	Generate Visual Studio projects using Premake (#557)	Tim Foley
	* Generate Visual Studio projects using Premake This change adds a `premake5.lua` file that allows us to generate our Visual Studio solution using Premake 5 (https://premake.github.io/). The existing Visual Studio solution/projects are now replaced with the Premake-generated ones, and project contributors will be expected to update these by running premake after adding/removing files. I have not changed the Linux `Makefile` build at all, because that file is also used for things like running our tests, so that clobbering it with a premake-generated `Makefile` would break our continuous testing. Hopefully future changes can switch to a generated `Makefile` and perhaps even add an XCode project as well. Notes: * The `build/slang-build.props` file is no longer needed/used, so it has been removed. * The `slang-eval-test` test fixture wasn't following our naming conventions for its directory path, so it was updated to streamline the Premake build configuration work. This required changes to the `Makefile` as well * Some seemingly unncessary preprocessor definitions that were specified for `core` and `slang-glslang` have been dropped. We will see if anything breaks from that. * Possible fixup for Premake vpath issue Premake's `vpath` feature seems to be nondeterministic about the order it applies filters (because Lua isn't deterministic about the order of entries in a key/value table), and as a result we can end up in a weird case where it decides that a `foo.cpp.h` file matches the `.cpp` filter (I'm not sure why) before it tests against the `.h` filter. This change uses an (undocumented) Premake facility to set `vpath` using a list of singleton tables, which seems to fix the order in which things get tested. * Remove support for "single-file" build of Slang The `hello` example was the only bit of code that uses the "single-file" way of building Slang, and this had already run up against limitations of the Visual Studio compilers in its Debug\|x64 build. Rather than mess with Premake to make it pass through the `/bigobj` linker flag that is needed to work around the issue, it makes more sense just to stop using/supporting the feature since we wouldn't want users to depend on it anyway (our documentation no longer refers to it). While I was at it I went ahead and made sure that the `SLANG_DYNAMIC` flag doesn't need to be set manually, so that instead there is a non-default `SLANG_STATIC` option (not that we have a static-library build of Slang at the moment).
2018-05-11	Cleanups around behavior when the compiler fails (#553)	Tim Foley
	* Cleanups around behavior when the compiler fails * Add another case where we try to `noteInternalErrorLoc()` if an exception in thrown. This one is the in the logic for emitting an IR instruciton. This could be improved by adding another layer at the function level (as a catch-all for instructions with no location), but something is better than nothing. * Change a bunch of `assert()`s over to `SLANG_ASSERT()`s, so that we can theoretically take more control over them (e.g., make release builds with asserts enabled) * Some other small cleanups around the assertions we perform. In the survey I made, I didn't really see many obvious "smoking gun" cases where we could produce a significantly better error message for some of the unimplemented/unexpected paths, other than to actually implement the missing functionality. * fixup
2018-05-11	Fixes #559 (#560)	Yong He

2018-05-10	Workaround for cases where we emit illegal-but-unused types	Tim Foley
	This is a quick workaround to deal with cases where we try to emit an unreferenced IR type that contains references to pre-legalization types (which might have been removed from the IR even thought they are still referenced). The basic fix is to not add types to our global order of instructions to emit by default, and only add them on demand as they are referenced by other instructions. This is not a real fix for the underlying issue, which is that type legalization is only being applied to a subset of global instructions instead of all of them. A more detailed fix for that problem will need to be devised next. This fix also doesn't address the question of why an unreferenced `struct` type came to be present in the IR code passed to the back-end in the first place. It would be good to understand how this scenario is arising.
2018-05-04	Re-enable emission of #line directives and clean up output (#554)	Tim Foley
	This was based on feedback from Falcor users, who felt like changing the default to have no line directives didn't work out well. Since I'd only made them disabled by default based on what I perceived to be Falcor's needs, I'm happy to turn this back on by default. I also added a few changes to clean up the output: * Don't emit a directive for a sub-expression, since that breaks up the code too much. The only directives inside a function body will be on top-level instructions that didn't get folded into a use site. * Add logic to emit a directive for top-level declarations (globals, functions, structs), and clean up their printing so that they put any extra space after the declaration rather than before (so the line numbers can be accurate) * Don't emit the file path part of a directive if it would be the same as the previous directive. This makes the output less noisy, at the cost of having to work your way backward to find the file if you are looking directly at the output. There are certainly more cleanups possible, but these make the output decent enough to be useful for working backwards from a downstream compiler error to the offending code.
2018-05-04	Allow more complex compound expressions when emitting from IR (#552)	Tim Foley
	The emit logic already had an idea of when an instruction should be "folded" it its use site(s), and this change just expands on that logic to try to be more aggressive. The basic idea is that instead of outputting this: ```hlsl float4 _S3 = a_0 + b_0; float4 _S4 = c_0 * _S3; d_0 = _S4; ``` we can hopefully output something like this: ```hlsl d_0 = c_0 * (a_0 + b_0); ``` The way this works is that after dealing with the various special cases that decide an instruction `I` must/cannot be folded in, we look and see if it has the following properites: * `I` has no side effects * `I` has a single user, `U` * `I` and `U` are in the same block (and `I` comes before `U` in that block) * for every instruction `X` between `I` and `U` (exclusive), `X` has no side effects If all of these conditions are true, then `I` can be folded in as a sub-expression when we emit `U`. This change doesn't affect most of our test output, but there is still a single test with SPIR-V output that we compare against a GLSL baseline, and so that baseline had to be modified to match the GLSL we now generate. Similar to #547, this change is not meant to provide a complete solution, but rather to take a concrete but low-risk step toward improving our output. Opportunities to improve the results further include: * We can/should ensure that when outputting sub-expressions we keep extra parentheses to a minimum. The old logic for emitting from an AST had support for "unparsing" expressions with minimal parentheses, and we should try to do the same. This can be error-prone, because omitting parentheses can lead to silent failures, so it must be done carefully. * We could try to be more aggressive about detecting what operations might have side effects. The most interesting case is function calls, where we should try to check if the callee is a function known to be side-effect-free. We could start by annotating most builtin functions with an attribute/decoration that indicates freedom from side effects. Deriving this attribute for user functions could be interesting, but we'd have to be careful since "nontermination" is technically a side effect. * We could try to be more aggressive about determining what side effects in instructions `X` are "safe" for the instruction `I` to move across. For example, if `I` is a load from variable `a` and `X` is a store to variable `b`, then that would seem to be safe. This starts to get into issues of instruction scheduling, though, and that is probably beyond what we want Slang to be doing.
2018-05-03	Add a pass for computing dominator trees (#541)	Tim Foley
	This code is currently not used by anything, but I wanted to check in a first pass at an implementation of dominator tree construction so that we don't have to keep avoiding implementing algorithms that rely on having dominator information available. The algorithm used to construct the dominator tree is taken from "A Simple, Fast Dominance Algorithm" by Keith D. Cooper, Timothy J. Harvey, and Ken Kennedy. This is not the "best" algorithm in terms of asymptotic performance, but it is among the simplest algorithms for computing a dominator tree that still outperforms naive iterative set-based methods. The actual data structure and API for the dominator tree has a bit of "cleverness" in it to try to make the common queries reasonably fast (e.g., you can check whether A dominates B in constant time). My hope is that even if we implement a more advanced algorithm for constructing the dominator tree, we can retain compatibility with passes that might make use of this API. Because no code is currently using this logic, I have done only minimal testing by stepping through this code and validating the results on paper for some very small CFGs. More serious testing/debugging may need to wait until we have an optimization pass that needs the dominator tree we compute here. One open question I have is how best to introduce traditional unit testing into Slang, since this is an example of code that would benefit greatly from being unit tested.
2018-05-03	Pass through original names for most declarations (#547)	Tim Foley
	The basic idea here is that when lowering to the IR, the front-end will attach a "name hint" to the IR instruction(s) that represent a given declaration, and then the passes that work on the IR will try to preserve and propagate those names, and then finally the emit logic will use them in place of mangled or unique names when available. This change does not try to deal with the issues that arise when we try to use those variable names in the output without any modification (e.g., handling cases where they might clash with keywords or builtins in the target language). Instead, it tries to establish baseline behavior for propagating through names, so that a later change can concentrate on the issue of using those names exactly when it is legal to do so. In order to avoid issues around the name "hints" causing problems we take two main steps: 1. We "scrub" each name to reduce it down to the allowed set of identifier characters in C-like languages, and then ensure that it doesn't do things that would be illegal in some downstream languages (e.g., consecutive underscores are not allowed in GLSL) or could clash with Slang's mangled names. This process isn't guaranteed to give distinct results for distinct inputs (it isn't a mangling scheme, after all). 2. We generate a unique ID for each occurence of a given name and always use that as a suffix. This means that even if a name happens to overlap with a keyword (if you somehow have a variable named `do`), we will still add a suffix that makes it not a problem (we'd output `do_0` which is fine). The logic for generating these names is mostly straightforward. For simple variables, we use their given name directly, while for other declarations we try to form a name that includes their parent declaration (e.g. `SomeType.someMethod`). Various IR passes need to propagate or preserve this information. The most interesting is type legalization, when we take a variable with an aggregate type and split some of the fields out into their own variables. In that case we generate "dotted" names like `someVar.someTexture` and rely on the emit logic to turn that into `someVar_someTexture`. During SSA generation, if we are promoting a variable to SSA temporaries, we will try to propagate the name of the variable over to the temporaries (unless they already have a name from some other place). The same applies to block parameters ("phi nodes"). Many of the test changes need their expected output to be updated for this change. Luckily in most cases the output has gotten easier to understand.
2018-05-02	Merge branch 'master' into master	Yong He

2018-05-02	Speedup type checking using cached overload resolution results.	Yong He
	This change adds caches to built-in operator overload resolution and type coersion to avoid running these time-consuming operations every time. - Adds `TypeCheckingCache` type, which is defined in check.cpp, that contains two dictionaries for the cached results of `ResolveInvoke` and `CanCoerce` calls. - Add `destroyTypeCheckingCache` and `getTypeCheckingCache` methods to `Session` class to reuse these cached results over the entire session.
2018-05-02	Add support for "swizzled stores" (#544)	Tim Foley
	This was a known issue in our IR representation, which was now biting a user. The basic problem is that in code like the following: ```hlsl RWStructureBuffer<float4> buffer; ... buffer[index].xz = value; ``` we ideally want to be able to reproduce the original HLSL code exactly, but that requires directly encoding the way that this code writes to two elements of a vector, but not the others. The currently lowering strategy we had produced IR something like: ```hlsl float4 tmp = buffer[index]; tmp.xz = value; buffer[index] = tmp; ``` That transformation might seem valid, but it has some big problems: * It generates UAV reads that are not needed, which could impact performance * It performs read-modify-write operations on memory that the programmer didn't explicitly write, which could create data races The fix here is somewhat obvious: if the "base" of a swizzle operation on a left-hand side resolves to a pointer in our IR, then we can output a "swizzled store" instead of the read-modify-write dance. We currently keep the read-modify-write around since it is potentially needed as a fallback in the general case. Along the way I also tried to make sure that we handle the case where we have a swizzle of a swizzle on the left-hand side: ```hlsl buffer[index].xz.y = value; ``` That code should behave the same as `buffer[index].z = value`. I am currently detecting and cleaning up this logic in the lowering path for `SwizzleExpr`, because that is the only place in the lowering logic that "swizzled l-values" currently get created.
2018-05-02	Add support for explicit register space bindings (#542)	Tim Foley
	This change adds support for specifying explicit register spaces, like: ```hlsl // Bind to texture register #2 in space #1 Texture2D t : register(t2, space1); ``` I added a test case to confirm that the register space is properly propagated through the Slang reflection API. This change also adds proper error messages for some error/unsupported cases that weren't being diagnosed: * Specifying a completely bogus register "class" (e.g., `register(bad99)`) * Failing to specify a register index (`register(u)`) * Specifying a component mask (`register(t0.x)`) * Using `packoffset` bindings I added test cases to cover all of these, as well as the new errors around support for register `space` bindings. In order to get the existing tests to pass, I had to remove explicit `packoffset` bindings from some DXSDK test shaders. None of these `packoffset` bindings were semantically significant (they matched what the compiler would do anyway, for both Slang and the standard HLSL compiler). Removing them is required for Slang now that we give an explicit error about our lack of `packoffset` support. In a future change we might add logic to either detect semantically insignificant `packoffset`s, or to just go ahead and support them properly (as a general feature on `struct` types).
2018-05-02	Fix emit logic when "terminators" occur in the middle of a block (#540)	Tim Foley
	Fixes #527 There were a few problem cases for the IR emit logic. The most obvious, which came up in #527 is that a function body with multiple `return` statements would generate invalid code: ```hlsl int foo() { return 1; int x = 2; return x; } ``` In that case the IR for `foo` would have a single block that has two `return` instructions, which is invalid. Another case that seems to be arising more often, but that had less obvious consequences was when one arm of an `if` statement ends in a `return`: ```hlsl if(a) { return b; } else { int c = 0; } int d = 0; ``` In that case, the `return` instruction for `return b` would be followed by a branch to the end of the `if` (the `int d = 0;` line), because that would be the normal control flow without the early `return`. The fix implemented here is to have the IR lowering logic be a bit more careful on two fronts: 1. When emitting a branch, check if the block we are emitting into has already been terminated, and if so just don't emit the branch (since we are logically at an unreachable point in the CFG. 2. Whenever we are about to emit code for a (non-empty) statement, ensure that the current block being build is unterminated. If the current block is terminated, then start a new one. Case (2) will only matter when there is unreachable code (e.g., in the function `foo()`, the declaration of `x` and the second `return` can never be reached), so I added a warning in that case, and included a test case that triggers the new warning (with a function like `foo()` above).
2018-05-01	Diagnose attempts to write to fields in methods (#530)	Tim Foley
	* Diagnose attempts to write to fields in methods Work on #529 This helps to avoid the case where a Slang user writes a struct with helpful `setter` methods, and finds that it doesn't work as expected because the `this` parameter is currently handled like an `in` parameter (passed by value, but mutable in the callee). Fixing this issue actually involved making a more broad fix to how l-value-ness is propagated. The existing checking logic was assuming that l-value-ness is just a property of a particular member declaration (e.g., a field is either mutable or not), and didn't take into account whether the "base expression" was mutable. This change fixes that oversight, which might lead to additional errors being issued if we aren't correctly making things mutable when we should. A `ThisExpr` was already immutable by default, so that part didn't actually need to change. Just propagating its immutability through was enough. As an additional assistance to users, I have added an extra diagnostic that triggers when a "destination of assignment is not an l-value" error occurs and the left-hand-side expression seems to be based on `this` (whether implicitly or explicitly). This will ideally help users to understand that the "setter" idiom is not yet supported. * Fixed setRadius typo
2018-05-01	Cleanups (#539)	Tim Foley
	* Cleanup: remove unused files from project * Cleanup: move IRModule forward declaration into correct namespace
2018-04-28	Remove unused local variable in vm.cpp (#533)	Jeremie St-Amand
	Unused local variable prevents compiling when warnings are treated as errors
2018-04-25	Fix for global generic parameter substitution (#512)	Tim Foley
	The problem here arises when multiple entry points are compiled in one pass. Each entry point has its own arguments for global generic parameters, and leads to us emitting a `bindGlobalGenericParameter(p, val)`. But once the first entry point's substitutions are applied, the second entry point's code gives `bindGlobalGenericParameter(val, val)` and the compiler crashes (in debug builds) because `val` is not a global generic parameter. This change just applies a quick fix. If we see `bindGlobalGenericParameter(x,y)` during specialization, and `x` is not a global generic parameter, then we skip it. The right long-term fix is to change the compiler's representation of global generic arguments so that they live on a `CompileRequest` instead of an `EntryPointRequest`. That is a more significant change (with impact on the public API), so I'm inclined to leave it as a cleanup for another day (given that no customers are using global generic parameters today).
2018-04-23	Improve SSA promotion for arrays and structs (#521)	Tim Foley
	* Improve SSA promotion for arrays and structs Fixes #518 The existing SSA pass would only handle `load(v)` and `store(v,...)` where `v` is the variable instruction, and would bail out if `v` was used as an operand in any other fashion. The new pass adds support for `load(ac)` where `ac` is an "access chain" with a gramar like: ac :: v \| getElementPtr(ac, ...) \| getFieldAddress(ac, ...) What this means in practical terms is that we can promote a local variable of array or structure type to an SSA temporary even if there are loads of individual elements/fields, as along as any assignment to the variable assigns the whole thing. I've added a test case to confirm that this change fixes passing of arrays as function parameters for Vulkan. * Fixup: disable test on Vulkan because render-test isn't ready This is a fix for Vulkan, but I don't think our testing setup is ready for it. * Fixup: error in unreachable return case, caught by clang * Fixups based on testing These are fixes found when testing the original changes against the user code that originated the bug report. * `emit.cpp`: Make sure to handle array-of-texture types when deciding whether to declare a temporary as a local variable in GLSL output * `ir-legalize-types.cpp`: Make a not of a source of validation failures that we need to clean up sooner or later (just not in scope for this bug fix change). * `ir-ssa.cpp`: * When checking if something is an access chain with a promotable var at the end, make sure the recursive case recurses into the "access chain" logic instead of the leaf case * Add some assertions to guard the assumption that any access chain we apply has been scheduled for removal * Correctly emit an element extract instead of getting an element address when promoting an element access into an array being promoted * Eliminate a wrapper routine that was setting up an `IRBuilder` and use the one from the block being processed in the SSA pass (since it was set up for stuff just like this) * `ir-validate.cpp` * Add a hack to avoid validation failures when running IR validation on the stdlib code. This case triggers for an initializer (`__init`) declaration inside an interface, since the logical "return type" is the interface type itself, which has no representation at the IR level and thus yields a null result type in a `FuncType` instruction.
2018-04-23	Fix successor computation for `switch` instruction (#520)	Tim Foley
	Fixes #519 The code was leaving out the `default` label from the successor list, which would break any passes that require an accurate CFG (with the big one right now being the SSA-formation pass).
2018-04-20	Better diagnostics when compilation is aborted (#517)	Tim Foley
	* Improve messages when compilation is aborted. Make sure to include the information from any `Slang::Exception` that was thrown, so that the poor user can at least point us at our own message string from an assertion failure. This doesn't provide them line-number information in their code or the Slang codebase, so there is still work to be done in making the compiler more friendly about this stuff. * When aborting compilation, try to note what source location we were working on This is handled by having exception handlers on the stack at key bottleneck points in semantic checking and IR generation, which can then emit a diagnostic to note what we were working on when things failed. This is not intended to be an indiciation to the user that their code is at fault for a compiler crash (it is always our fault), but might give them a chance to work around whatever bug is blocking them.
2018-04-20	Diagnose use of an implicit cast as an argument for an `out` parameter (#516)	Tim Foley
	Work on #499 Two big fixes here: * The logic for checking constraints on `out` arguments wasn't actually triggering because it relied on function parameters being given an `OutType` if they are marked `out`, but the code wasn't actually doing that. Fixing the computation of types for functions resolved that issue. * Next, I added a specific diagnostic to follow up the "expected an l-value" error to let the user know that their argument was implicitly converted, and that is why it doesn't count as an l-value in Slang's rules. I've added a test case to ensure that we retain this diagnostic until we can do a true fix for the issue. The right long-term fix is to have an AST representation of all the implicit casts involved (e.g., in both directions for an `inout` parameter), and then have the IR generate explicit code for the conversions in each direction (the `LoweredVal` representation can handle this sort of thing).
2018-04-19	Fix GS cross-compilation after IR type system change (#507)	Tim Foley
	The cross-compilation logic for geometry shaders would look through the user's entry point for calls like `someStream.Emit<X>(val)` and turn that into `outputGlobals = val; EmitVertex();`. It was recognizing the `Emit()` calls by looking at the callee in all `call` instructions and seeing if it was registered to lower to GLSL as `EmitVertex()`. The logic was try to look "through" `specialize` instructions (to deal with the `<X>` bit in the call above), but this wasn't updated for the new IR encoding where the first operand to a `specialize` is the generic being specialized, and not the function nested inside it. The fix here is to properly look through both `specialize` instructions and generics. This is kind of a gross operation and we've done things like it in a few places, so it might be something we try to extract into a utility function in the future.
2018-04-19	Add type legalization support for "field extract" op (#501)	Tim Foley
	The code was handling the "get field address" opcode (which takes a pointer to a struct and returns a pointer to a field), but didn't have a case for values. This was just an oversight.
2018-04-19	Fix up DXR type emission from IR type system (#498)	Tim Foley
	* There was a simple typo where we were emitting `RaytracingAccelerationStructureType` instead of `RaytracingAccelerationStructure` * The IR lowering logic was failing to handle types with an `__intrinsic_type` modifier (which maps them to a single IR opcode) that weren't in one of the various special cases. I added a catch-all case to the handling of `DeclRefType`. This notably affected the `RayDesc` type. * Even if we lower `RayDesc` to an intrinsic type, we still need to lower its fields too, and these were getting emitted with mangled names (as would happen for any user-defined fields). The solution I implemented was to allow for fields to have `__target_intrinsic` modifiers in the stdlib, to specify the un-mangled name they should use on each target. I'm not 100% happy with this solution, because it seems odd to have `RayDesc` be an intrinsic type, but then to also have field keys used in `getField` instructions as if it were an ordinary `struct`. It seems like a better solution would be to have it lower to an IR `struct`, just with an appropriate modifier.
2018-04-18	Fix output of `groupshared` with IR type system (#492)	Tim Foley
	The basic problem was that the lowering logic was constructing (more or less) `Ptr<@GroupShared X>` instead of `@GroupShared Ptr<X>`. There were also problems with passes not propagating through rates that should have been (e.g., legalization). I've added a test case to actually validate `groupshared` support.
2018-04-18	Fix up name mangling/unmangling for extensions (#493)	Tim Foley
	* Fix up name mangling/unmangling for extensions This is required for the unmangling we do on some builtin function names. The work here is mostly just a band-aid, and a more comprehensive pass over the name mangling/unmangling code is required to make any of this robust. * fixup: UNREACHABLE_RETURN argument
2018-04-18	Fix some logic around legalization of sampler types (#496)	Tim Foley
	The main error here was checking for `IRSamplerType` instead of `IRSamplerTypeBase`, which means the relevant logic only triggered for the `SamplerState` type and not the `SamplerComparisonState` type. The two affected places were type legalization (so that comparison samplers in `struct` types weren't being hoisted out) and the emit logic when deciding whether to introduce local temporaries (so we were emitting temporaries for comparison samplers, leading to GLSL errors).
2018-04-13	Propagate diagnostics when imported module has errors (#485)	Tim Foley
	A previous fix avoided crashes when an `import`ed module has errors by making the "failed to import" error a fatal one. Unfortunately, the code path that handles fatal errors was failing to copy diagnostic output from the sink over to the member variable on the `CompileRequest` that exposes the output through the API. This meant that API users lost all context on error messages in `import`ed code. This change fixes the immediate issue by plumbing through the error output, but doesn't fix the more fundamental issue: the front-end should not crash when an `import` fails, by any means.
2018-04-12	Preprocessor cleanups (#484)	Tim Foley
	* For a `#error` or `#warning`, read the rest of the line as raw text to include in the error message * When skipping tokens (e.g., in an `#ifdef`d out block), don't emit errors on invalid characters * TODO: we could clearly get more efficient and skip whole raw lines in the future * Fix an issue when a macro invocation that expands to nothing (zero tokens) is the last thing before a directive. The preprocessor was returning the `#` as an ordinary token, because it has already gone past its test for directives.
2018-04-11	Introduce an IR-level type system (#481)	Tim Foley
	* Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors
2018-04-10	Feature/dx12 compute (#482)	jsmall-nvidia
	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * Getting simpler dx12 compute tests to work. * With expected data in test - check for specialized and then for the default, so that multiple test can share the same expected data, but specialized cases can still be set. * Fixed construction and binding on dx12 textures. * Control which render apis used in test from command line. * Small aesthetic fixes in render-test/main.cpp. * Fix binding problem for uavs/srvs dx12. Previously tried to create srv/uav for StorageBuffers (like dx11 does), but the binding breaks as you can end up with two srvs using the same register. First pass at fixing problems with Texture creation for dx12 - assertions were hit with 3d or array textures. * Fixes to improve Dx12 setup shader resource views for cubemaps/arrays. * Fixed d3d12 textureSamplingTest - problem was that cubemap/array textures were not being uploaded correctly. * Changed the order of how binding of constant buffers (as just set on the Renderer) indexes. Previously they were given the lowest indices, but they clashed with the indices from the 'Binding'. Changing this means all tests run on d3d12. * Add code to allow use of warp (although not command line switchable yet). Fix problem setting up raw UAV - as identified by warp. * Added RenderApiUtil - which can detect if a render api is potentially available. * Moved render flag testing/parsing into RenderApiUtil. * Fix signed/unsigned warning. * Fixes around enums prefixed with k on the review of feature/dx12 compute branch.
2018-04-05	Falcor fixes (#479)	Tim Foley
	* Don't emit interpolation modifiers on GLSL fields The previous change that started passing through interpolation modifiers didn't account for the fact that the GLSL grammar doesn't allow interpolation modifiers on `struct` fields, so we shouldn't emit them in that case (and our legalization strategy for GLSL guarantees that varying input will never use a `struct` type anyway). * Try to handle SV_Position semantic on GS input HLSL allows `SV` semantics to be used for ordinary inter-stage dataflow in some cases (e.g., a VS can output `SV_Position` and it can then be read from a GS). GLSL allows similar things with `gl_Position`, but there are some wrinkles. One fix here is to correctly identify that `gl_FragCoord` should only be used as the translation of `SV_Position` for a fragment shader input (and not in the general case of any input). The other "fix" here is a kludge to handle the fact that the right translation for a GS input is not to an array called `gl_Position`, but to the syntax `gl_in[index].gl_Position` (array-of-structs style). I am doing this by attaching a custom decoration to the global variable we create for `gl_Position` and then intercepting it during code emit. I'm not proud of this, and would like to do something better given time. * Fix GLSL output for matrix-scalar multiplication The output logic was assuming that any use of `operator` in the input code that yields a value of matrix type must be translated to a `matrixCompMult()` call in GLSL, but this should really only apply if both of the operands* are matrices (not just based on the result type). As a result matrix-times-scalar operations were emitting a call to `matrixCompMult()` and GLSL was complaining because it won't implicitly promote scalars to matrices.
2018-04-04	Pass AST interpolation modifiers through to codegen. (#475)	Tim Foley
	This is a short-term fix, because we (1) don't have an IR-level representation of interpolation qualifiers, and (2) can't introduce one until after the IR-level type system is introduced (to be able to handle `struct` fields). The approach here is to find the AST-level declaration, either from layout information (in the case of an ordinary variable or function parameter), or from struct field information (because structs are being output from the AST form anyway). I've included a single end-to-end rendering test to confirm that we handle the `nointerpolation` modifier the same as HLSL. I also added the `noperspective` modifier, which seemed to be missing from our implementation.
2018-04-03	Fixes based on review of feature/dx12 PR. (#473)	jsmall-nvidia

2018-04-02	Implement "operator comma" in IR codegen (#472)	Tim Foley
	Fixes #471
2018-04-02	Feature/dx12 (#469)	jsmall-nvidia
	* Fix signed/unsigned comparison warning. * Split out d3d functions that will work across dx11 and 12. * Improve slang-test/README.md around command line options. * Make Guid comparison honor alignment for comparisons, such that mechanism work on architectures that can only do aligned accesses. * Initial setup of D3D12 Renderer, with presentFrame and clearFrame. * More support for D3D12 * Added FreeList * Added D3D12CircularResourceHeap * First attempt at createBuffer * First pass at map/unmap. * First pass binding vertex/constant buffers, and setting up InputLayout. Note that memory is not kept in scope on binding yet. * First pass of D3DDescriptorHeap * Small tidy up in render-d3d11. Added D3DDescriptorHeap to project. * First pass at D3D12 bind state. * Fix typos in D3D12Resource * Tidy up Dx11 render binding a little to match more with Dx12 style. * First pass at Dx12 BindingState * Handling of the command list d3d12. Support for submitGpuWork and waitForGpu. * First attempt at Dx12 capture of backbuffer to file. * First attempt at D3D12 binding for graphics. * D3D12 setup viewport etc - does now render triangle in render0.hlsl. * First pass at support for compute on D3D12Renderer * Use spaces over tabs in D3DUtil * Tabs to spaces in D3D12DescriptorHeap * Convert tab->spaces on render-d3d12.cpp
2018-04-02	Update to top-of-tree glslang (2018-04-02). (#470)	Tim Foley
	This is an attempt to alleviate some driver crashes caused by invalid SPIR-V. Because Slang drives `glslang` with GLSL source code, any invalid output is likely due to `glslang` bugs. I chose top-of-tree for `glslang` because it wasn't clear what their release process is. Hopefully we can go another year without having to update this dependency. The build setup we use for `glslang` had to change to account for one more `#define` that the code expects to have passed in externally.
2018-03-30	Fix several issues discovered by Falcor (#467)	Tim Foley
	Fixes #466 Most of these are Vulkan-related regressions. * Kludge the definition of `GroupMemoryBarrierWithGroupSync()` for GLSL so that it works around parentheses that the emit logic now introduces. * Don't emit `static` for global constants when targetting GLSL * Emit the `flat` modifier for varying input/output with integer type, when targetting GLSL * Avoid checking parameter default-value expressions more than once, because this can crash when the checking introduces syntax that is not expected to appear in the input AST
2018-03-29	Avoid crash when bad argument given to [instance(...)] attribute (#464)	Tim Foley
	Fixes #463 Some of the attributes were failing to check for a `null` result from `checkConstantIntVal`, and so they crashed when a bad expression was used in an attribute. The particular way this had been triggered was that a user put an HLSL geometry shader in the same file with other code, using an entry point like: ```hlsl [instance(COUNT)] void myGeometryShader(...) {...} ``` They then defined `COUNT` as a preprocessor macro when compiling using the GS, but left it undefined otherwise. The result was that the argument to the `instance` attribute would fail to type check, and thus wouldn't count as a constant integer value, so that `checkConstantIntVal` returns `null` and results in the crash. The workaround for the user is to always define `COUNT`, even when not compiling the GS. The fix in the compiler is to guard against `null` in these cases and bail out of attribute checking. I also implemented logic so that `CheckIntegerConstantExpression` (which is invoked by `checkConstantIntVal`) will not produce an additional error message if the underlying expression failed to type check. In this casem the user will get an `undefined identifier: COUNT` error message, and we don't need to waste their time by also telling them that this isn't a compile-time constant expression.
2018-03-29	Change uses of "spire" to "slang" (#461)	Tim Foley
	Fixes #350 When the Slang project forked off from the Spire research effort, we renamed things as we went, but many cases seem to have slipped through the cracks. The two biggest diffs here are: - The `hello` example program was incorrectly talking about what was in the shader file (Slang no longer supports the "module" or "pipeline" constructs from Spire), and so it wasn't just a simple rename. - The files under `tests/bindings` were mistakenly using `__SPIRE__` as a preprocessor guard, which means that they weren't actually testing what they meant to. Luckily, it looks like the relevant functionality didn't regress while these tests were unintentionally deactivated.
2018-03-29	Make IR-based output code cleaner (#458)	Tim Foley
	The big change here is that rather than trying to reproduce the exact line number and indentation of names as they appeared in the original code (which had been appropriate for the AST-to-AST translation strategy), we now emit code from the IR using a simple "pretty-printing" strategy, where indentation is determined by nesting. Along the way I deleted a bunch of dead code in `emit.cpp` that was handling emit from AST declarations/statements/expressions. I probably should have pulled that out into its own change, but doing that now would be tricky. This change also makes it so that we do not emit `#line` directives by default. Asking for `#line` directives in the output will probably become part of a Slang "debug mode" that tries to make the output code suitable for step-through debugging.
2018-03-29	Add support for default parameter values in IR codegen (#459)	Tim Foley
	Fixes #61 When lowering from AST to IR, if a call site doesn't supply an argument expression for each of the parameters to the callee, then use the default value expressions (stored as the "initializer" of the parameter decl) for each omitted parameter. This relies on the front-end to have already checked the call site for validity. Along the way I also cleaned up some of the checking of parameter declarations so that it is more like the checking of ordinary variable declarations (although the code is not yet shared). I also cleaned out some dead cases in the lowering logic for when we don't actually have a declaration available for a callee (these would only matter if we supported functions as first-class values). I added a simple test case to confirm that call sites both with and without the optional parameter work as expected. The strategy in this change is extremely simplistic, and might only be appropriate for default parameter value expressions that are compile-time constants (which should be the 99% case). This may require a major overhaul if we decide to handle default parameter values differently (e.g., by generating extra functions to ensure that the separate compilation story is what we want). Another issue that could change a lot of this logic would be if we start to support by-name parameters at call sites, since we could no longer assume that the argument and parameter lists align one-to-one (with the argument list possibly being shorter). Any work to add more flexible argument passing conventions would need to build a suitable structure to map from arguments to parameters, or vice-versa.
2018-03-28	Merge from v0.9.15 (#460)	Tim Foley
	* Fix bug when subscripting a type that must be split (#396) The logic was creating a `PairPseudoExpr` as part of a subscript (`operator[]`) operation, but neglecting to fill in its `pairInfo` field, which led to a null-pointer crash further along. * Allow writes to UAV textures (#416) Work on #415 This issue is already fixed in the `v0.10.` line, but I'm back-porting the fix to `v0.9.`. The issue here was that the stdlib declarations for texture types were only including the `get` accessor for subscript operations, even if the texture was write-able. I've also included the fixes for other subscript accessors in the stdlib (notably that `OutputPatch<T>` is readable, but not writable, despite what the name seems to imply). * Fix infinite loop in semantic parsing (#424) The code for parsing semantics was looking for a fixed set of tokens to terminate a semantic list, rather than assuming that whenever you don't see a `:` ahead, you probably are done with semantics. This meant that you could get into an infinite loop just with simple mistakes like leaving out a `;`. This change fixes the parser to note infinite loop in this case, and adds a test case to verify the fix. * Expose HLSL `shared` modifier through reflection. (#436) This is a request from Falcor, because the `shared` modifier can be used as a hint to optimize the grouping of parameters for binding. The intention is that `shared` marks shader parameters (including parameter blocks) that will us the same values across many draw calls (e.g., per-frame data, as opposed to per-model or per-instance). The mechanism I'm using here is to provide a general reflection API for exposing the `Modifier`s already attached to declarations. While the only modifier exposed is `shared`, and the only modifier information being exposed is presence/absence, this interface could be extended down the line.
2018-03-26	Unify all generic parameters, even if some mismatch (#454)	Tim Foley
	* Fix decl-ref printing to handling NULL pointers If the underlying decl, or its name is NULL, then use an empty string for the declaration name. This issue was found when debugging, but could bite non-debug cases too, if we ever try to print something like a generic type constraint, which has no name. * Unify all generic parameters, even if some mismatch Fixes #449 The front end tries to infer the right generic arguments to use at a call site using a sloppily implemented "unification" approach. The basic idea is that if you pass a `vector<float,3>` into a function that operates ona `vector<T,N>` where `T` and `N` are generic paameters, then the unification will try to unify `vector<float,3>` with `vector<T,N>` which will lead to it recursively unifying `float` with `T` and `3` with `N`, at which point we have viable values to substitute in for those parameters. Where the existing approach is maybe not quite right is in how it handles obvious unification failures. So if we ask the code to unify, say, `float` with `uint`, it will bail out immediately because those can't be unified. This sounds right superficially, but in some cases with might be calling a function that takes a `vector<float,N>` and passing a `vector<uint,3>` and we'd like to at least get far enough along with unification to see that `N` should be `3` so that the front end can maybe decide to call the function anyway, with some amount of implicit conversion. Over time I've had to modify a lot of the "unification" logic so that it doesn't treat the obvious failures as a hard stop, and instead just returns the failure as a boolean status, but keeps on trying to unify things even after such a failure. When doing unification as part of inference for generic arguments, there will usually be subsequent steps (e.g., type conversions for function aguments) that will catch the type errors that arise. This specific change is to make is so that when unifying the substitutions for a generic decl-ref, we try to unify all the pair-wise arguments, and don't bail out on the first mismatch (so that the `float`-vs-`uint` failure above doesn't lead to us skipping the `3` and `N` pairing). The one case we need to watch out for in all of this is when unification is used to check if an `extension` declaration (which might be generic) is actually application to a concrete type. In that case we obviously don't want an extension for `vector<float,N>` to apply to `vector<uint,3>`, so it is important that the extension case check the return status from the unification logic (or in the future, it could just confirm that the substituted type is equivalent to the original as a post-process...). I've added a test case that reproduces the original failure that surfaced the bug. * fixup: add expected test output
2018-03-26	Renderer resource mangement for render-test (#453)	jsmall-nvidia
	* First pass at resource based renderer using RefObject. * Correct handling of array of buffer pointers to Dx11. * Fix bug with setting viewOut incorrectly in createInputTexture. * More support for allowing com like interfaces. * Added and tidied Slang::Result - adding interface specific results * Guid added comparison support, and made base interface IComUnknown - with lowerCamel methods
2018-03-22	Tidy up of Renderer (#452)	jsmall-nvidia
	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types. * Refactor Renderer implementations such that: * Class definition does not contain long implementation/s * Removed unused globals * Ordered implementation after class definition * Made renderer specific classes child classes, and use Impl postfix to differentiate * Converted tabs into spaces * First pass at Slang::ComPtr. Added slang-defines.h which sets up some fairly commonly used defines such as SLANG_FORCE_INLINE, compiler detection, os detection, and some other cross platform features. * * Fixed bug in vk renderer - where features structure not initialized on hkCreateDevice * Make member variables in Renderer implementations use prefix * Updated test README.md to document that free parameter can control what test is run * * changed setClearColor to take an array of 4 floats to make API clearer on usage * mix of type usage style - defaulted to more conventional style * * Fixed swapWith * Use SLANG_FORCE_INLINE * Don't bother initializing List data when type is POD * Added convenience macro for Result handling SLANG_RETURN_NULL_ON_ERROR
2018-03-22	Add support for DirectX Raytracing (DXR) (#451)	Tim Foley
	* Add support for DirectX Raytracing (DXR) This is an initial pass to add support to Slang for the shader stages introduced by DirectX Raytracing (DXR). * Add declarations for DXR intrinsic types and functions to the Slang standard library. The way our compilation works, these will then get propagated through the IR as intrinsics and get spit back out again as-is during HLSL code emission. * Declare the DXR-related stages. This is the main work that affects the compiler's C++ implementation rather than being something we can add via the standard library today. * Switch around the encoding of the `Profile` type so that the stage is in the low bits, allowing API users to pass an ordinary `SlangStage` to operations that expect a `SlangProfileID`. - This represents a direction I'd like to push in long term, where the user specifies stage and "feature level" separately rather than using composite profiles like `vs_6_0`. The introduction of these new stages seems like a good point to try and make a clean break here and not introduce, e.g., `rgs_6_1` for ray generatin shaders. * Upgrade "effective profile" computation so that it advances the required version based on the specified stage (e.g., DXR stages seem to require at least shader model 6.1). - This is a bit of a kludge overall, but ideally we don't want a typical user to have to think about "feature level" stuff much at all. The ideal workflow is that they just hand us a source file and we work out entry points and their required feature levels in the compiler (and let the user query it when we are done). Until we implement that for real, stopgaps like this are required. Overall these are relatively small changes for supporting some major new API behavior. Slang's design helps out here, by allowing a lot of things to be specified in the stdlib (including generic intrinsic functions), but some of this is also owed to the DXIL-influenced design of DXR - e.g., the use of global functions in place of `SV_` semantics. fixup: typos * Fixup: use `pixel` instead of `fragment` as primary stage name This is to match HLSL conventions when generating output code, even if the Slang project officially favors the more correct term "fragment shader."