slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2018-05-11	Fixes #559 (#560)	Yong He

2018-05-10	Workaround for cases where we emit illegal-but-unused types	Tim Foley
	This is a quick workaround to deal with cases where we try to emit an unreferenced IR type that contains references to pre-legalization types (which might have been removed from the IR even thought they are still referenced). The basic fix is to not add types to our global order of instructions to emit by default, and only add them on demand as they are referenced by other instructions. This is not a real fix for the underlying issue, which is that type legalization is only being applied to a subset of global instructions instead of all of them. A more detailed fix for that problem will need to be devised next. This fix also doesn't address the question of why an unreferenced `struct` type came to be present in the IR code passed to the back-end in the first place. It would be good to understand how this scenario is arising.
2018-05-04	Re-enable emission of #line directives and clean up output (#554)	Tim Foley
	This was based on feedback from Falcor users, who felt like changing the default to have no line directives didn't work out well. Since I'd only made them disabled by default based on what I perceived to be Falcor's needs, I'm happy to turn this back on by default. I also added a few changes to clean up the output: * Don't emit a directive for a sub-expression, since that breaks up the code too much. The only directives inside a function body will be on top-level instructions that didn't get folded into a use site. * Add logic to emit a directive for top-level declarations (globals, functions, structs), and clean up their printing so that they put any extra space after the declaration rather than before (so the line numbers can be accurate) * Don't emit the file path part of a directive if it would be the same as the previous directive. This makes the output less noisy, at the cost of having to work your way backward to find the file if you are looking directly at the output. There are certainly more cleanups possible, but these make the output decent enough to be useful for working backwards from a downstream compiler error to the offending code.
2018-05-04	Allow more complex compound expressions when emitting from IR (#552)	Tim Foley
	The emit logic already had an idea of when an instruction should be "folded" it its use site(s), and this change just expands on that logic to try to be more aggressive. The basic idea is that instead of outputting this: ```hlsl float4 _S3 = a_0 + b_0; float4 _S4 = c_0 * _S3; d_0 = _S4; ``` we can hopefully output something like this: ```hlsl d_0 = c_0 * (a_0 + b_0); ``` The way this works is that after dealing with the various special cases that decide an instruction `I` must/cannot be folded in, we look and see if it has the following properites: * `I` has no side effects * `I` has a single user, `U` * `I` and `U` are in the same block (and `I` comes before `U` in that block) * for every instruction `X` between `I` and `U` (exclusive), `X` has no side effects If all of these conditions are true, then `I` can be folded in as a sub-expression when we emit `U`. This change doesn't affect most of our test output, but there is still a single test with SPIR-V output that we compare against a GLSL baseline, and so that baseline had to be modified to match the GLSL we now generate. Similar to #547, this change is not meant to provide a complete solution, but rather to take a concrete but low-risk step toward improving our output. Opportunities to improve the results further include: * We can/should ensure that when outputting sub-expressions we keep extra parentheses to a minimum. The old logic for emitting from an AST had support for "unparsing" expressions with minimal parentheses, and we should try to do the same. This can be error-prone, because omitting parentheses can lead to silent failures, so it must be done carefully. * We could try to be more aggressive about detecting what operations might have side effects. The most interesting case is function calls, where we should try to check if the callee is a function known to be side-effect-free. We could start by annotating most builtin functions with an attribute/decoration that indicates freedom from side effects. Deriving this attribute for user functions could be interesting, but we'd have to be careful since "nontermination" is technically a side effect. * We could try to be more aggressive about determining what side effects in instructions `X` are "safe" for the instruction `I` to move across. For example, if `I` is a load from variable `a` and `X` is a store to variable `b`, then that would seem to be safe. This starts to get into issues of instruction scheduling, though, and that is probably beyond what we want Slang to be doing.
2018-05-03	Add a pass for computing dominator trees (#541)	Tim Foley
	This code is currently not used by anything, but I wanted to check in a first pass at an implementation of dominator tree construction so that we don't have to keep avoiding implementing algorithms that rely on having dominator information available. The algorithm used to construct the dominator tree is taken from "A Simple, Fast Dominance Algorithm" by Keith D. Cooper, Timothy J. Harvey, and Ken Kennedy. This is not the "best" algorithm in terms of asymptotic performance, but it is among the simplest algorithms for computing a dominator tree that still outperforms naive iterative set-based methods. The actual data structure and API for the dominator tree has a bit of "cleverness" in it to try to make the common queries reasonably fast (e.g., you can check whether A dominates B in constant time). My hope is that even if we implement a more advanced algorithm for constructing the dominator tree, we can retain compatibility with passes that might make use of this API. Because no code is currently using this logic, I have done only minimal testing by stepping through this code and validating the results on paper for some very small CFGs. More serious testing/debugging may need to wait until we have an optimization pass that needs the dominator tree we compute here. One open question I have is how best to introduce traditional unit testing into Slang, since this is an example of code that would benefit greatly from being unit tested.
2018-05-03	Pass through original names for most declarations (#547)	Tim Foley
	The basic idea here is that when lowering to the IR, the front-end will attach a "name hint" to the IR instruction(s) that represent a given declaration, and then the passes that work on the IR will try to preserve and propagate those names, and then finally the emit logic will use them in place of mangled or unique names when available. This change does not try to deal with the issues that arise when we try to use those variable names in the output without any modification (e.g., handling cases where they might clash with keywords or builtins in the target language). Instead, it tries to establish baseline behavior for propagating through names, so that a later change can concentrate on the issue of using those names exactly when it is legal to do so. In order to avoid issues around the name "hints" causing problems we take two main steps: 1. We "scrub" each name to reduce it down to the allowed set of identifier characters in C-like languages, and then ensure that it doesn't do things that would be illegal in some downstream languages (e.g., consecutive underscores are not allowed in GLSL) or could clash with Slang's mangled names. This process isn't guaranteed to give distinct results for distinct inputs (it isn't a mangling scheme, after all). 2. We generate a unique ID for each occurence of a given name and always use that as a suffix. This means that even if a name happens to overlap with a keyword (if you somehow have a variable named `do`), we will still add a suffix that makes it not a problem (we'd output `do_0` which is fine). The logic for generating these names is mostly straightforward. For simple variables, we use their given name directly, while for other declarations we try to form a name that includes their parent declaration (e.g. `SomeType.someMethod`). Various IR passes need to propagate or preserve this information. The most interesting is type legalization, when we take a variable with an aggregate type and split some of the fields out into their own variables. In that case we generate "dotted" names like `someVar.someTexture` and rely on the emit logic to turn that into `someVar_someTexture`. During SSA generation, if we are promoting a variable to SSA temporaries, we will try to propagate the name of the variable over to the temporaries (unless they already have a name from some other place). The same applies to block parameters ("phi nodes"). Many of the test changes need their expected output to be updated for this change. Luckily in most cases the output has gotten easier to understand.
2018-05-02	Merge branch 'master' into master	Yong He

2018-05-02	Speedup type checking using cached overload resolution results.	Yong He
	This change adds caches to built-in operator overload resolution and type coersion to avoid running these time-consuming operations every time. - Adds `TypeCheckingCache` type, which is defined in check.cpp, that contains two dictionaries for the cached results of `ResolveInvoke` and `CanCoerce` calls. - Add `destroyTypeCheckingCache` and `getTypeCheckingCache` methods to `Session` class to reuse these cached results over the entire session.
2018-05-02	Add support for "swizzled stores" (#544)	Tim Foley
	This was a known issue in our IR representation, which was now biting a user. The basic problem is that in code like the following: ```hlsl RWStructureBuffer<float4> buffer; ... buffer[index].xz = value; ``` we ideally want to be able to reproduce the original HLSL code exactly, but that requires directly encoding the way that this code writes to two elements of a vector, but not the others. The currently lowering strategy we had produced IR something like: ```hlsl float4 tmp = buffer[index]; tmp.xz = value; buffer[index] = tmp; ``` That transformation might seem valid, but it has some big problems: * It generates UAV reads that are not needed, which could impact performance * It performs read-modify-write operations on memory that the programmer didn't explicitly write, which could create data races The fix here is somewhat obvious: if the "base" of a swizzle operation on a left-hand side resolves to a pointer in our IR, then we can output a "swizzled store" instead of the read-modify-write dance. We currently keep the read-modify-write around since it is potentially needed as a fallback in the general case. Along the way I also tried to make sure that we handle the case where we have a swizzle of a swizzle on the left-hand side: ```hlsl buffer[index].xz.y = value; ``` That code should behave the same as `buffer[index].z = value`. I am currently detecting and cleaning up this logic in the lowering path for `SwizzleExpr`, because that is the only place in the lowering logic that "swizzled l-values" currently get created.
2018-05-02	Add support for explicit register space bindings (#542)	Tim Foley
	This change adds support for specifying explicit register spaces, like: ```hlsl // Bind to texture register #2 in space #1 Texture2D t : register(t2, space1); ``` I added a test case to confirm that the register space is properly propagated through the Slang reflection API. This change also adds proper error messages for some error/unsupported cases that weren't being diagnosed: * Specifying a completely bogus register "class" (e.g., `register(bad99)`) * Failing to specify a register index (`register(u)`) * Specifying a component mask (`register(t0.x)`) * Using `packoffset` bindings I added test cases to cover all of these, as well as the new errors around support for register `space` bindings. In order to get the existing tests to pass, I had to remove explicit `packoffset` bindings from some DXSDK test shaders. None of these `packoffset` bindings were semantically significant (they matched what the compiler would do anyway, for both Slang and the standard HLSL compiler). Removing them is required for Slang now that we give an explicit error about our lack of `packoffset` support. In a future change we might add logic to either detect semantically insignificant `packoffset`s, or to just go ahead and support them properly (as a general feature on `struct` types).
2018-05-02	Fix emit logic when "terminators" occur in the middle of a block (#540)	Tim Foley
	Fixes #527 There were a few problem cases for the IR emit logic. The most obvious, which came up in #527 is that a function body with multiple `return` statements would generate invalid code: ```hlsl int foo() { return 1; int x = 2; return x; } ``` In that case the IR for `foo` would have a single block that has two `return` instructions, which is invalid. Another case that seems to be arising more often, but that had less obvious consequences was when one arm of an `if` statement ends in a `return`: ```hlsl if(a) { return b; } else { int c = 0; } int d = 0; ``` In that case, the `return` instruction for `return b` would be followed by a branch to the end of the `if` (the `int d = 0;` line), because that would be the normal control flow without the early `return`. The fix implemented here is to have the IR lowering logic be a bit more careful on two fronts: 1. When emitting a branch, check if the block we are emitting into has already been terminated, and if so just don't emit the branch (since we are logically at an unreachable point in the CFG. 2. Whenever we are about to emit code for a (non-empty) statement, ensure that the current block being build is unterminated. If the current block is terminated, then start a new one. Case (2) will only matter when there is unreachable code (e.g., in the function `foo()`, the declaration of `x` and the second `return` can never be reached), so I added a warning in that case, and included a test case that triggers the new warning (with a function like `foo()` above).
2018-05-01	Diagnose attempts to write to fields in methods (#530)	Tim Foley
	* Diagnose attempts to write to fields in methods Work on #529 This helps to avoid the case where a Slang user writes a struct with helpful `setter` methods, and finds that it doesn't work as expected because the `this` parameter is currently handled like an `in` parameter (passed by value, but mutable in the callee). Fixing this issue actually involved making a more broad fix to how l-value-ness is propagated. The existing checking logic was assuming that l-value-ness is just a property of a particular member declaration (e.g., a field is either mutable or not), and didn't take into account whether the "base expression" was mutable. This change fixes that oversight, which might lead to additional errors being issued if we aren't correctly making things mutable when we should. A `ThisExpr` was already immutable by default, so that part didn't actually need to change. Just propagating its immutability through was enough. As an additional assistance to users, I have added an extra diagnostic that triggers when a "destination of assignment is not an l-value" error occurs and the left-hand-side expression seems to be based on `this` (whether implicitly or explicitly). This will ideally help users to understand that the "setter" idiom is not yet supported. * Fixed setRadius typo
2018-05-01	Cleanups (#539)	Tim Foley
	* Cleanup: remove unused files from project * Cleanup: move IRModule forward declaration into correct namespace
2018-04-28	Remove unused local variable in vm.cpp (#533)	Jeremie St-Amand
	Unused local variable prevents compiling when warnings are treated as errors
2018-04-25	Fix for global generic parameter substitution (#512)	Tim Foley
	The problem here arises when multiple entry points are compiled in one pass. Each entry point has its own arguments for global generic parameters, and leads to us emitting a `bindGlobalGenericParameter(p, val)`. But once the first entry point's substitutions are applied, the second entry point's code gives `bindGlobalGenericParameter(val, val)` and the compiler crashes (in debug builds) because `val` is not a global generic parameter. This change just applies a quick fix. If we see `bindGlobalGenericParameter(x,y)` during specialization, and `x` is not a global generic parameter, then we skip it. The right long-term fix is to change the compiler's representation of global generic arguments so that they live on a `CompileRequest` instead of an `EntryPointRequest`. That is a more significant change (with impact on the public API), so I'm inclined to leave it as a cleanup for another day (given that no customers are using global generic parameters today).
2018-04-23	Improve SSA promotion for arrays and structs (#521)	Tim Foley
	* Improve SSA promotion for arrays and structs Fixes #518 The existing SSA pass would only handle `load(v)` and `store(v,...)` where `v` is the variable instruction, and would bail out if `v` was used as an operand in any other fashion. The new pass adds support for `load(ac)` where `ac` is an "access chain" with a gramar like: ac :: v \| getElementPtr(ac, ...) \| getFieldAddress(ac, ...) What this means in practical terms is that we can promote a local variable of array or structure type to an SSA temporary even if there are loads of individual elements/fields, as along as any assignment to the variable assigns the whole thing. I've added a test case to confirm that this change fixes passing of arrays as function parameters for Vulkan. * Fixup: disable test on Vulkan because render-test isn't ready This is a fix for Vulkan, but I don't think our testing setup is ready for it. * Fixup: error in unreachable return case, caught by clang * Fixups based on testing These are fixes found when testing the original changes against the user code that originated the bug report. * `emit.cpp`: Make sure to handle array-of-texture types when deciding whether to declare a temporary as a local variable in GLSL output * `ir-legalize-types.cpp`: Make a not of a source of validation failures that we need to clean up sooner or later (just not in scope for this bug fix change). * `ir-ssa.cpp`: * When checking if something is an access chain with a promotable var at the end, make sure the recursive case recurses into the "access chain" logic instead of the leaf case * Add some assertions to guard the assumption that any access chain we apply has been scheduled for removal * Correctly emit an element extract instead of getting an element address when promoting an element access into an array being promoted * Eliminate a wrapper routine that was setting up an `IRBuilder` and use the one from the block being processed in the SSA pass (since it was set up for stuff just like this) * `ir-validate.cpp` * Add a hack to avoid validation failures when running IR validation on the stdlib code. This case triggers for an initializer (`__init`) declaration inside an interface, since the logical "return type" is the interface type itself, which has no representation at the IR level and thus yields a null result type in a `FuncType` instruction.
2018-04-23	Fix successor computation for `switch` instruction (#520)	Tim Foley
	Fixes #519 The code was leaving out the `default` label from the successor list, which would break any passes that require an accurate CFG (with the big one right now being the SSA-formation pass).
2018-04-20	Better diagnostics when compilation is aborted (#517)	Tim Foley
	* Improve messages when compilation is aborted. Make sure to include the information from any `Slang::Exception` that was thrown, so that the poor user can at least point us at our own message string from an assertion failure. This doesn't provide them line-number information in their code or the Slang codebase, so there is still work to be done in making the compiler more friendly about this stuff. * When aborting compilation, try to note what source location we were working on This is handled by having exception handlers on the stack at key bottleneck points in semantic checking and IR generation, which can then emit a diagnostic to note what we were working on when things failed. This is not intended to be an indiciation to the user that their code is at fault for a compiler crash (it is always our fault), but might give them a chance to work around whatever bug is blocking them.
2018-04-20	Diagnose use of an implicit cast as an argument for an `out` parameter (#516)	Tim Foley
	Work on #499 Two big fixes here: * The logic for checking constraints on `out` arguments wasn't actually triggering because it relied on function parameters being given an `OutType` if they are marked `out`, but the code wasn't actually doing that. Fixing the computation of types for functions resolved that issue. * Next, I added a specific diagnostic to follow up the "expected an l-value" error to let the user know that their argument was implicitly converted, and that is why it doesn't count as an l-value in Slang's rules. I've added a test case to ensure that we retain this diagnostic until we can do a true fix for the issue. The right long-term fix is to have an AST representation of all the implicit casts involved (e.g., in both directions for an `inout` parameter), and then have the IR generate explicit code for the conversions in each direction (the `LoweredVal` representation can handle this sort of thing).
2018-04-19	Fix GS cross-compilation after IR type system change (#507)	Tim Foley
	The cross-compilation logic for geometry shaders would look through the user's entry point for calls like `someStream.Emit<X>(val)` and turn that into `outputGlobals = val; EmitVertex();`. It was recognizing the `Emit()` calls by looking at the callee in all `call` instructions and seeing if it was registered to lower to GLSL as `EmitVertex()`. The logic was try to look "through" `specialize` instructions (to deal with the `<X>` bit in the call above), but this wasn't updated for the new IR encoding where the first operand to a `specialize` is the generic being specialized, and not the function nested inside it. The fix here is to properly look through both `specialize` instructions and generics. This is kind of a gross operation and we've done things like it in a few places, so it might be something we try to extract into a utility function in the future.
2018-04-19	Add type legalization support for "field extract" op (#501)	Tim Foley
	The code was handling the "get field address" opcode (which takes a pointer to a struct and returns a pointer to a field), but didn't have a case for values. This was just an oversight.
2018-04-19	Fix up DXR type emission from IR type system (#498)	Tim Foley
	* There was a simple typo where we were emitting `RaytracingAccelerationStructureType` instead of `RaytracingAccelerationStructure` * The IR lowering logic was failing to handle types with an `__intrinsic_type` modifier (which maps them to a single IR opcode) that weren't in one of the various special cases. I added a catch-all case to the handling of `DeclRefType`. This notably affected the `RayDesc` type. * Even if we lower `RayDesc` to an intrinsic type, we still need to lower its fields too, and these were getting emitted with mangled names (as would happen for any user-defined fields). The solution I implemented was to allow for fields to have `__target_intrinsic` modifiers in the stdlib, to specify the un-mangled name they should use on each target. I'm not 100% happy with this solution, because it seems odd to have `RayDesc` be an intrinsic type, but then to also have field keys used in `getField` instructions as if it were an ordinary `struct`. It seems like a better solution would be to have it lower to an IR `struct`, just with an appropriate modifier.
2018-04-18	Fix output of `groupshared` with IR type system (#492)	Tim Foley
	The basic problem was that the lowering logic was constructing (more or less) `Ptr<@GroupShared X>` instead of `@GroupShared Ptr<X>`. There were also problems with passes not propagating through rates that should have been (e.g., legalization). I've added a test case to actually validate `groupshared` support.
2018-04-18	Fix up name mangling/unmangling for extensions (#493)	Tim Foley
	* Fix up name mangling/unmangling for extensions This is required for the unmangling we do on some builtin function names. The work here is mostly just a band-aid, and a more comprehensive pass over the name mangling/unmangling code is required to make any of this robust. * fixup: UNREACHABLE_RETURN argument
2018-04-18	Fix some logic around legalization of sampler types (#496)	Tim Foley
	The main error here was checking for `IRSamplerType` instead of `IRSamplerTypeBase`, which means the relevant logic only triggered for the `SamplerState` type and not the `SamplerComparisonState` type. The two affected places were type legalization (so that comparison samplers in `struct` types weren't being hoisted out) and the emit logic when deciding whether to introduce local temporaries (so we were emitting temporaries for comparison samplers, leading to GLSL errors).
2018-04-13	Propagate diagnostics when imported module has errors (#485)	Tim Foley
	A previous fix avoided crashes when an `import`ed module has errors by making the "failed to import" error a fatal one. Unfortunately, the code path that handles fatal errors was failing to copy diagnostic output from the sink over to the member variable on the `CompileRequest` that exposes the output through the API. This meant that API users lost all context on error messages in `import`ed code. This change fixes the immediate issue by plumbing through the error output, but doesn't fix the more fundamental issue: the front-end should not crash when an `import` fails, by any means.
2018-04-12	Preprocessor cleanups (#484)	Tim Foley
	* For a `#error` or `#warning`, read the rest of the line as raw text to include in the error message * When skipping tokens (e.g., in an `#ifdef`d out block), don't emit errors on invalid characters * TODO: we could clearly get more efficient and skip whole raw lines in the future * Fix an issue when a macro invocation that expands to nothing (zero tokens) is the last thing before a directive. The preprocessor was returning the `#` as an ordinary token, because it has already gone past its test for directives.
2018-04-11	Introduce an IR-level type system (#481)	Tim Foley
	* Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors
2018-04-10	Feature/dx12 compute (#482)	jsmall-nvidia
	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * Getting simpler dx12 compute tests to work. * With expected data in test - check for specialized and then for the default, so that multiple test can share the same expected data, but specialized cases can still be set. * Fixed construction and binding on dx12 textures. * Control which render apis used in test from command line. * Small aesthetic fixes in render-test/main.cpp. * Fix binding problem for uavs/srvs dx12. Previously tried to create srv/uav for StorageBuffers (like dx11 does), but the binding breaks as you can end up with two srvs using the same register. First pass at fixing problems with Texture creation for dx12 - assertions were hit with 3d or array textures. * Fixes to improve Dx12 setup shader resource views for cubemaps/arrays. * Fixed d3d12 textureSamplingTest - problem was that cubemap/array textures were not being uploaded correctly. * Changed the order of how binding of constant buffers (as just set on the Renderer) indexes. Previously they were given the lowest indices, but they clashed with the indices from the 'Binding'. Changing this means all tests run on d3d12. * Add code to allow use of warp (although not command line switchable yet). Fix problem setting up raw UAV - as identified by warp. * Added RenderApiUtil - which can detect if a render api is potentially available. * Moved render flag testing/parsing into RenderApiUtil. * Fix signed/unsigned warning. * Fixes around enums prefixed with k on the review of feature/dx12 compute branch.
2018-04-05	Falcor fixes (#479)	Tim Foley
	* Don't emit interpolation modifiers on GLSL fields The previous change that started passing through interpolation modifiers didn't account for the fact that the GLSL grammar doesn't allow interpolation modifiers on `struct` fields, so we shouldn't emit them in that case (and our legalization strategy for GLSL guarantees that varying input will never use a `struct` type anyway). * Try to handle SV_Position semantic on GS input HLSL allows `SV` semantics to be used for ordinary inter-stage dataflow in some cases (e.g., a VS can output `SV_Position` and it can then be read from a GS). GLSL allows similar things with `gl_Position`, but there are some wrinkles. One fix here is to correctly identify that `gl_FragCoord` should only be used as the translation of `SV_Position` for a fragment shader input (and not in the general case of any input). The other "fix" here is a kludge to handle the fact that the right translation for a GS input is not to an array called `gl_Position`, but to the syntax `gl_in[index].gl_Position` (array-of-structs style). I am doing this by attaching a custom decoration to the global variable we create for `gl_Position` and then intercepting it during code emit. I'm not proud of this, and would like to do something better given time. * Fix GLSL output for matrix-scalar multiplication The output logic was assuming that any use of `operator` in the input code that yields a value of matrix type must be translated to a `matrixCompMult()` call in GLSL, but this should really only apply if both of the operands* are matrices (not just based on the result type). As a result matrix-times-scalar operations were emitting a call to `matrixCompMult()` and GLSL was complaining because it won't implicitly promote scalars to matrices.
2018-04-04	Pass AST interpolation modifiers through to codegen. (#475)	Tim Foley
	This is a short-term fix, because we (1) don't have an IR-level representation of interpolation qualifiers, and (2) can't introduce one until after the IR-level type system is introduced (to be able to handle `struct` fields). The approach here is to find the AST-level declaration, either from layout information (in the case of an ordinary variable or function parameter), or from struct field information (because structs are being output from the AST form anyway). I've included a single end-to-end rendering test to confirm that we handle the `nointerpolation` modifier the same as HLSL. I also added the `noperspective` modifier, which seemed to be missing from our implementation.
2018-04-03	Fixes based on review of feature/dx12 PR. (#473)	jsmall-nvidia

2018-04-02	Implement "operator comma" in IR codegen (#472)	Tim Foley
	Fixes #471
2018-04-02	Feature/dx12 (#469)	jsmall-nvidia
	* Fix signed/unsigned comparison warning. * Split out d3d functions that will work across dx11 and 12. * Improve slang-test/README.md around command line options. * Make Guid comparison honor alignment for comparisons, such that mechanism work on architectures that can only do aligned accesses. * Initial setup of D3D12 Renderer, with presentFrame and clearFrame. * More support for D3D12 * Added FreeList * Added D3D12CircularResourceHeap * First attempt at createBuffer * First pass at map/unmap. * First pass binding vertex/constant buffers, and setting up InputLayout. Note that memory is not kept in scope on binding yet. * First pass of D3DDescriptorHeap * Small tidy up in render-d3d11. Added D3DDescriptorHeap to project. * First pass at D3D12 bind state. * Fix typos in D3D12Resource * Tidy up Dx11 render binding a little to match more with Dx12 style. * First pass at Dx12 BindingState * Handling of the command list d3d12. Support for submitGpuWork and waitForGpu. * First attempt at Dx12 capture of backbuffer to file. * First attempt at D3D12 binding for graphics. * D3D12 setup viewport etc - does now render triangle in render0.hlsl. * First pass at support for compute on D3D12Renderer * Use spaces over tabs in D3DUtil * Tabs to spaces in D3D12DescriptorHeap * Convert tab->spaces on render-d3d12.cpp
2018-04-02	Update to top-of-tree glslang (2018-04-02). (#470)	Tim Foley
	This is an attempt to alleviate some driver crashes caused by invalid SPIR-V. Because Slang drives `glslang` with GLSL source code, any invalid output is likely due to `glslang` bugs. I chose top-of-tree for `glslang` because it wasn't clear what their release process is. Hopefully we can go another year without having to update this dependency. The build setup we use for `glslang` had to change to account for one more `#define` that the code expects to have passed in externally.
2018-03-30	Fix several issues discovered by Falcor (#467)	Tim Foley
	Fixes #466 Most of these are Vulkan-related regressions. * Kludge the definition of `GroupMemoryBarrierWithGroupSync()` for GLSL so that it works around parentheses that the emit logic now introduces. * Don't emit `static` for global constants when targetting GLSL * Emit the `flat` modifier for varying input/output with integer type, when targetting GLSL * Avoid checking parameter default-value expressions more than once, because this can crash when the checking introduces syntax that is not expected to appear in the input AST
2018-03-29	Avoid crash when bad argument given to [instance(...)] attribute (#464)	Tim Foley
	Fixes #463 Some of the attributes were failing to check for a `null` result from `checkConstantIntVal`, and so they crashed when a bad expression was used in an attribute. The particular way this had been triggered was that a user put an HLSL geometry shader in the same file with other code, using an entry point like: ```hlsl [instance(COUNT)] void myGeometryShader(...) {...} ``` They then defined `COUNT` as a preprocessor macro when compiling using the GS, but left it undefined otherwise. The result was that the argument to the `instance` attribute would fail to type check, and thus wouldn't count as a constant integer value, so that `checkConstantIntVal` returns `null` and results in the crash. The workaround for the user is to always define `COUNT`, even when not compiling the GS. The fix in the compiler is to guard against `null` in these cases and bail out of attribute checking. I also implemented logic so that `CheckIntegerConstantExpression` (which is invoked by `checkConstantIntVal`) will not produce an additional error message if the underlying expression failed to type check. In this casem the user will get an `undefined identifier: COUNT` error message, and we don't need to waste their time by also telling them that this isn't a compile-time constant expression.
2018-03-29	Change uses of "spire" to "slang" (#461)	Tim Foley
	Fixes #350 When the Slang project forked off from the Spire research effort, we renamed things as we went, but many cases seem to have slipped through the cracks. The two biggest diffs here are: - The `hello` example program was incorrectly talking about what was in the shader file (Slang no longer supports the "module" or "pipeline" constructs from Spire), and so it wasn't just a simple rename. - The files under `tests/bindings` were mistakenly using `__SPIRE__` as a preprocessor guard, which means that they weren't actually testing what they meant to. Luckily, it looks like the relevant functionality didn't regress while these tests were unintentionally deactivated.
2018-03-29	Make IR-based output code cleaner (#458)	Tim Foley
	The big change here is that rather than trying to reproduce the exact line number and indentation of names as they appeared in the original code (which had been appropriate for the AST-to-AST translation strategy), we now emit code from the IR using a simple "pretty-printing" strategy, where indentation is determined by nesting. Along the way I deleted a bunch of dead code in `emit.cpp` that was handling emit from AST declarations/statements/expressions. I probably should have pulled that out into its own change, but doing that now would be tricky. This change also makes it so that we do not emit `#line` directives by default. Asking for `#line` directives in the output will probably become part of a Slang "debug mode" that tries to make the output code suitable for step-through debugging.
2018-03-29	Add support for default parameter values in IR codegen (#459)	Tim Foley
	Fixes #61 When lowering from AST to IR, if a call site doesn't supply an argument expression for each of the parameters to the callee, then use the default value expressions (stored as the "initializer" of the parameter decl) for each omitted parameter. This relies on the front-end to have already checked the call site for validity. Along the way I also cleaned up some of the checking of parameter declarations so that it is more like the checking of ordinary variable declarations (although the code is not yet shared). I also cleaned out some dead cases in the lowering logic for when we don't actually have a declaration available for a callee (these would only matter if we supported functions as first-class values). I added a simple test case to confirm that call sites both with and without the optional parameter work as expected. The strategy in this change is extremely simplistic, and might only be appropriate for default parameter value expressions that are compile-time constants (which should be the 99% case). This may require a major overhaul if we decide to handle default parameter values differently (e.g., by generating extra functions to ensure that the separate compilation story is what we want). Another issue that could change a lot of this logic would be if we start to support by-name parameters at call sites, since we could no longer assume that the argument and parameter lists align one-to-one (with the argument list possibly being shorter). Any work to add more flexible argument passing conventions would need to build a suitable structure to map from arguments to parameters, or vice-versa.
2018-03-28	Merge from v0.9.15 (#460)	Tim Foley
	* Fix bug when subscripting a type that must be split (#396) The logic was creating a `PairPseudoExpr` as part of a subscript (`operator[]`) operation, but neglecting to fill in its `pairInfo` field, which led to a null-pointer crash further along. * Allow writes to UAV textures (#416) Work on #415 This issue is already fixed in the `v0.10.` line, but I'm back-porting the fix to `v0.9.`. The issue here was that the stdlib declarations for texture types were only including the `get` accessor for subscript operations, even if the texture was write-able. I've also included the fixes for other subscript accessors in the stdlib (notably that `OutputPatch<T>` is readable, but not writable, despite what the name seems to imply). * Fix infinite loop in semantic parsing (#424) The code for parsing semantics was looking for a fixed set of tokens to terminate a semantic list, rather than assuming that whenever you don't see a `:` ahead, you probably are done with semantics. This meant that you could get into an infinite loop just with simple mistakes like leaving out a `;`. This change fixes the parser to note infinite loop in this case, and adds a test case to verify the fix. * Expose HLSL `shared` modifier through reflection. (#436) This is a request from Falcor, because the `shared` modifier can be used as a hint to optimize the grouping of parameters for binding. The intention is that `shared` marks shader parameters (including parameter blocks) that will us the same values across many draw calls (e.g., per-frame data, as opposed to per-model or per-instance). The mechanism I'm using here is to provide a general reflection API for exposing the `Modifier`s already attached to declarations. While the only modifier exposed is `shared`, and the only modifier information being exposed is presence/absence, this interface could be extended down the line.
2018-03-26	Unify all generic parameters, even if some mismatch (#454)	Tim Foley
	* Fix decl-ref printing to handling NULL pointers If the underlying decl, or its name is NULL, then use an empty string for the declaration name. This issue was found when debugging, but could bite non-debug cases too, if we ever try to print something like a generic type constraint, which has no name. * Unify all generic parameters, even if some mismatch Fixes #449 The front end tries to infer the right generic arguments to use at a call site using a sloppily implemented "unification" approach. The basic idea is that if you pass a `vector<float,3>` into a function that operates ona `vector<T,N>` where `T` and `N` are generic paameters, then the unification will try to unify `vector<float,3>` with `vector<T,N>` which will lead to it recursively unifying `float` with `T` and `3` with `N`, at which point we have viable values to substitute in for those parameters. Where the existing approach is maybe not quite right is in how it handles obvious unification failures. So if we ask the code to unify, say, `float` with `uint`, it will bail out immediately because those can't be unified. This sounds right superficially, but in some cases with might be calling a function that takes a `vector<float,N>` and passing a `vector<uint,3>` and we'd like to at least get far enough along with unification to see that `N` should be `3` so that the front end can maybe decide to call the function anyway, with some amount of implicit conversion. Over time I've had to modify a lot of the "unification" logic so that it doesn't treat the obvious failures as a hard stop, and instead just returns the failure as a boolean status, but keeps on trying to unify things even after such a failure. When doing unification as part of inference for generic arguments, there will usually be subsequent steps (e.g., type conversions for function aguments) that will catch the type errors that arise. This specific change is to make is so that when unifying the substitutions for a generic decl-ref, we try to unify all the pair-wise arguments, and don't bail out on the first mismatch (so that the `float`-vs-`uint` failure above doesn't lead to us skipping the `3` and `N` pairing). The one case we need to watch out for in all of this is when unification is used to check if an `extension` declaration (which might be generic) is actually application to a concrete type. In that case we obviously don't want an extension for `vector<float,N>` to apply to `vector<uint,3>`, so it is important that the extension case check the return status from the unification logic (or in the future, it could just confirm that the substituted type is equivalent to the original as a post-process...). I've added a test case that reproduces the original failure that surfaced the bug. * fixup: add expected test output
2018-03-26	Renderer resource mangement for render-test (#453)	jsmall-nvidia
	* First pass at resource based renderer using RefObject. * Correct handling of array of buffer pointers to Dx11. * Fix bug with setting viewOut incorrectly in createInputTexture. * More support for allowing com like interfaces. * Added and tidied Slang::Result - adding interface specific results * Guid added comparison support, and made base interface IComUnknown - with lowerCamel methods
2018-03-22	Tidy up of Renderer (#452)	jsmall-nvidia
	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types. * Refactor Renderer implementations such that: * Class definition does not contain long implementation/s * Removed unused globals * Ordered implementation after class definition * Made renderer specific classes child classes, and use Impl postfix to differentiate * Converted tabs into spaces * First pass at Slang::ComPtr. Added slang-defines.h which sets up some fairly commonly used defines such as SLANG_FORCE_INLINE, compiler detection, os detection, and some other cross platform features. * * Fixed bug in vk renderer - where features structure not initialized on hkCreateDevice * Make member variables in Renderer implementations use prefix * Updated test README.md to document that free parameter can control what test is run * * changed setClearColor to take an array of 4 floats to make API clearer on usage * mix of type usage style - defaulted to more conventional style * * Fixed swapWith * Use SLANG_FORCE_INLINE * Don't bother initializing List data when type is POD * Added convenience macro for Result handling SLANG_RETURN_NULL_ON_ERROR
2018-03-22	Add support for DirectX Raytracing (DXR) (#451)	Tim Foley
	* Add support for DirectX Raytracing (DXR) This is an initial pass to add support to Slang for the shader stages introduced by DirectX Raytracing (DXR). * Add declarations for DXR intrinsic types and functions to the Slang standard library. The way our compilation works, these will then get propagated through the IR as intrinsics and get spit back out again as-is during HLSL code emission. * Declare the DXR-related stages. This is the main work that affects the compiler's C++ implementation rather than being something we can add via the standard library today. * Switch around the encoding of the `Profile` type so that the stage is in the low bits, allowing API users to pass an ordinary `SlangStage` to operations that expect a `SlangProfileID`. - This represents a direction I'd like to push in long term, where the user specifies stage and "feature level" separately rather than using composite profiles like `vs_6_0`. The introduction of these new stages seems like a good point to try and make a clean break here and not introduce, e.g., `rgs_6_1` for ray generatin shaders. * Upgrade "effective profile" computation so that it advances the required version based on the specified stage (e.g., DXR stages seem to require at least shader model 6.1). - This is a bit of a kludge overall, but ideally we don't want a typical user to have to think about "feature level" stuff much at all. The ideal workflow is that they just hand us a source file and we work out entry points and their required feature levels in the compiler (and let the user query it when we are done). Until we implement that for real, stopgaps like this are required. Overall these are relatively small changes for supporting some major new API behavior. Slang's design helps out here, by allowing a lot of things to be specified in the stdlib (including generic intrinsic functions), but some of this is also owed to the DXIL-influenced design of DXR - e.g., the use of global functions in place of `SV_` semantics. fixup: typos * Fixup: use `pixel` instead of `fragment` as primary stage name This is to match HLSL conventions when generating output code, even if the Slang project officially favors the more correct term "fragment shader."
2018-03-21	First pass impls on ComPtr and reorganise Renderer (#450)	jsmall-nvidia
	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types. * Refactor Renderer implementations such that: * Class definition does not contain long implementation/s * Removed unused globals * Ordered implementation after class definition * Made renderer specific classes child classes, and use Impl postfix to differentiate * Converted tabs into spaces * First pass at Slang::ComPtr. Added slang-defines.h which sets up some fairly commonly used defines such as SLANG_FORCE_INLINE, compiler detection, os detection, and some other cross platform features.
2018-03-20	SlangResult and small bug/typos fixes (#448)	jsmall-nvidia
	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types.
2018-03-19	Entry point attribute (#447)	Tim Foley
	* Typo * Add [shader(...)] and clean up some literal handling * Add supporting for validating the `[shader(...)]` attribute, by checking that its argument is a string literal that names a known shader stage. * Split the `ConstantExpr` class into distinct subclasses rooted at `LiteralExpr`, so we have `BoolLiteralExpr`, `IntegerLiteralExpr`, `FloatingPointLiteralExpr`, and `StringLiteralExpr` * Add a `String` type to the stdlib, to be used as the type of a string literal. This change allows code using `[shader(...)]` to be accepted by the front-end again, but it does nothing about emitting it in final HLSL. * Allow entry points to be specified via [shader(...)] Before this change, the compiler would track a list of `EntryPointRequest` objects, based on what the suer specified via API and/or command-line options. Each entry point request would get matched up with an AST `FuncDecl` as part of semantic checking, and then the back end steps (layout, codegen, etc.) would work from that information. This change makes the compiler modal, in that it can either continue to use an explicit list of entry point requests (this is the mode when the list is non-empty), or it can rely on user-supplied attributes on entry point functions to drive codegen (this is the mode when the list is empty). User-specified `[shader(...)]` attributes are processed at the same place where the association from `EntryPointRequest`s to `FuncDecl`s would otherwise be made, and basically does the same thing in the opposite direction: looks for `FuncDecl`s with the appropriate attribute and synthesizes an `EntryPointRequest` for them. Subsequent processing should ideally not know where a given `EntryPointRequest` came from, and should handle both methods of specifying the entry points equivalently. One design choice that might not make immediate sense is that we do not process a function as an entry point (applying further validation, etc.) just because it has a `[shader(...)]` modifier, unless we are in the appropriate mode (which in this case is the mode where the user didn't specify their own entry points via API or command line). This is to handle cases where the user wants to explicitly compile only one entry point, so that they (1) don't want us to spend time validating code they don't care about, (2) don't want do get output they don't expect, and (3) might actually be presenting us with code that violates the language rules due to a combination of `#define`s in effect (e.g., they might have a `[shader("vertex")]` function that transitively executes a `discard` because of how the preprocessor was configured, but they don't care because they are compiling a fragment entry point). This decision might be something we revisit over time. As part of this work, I had to add some logic to pick a "profile version" to use for a combination of a target and stage (because when you specify `[shader("vertex")]` the compiler can't tell if you want `vs_5_0`, `vs_5_1`, etc.). This isn't really complete right now, because something like `-target dxbc` also doesn't determine a profile, so there is a bit of a kludge at present. We need to figure out a good long-term plan here, which might involve keeping target format, feature level/version, and pipeline stage as truly orthogonal concepts, rather than conflating them. That would involve more work in the API and command-line layers to de-compose things when the user specifies, e.g., `vs_5_1`, but might make downstream logic easier to manage. * Emit [shader(...)] attribute on entry point for SM 6.1 and later This should help ensure that the output from Slang can be compiled with dxc `lib_` profiles. Fix warning
2018-03-16	Small bug fixes. (#445)	Yong He
	This commit contains two small bug fixes: 1. In `specializeProgramLayout`, we cannot assume the resourceInfo entries in a varLayout and its corresponding type layout has the same order. Should use `FindResourceLayout`. 2. When generating ir for a switch statement, make sure to remove the breakLabel from the shared context when done. For some reason if a switch statement is being lowered twice, the Dictionary::Add method will complain the statement key already exists.
2018-03-16	Overhaul implementation of [attributes] (#443)	Tim Foley
	The existing code parsed all of the square-bracket `[attributes]` into `HLSLUncheckedAttribute`, and then went on to hand-convert some of them to specialized subclasses of `HLSLAttribute`. When attributes didn't check, they were left as-is, and no error message was issued, because at the time the compiler was focused on accepting arbitrary input. This change greatly overhauls the handling of `[attributes]`. Attributes are now declared in the stdlib, with declarations like: ```hlsl __attributeTarget(LoopStmt) attribute_syntax [unroll(count: int = 0)] : UnrollAttribute; ``` In this syntax, the `unroll` part is giving the attribute name (the `[]` are just for flavor, to make the declaration look like a use site; we could drop it if we don't like the clutter), the `count` is a parameter of the attribute, which we expect to be of type `int`, and which has a default value of `0` if unspecified. The `: UnrollAttribute` part specifies the meta-level C++ class that will implement this attribute (and corresponds to a class in `modifier-defs.h`). This syntax is similar to our current `syntax` declarations. I'm starting to think we should change it to something like a `__meta_class(UnrollAttribute)` modifier, and then use that uniformly across all cases (e.g., also replacing the curreent `__magic_type(Foo)` syntax). The `__attributeTarget(LoopStmt)` is a modifier that specifies the meta-level C++ class for syntax that this attribute is allowed to attach to. It is legal to have more than one of these. Attributes continue to be parsed in an unchecked form, so that we don't tie up semantic analysis and parsing more than necessary. During checking, we look up the attribute name in the current scope, and then replace the unchecked attribute with a more specific one if the checking passes. Checking proceeds in generic and attribute-specific phases. The generic phase includes checking the number of arguments against those specified in the attribute declaration (I don't currently check types, or handle default arguments), and then checking that at least one `__attributeTarget(...)` modifier applies to the syntax node being modified. The attribute-specific phase then applies to the specialized C++ subclass of `Attribute`, and does the actual checking right now (e.g., that step is responsible for actually type-checking things at present). This can obviously be improved over time. With this support I went ahead and added declarations for all the HLSL attributes I could find documented on MSDN. I also added a provisional declaration for the `[shader(...)]` attribute that has been added to dxc, but which is not yet documented. One important detail here is that lookup of attribute names needs to be done carefully, so that we don't let, e.g., local variables shadow an attribute declaration: ```hlsl int unroll = 5; // This attribute should not get confused by the local variable `unroll` [unroll] for(...) { .. } ``` The lookup logic already has a notion of a `LookupMask` that can be used to filter declarations out of the result. In this change I surfaced that mask through the main lookup API (rather than requiring a second pass to "refine" lookup results), and made is so that the default lookup mask does not include attributes, while an explicit mask can be used to look up only attributes. (An alternatie design we discussed was to follow the approach of C# and have the declaration of an attribute like `[unroll]` actually be `unrollAttribute`, with a suffix. I decided not to follow that approach for now because it seemed like printing good error messages in that case could require us to carefully trim the `Attribute` suffix off of names at times, and using the existing mask behavior seemed simpler.) To verify that the shadowing behavior is indeed correct, I modified the `loop-unroll.slang` test case. Smaller notes: * Removed the `HLSL` prefix from several of the C++ attribute classes * Made sure to actually validate the modifiers on statements * Special-cased checking for `ParamDecl` with a null type, because I'm re-using `ParamDecl` for attribute parameters, but can't give a concrete type to some of them right now * Deleting some old, dead emit-from-AST logic around attributes, rather than try to "fix" code that doesn't run (a more complete scrub of that code is still needed) * Fixed AST inheritance hierarchy so that a `Modifier` is a `SyntaxNode` rather than a `SyntaxNodeBase`. I have no idea why we have both of those, and we need to clean that up soon.