slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
...
*	Parser changes to improve handling of static method calls (#1290)	Tim Foley	2020-03-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Static Method Calls ------------------- The main fix here is for parsing of calls to static methods. Given a type like: struct S { void doThing(); } the parser currently gets tripped up on a statement like: S::doThing(); The problem here is that the `Parser::ParseStatement` routine was using the first token of lookahead to decide what to do, and in the case where it saw a type name it assumed that must mean the statement would be a declaration. It turns out that `Parser::ParseStatement` already had a more intelligent bit of disambiguation later on when handling the general case of an identifier (for which it couldn't determine the type-vs-value status at parse time), and simply commetning out the special-case handling of a type name and relying on the more general identifier case fixes the issue. That catch-all case still has some issues of its own, and this change expands on the comments to make some of those issues clear so we can try to address them later. Empty Declarators ----------------- One reason why the static method call problem was hard to discover was that it was masked by the parser allowing for empty declarator. That is, given input like: S::doThing(); This can be parsed as a variable declaration with a parenthesized empty declarator `()`. Practically, there is no reason to support empty declarators anwhere except on parameters, and allowing them in other contexts could make parser errors harder to understand. This change makes the choice of whether or not empty declarators are allowed something that can be decided at each point where we parse a declarator, and makes it so that only parsing of parameters opts in to allowing them. By disabling support for empty declarators in contexts where they don't make sense, we make code like the above a parse error when it appears at global scope, rather than a weird semantic error. A more complete future version of this change might also make support for parenthesized declarators an optional feature, or remove that support entirely. Slang doesn't actually support pointers (yet) so there is no real reason to allow parenthesized declarators right now. One note for future generations is that using an emptye declarator on a parameter of array type can actually create an ambiguity. If the user writes: void f(int[2][3]); did they mean for it to be interpreted as: void f(int a[2][3]); or as: void f(int[2][3] a); or even as: void f(int[2] a[3]); The first case there yields a different type for `a` than the other two, but is also what we pretty much have to support for backwards compatibility with HLSL. Requiring all function declarations to include parameter names would eliminate this potentially confusing case. Layout Modifiers ---------------- One of the above two syntax changes led to a regression in the output for a diagnostic test for `layout` modifiers (which are a deprecated but still functional feature from back when `slangc` supported GLSL input). The original output of the test case seemed odd, and when I looked at the parsing logic I saw that an early-exit error case was leading to spurious error messages because it failed to consume all the tokens inside the `layout(...)`. Fixing the logic to not use an early-exit (and instead rely on the built-in recovery behavior of `Parser`) produced more desirable diagnosic output. I changed the input file to put the `binding` and `set` specifiers on differnet lines so that the error output could show that the compiler properly tags both of the syntax errors.
*	Fix some bad behavior around static methods (#1289)	Tim Foley	2020-03-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are steps towards a fix for the problem of not being able to call a static method as follows: SomeType::someMethod(); One problem in the above is that the parser gets confused and parses an (anonynmous!) function declaration. This change doesn't address that problem, but does fix the problem that when checking fails to coerce `SomeType::someMethod` into a type it was triggering an unimplemented-feature exception rather than a real error message. Another problem was that if the above is re-written to try to avoid the parser bug: (SomeType::someMethod)(); we end up with a call where the base expression (the callee) is a `ParenExpr` and the code for handling calls wasn't expecting that. Instead, it sent the overload resolution logic into an unimplemented case that was bailing by throwing an arbitrary C++ exception instead of emitting a diagnostic. This latter issue was fixed in two ways. First, the code path that failed to emit a diagnostic now emits a reasonable one for the unimplemented feature (this still ends up being a fatal compiler error). Second, we properly handle the case of trying to call a `ParenExpr` by unwrapping it and using the base expression instead, so that `(<func>)(<args>)` is always treated the same as `<func>(<args>)`.
*	Handling of switch with empty body (#1284)	jsmall-nvidia	2020-03-20
\| \| \| \| \| \|	* Added handling for empty switch body. Added test for empty switch. * Fix testing for case in switch.
*	Yet more definitions moved into the stdlib (#1263)	Tim Foley	2020-03-09
\| \| \| \| \| \| \| \| \| \| \|	The only big catch that I ran into with this batch was that I found the `float.getPi()` function was being emitted to the output GLSL even when that function wasn't being used. This seems to have been a latent problem in the earlier PR, but was only surfaced in the tests once a Slang->GLSL test started using another intrinsic that led to the `float : __BuiltinFloatingPointType` witness table being live in the IR. The fix for the gotcha here was to add a late IR pass that basically empties out all witness tables in the IR, so that functions that are only referenced by witness tables can then be removed as dead code. This pass is something we should not apply if/when we start supporting real dynamic dispatch through witness tables, but that is a problem to be solved on another day. The remaining tricky pieces of this change were: * Needed to remember to mark functions as target intrinsics on HLSL and/or GLSL as appropriate (hopefully I caught all the cases) so they don't get emitted as source there. * The `msad4` function in HLSL is very poorly documented, so filling in its definition was tricky. I made my best effort based on how it is described on MSDN, but it is likely that if anybody wants to rely on this function they will need us to vet our results with some tests.
*	Fix a crash when a generic value argument isn't constant (#1241)	Tim Foley	2020-02-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This arose when a user tried to specialize the DXR 1.1 `RayQuery` type to a local variable: ```hlsl RAY_FLAG rayFlags = RAY_FLAG_CULL_FRONT_FACING_TRIANGLES \| RAY_FLAG_CULL_NON_OPAQUE; RayQuery<rayFlags> query; ``` In this case, we issued an error around `rayFlags` not being a constant as expected, but then we also crashes later on in checking because the `DeclRef` that was being used for the type had a null pointer for the generic argument corresponding to `rayFlags`. The main fix here was thus to add an `ErrorIntVal` case that can be used to represent something that should be an `IntVal` but where there was some kind of error in the input code so that the actual value isn't known to the compiler. A secondary fix here is that we were issuing error messages about expecting a constant for a parameter like `rayFlags` there twice, and one of those times was during the `JustChecking` part of overload resolution (when we are not supposed to emit any diagnostics). I fixed that up by allowing the `DiagnosticSink` to be used to be passed down explicitly (and allowing it to be null), while also leaving behind overloaded functions with the old signatures so that all the existing logic can continue to work unmodified.
*	* Fix for unary - on glsl (#1222)	jsmall-nvidia	2020-02-13
\| \| \|	* Test to check fix
*	Improve behavior when undefined identifier is a contextual keyword (#1200)	Tim Foley	2020-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The HLSL language has keywords with very common names like `triangle`, and Slang doesn't want to preclude users from using such names for their variables/functions/etc. In addition, Slang adds new keywords on top of HLSL (like `extension`) and we don't want those to prevent us from compiling existing code. As a result, almost all keywords in Slang are contextual keywords, and they can be shadowed by user varaibles. The down-side to making all keywords contextual is that in a case like this: ``` int test() { return triangle; } ``` The identifier `triangle` is not undefined as far as lookup (it is defined as a modifier keyword), so the existing "undefined identifier" logic gets bypassed, and instead we ran into an internal compiler error trying to construct an expression that refers to a modifier keyword. Fortunately, the internal compiler error in that case was overkill, and the compiler already had defensive logic to produce an expression with an error type if it couldn't figure out what the type of a declaration reference should be. The main fix here is thus to emit an "undefined identifier" error instead of an internal compiler error at the point where we see an attempt to reference a declaration that shouldn't be available in an expression context. In order to improve the quality of the diagnostic, the code for constructing declaration references was updated to pass along a source location to be used in error messages.
*	CUDA/C++ backend improvements (#1198)	jsmall-nvidia	2020-02-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* WIP with vector float test. * vector-float test working. * Fixed remaing tests broken with init changes. * Improve 64bit-type-support.md * Disable tests broken on CI system for Dx. * WIP: Make type available for comparison. * Moved type conversion into TypeTextUtil. * Add text/type conversions from DownstreamCompiler to TypeTextUtil. * Allow compaison taking into account type. * Removed quantize in vector-float.slang test.
*	Feature/test for double behavior (#1186)	jsmall-nvidia	2020-01-29
\| \| \| \| \| \| \| \| \| \| \| \|	* Split out binding writing. * Pass in the entry type. * Take into account output type with -output-using-type Added GPULikeBindRoot Added dxbc-double-problem test. * Add the dxbc-double-problem test.
*	Remove support for explicit register/binding syntax on TEST_INPUT (#1132)	Tim Foley	2019-11-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `TEST_INPUT` facility allows textual Slang test cases to provide two kinds of information to the `render-test` tool: 1. Information on what shader inputs exist 2. Information on what values/objects to bind into those shader inputs Under the first category of information, there exists supporting for attaching a `dxbinding(...)` annotation to a `TEST_INPUT` which seemingly indicates what HLSL `register` the input uses. There is a similar `glbinding(...)` annotation, used for OpenGL and Vulkan. It turns out that these annotations were, in practice, completely ignored and had no bearing on how `render-test` allocates or bindings graphics API objects. There was some amount of code attempting to validate that explicit registers/bindings were being set appropriately, but the actual values were being ignored. The visible consequence of the `dxbinding` and `glbinding` annotations being ignored is issue #1036: the order of `TEST_INPUT` lines was de facto determining the registers/bindings that were being used by `render-test`. This change simply removes the placebo features and strips things down to what is implemented in practice: the `TEST_INPUT` lines do not need target-API-specific binding/register numbers, because their order in the file implicitly defines them. I added logic to the parsing of `TEST_INPUT` lines to make sure I got an error message on any leftover annotations, and went ahead and systematicaly deleted all of the placebo annotations from our test cases. If we decide to make `TEST_INPUT` lines not depend on order of declaration in the future, we can build it up as a new and better considered feature. The main alternative I considered was to keep the annotations in place, and change `render-test` and the `gfx` abstraction layer to properly respect them, but that path actually creates much more opportunity for breakage (since every single test case would suddenly be specifying its root signature / pipeline layout via a different path using data that has never been tested). The approach in this change has the benefit of giving me high confidence that all the test cases continue to work just as they had before.
*	Fix problem when getting default value for a bool, was producing 0, which on ↵	jsmall-nvidia	2019-11-08
\| \| \| \|	glsl could not be coerced. (#1117)
*	Strip IR after front-end steps are done (#1092)	Tim Foley	2019-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Strip IR after front-end steps are done The main feature of this change is to unconditonally strip out the `IRHighLevelDeclDecoration`s in an IR module once the "mandatory" IR passes in the front end have run. This ensures that later IR passes (e.g., code emission) cannot rely on AST-level information to get their job done. Since I was already writing a pass to remove some instructions at the end of the front-end passes, I went ahead and also made the `-obfuscate` flag apply to the front-end IR generation by causing it to strip `IRNameHintDecoration`s while it is doing the other stripping. With this, the main identifying information left in IR modules (other than semantics and entry-point names) is mangled name strings for imported/exported symbols. A few other things got changes along the way: * Removed the `.expected` file for one of the tests, where that file seemingly shouldn't have been checked in at all. * Updated the signature of the DCE pass both so that it doesn't require a back-end compile request (it wasn't using it anyway), and so that it takes some options to decide whether to keep symbols marked `[export(...)]` alive (the front-end wants to keep these, while back-end passes currently need to be able to eliminate them). * Moved the `obfuscateCode` flag from the back-end compile request to the base class shared between front- and back-end requests, and updated the options and repro logic to set both as needed. An obvious improvement in the future would be to have the front- and back-end requests share these settings by referencing a single common object in the end-to-end case, rather than each having their own copy. * Removed logic that was keeping layout instructions alive in DCE, even if they weren't used. This seems to have been a vestige of an intermediate step between AST and IR layout. * fixup: add the new files
*	Fix a typo in core.meta.slang which was causing an assert when (#1024)	Robert Stepinski	2019-08-16
\| \| \|	compiling shaders that used texture2DMS Load() operations
*	Fix issue with outputting "static" in GLSL (#1006)	Tim Foley	2019-07-29
\| \| \| \| \| \| \|	This appears to be a regression introduced in #1001, and missed because none of our existing tests covered `static const` arrays on the GLSL/SPIR-V targets. The basic problem is that we cannot output a `static const` definition in GLSL because `static` is a reserved word and not a keyword. Instead for GLSL we just want a `const` array. This change makes the emission of `static` for global-scope constants key on the target language for code generation, and only emit it for HLSL, C, and C++. This change also adds a test case specifically for running Slang input that has a `static const` array on the Vulkan target.
*	Fix bitwise And & Or for scalar bool (#960)	Robert Stepinski	2019-05-01
\| \| \| \| \| \|	* Convert bitwise Or & And to logical operations on scalar bools * Test bitwise operations on scalar bools
*	Enable appropriate GLSL extension for unbounded-size resource arrays (#957)	Tim Foley	2019-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #941 The GLSL we were emitting for unbounded-size arrays was the obvious: ```hlsl // This HLSL: Texture2D t[]; ``` ```glsl // ... becomes this GLSL: texture2D t[]; ``` Unfortunately, the legacy GLSL behavior for an array without a declared size is what is called an "implicitly-sized" array, which means that it is assumed to actually have a fixed size, which is determined by the maximum integer constant value used to index into it (and only integer constants are allowed to be used when indexing into it). Users hadn't noticed the issue for a while, because most of our users who rely on unbounded-size arrays were also using the HLSL `NonUniformResourceIndex` function: ```hlsl float4 v = t[NonUniformResourceIndex(idx)].Sample(...); ``` When mapping such code to GLSL we use the `nonuniformEXT` qualifier added by the `GL_EXT_nonuniform_qualifier` extension, and it turns out that a secondary feature of that extension is that it changes the GLSL language semantics for arrays (of resources) with an unspecified size, so that they instead behave like we want. So users were happy and we were blissfully ignorant of the lurking issue. The problem is that as soon as a user neglects to use `NonUniformResourceIndex` (perhaps because an index really is uniform): ```hlsl cbuffer C { uint definitelyUniform; } ... float4 v = t[definitelyUniform].Sample(...); ``` Now the code we emit doesn't need `nonuniformEXT` so it doesn't enable `GL_EXT_nonuniform_qualifier` and the declaration of `t` now falls under the "implicitly-sized" array rules, and thus the code fails because `definitelyUniform` is being used as an index but is not an integer constant. The fix is pretty simple: when emitting a declaration of a global shader parameter to GLSL, we check if it is an unbounded-size array of resources and, if so, enable the `GL_EXT_nonuniform_qualifier` extension. We don't need any clever handling to deal with resource parameters nested in `struct` types or in entry-point parameter lists, etc., because previous IR passes will have split up complex types and moved everything to the global scope already.
*	Allow plugging in types with resources for interface parameters (#913)	Tim Foley	2019-03-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Allow plugging in types with resources for interface parameters The key feature enabled by this change is that you can take a shader declared with interface-type parameters: ```hlsl ConstantBuffer<ILight> gLight; float4 myShader(IMaterial material, ...) { ... } ``` and specialize its interface-type parameters to concrete type that can contain resources like textures, samplers, etc. The hard part of doing this layout is that we need to support signatures that include a mix of interface and non-interface types. Imagine this contrived example: ```hlsl float4 myShader( Texture2D diffuseMap, ILight light, Texture2D specularMap) { ... } ``` We end up wanting `diffuseMap` to get `register(t0)` and `specularMap` to get `register(t1)`, so that they have the same location no matter what we plug in for `light`. But if we plug in a concrete type for `light` that needs a texture register, we need to allocate it somewhere. We handle this by having the `TypeLayout` for `light` come back with a "primary" type layout that doesn't have any texture registers, but with a "pending" type layout that includes the texture register requirements of whatever concrete type we plug in. This split between "primary" and "pending" layout then needs to work its way up the hierarchy, so that an aggregate `struct` type with a mix of interface and non-interface fields (recursively), needs to compute an aggregate "primary type layout" and an aggregate "pending type layout," and then each field needs to be able to compute its offset in the primary/pending layout of the aggregate. A large chunk of the work in this PR is then just implementing the split between primary and pending data, and ensuring that layouts are computed appropriately. The next catch is that when a "parameter group" (either a parameter block or constant buffer) contains one or more values of interface type, then we can allow the parameter group to "mask" some of the resource usage of the concrete types we plug in, but others "bleed through." For example, if we have: ```hlsl struct MyStuff { float3 color; ILight light; } ConstantBuffer<MyStuff> myStuff; struct SpotLight { float3 position; Texture2D shadowMap; } `` If we plug in the `SpotLight` type for `myStuff.light`, then the `float3` data for the light can be "masked" by the fact that we have a constant buffer (we can just allocate the `float3` `position` right after `color`), but the `Texture2D` needed for `shadowMap` needs to "bleed through" and become "pending" data for the `myStuff` shader parameter. Adding support for that detail more or less required a full rewrite of the logic for allocating parameter group type layouts. The next detail is that when we go to legalize a declaration like the `myStuff` buffer, we will end up with something like: ```hlsl struct MyStuff_stripped { float3 color; } struct Wrapped { MyStuff_stripped primary; SpotLight pending; } ConstantBuffer<Wrapped> myStuff; ``` This "wrapped" version of the buffer type more accurately reflects the layout we need/want for the uniform/ordinary data, but in order to further legalize it and pull out the resource-type fields like `shadowMap` we need to have accurate layout information, and the problem is that layout information for the original buffer can't apply to this new "wrapped" buffer. The last major piece of this change is logic that runs during existential type legalization to compute new layouts for "wrapped" buffers like these that embeds correct offset/binding/register information for any resources nested inside them. A key challenge in that code is that existential legalization needs to erase any "pending" data from the program entirely, so that offset information that used to be relatie to the "pending" part of a surrounding type now needs to be relative to the primary part. The work here may not be 100% complete for all scenarios, but it does well enough on the new and existing tests that I want to checkpoint it. Note that a few other tests have had their output changed, but in all cases I've reviewed the diffs and determined that the change in observable behavior is consistent with what we intened Slang's behavior to be. Note that there is still one major piece of support for interface-type parameters that is missing here, and which might force us to revisit some of the decisions in this code: we don't properly support user-defined `struct` types with interface-type fields. * fixup: typos
*	Add support for scalar rcp() intrinsic for GLSL (#918)	Robert Stepinski	2019-03-20
\|
*	Hotfix/bool fix (#907)	jsmall-nvidia	2019-03-14
\| \| \| \| \| \| \| \|	* * Handle ! for bool vector in glsl * Handle operators that have a boolean return value * \|\| or && take bool * * Add comment in bool-op.slang test about doing \|\| or && on vector types not supported for GLSL targets
*	Fix rsqrt intrinsic for GLSL (#881)	Robert Stepinski	2019-03-06
\| \| \| \| \| \|	* Add support for glsl inversesqrt intrinsic * fixup for test failure
*	Fix dx12 root sig mismatch on texture2d-gather.hlsl test (#879)	jsmall-nvidia	2019-03-05
\| \| \| \| \| \|	* Fix texture2d-gather test failure on dx12. * Fix tab
*	Hotfix/crash invalid vk binding (#875)	jsmall-nvidia	2019-03-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add diagnostic for vk::binding failure. * Add test for vk::binding failure. * Add the expected output for glsl-layout-define.hlsl * * Copy over initialize expr if available when validating unchecked * Fix unloop - because now it always has one parameter (when before it could have none) * Split vk::binding and layout tests with invalid parameters * Removed the diagnostic for 2 ints expected * Added vk::binding that doesn't specify set in vk-bindings.slang * * Fix typo * Improve comments.
*	Hotfix/texture2d gather (#876)	jsmall-nvidia	2019-03-05
\| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass test to see if GatherRed works. * Add support for generating R_Float32 textures. * Set default texture format. * * Alter the texture2d-gather to work with a R_Float32 texture * Add support for scalar Texture2d types with GatherXXX in stdlib * Remove some left over commented out test code from texture2d-gather.hlsl
*	* Add cross compile test (#849)	jsmall-nvidia	2019-02-14
\| \| \|	* Add intrinsic for StructuredBuffer.Load
*	Add a test for glslang errors when using StructuredBuffer Load() (#848)	Robert Stepinski	2019-02-13
\|
*	Track stage for varying sub-fields (#842)	Tim Foley	2019-02-12
\| \| \| \| \| \| \| \|	Fixes #841 This reverts a small change made in #815 that seemed innocent at the time: we stopped tracking an explicit `Stage` to go with every `VarLayout` that is part of an entry-point varying parameter, and instead only associated the stage with the top-level parameter. That change ended up breaking the logic to emit the `flat` modifier automatically for integer type fragment-shader inputs for GLSL, but we didn't have a regression test to catch that case. This change adds a regression test to cover this case, and adds the small number of lines that were removed from `parameter-binding.cpp`. A few other test outputs had to be updated for the change (these are outputs that were changed in #815 for the same reason).
*	Hotfix/dispatch thread id improvements (#834)	jsmall-nvidia	2019-02-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* * Make vector comparisons out correct functions on glsl * Test for vector comparisons * Typo fixes * Glsl vector comparisons use functions. * Added a coercion test. * Do checking for the SV_DispatchThreadId type to see if it appears valid. * Fix typo * Make glsl do type conversion for SV_DispatchThreadID parameter. * Fix glsl to match func-resource-param-array with changes to how SV_DispatchThreadID changes.
*	Fix vector compares on GLSL targets (#833)	jsmall-nvidia	2019-02-08
\| \| \| \| \| \| \| \| \| \|	* * Make vector comparisons out correct functions on glsl * Test for vector comparisons * Typo fixes * Glsl vector comparisons use functions. * Added a coercion test.
*	Allow entry points to have explicit generic parameters (#826)	Tim Foley	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Allow entry points to have explicit generic parameters Prior to this change, the Slang implementation required users to use global `type_param` declarations in order to specialize a full shader. For example: ```hlsl type_param L : ILight; ParameterBlock<L> gLight; [shader("fragment")] float4 fs(...) { ... gLight.doSomething() ... } ``` With this change we can rewrite code like the above using explicit generics, plus the ability to have `uniform` entry-point parameters: ```hlsl [shader("fragment")] float4 fs<L : ILight>( uniform ParameterBlock<L> light, ...) { ... light.doSomething() ... } ``` Having this support in place should make it possible for us to eliminate global generic type parameters and the complications they cause (both at a conceptual and implementation level). The most central and visible piece of the change is that `EntryPointRequest` now holds a `DeclRef<FuncDecl>` instead of just ` RefPtr<FuncDecl>`, which allows it to refer to a specialization of a generic function. Various places in the code that refer to the `EntryPointRequest::decl` member now use a `getFuncDecl()` or `getFuncDeclRef()` method as appropriate (see `compiler.h`). In order to fill in the new data, the `findAndValidateEntryPoint` function has been greaterly overhauled. The changes to its operation include: * The by-name lookup step for the entry point function has been adapted to accept either a function or a generic function. * The generic argument strings provided by API or command line are no longer parsed all the way to `Type`s, but instead just to `Expr`s in the first pass. * There are now two cases for checking the global generic arguments against their matching parameters. The first case is the new one, where we plug the generic argument `Expr`s into the explicit generic parameters of an entry point (that case re-uses existing semantic checking logic). The second case is the pre-existing code for dealing with global generic type arguments. The `lower-to-ir.cpp` logic for hadling entry points then had to be extended. Making it deal with a full `DeclRef` instead of just a `Decl` was the easy part (just call `emitDeclRef` instead of `ensureDecl`). The more interesting bits were: * We need to carefully add the `IREntryPointDecoration` to the nested function and not the generic in the case where we have a generic entry point. There is a handy `getResolvedInstForDecorations` that can extract the return value for an IR generic so that we can decorate the right hting. * We need to make sure that in the case where we emit a `specialize` instruction (which normally wouldn't get a linkage decoration), we attach an `[export(...)]` decoration to it with the mangled name of the decl-ref, so that it can be found during the linking step. The IR linking step is then slightly more complicated because the mangled entry point name could either refer directly to an `IRFunc` or to a `specialize` instruction for a generic entry point. The logic was refactored to first clone the entry point symbol without concern for which case it is (the old code was specific to functions), and then if the result is a `specialize` instruction, we attempt to run generic specialization on-demand. That on-demand specialization is a bit of a kludge, but it deals with the fact that all the downstream passing only expect to see an `IRFunc`. A future cleanup might try to split out that specialization step into its own pass, which ends up being a limited form of the specialization pass. Since I was already having to touch a lot of the code around IR linking, I went ahead and refactored the signature of the operations. I eliminated the need for the caller to create, pass in, and then destroy an `IRSpecializationState` (really an IR linking state), and replaced it with a structure local to the pass (that data structure was a remnant of an older approach in the compiler), and then also renamed the main operation to `linkIR` to reflect what it is doing in our conceptual flow. Smaller changes made along the way include: * Refactored `visitGenericAppExpr` to create a subroutine `checkGenericAppWithCheckedArgs` so that it can be used by the entry-point validation logic described above). * Refactored the declarations around the IR passes in `emitEntryPoint()` (`emit.cpp`), to show that things are more self-contained than they used to be (e.g., that the `TypeLegalizationContext` is now only needed by one pass). * Refactored the generic specialization code so that there is a stand-along free function that can perform specialization on a `specialize` instruction without all the other context being required. This is only to support the limited specialization that needs to be done as part of linking. * Updated the `global-type-param.slang` test to actually test entry-point generic parameters. In a later pass we can/should rework all the tests/examples for global type parameters over to use explicit entry-point generic parameters (at which point we should rename the tests as well). For now I am leaving thigns with just one test case, with the expectation that bugs will be found and ironed out as we expand to more tests. * fixup * Fixup: don't leave entry-point decorations on stuff we don't want to keep The IR `[entryPoint]` decoration is effectively a "keep this alive" decoration, which means that attaching it to something we don't intend to keep around can lead to Bad Things. The approach to generic entry points was attaching `[entryPoint]` to the underlying `IRFunc` because that seemed to make sense, but that meant that the `specialize` instruction at global scope scould instantiate that generic and then keep it alive, even if the resulting function wouldn't be valid according to the language rules. As a quick fix, I'm attaching `[entryPoint]` to the `specialize` instruction instead in such cases, and then re-attaching it to the result of explicit specialization during linking. * Port most of remaining test and rename global type parameters This change ports as many as possible of the existing tests for global type parameters over to use entry-point generic parameters instead. For the most part this is a mechanical change. A few test cases remain using global generic parameters, as does the `model-viewer` example application. The reason for this is that the shaders have either or both the following features: * A vertex and fragment shader that can/shold agree on their parameters * A type declaration (e.g., a `struct`) that is dependent on one of the generic type parameters In these cases, it would really only make sense to switch to explicit parameters once we support shader entry points nested inside of a `struct` type, so that we can use an outer generic `struct` as a mechanism to scope the entry points and other type-dependent declrations. Since global-scope type parameters need to persist for at least a bit longer, I went ahead and renamed all the use sites over to use `type_param` for consistency.
*	Fixing IR-lowering not properly registering func decl	Yong He	2019-01-30
\|
*	Fix IR emit logic for methods in `struct` types (#791)	Tim Foley	2019-01-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a bug in the logic for emitting initial IR, such that it was neglecting to emit "methods" (member functions) unless they were also referenced by a non-member (global) function, or were needed to satisfy an interface requirement. This would only matter for `import`ed modules, since for non-`import`ed code, anything relevant would be referenced by the entry point so that the problem would never surface. This change fixes the underlying problem by adding a step to the IR lowering pass called `ensureAllDeclsRec` that makes sure that not only global-scope declarations, but also anything nested under a `struct` type gets emitted to the initial IR module. There are also a few unrelated fixes in this PR, which are things I ran into while making the fix: * Deleted support for the (long gone) `IRDeclRef` type in our `slang.natvis` file * Added support for visualizing the value of IR string and integer literals when they appear in the debugger * Fixed IR dumping logic to not skip emitting `struct` and `interface` instructions. Switching those to inherit from `IRType` accidentally affected how they get printed in IR dumps by default. * Fixed up the IR linking logic so that it correctly takes `[export]` decorations into account, so that an exported definition will always be taken over any other (unless the latter is more specialized for the target). I initially implemented this in an attempt to fix the original issue, but found it wasn't a fix for the root cause. It is still a better approach than what was implemented previously, so I'm leaving it in place.
*	Improve handling of {} initializer list expressions (#778)	Tim Foley	2019-01-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #775 It was reported (in #775) that Slang doesn't handle initializer-list syntax when initializing matrix variables. When starting on a fix for that it became apparent that the time was right to fix two broad issues in the compiler's current handling of `{}`-enclosed initializer lists. The first issue was that the front-end checking of initializer lists wasn't handling the C-style behavior where an initializer list can either contain nested `{}`-enclosed lists for sub-arrays/-structures, or directly contain "leaf" values for initializing those aggregates. For example, the following two variable declarations ought to be equivalent: ```hlsl int4 a[] = { {1, 2, 3, 4}, {5, 6, 7, 8} }; int4 b[] = { 1, 2, 3, 4, 5, 6, 7, 8 }; ``` Getting this distinction right is important because we want to support initializing a matrix either from a list of vectors for its rows, or a list of scalars for its elements (in row-major order). The front-end semantic checking logic for initializer lists was revamped so that it conceptually tries to "read" an expression of a desired type from the initializer list, and decides at each step whether to consume a single expression by coercing it to the desired type, or to recursively read multiple sub-values to construct the type as an aggregate. The logic for deciding between direct vs aggregate initialization could potentially use some tweaking, but luckily it should always handle the case where users introduce explicit `{}`-enclosed sub-lists to make their intention clear, so that existing Slang code should continue to work as before. The second issue was that initializers without the expected number of elements weren't implemented in code generation, so they would lead to internal compiler errors. This change revamps the codegen logic for initializer lists so that it can synthesize default values for fields/elements that were left out during initialization. This includes an attempt to support default initialization of `struct` fields based on explicitly written initialization expressions.
*	Fix a bug in IR linking (#777)	Tim Foley	2019-01-16
\| \| \| \| \|	The IR linking logic was recently rewritten to use the (optional) `IRLinkageDecoration`s instead of assuming `IRGlobalVals` always have a mangled name field, and in that process a bug seems to have crept in where in the case that an instruction that would usually quality as a "global value" does not have linkage, we were failing to register the instruction we create in the output module as a replacement for the original instruction. This problem affects `static` variables inside of functions, leading to them potentially getting emitted multiple times.
*	Fix up declaration checking order for enums (#774)	Tim Foley	2019-01-15
\| \| \| \| \| \| \| \| \|	The logic in `check.cpp` for declaration checking is very messy and needs to be re-written, but in the interim we need to be careful to avoid any cases where a declaration, or some piece of it, gets redundantly checked multiple times. The way the logic had been working, the different "cases" in an `enum` type were being checked twice, and that meant that any initialization expression for a case would be type-checked the first time (potentially leading to a new AST) and then the checked AST would be checked again. This created a problem if the first round of checking introduced any AST nodes that the checking logic would not expect to see (because the parser cannot possibly produce them). The fix here is to follow the style of the other declaration checking cases, where checking is separated into two distinct phases (the "header" phase makes the declaration usable by others, while the "body" phase checks its implementation details for internal consistency). This change includes a test case that produced an internal compiler error before, and compiles without error now.
*	Change how buffers are emitted (#741)	Tim Foley	2018-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Change how buffers are emitted This is a change with a lot of pieces, which can't always be separated out cleanly. I'm going to walk through them in what I hope is a logical order. The main goal of this change was to allow arrays of structured buffers to translate to Vulkan. Consider two declarations of structured buffers in HLSL/Slang: ```hlsl StructuredBuffer<X> single; StructuredBuffer<Y> multiple[10]; ``` The current translation logic was handling `single` by translating it into an unnamed GLSL `buffer` block like: ```glsl layout(std430) buffer _S1 { X single[]; }; ``` That syntax allows an expression like `single[i]` in Slang to be translated simply as `single[i]` in GLSL. But that naive translating doesn't work for `multiple`, since we need to declare a array of blocks in GLSL, which requires giving the whole thing a name: ```glsl layout(std430) buffer _S2 { Y _data[]; } multiple[10]; ``` Now a reference to `multiple[i][j]` in Slang needs to become `multiple[i]._data[j]` in GLSL. To avoid having way too many special cases around single structured buffers vs. arrays, it makes sense to allows emit things in the latter form, so that we instead lower `single` as: ```glsl layout(std430) buffer _S1 { X _data[]; } single; ``` So that now a reference to `single[i]` becomes `single._data[i]` in GLSL. Most of that can be handled in the standard library translation of the structured buffer indexing operations. The only wrinkle there is that there were some old special-case instructions in the IR intended to handle buffer load/store operations (these were added back when I was trying to keep the "VM" path working). These aren't really needed to have structured-buffer operations work; they can be handled as ordinary functions as far as the stdlib is concerned. I removed the old instructions. Along the way, it became clear that a few other cases follow the same pattern. Byte-addressed buffers are an obvious case. We were lowering HLSL/Slang: ```hlsl ByteAddressBuffer b; ... uint x = b.Load(0); ``` to GLSL like: ```glsl layout(std430) buffer _S1 { uint b[]; }; ... uint x = b[0]; ``` That logic would fail for arrays the same way that the structured buffer case was failing. The fix is the same: use named `buffer` blocks and then introduce an explicit `_data` field: ```glsl layout(std430) buffer _S1 { uint _data[]; } b; ... uint x = b._data[0]; ``` Just like with structured buffers, all of the VK translation for operations on byte-addressed buffers can be implemented directly in teh stdlib, so once the emit logic was changed it was just a matter of adding `._data` to a bunch of VK tranlsations. It turns out that arrays of constant buffers have more or less the same problem, and furthermore we have some problems with any code that directly uses the modern HLSL `ConstantBuffer<T>` type. Note: the emit logic around constant buffers sometimes refers to "parameter groups" because that is being used in the compiler as a catch-all term for constant buffers, texture buffers, and parameter blocks. The existing code was going out of its way to reproduce the way that constant buffer declarations are implicitly referenced in HLSL: ```hlsl cbuffer C { float f; } ... float tmp = f; // No reference to `C` here ``` This can be seen in the emit logic with the `isDerefBaseImplicit` function, which is used to take the internal IR representation for a reference to `f` (which is closer to the expression `(C).f` or `C->f`) and leave off any reference to `C` so that we emit just `f`. That kind of logic just flat out doesn't work in some important cases. Arrays of constant buffers are a clear one: ```hlsl ConstantBuffer<X> cbArray[3]; ... X x = cbArray[0]; ``` There is no way to translate that to an ordinary `cbuffer` declaration at all. The same problem can be created without arrays, though: ```hlsl ConstantBuffer<X> singleCB; ... X x = singleCB; ``` The current strategy for translating constant buffers was translating `singleCB` into a `cbuffer` declaration that reproduced the fields of `X` as its members, which just wouldn't work: ```hlsl cbuffer singleCB { float f; // field of `X` } ... X x = singleCB; // ERROR: there is nothing named `singleCB` in this HLSL ``` The new strategy is more consistent. We still generate a `cbuffer` declaration for a single constant buffer, but we always give it a single field of the chosen element type: ```hlsl cbuffer singleCB { X singleCB; } ... X x = singleCB; // this works fine! ``` And in the array case we generate code that uses the explicit `ConstantBuffer<T>` type: ```hlsl ConstantBuffer<X> cbArray[3]; ... X x = cbArray[0]; ``` The GLSL output is more complicated because unlike with HLSL there is no implicit conversion from a uniform block to its element type (there is no notion of an element type). The array case thus needs a `_data` field similar to what we do for structured buffers: ```glsl layout(std140) uniform _S3 { X _data; } cbArray[3]; ... X x = cbArray[0]._data; ``` And then the non-array case needs to have a similar `_data` field for consistency: ```glsl layout(std140) uniform _S1 { X _data; } singleCB; ... X x = singleCB._data; ``` This is handled by inserting the necessary reference to `_data` whenever we dereference a constant buffer, either as part of a load instruction (loading from the whole CB as a pointer), or an `IRFieldAddress` instruction which forms a pointer into the CB (e.g., `&(singleCB->f)` becomes `singleCB._data.f`). The current emit logic handles `ParameterBlock<X>` differently from `ConstantBuffer<X>`, but really only to allow parameter blocks to be explicitly named in the output, while constant buffers were left implicit by default. Thus the only difference was a legacy one (from back when trying to exactly reproduce the HLSL text we got as input was considered an important goal), and the new approach to emitting constant buffers would get rid of it. I removed the separate logic for emitting `ParameterBlock<X>` and just let the handling for constant buffers deal with it. Note that any resource types inside of a `ParameterBlock<X>` would have been moved out as part of legalization, so that a parameter block is 100% equivalent to a constant buffer when it comes time to emit code. Unsurprisingly, changing the way we generate HLSL and GLSL output for all these buffer types meant that any tests that were directly comparing the output of `slangc` against `fxc`, `dxc`, or `glslang` broke. The basic approach to fixing the breakage in GLSL tests was to update the GLSL baseline to reflect the new output startegy. In some cases I used macros to name the various `_S<digits>` temporaries so that future renaming will hopefully be easier (it would be great if we auto-generated temporary names with a bit more context). There was one GLSL test (`tests/bugs/vk-structured-buffer-binding`) that was using raw GLSL expected output, and this was changed to use a GLSL baseline to generate SPIR-V for comparison. For HLSL tests we were sometimes running the same input file through `slangc` and `fxc`/`dxc`, and in these cases I macro-ized the various `cbuffer` declarations to generate different declarations depending on the compiler. I completely dropped the tests coming from the D3D SDK because they aren't providing much coverage, and updating them would change them so far from the original code that the purported benefit (using a body of existing shaders) would be lost. I also dropped the explicit matrix layout qualifiers in the `matrix-layout` test because the new output strategy breaks those for GLSL (you can't put matrix layout qualifiers on `struct` fields, and now the body of every constant buffer is inside a `struct`). This isn't as big of a loss as it seems, because our handling of those qualifiers wasn't really right to begin with. Slang users should only be setting the matrix layout mode globally (and we should probably switch to error out on the explicit qualifiers for now). The other thing that got dropped is tests involving `packoffset` modifiers. Slang already warns that it doesn't support these, and the way they were used in the test cases is actually misleading. For the binding/layout-related tests, the goal was to show that Slang reproduces the same layout as fxc, in which case explicitly enforcing a layout via `packoffset` seems like cheating (are we sure we enforced the layout fxc would have produced?). The real reason was that Slang used to emit explicit `packoffset` on every* field of a `cbuffer` it would output, because of an `fxc` bug where you couldn't use `register` on textures/samplers declared inside a `cbuffer` unless every field in the `cbuffer` used a `register` or `packoffset` modifier. Slang hasn't required that behavior in a while because it now splits textures and samplers, and the one test case where we needed `packoffset` to work around the `fxc` bug in the baseline HLSL has been macro-ified even more to work around the bug. The amount of churn in the test cases is unfortunate, but it continues to point at the weakness of any testing strategy that checks for exact equivalent between Slang's output and that of other compilers. We need to keep working to replace these tests with better alternatives. In `check.cpp` there is logic to perform implicit dereferencing, so that if you write `obj.f` where `obj` is a `ConstantBuffer<X>` (or some other "pointer-like" type) and `f` is a field in `X`, then this effectively translates as `(obj).f`. That is, we dereference the value of type `ConstantBuffer<X>` to get a value of type `X`, and then refer to the field of the `X` value. There was a problem where the logic to insert that kind of implicit dereference operation was using a reference (`auto& type = ...`) for the type of the expression being dereferenced, and then clobbering it. This would mean that an expression of type `ConstantBuffer<X>` would have its type overwritten to be just `X` and then codegen would break later on. I'm not sure how we haven't run into that before. The `array-of-buffers` test case was added to confirm that we now support arrays of constant, structured, and byte-address buffers for both DXIL and SPIR-V output. Okay, so that was a lot of stuff, but hopefully it is clear how this all works to make the output of the compiler more consistent and explicit, while also supporting the required new functionality. fixup: review feedback
*	Bug fix - vk::binding on structured buffers (#720)	jsmall-nvidia	2018-11-16
\| \| \| \| \| \| \| \|	* Fix output of binding of structured buffer on GLSL. * Added test to check vk binding is coming thru. * Fix closethit binding inconsistency.
*	Fix a precedence bug in code emit (#705)	Tim Foley	2018-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix a precedence bug in code emit Given code like the following: ```hlsl float a = ...; float3 b = pow(a, 2.0); float3 c = b.xyz; ``` There is an implicit cast from `float` to `float3` in the computation of `b`, that Slang will always make explicit in the output. Slang will also tend to pull the computation of `b` into the next expression if it has no other use sites in the same function. When it does, the compiler was failing to parenthesize the result correctly, and yielded (more or less): ```hlsl float a = ...; float3 c = (float3) pow(a,2.0).xyz; ``` As you can see, the swizzle ended up attached to the `pow()` call instead of the cast, and the downstream compiler luckily complained that we couldn't apply an `.xyz` swizzle to a scalar value. This change adds the missing parentheses-insertion logic for that case of emitting a cast expression, so that we instead get: ```hlsl float a = ...; float3 c = ((float3) pow(a,2.0)).xyz; ``` I added a test case to catch this specific issue, but there is of course no guarantee that we haven't missed other cases in the emit logic. This is why I held out so long on getting to the "why so many parentheses?" complaints... * remove commented-out code from test program
*	Fix a crash on function-static variables with initializers (#703)	Tim Foley	2018-10-30
\| \| \| \| \|	This code path hadn't been used, and it had a crash due to not inserting the basic blocks it created (for initializing the variable) into the parent function. The fix adds a bit more smarts to the `IRBuilder` to help with inserting basic blocks into the flow of a function. The actual user issue was around `static const` declarations, and it is clear that the code is incorrectly treating a function local `static const` as if it were just `static`. That will need to be fixed in another change.
*	Rework command-line options handling for entry points and targets (#697)	Tim Foley	2018-10-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Rework command-line options handling for entry points and targets Overview: * The biggest functionality change is that the implicit ordering constraints when multiple `-entry` options are reversed: any `-stage` option affects the `-entry` to its left instead of to its right as it used to. This is technically a breaking change, but I expect most users aren't using this feature. * The options parsing tries to handle profile versions and stages as distinct data (rather than using the combined `Profile` type all over), and treats a `-profile` option that specifies both a profile version and a stage (e.g., `-profile ps_5_0`) as if it were sugar for both a `-profile` and a `-stage` (e.g., `-profile sm_5_0 -stage fragment`). * We now technically handle multiple `-target` options in one invocation of `-slangc`, but do not advertise that fact in the documentation because it might be confusing for users. Similar to the relationship between `-stage` and `-entry`, any `-profile` option affects the most recent `-target` option unless there is only one `-target`. * The logic for associating `-o` options with corresponding entry points and targets has been beefed up. The rule is that a `-o` option for a compiled kernel binds to the entry point to its left, unless there is only one entry point (just like for `-stage`). The associated target for a `-o` option is found via a search, however, because otherwise it would be impossible to specify `-o` options for both SPIR-V and DXIL in one pass. * The handling of output paths for entry points in the internal compiler structures was changed, because previously it could only handle one output path per entry point (even when there are multiple targets). The new logic builds up a per-target mapping from an entry point to its desired output path (if any). Details: * Support for formatting profile versions, stages, and compile targets (formats) was added to diagnostic printing, so that we can make better error messages. This is fairly ad hoc, and it would be nice to have all of the string<->enum stuff be more data-driven throughout the codebase. * Test cases were added for (almost) all of the error conditions in the current options validation. The main one that is missing is around specifying an `-entry` option before any source file when compiling multiple files. This is because the test runner is putting the source file name first on the command line automatically, so we can't reproduce that case. * Several reflection-related tests now reflect entry points where they didn't before, because the logic for detecting when to infer a default `main` entry point have been made more loose * On the dxc path, beefed up the handling of mapping from Slang `Profile`s to the coresponding string to use when invoking dxc. * A bunch of tests cases were in violation of the newly imposed rules, so those needed to be cleaned up. * There were also a bunch of test cases that had accidentally gotten "disabled" at some point because there were comparing output from `slangc` both with and without a `-pass-through` option, but that meant that any errors in command-line parsing produced the same error output in both the Slang and pass-through cases. This change updates `slang-test` to always expect a successful run for these tests, and then manually updates or disables the various test cases that are affected. * When merging the updated test for matrix layout mode, I found that the new command-line logic was failing to propagate a matrix layout mode passed to `render-test` into the compiler. This was because the `-matrix-layout` options were implemented as per-target, but the target was being set by API while the option came in via command line (passed through the API). It seems like we want matrix layout mode to be a global option anyway (rather than per-target), so I made that change here. Add missing expected output files * A 64-bit fix * Remove commented-out code noted in review
*	Fix Vulkan codegen for image atomics (#690)	Tim Foley	2018-10-25
\| \| \| \| \| \| \| \| \|	The basic problem was that the front-end was generating code that used a `uint` vector for the coordinates, while GLSL requires an `int` vector. Without support for implicit type conversions, this leads to GLSL compilation failure. The fix here is to insert the type conversion as late as possible (during GLSL emit). This isn't a pretty solution, but it is the easiest one to implement in the current compiler. A more forward-looking approach would be to support "force inline" functions in the stdlib, so that we can implement the conversion logic in a stdlib implementation specialized for the Vulkan/GLSL target. At the moment, everything to do with image atomics is all sleight of hand anyway, so making it incrementally messier isn't a bit hit.
*	Fix error when one constant is defined equal to another (#670)	Tim Foley	2018-10-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix error when one constant is defined equal to another Fixes #666 When a user declares one constant (usually a `static const` variable) to be exactly equal to another by name: ```hlsl static const a = 999; static const b = a; ``` Then the IR-level representation of `b` is an `IRGlobalConstant` whose value expression is just a pointer to the definition of `a`. The logic in `emitIRGlobalConstantInitializer()` was trying to always call `emitIRInstExpr` to emit the value of the constant as an expression, but that function only handles complex/compound expressions and not the case of simple named values (e.g., constants like `a`). The intention is for code to call `emitIROperand()` instead, and let it decide whether to emit an expression or a named reference using its own decision-making. The `IRGlobalConstant` case really just wants to pass in the "mode" flag it uses to influence that decision-making, but shouldn't be working around it. This change just replaces the `emitIRInstExp()` call with `emitIROpernad()` and adds a test case to confirm that this fixes the reported problem. * Fixups for bugs in previous change The first problem was that certain instruction ops were being special-cased to opt out of "folding" into expressions before we make the universal check to always fold when inside an initializer for a global constant. The second problem is that the `emitIROperand()` logic was always putting expressions around sub-expressions, which breaks parsing when the sub-expression is an initializer list (`{...}`). This fixup is pretty much a hack, but will be something we can remove once we don't emit unncessary parentheses overall, which is a better fix.
*	Improve generic argument inference for builtins (#598)	Tim Foley	2018-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #487 The basic problem here is that the user writes something like: ```hlsl float invSqrt2 = 1 / sqrt(2); ``` In this case the user knows that `sqrt()` is only defined for floating-point types, so they expect this to compile something like: ```hlsl float invSqrt2 = float(1) / sqrt(float(2)); ``` The challenge this creates for the Slang compiler is that we use generics to streamline our declarations of all the builtins, so that the scalar `sqrt()` function is actually declared as: ```hlsl T sqrt<T:__BuiltinFloatingPointType>(T value); ``` The `__BuiltinFloatingPointType` is an `interface` defined as part of the standard library, such that only built-in floating-point types conform to it (that is, `half`, `float`, and `double`). When generic argument inference applies to a call like `sqrt(2)`, we see an argument of type `int`, and try to infer `T=int`, which leads to a failure because `int` does not conform to `__BuiltinFloatingPointType`. The point where this currently fails in in the logic to "join" two types for inference, which is supposed to pick the best type that can represent both of two input types. E.g., a join between `float` and `int3` would be `float3`, since both of those types can convert to it, and it is the "minimal" type with that property. So, the goal here is simple: we want a "join" between `int` and `__BuiltinFloatingPointType` to yield the `float` type. The way we handle that in this change is to special case the join of a basic scalar type and an interface, by enumerating all the basic scalar types, filtering them for ones that support the chosen interface and can be implicitly converted from the argument type, and then picking the "best" of them (the comments in the code explain what "best" means in this context). The technique used here could be generalized in the future to deal with user-defined types or more cases, but that would risk slowing down overload resolution even more, which is already the most expensive part of our semantic checking pass. A test case has been added for the specific case of `sqrt()` applied to an `int` argument.
*	A bunch of work to resolve #569 (#576)	Tim Foley	2018-05-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* render-test should not fail on HLSL compiler warnings The logic in `render-test` that invokes `D3DCompile` was causing a test to fail if it produced any warnings (not just if compilation fails). Warning output can be dealt with by the test runner, since it will compare output between runs anyway, and it is useful to be able to run something through `render-test` that compiles with warnings. * Be more careful about deleting IR instructions There was an `IRInst::deallocate()` method that had a precondition that the instruction should already be removed from its parent and clear out all its operands before calling, but it wasn't checking this and the few call sites weren't doing things right either. I consolidated things on `IRInst::removeAndDeallocate()` which does all the things: removes from the parent, clear out operands, and then deallocates. I also made sure to clear out the type operand. This clears up some crashing issues where passes were removing instructions but those instructions would still show up as users of other instructions. * Don't emit bitwise not for non-Boolean types It seems like the logic in `emit.cpp` messed things up and decided that `Not` (the IR instruction that is equivalent to `!` in the AST) should emit as `!` for Boolean types and `~` for other types, but this makes no sense (e.g., `~(a & 1)` is very different from `!(a & 1)`, even when interpreted as a condition). It seems like this logic was intended for the `BitNot` case, where `~a` and `!a` are actually equivalent for Boolean values (but a target language might not like `~a` on `bool` values). Maybe the original plan was that the `Not` instruction should only apply to Boolean values in the first place, and that other values should be converted to `bool` (or a vector of `bool`) before applying `Not`, but even in that case the emit logic makes no sense. This caused an actual problem for one of my test cases, so it was important to fix it now. * Fix issue with cached resolution for overoaded operators The basic problem was that the lookup logic was forming a key based on the first definition it found for the overloaded operator, but that means that when processing a prefix `++a` call we might look up the postfix definition of `operator++` and decide to use its opcode as the key. This "fixes" the logic by looking for the first definition with a "compatible" definition (e.g., a `__prefix` function if we are checking a `PrefixExpr`), and then uses its opcode. A better fix in the long run would be to make the cache just be keyed on the operator name and the "fixity" of the expression (prefix, postfix, or infix). * Introduce an intermediate structured control-flow representation The code previously used a single function called `emitIRStmtsForBlocks` in `emit.cpp` that would take a logical sub-graph of the CFG and emit it as high-level statements. It would do this by recognizing operations like coniditional branches that it could turn into high-level `if` statements, etc. The main problem with this function was that it mixed together the logic for how we restructure the program with the logic for how we emit high-level code from that structure. This change splits those two parts of the algorithm by introducing an intermediate data structure: a tree of `Region`s, which represent single-entry regions of the CFG. There are subclasses of `Region` corresponding to various structured control-flow constructs, and then a leaf case that wraps a single `IRBlock`. The new function `generateRegionsForIRBlocks()` (in `ir-restructure.cpp`) now handles the restructuring work, by building one or more `Region`s to represent a sub-graph, while `emitRegion()` handles emitting HLSL/GLSL source code from a region. Splitting things in this way opens up some opportunities for future changes: * We can expand the set of IR control-flow constructs allowed, so long as we can still generate structure `Region`s from them, without having to mess with the emit logic (e.g., we could start to support multi-level `break` by introducing temporaries as needed). In the limit we can generate our `Region`s using something like the "Relooper" algorithm. * We can emit to other representations while retaining the same control-flow restructuring support. E.g., if we drop the structured information from the IR, then emitting to SPIR-V for Vulkan would require us to use the strucured control-flow information from these `Region`s. * We can do analysis that needs to understand `Region` structure. This is relevant to issue #569, which was what prompted me to start on this work. Now that we have a representation of the nesting of `Region`s, we can use it to reason about visibility of values between blocks. During development of this change I ran into a gotcha, in that I had been assuming each IR block would map to a single `Region`, forgetting that our current lowering of "continue clauses" in `for` loops leads to them being duplicated. The `Region` representation handles this by having a linked-list struct mapping IR blocks to the `SimpleRegion`s that represent them. I added a test case that includes a `for` loop with a continue clause that is reached along multiple paths just to make sure that we continue to support that case. The compiler output should not change as a result of this work; this is supposed to be a pure refactoring change. * Add a pass to resolve scoping issues in generated code Fixes #569 The basic problem arises because the structured control flow that we output in high-level HLSL/GLSL doesn't match the "scoping" rules of an SSA IR. In particular, SSA says that a value can be used in any block that is dominated by the definition, but in the presence of `break` and `continue` statements it is easy to construct cases where a block dominates something that is not in its scope for structured control flow. Consider: ```hlsl for(;;) { int a = xyz; if(a) { int b = a; break; } int c = a; } int d = b; ``` This program is invalid as HLSL, because the variable `b` is referenced outside of its scope, but if we look at the CFG for this function, it is clear that the block that computes `b` dominated the block that computes `d`. IR optimizations can easily create code like this, so we need to be ready for it. The previous change added an explicit `Region` structure to represent the structured control flow that we re-form out of the IR, and this change adds a pass that exploits the structuring information to detect cases like the above and introduce temporaries to fix the scoping issue. For example, the pass would change the earlier code block into something like: ```hlsl int tmp; for(;;) { int a = xyz; if(a) { int b = a; tmp = b; break; } int c = a; } int d = tmp; ``` That is, we introduce a new `tmp` variable at a scope "above" both the definition and use of `b`, and then we copy `b` into that temporary right where it is computed, and then use the temporary instead of the original `b` at the use site. A few details that came up during the implementation: * Downstream compilers may get confused by code like the above, and complain that `tmp` may be used before it is initialized, even though the very definition of dominators in a CFG means we don't have to worry about it. Still, I introduced some one-off code to initialize the temporaries just to silence spurious warnings coming from fxc. * We need to be careful not to apply this logic to "phi nodes" (the parameters of basic blocks) since they will already be turned into temporaries by the emit logic, and trying to introduce temporaries with this pass led to broken code (I still need to investigate why). It may be that a future version of this pass should also take the code out of SSA form, so that we can introduce both kinds of temporaries in a single pass (and maybe eliminate some unnecessary variables by doing basic register allocation). There is another transformation that could fix some issues of this kind, by moving code out of a structured control-flow construct and to the "join point" after it. For example, we could turn our loop from the start of this commit message into: ```hlsl for(;;) { int a = xyz; if(a) { break; } int c = a; } int b = a; int d = b; ``` Moving the definition of `b` to after the loop is possible because there is no way to get out of the loop without executing that code anyway. Now the scoping issue for `d`'s use of `b` has gone away, but of course we've introduced a new scoping issue for `a`, when it gets used by `b`. Adding a pass to re-arrange control flow like this could reduce the cases where we have to apply the current pass, but it wouldn't eliminate them entirely. That means such a pass can be deferred to future work. This change includes a test case the reproduces the original issue, so that we can confirm the fix works.
*	Handle structure initializers in IR type legalization (#567)	Tim Foley	2018-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #566 The basic problem here is that the front-end translates a structure initializer-list expression into a `makeStruct` instruction (with one argument per field), but the IR type legalization logic wasn't handling the case where a `makeStruct` is used to construct a struct value that needs to get split by legalization. The implementation is relatively straightforward, and like the other cases of instruction legalization for compound types, it follows the shape of the `LegalType`/`LegalVal` cases. The one interesting bit is that we need to be a bit careful and filter the single argument list for `makeStruct` into two in the case where we generate a "pair" type for something that has both "ordinary" and "special" (resource) fields. Luckily the `PairInfo` data that was generated by type legalization has exactly the information we need (by design). This change does not address several issues that could be handled in follow-on changes: * The `makeArray` instruction will face similar issues if it is applied to a type that requires legalization: we'd need to turn an array of `LegalVal`s into a bunch of distinct arrays. * The error message when we hit the unimplemented case here isn't great. Ideally we should provide the line number of the instruction that fails in an error message when legalization fails. This change tries to focus narrowly on the bug at hand, and leave these issues for later changes.
*	Add test for associated type from global generic parameter (#561)	Tim Foley	2018-05-11
\| \| \| \| \|	Resolves #357 The example shader from that issue has been added as a test case, and works with the top-of-tree Slang compiler (most likely due to the changes introduced with the IR-level type system).
*	Pass through original names for most declarations (#547)	Tim Foley	2018-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic idea here is that when lowering to the IR, the front-end will attach a "name hint" to the IR instruction(s) that represent a given declaration, and then the passes that work on the IR will try to preserve and propagate those names, and then finally the emit logic will use them in place of mangled or unique names when available. This change does not try to deal with the issues that arise when we try to use those variable names in the output without any modification (e.g., handling cases where they might clash with keywords or builtins in the target language). Instead, it tries to establish baseline behavior for propagating through names, so that a later change can concentrate on the issue of using those names exactly when it is legal to do so. In order to avoid issues around the name "hints" causing problems we take two main steps: 1. We "scrub" each name to reduce it down to the allowed set of identifier characters in C-like languages, and then ensure that it doesn't do things that would be illegal in some downstream languages (e.g., consecutive underscores are not allowed in GLSL) or could clash with Slang's mangled names. This process isn't guaranteed to give distinct results for distinct inputs (it isn't a mangling scheme, after all). 2. We generate a unique ID for each occurence of a given name and always use that as a suffix. This means that even if a name happens to overlap with a keyword (if you somehow have a variable named `do`), we will still add a suffix that makes it not a problem (we'd output `do_0` which is fine). The logic for generating these names is mostly straightforward. For simple variables, we use their given name directly, while for other declarations we try to form a name that includes their parent declaration (e.g. `SomeType.someMethod`). Various IR passes need to propagate or preserve this information. The most interesting is type legalization, when we take a variable with an aggregate type and split some of the fields out into their own variables. In that case we generate "dotted" names like `someVar.someTexture` and rely on the emit logic to turn that into `someVar_someTexture`. During SSA generation, if we are promoting a variable to SSA temporaries, we will try to propagate the name of the variable over to the temporaries (unless they already have a name from some other place). The same applies to block parameters ("phi nodes"). Many of the test changes need their expected output to be updated for this change. Luckily in most cases the output has gotten easier to understand.
*	Improve SSA promotion for arrays and structs (#521)	Tim Foley	2018-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Improve SSA promotion for arrays and structs Fixes #518 The existing SSA pass would only handle `load(v)` and `store(v,...)` where `v` is the variable instruction, and would bail out if `v` was used as an operand in any other fashion. The new pass adds support for `load(ac)` where `ac` is an "access chain" with a gramar like: ac :: v \| getElementPtr(ac, ...) \| getFieldAddress(ac, ...) What this means in practical terms is that we can promote a local variable of array or structure type to an SSA temporary even if there are loads of individual elements/fields, as along as any assignment to the variable assigns the whole thing. I've added a test case to confirm that this change fixes passing of arrays as function parameters for Vulkan. * Fixup: disable test on Vulkan because render-test isn't ready This is a fix for Vulkan, but I don't think our testing setup is ready for it. * Fixup: error in unreachable return case, caught by clang * Fixups based on testing These are fixes found when testing the original changes against the user code that originated the bug report. * `emit.cpp`: Make sure to handle array-of-texture types when deciding whether to declare a temporary as a local variable in GLSL output * `ir-legalize-types.cpp`: Make a not of a source of validation failures that we need to clean up sooner or later (just not in scope for this bug fix change). * `ir-ssa.cpp`: * When checking if something is an access chain with a promotable var at the end, make sure the recursive case recurses into the "access chain" logic instead of the leaf case * Add some assertions to guard the assumption that any access chain we apply has been scheduled for removal * Correctly emit an element extract instead of getting an element address when promoting an element access into an array being promoted * Eliminate a wrapper routine that was setting up an `IRBuilder` and use the one from the block being processed in the SSA pass (since it was set up for stuff just like this) * `ir-validate.cpp` * Add a hack to avoid validation failures when running IR validation on the stdlib code. This case triggers for an initializer (`__init`) declaration inside an interface, since the logical "return type" is the interface type itself, which has no representation at the IR level and thus yields a null result type in a `FuncType` instruction.
*	Fix successor computation for `switch` instruction (#520)	Tim Foley	2018-04-23
\| \| \| \| \| \| \|	Fixes #519 The code was leaving out the `default` label from the successor list, which would break any passes that require an accurate CFG (with the big one right now being the SSA-formation pass).
*	Introduce an IR-level type system (#481)	Tim Foley	2018-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors
*	Implement "operator comma" in IR codegen (#472)	Tim Foley	2018-04-02
\| \| \|	Fixes #471