slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Feature/lex memory reduction (#762)	jsmall-nvidia	2018-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Only do scrubbing if needed. When allocating content try to limit size (with scrubbing each token takes up 1k), now it's 16 bytes min size. * Don't allocate for every call to write on the CallbackWriter - use the m_appendBuffer. * Don't allocate memory for CallbackWriter use m_appendBuffer. * Use UnownedStringSlice for suffix output for parsing float/int literals. Fix typo in invalidFloatingPointLiteralSuffix * Using memory arena to hold tokens that are not in SourceManager. * Improve comment on lexing. * Make UnownedStringSlice allocation simpler on SourceManager. * Fix error on gcc around UnownedStringSlice - because VC converted string + UnownedStringSlice automatically into a String. * Fix generateName needing concat string for gcc. * When constructing a Token in parseAttributeName - because it's a Identifier, we have to set the Name. * Remove translation through String on getIntrinsicOp * Make func-cbuffer-param disablable with -exclude compatibility-issue * Move memory leak in render-test. * From review - can just use "?:" instead of performing a concat.
*	Add support for Vulkan raytraicng "shader record" (#735)	Tim Foley	2018-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The syntax for this is a placeholder for now, since we will probably want to migrate to whatever gets decided on for dxc. To declare that some data should be part of the "shader record" use `layout(shaderRecordNV)` to mirror the GLSL raytracing extension: ```hlsl layout(shaderRecordNV) cbuffer MyShaderRecord { float4 someColor; uint someValue; } ``` The intention (not enforced) is that an application would map `MyShaderRecord` to "root constants" in the "local root signature" when compiling for DXR, while the output code in GLSL will always map to the shader record in Vulkan raytracing: ```glsl layout(shaderRecordNV) buffer MyShaderRecord { float4 someColor; uint someValue; }; ``` This change does not support declaring a global value of `struct` type with `layout(shaderRecordNV)` (or a `ParameterBlock` with the modifiers, although that would be a nice-to-have feature) and it does not support having the contents of the shader record be mutable (even if GLSL/Vulkan allows it). Those can/should be added in future changes. In terms of implementation, this closely mirrors the way that `layout(push_constant)` buffers were being handled, where the data inside the `ConstantBuffer<X>` (the value of type `X`) gets laid out using ordinary rules (and consuming ordinary `UNIFORM` storage, while the buffer itself is given a different layout resource to reflect that fact that it does not consume a VK `binding` any more, but a different conceptual resource. Note: an alternative design here (that might actually be preferrable) would be to have both push-constant and shader-record buffers be handled as alternative aliases for `ConstantBuffer` (or maybe `ParameterBlock`) so that you have, e.g.: ```hlsl PushConstantBuffer<X> myPushConstants; ShaderRecord<Y> myShaderRecord; ``` This alternative design avoids API-specific decorations on the declarations, and reflects the intent of the programmer very directly, even when they are compiling for a target like D3D that doesn't reflect these choices at the IL level (it could still be exposed through the Slang reflection API).
*	Allow parameter blocks to be explicitly bound to spaces (#736)	Tim Foley	2018-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Don't look at VK bindings when compiling for D3D and vice versa The compiler had been looking at all the modifiers on a declaration when piecing together binding information, whether or not those modifiers should apply on the chosen target API. This was working in practice because the "layout resource kinds" used by each API target were disjoint, for the most part. This change ensures that we don't even look at modifiers that don't apply on the chosen target, and furthermore adds a new warning that applies if the user is compiling a shader with explicit `register` bindings for Vulkan, if there are no corresponding `[[vk::binding(...)]]` attributes (under the assumption that if they want to be explicit in one case, they probably want to be explicit in all cases). * Allow explicit space/set bindings on parameter blocks The syntax for the D3D case is to specify a `space` in a `register` modifier, without any other register class: ```hlsl ParameterBlock<X> myBlock : regsiter(space999); ``` In the Vulkan case, the user must apply the `[[vk::binding(...)]]` attribute and is expected to use a `binding` of zero: ```hlsl [[vk::binding(0,999)]] ParameterBlock<X> myBlock; ``` This change includes a reflection test for the new capability (where we also confirm that it produces the expected output when compared with fxc), and a test for the diagnostic messages when the user messes up bindings for Vulkan. The implementation itself is fairly straightforward, since the compiler already treats registe spaces/sets as a resource that parameters can consume directly. Note: the test case for explicit parameter block space/set bindings includes some commented out code that lead to a compiler crash. I would like to fix the underlying issue, but it seemed sensible to keep the bug fix out of a change like this that is adding functionality.
*	Add support for unbounded arrays as shader parameters (#725)	Tim Foley	2018-11-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for unbounded arrays as shader parameters With this change, Slang shaders can use unbounded-size arrays as parameters, e.g.: ```hlsl Texture2D t[] : register(t3, space2); SamplerState s[]; ``` As shown in the above example, Slang supports both explicit `register` declarations on unbounded-size arrays and also implicit binding. When doing automatic parmaeter binding, Slang will allocate a full register space to an unbounded-size array of textures/smaplers, starting at register zero. Note that for the Vulkan target, an array of descriptors of any size (including unbounded size) consumes only a single `bindign`, so much of this logic is specific to D3D targets. Details on the changes made: * The single biggest change is a new `LayoutSize` type that is used to store a value that can either be a finite unsigned integer or a dedicated "infinite" value (which is stored as the all-bits-set `-1` value). This is used in places where a size could either be a finite value or an "unbounded" value, to both try to make standard math robust against the infinite case, and also to force code to deal with both the finite and infinite cases more explicitly when they care about the difference. * The public API was documented so that unbounded-size arrays report their size as `-1`. We should probably change this function to return a signed value instead of `size_t`, but that would technically be a source-breaking change, so we want to make sure we stage it appropriately. * The code that invokes fxc was updated so that it passes the appropriate flag to enable unbounded arrays of descriptors. I haven't looked yet at whether dxc needs such a flag, so there may need to be a follow-on change to add that. * The logic in the `UsedRanges::Add` method for tracking what registers have been claimed was rewritten because the previous version had some subtle bugs. The new version includes more detailed comments that attempt to explain why I think the new logic works. * The top-level logic for auto-assigning bindings to parameters has been overhauled to deal with the fact that a parameter that needs "infinite" amounts of a resource should be claiming a full register space for those resources instead. Whenever a parameter allocates any register spaces we want them all to be contiguous, so we have a loop that counts the requirements and allocates the spaces before we go along and dole them out. * When computing the layout for an array type, we need to carefully deal with unbounded-size arrays. In the case of an unbounded array of a "simple" resource type (e.g., `Texture2D[]`), we opt to expose the type layout as consuming an infinite number of the appropriate register, while in the case of a complex type (say, a `struct` with two texture fields), we need to instead allocate whole spaces for those fields. The logic here is more subtle than I would like, and interacts with the existing code that "adjusts" the element type of an array in order to make standard indexing math Just Work. * Similarly, when a `struct` type has unbounded-array fields, then we need to transform any field with infinite register requirements to instead consume a space in the resulting aggregate type. This case is comparatively easier than the array case. * The test case for unbounded arrays covers both explicit and implicit bindings, and also the case of an unbounded array over a `struct` type (it does not cover the case of a `struct` contianing unbounded arrays, so that will need to be added later). For this test we are both validation the output reflection data and that we produce the same code as fxc (with explicit bindings in the fxc case). * The reflection test app was modified to use the new API contract and detect when a parameter consumes `SLANG_UNBOUNDED_SIZE` resources. * Fixup: ensure unbounded size is defined at right bit width
*	Add callable shader support for Vulkan ray tracing (#718)	Tim Foley	2018-11-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add callable shader support for Vulkan ray tracing This change extends the previous work to update Vulkan ray tracing support for the finished `GL_NV_ray_tracing` spec. One of the features missing in the experimental extension that was added to the final spec is "callable shaders," which allow ray tracing shaders to call other shaders as general-purpose subroutines. Most of the implementation work here mirrors what was done for the `TraceRay()` function to map it to `traceNV()`. We map the generic `CallShader<P>` function to the non-generic `executeCallableNV`, with a payload identifier that indicates a specific global variable of type `P` (the global variable being generated from a `static` local in `CallShader`). A new modifier is added to identify the payload structure, and the parameter binding/layout logic introduces a new resource kind for callable-shader payload data (where previously the logic had assumed ray and callable payloads should use the same resource kind). Two test shaders are included: one for the callable shader (`callable.slang`) and one for a ray generation shader that calls it (`callable-caller.slang`). Just for kicks, the payload data type is defined in a shared file so that we can be sure the two agree (trying to emulate what might be good practice, and ensure that ray tracing support works together with other Slang mechanisms). * Typo fix: assocaited->associated One instance was found in review, but I went ahead and fixed a bunch since I seem to make this typo a lot. * Typo fix: defintiion->definition
*	Support cross-compilation of ray tracing shaders to Vulkan (#663)	Tim Foley	2018-10-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Move to newer glslang * Support cross-compilation of ray tracing shaders to Vulkan This change allows HLSL shaders authored for DirectX Raytracing (DXR) to be cross-compiled to run with the experimental `GL_NVX_raytracing` extension (aka "VKRay"). * The GLSL extension spec is marked as experimental, so that any shaders written using this support should be ready for breaking changes when the spec is finalized. * "Callable shaders" are not exposed throug the GLSL extension, so this feature of DXR will not be cross-compiled. * The experimental Vulkan raytracing extension does not have an equivalent to DXR's "local root signature" concept. This does not visibly impact shader translation (because the local/global root signature mapping is handled outside of the HLSL code), but in practice it means that applications which rely on local root signatures on their DXR path will not be able to use the translation in this change as-is; more work will be needed. The simplest part of the implementation was to go into the Slang standard library and start adding GLSL translations for the various DXR operations. In some cases, like mapping `IgnoreHit()` to `ignoreIntersectionNVX()` this is almost trivial. The various functions to query system-provided values (e.g., `RayTMin()`) were also easy, with the only gotcha being that they map to variables rather than function calls in GLSL, and our handling of `__target_intrinsic` assumes that a bare identifier represents a replacement function name, and not a full expression, so we have to wrap these definitions in parentheses. The tricky operations are then `TraceRay<P>()` and `ReportHit<A>()`, because these two are generics/templates in HLSL. GLSL doesn't support generics, even for "standard library" functions, so the raytracing extension implements a slightly complex workaround: the matching operations `traceNVX()` and `reportIntersectionNVX()` pass the payload/attributes argument data via a global variable. That is, shader code for the GLSL extensions writes to the global variable and then calls the intrinsic function. The linkage between the call site and the global is established by a modifier keyword (`rayPayloadNVX` and `hitAttributeNVX`, respectively) and in the case of ray payload also uses `location` number to identify which payload global to use (since a single shader can trace rays with multiple payload types). Our translation strategy in Slang tries to leverage standard language mechanisms instead of special-case logic. For example, to translate the `ReportHit<A>()` function, we provide both a default declaration that will work for HLSL (where the operation is built-in with the signature given), and a definition marked with the `__specialized_for_target(glsl)` modifier. The GLSL definition declares a function `static` variable that will fill the role of the required global, and then does what the GLSL spec requires: assigns to the global, and then calls the `reportIntersectionNVX` builtin (which we declare as a separate builtin). Our ordinary lowering process will turn that `static` variable into an ordinary global in the IR, and the `[__vulkanHitAttributes]` attribute on the variable will be emitted as `hitAttributeNVX` in the output. There is no additional cross-compilation logic in Slang specific to `ReportHit<A>()` - the target-specific definition in the standard library Just Works. The case for `TraceRay<P>()` is a bit more complicated, simply because the GLSL `traceNVX()` function needs to be passed the `location` for the payload global. We implement the payload global as a function-`static` variable, with the knowledge that every unique specialization of `TraceRay<P>()` will generate a unique global variable of type `P` to implement our function-`static` variable. We then add a slightly magical builtin function `__rayPayloadLocation()` that can map such a variable to its generated `location`; the logic for this is implemented in `emit.cpp` and described below. We also changed the `RayDesc` and `BuiltinTriangleIntersectionAttributes` types from "magic" intrinsic types over to ordinary types (because the GLSL output needs to declare them as ordinary `struct` types). This ends up removing some cases in the AST and IR type representations. By itself this change would break HLSL emit, because in that case the types really are intrinsic. We added a `__target_intrinsic` modifier to these types to make them intrinsic for HLSL, and then updated the downstream passes to handle the notion of target-intrinsic types. The logic for binding/layout of entry point inputs and outputs was updated so that raytracing stages don't follow the default logic for varying input/output parameters. This is because the input/output parameters of a raytracing entry point aren't really "varying" in the same sense as those in the rasterization pipeline. In particular, the SPIR-V model for raytracing input and output treats "ray payload" and "hit attributes" parameters as being in a distinct storage class from `in` or `out` parameters. We also detect cases where a ray tracing stage declares inputs/outputs that it shouldn't have. This logic could conceivably be extended to other stages (e.g., to give an error on a compute shader with user-defined varying input/output). The type layout logic added cases for handling raytracing payload and hit-attribute data, but this is currently just a stub implementation that follows the same logic as for varying `in` and `out` parameters (it cannot give meaningful byte sizes/offsets right now). To my knowledge the GLSL spec doesn't currently specify anything about layout, and I haven't read the DXR spec language carefully enough to know what it says about layout. A future change should update the layout logic to allow for byte-based layout of ray payloads, etc. so that we can query this information via reflection. The GLSL legalization logic in `ir.cpp` was updated to factor out the per-entry-point-parameter code into its own function, and then that function was updated to special-case the input/output of a ray-tracing shader. While for rasterization stages we typically want to take the user-declared input/output and "scalarize" it for use in GLSL (in part to deal with language limitations, and in part to tease system values apart from user-defined input/output), the GLSL spec for raytracing requires payload and hit attribute parameters to be declared as single variables. There is also the issue that even for an `in out` parameter, a ray payload parameter should only turn into a single global, whereas the handling for varying `in out` parameters generates both an `in` and an `out` global for the GLSL case. Other than the handling of entry point parameters, the GLSL legalization pass doesn't need to do anything special for ray tracing shaders. The trickiest change in the `emit.cpp` logic is that we now generate `location`s for ray payload arguments (the outgoing from a `TraceRay()` call) on demand during code generation. This is a bit hacky, and it would be nice to handle it as a separate pass on the IR rather than clutter up the emit logic, but this approach was expedient. Basically, any of the global variables that got generated from the `static` declarations in the standard library implementation of `TraceRay()` will trigger the logic to assign them a `location`. The logic for emitting intrinsic operations added a few new `$`-based escape sequences. The `$XP` case handles emitting the location of a generated ray payload variable; this is how we emit the matching location at the site where we call `traceNVX`. The `$XT` case emits the appropriate translation for `RayTCurrent()` in HLSL, because it maps to something different depending on the target stage. All of the test cases here consist of a pair of an HLSL/Slang shader written to the DXR spec, plus a matching GLSL shader for a baseline. The GLSL shaders are carefully designed so that when fed into glslang they will produce the same SPIR-V as our cross-compilation process. This kind of testing is quite fragile, but it seems to be the best we can do until our testing framework code supports both DXR and VKRay. A bunch of the core changes ended up being blocked on issues in the rest of the compiler, so some additional features go implemented or fixed along the way: The first big wall this work ran into was that the `__specialized_for_target` modifier hasn't actually been working correctly for a while. It turns out that for the one function that is using it, `saturate()`, we have been outputting the workaround GLSL function in all cases (including for HLSL output) rather than only on GLSL targets. The problem here is that for a generic function with a `__specialized_for_target` modifier or a `__target_intrinsic` modifier, the IR-level decoration will end up attached to the `IRFunc` instruction nested in the `IRGeneric`, but the logic for comparing IR declarations to see which is more specialized (via `getTargetSpecializationLevel()`) was looking only at decorations on the top-level value (the generic). The quick (hacky) fix here is to make `getTargetSpecializationLevel()` try to look at the return value of a generic rather than the generic itself, so that it can see the decorations that indicate target-specific functions. A more refined fix would be to attach target-specificity decorations to the outer-most generic (to simplify the "linking" logic). The only reason not to fold that into the current fix is that the `__target_intrinsic` modifier currently serves double-duty as a marker of target specialization and information to drive emit logic. The latter (the emit-related stuff) currently needs to live on the `IRFunc`, and moving it to the generic could easily break a lot of code. This needs more work in a follow-on fix, but for now target specialization should again be working. The other big gotcha that the simple "just use the standard library" strategy ran into was that function-`static` variables weren't actually implemented yet, and in particular function-`static` variables inside of generic functions required some careful coding. The logic in `lower-to-ir.cpp` has this `emitOuterGenerics()` function that is supposed to take a declaration that might be nested inside of zero or more levels of AST generics, and emit corresponding IR generics for all those levels. This is needed because two different AST functions nested inside a single generic `struct` declaration should turn into distinct `IRFunc`s nested in distinct `IRGeneric`s. The tricky bit to making that all work is that the same AST-level generic type parameter will then map to different IR-level instructions (the parameters of distinct `IRGeneric`s) when lowering each function. The existing logic handled this in an idiomatic way by making "sub-builders" and "sub-contexts." This change refactors some of the repeated logic into a `NestedContext` type to help simplify the pattern, and applies it consistently throughout the `lower-to-ir.cpp` file. Besides that cleanup, the major change is `lowerFunctionStaticVarDecl` which, unsurprisingly, handles lower of function-`static` variables to IR globals. The careful handling of nested contexts here is needed because if we are in the middle of lowering a generic function, then a `static` variable should turn into its own `IRGeneric` wrapping an `IRGlobalVar`. The body of the function should refer to the global variable by specializing the global variable's `IRGeneric` to the parameters of the functions `IRGeneric`. This tricky detail is handled by `defaultSpecializeOuterGenerics`. An additional subtlety not actually required for this raytracing work (and thus not properly tested right now) is handling function-`static` variables with initializers. These can't just be lowered to globals with initializers, because HLSL follows the C rule that function-`static` variables are initialized when the declaration statement is first executed (and this could be visible in the presence of side-effects). The lowering strategy here translates any `static` variable with an initializer into two globals: one for the actual storage, plus a second `bool` variable to track whether it has been initialized yet. There are some opportunities to optimize this case, especially for `static const` data, but that will need to wait for future changes. We've slowly been shifting away from the model where a user thinks of a "profile" as including both a stage and a feature level. Instead, the user should think about selecting a profile that only describes a feature level (e.g., `sm_6_1`, `glsl_450`, etc.), and then separately specifying a stage (`vertex`, `raygeneration, etc.) for each entry point. The challenge here is that the command-line processing still only had a single `-profile` switch, and no way to specify the stage. Adding the `-stage` option was relatively easy, but making it work with the existing validation logic for command-line arguments was tricky, because of the complex model that `slangc` supports for compiling multiple entry points in a single pass. * In `slang.h` add new reflection parameter categories for ray payloads and hit attributes, as part of entry point input/output signatures. * A previous change already updated our copy of glslang to one that supports the `GL_NVX_raytracing` extension, so in `slang-glslang.cpp` we just needed to map Slang's `enum` values for the raytracing stage names to their equivalents in the glslang code. * Moved the logic for looking up a stage by name (`findStageByName()`) out of `check.cpp` and into `compiler.cpp`, with a declaration in `profile.h` * Added a `$z` suffix to the GLSL translation of `Texture.SampleLevel()`, to handle cases where the texture element type is not a 4-component vector. Note that this fix should actually be applied to all* these texture-sampling operations, but I didn't want to add a bunch of changes that are (clearly) not being tested right now. * The layout logic for entry points was updated to correctly skip producing a `TypeLayout` for an entry point result of type `void`, which meant that the related emit logic now needs to guard against a null value for the result layout. * In `ir.cpp`, dump decorations on every instruction instead of just selected ones, so that our IR dump output is more complete. * Added a command-line `-line-directive-mode` option so that we can easily turn off `#line` directives in the output when debugging. Not all cases where plumbed through because the `none` case is realistically the most important. * Parser was fixed to properly initialize parent links for "scope" declarations used for statements, so that we can walk backwards from a function-scope variable (including a `static`) and see the outer function/generics/etc. * Added GLSL 460 profile, since it is required for ray tracing. Also updated the logic for computing the "effective" profile to use to recognize that GLSL raytracing stages require GLSL 460. * Added some conventional ray-tracing shader suffixes to the handling in `slang-test`. This code isn't actually used, but was relevant when I started by copy-pasting some existing VKRay shaders as the starting point for my testing. * Fixup: typos
*	Remove the "hack sampler" workaround (#648)	Tim Foley	2018-09-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Update glslang version * Fix build for new glslang The latest glslang required a few changes to our manual build for their code (because we are not taking a dependency on CMake). * Rebuild project files using premake, which picks up a few files added to glslang, but also a few diffs in Slang's own project files in cases where they were edited manually instead of using premake. * Fix up the declaration our our device limits (which are inentionally set to not limit what code passes through our glslang), because the underlying structure definition in glslang has changed. This is a kludgy bit of glslang's design, but it doesn't make sense for us to invest in a more serious workaround. * Remove the "hack sampler" workaround When the `GL_KHR_vulkan_glsl` spec was introduced to allow GLSL to be compiled for Vulkan SPIR-V, it made an annoying mistake by leaving a few builtins as taking `sampler2D`, etc. when the equivalent SPIR-V operations only require a `texture2D`, etc. The relevant builtins are: * `textureSize` * `textureQueryLevels` * `textureSamples` * `texelFetch` * `texelFetchOffset` This means that shader code that wanted to use those operations needed to conspire to have a `sampler` handy so they could write, e.g.: ```glsl vec4 val = texelFetch(sampler2D(myTexture, someRandomSampler), p, lod); ``` when what they really wanted was this: ```glsl vec4 val = texelFetch(myTexture, p, lod); ``` That is annoying but probably something each to work around for a GLSL programmer, but when cross-compiling from HLSL, you might have an operation like: ```hlsl float4 val = myTexure.Load(p); ``` in which case a cross-compiler needs to manufacture a sampler out of thin air. If the shader happened to use a sampler for something else you could snag that, but in the worse case you had to cross-compile to GLSL that declared a new sampler. Slang did this by declaring a sampler called `SLANG_hack_samplerForTexelFetch` (because `texelFetch` is the operation that first surfaced the issue). For complex reasons we always define this sampler, even if we turn out not to need it in a particular output kernel. This choice has a bunch of annoying consequences: * There is always a sampler defined in descriptor set zero, because that's where we put the hack sampler, so a user-defined parameter block always has a set number of 1 or greater (see #646). * The hack sampler shows up in reflection output because users need to size their descriptor sets appropriately to pass along this sampler that won't actually be used if they don't want to get debug spew from the validation layers. We filed an issue on glslang about this problem, and eventually some kind folks from the gamedev community (who also saw the same problem) defined an extension spec (`GL_EXT_samplerless_texture_functions`) to fix the underlying issue and contributed a patch to glslang to make it support that extension. This change just backs the hack out of Slang now that we have a glslang version that supports the extension to get past the defect in the original GLSL-for-Vulkan definition. Besides yanking out the code for the hack, we also change the relevant builtins to declare that they require this new GLSL extension (so that we properly request it from glslang when the builtins are used), and fix some reflection test cases that exposed the existence of the "hack sampler." * Fixup: syntax error in stdlib generator files * Remove more code for hack sampler There was logic to ensure we always have a "default" register space/set when cross-compiling, because the hack sampler would need it. This is no longer necessary once we remove the hack sampler. * Fix expected test output. Fixing the root cause of issue #646 means that one of our test cases that tickles that issue now produces different output (luckily it can now be used as a regression test for the issue).
*	Support for [[vk::push_constant]] (#629)	jsmall-nvidia	2018-08-22
\| \| \| \| \| \| \| \| \| \|	* Support for attributed [[vk::push_constant]] and [[push_constant]]. Can also use layout(push_constant). * Fix test so matches the expected output. * Add expected output to binding-push-constant-gl.hlsl * Trivial change to force travis rebuild to test the gcc linux build really has a problem.
*	Feature/attributed binding (#621)	jsmall-nvidia	2018-07-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Typo fix, and added dxc to command line documentation. * Fix small typos. Added support for Scope to lexer. Fix bug in Token ctor. * Add support for attribute names that are scoped. * Added GLSLBindingAttribute. Make binding work through core.met.slang. * Allow [[gl::binding(binding, set)]] [[vk::binding(binding,set)]]
*	Cleanups around behavior when the compiler fails (#553)	Tim Foley	2018-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Cleanups around behavior when the compiler fails * Add another case where we try to `noteInternalErrorLoc()` if an exception in thrown. This one is the in the logic for emitting an IR instruciton. This could be improved by adding another layer at the function level (as a catch-all for instructions with no location), but something is better than nothing. * Change a bunch of `assert()`s over to `SLANG_ASSERT()`s, so that we can theoretically take more control over them (e.g., make release builds with asserts enabled) * Some other small cleanups around the assertions we perform. In the survey I made, I didn't really see many obvious "smoking gun" cases where we could produce a significantly better error message for some of the unimplemented/unexpected paths, other than to actually implement the missing functionality. * fixup
*	Add support for explicit register space bindings (#542)	Tim Foley	2018-05-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds support for specifying explicit register spaces, like: ```hlsl // Bind to texture register #2 in space #1 Texture2D t : register(t2, space1); ``` I added a test case to confirm that the register space is properly propagated through the Slang reflection API. This change also adds proper error messages for some error/unsupported cases that weren't being diagnosed: * Specifying a completely bogus register "class" (e.g., `register(bad99)`) * Failing to specify a register index (`register(u)`) * Specifying a component mask (`register(t0.x)`) * Using `packoffset` bindings I added test cases to cover all of these, as well as the new errors around support for register `space` bindings. In order to get the existing tests to pass, I had to remove explicit `packoffset` bindings from some DXSDK test shaders. None of these `packoffset` bindings were semantically significant (they matched what the compiler would do anyway, for both Slang and the standard HLSL compiler). Removing them is required for Slang now that we give an explicit error about our lack of `packoffset` support. In a future change we might add logic to either detect semantically insignificant `packoffset`s, or to just go ahead and support them properly (as a general feature on `struct` types).
*	Introduce an IR-level type system (#481)	Tim Foley	2018-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors
*	Entry point attribute (#447)	Tim Foley	2018-03-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Typo * Add [shader(...)] and clean up some literal handling * Add supporting for validating the `[shader(...)]` attribute, by checking that its argument is a string literal that names a known shader stage. * Split the `ConstantExpr` class into distinct subclasses rooted at `LiteralExpr`, so we have `BoolLiteralExpr`, `IntegerLiteralExpr`, `FloatingPointLiteralExpr`, and `StringLiteralExpr` * Add a `String` type to the stdlib, to be used as the type of a string literal. This change allows code using `[shader(...)]` to be accepted by the front-end again, but it does nothing about emitting it in final HLSL. * Allow entry points to be specified via [shader(...)] Before this change, the compiler would track a list of `EntryPointRequest` objects, based on what the suer specified via API and/or command-line options. Each entry point request would get matched up with an AST `FuncDecl` as part of semantic checking, and then the back end steps (layout, codegen, etc.) would work from that information. This change makes the compiler modal, in that it can either continue to use an explicit list of entry point requests (this is the mode when the list is non-empty), or it can rely on user-supplied attributes on entry point functions to drive codegen (this is the mode when the list is empty). User-specified `[shader(...)]` attributes are processed at the same place where the association from `EntryPointRequest`s to `FuncDecl`s would otherwise be made, and basically does the same thing in the opposite direction: looks for `FuncDecl`s with the appropriate attribute and synthesizes an `EntryPointRequest` for them. Subsequent processing should ideally not know where a given `EntryPointRequest` came from, and should handle both methods of specifying the entry points equivalently. One design choice that might not make immediate sense is that we do not process a function as an entry point (applying further validation, etc.) just because it has a `[shader(...)]` modifier, unless we are in the appropriate mode (which in this case is the mode where the user didn't specify their own entry points via API or command line). This is to handle cases where the user wants to explicitly compile only one entry point, so that they (1) don't want us to spend time validating code they don't care about, (2) don't want do get output they don't expect, and (3) might actually be presenting us with code that violates the language rules due to a combination of `#define`s in effect (e.g., they might have a `[shader("vertex")]` function that transitively executes a `discard` because of how the preprocessor was configured, but they don't care because they are compiling a fragment entry point). This decision might be something we revisit over time. As part of this work, I had to add some logic to pick a "profile version" to use for a combination of a target and stage (because when you specify `[shader("vertex")]` the compiler can't tell if you want `vs_5_0`, `vs_5_1`, etc.). This isn't really complete right now, because something like `-target dxbc` also doesn't determine a profile, so there is a bit of a kludge at present. We need to figure out a good long-term plan here, which might involve keeping target format, feature level/version, and pipeline stage as truly orthogonal concepts, rather than conflating them. That would involve more work in the API and command-line layers to de-compose things when the user specifies, e.g., `vs_5_1`, but might make downstream logic easier to manage. * Emit [shader(...)] attribute on entry point for SM 6.1 and later This should help ensure that the output from Slang can be compiled with dxc `lib_` profiles. Fix warning
*	Small bug fixes. (#445)	Yong He	2018-03-16
\| \| \| \| \| \| \|	This commit contains two small bug fixes: 1. In `specializeProgramLayout`, we cannot assume the resourceInfo entries in a varLayout and its corresponding type layout has the same order. Should use `FindResourceLayout`. 2. When generating ir for a switch statement, make sure to remove the breakLabel from the shared context when done. For some reason if a switch statement is being lowered twice, the Dictionary::Add method will complain the statement key already exists.
*	Initial support for cross-compilation of geometry shaders to GLSL (#423)	Tim Foley	2018-02-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These changes are related to getting a first Slang geometry shader to translate to GLSL. There are some unrelated cross-compilation fixes in here as well. * Add direct support to shader parameter layout for GS output streams, so that they are reflected as a container type * Fix the declarations of the `SampleCmp` methods; they should always return `float`, independent of the nominal element type of the texture. * Fix up our handling of `__target_intrinsic` modifiers, so that we are a little bit more careful in how we detect something as being just a simple name replacement (e.g., `__target_intrinsic(glsl, "foo")` should make us output `foo(original, args, here)`) vs. a custom expression (e.g., `__target_intrinsic(glsl, "bar+1")` should output `bar+1` and not use any arguments, even without any `$` substitutions). * Don't emit the `[unroll]` modifier when outputting GLSL. Eventually we need to fully unroll loops for GLSL output anyway. * Inspect th entry point parameter list (from the layout information) when emitting a GS, so that we can write out the correct `layout` modifiers for input primitive type and output primitive topology. * Add a new case to `ScalarizedVal` to handle cases where an HLSL system value needs to map to a GLSL built-in variable with a slightly different type (e.g., `SV_RenderTargetArrayIndex` is a `uint` while `gl_Layer` is an `int`). For now this is only hanlding trivial cases (where a direct cast can achieve the result we want), but eventually it might need to handle things like conversion between arrays and vectors. * This is mostly just the infrastructure for the feature, and the actual enumeration of the correct types for all the system values is still to be done. * Handle a few more cases in assignment between `ScalarizedVal`. In particular, deal with cases where `materializeValue` is called on a tuple that has an array type, so that we need to construct the individual array elements. * Add translation for GS output stream `Append()` and `RestartStrip()` * Note that the translation of `Append()` seems to ignore its argument; this is because we desugar the operation during legalization for GLSL (see next item) * When legalizing for GLSL, detect an entry point parameter that is a GS stream, and translate it into `out` variables for its element type, and then rewrite any calls to `Append()` in the body of the entry point to be preceded by assignment to those variables. This works in tandem with the above translation of HLSL `Append()` calls into GLSL `EmitVertex()` calls. * We are detecting calls to `Append()` in a slightly hacky way, by looking at decorations on the callee to make sure that it is a function that is determined to translate to `EmitVertex()`. * Right now we aren't handling calls to `Append()` in other functions. It wouldn't be hard in principle to walk all the functions in the module and apply the translation (assuming we don't want to start supporting multiple output streams), but this wouldn't handle the passing of the GS output stream between functions. (This points out that there is a need for an additional type legalization pass that desugars away parameters of types that aren't actually meaningful on the target).
*	Handling of duplicate global shader parameter declarations (#405)	Tim Foley	2018-02-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Issue error when shader parameter type doesn't match between translation units We currently allow users to compile different translation units at the same time for the vertex and fragment shader, and those different translation units might contain distinct declarations for the "same" parameter. Furthermore, those declarations might use distinct declaratins of the "same" type. We currently bind those as if they were one parameter, which means we assume they have types that are actually equivalent. Unfortunately, things break when that isn't the case. This change adds error messages when parameter declarations that are determined to be the "same" don't have types that are either equiavlent or match "structurally" (same names, same fields, and field types match recursively). Ideally most users won't see these errors, because they will only submit single files to the compiler. Eventually we may actually decide to require that case and simplify our logic. * Support types with layout coming from a "matching" type Because of how the front-end performs parameter binding, we can end up in cases where a variable is given a `TypeLayout` based on a "matching" type from another translation unit (because the "same" shader parmaeter got declared in multiple TUs). This change tries to support that use case by avoiding absolute field decl-refs in the representation used for type legalization, so that we don't end up in a situation where we look up field layout based on information that doesn't match.
*	Make specialization presserve global parameter enumeration order in ↵	Yong He	2018-01-19
\| \| \| \|	reflection data
*	Refactor substitution representation in DeclRefBase (#363)	Yong He	2018-01-12
\| \| \| \| \| \| \| \| \| \| \| \|	This commit changes the type of `DeclRefBase::substitutions` from `RefPtr<Substitutions>` to `SubstitutionSet`, which is a new type defined as following: ``` struct SubstitutionSet { RefPtr<GenericSubstitution> genericSubstitutions; RefPtr<ThisTypeSubstitution> thisTypeSubstitution; RefPtr<GlobalGenericParamSubstitution> globalGenParamSubstitutions; } ``` This change get rid of most helper functions to retreive the substitution of a certain type, as well as surgery operations to insert a `ThisTypeSubstitution` or `GlobalGenericTypeSubstittuion` at top or bottom of the substitution chain. It also simplies type comparison when certain type of substitution should not be considered as part of type definition.
*	Bug fixes for Slang integration (#356)	Yong He	2018-01-04
\| \| \| \| \| \| \| \| \| \| \| \|	* fix #353 * move validateEntryPoint to after all entrypoints has been checked * bug fix: DeclRefType::SubstituteImpl should change ioDiff * bug fix: generic resource usage should have count of 1 instead of 0. * update test case
*	Add API for querying TypeLayout from a Type	Yong He	2018-01-03
\| \| \| \| \| \|	Added two API functions: 1. `spReflection_FindTypeByName`, which returns a DeclRefType to the struct type with the given name. The function finds from all loaded modules in a `CompileRequest` for a decl with the given name, construct a `Type` object and cache it in `CompileRequest::types` dictionary. The subsequent calls to `spReflection_FindTypeByName` with the same name will simply returned the cached Type objects. 2. `spReflection_GetTypeLayout`, which returns a `TypeLayout` for a given `Type`. This function creates and caches the `TypeLayout` in the `TargetRequest` object that is used to create the `ProgramLayout`.
*	no-codegen compile flag and global generics reflection (#347)	Yong He	2018-01-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* no-codegen compile flag and global generics reflection 1. Add SLANG_COMPILE_FLAG_NO_CODEGEN (-no-codegen) compiler flag to skip code generation stage, so that a shader that uses global generic type parmameters can be parsed, checked and introspected without knowing the final specialization. 2. Add reflection API to query for global generic type parameters, global parameters of generic type, and the generic type parameter index related to a global generic parameter. 3. Add a reflection test case for global generic type parameters. * add expected result for global-type-params test case. * fix reflection json output. * fix branch condition errors * fix expected result for global-type-params.slang * fix expected test case output
*	Add sample-rate-input detection for HLSL. (#312)	Tim Foley	2017-12-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add sample-rate-input detection for HLSL. In the HLSL case, it is possible to do this detection entirely based on declared signatures (it doesn't have a dependency on code generation like the GLSL case does). I've added test cases for the two main ways that a shader can become sample rate: 1. Qualify a fragment input with `sample` 2. Accept an input with the `SV_SampleIndex` semantic In each case I nested the input inside a `struct` to try to match common HLSL idiom, and to make sure that we handle the nested case. This code is not robust against shaders that declare such an input and then never use it, but that is to be expected given the goals for Slang. * Fixup: add missing test output files
*	More fixups for parameter block binding generation (#311)	Tim Foley	2017-12-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* More fixups for parameter block binding generation The bug in this case arises when there is both a parameter block and global-scope resources, all of which are relying on automatic binding assignment. If the parameter block is the first global-scope parameter that gets encountered, then it is possible for it to allocate regsiter space/set zero for itself, which confuses the logic for handling other global-scope parameters (which assumes that they get space/set zero). I've also made some fixup to the reflection test harness and reflection API code: - Have the hardness handle register-space allocations when printing, and be sure to only show their `index` and not their `space` (since that would be redundant) - Have the reflection API only auto-redirect queries on a parameter group type layout to its container type layout if the container type layout has a non-zero number of resource allocations. The problem that arises here is a `ParameterBlock<X>` where `X` doesn't contain any uniforms, so that no container is needed. In that case the container ends up with no resource allocation(s). * Fixups for test failures. - The thread-group size tests failed because they had shader parameters with no resources to back them (built-in `SV_` inputs), and the printing of those changed. I fixed up the baseline, but also had to fix a few bugs in the reflection test fixture's printing logic. - The GLSL parameter block test revealed a corner case of the existing logic: because we always need to generate a binding for the "hack" sampler (even if code doesn't end up needing it), and that sampler should always go in the "default" set (should be set zero), the user's `ParameterBlock` will always end up as `set=1` or later, even if there are no other global-scope parameters. - This will be fixed once we don't have to rely on glslang's annoying behavior in this one case, either because glslang gets fixed, or because we implement our own SPIR-V codegen.
*	Add API to query stage of varying parameter (#302)	Tim Foley	2017-11-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #301 The problem here is that if you have input GLSL code like: ```glsl // example.vs in vec3 pos; ``` and: ```glsl // example.fs in vec3 worldPos; ``` Then both `pos` and `worldPos` are reflected as global variables (parameters of the program), which both get bound to "varying input" resources, but there is no way to tell through the API that `pos` is a vertex parameter while `worldPos` is a fragment one. The original request in issue #301 was to expose parameters like this not as a global variables, but rather as parameters of the entry point in their specific file. That is, treat it as if the user had written, e.g.: ```glsl // example.vs void vsMain(in vec3 pos) { ... } ``` Doing that would unify the GLSL and HLSL/Slang cases a bit, but would require the Slang reflection API to lie about the structure of code the user wrote. At a more basic level, that would have been hard to implement because the current reflection API just exposes the underlying AST, and the AST needs to leave `pos` at the global scope so that when we go and spit GLSL back out we retain the original structure. This PR implements a more simplistic solution, where the user is allowed to query the stage that a varying parameter "belongs" to. For right now I'm only enabling this to work for varying parameters (but it doesn't care if they are entry-point or global-scope varyings). Despite what I said on #301, this should work for both the top-level parameter's variable layout, and any variable layouts for fields within its type reflection. In terms of implementation, I took the simple but wasteful route: every `VarLayout` now has a `stage` field that is by default initialized to `SLANG_STAGE_NONE`. When collecting varying parameters, I take advantage of the fact that everything bottlenecks through `processEntryPointParameter()` which takes an `EntryPointParameterState` so that I can set the `VarLayout::stage` field for any varying parameter in one place. While I was making this change, I also did a bit of cleanup so that the "official" names for the varying parameter categories are `VARYING_INPUT` and `VARYING_OUTPUT`, with `VERTEX_INPUT` and `FRAGMENT_OUTPUT` being "deprecated" in principle. I didn't do the bulk rename inside the codebase yet.
*	Generate IR per-module for loaded modules (#299)	Tim Foley	2017-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic idea here is that for each module that gets loaded via `import`, we should also generate the initial IR for the declarations in that module at the time it gets loaded. Furthermore, when we generate initial IR for a module, we will only generate IR declarations (not definitions) for any functions/variables in modules it imports. Later, when cloning IR to begin code generation for an entry point, we will effectively "link" all of the loadedm modules together, so that a given global value can get its definition from any of the IR modules present. - Change the `loadedModulesList` and related data structures to hold a new `LoadedModule` type, instead of just the AST (and then have a `LoadedModule` own both the AST and the IR module) - Share some logic between the `import` and `#import` cases, so that we always try to generate IR for modules we load. - Make sure that IR generation always gets skipped if the command-line flags tell us not to use the IR. - A few small fixups for cases that didn't arise in IR lowering so far, but come up when we try to actually generate IR for things like the stdlib. There are some notable gaps in this work right now: - The stdlib modules are exempted from this behavior; we always generate IR for stdlib functions in any user module that calls them. This is just a workaround for the fact that the stdlib modules don't show up in the list of imported modules right now. - We don't currently have logic that does the "linking" step for global variables like we do for functions. We really need to look up the symbols with the same mangled name, and favor any one of them that has a definition (if there is one) - Similarly, the handling of witness tables is incomplete. During initial IR generation, we should probably be generating empty witness tables for any conformances that were declared in other modules (but are being used locally in this module), and then the "linking" step should favor non-empty witness tables over empty ones. Still, all the test cases pass with the code like this, and this seems like an important step in the right direction.
*	fixup global generic parameters	Yong He	2017-11-20
\| \| \| \| \| \|	1. simplify RoundUpToAlignment() 2. add new a render-compute test case to cover the situation where the entry-point interface (parameter/return types of an entry-point function) is dependent on the global generic type. 3. initial fixes to get this test case to compile (but is not producing correct HLSL output yet)
*	Add support for global generic parameters (#285)	Yong He	2017-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. This commit does not include ProgramLayout specialization. * IR: pass through `[unroll]` attribute (#284) The initial lowering was adding an `IRLoopControlDecoration` to the instruction at the head of a loop, but this was getting dropped when the IR gets cloned for a particular entry point. The fix was simply to add a case for loop-control decorations to `cloneDecoration`. * fix warnings * IR: support `CompileTimeForStmt` (#286) This statement type is a bit of a hack, to support loops that must be unrolled. The AST-to-AST pass handles them by cloning the AST for the loop body N times, and it was easy enough to do the same thing for the IR: emit the instructions for the body N times. The only thing that requires a bit of care is that now we might see the same variable declarations multiple times, so we need to play it safe and overwrite existing entries in our map from declarations to their IR values. Of course a better answer long-term would be to do the actual unrolling in the IR. This is especially true because we might some day want to support compile-time/must-unroll loops in functions, where the loop counter comes in as a parameter (but must still be compile-time-constant at every call site). * Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. progress on parameter binding * Add a more contrived test case for specializing parameter bindings * update render-test to align buffers to 256 bytes (to get rid of D3D complains on minimal buffer size). * adding one more test case for parameter binding specialization. * Cleanup according to @tfoleyNV 's suggestions. * fix a bug introduced in the cleanup
*	Parameter block work (#276)	Tim Foley	2017-11-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Don't auto-enable IR use for compute tests The `COMPARE_COMPUTE` and `COMPARE_RENDER_COMPUTE` test fixtures were set up to always enable the `-use-ir` flag on Slang, which precludes having any tests that confirm functionality on the old non-IR path (which is still required by our main customer). This change adds the `-xslang -use-ir` flags explicitly to any compute test cases that left them out, and makes the fixture no longer add it by default. * Continue building out parameter block support The initial front-end logic for parameter blocks was already added, but they are still missing a bunch of functionality. This change addresses some of the known issues: - Bug fix: don't try to emit HLSL `register` bindings for variables that consume whole register spaces/sets - Overhaul type layout logic so that it can make decisions based on a given code generation target (currently passed in as a `TargetRequest`), which allows us to decide whether or not a parameter block should get its own register set on a per-target basis. - Always use a register space/set for Vulkan - Never use a register space/set for HLSL SM 5.0 and lower - By default, don't use register spaces/sets for HLSL output - Add a command-line flag and some "target flags" to enable register-space usage for D3D targets - Hackily add initial support for parameter blocks in the AST-to-AST path - This just blindly lowers `ParameterBlock<T>` to `T`, which shouldn't quite work - A more complete overhaul will probably need to wait until the AST-to-AST legalization is changed to use the `LegalType`s from the IR legalization pass. - Add a compute-based test case to actually run code using parameter blocks - This file runs test cases both with and without the IR
*	Parameter blocks (#245)	Tim Foley	2017-11-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Rename existing ParameterBlock to ParameterGroup We are planning to add a new `ParameterBlock<T>` type, which maps to the notion of a "parameter block" as used in the Spire research work. Unfortunately, the compiler codebase already uses the term `ParameterBlock` as catch-all to encompass all of HLSL `cbuffer`/`tbuffer` and GLSL `uniform`/`buffer`/`in`/`out` blocks (all of which are lexical `{}`-enclosed blocks that define parameters...). This change instead renames all of the existing concepts over to `ParameterGroup`, which isn't an ideal name, but at least doesn't directly overlap the new terminology or any existing terminology. The new `ParameterBlockType` case will probably be a subclass of `ParameterGroupType`, since it is a logical extension of the underlying concept. * Add Shader Model 5.1 profiles The HLSL `register(..., space0)` syntax is only allowed on "SM5.1" and later profiles (which is supported by the newer version of `d3dcompiler_47.dll` that comes with the Win10 SDK, but not the older version of `d3dcompiler_47.dll` - good luck figuring out which you have!). This change adds those profiles to our master list of profiles, and nothing else. * First pass at support for `ParameterBlock<T>` - Add the type declaration in stdlib - Add a special case of `ParameterGroupType` for parameter blocks - Handle parameter blocks in type layout (currently handling them identically to constant buffers for now, which isn't going to be right in the long term) - Add an IR pass that basically replaces `ParameterBlock<T>` with `T` - Eventually this should replace it with either `T` or `ConstantBuffer<T>`, depending on whether the layout that was computed required a constant buffer to hold any "free" uniforms - Add first stab at an IR pass to "scalarize" global variables using aggregate types with resources inside. - This currently only applies to global variables, so it won't handle things passed through functions, or used as local variables - It also only supports cases where the references to the original variable are always references to its fields, and not the whole value itself - Add a single test case that technically passes with this level of support, but probably isn't very representative of what we need from the feature * Fold parameter-block desugaring into a more complete "type legalization" pass The basic problem that was arising is that once you desugar `ParameterBlock<T>` into `T`, you then need todeal with splitting `T` into its constituent fields if it contains any resource types. Handling those transformations by following the usual use-def chains wasn't really helping, because you might need systematic rewriting that can really only be handled bottom-up. This change adds a new pass that is intended to perform multiple kinds of type "legalization" at once: - It will turn `ParameterBlock<T>` into `T` - It may at some point also convert `ConstantBuffer<T>` into `T` as well - It will turn an value of an aggregate type that contains resources into N different values (one per field) - As a result of this, it will also deal with AOS-to-SOA conversion of these types Legalization is applied to every function/instruction/value, so that it can make large-scale changes that would be tough to manage with a work list. This pass needs to be run after generics have been fully specialized, so that we know we are always dealing with fully concrete types, so that their legalization for a given target is completely known. This is still work in progress; there's more to be done to get this working with all our test cases, and finish the remaining `ParameterBlock<T>` work. * Improve binding/layout information when using parameter blocks - When doing type layout for a parameter block, don't include the resources consumed by the element type in the resource usage for the parameter block - Note that this is pretty much identical to how a `ConstantBuffer<T>` does not report any `LayoutResourceKind::Uniform` usage, except that `ParameterBlock<T>` is also going to hide underlying texture/sampler reigster usage - The one exception here is that any nested items that use up entire `space`s or `set`s those need to be exposed in the resource usage of the parent (I don't have a test for this) - When type legalization needs to scalarize things, it must propagate layout information down to the new leaf variables. In general, the register/index for a new leaf parameter should be the sum of the offsets for all of the parent variables along the "chain" from the original variable down to the leaf (we aren't dealing with arrays here just yet). - When type legalization decides to eliminate a pointer(-like) type (e.g., desugar `ParameterBlock<T>` over to `T`), actually deal with that in terms of the `LegalVal`s created, so that we can know to turn a `load` into a no-op when applied to a value that got indirection removed. - Hack up the "complex" parameter-block test so that it actually passes (the big hack here is that the HLSL baseline is using names that are generated by the IR, and are unlikely to be stable as we add/remove transformations). - Note: I can't make these be compute tests right now, because regsiter spaces/sets are a feature of D3D12/Vulkan, and our test runner isn't using those APIs.
*	fix all unreachable code warnings	Yong He	2017-11-04
\|
*	Reflection: allow querying of semantics on varying input/output (#224)	Tim Foley	2017-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is functionality required to support a Falcor bug fix. Most of the code to compute the right semantic name/index for a parameter was already present. This change adds: - Storage for semantic name/index on every `VarLayout` - Note: this is wasteful and should be optimized later - A public API to query the semantic name/index - The contract is that this API returns `NULL` if the parameter had no semantic - A bit of work in `parameter-binding.cpp` to attach semantics to varying input/output when traversing varying parameters. - Note: this is intentionally set up so that it associates semantics even with non-leaf parameters, so that an API user can query the semantic of a `struct` parameter and know that its members will be assigned sequential semantic indices from its starting value. - Support for dumping this information in reflection tests One notable thing that I did not change here is that the reflection test fixture doesn't report information on the output of an entry point, even though it really should. That should be fixed in a separate change, though, because it would affect many of the expected outputs.
*	Implement notion of a "container format" (#213)	Tim Foley	2017-10-16
\| \| \| \| \| \| \| \| \| \| \|	The big addition here is that the Slang "bytecode" is no longer treated as just a "code generation target" (`CodeGenTarget`) akin to DX bytecode (DXBC) or SPIR-V, but instead is a `ContainerFormat` that can be used to emit all the results of a compile request (well, currently just the IR-as-BC, but the intention is there). Getting to this goal involved some prior checkins that eliminated bogus "targets" that weren't really akin to SPIR-V or DXBC: `-target slang-ir-asm` and `-target reflection-json`. Those targets were really in place to support testing, and so they've been made more explicit testing/debug options. This change eliminates `-target slang-ir` and instead tries to allow the user to specify `-o foo.slang-module` as an output file name, that indicates the intention to output a "container" file that will wrap up all the generated code. I've also gone ahead and generalized the existing `-target` option so that we are actually building up a list of code generation targets. This is largely just a cleanup, since it forces code to be more aware of when it is doing something target-specific vs. target independent. For example, reflection layout information lives on a requested target, and not on the compile request as a whole, and similarly output code is per-target, per-entry-point. As a cleanup, I eliminated support for per-translation-unit output. This was vestigial code from back when I used to try and do HLSL generation for a whole translation unit instead of per-entry-point (which turned out to be a lot of complexity for little gain), and it was only being used in the `hello` example and the `render-test` test fixture - in both cases fixing it up was easy enough. I've stubbed out the old `spGetTranslationUnitSource` API, but haven't removed it yet.
*	First attempt at a Linux build (#193)	Tim Foley	2017-09-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First attempt at a Linux build - Fix up places where C++ idioms were written assuming lenient behavior of Microsoft's compiler - Add a few more alternatives for platform-specific behavior where Windows was the only platform accounted for. - Add a basic Makefile that can at least invoke our build, even if it isn't going good dependency tracking, etc. - Build `libslang.so` and `slangc` that depends on it, using a relative `RPATH` to make the binary portable (I hope) - Add an initial `.travis.yml` to see if we can trigger their build process. * Fixup: const bug in `List::Sort` I'm not clear why this gets picked up by the gcc and clang that Travis uses, but not the (newer) gcc I'm using on Ubuntu here, but I'm hoping it is just some missing `const` qualifiers. * Fixup: reorder specialization of "class info" Clang complains about things being specialized after being instantiated (implicilty), and I hope it is just the fact that I generate the class info for the roots of the hierarchy after the other cases. We'll see. * Fixup: add `platform.cpp` to unified/lumped build * Fixup: Windows uses `FreeLibrary` and not `UnloadLibrary` * Fixup: fix Windows project file to include new source file This obviously points to the fact that we are going to need to be generating these files sooner or later.
*	Improve diagnostics for overlapping/conflicting bindings	Tim Foley	2017-08-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	Closes #38 - Change overlapping bindings case from error to warning (it is technically allowed in HLSL/GLSL) - Make diagnostic messages for these cases include a note to point at the "other" declaration in each case, so that user can more easily isolate the problem - Unrelated fix: make sure `slangc` sets up its diagnostic callback before parsing command-line options so that error messages output during options parsing will be visible - Unrelated fix: make sure that formatting for diagnostic messages doesn't print diagnostic ID for notes (all have IDs < 0). - Note: eventually I'd like to not print diagnostic IDs at all (I think they are cluttering up our output), but doing that requires touching all the test cases...
*	Handle possibility of bad types in varying input/output signature.	Tim Foley	2017-08-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #160 If the front-end runs into a type it doesn't understand in the parameter list of an entry point, it will create an `ErrorType` for that parameter, but then the parameter binding/layout rules will fail to create a `TypeLayout` for the prameter (and return `NULL`). There were some places where the code was expecting that operation to succeed unconditionally, and so would crash when there was a bad type. The specific case in the bug report was when the return type of a shader entry point was bad: // `vec4` is not an HLSL type vec4 main(...) { ... } Note that the specific case in the buf report only manifests in "rewriter" mode (when the Slang compiler isn't allowed to issue error messages from the front-end), but the same basic thing would happen if the varying parameter/output had used a type that is invalid for varying input/output: Texture2D main(...) { ... } I'm not 100% happy with just adding more `NULL` checks for this, because there is no easy way to tell if they are exhaustive. A better solution in the longer term might be to construct a kind of `ErrorTypeLayout` to represent cases where we wanted a type layout, but none could be constructed.
*	Add an explicit `Name` type	Tim Foley	2017-08-14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #23 Up to this point, the compiler has used the ordinary `String` type to represent declaration names, which means a bunch of lookup structures throughout the compiler were string-to-whatever maps, which can reduce efficiency. It also means that things like the `Token` type end up carying a `String` by value and paying for things like reference-counting. This change adds a `Name` type that is used to represent names of variables, types, macros, etc. Names are cached and unique'd globally for a session, and the string-to-name mapping gets done during lexing. From that point on, most mapping is from pointers, which should make all the various table lookups faster. More importantly (possibly), this brings us one step closer to being able to pool-allocate the AST nodes.
*	Rename `Name` fields to `name`	Tim Foley	2017-08-14
\| \| \| \|	This is in preparation for using `Name` as a type name.
*	Major naming overhaul:	Tim Foley	2017-08-09
\| \| \| \| \| \| \| \| \| \|	- `ExpressionSyntaxNode` becomes `Expr` - `StatementSyntaxNode` becomes `Stmt` - `StructSyntaxNode` becomes `StructDecl` - `ProgramSyntaxNode` becomes `ModuleDecl` - `ExpressionType` becomes `Type` - Existing fields names `Type` become `type` - There might be some collateral damage here if there were, e.g., `enum`s named `Type`, but I can live with that for now and fix those up as a I see them
*	Remove uses of global variables	Tim Foley	2017-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were two main places where global variables were used in the Slang implementation: 1. The "standard library" code was generated as a string at run-time, and stored in a global variable so that it could be amortized across compiles. 2. The representation of types uses some globals (well, class `static` members) to store common types (e.g., `void`) and to deal with memory lifetime for things like canonicalized types. In each case the "simple" fix is to move the relevant state into the `Session` type that controlled their lifetime already (the `Session` destructor was already cleaning up these globals to avoid leaks). For the standard library stuff this really was easy, but for the types it required threading through the `Session` a bit carefully. One more case that I found: there was a function-`static` variable used to generate a unique ID for files output when dumping of intermediates is enabled (this is almost strictly a debugging option). Rather than make this counter per-session (which would lead to different sessions on different threads clobbering the same few files), I went ahead and used an atomic in this case. Note that the remaining case I had been worried about was any function-`static` counter that might be used in generating unique names. It turns out that right now the parser doesn't use such a counter (even in cases where it probably should), and the lowering pass already uses a counter local to the pass (again, whether or not this is a good idea). This change should be a major step toward allowing an application to use Slang in multiple threads, so long as each thread uses a distinct `SlangSession`. The case of using a single session across multiple threads is harder to support, and will require more careful implementation work.
*	Make the "hack" sampler explicit for now	Tim Foley	2017-07-22
\| \| \| \| \| \| \| \| \| \| \|	- We use this to work around the fact that, e.g., `Texture2D.Load` doesn't take a sampler, but the equivalent GLSL operation `texelFetch` requires one - Previously we tried to hide the sampler from the user, hoping that glslang would drop it and we could just ignore it, but that doesn't work - For now we'll go ahead and explicitly show the sampler in the reflection info so that an app can react appropriately - We also generate a unique binding for the sampler, instead of the old behavior that fixed it with `binding = 0` - We still fix it with `set = 0`, so it might still surprise users
*	Translate NV single-pass stereo extension from Slang to GLSL	Tim Foley	2017-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- The easy part here is treating `NV_` prefixed semantics as another case of "system-value" semantics - Mapping the new semantics (`NV_X_RIGHT` and `NV_VIEWPORT_MASK`) to their GLSL equivalents is harder - Instead of a single "right-eye vertex" output, GLSL defines an array of per-view positions - Instead of a vector of masks, GLSL defines an array of per-view masks - Another point here is that a lot of semantics that appear as `uint` in HLSL are `int` in GLSL, which can lead to conversion issues. - The approach here is to have the lowering pass introduce a notion of assignment with "fixups," which will try to cast things as needed - When assigning to a simple value with the "wrong" type, introduce a cast - When assigning to an array from a vector, break out multiple assignments of individual vector/array elements - In order to facilitate the above, I needed to add actual types to the magic expressions I introduce to represent GLSL builtin variables. These were taken by scanning the online documentation for GL, so they might not be perfect. - Major issues with the approach in this change: - No attempt is being made here to check that the original declaration used a type appropriate to the semantic. The assumption is that this logic only ever triggers for Slang entry points, or GLSL entry points using a Slang `struct` type for input/output (and for right now Slang code is only ever written by "understanding" developers) - In the case of a Slang entry point, we always copy varying parameters in/out around the call to `main_`, so this approach should handle calls to functions with `out` or `in out` parameters okay, but it is not robust to cases where we don't want to copy in all the entry point parameters first thing (e.g., a GS), so that will have to change - In the GLSL case (or if we revise the approach to Slang entry points), there is going to be a problem if these converted varying parameters are ever passed as arguments to `out` or `in out` parameters. In these cases we need to do more sleight-of-hand to reify a temporary variable and do the necessary copy-in/copy-out. Being able to do that logic relies on having correct information about callees, which requires having robust semantic analysis of the function body. There is only so much we can do... - A better long-term approach would not rely on an ad-hoc "fixup" conversion during assignment, but would instead implement the GLSL builtin variables as, effectively, global "property" declarations that have both `get` and `set` accessors, and then tunnel a reference to such a property down through lowering, where it can lower to uses of the "getter" or "setter" as appropriate in context (and the result type of the getter/setter can be what we'd want/expect).
*	Try to improve handling of failures during compilation	Tim Foley	2017-07-19
\| \| \| \| \| \| \|	The change is mostly about trying to make sure the compiler "fails safe" when it encounters an internal assumption that isn't met. Most internal errors will now throw exceptions (yes, exceptions are evil, but this will work for now), and these get caught in `spCompile` so that they don't propagate to the user (they just see a message that compilation aborted due to an internal error). Subsequent changes are going to need to work on diagnosing as many of these situations as possible, so that users can at least know what construct in their code was unexpected or unhandled by the compiler.
*	Fixes for how parameter block names are set up.	Tim Foley	2017-07-19
\| \| \| \| \| \| \| \| \| \| \|	We generate implicit names for global-scope parameter blocks (including HLSL `cbuffer`s, since the "name" the user sees is really just for reflection purposes), but this had a few problems: - We used the generated names for parameter-binding purposes - Except for a GLSL block with an explicit name, in which case we'd use the internal name and not the reflection name for matching - The generated named didn't match between GLSL and HLSL/Slang declarations This change tries to fix both of these issues. I changed the name generation to try to make it identical between HLSL and GLSL (to the extent we can control it), just in case. But then I also went and changed the parameter-binding-generation logic to use the reflection name instead of the internal name when deciding if things are the "same" parameter.
*	Support scalarization of varying input/output for GLSL	Tim Foley	2017-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GLSL technically supports varying (`in`, `out`) parameters of `struct` type, but there are some annoying constraints (not allowed for VS input), and it doesn't work with how an HLSL user would usually put "system-value" inputs/outputs into a `struct` together with ordinary inputs/outputs. To work around this, this change adds support for using an imported Slang `struct` type for an `in` or `out` parameter, in which case it will (1) be scalarized and (2) will have system-value semantics mapped appropriately, just as for an entry-point parameter when cross-compiling an HLSL-style `main()`. Changes: - Add a notion of a `VaryingTupleExpr` and `VaryingTupleVarDecl`, similar to those for the resources-in-structs case - Trigger use of these when we have a global-scope varying in/out using an imported `struct` type - Also use these in the cross-compilation case for ordinary varying input/output (since this approach seems like it should be more general, and can hopefully handle stuff like GS input/output some day) - When generating parameter binding information, special case global-scope input/output, and treat it the same as entry-point-parameter input/output - Revamp how used resource ranges are computed so that we can eventually make this specific to an entry point - Actually implement first signs of life for `maybeMoveTemp` so that assignments to the tuple-ified outputs will work better - Add first test case that actually seems to work - Add diagnostics for conflicting explicit bindings on a parameter - Add diagnostic for different parameters with overlapping bindings - Make global-scope varying input/output use a tracking data structure specific to the translation unit for computing locations (so that they are independent of other TUs)
*	Don't allow varying parameters to be merged in reflection data	Tim Foley	2017-07-18
\| \| \| \| \|	All varying input/output parameters need to be specified to the entry point that declared them. In the case of HLSL/Slang this happens for free, but in the case of GLSL we need to be careful not to merge global-scope `in` or `out` parameters in ways that don't make sense.
*	Make sure to treat imported modules as Slang	Tim Foley	2017-07-17
\| \| \| \| \| \|	- When generating parameter binding/reflection info, treated imported modules as Slang code, instead of the source language of the outer translation unit - This fixes an issue where global-scope shader parameters in a `.slang` file were getting ignored for binding-generation purposes when imported by a GLSL file
*	Skip unknown types during parameter-binding/-reflection step	Tim Foley	2017-07-17
\| \| \| \| \| \|	Work on #105 If a semantic error occurs in the type of an entry-point parameter, we need to be able to skip over it when doing parameter binding and reflection-generation work.
*	Adjust type layout when parameter block constains member using the same resource	Tim Foley	2017-07-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have something like to following in HLSL: cbuffer C { Texture2D t; ... } and we are compiling to GLSL, then both `C` and `C.t` consume the same kind of resource (a descriptor-table slot). The way reflection was working right now, querying the index of `C` would return its binding (let's say it is `4` just to be concrete) and then a query on `C::t` would give its offset, which was being computed as `0` because it is the first field in the logical `struct` type. That obviously leads to bad math and requires some subtle `+1`s in cases to get things right (e.g., when scalaring during lowering, I had to carefully add one in some cases). It is unreasonable to expect users to deal with this. This commit changes it so that the offset of field `C::t` is `1` so that hopefully more things Just Work. The special-case logic in lowering is now gone. One important catch here is that this pretty much only works in the case where the element type of a parameter block is a `struct` type (which is really all that makes sense right now). If we ever want to generalize this in the future, then it will probably be necessary to change the `TypeLayout` case for parameter blocks to store a `VarLayout` for the element, rather than just a `TypeLayout`.
*	Don't assign a `binding` to a `push_constant` buffer	Tim Foley	2017-07-14
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes #12 - This was a latent issue, but the previous commit brought it to the front. - As indicated in #12, I don't allocate a descriptor-table slot to the block - Instead I allocate a `PushConstantBuffer` - Unlike what #12 asks for, I don't use a different resource type for the contents of the block - Pretty much all the logic is easiest if these continue to be just plain `Uniform` data
*	Start handling system-value semantics during lowering	Tim Foley	2017-07-10
\| \| \| \| \| \| \|	I hadn't been lowering `SV_Position` outputs to `gl_Position`, and had somehow been relying on hidden driver behavior that I guess made things Just Work. This change adds some infrastructure to handle `SV_` semantics during lowering of an entry point (currently only covering `SV_Position` and `SV_Target`, FWIW). As a byproduct, this also means that a `VarLayout` stores semantic info, which could conceivably be exposed through reflection data now.