slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	render-test: Change D3D12 default to sm_6_5 (#8320)	James Helferty (NVIDIA)	2025-09-02
\| \| \| \| \| \| \| \| \|	Changes default for render-test to sm_6_5. Since sm_6_5 is the new default, remove the -use-dxil option, add -use-dxcb option Remove -use-dxil option from all test cases. Add -use-dxcb to two tests that needed it. Fixes #7611
*	Refresh of disabled WGPU tests (#5614)	Anders Leino	2024-11-21
\| \| \| \| \| \| \| \| \|	Some tests are now passing and are enabled. Other tests are still failing, but are given comments categorizing the failures. Tests in the 'Not supported in WGSL' category are also removed from the expected failures list. (Though they are still kept disabled for WebGPU, of course.) This closes #5519.
*	Enable WebGPU tests in CI (#5239)	Anders Leino	2024-10-15
\|
*	Metal compute tests (#4292)	skallweitNV	2024-06-07
\|
*	Warning on lossy implicit casts. (#2367)	Yong He	2022-08-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Warning on bool to float conversion. * Fix test cases. * Improve. * LanguageServer: don't show constant value for non constant variables. * Fix tests. * Fix warnings in tests. Co-authored-by: Yong He <yhe@nvidia.com>
*	Update gfx back-ends to handle static specialization (#1826)	Tim Foley	2021-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Update gfx back-ends to handle static specialization The main goal here is to make the D3D11, D3D12 and Vulkan back-ends support static specialization of interface types in the case where the data for the type won't "fit" in the pre-allocated space for existential values. This includes all cases where the concrete type being specialized to has resources/samplers/etc., as well as any cases where its ordinary/uniform data exceeds the space available. (Note that the CPU and CUDA targets don't need this work since they can (in theory) support arbitrary-size data in the fixed-size existential payload by using pointer indirection. Actually supporting indirection in those cases should be a distinct change) The Slang compiler already performs layout for programs that have this kind of data that doesn't "fit," and it lays them out using an idea of "pending" type layouts. Basically, a type that contains some amount of specialized interface-type fields will produce both a "primary" type layout that just covers the data for the unspecialized case, as well as "pending" type layout that describes the layout for all the extra data needed by specialization. When laying out a `ConstantBuffer<X>` or `ParameterBlocK<X>` ("CB" or "PB"), the front-end will try to place as much of that "pending" data into the layout of the buffer/block itself as is possible. That means that both CBs and PBs will be able to allocate trailing bytes for any ordinary data in the "pending" layout. PBs will be able to allocate any trailing resources/samplers into their layout, but for CBs they will spill out to be part of the pending layout for the buffer itself. In order for the back-ends to properly handle pending data, they need to either assume the exact layout rules used by the front-end and try to reproduce them (e.g., by iterating over binding ranges and sub-objects in the exact same order that front-end layout would enumerate them), or they need to respect the reflection information produced by the front-end. This change takes the latter approach, trying to make only minimal assumptions about the layout rules being used. This choice is motivated by wanting to decouple the `gfx` implementation from the compiler front-end, especially insofar as this work has made me question whether the current layout rules are the best ones possible. A common theme across all the implementations is to have a fixed-size type that can represent "binding offsets" for the chosen back-end. The offset type has fields that depend on the API-specific way bindings are indexed; e.g., for D3D11 it has offsets for CBV, SRV, UAV, and sampler bindings. This fixed-size offset type can be filled in based on Slang reflecton information, and then used to compute derived offsets with just a few add operations. The simple offset type for each API is then extended to produce an offset type that includes both the offsets for "primary" data and also the offsets for "pending" data. Most logic that traffics in offsets doesn't have to know about this more complicated representation. Making consistent use of these offsets required that I pretty much rewrite the logic that actually applies shader objects to the API state. Doing so might be lowering the efficiency of the system in the near term, but the increase in clarity was important for getting the work done, and it seems like it will also be important if/when we start trying to perform special-case optimizations around root and entry-point parameter setting. While there are many API-specific differences, we can identify a repeated pattern where many steps, whether applying parameters to the pipeline stage or constructing signatures / layouts, can be broken down into three main operations on `ShaderObject`s or their layouts: * `AsValue()` is the core operation, and is the one used for the `ExistentialValue` case most of the time. It ignores the ordinary data in the object, and instead processes all nested binding ranges (for resources/smaplers) and sub-objects. `AsConstantBuffer()` handles the `ConstntBuffer<X>` case, by dealing with the implicit buffer for ordinary data (if it is needed) and then delegates to the `AsValue()` case. * `AsParameterBlock()` handles the `ParameterBlock<X>` case, by allocating/preparing/etc. any descriptor tables/sets that would be required for the current object/layout and then delegating to `AsConstantBuffer()` to do the rest The idea is that by having the parameter block case delegate to the constant buffer case, which delegates to the value/existential case, we can streamline a lot of the logic so that it doesn't seem quite as full of special cases. Note: When preparing this pull request I spent a reasonable amount of time trying to clean up the D3D11 and Vulkan implementations, so they are probably the easiest to read and understand when it comes to the new code. Doing the cleanup work also helped to work out some weird corner case bugs/issues. In contrast, the D3D12 path hasn't had as much attention given to cleanliness and comments, so it really needs some attention down the line to get things into a state that is easier to understand. * fixup: remove debugging code spotted in review
*	Refactor D3D12 renderer root signature creation (#1779)	Tim Foley	2021-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change originated as an attempt to re-enable a test case, but it has ended up disabling more tests (for good reasons) than it re-enables. The main change here is a significant overhaul of the way that the D3D12 render path extracts information from the Slang reflection API to produce a root signature. There were also some supporting fixes in the reflection information to make sure it returns what the D3D12 back-end needed. The big picture here is that the D3D12 path now uses the descriptor ranges stored in the reflection data more or less directly. It still needs to use register/space offset information queried via the "old" reflection API, but it only does so at the top level now, for the program and entry points themselves. All other layout information is derived directly from what Slang provides. Smaller changes: * The "flat" reflection API was expanded to include `getBindingRangeDescriptorRangeCount()` which was clearly missing. * The "flat" reflection results for a constant buffer or parameter block that didn't contain any uniform data and was mapped to a plain constant buffer needed to be fixed up. That logic is still way to subtle to be trusted. * Several additional tests were disabled that relied on static specialization, global/entry-point generi type parameters, structured buffers of interfaces or other features we don't officially support with shader objects right now. All of the affected tests were somehow passing by sheer luck and because they often passed in specialization arguments via explicit `TEST_INPUT` lines. * The `inteface-shader-param` test is re-enabled now that we can properly describe its input with the new `set` mode on `TEST_INPUT` * `ShaderCursor::getElement()` can now be used on structure types (in addition to arrays) to support by-index access to fields * The `TEST_INPUT` system was expanded to support both by-name and by-index setting of structure fields for aggregates * The `TEST_INPUT` system was expanded to allow an `out` prefix to mark parts of an expression as outputs on a `set` lines * The `TEST_INPUT` system was expanded so that anything that would be allowed on a `TEST_INPUT` line by itself (like `ubuffer(...)`) can now be used as a sub-expression on a `set` line Co-authored-by: Yong He <yonghe@outlook.com>
*	Remove old code paths from render-test (#1760)	Tim Foley	2021-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Remove old code paths from render-test Historically, the `render-test` tool was using three different code paths: * One based on `gfx` and manual (non-reflection-based) parameter setting, used for OpenGL, D3D11, D3D12, and Vulkan * One for CPU that used reflection-based parameter setting but shared no code with the first * One for CUDA that used reflection-based parameter setting and shared some, but not all, code with the CPU path Recently we've updated `render-test` to include a fourth option: * Using `gfx` and the "shader object" system it exposes for a unified reflection-based parameter-setting system taht works across OpenGL, D3D11, D3D12, Vulkan, CUDA, and CPU This change removes the first three options and leaves only the single unified path. A sa result, a bunch of code in `render-test` is no longer needed, and the codebase no longer relies on things like the `IDescriptorSet`-related APIs in `gfx`. Several existing tests had to be disabled to make this change possible. Those tests will need to be audited and either re-enabled once we fix issues in the shader object system, or permanently removed if they don't test stuff we intend to support in the long run (e.g., global-scope type parameters, which aren't a clear necessity). * fixup: CUDA detection logic
*	Re-enable `interface-shader-param` tests. (#1614)	Yong He	2020-11-30
\|
*	Unify handling of static and dynamic dispatch for interfaces (#1612)	Tim Foley	2020-11-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Overview ======== Prior to this change, we had two different code generation strategies for interface/existential types in Slang, that didn't always play nicely together: * The "legacy" static specialization approach could handle plugging in an arbitrary concrete type for an existential type parameter (including types with resources, etc.), but wouldn't work well with things like a `StructuredBuffer<>` of an interface type, and requires somewhat counter-intuitive layout rules to make work. * The new dynamic dispatch approach produces simpler, more easily understood layouts by assuming that values of interface type can fit into a fixed number of bytes. The tradeoff there is that it cannot handle types that include resources (only POD types). The goal of this change is to make it so that the two strategies can co-exist. In particular, in cases where a shader is amenable to both static specialization and dynamic dispatch, the type layouts should agree. In order to make the type layouts agree, we: * Declare that all values of existential type reserve storage according to the dynamic-dispatch rules (so 16 bytes for the RTTI and witness-table information, plus whatever bytes are needed to story "any value" of a conforming type). * Then we modify the "legacy" layout rules so that if a value of concrete type can fit in the reserved "any value" space for a given interface, then it is laid out there exactly like the dynamic dispatch rules would do. Otherwise, we fall back to the previous legacy rules (since we don't need to agree with the dynamic-dispatch layout on types that can't be used with dynamic dispatch). Details ======= * Renamed `ExistentialBox` to `BoundInterfaceType` to better clarify how it relates to `BindExistentialsType` * Unconditionally apply the `lowerGenerics` pass during emit, since it is now responsible for aspects of the lowering of existential types when specialization is used. * Made IR type layout take the target into account, so that the layout of resource types can vary by target (e.g., being POD on some targets, and invalid on others) * Cleaned up some issues around using global shader parameters as the "key" for their layout information in the global-scope layout (only comes up when there are global-scope `uniform` parameters) * Made there be a default any-value size (16) instead of making it be an error to leave out. This was the simplest option; we could try to go back to having an error, but we'd need to only issue it if we are sure a type/interface is being used with dynamic dispatch, since static dispatch doesn't have to obey the restrictions. * Changed lowering of existential types to tuples so that bound interfaces where the concrete type won't fit use a "pseudo-pointer" instead of an "any-value" to hold the payload * Changed IR type legalization to handle the "pseudo-pointer" case and apply layout information from an interface type over to the payload part when static specialization was used. * Changed some details of how witness tables were being lowered, so that we didn't have to create "proxy" witness tables for the constraints on associated types (just use the actual requirement entries we generate) * Changed witness tables so that they know the subtype doing the conforming * Added logic so that we don't generate pack/unpack logic and witness table wrapper functions for types that are incompatible with any-value/dynamic dispatch for a given interface. * Changed the core AST-level type layout logic to use the dynamic-dispatch layout in case things fit, and the legacy static specialization case when things don't (while also reserving space for the dynamic-dispatch fields) * Changed a bunch of test cases for static specialization to properly use the new layout (which introduces new buffers in some cases, and moves data around in others). Future Work =========== The experience of trying to reconcile our older way of handling interface-type specialization with our newer model (that supports dynamic dispatch) makes it clear that we really need to make similar changes to our handling of generic type parameters on entry points and at the global scope. A future change should make it so that a global type parameter is lowered with a type layout similar to a value parameter of interface type, including the RTTI and witness-table pieces, and just leaving out the "any value" piece. A similar translation strategy should apply to entry-point generic parameters (mirroring how we lower generic functions for dynamic dispatch already), and value specialization parameters. Co-authored-by: Yong He <yonghe@outlook.com>
*	Change the policy for entry-point uniform parameters on Vulkan (#1476)	Tim Foley	2020-08-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Entry point `uniform` parameters were a feature of the original Cg and HLSL, but have not been used much in production shader code. One of our goals on Slang is to reduce the (ab)use of the global scope, so bringing entry point `uniform` parameters up to a greater level of usability is an important goal. Some policy choices about how global vs. entry-point `uniform` parameters behave have already been made, that shape decisions looking forward: * For DXBC/DXIL, it makes the most sense to follow the lead of fxc/dxc, by treating entry point `uniform` parameters as a kind of syntax sugar for global shader parameters. Any parameters of "ordinary" types are bundles up into an implicit constant buffer, and all the resources (including the implicit constant buffer) are assigned `register`s just as for globals. It is up to the application to decide how to bind those parameters via a root signature (using root descriptors, root constants, descriptor tables, local vs. global root signature, etc.) * For CPU, it makes sense to pass global vs. entry-point parameters as two different pointers, although the details of what we do for CPU are the least constrained across all current targets. * For CUDA compute, it makes the most sense to map global shader parameters to `__constant__` global data, and entry-point `uniform` parameters to kernel parameters. This choice ensures that the signature of a kernel when translated from Slang->CUDA follows the Principle of Least Surprise, at the cost of making entry-point vs. global parameters be passed via different mechanisms. * For OptiX ray tracing, it makes sense to expand on the precedent from CUDA compute: pass global parameters via global `__constant__` data (as is already expected by OptiX for whole-launch parameters), and pass entry-point `uniform` parameters via the "shader record." This establishes a precedent that for ray-tracing shaders, global-scope parameters map to the "global root signature" concept from DXR, while entry-point `uniform` parameters map to a "local root signature" or "shader record." * For Vulkan ray tracing, the precedent from OptiX then argues that entry-point `uniform` parameters should map to the Vulkan "shader record" concept (and thus cannot support things like resource types). * The remaining interesting case is what to do for non-ray-tracing shaders on Vulkan. The dev team agrees that the most reasonable choice to make for non-ray-tracing Vulkan shaders is to map entry-point `uniform` parameters to "push constants." In particular, this makes it easy to express the case of a compute kernel with direct parameters of ordinary/value types in the way that will be implemented most efficiently. The big picture is then that a kernel like: ```hlsl void computeMain(uniform float someValue) { ... } ``` will map to output GLSL like: ```glsl layout(push_constant) uniform { float someValue; } U; void main() { ... } ``` If the user really wanted a constant-buffer binding to be created instead, they can easily change their input to make the buffer explicit: ```hlsl struct Params { float someValue; } void computeMain(uniform ConstantBuffer<Params> params) { ... } ``` (Forcing the user to be explicit about the desire for a buffer here creates a nice symmetry between Vulkan and CUDA; in the first case the user sets up the data in host memory and passes it to the GPU by copy, while in the second case the user must allocate and set up a device-memory buffer for the data. This symmetry extends to D3D if the application chooses to map entry-point `uniform` parameters to root constants.) This change implements logic in the "parameter binding" part of the Slang compiler to make sure that entry-point `uniform` parameters are wrapped up in a push-constant buffer rather than an ordinary constant buffer for non-ray-tracing shaders on Vulkan (and in a shader record "buffer" for the ray-tracing case). The majority of the actual work was in adding support for root/push constants to the test framework and the graphics API abstraction it uses. To be clear about that support: * Root constant ranges are (perhaps confusingly) treated as a new kind of "slot" that can appear on a descriptor set. This choice ensures that the implicit numbering of registers/spaces used by the back-ends can account for these ranges correctly. * The `TEST_INPUT` lines are extended to allow a `root_constants` case that behaves more or less like `cbuffer` * The CPU and CUDA paths can treat a `root_constants` input identically to a `cbuffer`. They already allocate the actual buffers based on reflection, and just use `cbuffer` as a directive that causes bytes to be copied in. * On D3D12 and Vulkan, a descriptor set allocates a `List<char>` to hold the bytes of root constant data assigned into it, and these bytes are flushed to the command list when the table is actually bound (usually right before rendering). * On D3D11, a descriptor set treats a root constant range more or less like a constant buffer range (with a single buffer), except that it also automatically allocates a buffer to hold the data. Assigning "root constant" data automatically copies it into that buffer. The small number of tests that used entry-point `uniform` parameters of ordinary types were updated to use the new `root_constant` input type, and the bugs that surfaced were fixed. A new test to confirm that entry-point `uniform` parameters map to the shader record for VK ray tracing was added. An important but technically unrelated change is the removal of the `DescriptorSetImpl::Binding` type and related function from the Vulkan implementation of `Renderer`. That type was created to ensure that objects that are bound into a descriptor set don't get released while the descriptor set is still alive, but the implementation relied on a complicated linear search to check for existing bindings, which could create a performance issue for descriptor sets that include large arrays of descriptors. The new implementation makes use of the approach already present in the various `Renderer` implementations (including the Vulkan one) for assigning ranges in a descriptor set a flat/linear index for where their pertinent data is to be bound. As a result, the Vulkan `DescriptorSetImpl` now uses a single flat array of `RefPtr`s to track bound objects, and has no need for linear search when binding. Co-authored-by: Yong He <yonghe@outlook.com>
*	Change rules for layout of buffers/blocks containing only interface types ↵	Tim Foley	2020-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#1318) TL;DR: This is a tweak the rules for layout that only affects a corner case for people who actually use `interface`-type shader parameters (which for now is just our own test cases). The tweaked rules seem like they make it easier to write the application code for interfacing with Slang, but even if we change our minds later the risk here should be low (again: nobody is using this stuff right now). Slang already has a rule that a constant buffer that contains no ordinary/uniform data doesn't actually allocate a constant buffer `binding`/`register`: struct A { float4 x; Texture2D y; } // has uniform/ordinary data struct B { Texture2D u; SamplerState v; } // has none ConstantBuffer<A> gA; // gets a constant buffer register/binding ConstantBuffer<B> gB; // does not There is similar logic for `ParameterBlock`, where the feature makes more sense. A user would be somewhat surprised if they declared a parmaeter block with a texture and a sampler in it, but then the generating code reserved Vulkan `binding=0` for a constant buffer they never asked for. The behavior in the case of a plain `ConstantBuffer` is chosen to be consistent with the parameter block case. (Aside: all of this is a non-issue for targets with direct support for pointers, like CUDA and CPU. On those platforms a constant buffer or parameter block always translates to a pointer to the contained data.) Now, suppose the user declares a constant buffer with an interface type in it: interface IFoo { ... } ConstantBuffer<IFoo> gBuffer; When the layout logic sees the declaration of `gBuffer` it doesn't yet know what type will be plugged in as `IFoo` there. Will it contain uniform/ordinary data, such that a constant buffer is needed? The existing logic in the type layout step implemented a complicated rule that amounted to: * A `ConstantBuffer` or `cbuffer` that only contains `interface`/existential-type data will not be allocated a constant buffer `register`/`binding` during the initial layout process (on unspecialized code). That means that any resources declared after it will take the next consecutive `register`/`binding` without leaving any "gap" for the `ConstantBuffer` variable. * After specialization (e.g., when we know that `Thing` should be plugged in for `IFoo`), if we discover that there is uniform/ordinary data in `Thing` then we will allocate a constant buffer `register`/`binding` for the `ConstantBuffer`, but that register/binding will necessarily come after any `register`s/`binding`s that were allocated to parameters during the first pass. * Parameter blocks were intended to work the same when when it comes to whether or not they allocate a default `space`/`set`, but that logic appears to not have worked as intended. These rules make some logical sense: a `ConstantBuffer` declaration only pays for what the element type actually needs, and if that changes due to specialization then the new resource allocation comes after the unspecialized resources (so that the locations of unspecialized parameters are stable across specializations). The problem is that in practice it is almost impossible to write client application code that uses the Slang reflection API and makes reasonable choices in the presence of these rules. A general-purpose `ShaderObject` abstraction in application code ends up having to deal with multiple possible states that an object could be in: 1. An object where the element type `E` contains no uniform/ordinary data, and no interface/existential fields, so a constant buffer doesn't need to be allocated or bound. 2. An object where the element type `E` contains no uniform/ordinary data, but has interace/existential fields, with two sub-cases: a. When no values bound to interface/existential fields use uniform/ordinary dat, then the parent object must not bind a buffer b. When the type of value bound to an interface/existential field uses uniform/ordinary data, then the parent object needs to have a buffer allocated, and bind it. 3. When the element type `E` contains uniform/ordinary data, then a buffer should be allocated and bound (although its size/contents may change as interface/existential fields get re-bound) Needing to deal with a possible shift between cases (2a) and (2b) based on what gets bound at runtime is a mess, and it is important to note that even though both (2a) and (3) require a buffer to be bound, the rules about where the buffer gets bound aren't consistent (so that the application needs to undrestand the distinction between "primary" and "pending" data in a type layout). This change introduces a different rule, which seems to be more complicated to explain, but actually seems to simplify things for the application: * A `ConstantBuffer` or `cbuffer` that only contains `interface`/existential-type data always has a constant buffer `register`/`binding` allocated for it "just in case." * If after specialization there is any uniform/ordinary data, then that will use the buffer `register`/`binding` that was already allocated (that's easy enough). * If after speciazliation there isn't any uniform/ordinary data, then the generated HLSL/GLSL shader code won't declare a buffer, but the `register`/`binding` is still claimed. * A `ParameterBlock` behaves equivalently, so that if it contains any `interface`/existential fields, then it will always allocate a `space`/`set` "just in case" The effect of these rules is to streamline the cases that an application needs to deal with down to two: 1. If the element type `E` of a shader object contains no uniform/ordinary or interface/existential fields, then no buffer needs to be allocated or bound 2. If the element type `E` contains any uniform/ordinary or interface/existential fields, then it is always safe to allocate and bind a buffer (even in the cases where it might be ignored). Furthermore, the reflection data for the constant buffer `register`/`binding` becomes consistent in case (2), so that the application can always expect to find it in the same way.
*	Remove support for explicit register/binding syntax on TEST_INPUT (#1132)	Tim Foley	2019-11-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `TEST_INPUT` facility allows textual Slang test cases to provide two kinds of information to the `render-test` tool: 1. Information on what shader inputs exist 2. Information on what values/objects to bind into those shader inputs Under the first category of information, there exists supporting for attaching a `dxbinding(...)` annotation to a `TEST_INPUT` which seemingly indicates what HLSL `register` the input uses. There is a similar `glbinding(...)` annotation, used for OpenGL and Vulkan. It turns out that these annotations were, in practice, completely ignored and had no bearing on how `render-test` allocates or bindings graphics API objects. There was some amount of code attempting to validate that explicit registers/bindings were being set appropriately, but the actual values were being ignored. The visible consequence of the `dxbinding` and `glbinding` annotations being ignored is issue #1036: the order of `TEST_INPUT` lines was de facto determining the registers/bindings that were being used by `render-test`. This change simply removes the placebo features and strips things down to what is implemented in practice: the `TEST_INPUT` lines do not need target-API-specific binding/register numbers, because their order in the file implicitly defines them. I added logic to the parsing of `TEST_INPUT` lines to make sure I got an error message on any leftover annotations, and went ahead and systematicaly deleted all of the placebo annotations from our test cases. If we decide to make `TEST_INPUT` lines not depend on order of declaration in the future, we can build it up as a new and better considered feature. The main alternative I considered was to keep the annotations in place, and change `render-test` and the `gfx` abstraction layer to properly respect them, but that path actually creates much more opportunity for breakage (since every single test case would suddenly be specifying its root signature / pipeline layout via a different path using data that has never been tested). The approach in this change has the benefit of giving me high confidence that all the test cases continue to work just as they had before.
*	Start exposing a new COM-lite API (#987)	Tim Foley	2019-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Start exposing a new COM-lite API This change is mostly about exposing a new API to the Slang compiler that allows more fine-grained control over the compilation flow. The basic concepts in the new API are: * An `IGlobalSession` is the granularity at which we load/parse the Slang stdlib, and therefore gives applications a way to amortize startup cost for the library across multiple compiles. This is a concept that might be able to go away in a future version of Slang. * An `ISession` owns all the code that gets loaded/compiled/generated. Any `import`ed modules are shared across everything in a session (we don't re-parse/-check the code when we see another `import` for the same module). Any generic- or interface-based code in the session can be specialized using types from the same session (but not necessarily across sessions). * An `IModule` is the unit of code loading and scoping. It doesn't expose any API in this change, but would be the right scope for looking up types or entry points by name. * An `IProgram` is a "linked" combination of modules and entry points from which code can be generated and reflection information queried. This change re-uses the existing reflection API types, rather than introduce a new API that duplicates that functionality. That will probably change in a future revision. There are two major pieces of functionality added here that aren't related to the new API: * We now have an API concept of "entry point groups" which are one or more entry points that are intended to be used together so that they need to have non-overlapping parameters. For now this is being used to handle "hit groups" and local root signatures for ray tracing, but I'm not sure this is a concept we will keep in the long run. * We have a very special-case (client-application-specific) flag that ascribes special meaning to the `shared` keyword, so that it can be attached to global parameters to indicate that they are actually to be part of the local root signature rather than the global one for DXR. None of the API design (including naming) here is finalized; the only reason to check in the changes at this point to avoid having a long-running branch that leads to merge pain. Clients should not try to depend on the new API just yet, since it is still a work in progress. * fixup: clang warning * fixup: try to detect clang C++11 support * fixup * fixup * fixup * fixup * fixup: review feedback
*	Allow interface types to be used inside of structs (#966)	Tim Foley	2019-05-21
	Previously, interface types were allowed to be used directly as function parameters, local variables, and global shader parameters. Using an interface type as a field of a `struct` type or a `cbuffer` declaration was not implemented. This change adds that support, and fixes several unrelated issues that caused problems in doing so. * The most important work here was adding a case for `IRStructType` to `maybeSpecializeBindExistentialsType` that creates a specialized variant of a `struct` type on-demand based on specialization operands. This logic loops over the fields of the original struct, and creates new fields by binding the existentials/interfaces in the type of each field. Caching is used to ensure that the same `struct` type specialized to the same operands should yield the same result. * To allow subsequent specialization to occur when a `struct` with interface-type fields is used, it was also necessary to specialize field-address and field-extract instructions in cases where the value that the field is being extracted from is a `wrapExistential`. * Similarly, we neede to make sure that the logic for specializing called functions based on the concrete types for interfaces in the argument list would also take into account `struct` types with existential-type fields inside of them. * Doing the above changes revealed some serious flaws in how the `ir-specialize.cpp` logic was tracking which instructions still needed to be processed. It had previously been assuming that it could assume any relevant instructions were on its work list, and when the work list went empty it could exit. This runs into two problems: (1) sometimes we create new instructions when specializing, and it may be impossible to ensure that all the new instructions (e.g., those created by utility routines in other files) get added to the work list, and (2) sometimes the instruction(s) that need to be re-visited when we specialize something aren't its direct users, but instead somethign that transitively depends on the instruction. These issues were fixed by two changes to the pass: (1) we now maintain a list of known "clean" instructions instead of implicitly using the work-list as a list of "dirty" instructions (so that implicitly any new instruction is dirty), and periodically iterating over all instructions to add the non-clean ones to the work list for processing, and (2) when an instruction is specialized/replaced we mark everything that transitively depends on it "dirty" (by removing it from the "clean" list). * Added some logic to "fix up" the type of an IR function after changes that might modify its parameter list. Failing to have this logic meant that certain types were still live (because they were referenced by a function type) that couldn't actually be emitted as legal HLSL/GLSL. * Added some special cases to IR instruction creation for `wrapExistential` and `BindExistentialsType` so that they act as no-ops when there are no "slots" providing specialization information. This helps avoid some special cases when specializing structure fields (since some fields specialization and others don't, so in general there are zero or more operands specific to each field). * Added a test case that uses an interface type in a `cbuffer`, as well as an interface type in a `struct` passed as an entry-point `uniform` parameter. * Fixed up some parts of the `.natvis` files to reflect naming changes from a previous PR and thus restore some of the useful Visual Studio debugging experience for Slang.