| Age | Commit message (Collapse) | Author |
|
Fixes #1990
The underlying problem here is in the `ExtractExistentialType` AST node class.
An "existential" in current Slang is typically a value of interface type. When such a value is used in an operation, the type-checker "opens" the extistential so that subsequent type-checking steps can work with the (statically unknown) specific type of the value stored inside. The `ExtractExistentialType` AST node represents the type of an existential that has been "opened" in this way.
When the front-end performs lookup "into" a value with one of these types, it nees to use a reference to the original interface declaration with a "this-type substitution" that refers to the "opened" type (a this-type substitution tells the compiler the concrete type it should use in place of `This` in signatures within the interface; it allows compiler to "see" the right associated type definitions to use in a context).
Prior to this change, the implementation would store the specialized reference to the original interface declaration in the `ExtractExistentialType` node as part of its state. The catch there is that the specialized interface reference indirectly refers to the `ExtractExistentialType` AST node itself, creating a circularity. As soon as the front-end performs any operation that tries to recurse over that structure, it would go into an infinite loop.
The fix here sounds kind of like a hack, but seems to be pretty nice in practice. Instead of always storing the specialized interface reference, we instead store the few values that are needed to construct it, and then create and cache the actual reference on-demand. The on-demand created fields are not considered part of the state of the AST node for any kind of recursion or serialization, so they avoid the original problem.
A single test case was added that represents the original bug, and confirms the fix.
|
|
* Format list updated with additional formats supported by both D3D and Vulkan; D3DUtil::getMapFormat() and VkUtil::getVkFormat() updated to include additional formats; GFX_FORMAT() updated with all additional formats (BC compression unfinished)
* Finished updating GFX_FORMAT with newly added formats and sizes; Pixel size is now tracked using the FormatPixelSize struct containing the values for bytes per block and pixels per block to accomodate BC formats; Updated gfxGetFormatSize and associated sub-calls to return FormatPixelSize instead of uint8_t; Most calls to gfxGetFormatSize() updated to reflect changes, a couple calls still unupdated
* Changes to accommodate new formats finished, debugging slang-literal unit test
* First format unit test working
* One test added for BC1Unorm and RGBA8Unorm_SRGB, both passing
* Refactored format testing code to merge BC1Unorm and RGBA8Unorm SRGB into a single file
* All unit tests added for BC and Srgb formats
* Most tests added and working; Added five additional formats (still need tests) and made the appropriate changes to support these; createTextureView() modified for D3D11, D3D12, and Vulkan to take into account the format specified in the texture view desc when the texture's format is typeless
* Format enums renamed to more closely match their D3D counterparts; Added a universal float and uint buffer and buffer view for use across all Format tests
* Remaining tests added; D3D12 tests pass, but Vulkan crashes in BC1_UNORM and D3D11 spits out a bunch of D3D11 Errors (but supposedly passes)
* re-run premake
* Added Sint versions of test shaders; Vulkan and D3D11 tests also pass
* Size struct for format unit tests no longer use initializer lists
* Fixed a Size struct missed in the previous pass
* Fixed minor bugs causing tests to fail
* Added documentation detailing all currently unsupported formats
* Skip tests causing unsupported format warnings due to swiftshader
* updated several test using old Format enum names
* Revert change to compareComputeResult() that was added for debugging purposes
* DEBUGGING: Added prints to identify which formats are failing on CI
* Reverted attempted debugging changes; Fixed texture2d-gather.hlsl to use updated Format enums
* Fixed incorrect array sizes in d3d11 _initSrvDesc()
* Commented out further tests that produce unexpected results when tested for Vulkan with swiftshader
* Revert "Merge branch 'expanded-format-support' of https://github.com/lucy96chen/slang into expanded-format-support"
This reverts commit 20008f0d3ecc3b1405ecac8c138edaa3cd37ed6b, reversing
changes made to 6081e95827315fee50e18409394d5abd62fac787.
* Added a fuzzy comparison function for use with floats
* submodule update
* Revert messed up changes caused by previous revert after automatically merging on github
|
|
* Work to mitigate SPIR-V bloat
SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things.
Note: experimentation shows that the bloat is less pronounced when running either *no* SPIR-V optimizations or *full* SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request.
This change primarily does two things:
First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would *always* copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters.
Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies.
Notes:
* The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon.
* The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing.
* There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments.
* Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated.
* This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create *more* memory traffic (because the same values get loaded in multiple callee functions).
* fixup: run premake
* fixup: typo
|
|
|
|
* Include a "stack trace" with nested-import errors
When errors occur in nested `#include` files it is often helpful to have a "stack trace" / traceback of the `#include` chain that led from a root translation unit to the file with an error.
This change implements a similar feature for `import`s.
It is worth noting that `import`s don't really *require* this kind of compiler support the way `#include`s do because the intention is that the meaning of an `import`ed file does not depend on the order or nesting of `import`s. As such, when trying to *fix* an error in an `import`ed file, you usually don't care how it came to be `import`ed into your shaders.
The use case here is somebody adapting a large body of Slang code to use in a different codebase, such that they have certain `.slang` files they don't actually intend to have compile correctly, and they want to be able to diagnose how they came to include those files when/if they cause problems.
The actual feature implementation is pretty simple because we already track a stack of active `import`s so that we can detect and diagnose recursive `import`s. This change simply changes the disagnostics when there is an error in imported code so that instead of just noting the inner-most `import` site it lists all the `import` sites that were active at the time.
The change includes a test case to confirm that the behavior works (at least for the case of a parse error).
* fixup: test outputs
Co-authored-by: Yong He <yonghe@outlook.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
* Overhaul the preprocessor
The old Slang preprocessor was based on a simple mental model that tried to unify two parts of macro expansion:
* Scanning for macro invocations in a sequence of tokens
* Producing the expanded tokens for a macro expansion by substituting arguments into its body
The basic was that substitution of macro arguments into a macro definition is superficially similar to top-level macro expansion, just with an environment where the macro arguments act like `#define`s for the corresponding parameter names. That approach was "clever" and could conceivably have been extended to include a lot of advanced preprocessor features (e.g., a preprocessor-level `lambda` would be easy to support!), but it was basically impossible to make it correctly handle all the corner cases of the full C/C++ preprocessor.
The fundamental problem with the old approach was that it conflated the two parts of expansion listed above into one implementation, while the various special cases of the C/C++ preprocessor rely on treating the two cases very differently. The new approach here (which is somewhere between a refactor and a full rewrite of the preprocessor) changes things up in a few key ways:
* The abstraction still cares a lot about streams of tokens, but it now treats the top level streams (`InputFile`s) as fairly different from the lower-level streams (`InputStream`s)
* Macro expansion is handled as a dedicated type of stream that wraps another stream. This allows macro expansion to be applied to anything, and supports cases where multiple rounds of macro expansion are required by the spec.
* Macro *invocations* and the substitution of their arguments are now handled by a completely new system.
* Macro arguments are no longer treated as if they were `#define`s
* The macro body/definition is analyzed at definition time to detect various kinds of issues, and to derive a list of "ops" that make it easier to "play back" the definition at substitution time
* Token pasting and stringizing are now only handled in macro definitions (rather than being allowed anywhere), and their use cases are restricted to only those that make sense (e.g., you can't stringize anythign except a macro parameter, because anything else wouldn't make sense)
The key new types here are the `ExpansionInputStream` which handles scanning for macro invocations, and the `MacroInvocation` type, which handles playing back the macro body with substitutions.
The `ExpansionInputStream` is the easier of the two to understand. By refactoring it to use a single token of lookahead, the one major detail it had to deal with before (abandoning expansion of a function-like macro if the macro name was not followed by `(`) is significantly easier to manage.
The more subtle part is the `MacroInvocation` type, and most of the complexity there is around handling of token pasting, and the fact that either or both of the operands to a token paste might be empty.
Many of the test cases that exposed the problems in the preprocessor have been moved from `current-bugs` to `preprocessor` since they now work correctly.
* debugging: enable extractor command line dump
* fixup
* fixup
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Pass LexerFlags to all lexing functionality that may output a diagnostic.
* Add test for lexing disabling issue.
* Improve tests.
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix bug causing infinite loop in parser.
* Add a faster path for LookAheadToken when the offset is 0.
* Fix typo introduced in last commit.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Made vector init combinations explicit.
* Add test for vector initialization.
* Small comment change - really just to retrigger TC.
|
|
This change originated as an attempt to re-enable a test case, but it has ended up disabling more tests (for good reasons) than it re-enables.
The main change here is a significant overhaul of the way that the D3D12 render path extracts information from the Slang reflection API to produce a root signature.
There were also some supporting fixes in the reflection information to make sure it returns what the D3D12 back-end needed.
The big picture here is that the D3D12 path now uses the descriptor ranges stored in the reflection data more or less directly.
It still needs to use register/space offset information queried via the "old" reflection API, but it only does so at the top level now, for the program and entry points themselves.
All other layout information is derived directly from what Slang provides.
Smaller changes:
* The "flat" reflection API was expanded to include `getBindingRangeDescriptorRangeCount()` which was clearly missing.
* The "flat" reflection results for a constant buffer or parameter block that didn't contain any uniform data and was mapped to a plain constant buffer needed to be fixed up. That logic is still way to subtle to be trusted.
* Several additional tests were disabled that relied on static specialization, global/entry-point generi type parameters, structured buffers of interfaces or other features we don't officially support with shader objects right now. All of the affected tests were somehow passing by sheer luck and because they often passed in specialization arguments via explicit `TEST_INPUT` lines.
* The `inteface-shader-param` test is re-enabled now that we can properly describe its input with the new `set` mode on `TEST_INPUT`
* `ShaderCursor::getElement()` can now be used on structure types (in addition to arrays) to support by-index access to fields
* The `TEST_INPUT` system was expanded to support both by-name and by-index setting of structure fields for aggregates
* The `TEST_INPUT` system was expanded to allow an `out` prefix to mark parts of an expression as outputs on a `set` lines
* The `TEST_INPUT` system was expanded so that anything that would be allowed on a `TEST_INPUT` line by itself (like `ubuffer(...)`) can now be used as a sub-expression on a `set` line
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Refactor `gfx` to surface `CommandBuffer` interface.
* Fixes.
* Fix code review issues, and make vulkan runnable on devices without VK_EXT_extended_dynamic_states.
* Update solution files
* Move out-of-date examples to examples/experimental
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* First pass at handling 'names' that are too long in GLSL output.
* Test to check functionality with very long func name.
* Add access a long names buffer.
* Fix typo in assert.
Fix issue with coercion error for 1.0f / 0x7fffffff
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP: First pass in supporting output of line error information.
* Add support for lexing to better be able to indicate SourceLocation information.
* Fix lexer usage in DiagnosticSink in C++ extractor.
* Update diagnostics tests to have line location info.
* Fixed test expected output that now have source location information in them.
* Better handling of tab.
* Fix test expected results for tabbing change.
* DiagnosticLexer -> DiagnosticSink::SourceLocationLexer
Added line continuation tests.
* Fix typo.
* Added String::appendRepeatedChar
* Change to rerun tests.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Work around for issue with obfuscation (and lack of name hints) leading to names in output not being correctly uniquified.
* Improve appendChar
Remove unrequired memory juggling to scrub names.
* Remove test code.
* Small fixes in comments and method called.
* Remove linkage decoration on functions that are specialized.
* Obfuscation naming with specialization test.
* Fix instruction deletion.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* Fix existential specialization of mutable buffer loads.
* fix
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
This change converts a large number of our existing tests to use the `ShaderObject` support that was added to the `gfx` layer.
In many cases, tests were just updated to pass `-shaderobj` and the result Just Worked.
In other cases, a `name` attribute had to be added to one or more `TEST_INPUT` lines.
For tests that did not work with shader objects "out of the box," I spent a little bit of time trying to get them work, but fell back to letting those tests run in the older mode.
Future changes to the infrastructure will be needed to get those additional tests working in the new path.
Along with the changes to test files, the following implementation changes were made to get additional tests working:
* Because the shader object mode uses explicit register bindings (from reflection), the hacky logic that was offseting `u` registers for D3D12 based on the number of render targets gets disabled (by another hack).
* The "flat" reflection information coming from Slang was not correctly reporting "binding ranges" for things that consumed only uniform data (which would be everything on CUDA/CPU), so it was refactored to properly include binding ranges for anything where the type of the field/variable implied a binding range should be created (even if the `LayoutResourceKind` was `::Uniform`).
* A few fixes were made to the CUDA implementation of `Renderer`, in order to get additional tests up and running. Most of these changes had to do with texture bindings, which hadn't really been tested previously.
In addition, a few changes were made that were attempts at getting more tests working, but didn't actually help. These could be dropped if requested:
* As a quality-of-life feature (not being used) the `object` style of `TEST_INPUT` line is upgraded to support inferring the type to use from the type of the input being set.
* Any `object` shader input lines get ignored in non-shader-object mode.
|
|
* Use "capability" system to select VKRT extension
Slang currently supports translation of ray tracing shader code to Vulkan GLSL code that uses the `GL_NV_ray_tracing` extension. A multi-vendor equivalent of that extension has been released as `GL_EXT_ray_tracing` and we want Slang to support that extension as well.
At the simplest, making the change from one extension to the other is just a matter of changing a few strings, since it does not appear that anything of significance was changed at the GLSL level (or even in SPIR-V). Where this gets trickier is when we have users who want us to support *both* extensions, and to be able to switch between them.
The solution we've implemented here more or less amounts to:
* If you don't tell the compiler which extension to use, it will default to `GL_EXT_ray_tracing` (the newer multi-vendor one).
* If you explicitly want the older extension, you can opt into it using the `-profile` option or via a new API for explicitly adding capabilities to your target.
Making that work required a few different kinds of changes:
* The options parsing and public API needed ways to add optional capabilities to a target.
* During GLSL code emit, we can check the capabilities that were added to the target to see if the `GL_NV_ray_tracing` extension was explicitly enabled and, if not, default to using the `GL_EXT_ray_tracing` names for things. This step is needed because some of the modifiers/attributes involved in the extension have to be handled explicitly in the code generator rather than implicitly as part of mapping intrinsic functions.
* We add two different translations to the relevant operatiosn in the stdlib, one marked with each of the extensions. If profile/capability-based overload resolution can be relied on to pick the right one, this should Just Work.
* Next, a bunch of work had to go into making capability-based overloading Just Work for the purposes of this change. There's been a nearly complete reworking of the implementation of `CapabilitySet` here to make it more suitable for our needs.
* The tests that were using ray tracing translation for Vulkan needed to be updated. For some of them I updated their baselines to use `GL_EXT_ray_tracing` so that they can test the new path. For others, I updated the command line for the test case so that it explicitly opts into using `GL_NV_ray_tracing`. The result is that we have some coverage of each extension. I would have liked to have each test run in both modes, but our pass-through glslang support doesn't support `-D` options, so I couldn't take that step easily.
This change does *not* add support for `GL_EXT_ray_query`, the extension that supports "DXR 1.1" style queries under Vulkan. Adding support for that extension should hopefully be a smaller step because it doesn't have the same multiple-extensions issue.
This change does *not* address a lot of possible avenues for improvement or cleanup around the capability system. It focuses only on those changes that are necessary to make the ray tracing feature work and leaves the rest for future work.
* fixup: infinite loop
* Comment-only change to retrigger TC build
|
|
* "Shader Toy" example and related fixes
This change introduces a new `shader-toy` example program that is primarily designed to show how Slang's features for type-based encapsulation and modularity can be applied to modularity for effects along the lines of those from `shadertoy.com`.
The Example
-----------
The example is being checked in with an example "toy" effect that I hastily put together, so that it would not be encumbered with any IP concerns. I wrote the effect using the shadertoy.com editor, so I can be sure it is valid GLSL. During bringup of the application I used a pre-existing and larger effect for testing, so some of the support code that was added is not being used at present.
The big-picture idea here is to have an exmaple that shows how to modularize things using Slang interfaces and generics, and then to use the Slang compiler API to manage the compilation, composition, specialization, and linking steps. For better or worse this leads to the sequence of API calls involved being much longer than what was in something like the `hello-world` example.
Future Work (Example)
---------------------
There is a lot of room for improvement and expansion here, so this should be viewed as a checkpoint of work in progress rather than something I'm claiming as a finalized demonstration of all we'd like to achieve. Areas for future work include:
* We need to copy the integration of "Dear, IMGUI" that was already done for the `model-viewer` example so that this example can have a UI.
* Now that the compilation flow is broken into all these additional steps, it should be possible to have the application load multiple effects as distinct modules, and then provide a UI for switching between them. The chosen effect module would be used to specialize the top-level shader(s) before kernel generation.
* The checked-in logic includes a compute shader that can execute an effect, but that hasn't been tested nor has it been wired up to any kind of UI. We should have a way to switch between multiple execution methods, with a goal of eventually including CPU execution.
* The "GLSL compatibility" code needs a lot of improvements before it is likely to be usable for a nontrivial number of shaders. Some of that work is waiting on Slang compiler fixes, though.
* We should consider allowing the individual "toy" effects to define their own uniform parameters and expose those via a UI and reflection. The catch in this case is not that this would be difficult to do, but that it would be a semantic change to how shader toy effects currently work.
The Compiler Fixes
------------------
Doing this work exposed a few bugs in Slang, and this change includes fixes for the ones that were quick to address.
We already had logic in `slang-check-shader.cpp` that was validating the entry points in a compile request - either by checking the explicitly-listed entry points, or by scanning for `[shader("...")]` attributes. The problem is that the routine that did that checking was not being invoked on all compiles. The logic that handled entry points was only being run for manual compiles using `SlangCompileRequest`, while anything using `import` or `loadModule` would ignore entry points. I refactored the relevant code into a subroutine that will be invoked in all compilation scenarios.
There were already `TODO` comments in `SpecializedComponentType` which made the point about how a specialized entry point like `myShader<YourType>` would need to properly show that it has dependencies on both the module that defines `myShader` *and* the module that defines `YourType`, while only the former was being handled at present. I went ahead and implemented the logic to scan the generic arguments for a specialized compoment type in order to determine what module(s) the arguments depend on (both type arguments and witness tables). With that change, using `IComponentType::link` on a specialized component will properly pull in the module(s) that the generic arguments come from.
In `slang-ir-legalize-types.cpp` we could run into assertion failures in debug builds because of code trying to legalize layout `IRAttr`s for fields or parameters with types that need legalization. In practice it is safe to skip these layout attributes, because legalization of the fields/parameters they pertain to would result in creation of entirely new layout attributes, and the old ones would then be unreferenced.
Future Work (Fixes)
-------------------
There are other compiler bugs that this work exposed, but which this change does not address. These will need to be resolved as part of subsequent changes:
* Slang allows for default-initialization of variables of a generic type. That is, given `<T : ISomething>` a user is allowed to declare `T x = {};` and the Slang front-end does not complain. Instead, this leads to an internal compiler error during IR lowering.
* The Slang `__init()` feature probably needs to be upgraded to a properly supported feature, and we probably need a way to make implementing default-initialization an easy thing (e.g., any `struct` type that has initial-value expressions for all its fields should automatically and implicitly satsify an `init();` requirement declared in an interface)
* Iniside an `__init()` definition, code has mutable access to members of the enclosing type, but for some reason the front-end is incorrectly treating `this` as immutable in those contexts. As a result you can write to `someField` but not `this.someField`.
* User-defined operator overloads flat out don't work (which isn't surprising given that no clients have decided to use them yet, and we have no test coverage for them). This is actually due to the shadowing rules being used for lookup right now, so a fix for this issue is going to have far-reaching consequences around what overloads are visible where (and anything that impacts overload resolution is a big can of worms, including around performance).
* fixup: test case had missing main function
|
|
* Add shader object parameter binding to renderer_test.
* remove multiple-definitions.hlsl
* Fix cuda implementation.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
Slang generates code that turns the implicit `this` parameter of a method into an explicit parameter. The logic that decides whether that parameter should be `inout` is a bit involved, and there was a bug where a generic method would lead to the use of an `in` modifier (the default) and override the `inout` modifier that was requested by the method itself.
This change fixes the logic to treat generic declarations in the parent chain of a leaf method as having no bearing on whether an implicit `this` parameter should be `inout` or not.
A test case is included that breaks with the old behavior, and demonstrates that a generic `[mutating]` method can now work correctly.
|
|
The basic problem here is that when a function has multiple declarations with matching signatures (e.g., a forward declaration and then a later definition with a body), the IR lowering logic would lower all declarations whenever the first one was encountered, but then would only register an IR value as the lowered version of the first declaration. Other matching declarations would then run the risk of being lowered again, and in the case where they included features like loops with break/continue labels, that would create the risk of keys getting inserted into certain dictionaries more than one, leading to exceptions.
This change ensures that when lowering a function that has multiple matching declarations to IR, we register an IR value for all of those declarations and not just the first.
I have added a test case that leads to a crash without this change, to ensure that we don't introduce a regression down the line.
|
|
* Enable all dynamic dispatch tests on CUDA.
* Fix expected cross-compile test results.
|
|
During semantic checking, the compiler used to link together `ExtensionDecl`s into a singly-linked list dangling off of the `AggTypeDecl` that they applied to. This approach made lookup relatively easy, because given a `DeclRef` to an `AggTypeDecl` one could easily find and walk the list of candidate extensions.
Unfortunately, the simple approach has two major strikes against it:
* First, as we recently ran into, it creates a lifetime/ownership problem, in cases where the `ExtensionDecl` is outlived by the `AggTypeDecl` it applies to. This creates the one and only place in the compiler today where an "old" AST node might point to a "new" AST node, and it resulted in use-after-free problems in client code.
* Second, the scoping of `extension`s ends up being completely wrong. All of the `extension` methods on a type end up being visible in all cases, instead of just in the context of modules where the `extension` itself is visible. The comparable feature in C# (static extension methods) is careful to not make scoping mistakes like this. The Swift langauge has loose scoping for `extension` more akin to what we have in Slang today, but the maintainers seem to consider it a misfeature.
This change attempts to clean up both issues by changing the way that extension declarations are stored. There are two main pieces:
1. The primary "source of truth" for extension lookup has been moved to the `ModuleDecl`, where a module is responsible for storing a cache of the extensions declared within that module (keyed by the declaration of the type being extended). This cache is updated at the same point where the old code would mutate the AST node being depended on.
2. A secondary aggregated cache is added to the `SharedSemanticsContext` used during semantic checking. This cache includes entries from across multiple modules, and is intended to be invalidated and rebuilt on demand if new modules are added during checking.
Access to the candidate extensions has now been put behind subroutines that require a semantics-checking context to be passed in (there was always one available in contexts that care about extensions).
In addition, the operation for looking up members including those from extensions was refactored heavily to involve internal rather than external iteration and, more importantly, was changed so that it actually tests whether the `ExtensionDecl`s it loops over apply to the type in question, rather than blindly letting extensions members be looked up in ways that don't make sense.
There are three test cases added here to confirm aspects of the fix:
* First, I added a test that reproduces the crash that was being seen, so that we have a regression test for the fix.
* Second, I added a basic semantic-checking test to confirm that an `extension` from an `import`ed module is still visible/usable, to confirm that I didn't break existing valid uses of extensions.
* Third, I added a diagnostic test that ensures that we correctly ignore extensions that should not be visible in a given context as a result of `import` declarations.
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
contains an array (#1448)
* This code is disabled, it was part of the optimization `Specialize function calls involving array arguments. (#1389)` on github.
It is disabled here because it causes a problem when a struct is passed to a function that contains a structured buffer *and* an array. It is specialized on the struct type, and so those types become parameters to the function. If the struct contains a structured buffer this is a problem on GLSL/VK based targets because currently structured buffers cannot be function parameters.
The fix for now is to just disable this optimization.
* Fix typo in name of test expected values.
|
|
If somebody defines two `struct` types with the same name:
```hlsl
struct A {}
// ...
struct A {}
```
and then tries to use that name when specializing a generic function:
```hlsl
void doThing<T>() { ... }
// ...
doThing<A>();
```
then the Slang front-end currently crashes, which leads to it not diagnosing the original problem (the conflicting declarations of `A`).
This change fixes up the checking of generic arguments so that it properly fills in dummy "error" arguments in place of missing or incorrect arguments, and thus guarantees that the generic substitution it creates will at least be usable for the next steps of checking (rather than leaving null pointers in the substitution).
This change also fixes up the error message for the case where a generic application like `F<A>` is formed where `F` is not a generic. We already had a more refined diagnostic defined for that case, but for some reason the site in the code where we ought to use it was still issuing an internal compiler error around an unimplemented feature.
This chagne includes a diagnostic test case to cover both of the above fixes.
|
|
The Slang compiler was bit by a known issue when translating from SSA form back to straight-line code. Give code like the following:
int x = 0;
int y = 1;
while(...)
{
...
int t = x;
x = y;
y = t;
}
...
The SSA construction pass will eliminate the temporary `t` and yield code something like:
br(b, 0, 1);
block b(param x : Int, param y : Int):
...
br(b, y, x);
The loop-dependent variables have become parameters of the loop block, and the branchs to the top of the loop pass the appropriate values for the next iteration (e.g., the jump that starts the loop sends in `0` and `1`).
The problem comes up when translating the back-edge the continues the loop out of SSA form. Our generated code will re-introduce temporaries for `x` and `y`:
int x;
int y;
// jump into loop becomes:
x = 0;
y = 1;
for(;;)
{
...
// back-edge becomes
x = y;
y = x;
continue;
}
The problem there is that we've naively translated a branch like `br(b, <a>, <b>)` into `x = <a>; y = <b>;` but that doesn't work correctly in the case where `<b>` is `x`, because we will have already clobbered the value of `x` with `<a>`.
The simplest fix is to introduce a temporary (just like the input code had), and generate:
// back-edge becomes
int t = x;
x = y;
y = t;
This change modifies the `emitPhiVarAssignments()` function so that it detects bad cases like the above and emits temporaries to work around the problem. A new test case is included that produced incorrect output before the change, and now produces the expected results.
A secondary change is folded in here that tries to guard against a more subtle version of the problem:
for(...)
{
...
int x1 = x + 1;
int y1 = y + 1;
x = y1;
y = x1;
}
In this more complicated case, each of `x` and `y` is being assigned to a value derived from the other, but neither is being set using a block parameter directly, so the changes to `emitPhiVarAssignments()` do not apply.
The problem in this case would be if the `shouldFoldInstIntoUseSites()` logic decided to fold the computation of `x1` or `y1` into the branch instruction, resulting in:
x = y + 1;
y = x + 1;
which would again violate the semantics of the original code, because now there is an assignment to `x` before the computation of `x + 1`.
Right now it seems impossible to force this case to arise in practice, due to implementation details in how we generate IR code for loops. In particular, the block that computes the `x+1` and `y+1` values is currently always distinct from the block that branches back to the top of the loop, and we do not allow "folding" of sub-expressions from different blocks. It is possible, however, that future changes to the compiler could change the form of the IR we generate and make it possible for this problem to arise.
The right fix for this issue would be to say that we should introduce a temporary for any branch argument that "involves" a block parameter (whether directly using it or using it as a sub-expression). Unfortunately, the ad hoc approach we use for folding sub-expressions today means that testing if an operand "involves" something would be both expensive and unwieldy.
A more expedient fix is to disallow *all* folding of sub-expressions into unconditional branch instructions (the ones that can pass arguments to the target block), which is what I ended up implementing in this change. Making that defensive change alters the GLSL we output for some of our cross-compilation tests, in a way that required me to update the baseline/gold GLSL.
A better long-term fix for this whole space of issues would be to have the "de-SSA" operation be something we do explicitly on the IR. Such an IR pass would still need to be careful about the first issue addressed in this change, but the second one should (in principle) be a non-issue given that our emit/folding logic already handles code with explicit mutable local variables correctly.
|
|
Static Method Calls
-------------------
The main fix here is for parsing of calls to static methods. Given a type like:
struct S
{
void doThing();
}
the parser currently gets tripped up on a statement like:
S::doThing();
The problem here is that the `Parser::ParseStatement` routine was using the first token of lookahead to decide what to do, and in the case where it saw a type name it assumed that must mean the statement would be a declaration.
It turns out that `Parser::ParseStatement` already had a more intelligent bit of disambiguation later on when handling the general case of an identifier (for which it couldn't determine the type-vs-value status at parse time), and simply commetning out the special-case handling of a type name and relying on the more general identifier case fixes the issue.
That catch-all case still has some issues of its own, and this change expands on the comments to make some of those issues clear so we can try to address them later.
Empty Declarators
-----------------
One reason why the static method call problem was hard to discover was that it was masked by the parser allowing for empty declarator. That is, given input like:
S::doThing();
This can be parsed as a variable declaration with a parenthesized empty declarator `()`.
Practically, there is no reason to support empty declarators anwhere except on parameters, and allowing them in other contexts could make parser errors harder to understand.
This change makes the choice of whether or not empty declarators are allowed something that can be decided at each point where we parse a declarator, and makes it so that only parsing of parameters opts in to allowing them.
By disabling support for empty declarators in contexts where they don't make sense, we make code like the above a parse error when it appears at global scope, rather than a weird semantic error.
A more complete future version of this change might *also* make support for parenthesized declarators an optional feature, or remove that support entirely. Slang doesn't actually support pointers (yet) so there is no real reason to allow parenthesized declarators right now.
One note for future generations is that using an emptye declarator on a parameter of array type can actually create an ambiguity. If the user writes:
void f(int[2][3]);
did they mean for it to be interpreted as:
void f(int a[2][3]);
or as:
void f(int[2][3] a);
or even as:
void f(int[2] a[3]);
The first case there yields a different type for `a` than the other two, but is also what we pretty much *have* to support for backwards compatibility with HLSL. Requiring all function declarations to include parameter names would eliminate this potentially confusing case.
Layout Modifiers
----------------
One of the above two syntax changes led to a regression in the output for a diagnostic test for `layout` modifiers (which are a deprecated but still functional feature from back when `slangc` supported GLSL input).
The original output of the test case seemed odd, and when I looked at the parsing logic I saw that an early-exit error case was leading to spurious error messages because it failed to consume all the tokens inside the `layout(...)`. Fixing the logic to not use an early-exit (and instead rely on the built-in recovery behavior of `Parser`) produced more desirable diagnosic output.
I changed the input file to put the `binding` and `set` specifiers on differnet lines so that the error output could show that the compiler properly tags both of the syntax errors.
|
|
These are steps towards a fix for the problem of not being able to call a static method as follows:
SomeType::someMethod();
One problem in the above is that the parser gets confused and parses an (anonynmous!) function declaration. This change doesn't address that problem, but *does* fix the problem that when checking fails to coerce `SomeType::someMethod` into a type it was triggering an unimplemented-feature exception rather than a real error message.
Another problem was that if the above is re-written to try to avoid the parser bug:
(SomeType::someMethod)();
we end up with a call where the base expression (the callee) is a `ParenExpr` and the code for handling calls wasn't expecting that. Instead, it sent the overload resolution logic into an unimplemented case that was bailing by throwing an arbitrary C++ exception instead of emitting a diagnostic.
This latter issue was fixed in two ways. First, the code path that failed to emit a diagnostic now emits a reasonable one for the unimplemented feature (this still ends up being a fatal compiler error). Second, we properly handle the case of trying to call a `ParenExpr` by unwrapping it and using the base expression instead, so that `(<func>)(<args>)` is always treated the same as `<func>(<args>)`.
|
|
* Added handling for empty switch body.
Added test for empty switch.
* Fix testing for case in switch.
|
|
The only big catch that I ran into with this batch was that I found the `float.getPi()` function was being emitted to the output GLSL even when that function wasn't being used. This seems to have been a latent problem in the earlier PR, but was only surfaced in the tests once a Slang->GLSL test started using another intrinsic that led to the `float : __BuiltinFloatingPointType` witness table being live in the IR.
The fix for the gotcha here was to add a late IR pass that basically empties out all witness tables in the IR, so that functions that are only referenced by witness tables can then be removed as dead code. This pass is something we should *not* apply if/when we start supporting real dynamic dispatch through witness tables, but that is a problem to be solved on another day.
The remaining tricky pieces of this change were:
* Needed to remember to mark functions as target intrinsics on HLSL and/or GLSL as appropriate (hopefully I caught all the cases) so they don't get emitted as source there.
* The `msad4` function in HLSL is very poorly documented, so filling in its definition was tricky. I made my best effort based on how it is described on MSDN, but it is likely that if anybody wants to rely on this function they will need us to vet our results with some tests.
|
|
This arose when a user tried to specialize the DXR 1.1 `RayQuery` type to a local variable:
```hlsl
RAY_FLAG rayFlags = RAY_FLAG_CULL_FRONT_FACING_TRIANGLES | RAY_FLAG_CULL_NON_OPAQUE;
RayQuery<rayFlags> query;
```
In this case, we issued an error around `rayFlags` not being a constant as expected, but then we also crashes later on in checking because the `DeclRef` that was being used for the type had a null pointer for the generic argument corresponding to `rayFlags`.
The main fix here was thus to add an `ErrorIntVal` case that can be used to represent something that should be an `IntVal` but where there was some kind of error in the input code so that the actual value isn't known to the compiler.
A secondary fix here is that we were issuing error messages about expecting a constant for a parameter like `rayFlags` there *twice*, and one of those times was during the `JustChecking` part of overload resolution (when we are not supposed to emit any diagnostics). I fixed that up by allowing the `DiagnosticSink` to be used to be passed down explicitly (and allowing it to be null), while also leaving behind overloaded functions with the old signatures so that all the existing logic can continue to work unmodified.
|
|
* Test to check fix
|
|
The HLSL language has keywords with very common names like `triangle`, and Slang doesn't want to preclude users from using such names for their variables/functions/etc.
In addition, Slang adds new keywords on top of HLSL (like `extension`) and we don't want those to prevent us from compiling existing code.
As a result, almost all keywords in Slang are contextual keywords, and they can be shadowed by user varaibles.
The down-side to making all keywords contextual is that in a case like this:
```
int test() { return triangle; }
```
The identifier `triangle` is *not* undefined as far as lookup (it is defined as a modifier keyword), so the existing "undefined identifier" logic gets bypassed, and instead we ran into an internal compiler error trying to construct an expression that refers to a modifier keyword.
Fortunately, the internal compiler error in that case was overkill, and the compiler already had defensive logic to produce an expression with an error type if it couldn't figure out what the type of a declaration reference should be.
The main fix here is thus to emit an "undefined identifier" error instead of an internal compiler error at the point where we see an attempt to reference a declaration that shouldn't be available in an expression context.
In order to improve the quality of the diagnostic, the code for constructing declaration references was updated to pass along a source location to be used in error messages.
|
|
* WIP with vector float test.
* vector-float test working.
* Fixed remaing tests broken with init changes.
* Improve 64bit-type-support.md
* Disable tests broken on CI system for Dx.
* WIP: Make type available for comparison.
* Moved type conversion into TypeTextUtil.
* Add text/type conversions from DownstreamCompiler to TypeTextUtil.
* Allow compaison taking into account type.
* Removed quantize in vector-float.slang test.
|
|
* Split out binding writing.
* Pass in the entry type.
* Take into account output type with -output-using-type
Added GPULikeBindRoot
Added dxbc-double-problem test.
* Add the dxbc-double-problem test.
|
|
The `TEST_INPUT` facility allows textual Slang test cases to provide two kinds of information to the `render-test` tool:
1. Information on what shader inputs exist
2. Information on what values/objects to bind into those shader inputs
Under the first category of information, there exists supporting for attaching a `dxbinding(...)` annotation to a `TEST_INPUT` which seemingly indicates what HLSL `register` the input uses. There is a similar `glbinding(...)` annotation, used for OpenGL and Vulkan.
It turns out that these annotations were, in practice, completely ignored and had no bearing on how `render-test` allocates or bindings graphics API objects. There was some amount of code attempting to validate that explicit registers/bindings were being set appropriately, but the actual values were being ignored.
The visible consequence of the `dxbinding` and `glbinding` annotations being ignored is issue #1036: the order of `TEST_INPUT` lines was *de facto* determining the registers/bindings that were being used by `render-test`.
This change simply removes the placebo features and strips things down to what is implemented in practice: the `TEST_INPUT` lines do not need target-API-specific binding/register numbers, because their order in the file implicitly defines them.
I added logic to the parsing of `TEST_INPUT` lines to make sure I got an error message on any leftover annotations, and went ahead and systematicaly deleted all of the placebo annotations from our test cases.
If we decide to make `TEST_INPUT` lines *not* depend on order of declaration in the future, we can build it up as a new and better considered feature.
The main alternative I considered was to keep the annotations in place, and change `render-test` and the `gfx` abstraction layer to properly respect them, but that path actually creates much more opportunity for breakage (since every single test case would suddenly be specifying its root signature / pipeline layout via a different path using data that has never been tested). The approach in this change has the benefit of giving me high confidence that all the test cases continue to work just as they had before.
|
|
glsl could not be coerced. (#1117)
|
|
* Strip IR after front-end steps are done
The main feature of this change is to unconditonally strip out the `IRHighLevelDeclDecoration`s in an IR module once the "mandatory" IR passes in the front end have run. This ensures that later IR passes (e.g., code emission) *cannot* rely on AST-level information to get their job done.
Since I was already writing a pass to remove some instructions at the end of the front-end passes, I went ahead and also made the `-obfuscate` flag apply to the front-end IR generation by causing it to strip `IRNameHintDecoration`s while it is doing the other stripping. With this, the main identifying information left in IR modules (other than semantics and entry-point names) is mangled name strings for imported/exported symbols.
A few other things got changes along the way:
* Removed the `.expected` file for one of the tests, where that file seemingly shouldn't have been checked in at all.
* Updated the signature of the DCE pass both so that it doesn't require a back-end compile request (it wasn't using it anyway), and so that it takes some options to decide whether to keep symbols marked `[export(...)]` alive (the front-end wants to keep these, while back-end passes currently need to be able to eliminate them).
* Moved the `obfuscateCode` flag from the back-end compile request to the base class shared between front- and back-end requests, and updated the options and repro logic to set both as needed. An obvious improvement in the future would be to have the front- and back-end requests share these settings by referencing a single common object in the end-to-end case, rather than each having their own copy.
* Removed logic that was keeping layout instructions alive in DCE, even if they weren't used. This seems to have been a vestige of an intermediate step between AST and IR layout.
* fixup: add the new files
|
|
compiling shaders that used texture2DMS Load() operations
|
|
This appears to be a regression introduced in #1001, and missed because none of our existing tests covered `static const` arrays on the GLSL/SPIR-V targets.
The basic problem is that we cannot output a `static const` definition in GLSL because `static` is a reserved word and not a keyword. Instead for GLSL we just want a `const` array. This change makes the emission of `static` for global-scope constants key on the target language for code generation, and only emit it for HLSL, C, and C++.
This change also adds a test case specifically for running Slang input that has a `static const` array on the Vulkan target.
|
|
* Convert bitwise Or & And to logical operations on scalar bools
* Test bitwise operations on scalar bools
|
|
Fixes #941
The GLSL we were emitting for unbounded-size arrays was the obvious:
```hlsl
// This HLSL:
Texture2D t[];
```
```glsl
// ... becomes this GLSL:
texture2D t[];
```
Unfortunately, the legacy GLSL behavior for an array without a declared size is what is called an "implicitly-sized" array, which means that it is assumed to actually have a fixed size, which is determined by the maximum integer constant value used to index into it (and only integer constants are allowed to be used when indexing into it).
Users hadn't noticed the issue for a while, because most of our users who rely on unbounded-size arrays were also using the HLSL `NonUniformResourceIndex` function:
```hlsl
float4 v = t[NonUniformResourceIndex(idx)].Sample(...);
```
When mapping such code to GLSL we use the `nonuniformEXT` qualifier added by the `GL_EXT_nonuniform_qualifier` extension, and it turns out that a secondary feature of that extension is that it changes the GLSL language semantics for arrays (of resources) with an unspecified size, so that they instead behave like we want. So users were happy and we were blissfully ignorant of the lurking issue.
The problem is that as soon as a user neglects to use `NonUniformResourceIndex` (perhaps because an index really is uniform):
```hlsl
cbuffer C { uint definitelyUniform; }
...
float4 v = t[definitelyUniform].Sample(...);
```
Now the code we emit doesn't need `nonuniformEXT` so it doesn't enable `GL_EXT_nonuniform_qualifier` and the declaration of `t` now falls under the "implicitly-sized" array rules, and thus the code fails because `definitelyUniform` is being used as an index but is *not* an integer constant.
The fix is pretty simple: when emitting a declaration of a global shader parameter to GLSL, we check if it is an unbounded-size array of resources and, if so, enable the `GL_EXT_nonuniform_qualifier` extension.
We don't need any clever handling to deal with resource parameters nested in `struct` types or in entry-point parameter lists, etc., because previous IR passes will have split up complex types and moved everything to the global scope already.
|
|
* Allow plugging in types with resources for interface parameters
The key feature enabled by this change is that you can take a shader declared with interface-type parameters:
```hlsl
ConstantBuffer<ILight> gLight;
float4 myShader(IMaterial material, ...)
{ ... }
```
and specialize its interface-type parameters to concrete type that can contain resources like textures, samplers, etc.
The hard part of doing this layout is that we need to support signatures that include a mix of interface and non-interface types. Imagine this contrived example:
```hlsl
float4 myShader(
Texture2D diffuseMap,
ILight light,
Texture2D specularMap)
{ ... }
```
We end up wanting `diffuseMap` to get `register(t0)` and `specularMap` to get `register(t1)`, so that they have the same location no matter what we plug in for `light`.
But if we plug in a concrete type for `light` that needs a texture register, we need to allocate it *somewhere*.
We handle this by having the `TypeLayout` for `light` come back with a "primary" type layout that doesn't have any texture registers, but with a "pending" type layout that includes the texture register requirements of whatever concrete type we plug in.
This split between "primary" and "pending" layout then needs to work its way up the hierarchy, so that an aggregate `struct` type with a mix of interface and non-interface fields (recursively), needs to compute an aggregate "primary type layout" and an aggregate "pending type layout," and then each field needs to be able to compute its offset in the primary/pending layout of the aggregate.
A large chunk of the work in this PR is then just implementing the split between primary and pending data, and ensuring that layouts are computed appropriately.
The next catch is that when a "parameter group" (either a parameter block or constant buffer) contains one or more values of interface type, then we can allow the parameter group to "mask" some of the resource usage of the concrete types we plug in, but others "bleed through."
For example, if we have:
```hlsl
struct MyStuff { float3 color; ILight light; }
ConstantBuffer<MyStuff> myStuff;
struct SpotLight { float3 position; Texture2D shadowMap; }
``
If we plug in the `SpotLight` type for `myStuff.light`, then the `float3` data for the light can be "masked" by the fact that we have a constant buffer (we can just allocate the `float3` `position` right after `color`), but the `Texture2D` needed for `shadowMap` needs to "bleed through" and become "pending" data for the `myStuff` shader parameter.
Adding support for that detail more or less required a full rewrite of the logic for allocating parameter group type layouts.
The next detail is that when we go to legalize a declaration like the `myStuff` buffer, we will end up with something like:
```hlsl
struct MyStuff_stripped { float3 color; }
struct Wrapped
{
MyStuff_stripped primary;
SpotLight pending;
}
ConstantBuffer<Wrapped> myStuff;
```
This "wrapped" version of the buffer type more accurately reflects the layout we need/want for the uniform/ordinary data, but in order to further legalize it and pull out the resource-type fields like `shadowMap` we need to have accurate layout information, and the problem is that layout information for the original buffer can't apply to this new "wrapped" buffer.
The last major piece of this change is logic that runs during existential type legalization to compute new layouts for "wrapped" buffers like these that embeds correct offset/binding/register information for any resources nested inside them. A key challenge in that code is that existential legalization needs to erase any "pending" data from the program entirely, so that offset information that used to be relatie to the "pending" part of a surrounding type now needs to be relative to the primary part.
The work here may not be 100% complete for all scenarios, but it does well enough on the new and existing tests that I want to checkpoint it. Note that a few other tests have had their output changed, but in all cases I've reviewed the diffs and determined that the change in observable behavior is consistent with what we intened Slang's behavior to be.
Note that there is still one major piece of support for interface-type parameters that is missing here, and which might force us to revisit some of the decisions in this code: we don't properly support user-defined `struct` types with interface-type fields.
* fixup: typos
|
|
|
|
* * Handle ! for bool vector in glsl
* Handle operators that have a boolean return value
* || or && take bool
* * Add comment in bool-op.slang test about doing || or && on vector types not supported for GLSL targets
|
|
* Add support for glsl inversesqrt intrinsic
* fixup for test failure
|
|
* Fix texture2d-gather test failure on dx12.
* Fix tab
|
|
* Add diagnostic for vk::binding failure.
* Add test for vk::binding failure.
* Add the expected output for glsl-layout-define.hlsl
* * Copy over initialize expr if available when validating unchecked
* Fix unloop - because now it always has one parameter (when before it could have none)
* Split vk::binding and layout tests with invalid parameters
* Removed the diagnostic for 2 ints expected
* Added vk::binding that doesn't specify set in vk-bindings.slang
* * Fix typo
* Improve comments.
|
|
* First pass test to see if GatherRed works.
* Add support for generating R_Float32 textures.
* Set default texture format.
* * Alter the texture2d-gather to work with a R_Float32 texture
* Add support for scalar Texture2d types with GatherXXX in stdlib
* Remove some left over commented out test code from texture2d-gather.hlsl
|
|
* Add intrinsic for StructuredBuffer.Load
|