slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2020-11-19	Fix constant folding in attributes (#1610)	Yong He
	* Fix constant folding in attributes * remove unnecessary change * remove unnecessary change * remove unnecessary change * Fixed circular checking issue. * cleanup * more cleanup * minimize diff * minimize diff * minimize diff
2020-11-18	Test for serializing out and reading back Stdlib (#1605)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Mangling/module name extraction for GenericDecl * Add comment on SerialFilter to explain re-enabling Stmt. * Support setting up SyntaxDecl when reconstructed after deserialization. * Improvements to setup SyntaxDecl. * Fix typo so can read compressed SourceLocs. * Fix issue with SourceManger. * Simple test for serializing out stdlib and reading back in. * Fix calling convention. * Add override to StdLib impls. * Fix typo. * Apply testing to an actual compute test when using load-stdlib Make -load/compile-stdlib processable by Slang Move out testing into util into TestToolUtil so can be shared. * Slightly more concise setup of session. * Fix some errors introduced with session handling. * Made setup for compile same across slangc and slangc-tool.
2020-11-11	Include hierarchy output (#1595)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Improve diagnostic for token pasting. * Token paste location test. * Output include hierarchy. * WIP on includes hierarchy. * Improved include hierarchy output - to handle source files without tokens. Improved test case. * Small comment improvements. Fixed a typo with not returning a reference. * Slight simplification of the ViewInitiatingHierarchy, by adding GetOrAddValue to Dictionary. * Remove the need for ViewInitiatingHierarchy type. * Improve output of path in diagnostic for includes hierarchy. * Remove comment in diagnostic for token-paste-location.slang * Update command line docs to include `-output-includes` Co-authored-by: Yong He <yonghe@outlook.com>
2020-11-06	Specialize witness table lookups. (#1596)	Yong He
	* Specialize witness table lookups. * Remove generated files from vcxproj * Fix call to generic interface methods.
2020-10-22	Generate `if` based dispatch logic on GPU targets. (#1585)	Yong He

2020-10-15	Fix a bug in IR lowering (#1578)	Tim Foley
	The basic problem here is that when a function has multiple declarations with matching signatures (e.g., a forward declaration and then a later definition with a body), the IR lowering logic would lower all declarations whenever the first one was encountered, but then would only register an IR value as the lowered version of the first declaration. Other matching declarations would then run the risk of being lowered again, and in the case where they included features like loops with break/continue labels, that would create the risk of keys getting inserted into certain dictionaries more than one, leading to exceptions. This change ensures that when lowering a function that has multiple matching declarations to IR, we register an IR value for all of those declarations and not just the first. I have added a test case that leads to a crash without this change, to ensure that we don't introduce a regression down the line.
2020-10-13	Repro test that loads repro (#1576)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Slang repro test that reloads and runs compiled code.
2020-10-09	Support CUDA bindless texture in dynamic dispatch code. (#1575)	Yong He

2020-10-06	InterlockedExchangeU64 support on RWByteAddressBuffer (#1572)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Added [__requiresNVAPI] to functions that need nvapi support. * Added support for InterlockedExchangeU64 Added exchange-int64-byte-address-buffer test Fixed typo in cas-int64-byte-address-buffer test * Improve comment around NVAPI usage in hlsl.meta.slang
2020-10-05	Update the type of a call inst during specialization. (#1569)	Yong He

2020-10-04	Handle partial existential parameter type specialization. (#1568)	Yong He
	* Specialize exsitentials parameters in struct fields. * Cleanup. * Handle partial existential parameter type specialization. Co-authored-by: Yong He <yhe@nvidia.com>
2020-10-02	Specialize exsitentials parameters in struct fields. (#1565)	Yong He
	* Specialize exsitentials parameters in struct fields. * Cleanup. Co-authored-by: Yong He <yhe@nvidia.com>
2020-09-23	Fix GLSL output for byte-address loads of vectors (#1558)	Tim Foley
	While working on #1557, it became clear that something was going wrong when using `*ByteAddressBuffer.Load<T>` to load a vector type on GLSL/SPIR-V targets. The root problem was that the IR-level layout logic (which computes the "natural" layout of a type) had not yet been extended to handle vectors. The fix is simple enough, but it highlights the fact that we probably need to go ahead and "complete" that layout logic sooner or later. This change includes a test case that covers the behavior added here, as well as the case that #1557 fixes. Unfortunately, due to CI system limitations, the HLSL/dxc part of the test is not yet enabled.
2020-09-21	Enable all dynamic dispatch tests on CUDA. (#1552)	Yong He
	* Enable all dynamic dispatch tests on CUDA. * Fix expected cross-compile test results.
2020-09-17	Initial attempt to enable CUDA dynamic dispatch codegen (#1549)	Yong He
	* Front-load cuda module loading to fill in RTTI pointers. * Enable dynamic dispatch codegen for CUDA.
2020-09-14	Support shader parameters that are an array of existential type. (#1542)	Yong He
	* Support shader parameters that are an array of existential type. * Rename to getFirstNonExistentialValueCategory Co-authored-by: Yong He <yhe@nvidia.com>
2020-09-11	Remove some "do what I mean" logic from reflection API (#1539)	Tim Foley
	The reflection API had a bit of DWIM (Do What I Mean) logic in that a client could query the resource usage/bindings of a `ParameterBlock<X>` and see not only the register `space` or descriptor `set` for the block itself, but also the constant buffer `register` or `binding` for its default constant buffer (if any). The reason for this behavior was that there was existing client code in Falcor that relied on that behavior for parameter blocks, and even after changing the way that parameter block layouts were computed and stored we sought to maintain backwards compatibility with that client code. The trouble is that the weird behavior then goes on to cause confusion for other clients of the Slang reflection API. This change removes the special-case logic, and fixes up our reflection tests to mirror the new (correct) information that we return. When this change is released, it will be a breaking change for any client code that still relies on the old behavior. We will need to coordinate with client application developers to fix their reflection logic. Note that all the same information can still be accessed, simply by using new reflection API that we have added.
2020-09-10	Allow existential types in `StructuredBuffer` element type. (#1536)	Yong He
	* Allow existential types in `StructuredBuffer` element type. * Handle StructuredBuffer.Load/.Consume methods * Clean up unnecessary changes * Code cleanup * Update test comment
2020-09-10	Add a pass to support resource return values (#1537)	Tim Foley
	A long-standing problem for the Slang implementation has been that some targets (notably GLSL/SPIR-V) do not support treating resources (textures, buffers, samplers, etc.) as first-class types. Resource types on such platforms are restricted so that they may not be used as the type of: 1. fields of aggregate types (`struct`s) 2. local variables 3. function results or `out`/`inout` parameters Issue (1) is handled by our "type legalization" pass today, by splitting aggregates that contain resources into separate fields/variables/parameters. Issue (2) is worked around by putting code into SSA form and promoting local variables to SSA temporaries when possible; the net result is that many local variables of texture type are eliminated (that pass is not perfect, though, and it is possible for users to get errors when it doesn't fully clean up local variables of texture type). Issue (3) is a much more complicated matter, and it is what this change is concerned with. A typical solution to issue (3) is to simply inline all of the code in a program, at which point function results and `out`/`inout` parameters will no longer exist to cause problems. We reject such solutions for two reasons. First, there are limitations on control-flow structure in HLSL/GLSL/SPIR-V that mean they cannot express certain programs after inlining has been performed. Second, and more importantly, the philosophy of the Slang compiler is to perform as little duplication of code as possible, so that we do not accidentally contribute to binary size bloat. Instead, this change tackles the problem of functions that output resource types by adding a new specialization pass. The pass detects functions that ought to be specialized (because they have resource-type outputs), and inspects their bodies to see if the values they output have a predicatable structure that can be replicated outside of the function body. The same logic that inspects the function body also rewrites (a copy of) the function to not have the offending outputs. Finally, all the call sites to a function that is rewritten in this way also get rewritten so that instead of using output values from the function itself, they reproduce the expected output value(s) in their own code. The pass as presented here is intentionally limited in the scope of what it can optimize away (and the test case only touches on that specific functionality). The goal is to get a basic version of this pass in place and evaluated, and then to expand on its functionality incrementally over time.
2020-09-04	Allow mixing unspecialized and specialized existential parameters. (#1533)	Yong He
	* Allow mixing unspecialized and specialized existential parameters. * Fixes.
2020-09-02	Allow unspecialized existential shader parameters (dynamic dispatch). (#1529)	Yong He
	* Allow unspecialized existential shader parameters (dynamic dispatch). * Fixes. * Fixes * disable cuda test
2020-09-02	Add support for (undocumented) HLSL 16-bit bit-cast ops (#1528)	Tim Foley
	As of SM 6.2, the dxc compiler added support for a set of 16-bit bit-cast operations to mirror the `asuint`, `asfloat`, and `asint` operations that were provided for 32-bit scalar types. These operations are not publicly documented, so we didn't think to add them. It should be noted that there was already a similar operation in HLSL, called `f32tof16`, that took as input a `float` and then packed a half-precision version of it into the low bits of a `uint`. The problem is that using that operation for `half`->`uint16_t` conversion required a round trip through a `float`, and downstream compilers seemingly can't optimize away that conversion. This change adds the new operations along with a test that tries to make use of them to ensure the results are what is expected. There are enough cases to cover that I had to write the test in a way where each thread only writes out a subset of the required output. There are two other changes here are that are not directly related to the main feature: First, it seems like the `[__forceInlineEarly]` attribute on some of these overloads interacts poorly with generics, and results in an `IRVectorType` appearing at local scope in the output code. That is semantically reasonable given our IR model, but it would ideally be something that gets eliminated as a result of deduplication of types. For now I've introduced a slight hack to make types always get inlined into their use sites during emission, which should handle the case of locally-defined types. I'm not 100% happy with that solution, but it seemed better than introducing a bunch of unrelated fixes into this PR. Second, the way that conversion operations were being declared for matrix types seems to have been incorrect: we had a single explicit initializer added to matrix types via an `extension` that allowed them to be initialized from other matrix types with the same size and any element type. In order to support implicit conversions of matrix types, I cribbed the code we were already using to introduce implicit conversion operations for vector types.
2020-08-28	Enable lower-generics pass universally. (#1518)	Yong He
	* Enable lower-generics pass universally. * Exclude builtin interfaces and functions from lower-generics pass. * Update stdlib. * Fixup. * Fixes handling of nested intrinsic generic functions. * Fixes. * Fixes.
2020-08-27	Enable simple extensions of interface types (#1521)	Tim Foley
	The big picture here is that an `extension` can now apply to an interface type and provide convenience methods for all types that implement that interface. Suppose you have an interface for counters: interface ICounter { [mutating] void add(int val); } and a type that implements it: struct SimpleCounter : ICounter { int _state = 0; ... } If a common operation in your codebase is to increment a counter by adding one, you would be faced with the problem of either: * Add the `increment()` operation to `ICounter`, and force every implementation to implement the new requirement * Add the `increment()` operation to concrete counter types as needed, and thus not be able to use it in generic code * Make `increment()` a global ("free") function, and force clients of counters to have to know which operations use member syntax (`c.add(...)`) and which use global function call syntax (`increment(c)`). The whole idea of `extension`s is to allow for another option that is better than all of the above: extension ICounter { [mutating] void increment() { this.add(1); } } The core of the implementation is relatively straightforward, and consists of two complementary pieces. The first piece is that when emitting a concrete IR entity (function/type/whatever) we treat any enclosing `interface` type (or `extension` thereof) a bit like an enclosing `GenericDecl`, and introduce an `IRGeneric` to wrap things. The generic `IRGeneric` has parameters representing the `This` type for the interface, along with the witness table that shows how `This` conforms to the interface itself. We thus end up with an IR version of `increment()` something like: void increment<This : ICounter>(This this) { this.add(1); } The second (complementary) fix is that when there is code that references this `increment()` operation, we don't treat it like an interface requirement (look up based on its key), and instead treat it like a generic (since that is how it is lowered now) and speciaize it to the information we can glean from the `ThisTypeSubstitution`. A related fix that is required here is that within the body of `increment`, when we perform `this.add`, we need to ensure that the lookup of `add` in the base interface properly takes into account the subtype relationship (`This : ICounter`) and encodes it into the lookup result, so that we get `((ICounter) this).add`, and properly generate code that looks up the `add` method in the witness table for `This`.
2020-08-27	Clean up the way that lookup "through" a base type is encoded (#1519)	Tim Foley
	* Clean up the way that lookup "through" a base type is encoded In order to undestand this change, it is important to undestand how lookup through base interfaces works prior to this change. In order to understand that it helps to be reminded of how inheritance relationships get encoded in the AST. Suppose the user writes: struct Base { int val; } struct Derived : Base { ... } ... Derived d = ...; int v = d.val; The question is how an expression like `d.val` gets semantically checked, and how it is encoded into the IR after semantic checking. You might assume it gets checked and encoded so that we end up with: int v = ((Base) d).val; and that seems like it should Just Work... so of course that isn't what Slang has been doing. Instead, we relied on the fact that the inheritance relationship `Derived : Base` is represented as an `InheritanceDecl` member of the `Derived` type, and we ended up checking the code into something like: int v = d.<anonymous>.val; where `<anonymous>` stands in for the name of the `InheritanceDecl` that represents inheritance from `Base`. This design choice makes a limited amount of sense when you consider how inheritance would typically be lowered to a C-like output language: // struct Derived : Base { ... } // => struct Derived { Base base; ... } The problem with that encoding is that it really doesn't make sense for almost any other scenario. In particular, if you have a generic type parameter `T` that was constrianed with `T : ISomething`, then the constraint isn't even technically a member of the type parameter `T`, so expressing thing as a member reference in the AST is completely incorrect. Unfortunately, by the time it was clear that we needed something better, a bunch of implementation work was done based on the existing representation. This change tries to clean things up so that lookup of a super-type member through a value of a sub-type does the obvious thing: cast the value to the super-type and then look up the member (as in `((Base) d).val`). The core of the change is that in lookup, instead of creating `Constraint` breadcrumbs whenever we are looking up in a super-type (with a reference to the `TypeConstraintDecl` being used) we instead use `SuperType` breadcrumbs (with a reference to a `SubtypeWitness`). Then when we create the expression from a `LookupResultItem`, we translate any `SuperType` breadcrumbs into `CastToSuperTypeExpr`s (an expression type that already existed). This change also adds support for lookup through the `This` type in the context of an interface, and in order for that to work we need a new kind of subtype witness to represent the knowledge that a `This` type is a subtype of the enclosing interface. Making that work forces us to change the representation of `TransitiveSubtypeWitness` so that it takes a pair of subtype witnesses (and not one subtype witness plus one `TypeConstraintDecl`). For the most part this is a small change, but it raises the possibility that some pieces of the code aren't going to be robust against all possible shapes of subtype witnesses. The IR lowering logic has relied on the weird `d.<anonymous>` representation in order to ensure that when looking up interface members we weren't always casting to the interface type (which would create a `makeExistential` instruction), and then calling using that. Basically, the IR lowering would ignore the `d.<anonymous>` part and just emit `d`, but we can't do that for `((Base) d)` or `((IThing) d)` because whehter or not we should actually perform the cast depends on context. For now we solve that problem by adding specific logic to ignore up-casts to interface types when they appear in member expressions or method calls. A more robust solution might be needed down the line, but this seems to work in practice. All of this work is cleanup that I found was needed in order to make `extension`s of `interface` types workable. * fixup: disable an incorrect test
2020-08-26	Added more Atomic support for int64 types on RWByteAddressBuffer (#1515)	jsmall-nvidia
	* Support for more 64 bit atomics on ByteAddressBuffer. * min max 64bit test. * Disable CUDA version of min max 64 bit test - as produces the wrong output. * Update target-compatibility.md with added 64 bit atomics. Co-authored-by: Yong He <yonghe@outlook.com>
2020-08-24	RWByteAddressBuffer::InterlockedCompareExchangeU64 (#1513)	jsmall-nvidia
	* First pass at incorporating nvapi into test harness. * D3d12 Atomic Float Add via NVAPI working * Dx12 atomic float appears to work. * Atomic float add on Dx12. * Added atomic64 feature addition to vk. Fix correct output for atomic-float-byte-address.slang * Disable atomic float failing tests. * Upgraded VK headers. * Detect atomic float availability on VK. * Try to get test working for in64 atomic. * Made HLSL prelude controlled via the render-test requirements. * Added -enable-nvapi to premake. * Fix D3D12Renderer when NVAPI is not available. * Small improvements to VKRenderer. * Improve atomic documentation in target-compatibility.md. * Fixed NVAPI working on D3D12. * Test for specific NVAPI features. * Remove requiredFeatures from Renderer::Desc as was ignored. Tried to document more around nvapiExtnSlot. * Readded requiredFeatures to Renderer::Desc * Improve comments in the tests. * Rename Fp32 -> F32 Added cas-int64-byte-address-buffer.slang test Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-24	NVAPI improvements (#1512)	jsmall-nvidia
	* First pass at incorporating nvapi into test harness. * D3d12 Atomic Float Add via NVAPI working * Dx12 atomic float appears to work. * Atomic float add on Dx12. * Added atomic64 feature addition to vk. Fix correct output for atomic-float-byte-address.slang * Disable atomic float failing tests. * Upgraded VK headers. * Detect atomic float availability on VK. * Try to get test working for in64 atomic. * Made HLSL prelude controlled via the render-test requirements. * Added -enable-nvapi to premake. * Fix D3D12Renderer when NVAPI is not available. * Small improvements to VKRenderer. * Improve atomic documentation in target-compatibility.md. * Fixed NVAPI working on D3D12. * Test for specific NVAPI features. * Remove requiredFeatures from Renderer::Desc as was ignored. Tried to document more around nvapiExtnSlot. * Readded requiredFeatures to Renderer::Desc * Improve comments in the tests.
2020-08-21	Vulkan update/NVAPI support (#1511)	jsmall-nvidia
	* First pass at incorporating nvapi into test harness. * D3d12 Atomic Float Add via NVAPI working * Dx12 atomic float appears to work. * Atomic float add on Dx12. * Added atomic64 feature addition to vk. Fix correct output for atomic-float-byte-address.slang * Disable atomic float failing tests. * Upgraded VK headers. * Detect atomic float availability on VK. * Try to get test working for in64 atomic. * Made HLSL prelude controlled via the render-test requirements. * Added -enable-nvapi to premake. * Fix D3D12Renderer when NVAPI is not available. * Small improvements to VKRenderer. * Improve atomic documentation in target-compatibility.md.
2020-08-21	Fix stdlib declarations for texture Gather() (#1510)	Tim Foley
	Fixes #1507 These operations were failing to take into account the way that array textures require an extra coordinate to be passed in for the primary location (but not the additional offsets). Adding `isArray` to the component count is the existing solution used for similar intrinsics elsewhere in the stdlib, and it is adopted here. Because our test framework isn't really set up to do a lot of texture testing (including having no support for texture arrays), the test added here is just a cross-compilation test that compares output with fxc for comparable input.
2020-08-21	Another fix for overriding property decls (#1509)	Tim Foley
	* Another fix for overriding property decls The central problem we keep running into with `property` decls in `interface`s comes down to two choices: 1. When a member lookup `obj.someName` or a simple lookup for `someName` produces an overloaded result, we make no attempt to resolve the overloading right away, and instead postpone disambiguation until the point where that expression gets used, in case the context where it gets used can help in disambiguation (a notable case being when there is a call expression `obj.someName(...)` or `someName(...)`). 2. When looking up members in a the scope of a type (either for `obj.someName` or `someName` in the context of a method), we include all results from base types in the set of overloads returned, even in cases where the type has a direct member that "overrides" the inherited one. The combination of these factors means that when a `struct` type implements a `property` to satisfy a requirement of an inherited `interface`, then references to `obj.someProp` end up being ambiguous between the property in the concrete `struct` type and the property it inherits through the `interface`. There is no quick fix possible for issue (2). It might seem that we could just skip over members inherited through `interface`s when doing lookup in a type, but that solution wouldn't apply to inheritance from another `struct` type, or any future scenario where we support default implementations of methods in interfaces. The simple idea of saying that a derived-type member named `M` hides all inherited members named `M` is possible, but would lead to a bad user interface when a type wants to support both a core "bottleneck" method and a bunch of convenience overloads with the same name. That leaves us with issue (1), and trying to find a reasonable fix for it. The common case is that any expression `e` eventually gets used in a context where it will be be subject to disambiguation: * If we form a call expression `e(...)`, then the overload resolution logic will (obviously) work to disambiguate which `e` was meant. * If `e` is used as an argument to another call (`f(... e ...)` or `... + e`), then `e` will be coerced to the expected parameter type for its argument position, and that coercion will disambiguate it (this is the bit that was fixed in #1501) * If `e` is used in another context where a type is expected/known, it will also be coerced: `if(e)`, `int v = e`, etc. The problem case that is left behind is any scenario where `e` is not subject to one of the above resolution cases, which mostly amounts to cases where an expression is never coerced to a single fixed type. There are a few important cases where this occurs today: * When the expression is used as the left-hand side of an assignement (`e = ...`). * When an expression is used to initialize a variable with an implicit type (`let v = e`). * When inferring generic arguments from the value arguments at a call site (`f(e)` where `f` is defined as `f<T>(T v)`) The key connecting thread in each of these cases is that the front-end needs to determine the type of `e` to make progress. Our semantic checking logic already has functions that try to draw a distinction between the two cases: * The `CheckTerm()` operation is supposed to be used when we expect that we will eventually coerce or otherwise diambiguate the term, and also in cases where we don't yet know if a term should name a type or a value * The `CheckExpr()` operation is supposed to be used when we do not expect that we will apply coercion/disambiguation to a term, and need to have assurances that it has been coerced into a non-overloaded expression with a reasonable type The simple part of the fix made here is to make `CheckExpr()` actually do part of what it is suppsoed to (attempt to disambiguate overloaded terms), and then audit all the call sites to `CheckExpr()` to make sure they are actually ones that intend to opt into that logic. The messier part of the fix is dealing with generic argument inference, because we need to extract the type of the disambiguated expression for the purposes of inference, but we don't want to disturb the actual argument list at a call site (because type coercion of the arguments is supposed to handle the disambiguation). This part is done with a bit of special-casing in the overload-resolution context, by adding a method that gets the type or an argument after disambiguation (when possible). * fixup Co-authored-by: Yong He <yonghe@outlook.com>
2020-08-21	Allow calling a generic function with an existential value (dynamic ↵	Yong He
	dispatch) (#1508) * Allow calling a generic function with an existential value (dynamic dispatch). * Fixes per review comments. * Clean up implementation by having `openExistential` return `ExtractExistentialType` instead of a DeclRef to the interface with a `ThisTypeSubstitution`. * More cleanups Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com> Co-authored-by: Yong He <yhe@nvidia.com>
2020-08-20	Initial support for a using construct (#1506)	Tim Foley
	The basic idea is that if you have a namespace: namespace MyCoolNamespace { void f() { ... } ... } then you can bring the declarations from that namespace into scope with: using MyCoolNamespace; f(); The `using` construct is allowed in any scope where declarations are allowed. As an additional feature, the construct allows and then ignores the keyword `namespace` if it occurs right after `using`: using namespace MyCoolNamespace; Note that unlike in C++, `using` a namespace inside another namespace doesn't implicitly make the symbols available to clients of that namespace: namespace hidden { void secret() {...} ... } namespace api { using hidden; ... } api.secret(); // ERROR: `secret()` isn't a member of `api` The implementation of this feature was relatively simple, although it does leave out more advanced features that might be desirable in the future: * No support for `using MCN = MyCoolNamespace` sorts of tricks to define a short name * No support for `using` anything that isn't a namespace (e.g., to make the members of a type available without qualification) * No support for cases where multiple visible modules have a namespace of the same name (or dealing with overloaded namespaces in general)
2020-08-19	Int64 atomic add RWByteAddressBuffer support (#1504)	jsmall-nvidia
	* Fix premake5.lua so it uses the new path needed for OpenCLDebugInfo100.h * Keep including the includes directory. * Added the spirv-tools-generated files. * We don't need to include the spirv/unified1 path because the files needed are actually in the spirv-tools-generated folder. * Put the build_info.h glslang generated files in external/glslang-generated. Alter premake5.lua to pick up that header. * First pass at documenting how to build glslang and spirv-tools. * Improved glsl/spir-v tools README.md * Added revision.h * Change how gResources is calculated. Update about revision.h * Update docs a little. * Split out spirv-tools into a separate project for building glslang. This was not necessary on linux, but is necessary on windows, because there is a file disassemble.cpp in spirv-tools and in glslang, and this leads to VS choosing only one. With the separate library, the problem is resolved. * Fix direct-spirv-emit output. * Update to latest version of spirv headers and spirv-tools. * Upgrade submodule version of glslang in external. * Add fPIC to build options of slang-spirv-tools * WIP adding support for InterlockedAddFp32 * Upgrade slang-binaries to have new glslang. * Fix issues with Windows slang-glslang binaries, via update of slang-binaries used. * WIP - atomicAdd. This solution can't work as we can't do (float) in glsl. WIP on atomic float ops. * Added checking for multiple decls that takes into account __target_intrinsic and __specialized_for_target. First pass impl of atomic add on float for glsl. * Split __atomicAdd so extensions are applied appropriately. * Made Dxc/Fxc support includes. Use HLSL prelude to pass the path to nvapi Added -nv-api-path * Refactor around IncludeHandler and impl of IncludeSystem * slang-include-handler -> slang-include-system Have IncludeHandler/Impl defined in slang-preprocessor * Small comment improvements. * Document atomic float add addition in target-compatibility.md. * CUDA float atomic support on RWByteAddressBuffer. * Add atomic-float-byte-address-buffer-cross.slang * Removed inappropriate-once.slang - the test is no longer valid when a file is loaded and has a unique identity by default. A test could be made, but would require an API call to create the file (so no unique id). Improved handling of loadFile - uses uniqueId if has one. * Work around for testing target overlaps - to avoid exceptions on adding targets. Simplify PathInfo setup. Modify single-target-intrinsic.slang - it no longer failed because there were no longer multiple definitions for the same target. * Int64 atomic add RwByteAddressBuffer support. * Fix typo in stdlib for int atomic ByteAddressBuffer. * Small fixes to int64 atomic test. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-18	Support initializing an existential value from a generic value. (#1503)	Yong He
	* Support initializing an existential value from a generic value. * Remove trailing spaces and clean up debugging code.
2020-08-18	Support for float atomics on RWByteAddressBuffer (#1502)	jsmall-nvidia
	* Fix premake5.lua so it uses the new path needed for OpenCLDebugInfo100.h * Keep including the includes directory. * Added the spirv-tools-generated files. * We don't need to include the spirv/unified1 path because the files needed are actually in the spirv-tools-generated folder. * Put the build_info.h glslang generated files in external/glslang-generated. Alter premake5.lua to pick up that header. * First pass at documenting how to build glslang and spirv-tools. * Improved glsl/spir-v tools README.md * Added revision.h * Change how gResources is calculated. Update about revision.h * Update docs a little. * Split out spirv-tools into a separate project for building glslang. This was not necessary on linux, but is necessary on windows, because there is a file disassemble.cpp in spirv-tools and in glslang, and this leads to VS choosing only one. With the separate library, the problem is resolved. * Fix direct-spirv-emit output. * Update to latest version of spirv headers and spirv-tools. * Upgrade submodule version of glslang in external. * Add fPIC to build options of slang-spirv-tools * WIP adding support for InterlockedAddFp32 * Upgrade slang-binaries to have new glslang. * Fix issues with Windows slang-glslang binaries, via update of slang-binaries used. * WIP - atomicAdd. This solution can't work as we can't do (float) in glsl. WIP on atomic float ops. * Added checking for multiple decls that takes into account __target_intrinsic and __specialized_for_target. First pass impl of atomic add on float for glsl. * Split __atomicAdd so extensions are applied appropriately. * Made Dxc/Fxc support includes. Use HLSL prelude to pass the path to nvapi Added -nv-api-path * Refactor around IncludeHandler and impl of IncludeSystem * slang-include-handler -> slang-include-system Have IncludeHandler/Impl defined in slang-preprocessor * Small comment improvements. * Document atomic float add addition in target-compatibility.md. * CUDA float atomic support on RWByteAddressBuffer. * Add atomic-float-byte-address-buffer-cross.slang * Removed inappropriate-once.slang - the test is no longer valid when a file is loaded and has a unique identity by default. A test could be made, but would require an API call to create the file (so no unique id). Improved handling of loadFile - uses uniqueId if has one. * Work around for testing target overlaps - to avoid exceptions on adding targets. Simplify PathInfo setup. Modify single-target-intrinsic.slang - it no longer failed because there were no longer multiple definitions for the same target. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-17	Attempt to fix lookup for members that "override" (#1501)	Tim Foley
	Our current lookup process always finds all members of a type, which can include both an inherited member (e.g., from an `interface`) and one that logically overrides/implements it. If something downstream doesn't filter this result down and favor the derived member, then an ambiguity error will result. To date, this has mostly been a non-issue because we haven't emphasized inheritance, and the main case we did support (`struct` types implemented `interface` methods) gets disambiguated as part of overload resolution for function calls. Recent changes to support `property` declarations to `interface`s add the possibility for ambiguity between a "base" and "derived" declaration that can't rely on overload resolution for disambiguation. The approach in this PR is to add disambiguation logic to the other main place where the results of lookup get used. If a lookup result is being assigned to a variable, passed to a function, or otherwise used in a case where a value of a specific type is needed, it will be "coerced" to the desired type. This change makes it so that the first step in the coercion logic is to try to disambiguate the expression that is being coerced. In order to ensure that an overloaded expression can be detected and resolved even when just checking if coercion is possible, I needed to update the `canCoerce()` functions to also take the expression that is being tested for coercibility, and not just its type. There is only one case (that I saw) where coercion checks were being made without an expression value available, and that case didn't actually need/want to handle overloading. In order to test the fixes here, I added logic to the `property`-in-`interface` test to make sure that the critical cases work as expected (references to a derived member using "dot syntax" and "implicit `this`" syntax). Alternatives Considered ----------------------- The first attempt at this fix took a simpler approach: I added the disambiguation logic as a post-process on member lookup. That is, given `obj.foo` I would take the `LookupResult` for `foo` and immediately try to filter it to include only the most-derived members. This approach has the major benefit of catching even more use cases of values (and thus helping to ensure that we don't spend forever chasing down more of these ambiguity errors), but it also has two critical problems: 1. If we only trigger disambiguation when looking up `obj.foo`, then we can't do anything to help when `foo` is looked up as an ordinary identifier, but is actually equivalent to `this.foo`. A full fix would require doing this disambiguation on every* name lookup, which leads to the second issue: 2. It is important that for a method call like `obj.m(...)` we do not disambiguate when looking up `obj.m`, and instead let the overload resolution for the call resolve things. That choice is what makes it possible to call an inherited `m` declaration even when there is a derived `m` with a different signature. Issue (1) is covered by the test case that was added here, but we should probably have a test case for (2) to make sure we don't break that use case. Caveats ------- An important case that we don't solve in this PR is when the result of a lookup is captured in a variable without an explicit type: let f = obj.foo; That case also needs disambiguation, and should be addressed in a later change. A secondary issue is that our approach to prioritizing declarations during lookup is still quite naive. We really need a way for lookup to attach information about nesting of scopes to results (to be clear that results from inner scopes should be preferred over those from outer scopes), as well as have a robust mechanism for comparing the priority of members based on the inheritance graph of a type. This change doesn't do anything to make the situation better or worse.
2020-08-14	Lower existential types. (#1497)	Yong He
	Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-14	Fix an issue with explicit enum tag types (#1495)	Tim Foley
	The basic problem here was that in a declaration like: ```hlsl enum Color : uint { Red, Orange, ... } ``` The `: uint` bit is represented as an `InheritanceDecl`, because that is what we use to represent the syntactic form of inheritance clauses like that. At the point where we parse the `InheritanceDecl` we don't yet know whether it represents a base interface or a "tag type" like `uint` in this case. The root problem that is then created is: an `enum` type is not a subtype of its "tag type," and treating it like a subtype can create problems. The main problem that arises is that looking in a type like `Color` will find both the members of color and the members of `uint`. In the case of things like `__init` declarations, that creates a problem where the `Color` type has two different `__init`s that take a `uint`: * The one it inherits from `uint` via that `InheritanceDecl` (even though it shouldn't) * The one it gets via an extension just for conforming to `__EnumType` (a non-user-exposed `interface` in the standard library) Because both of those `__init`s are inherited, neither is preferred over the other one and they create an ambiguity if somebody tries to write: ```hlsl uint u = ...; Colorc = Color(u); ``` The solution used in this PR is to add a compiler-internal modifier to the `InheritanceDecl` that introduces a "tag type" to an `enum`, in an early phase of checking (one of the ones that occurs before it is legal to enumerate the bases of a type). Then the lookup process is modified to ignore `InheritanceDecl`s with that modifier when doing lookup in super-types (since the declaration does not indicate a subtype/supertype relationship). This appears to get the basic feature working again, although it is possible that there are other parts of the compiler that use `InheritanceDecl`s and mistakently assume that all `InheritanceDecl`s introduce subtype/supertype relationships. We probably need to do a significant audit of the code to start being more clear about the nature of the relationships such declarations introduce. Such steps are left to future changes. Co-authored-by: Yong He <yonghe@outlook.com>
2020-08-13	Support property declarations in interfaces (#1494)	Tim Foley
	There are two main features in this change. First, we allow for `interface`s to declare `property` requirements, which can be satisfied by matching `property` declarations in a type that conforms to the interface: interface IRectangle { property float width { get; } property float height { get; } } struct Square : IRectangle { float size; property float width { get { return size; } } property float height { get { return size; } } } Second, we allow a type to satisfy a `property` requirement with an ordinary field of the same name: struct Rectangle : IRectangle { float width; float height; // no explicit `property` declarations needed } The implementation of these features is mostly in `slang-check-decl.cpp` in the logic for checking conformance of a type to an interface. The first feature simply requires adding logic to checking whether a candidate satisfying `property` declaration matches a required `property` declaration. To do so, it must have the same type, and an accessor to satisfy each of the required accessors. The second feature requires adding logic to synthesize an AST `property` declaration for a type, based on a required `property` declaration and its accessors. This means that, more or less, any type where `this.name` yields a storage location that does what is needed can satisfy a property requirement (there is no specific rule that says the storage needs to be a field, although that is the most likely case). The way that witnesses are stored for property declarations probably merits some description. During IR lowering, an abstract storage declaration like a subscript or `property` more or less desugars away, so that the actual interface requirements correspond to the accessors within it (the `get`, `set`, etc.). This means that a witness table should have entries/keys corresponding to the accessors and not the property itself. The process of finding/recording witnesses for `property` requirements thus installs entries for the individual accessors (with care taken to only install accessor witnesses once we are sure we have witnesses for all the requirements). Currently, the code also installs an entry for the property itself, although that is not strictly required, and might not be something we continue to do long-term. (Aside: it was somewhat surprising that an end-to-end test of `property` declarations in `interface`s Just Worked without any changes to IR lowering.) As we continue to write more code that synthesizes and checks AST expressions/statements, it becomes necessary to refactor the semantic checking logic so that it splits the recursive part (e.g., checking the operands of an assignment) from the validation part (e.g., checking that the assignment itself is valid). It is probably too big of a change to justify at this point, but it might be valuable in the future to have distinct hierarchies that represent unchecked and checked ASTs, with semantic checking mostly being a transformation from one to the other. The benefit of such a change is we could factor out a distinct "builder" API for constructing validated/checked AST nodes, with both semantic checking and AST synthesis being clients of that API.
2020-08-13	Added WavePrefixCountBits test. (#1493)	jsmall-nvidia
	Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-13	Allow both traditional and modern property syntax (#1487)	Tim Foley
	The initial change to introduce `property` declarations tied them to a "modern" syntax: property width : float { ... } In practice, a great majority of users assume that properties in Slang will be declared like those in C#: property float height { ... } This change allows both options to parse correctly. The choice made here is to only parse as the "modern" syntax when it can be detected from lookahead (an identifier followed by a `:`), and fall back to the "traditional" syntax otherwise. That choice might not produce the best diagnostic messages around syntax errors in codebases that use the modern syntax, but it is the easiest trade-offs to make. We also add similar disambiguation logic for the `newValue` parameter of a `set` declaration (and other "modern"-style parameters). This strategy cannot be applied to all function parameters in general, because traditional parameter lists can still use `:` to introduce a semantic. Note: the same disambiguation strategy applied here could be used for `let` and `var` declarations: let a : int = 1; let int b = 2; This change does not try to introduce flexibility like that, because it seems unlikely for users to care.
2020-08-11	Bugfix: WaveActiveCountBits on glsl output. (#1488)	jsmall-nvidia
	* Fix WaveActiveCountBits on glsl output. * Fix warning `could not be inlined because the return instruction is not at the end of the function. This could be fixed by running merge-return before inlining.` from glslang - because we weren't including the CreateMergeReturnPasss on default optimization, and it's assumed in InlineExhaustivePass. * Keep WaveActiveCountBits use the default WaveMask impl. * Fix WaveCountBits calculation. Use WaveActiveBallot instead of the _WaveActiveBallot.
2020-08-11	Improvements to Casting (#1483)	jsmall-nvidia
	* Improve handling of cast detection when have a more complex cast than just a single identifier. * Improve comments around heuristic for casting * Added nested enum test. * Improve comments * Define function like - output change. * Use lookup for types in determining if cast or not. * Add _isCast function * Add heuristic test to nested-enum.slang that works if the type test fails. * Change hueristic based on review. Allow (..)( to always be an expression, because if it's a type it will be turned into a cast later. * Fix output of define-function-like.slang - which changes again with improved casting support. * Improve testing for type in cast - if we find a decl and it's not a type, then we know it's not a cast.
2020-08-07	AnyValue packing/unpacking pass. (#1480)	Yong He
	* AnyValue packing/unpacking pass. * Add diagnostic for types that does not fit in required AnyValueSize. * Add expected test result * Fix warnings.
2020-08-05	Change the policy for entry-point uniform parameters on Vulkan (#1476)	Tim Foley
	Entry point `uniform` parameters were a feature of the original Cg and HLSL, but have not been used much in production shader code. One of our goals on Slang is to reduce the (ab)use of the global scope, so bringing entry point `uniform` parameters up to a greater level of usability is an important goal. Some policy choices about how global vs. entry-point `uniform` parameters behave have already been made, that shape decisions looking forward: * For DXBC/DXIL, it makes the most sense to follow the lead of fxc/dxc, by treating entry point `uniform` parameters as a kind of syntax sugar for global shader parameters. Any parameters of "ordinary" types are bundles up into an implicit constant buffer, and all the resources (including the implicit constant buffer) are assigned `register`s just as for globals. It is up to the application to decide how to bind those parameters via a root signature (using root descriptors, root constants, descriptor tables, local vs. global root signature, etc.) * For CPU, it makes sense to pass global vs. entry-point parameters as two different pointers, although the details of what we do for CPU are the least constrained across all current targets. * For CUDA compute, it makes the most sense to map global shader parameters to `__constant__` global data, and entry-point `uniform` parameters to kernel parameters. This choice ensures that the signature of a kernel when translated from Slang->CUDA follows the Principle of Least Surprise, at the cost of making entry-point vs. global parameters be passed via different mechanisms. * For OptiX ray tracing, it makes sense to expand on the precedent from CUDA compute: pass global parameters via global `__constant__` data (as is already expected by OptiX for whole-launch parameters), and pass entry-point `uniform` parameters via the "shader record." This establishes a precedent that for ray-tracing shaders, global-scope parameters map to the "global root signature" concept from DXR, while entry-point `uniform` parameters map to a "local root signature" or "shader record." * For Vulkan ray tracing, the precedent from OptiX then argues that entry-point `uniform` parameters should map to the Vulkan "shader record" concept (and thus cannot support things like resource types). * The remaining interesting case is what to do for non-ray-tracing shaders on Vulkan. The dev team agrees that the most reasonable choice to make for non-ray-tracing Vulkan shaders is to map entry-point `uniform` parameters to "push constants." In particular, this makes it easy to express the case of a compute kernel with direct parameters of ordinary/value types in the way that will be implemented most efficiently. The big picture is then that a kernel like: ```hlsl void computeMain(uniform float someValue) { ... } ``` will map to output GLSL like: ```glsl layout(push_constant) uniform { float someValue; } U; void main() { ... } ``` If the user really wanted a constant-buffer binding to be created instead, they can easily change their input to make the buffer explicit: ```hlsl struct Params { float someValue; } void computeMain(uniform ConstantBuffer<Params> params) { ... } ``` (Forcing the user to be explicit about the desire for a buffer here creates a nice symmetry between Vulkan and CUDA; in the first case the user sets up the data in host memory and passes it to the GPU by copy, while in the second case the user must allocate and set up a device-memory buffer for the data. This symmetry extends to D3D if the application chooses to map entry-point `uniform` parameters to root constants.) This change implements logic in the "parameter binding" part of the Slang compiler to make sure that entry-point `uniform` parameters are wrapped up in a push-constant buffer rather than an ordinary constant buffer for non-ray-tracing shaders on Vulkan (and in a shader record "buffer" for the ray-tracing case). The majority of the actual work was in adding support for root/push constants to the test framework and the graphics API abstraction it uses. To be clear about that support: * Root constant ranges are (perhaps confusingly) treated as a new kind of "slot" that can appear on a descriptor set. This choice ensures that the implicit numbering of registers/spaces used by the back-ends can account for these ranges correctly. * The `TEST_INPUT` lines are extended to allow a `root_constants` case that behaves more or less like `cbuffer` * The CPU and CUDA paths can treat a `root_constants` input identically to a `cbuffer`. They already allocate the actual buffers based on reflection, and just use `cbuffer` as a directive that causes bytes to be copied in. * On D3D12 and Vulkan, a descriptor set allocates a `List<char>` to hold the bytes of root constant data assigned into it, and these bytes are flushed to the command list when the table is actually bound (usually right before rendering). * On D3D11, a descriptor set treats a root constant range more or less like a constant buffer range (with a single buffer), except that it also automatically allocates a buffer to hold the data. Assigning "root constant" data automatically copies it into that buffer. The small number of tests that used entry-point `uniform` parameters of ordinary types were updated to use the new `root_constant` input type, and the bugs that surfaced were fixed. A new test to confirm that entry-point `uniform` parameters map to the shader record for VK ray tracing was added. An important but technically unrelated change is the removal of the `DescriptorSetImpl::Binding` type and related function from the Vulkan implementation of `Renderer`. That type was created to ensure that objects that are bound into a descriptor set don't get released while the descriptor set is still alive, but the implementation relied on a complicated linear search to check for existing bindings, which could create a performance issue for descriptor sets that include large arrays of descriptors. The new implementation makes use of the approach already present in the various `Renderer` implementations (including the Vulkan one) for assigning ranges in a descriptor set a flat/linear index for where their pertinent data is to be bound. As a result, the Vulkan `DescriptorSetImpl` now uses a single flat array of `RefPtr`s to track bound objects, and has no need for linear search when binding. Co-authored-by: Yong He <yonghe@outlook.com>
2020-08-05	`AnyValue` based dynamic dispatch code gen (#1477)	Yong He
	* AnyValue based dynamic code gen * Fix aarch64 build error
2020-08-04	Sampler Feedback improvements (#1475)	jsmall-nvidia
	* Add the Feedback texture types. Depreciate SLANG_RESOURCE_EXT_SHAPE_MASK. * Starting point to test sampler feedback. * WIP on FeedbackSampler. * Use __target_intrinsic to override the output of sampler feedback types. * Use newer generic syntax for FeedbackTexture. * Reflects Feedback type. * SLANG_TYPE_KIND_TEXTURE_FEEDBACK -> SLANG_TYPE_KIND_FEEDBACK * Added reflection test. * Reneable issue with generics in sampler-feedback-basic.slang * Add methods to FeedbackTexture2D/Array. Make test cover test cases. * Sampler feedback produces DXC code. * Disabled Sampler feedback test - as requires newer version of DXC. * Fix bug in reflection tool output. * Fix problem with direct-spirv-emit.slang.expected due to update to glslang. * Fix direct-spirv-emit.slang * Use SLANG_RESOURCE_EXT_SHAPE_MASK again * Make Feedback be emitted as a textue type prefix. * Add support for GetDimensions to FeedbackTexture2D * WIP on CPU sampler feedback. Update of target compatibility. * Fix some bugs in C++ feedback sampler. Fix GetDimensions for FeedbackTextures. Run 'Compile' test for CPU compute feedback texture test. Update target-compatability.md * Fix GetDimensions call on feedback sampler. * Small documentation improvements. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-08-03	First pass support for Sampler Feedback (#1470)	jsmall-nvidia
	* Add the Feedback texture types. Depreciate SLANG_RESOURCE_EXT_SHAPE_MASK. * Starting point to test sampler feedback. * WIP on FeedbackSampler. * Use __target_intrinsic to override the output of sampler feedback types. * Use newer generic syntax for FeedbackTexture. * Reflects Feedback type. * SLANG_TYPE_KIND_TEXTURE_FEEDBACK -> SLANG_TYPE_KIND_FEEDBACK * Added reflection test. * Reneable issue with generics in sampler-feedback-basic.slang * Add methods to FeedbackTexture2D/Array. Make test cover test cases. * Sampler feedback produces DXC code. * Disabled Sampler feedback test - as requires newer version of DXC. * Fix bug in reflection tool output. * Fix problem with direct-spirv-emit.slang.expected due to update to glslang. * Fix direct-spirv-emit.slang * Use SLANG_RESOURCE_EXT_SHAPE_MASK again * Make Feedback be emitted as a textue type prefix. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-07-31	Upgrade to Glslang 11.0.0 (#1466)	jsmall-nvidia
	* Fix premake5.lua so it uses the new path needed for OpenCLDebugInfo100.h * Keep including the includes directory. * Added the spirv-tools-generated files. * We don't need to include the spirv/unified1 path because the files needed are actually in the spirv-tools-generated folder. * Put the build_info.h glslang generated files in external/glslang-generated. Alter premake5.lua to pick up that header. * First pass at documenting how to build glslang and spirv-tools. * Improved glsl/spir-v tools README.md * Added revision.h * Change how gResources is calculated. Update about revision.h * Update docs a little. * Split out spirv-tools into a separate project for building glslang. This was not necessary on linux, but is necessary on windows, because there is a file disassemble.cpp in spirv-tools and in glslang, and this leads to VS choosing only one. With the separate library, the problem is resolved. * Fix direct-spirv-emit output. * Update to latest version of spirv headers and spirv-tools. * Upgrade submodule version of glslang in external. * Add fPIC to build options of slang-spirv-tools * Upgrade slang-binaries to have new glslang. * Fix issues with Windows slang-glslang binaries, via update of slang-binaries used. * Small improvements to glslang building process documentation. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>