slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2020-06-05	ASTNodes use MemoryArena (#1376)	jsmall-nvidia
	* Add a ASTBuilder to a Module Only construct on valid ASTBuilder (was being called on nullptr on occassion) * Add nodes to ASTBuilder. * Compiles with RefPtr removed from AST node types. * Initialize all AST node pointer variables in headers to nullptr; * Initialize AST node variables as nullptr. Make ASTBuilder keep a ref on node types. Make SyntaxParseCallback returns a NodeBase * Don't release canonicalType on dtor (managed by ASTBuilder). * Give ASTBuilders a name and id, to help in debugging. For now destroy the session TypeCache, to stop it holding things released when the compile request destroys ASTBuilders. * Moved the TypeCheckingCache over to Linkage from Session. * NodeBase no longer derived from RefObject. * Only add/dtor nodes that need destruction. First pass compile on linux.
2020-06-05	Merge branch 'master' into loop_attrib	Tim Foley

2020-06-04	Emit [loop] attribute to output HLSL.	Yong He

2020-06-04	First steps toward inheritance for struct types (#1366)	Tim Foley
	* First steps toward inheritance for struct types This change adds the ability for a `struct` type to declare a base type that is another `struct`: ```hlsl struct Base { int baseMember; } struct Derived : Base { int derivedMember; } ``` The semantics of the feature are that code like the above desugars into code like: ```hlsl struct Base { int baseMember; } struct Derived { Base _base; int derivedMember; } ``` At points where a member from the base type is being projected out, or the value is being implicitly cast to the base type, the compiler transforms the code to reference the implicitly-generated `_base` member. That means code like this: ```hlsl void f(Base b); ... Derived d = ...; int x = d.baseMember; f(d); ``` gets transformed into a form like this: ```hlsl void f(Base b); ... Derived d = ...; int x = d._base.baseMember; f(d._base); ``` Note that as a result of this choice, the behavior when passing a `Derived` value to a function that expects a `Base` (including to inherited member functions) is that of "object shearing" from the C++ world: the called function can only "see" the `Base` part of the argument, and any operations performed on it will behave as if the value was indeed a `Base`. There is no polymorphism going on because Slang doesn't currently have `virtual` methods. In an attempt to work toward inheritance being a robust feature, this change adds a bunch of more detailed logic for checking the bases of various declarations: * An `interface` declaration is only allowed to inherit from other `interface`s * An `extension` declaration can only introduce inheritance from `interface`s * A `struct` declaration can only inherit from at most one other `struct`, and that `struct` must be the first entry in the list of bases This change also adds a mechanism to control whether a `struct` or `interface` in one module can inherit from a `struct` or `interface` declared in another module: * If the base declaration is marked `[open]`, then the inheritance is allowed * If the base declaration is marked `[sealed]`, then the inheritance is allowed * If it is not marked otherwise, a `struct` is implicitly `[sealed]` * If it is not marked otherwise, an `interface` is implicitly `[open]` These seem like reasonable defaults. In order to safeguard the standard library a bit, the interfaces for builtin types have been marked `[sealed]` to make sure that a user cannot declare a `struct` and then mark it as a `BuiltinFloatingPointType`. This step should bring us a bit closer to being able to document and expose these interfaces for built-in types so that users can write code that is generic over them. There are some big caveats with this work, such that it really only represents a stepping-stone toward a usable inheritance feature. The most important caveats are: * If a `Derived` type tries to conform to an interface, such that one or more interface requirements are satisfied with members inherited from the `Base` type, that is likely to cause a crash or incorrect code generation. * If a `Derived` type tries to inherit from a `Base` type that conforms to one or more interfaces, the witness table generated for the conformance of `Derived` to that interface is likely to lead to a crash or incorrect code generation. It is clear that solving both of those issues will be necessary before we can really promote `struct` inheritance as a feature for users to try out. * fixup: trying to appease clang error * fixups: review feedback
2020-06-02	Working matrix swizzle (#1354)	Dietrich Geisler
	* Working matrix swizzle. Supports one and zero indexing and multiple elements. Performs semantic checking of the swizzle. Matrix swizzles are transformed into a vector of indexing operations during lowering to the IR. This change does not handle matrix swizzle as lvalues. * Renaming * Added missing semicolon * Initialize variable for gcc * Added the expect file for diagnostics * Matrix swizzle updated per PR feedback * Stylistic fix * Formatting fixes * Fix compiling with AST change. Change indentation. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2020-05-29	Feature/ast syntax standard (#1360)	jsmall-nvidia
	* Small improvements to documentation and code around DiagnosticSink * Made methods/functions in slang-syntax.h be lowerCamel Removed some commented out source (was placed elsewhere in code) * Making AST related methods and function lowerCamel. Made IsLeftValue -> isLeftValue.
2020-05-28	WIP: ASTBuilder (#1358)	jsmall-nvidia
	* Compiles. * Small tidy up around session/ASTBuilder. * Tests are now passing. * Fix Visual Studio project. * Fix using new X to use builder when protectedness of Ctor is not enough. Substitute->substitute * Add some missing ast nodes created outside of ASTBuilder. * Compile time check that ASTBuilder is making an AST type. * Moced findClasInfo and findSyntaxClass (essentially the same thing) to SharedASTBuilder from Session.
2020-05-26	Improvements around hashing (#1355)	jsmall-nvidia
	* Fields from upper to lower case in slang-ast-decl.h * Lower camel field names in slang-ast-stmt.h * Fix fields in slang-ast-expr.h * slang-ast-type.h make fields lowerCamel. * slang-ast-base.h members functions lowerCamel. * Method names in slang-ast-type.h to lowerCamel. * GetCanonicalType -> getCanonicalType * Substitute -> substitute * Equals -> equals ToString -> toString * ParentDecl -> parentDecl Members -> members * * Make hash code types explicit * Use HashCode as return type of GetHashCode * Added conversion from double to int64_t * Split Stable from other hash functions * toHash32/64 to convert a HashCode to the other styles. GetHashCode32/64 -> getHashCode32/64 GetStableHashCode32/64 -> getStableHashCode32/64 * Other Get/Stable/HashCode32/64 fixes * GetHashCode -> getHashCode * Equals -> equals * CreateCanonicalType -> createCanonicalType * Catches of polymorphic types should be through references otherwise slicing can occur. * Fixes for newer verison of gcc. Fix hashing problem on gcc for Dictionary. * Another fix for GetHashPos * Fix signed issue around GetHashPos
2020-05-22	Tidy up around AST nodes (#1353)	jsmall-nvidia
	* Fields from upper to lower case in slang-ast-decl.h * Lower camel field names in slang-ast-stmt.h * Fix fields in slang-ast-expr.h * slang-ast-type.h make fields lowerCamel. * slang-ast-base.h members functions lowerCamel. * Method names in slang-ast-type.h to lowerCamel. * GetCanonicalType -> getCanonicalType * Substitute -> substitute * Equals -> equals ToString -> toString * ParentDecl -> parentDecl Members -> members
2020-05-19	Reduce the size of Token (#1349)	jsmall-nvidia
	* Token size on 64 bits is 24 bytes (from 40). On 32 bits is 16 bytes from 24. * Added hasContent method to Token. Some other small improvements around Token.
2020-04-08	Fixes for IR generics (#1311)	Tim Foley
	* Fixes for IR generics There are a few different fixes going on here (and a single test that covers all of them). 1. Fix optionality of trailing semicolon for `struct`s ====================================================== We have logic in the parser that tries to make a trailing `;` on a `struct` declaration optional. That logic is a bit subtle and couild potentially break non-idiomatic HLSL input, so we try to only trigger it for files written in Slang (and not HLSL). For command-line `slangc` this is based on the file extension (`.slang` vs. `.hlsl`), and for the API it is based on the user-specified language. The missing piece here was that the path for handling `import`ed code was not setting the source language of imported files at all, and so those files were not getting opted into the Slang-specific behavior. As a result, `import`ed code couldn't leave off the semicolon. 2. Fix generic code involving empty `interface`s ================================================ We have logic that tries to only specialize "definitions," but the definition-vs-declaration distinction at the IR level has historically been slippery. One corner case was that a witness table for an interface with no methods would always be considered a declaration, because it was empty. The notion of what is/isn't a definition has been made more nuanced so that it amounts to two main points: * If something is decorated as `[import(...)]`, it is not a definition * If something is a generic/func (a declaration that should have a body), and it has no body, it is a declaration Otherwise we consider anything a definition, which means that non-`[import(...)]` witness tables are now definitions whether or not they have anything in them. 3. Fix IR lowering for members of generic types =============================================== The IR lowering logic was trying to be a little careful in how it recurisvely emitted "all" `Decl`s to IR code. In particular, we don't want to recurse into things like function parameters, local variables, etc. since those can never be directly referenced by external code (they don't have linkage). The existing logic was basically emitting everything at global scope, and then only recursing into (non-generic) type declarations. This created a problem where a method declared inside a generic `struct` would not be emitted to the IR for its own module at all unless it happened to be called by other code in the same module. The fix here was to also recurse into the inner declaration of `GenericDecl`s. I also made the code recurse into any `AggTypeDeclBase` instead of just `AggTypeDecl`s, which means that members in `extension` declarations should not properly be emitted to the IR. Conclusion ========== These fixes should clear up some (but not all) cases where we might emit an `/* unhandled /` into output HLSL/GLSL. A future change will need to make that path a hard error and then clean up the remaining cases. fixup: tabs->spaces
2020-04-08	Remove static struct members from layout and reflection (#1310)	jsmall-nvidia
	* * Added MemberFilterStyle - controls action of FilteredMemberList and FilteredMemberRefList * Splt out template implementations * Use more standard method names dofr FilteredMemberRefList * Added reflect-static.slang test * Added isNotEmpty/isEmpty to filtered lists * Added ability to index into filtered list (so not require building of array) * Default MemberFilterStyle to All. * Remove explicit MemberFilterStyle::All
2020-04-02	Add basic support for namespaces (#1304)	Tim Foley
	This change adds logic for parsing `namespace` declarations, referencing them, and looking up their members. * The parser changes are a bit subtle, because that is where we deal with the issue of "re-opening" a namespace. We kludge things a bit by re-using an existing `NamespaceDecl` in the same parent if one is available, and thereby ensure that all the members in the same namespace can see on another. * In order to allow namespaces to be referenced by name they need to have a type so that a `DeclRefExpr` to them can be formed. For this purpose we introduce `NamespaceType` which is the (singleton) type of a reference to a given namespace. * The new `NamespaceType` case is detected in the `MemberExpr` checking logic and routed to the same logic that `StaticMemberExpr` uses, and the static lookup logic was extended with support for looking up in a namespace (a thin wrapper around one of the existing worker routines in `slang-lookup.cpp`. * I made `NamespaceDecl` have a shared base class with `ModuleDecl` in the hopes that this would allow us to allow references to modules by name in the future. That hasn't been tested as part of this change. * I cleaned up a bunch of logic around `ModuleDecl` holding a `Scope` pointer that was being used for some of the more ad hoc lookup routines in the public API. Those have been switched over to something that is a bit more sensible given the language rules and that doesn't rely on keeping state sititng around on the `ModuleDecl`. * I added a test case to make sure the new funcitonality works, which includes re-opening a namespace, and it also tests both `.` and `::` operations for lookup in a namespace. * The main missing feature here is the ability to do something like C++ `using`. It would probably be cleanest if we used `import` for this, since we already have that syntax (and having both `import` and `using` seems like a recipe for confusion). Most of the infrastructure is present to support `import`ing one namespace into another (in a way that wouldn't automatically pollute the namespace for clients), but some careful thought needs to be put into how import of namespaces vs. modules should work.
2020-03-30	CUDA version handling (#1301)	jsmall-nvidia
	* render feature for CUDA compute model. * Use SemanticVersion type. * Enable CUDA wave tests that require CUDA SM 7.0. Provide mechanism for DownstreamCompiler to specify version numbers. * Enabled wave-equality.slang * Make CUDA SM version major version not just a single digit. * Fix assert. * DownstreamCompiler::Version -> CapabilityVersion
2020-03-20	Handling of switch with empty body (#1284)	jsmall-nvidia
	* Added handling for empty switch body. Added test for empty switch. * Fix testing for case in switch.
2020-03-16	Define compound intrinsic ops in the standard library (#1273)	Tim Foley
	* Define compound intrinsic ops in the standard library The current stdlib code has a notion of "compound" intrinsic ops, which use the `__intrinsic_op` modifier but don't actually map to a single IR instruction. Instead, most* of these map to multiple IR instructions using hard-coded logic in `slang-ir-lower.cpp`. (* One special case is `kCompoundOp_Pos` that is used for unary `operator+` and that maps to zero IR instructions) All of the opcodes that used to use the `kCompoundOp_` enumeration values now have definitions directly in the stdlib and use the new `[__unsafeForceInlineEarly]` attribute to ensure that they get inlined into their use sites so that the output code is as close as possible to the original. For the most part, generating the stdlib definitions for the compound ops is straightforward, but here's some notes: * The unary `operator+` I just defined directly in Slang code, since it doesn't share much structure with other cases * The unary increment/decrement ops are generated as a cross product of increment/decrement and prefix/postfix. The logic is a bit messy but given that we have scalar, vector, and matrix versions to deal with it still saves code overall * Because all the compound/assignment cases got moved out, the existing code for generating unary/binary ops can be simplified a bit * All the no-op bit-cast operations like `asfloat(float)` are now inline identity functions * A few other small cleanups are made by not having to worry about the compound ops (which used to be called "pseudo ops") sometimes being encoded in to the same type of value as a real IR opcode. The one big detail here is a fix for how IR lowering works for `let` declarations: they were previously being `materialize()`d which only guarantees that they've been placed in a contiguous and addressable location, but doesn't actually convert them to an r-value. As a result a `let` declaration could accidentally capture a mutable location by reference, which is definitely not what we wanted it to do. Fixing this was needed to make the new postfix `++` definition work (several existing tests end up covering this). One important forward-looking note: One subtle (but significant) choice in this change is that we actually reduce the number of declarations in the stdlib, because instead of having the compound operators include both per-type and generic overloads (just listing scalar cases for now): float operator+=(in out float left, float right) { ... } int operator+=(in out int left, int right) { ... } ... T operator+= <T:__BuiltinBlahBlah>(in out T left, T right) { ... } We now have only the single generic version: T operator+= <T:__BuiltinBlahBlah>(in out T left, T right) { ... } In running our current tests, this change didn't lead to any regressions (perhaps surprisingly). Given that we were able to reduce the number of overloads for `operator+=` by a factor of N (where N is the number of built-in types), it seems worth considering whether we could also reduce the number of overloads of `operator+` by the same factor by only having generic rather than per-type versions. One concern that this forward-looking question raises is whether the quality of diagnostic messages around bad calls to the operators might suffer when there are only generic overloads instead of per-type overloads. In order to feel out this problem I added a test case that includes some bad operator calls both to `+=` (which is now only generic with this change) and `+` (which still has per-type overloads). Overall, I found the quality of the error messages (in terms of the candidates that get listed) isn't perfect for either, but personally I prefer the output in the generic case. As part of adding that test, I also added some fixups to how overload resolution messages get printed, to make sure the function name is printed in more cases, and also that the candidates print more consistently. These changes affected the expected output for one other diagnostic test. * fixup: disable bad operator test on non-Windows targets
2020-03-11	Add a basc inlining facility for use in the stdlib (#1271)	Tim Foley
	The main feature visible to the stdlib here is the `[__unsafeForceInlineEarly]` attribute, which can be attached to a function definition and forces calls to that function to be inlined immediately after initial IR lowering. The pass is implemented in `slang-ir-inline.{h,cpp}` and currently only handles the completely trivial case of a function with no control flow that ends with a single `return`. The lack of support for any other cases motivates the `__unsafe` prefix on the attribute. In order to test that the pass works, I modified the "comma operator" in the standard library to be defined directly (rather than relying on special-case handling in IR lowering), and then added a test that uses that operator to make sure it generates code as expected. The compute version of the test confirms that we generate semantically correct code for the operator, while the SPIR-V cross-compilation test confirms that our output matches GLSL where the comma operator has been inlined, rather than turned into a subroutine. Notes for the future: * With this change it should be possible (in principle) to redefine all the compound operators in the stdlib to instead be ordinary functions with the new attribute, removing the need for `slang-compound-intrinsics.h`. * Once the compound intrinsics are defined in the stdlib, it should be easier/possible to start making built-in operators like `+` be ordinary functions from the standpoint of the IR * The attribute and pass here could be extended to include an alternative inlining attribute that happens later in compilation (after linking) but otherwise works the same. This could in theory be used for functions where we don't want to inline the definition into generated IR, but still want to inline things berfore generating final HlSL/GLSL/whatever. * The inlining pass itself could be generalized to work for less trivial functions pretty easily; for the most part it would just mean "splitting" the block with the call site and then inserting clones of the blocks in the callee. Any `return` instructions in the clone would become unconditional branches (with arguments) to the block after the call (which would get a parameter to represent the returned value). * The "hard" part for such an inlining pass would be handling cases where the control flow that results from inlining can't be handled by our later restructuring passes. The long-term fix there is to implement something like the "relooper" algorithm to restructure control flow as required for specific targets.
2020-03-05	Feature/glslang spirv version (#1256)	jsmall-nvidia
	* WIP add support for __spirv_version . * Added IRRequireSPIRVVersionDecoration * SPIR-V version passed to glslang. Enable VK wave tests. Split ExtensionTracker out, so can be cast and used externally to emit. Added SourceResult. * Fix warning on Clang. * Missing hlsl.meta.h * Refactor communication/parsing of __spirv_version with glslang. * Fix some debug typos. Be more precise in handling of substring handling. * Make glslang forwards and backwards binary compatible. * Small comment improvements. * Added slang-spirv-target-info.h/cpp * Fix for major/minor on gcc. * Another fix for gcc/clang. * VS projects include slang-spirv-target-info.h/cpp * Removed SPIRVTargetInfo Added SemanticVersion. Don't bother with passing a target to glslang. Should be separate from 'version'. * Renamed slang-emit-glsl-extension-tracker.cpp/.h -> slang-glsl-extension-tracker.cpp/.h Fixed some VS project issues. * Fix a comment. * Added slang-semantic-version.cpp/.h * Added slang-glsl-extension-tracker.cpp/.h * Added split that can check for input has all been parsed. * Fix problem on x86 win build.
2020-03-03	__spirv_version Decoration (#1255)	jsmall-nvidia
	* WIP add support for __spirv_version . * Added IRRequireSPIRVVersionDecoration
2020-03-03	Move definitions of simple vector/matrix builtins to stdlib. (#1247)	Tim Foley
	Some of the functions declared in the Slang standard library are built in on some targets (almost always the case for HLSL) but aren't available on other targets (often the case for GLSL, CUDA, and CPU). To date, the CUDA and CPU targets have worked around this issue by synthesizing definitions of the missing functions on the fly as part of output code generation, at the cost of some amount of code complexity in the emit pass. This change adds definitions inside the stdlib itself for a large number of built-in HLSL functions that act element-wise over both vectors and matrices (e.g., `sin()`, `sqrt()`, etc.), and changes the CPU/CUDA codegen path to not synthesize C++ code for those functions (instead relying on code generated from the Slang definitions). The element-wise vector/matrix function bodies are being defined using macros in the stdlib, so that we can more easily swap out the definitions en masse if we find an implementation strategy we like better. This could involve defining special-case syntax just for vector/matrix "map" operations that can lower directly to the IR and theoretically generate cleaner code after specialization is complete. As a byproduct of this change, the matrix versions of these functions should in principle now be available to GLSL (GLSL only defines vector versions of functions like `sin()`, and leaves out matrix ones). No testing has been done to confirm this fix. In some cases builtins were being declared with multiple declarations to split out the HLSL and GLSL cases, and this change tries to unify these as much as possible into single declarations to keep the stdlib as small as possible. Two functions -- `sincos()` and `saturate()` -- were simple enough that their full definitions could be given in the stdlib so that even the scalar cases wouldn't need to be synthesized, so the corresponding enumerants were removed in `slang-hlsl-intrinsic-set.h`. In the case of `saturate()` the pre-existing definition used for GLSL codegen could have been used for CPU/CUDA all along. In some cases functions that can and should be defined in the future have had commented-out bodies added as an outline for what should be inserted in the future. Most of these functions cannot be implemented directly in the stdlib today because basic operations like `operator+` are currently not defined for `T : __BuiltinArithmeticType`, etc. Adding such declarations should be straightforward, but brings risks of creating unexpected breakage, so it seemed best to leave for a future change. This change does not try to address making vector or matrix versions of builtin functions that map to single `IROp`s, because the existing mechanisms for target-based specialization, etc., do not apply for such cases. In the future we will either have to make those operations into ordinary functions (eliminating many `IROp`s) so that stdlib definitions can apply, or add an explicit IR pass to deal with legalizing vector/matrix ops for targets that don't support them natively. The right path for this is not yet clear, so this change doesn't wade into it. This change does not touch the `Wave*` functions added in Shader Model 6, despite many of these having vector/matrix versions that could benefit from the same default mapping. It is expected that these functions will have GLSL/Vulkan translation added soon, and it probably makes sense to know what cases are directly supported on Vulkan before adding the hand-written definitions. Because of the limitations on what could be ported into the stdlib, it is not yet possible to remove any of the infrastructure for synthesizing builtin function definitions in the CPU and CUDA back-ends.
2020-03-02	Renamed UnownedStringSlice::size to getLength to make match String. (#1254)	jsmall-nvidia

2020-02-21	Add surface syntax for "this type" (#1236)	Tim Foley
	Within the context of an aggregate type (or an `extension` of one), the programmer can use `this` to refer to the "current" instance of the surrounding type, but there is no easy way to utter the name of the type itself. This is especially relevant inside of an `interface`, where the type of `this` isn't actually the `interface` type, but rather a placeholder for the as-yet-unknown concrete type that will implement the interface. This change adds a keyword `This` that works similarly to `this`, but names the current type instead of the current instance. It can be used to declare things like binary methods or factory functions in an interface: ``` interface IBasicMathType { This absoluteValue(); This sumWith(This left); } T doSomeMath<T:IBasicMathType>(T value) { return value.sumWith(value.absoluteValue()); } ``` The `This` type is consistent with the type named `Self` in Rust and Swift (where Rust/Swift use `self` instead of `this`). Other names could be considered (e.g., `ThisType`) if we find that users don't like the name in this change.
2020-02-20	Initial support for user-defined initializer/constructor declarations (#1233)	Tim Foley
	The basic idea is that the user can write: ```hlsl struct MyThing { int a; float b; __init(int x, float y) { a = x; b = y; } } ``` and after that point, they can create an intstance of their `MyThing` type as simply as `MyThing(123, 4.56f)`. There was already a large amount of infrastructure laying around that is shared between ininitializers and ordinary functions, so enabling this feature mostly amounted to tying up some loose ends: * In the parser, make sure to properly push/pop the scope for an `__init` (or `__subscript`) declaration, so parameters would be visible to the body * In semantic checking, make sure that declaration "header" checking properly bottlenecks all the function-like cases into a base routine * In semantic checking, make sure that the logic for checking function bodies applies to every `FunctionDeclBase` with a body, and not just `FuncDecl`s * Update semeantic checking for statements to allow for any `FunctionDeclBase` as the parent declaration, not just a `FuncDecl` * In lookup, treat the `this` parameter of an `__init` (well, not actually a parameter in this case) as being mutable, just like for a `[mutating]` method * In IR codegen, don't just assume that all `__init`s are intrinsics, and narrow the scope of that hack to just `__init`s without bodies * In IR codegen, detect when we are emitting an IR function for an `__init`, and in that case create a local variable to represent the `this` value, and implicitly return that value at the end of the body. From that point on the rest of the compiler Just Works and IR codegen doesn't have to think of an `__init` as being any different than if the user had declared a `static MyThing make(...)` function. Caveats: * C++ users might like to use that naming convention (so `MyThing` as the name instead of `__init`). We can consider that later. * Everybody else might prefer a keyword other than `__init` (e.g., just `init` as in Swift), but I'm keeping this as a "preview" feature for now, rather than something officially supported * Early `return`s from the body of an `__init` aren't going to work right now. * There is currently no provision for automatically synthesizing initializers for `struct` types based on their fields. This seems like a reasonable direction to take in the future. * There is no provision for routing `{}`-based initializer lists over to initializer calls. The two syntaxes probably need to be unified at some point so that doing `MyType x = { a, b, c }` and `let x = MyType(a, b, c)` are semantically equivalent. It is possible that as a byproduct of this change user-defined `__subscript`s might Just Work, but I am guessing there will still be loose ends on that front as well, so I will refrain from looking into that feature until we have a use case that calls for it.
2020-02-07	Change handling of strings for HLSL/GLSL targets (#1204)	Tim Foley
	* Change handling of strings for HLSL/GLSL targets This change switches our handling of string literals and `getStringHash` to something that is more streamlined at the cost of potentially being less general/flexible. * `String` is now allowed as a parameter type in user-defined functions * `getStringHash` is now allowed to apply to `String`-type values that aren't literals * The list of strings in an IR module is now generated during IR lowering as part of lowering a string literal expression, rather than being defined by recursively walking the IR of the module looking for `getStringHash` calls. The public API still refers to these as "hashed" strings, but they are realistically now "static strings." * When emitting code for HLSL/GLSL, the `String` type emits as `int`, and `getStringHash(x)` emits as `x`. In terms of implementation, the choice was whether to translate `String` over to `int` in an explicit IR pass, or to lump it into the emit pass. While adding the logic to emit clutters up an already complicated step, it is ultimately much easier to make the change there than to write a clean IR pass to eliminate all `String` use. Note that other targets that can handle a more full-featured `String` type are not addressed by this change and thus do not support `String` at all. It may be woth emitting `String` as `const char` on those targets, and emitting string literals directly, but the `getStringHash` function would need to be implemented in the "prelude" then, and we probably want to pick a well-known/-documented hash algorithm before we go that far. This change also brings along some some clean-ups to the `gpu-printing` example, since it can now take advantage of the new functionality of `String`. Fix up tests for new string handling * Add global string literal list to string-literal test (since we now list all static string literals and not just those passed to `getStringHash`) * Disable `getStringHash` test on CPU, since we don't have a working `String` on that platform right now (only HLSL/GLSL) Co-authored-by: Tim Foley <tim.foley.is@gmail.com>
2020-01-31	Some Slang API additions (#1195)	Tim Foley
	* Some Slang API additions These are additions to the public Slang API that came up while I was trying to write an example to demonstrate GPU printing. The additions aren't strictly necessary for the example, but I found these to be missing services when writing the code the way I wanted to. The main public changes are: * There is a new distinct `IEntryPoint` interface which inherit from `IComponentType` (much like `IModule` before). For now this doesn't expose any new functions on top of `IComponentType`, but I expect it to do so eventually. * It is now possible to get the `IModule` for a specific translation unit in a compile request with `spCompileRequest_getModule`. Even for a compile request that had only one translation unit this is not the same object as gets returned by `spCompileRequest_getProgram`. The latter should probably be called `_getLinkedProgram` because it returns a composite component type that links the module with everything it `import`s. An `IModule` can look up entry points declared in the module. Eventually a module should support looking up most of the declarations in the module (e.g., types) by name, but entry points are an obvious case. * A new `link()` operation is added to `IComponentType`. It is possible to have component types that have unsatisfied dependencies, such that trying to generate kernel code from them will fail. The `link()` operation tries to produce a new composite component type that combines a component with its dependencies, to enable code generation. The implementation of end-to-end compilation was using a function like this internally, but it hadn't been exposed to the API. Notes on the implementation: * The list of entry points declared in a given translation unit has moved from `TranslationUnitRequest` to the `Module` inside of it. * `EntryPoint` now has to do a song and dance much like `Module` to both inherit from `ComponentType` and support the `IEntryPoint` interface. * The `Session* m_session` member in `Linkage` (in terms of public API, this is the `slang::ISession` holding a pointer to the `slang::IGlobalSession`) has been changed to a `RefPtr`. Without this change an application can't just hold onto a `ComPtr<slang::ISession>`; they also need to retain the `IGlobalSession` or things will crash. The new behavior seems more correct, but I worry that it might introduce a leak. * The `asInternal` operation for `IComponentType` had to be updated to not just perform a cast. A type like `Module` has two `IComponentType` sub-objects, and only one of these is at the same address as the `ComponentType` base. * Similarly, the `Module::getInterface` logic was changed to fall back to `Super::getInterface` for all the cases other than `IID_IModule`, so that it would be guaranteed to return the `IComponentType` at the same address as `ComponentType` in response to a `queryInterface`. * Fixes for memory retain cycles As part of the earlier change, I made the `Linkage` type hold a `RefPtr` to the `Session`. The motivation there is it lets a user hang onto just a `slang::ISession` without having to also retain the `slang::IGlobalSession` for no immediately apparent reason. There are two problems that this surfaced, one pre-existing and one new. The new problem was that `Session` already held a `RefPtr<Linkage> m_builtinLinkage` for the linkage that holds the stdlib code. I solved that problem by splitting the parent pointer in `Linkage` into two pointers: a raw pointer that is used to actually locate the parent session, and a ref-counted one that can be used to optionally retain the parent session. The builtin linkage is then set up to explicitly not retain its parent, thus breaking the cycle. The second problem was a pre-existing one, where every `ComponentType` was holding a retained pointer to its parent `Linkage`, but in turn the `Linkage` was holding retained pointers to many `ComponentTypes` (and subclasses thereof). For this case I used the more expedient fix of making the parent pointer into a raw pointer, and figuring that it is a reasonable rule to expect user to retain the `Linkage` (aka `slang::ISession`) that owns a component type if they want to be able to use the component type. I might need/want to investigate a better fix for the latter issue, but for now this seems to clean up the issues I was seeing in the tests. Fingers crossed.
2019-12-19	Fix invocation of `[mutating]` methods (#1156)	Tim Foley
	The logic for invoking methods (member functions) in `slang-lower-to-ir.cpp` was failing to take into account whether the callee was `[mutating]` or not. Instead, it would always lower the `base` expression in something like `base.f(...)` as an r-value expression, consistent with a non-`[mutating]` method. The incorrect code generation strategy somehow turned out to work in many cases, but it broke in cases where a `[mutating]` method was called on an `inout` parameter. E.g., in this code: ```hlsl struct Stuff { [mutating] void doThing() { ... } } void broken(inout Stuff s) { s.doThing(); } ``` The `broken` function would fail to write back the value mutated by `doThing` to its `s` parameter before returning. The crux of the fix here is inside `visitInvokeExpr()`. Instead of directly calling `lowerRValueExpr` on the base expression of a method/member-function call, we instead compute the "direction" of the `this` parameter in the callee, and use that to emit the argument expression appropriately. In order to enable that change, there are several refactorings included: * The existing `ParameterDirection` and `getParameterDirection()` calls were lifted out from the declaration visitor to the global scope, so that they could be shared between lowering of functions and their call sites. * The logic for determining the "direction" of a `this` parameter was factored out of `collectParameterLists()` into its own `getThisParamDirection()` subroutine (again so that functions and call sites can share matching logic). * The logic for turning an AST expression used as a call argument into IR argument(s)* was pulled out into its own `addCallArgsForParam` and was refactored to rely on a `ParameterDirection` instead of directly inspecting the modifiers on a `ParamDecl`. This allows the function to be used for ordinary/direct arguments and the `this` argument, and also ensures that the caller and callee will agree on the direction of parameters. Fixing the way that `[mutating]` methods are called actually broke some test cases, specifically in the cases where a `[mutating]` method was being called on a value with an interface-constrained generic type: ```hlsl interface IThing { [mutating] void doStuff(); } void myFunc<T : IThing>(inout T thing) { thing.doStuff(); } ``` Our argument passing for `inout` parameters currently requires that we make a temp copy of `thing` into a local, and then pass that local as argument for the `inout` parameter, before copying back. The issue that arose was that a simple version of the logic uses the type of the `base` expression in `base.someMethod(...)` as the type of the local variable, but for an interface method call the base expression will have been cast to the interface type (we effectively have `((IThing) thing).doStuff()`. The fix here was to query the this type through the member function we are calling, and to share that logic between the function-call and function-declaration cases, to try and make sure they match, which meant even more logic got hoisted out of the declaration-emission logic and to the top level. Note: This change does not clean up any other clarity or performance concerns around `out` and `inout` parameters; it is only focused on correctness.
2019-12-06	Support conversion from int/uint to enum types (#1147)	Tim Foley
	* Support conversion from int/uint to enum types The basic feature here is tiny, and is summarized in the code added to the stdlib: ``` extension __EnumType { __init(int val); __init(uint val); } ``` The front-end already makes all `enum` types implicitly conform to `__EnumType` behind the scenes, and this `extension` makes it so that all such types inherit some initializers (`__init` declarations, aka. "constructors") that take `int` and `uint`. (Note: right now all `__init` declarations in Slang are assumed to be implemented as intrinsics using `kIROp_Construct`. This obviously needs to change some day, especially so that we can support user-defined initializers.) Actually making this work required a bit of fleshing out pieces of the compiler that had previously been a bit ad hoc to be a bit more "correct." Most of the rest of this description is focused on those details, since the main feature is not itself very exciting. When overload resolution sees an attempt to "call" a type (e.g., `MyType(3.0)`) it needs to add appropriate overload candidates for the initializers in that type, which may take different numbers and types of parameters. The existing code for handling this case was using an ad hoc approach to try to enumerate the initializer declarations to consider, which might be found via inheritance, `extension` declarations, etc. In practice, the ad hoc logic for looking up initializers was just doing a subset of the work that already goes into doing member lookup. Changing the code so that it effectively does lookup for `MyType.__init` allows us to look up initializers in a way that is consistent with any other case of member lookup. Generalizing this lookup step brings us one step closer to being able to go from an `enum` type `E` to an initializer defined on an `extension` of an `interface` that `E` conforms to. One casualty of using the ordinary lookup logic for initializers is that we used to pass the type being constructed down into the logic that enumerated the initializers, which made it easier to short-circuit the part of overload resolution that usually asks "what type does this candidate return." It might seem "obvious" that an initializer/constructor on type `Foo` should return a value of type `Foo`, but that isn't necessarily true. Consider the `__BuiltinFloatingPointType` interface, which requires all the built-in floating-point types (`float`, `double`, `half`) to have an initializer that can take a `float`. If we call that interface in a generic context for `T : __BuiltinFloatingPointType`, then we want to treat that initializer as returning `T` and not `__BuiltinFloatingPointType`. Without the ad hoc logic in initializer overload resolution, this is the exact problem that surfaced for the stdlib definition of `clamp`. The solution to the "what type does an initializer return" problem was to introduce a notion of a `ThisType`, which refers to the type of `this` in the body of an interface. More generally, we will eventually want to have the keyword `This` be the type-level equivalent of `this`, and be usable inside any type. The `calcThisType` function introduced here computes a reasonable `Type` to represent the value of `This` within a given declaration. Inside of concrete type it refers to the type itself, while in an `interface` it will always be a `ThisType`. The existing `ThisTypeSubstitution`s, previously only applied to associated types, now apply to `ThisType`s as well, in the same situations. The next roadblock for making the simple declarations for `__EnumType` work was that the lookup logic was only doing lookup through inheritance relationships when the type being looked up in was an `interface`. The logic in play was reasonable: if you are doing lookup in a type `T` that inherits from `IFoo`, then why bother looking for `IFoo::bar` when there must be a `T::bar` if `T` actually implements the interface? The catch in this case is that `IFoo::bar` might not be a requirement of `IFoo`, but rather a concrete method added via an `extension`, in which case `T` need not have its own concrete `bar`. The simple/obvious fix here was to make the lookup logic always include inherited members, even when looking up through a concrete type. Of course, if we allow lookup to see `IFoo::bar` when looking up on `T`, then we have the problem that both `T::bar` and `IFoo::bar` show up in the lookup results, and potentially lead to an "ambiguous overload" error. This problem arises for any interface rquirement (so both methods and associated types right now). In order to get around it, I added a somewhat grungy check for comparing overload candidates (during overload resolution) or `LookupResultItem`s (during resolution of simple overloaded identifiers) that considers a member of a concrete type as automatically "better" than a member of an interface. The Right Way to solve this problem in the long run requires some more subtlety, but for now this check should Just Work. One final wrinkle is that due to our IR lowering pass being a bit overzealous, we currently end up trying to emit IR for those new `__init` declarations, which ends up causing us to try and emit IR for a `ThisType`. That is a case that will require some subtlty to handle correctly down the line, for for now we do the expedient thing and emit the `ThisType` for `IFoo` as `IFoo` itself, which is not especially correct, but doesn't matter since the concrete initializer won't ever be called. * testing: add more debug output to Unix process launch function * testing: increase timeout when running command-line tests
2019-12-06	Remove legacy feature for merging global shader parameters (#1139)	Tim Foley
	* Remove legacy feature for merging global shader parameters There is a fair amount of special-case code in the Slang compiler today to deal with the scenario where a programmer declares the "same" shader parameter across two different translation units: ```hlsl // a.hlsl Texture2D a; cbuffer C { float4 c; } ``` ```hlsl // b.hlsl cbuffer C { float4 c; } Texture2D b; ``` An important note here is that the declaration of `C` may be in a header file that both `a.hlsl` and `b.hlsl` `#include`, because from the standpoint of the parser and later stages of the compiler, there is no difference between `C` being in an included file vs. it being copy-pasted across both `a.hlsl` and `b.hlsl`. When a user invokes `slangc a.hlsl b.hlsl` (or the equivalent via the API), then they may decide that it is "obvious" that the shader parameter `C` is the "same" in both `a.hlsl` and `b.hlsl`. Knowing that the parameter is the "same" may lead them to make certain assumptions: * They may assume that generated code for entry points in `a.hlsl` and `b.hlsl` will both agree on the exact `register`/`binding` occupied by `C`. * They may assume that reflection information for their program will only reflect `C` once, and it will reflect it in a way that is applicable to entry points in both `a.hlsl` and `b.hlsl` * They may assume that the compiler can and should handle this use case even when `C` contains fields with `struct` types that are declared in both `a.hlsl` and `b.hlsl` that have the "same" definition. * They may assume that in cases where `C` is declared inconsistently between `a.hlsl` and `b.hlsl` the compiler can and will diagnose an error. Making these assumptions work in practice required a lot of special-case code: * When composing/linking programs was `ComponentType`s we had to include a special case `LegacyProgram` type that could provide these "do what I mean" semantics, since they are not what one would want in the general case for a `CompositeComponentType`. * During enumeration of global shader parameter in a `LegacyProgram`, we had to detect parameters from distinct modules (translation units) with the same name, and then enforce that they must have the "same" type (via an ad hoc recursive structural type match). No other semantic checking logic needs or uses that kind of structural check. * During parameter binding generation, we need to handle the case where a single global shader parameter might have multiple declarations, and make sure to collect explicit bindings from all of them (checking for inconsistency) and also to apply generated bindings to all of them. * The `mapVarToLayout` member in `StructTypeLayout` is a concession to the fact that we might have multiple `VarDecl`s for each field of the struct that represents the global scope, we might need to look up a field and its layout using any of those declarations (much of the need for this field had gone away now that IR passes are largely using IR-based layout). All of these different special cases added more complex code in many places in the compiler, all to support a scenario that isn't especially common. Most users won't be affected by the original issue, because they will do one of several things that rule it out: * Anybody using `slangc` like a stand-in for `fxc` or `dxc` and compiling one translation unit at a time will not suffer from any problems. If/when such users want consistent bindings across translation units, they already use either explicit binding or rely on consistent ordering and implicit binding. * Anybody who puts all the entry points that get combined into a pass/pipeline in a single file will not have problems. They will automatically get consistent bindings because of Slang's guarantees, and there can't be duplicated declarations when there is only one translation unit. * Anybody using `import` to factor out common declarations while compiling multiple translation units at once will not be affected. Parameters declared in an `import`ed module are the "same" in a much deeper way that it is trivial for Slang to support. Only users of the Falcor framework are likely to be affected by this, and they have two easy migration paths: either put related entry points into the same file, or factor common parameters into an `import`ed module. (It is also worth noting that for command-line `slangc`, it is possible to have a single module with multiple `.slang` files in it, which can all see global declarations like parameters across all the files. Anybody who buys into doing things the Slang Way should have no problem avoiding duplicated declarations) With the rationale out of the way, the actual change mostly just amounts to deleting lots of code that is no longer needed. An astute reviewer might notice several `assert`-fail conditions where complex Slang features were never actually made to work correctly with this legacy behavior. A small number of test cases broke with the code changes, but these were tests that specifically exercised the behavior being removed. In the case of the tests around binding/reflection generating, I rewrote the tests to use one of the idomatic workarounds (putting the shared parameters into an `import`ed module), but doing so required me to add support for `#include` when doing pass-through compilation with `fxc`. That logic added a bit more cruft than I had originally hoped to this commit, but having `#include` support when doing pass-through compilation is probably a net win. * fixup: 64-bit warning
2019-12-04	Feature/string hash review (#1142)	jsmall-nvidia
	* * Added ConstArrayView * Made StringSlicePool have styles * Remove point about strings not having terminating 0 (they do), and restriction around "" * spCalcStringHash -> spComputeStringHash * Small code improvements. Closer to coding conventions. * Fix small bug with Empty adding c string. * Fix typo in assert. * Fix ArrayView compiling issue on gcc/clang. * Remove tabs. * Improve comments around StringSlicePool. Simplify getting the added slices.
2019-12-03	getStringHash on string literals (#1140)	jsmall-nvidia
	* WIP getStringHash * Have a use. * Add slang-string-hash.h/.cpp * Use StringSlicePool for holding strings for StringHash. Add outputBuffer to string-literal-hash.slang so value can be tested. Ignore the GlobalHashedStringLiterals instruction on emit. * Add all the hashed string literals to ProgramLayout. * Add reflection support for hashed string literals to reflection test. * Fix string literal hash test. * Small fixes to pass test suite. * Fix issue in serialization where IRUse is not correctly initialized. * Fix problem initializing IRUse for string hash pass. Remove hack from slang-ir-specialize - specially handling if user is not null. * * Use shared builder when replacing getStringHash * Comments for functions in slang-ir-string-hash * Do not allow zero length string literals. Could be allowed, but doing so would require StringSlicePool to have a special case (or some other mechanism)
2019-11-22	Clean up the concept of "pseudo ops" (#1136)	Tim Foley
	* Clean up the concept of "pseudo ops" Built-in functions in the Slang standard library can be marked with `__intrinsic_op(...)` to indicate that they should not lower to functions in the IR, and that instead call sites to those functions should be translated directly to the IR. There are two cases where `__intrinsic_op(...)` gets used: 1. In the case where the argument to `__intrinsic_op(...)` is an actual IR instruction opcode, the IR lowering logic directly translates a call into an instruction with the given opcode. The arguments to the call become the operands of the instruction. 2. In the case where the argument to `__intrinsic_op(...)` is one of a set of "pseudo" instruction opcodes, the IR lowering logic directly handles the lowering to IR with dedicated code. The operands to the call might be handled differently depending on the kind of operation. The compound operators like `+=` are the most important example of these "pseudo" instructions. It doesn't make sense to handle them as true function calls (although that would work semantically), nor does it make sense to have a single IR instruction with such complicated semantics. An earlier version of the compiler used the same enumeration for both the true IR instruction opcodes and these "pseudo" opcodes, with the simple constraint that the pseudo opcodes were all negative while the real opcodes were positive. That design got changed up over a few refactorings, and because there was never a good explanation in the code itself of what "pseudo" opcodes were, we eventually ended up in a place where the in-memory and serialized IR encodings included logic to try to deal with the possibility of these "pseudo" opcodes, even though the entire design of the lowering pass meant that they'd never appear in generated IR. This change tries to clean up the mess in a few ways: * The terminology is now that these are "compound" intrinsic ops, to differentiate them from the more common case of intrinsic ops that map one-to-one to IR instructions. * The declaration of the compound intrinsic ops is no longer in a file related to the IR, and doesn't use the `IR` naming prefix, so somebody looking at the IR opcodes cannot become confused and think the compound ops are allowed there. * The IR encoding in memory and when serialized is updated to not account for or worry about the possibility of "pseudo" ops. * The compound ops are declared in such a way that ensures their enumerant values are all negative, so that they are yet again trivially disjoint from the true IR opcodes. A more drastic change might have split `__intrinsic_op` into two different modifier types: one for the trivial single-instruction case and one for the compound case. Doing this would make the change more invasive, though, because there are places in the meta-code that generates the standard library that intentionally handle both single-instruction and compound ops (because built-in operators can translate to either case). * fixup: missing file * cleanups based on review feedback
2019-11-19	Initial work for "global generic value parameters" (#1127)	Tim Foley
	* Initial work for "global generic value parameters" The main new feature here is support for the `__generic_value_param` keyword, which introduces a global generic value parameter. For example: __generic_value_param kOffset : uint = 0; This declaration introduces a global generic value parameter `kOffset` of type `uint` that has a nominal default value of zero. The broad strokes of how this feature was added are as follows: * A new `GlobalGenericValueParamDecl` AST node type is introduces in `slang-decl-defs.h` * A new `parseGlobalGenericValueParamDecl` subroutine is added to `slang-parser.cpp`, and is added to the list of declaration cases as the callback for the `__generic_value_param` name. * Cases for `GlobalGenericValueParamDecl` are added to the declaration checking passes in `slang-check-decl.cpp`, mirroring what is done for other variable declaration cases. * A case for `GlobalGenericValueParamDecl` is aded to the `Module::_collectShaderParams` function, so that it is recognized as a kind of specialization parameter. This introduces a specialization parameter of flavor `SpecializationParam::Flavor::GenericValue` (which was already defined before this change, although it was unused). * A case for `SpecializationParam::Flavor::GenericValue` is added in `Module::_validateSpecializationArgsImpl` to check that a specialization argument represents a compile-time-constant value (not a type). * A case for `GlobalGenericValueParmDecl` is introduced in `slang-lower-to-ir.cpp` that introduces a global generic parameter in the IR * The `IRBuilder` is extended to support creating `IRGlobalGenericParam`s for the distinct cases of type, witness-table, and value parameters. The same IR instruction type/opcode is used for all cases, and only the type of the IR instruction differs. * The existing mechanisms for lowering specialization arguments to the IR, and doing specialization on the IR itself Just Work with global generic value parameters since they already support value parameters on explicit generic declarations. That's the santized version of things, but there were also a bunch of cleanups and tweaks required along the way: * The `SpecializationParam` type was extended to also track a `SourceLoc` to help in diagnostic messages, which meant some churn in the code that collects specialization parameters. * The `_extractSpecializationArgs` function is tweaked to support any kind of "term" as a specialization argument (either a type or a value). * To allow parsing specialization arguments that can't possibly be types (e.g., integer literals) we replace the existing `parseTypeString` routine with `parseTermString` and then in `parseTermFromSourceFile` call through to a general case of expression parsing (which can also parse types) rather than only parsing types directly. * Right before doing back-end code generation, we check if the program we are going to emit has remaining (unspecialized) parameters, in which case we emit a diagnostic message for the parameters that haven't been specialized rather than go on to emit code that will fail to compile downstream. * Within the `render-test` tool we collapse down the arrays that held both "generic" and "existential" specialization arguments, so that we just have global and entry-point specialization argument lists. This mirrors how Slang has worked internally for a while, but the difference hasn't been important to the test tool because no tests currently mix generic and existential specialization. The logic for parsing `TEST_INPUT` lines has been streamlined down to just the global and entry-point cases, but the pre-existing keywords are still allowed so that I don't have to tweak any test cases. There are several significant caveats for this feature, which mean that it isn't really ready for users to hammer on just yet: * There is no support for `Val`s of anything but integers, so there is no way to meaningfully have a generic value param with a type other than `int` or `uint`. * We allow for a default-value expression on global generic parameters, but do not actually make use of that value for anything (e.g., to allow a programmer to omit specialization arguments), nor check that it meets the constraints of being compile-time constant. * Global generic value parameters are not currently being treated the same as explicit generic parameters in terms of how they can be used for things like array sizes or other things that require constants. This will probably be relaxed at some point, but allowing a global generic to be used to size an array creates questions around layout. * The IR optimization passes in Slang currently won't eliminate entire blocks of code based on constant values, so using a global generic value parameter to enable/disable features will not currently lead to us outputting drastically different HLSL or GLSL. That said, we expect most downstream compilers to be able to handle an `if(0)` well. * Fix regression for tagged union types The change that made specialization arguments be parsed as "terms" first, and then coerced to types meant that any special-case logic that is specific to the parsing of types would be bypassed and thus not apply. Most of that special-case logic isn't wanted for specialization arguments, since it pertains to cases were we want to, e.g, declare a `struct` type while also declaring a variable of that type. The one special case that is useful is the `__TaggedUnion(...)` syntax, which is the only way to introduce a tagged union type right now. In order to get that case working again, all I had to do was register the existing logic for parsing `__TaggedUnion` as an expression keyword with the right callback, and the existing logic in expression parsing kicks in (that logic was already handling expression keywords like `this` and `true`). I left in the existing logic for handling `__TaggedUnion` directly where types get parsed, rather than try to unify things. A better long-term fix is to make the base case for type parsing route into `parseAtomicExpr` so that the two paths share the core logic. That change should probably come as its own refactoring/cleanup, because it creates the potential for some subtle breakage. * fixup: typo
2019-11-08	Fix problem when getting default value for a bool, was producing 0, which on ↵	jsmall-nvidia
	glsl could not be coerced. (#1117)
2019-11-07	* Removed strip pass from emit as no longer needed (#1114)	jsmall-nvidia
	* If obfuscate is enabled do strip on Layout * Add option to keep insts that have layout decoration (else DCE strips layout) * Add NameHint back in lowering - as strip now correctly removes. We may want NameHints in some stages even with obfuscation (for error messages in IR passes), as long as they are removed appropriately at the end
2019-11-06	Add basic support for entry points in `.slang-lib` files. (#1112)	Tim Foley
	* Add basic support for entry points in `.slang-lib` files. The basic idea here is that when writing out a `.slang-lib` file based on a compile request, we include new sections in the generated RIFF that represent the entry points that were requested. The entry-point information is serialized in an entirely ad hoc fashion (a future change might clean it up to use the `OffsetContainer` machinery), and contains the name, profile, and mangled symbol name of an entry point. When deserializing this information, we create a list of "extra" entry points that gets attached to the front-end compile requests. These "extra" entry points get turned into `EntryPoint` objects at the same place in the code that entry points specified on the command line or via API would be checked, but the extra entry points bypass the semantic checking and just create "dummy" `EntryPoint` objects. Aside: the ability for a compile request to end up with entry points that weren't originally specified via API or command-line is not new. We already had support for compiling a translation unit with entry points entirely specified via `[shader(...)]` attributes, and this new support tries to function similarly. Because the "dummy" entry points don't retain AST-level information, several parts of the code have been modified to defensively check for `EntryPoint` objects without a matching AST declaration, and skip over them. The main place where this creates a problem is paramete binding, where ignoring the dummy entry point is appropriate since we currently assume linked-in library code has been laid out manually. One small cleanup here is that the `-r` command-line flag and the `spAddLibraryReference` API functio now bottleneck through a common routine to do their work, so that they both gain the new behavior without needing copy-paste programming. In order to keep the existing test case for library linking with entry points working, I had to add a flag to the `render-test` tool so that it can skip specifying entry point names as part of the compile request it creates. In that case it must instead assume that the entry points will be added to the compile request via other means. This logic is a bit magical, and hints that we should be looking for other ways to expose the library linking functionality over time. * fixup: remove alignment assertion
2019-11-06	Support for [__extern] attribute (#1111)	jsmall-nvidia
	* Added RiffReadHelper * Move type to fourCC in Chunk simplifies some code. * Make MemoryArena able to track external blocks. Allow ownership of Data to vary. Changed IR serialization to use moved allocations to avoid copies. As it turns out all of the array writes could use unowned data, but doing so requires the IRData to stay in scope longer than IRSerialData, which it does at the moment - but perhaps needs better naming or a control for the feature. * Write out slang-module container. * WIP on -r option. Loading modules - with -r. * Making the serialized-module run (without using imported module). * Split compiling module from the test. * Separate module compilation with a function working. * Remove serialization test as not used. * Fix warning on gcc. * Updated test to have types across module boundary. * Allow entry point declaration. A test that tries to build with just an entry point declaration and a module. * Try to make link work with multiple modules. * Multi module linking first pass working. * Multi module test working with -module-name option * Added feature to repro manifest of approximation of command line that was used. * Use isDefinition - for determining to add decorations to entry point lowering. * Added support for repo-file-system.h More precise control of CacheFileSystem. Allow RelativeFileSystem to strip paths optionally. Use canonical paths in PathInfo cache. Fix bug in -D options for command line output of StateSerailizeUtil * Add missing slang-options.h * Fix bug in bit slang-state-serialize.cpp with bit removal. * Added documentation around -repro-file-system Added spLoadReproAsFileSystem function. * Fix warning. * spAddLibraryReference * * Add support for slang-lib extension * Container output when using -no-codegen option * Use the m_containerFormat to determine if the module container is constructed. Store the result in a blob. This allows for potential access via the API. Write the blob if a filename is set. Use m_ prefix for container variables. * Added spGetContainerCode. Made spGetCompileRequestCode work. * * Put obfuscateCode on linkage * Remove obfuscation from variable names - as can be achieved by either stripping and/or removing NameHintDecorations at lowering * Remove name hints being added during lowering * Add stripping of SourceLoc location in strip phase * Hashing of linkage import/export names. * Do final strip in emitEntryPoint, removes any remaining SourceLoc. * Support for [__extern] to mark struct/function that are defined elsewhere. * Allow adding extern to any decl. * Use ExternAtrtibute to apply import decoration, rather than use an ir extern decoration. * Added a test for [__extern] * Improved comment around [__extern]
2019-11-06	Feature/obfuscate improvements (#1107)	jsmall-nvidia
	* Added RiffReadHelper * Move type to fourCC in Chunk simplifies some code. * Make MemoryArena able to track external blocks. Allow ownership of Data to vary. Changed IR serialization to use moved allocations to avoid copies. As it turns out all of the array writes could use unowned data, but doing so requires the IRData to stay in scope longer than IRSerialData, which it does at the moment - but perhaps needs better naming or a control for the feature. * Write out slang-module container. * WIP on -r option. Loading modules - with -r. * Making the serialized-module run (without using imported module). * Split compiling module from the test. * Separate module compilation with a function working. * Remove serialization test as not used. * Fix warning on gcc. * Updated test to have types across module boundary. * Allow entry point declaration. A test that tries to build with just an entry point declaration and a module. * Try to make link work with multiple modules. * Multi module linking first pass working. * Multi module test working with -module-name option * Added feature to repro manifest of approximation of command line that was used. * Use isDefinition - for determining to add decorations to entry point lowering. * Added support for repo-file-system.h More precise control of CacheFileSystem. Allow RelativeFileSystem to strip paths optionally. Use canonical paths in PathInfo cache. Fix bug in -D options for command line output of StateSerailizeUtil * Add missing slang-options.h * Fix bug in bit slang-state-serialize.cpp with bit removal. * Added documentation around -repro-file-system Added spLoadReproAsFileSystem function. * Fix warning. * spAddLibraryReference * * Add support for slang-lib extension * Container output when using -no-codegen option * Use the m_containerFormat to determine if the module container is constructed. Store the result in a blob. This allows for potential access via the API. Write the blob if a filename is set. Use m_ prefix for container variables. * Added spGetContainerCode. Made spGetCompileRequestCode work. * * Put obfuscateCode on linkage * Remove obfuscation from variable names - as can be achieved by either stripping and/or removing NameHintDecorations at lowering * Remove name hints being added during lowering * Add stripping of SourceLoc location in strip phase * Hashing of linkage import/export names. * Do final strip in emitEntryPoint, removes any remaining SourceLoc.
2019-10-31	Reference IR modules with entry point (#1101)	jsmall-nvidia
	* Added RiffReadHelper * Move type to fourCC in Chunk simplifies some code. * Make MemoryArena able to track external blocks. Allow ownership of Data to vary. Changed IR serialization to use moved allocations to avoid copies. As it turns out all of the array writes could use unowned data, but doing so requires the IRData to stay in scope longer than IRSerialData, which it does at the moment - but perhaps needs better naming or a control for the feature. * Write out slang-module container. * WIP on -r option. Loading modules - with -r. * Making the serialized-module run (without using imported module). * Split compiling module from the test. * Separate module compilation with a function working. * Remove serialization test as not used. * Fix warning on gcc. * Updated test to have types across module boundary. * Allow entry point declaration. A test that tries to build with just an entry point declaration and a module. * Try to make link work with multiple modules. * Multi module linking first pass working. * Multi module test working with -module-name option * Use isDefinition - for determining to add decorations to entry point lowering.
2019-10-25	Don't use mangled names when emitting code (#1096)	Tim Foley
	* Small cleanup to how standard-library code gets marked Declarations in the standard library used to be individualy marked with `FromStdLibModifier` so that downstream passes (notably IR lowering) could treat them specially. At some point we simplified things by just looking for `FromStdLibModifier` in the ancestor list of a declaration, so that we could handle nested declarations without having to recursively attach the modifier to everything. This change simplifies things a bit further in that the `FromStdLibModifier` now only get attached to the `ModuleDecl`, since that will be on the ancestor chain of all the declarations anyway. The second change here is that we attach this marker modifier before we parse and do semantic checking on the module, rather than after. There is no code in this change that relies on that difference, but changing the timing of when we attach the modifier means that the semantic checking logic can now reliably detect if something represents a declaration in the standard library. * Change implementation of `sign()` for GLSL target Issue #602 related to the `sign()` function having a different return type between GLSL and HLSL/SLang. At the time, the issue was fixed by adding special-case logic in the emit pass that looked for the `sign()` function by name. This change cleans up that logic to work using the existing `__target_intrinsic` mechanism instead. * Make sure code emit doesn't depend on mangled names There was existing logic in the HLSL/GLSL emit path (that got copy-pasted into the C/C++ path) where if a builtin function didn't have an application `__target_intrinsic` modifier (`IRTargetIntrinsicDecoration` at the IR level), we used some fairly fragile logic to take the mangled symbol name for a function and unmangle it to recover the original name, as well as the number of parameters (to detect member functions). This approach had always been a kludge, and we have reached a point where it actively hinders forward progress on some valuable new features. The big idea of this change is fairly simple: in the IR lowering pass we detect when we have a stdlib function that ought to have an intrinsic definition but doesn't have a "catch-all" `__target_intrinsic` modifier applied. If we discover such a function, we go ahead and emit it to the IR with just such a catch-all modifier. (The main alternative here would be to require that all functions in the stdlib be manually marked `__target_intrinsic` and make this a front-end issue where we get errors if we write stdlib functions wrong, but this workaround is more expedient for now) Some of the logic around target intrinsics already handled the idea of having catch-all intrinsic modifiers/decorations with an empty target name, but that support wasn't 100% complete so this change strives to get it working. One detail that was only being handled along this mangled-name path was support for emitting calls using member-function syntax. I updated the generic handling of `__target_intrinsic`-based calls to support specifying a member function name using a prefix `.` on the definition string. With this work in place, it is possible to clean up a bunch of logic in the `emit` code that had to assume that any function without a body/definition must be an intrinsic. Instead, with this change we can now use the presence of a suitable `IRTargetIntrinsicDecoration` as the indicator that a function is an intrinsic. This should make the emit logic a bit more robust. One wrinkle that remains with this change is the special handling of functions that get the name `operator[]` (that is, the `get`/`set`/etc. accessors for `__subscript` declarations). The logic no longer depends on the mangled name, but it still feels a bit gross to have this kind of string-based special case (especially since it is our main/only special case like this). Another wrinkle is that the C/C++ back-end logic for handling intrinsic functions largely mirrors the HLSL/GLSL case, but can't just use the same exact code because it has to be intercepted by the logic that generates some of the required functions on-demand. There's still a net cleanup with this change, but there is probably an opportunity to remove even more duplication down the road.
2019-10-24	Strip IR after front-end steps are done (#1092)	Tim Foley
	* Strip IR after front-end steps are done The main feature of this change is to unconditonally strip out the `IRHighLevelDeclDecoration`s in an IR module once the "mandatory" IR passes in the front end have run. This ensures that later IR passes (e.g., code emission) cannot rely on AST-level information to get their job done. Since I was already writing a pass to remove some instructions at the end of the front-end passes, I went ahead and also made the `-obfuscate` flag apply to the front-end IR generation by causing it to strip `IRNameHintDecoration`s while it is doing the other stripping. With this, the main identifying information left in IR modules (other than semantics and entry-point names) is mangled name strings for imported/exported symbols. A few other things got changes along the way: * Removed the `.expected` file for one of the tests, where that file seemingly shouldn't have been checked in at all. * Updated the signature of the DCE pass both so that it doesn't require a back-end compile request (it wasn't using it anyway), and so that it takes some options to decide whether to keep symbols marked `[export(...)]` alive (the front-end wants to keep these, while back-end passes currently need to be able to eliminate them). * Moved the `obfuscateCode` flag from the back-end compile request to the base class shared between front- and back-end requests, and updated the options and repro logic to set both as needed. An obvious improvement in the future would be to have the front- and back-end requests share these settings by referencing a single common object in the end-to-end case, rather than each having their own copy. * Removed logic that was keeping layout instructions alive in DCE, even if they weren't used. This seems to have been a vestige of an intermediate step between AST and IR layout. * fixup: add the new files
2019-10-22	User IR-based layout for all IR steps (#1084)	Tim Foley
	This change builds on previous work that moves toward a more IR-based representation of layout. Those steps added some instructions for representing layout in the IR (initially just proxies for the AST layout objects), and an explicit lowering pass that could build a target-specific IR module that binds parameters and entry points to layout information. This change aims to complete that work, in the sense that the IR representation of layout is now self-contained and does not rely on having pointers back into the AST-level representation. Achieving this requires two main kinds of work: 1. Update any code that used layout information derived from the IR (most notably all the `slang-emit-` code) to use the new IR representation and its accessors. 2. Update any code that constructs* layouts using information derived from the IR to construct IR layouts instead. The biggest new infrastructure feature in this change is support for "attributes" in the IR (I'd welcome feedback on the naming). An attribute can either be thought of like key/value arguments that can be added to certain instructions to encode optional data, or alternatively like a decoration that is referenced as an operand instead of a child. The value of attributes over decorations is that they can affect the hash/identity of an instruction (which decorations can't), while the advantage of decorations is that they can easily be added/removed over the lifetime of an instruction (which attributes can't). We mostly use them here to represent operands that are logically optional. Once attributes are available, the encoding of layout information into the IR is mostly straightforward: * An `IRVarLayout` has a fixed operand for its type layout, and can accept a few different attributes * Zero or more `IRVarOffsetAttr`s that specify the offset of the variable for a given resource kind. These are equivalent to the `VarLayout::ResourceInfo`s at the AST level. * An optional `IRUserSemanticAttr` and `IRSystemValueSemanticAttr` to represent the (possibly derived) semantic of a varying input/output parameter. * An option `IRStageAttr` to represent the known stage for a parameter. * An `IREntryPointLayout` has a var layout for the entry point parameters (logically grouped in to a struct) and another var layout for the result parameter. * There is a small type hierarchy rooted at `IRTypeLayout` where each subtype can add fixed operands and attributes that are expected to appear. It also supports `IRTypeSizeAttr`s that serve a similar role to the `IRVarOffsetAttr`s. * Structure types maintain the mapping of fields to their var layouts using `IRStructFieldLayoutAttr`s. With the encoding in place, most of the changes in category (1) (code that just uses rather than creates layouts) was straightforward. The biggest different beyond name changes was that everything needs to be fetched using accessors instead of bare fields. It would have been possible to stage this commit and make the diffs smaller by first introducing mandatory acessors to the AST layout types. The changes in category (2) were more involved. There were a lot of places in the existing code where a `TypeLayout` or `VarLayout` would be created, and then initialized piecemeal over several lines of code (and sometimes even across functions). Because of the way that layouts need to support many optional properties, it did not seem practical to just have monolithic factory functions that took all the options as arguments, so I instead opted for a builder approach. The builders for `IRVarLayout` and `IREntryPointLayout` are both straightforward, and honestly there is no realy need for a builder for entry point layouts right now, but I was trying to future-proof in case we decidd to add some optional attributes to them. The builders for type layouts are more involved because of the inheritance hierarchy. Each concrete sub-type of type layout needs to define its own builder type that customizes the opcode, operands, and attributes of the final instruction. The refactoring that had to go into this change was a nice excuse to clean up a few ugly warts in the AST layout code that were largely there to support IR use cases. While this change adds a lot of new infrastructure code to the IR, most of the client code has stayed the same or gotten simpler. One annoying wart that remains with this change is the notion of an "offset element type layout" for parameter group types. That idea was added to deal with a legacy feature in the reflection API that we realized was a mistake, but unfortunately having that "offset" layout handy made writing a few other pieces of code simpler so that there are use cases of the feature even in the IR. Removing those uses is do-able, but requires careful refactoring so it is best left to a follow-on change. Another thing that could be considered for a follow-on change is how much information should be specified when constructing a `Builder` for an IR type layout, and how much should be allowed to be specified statefully/piecemeal. It would be nice to force all the required operands to be specified up front, but `IRParameterGroupTypeLayout::Builder` doesn't currently work that way because so much of the client code that needs it involved a lot of stateful setting and would need to be refactored heavily to provide the necessary information up front.
2019-10-17	Initial work on representing layout at IR level (#1079)	Tim Foley
	* Initial work on representing layout at IR level This change starts the process of making the back-end of the compiler independent of the AST-level layout information (`TypeLayout`, `VarLayout`, etc.) so that it instead only relies on layout information that is embedded into IR modules. This brings us incrementally closer to a world in which the back-end could be run without the AST-level structures even existing (e.g., for an application that just wants to ship IR without any AST information for IP protection, while still supporting some amount of linking and specialization). The main parts of the change are: * There is a bunch of incidental churn related to specifying entry points by index instead of the `EntryPoint` object for certain operations. This ends up being a better choice because we can use the index to look up side-band information about the entry point that might not be stored on the `EntryPoint` object itself. In particular... * We expand the `ComponentType` interface to support looking up the mangled name of an entry point by index. In common cases (no generic/interface specialization) this would be the same as asking the `EntryPoint` for its mangled name, but in cases where we have specialized a generic entry point, the mangled name would include speicalization arguments that are only available on the `SpecializedComponentType` that wraps the entry point. This part of the change isn't ideal and there might be a better solution waiting to be invented. Note that we store mangled entry point names as strings rather than using `DeclRef`s because that ensures that the information could be serialized and deserialized without a dependence on the AST. * The `TargetProgram` type (which represents binding a specific `ComponentType` for a shader program to a specific `TargetRequest` that represents the target platform) is expanded to include an `IRModule` that represents layout information, in addition to the AST-level `ProgramLayout` it already contained. We create both of these objects at the same time (on-demand) to simplify the overall flow (so that any code that triggers creation of the AST-level layout will also ensure that the IR-level layout exists). * A bunch of code in the emit passes that was passing down layout-related objects has been eliminated. It appears that most of those objects weren't actually being used, so this is just a cleanup, but it helps ensure that the back-end steps are "clean" and don't depend on the AST-level information. The one big exception here is that the emit logic needs to know the stage for the entry point being emitted (to deal with one wrinkle in translating DXR to VKRT). * A big change (actually introduced by @jsmall-nvidia in a branch that this change copied and then built from) is to introduce some more explicit IR instructions to represent layout information, notably an `IRTypeLayout` and an `IRVarLayout`. For now these objects still reference their AST equivalents, but the separation gives us an incremental path to move information from the AST-level objects over to the IR ones. This work includes logic in `IRBuilder` to construct the IR-level layout objects from the AST-level ones on-demand, so that the existing code paths that try to attach AST-level layout will continue to work for now. * Because layout information is now embedded in the IR, the `slang-ir-link.cpp` logic loses a lot of cases that used to deal with attaching AST-level layout objects to IR-level instructions during the linking process. Instead, the linker now assumes that one (or more) of the input IR modules will have layout information associated with it, and the linker makes sure to copy layout decorations (and the instructions they reference) from the input IR module(s) to the output using its more ordinary mechanisms. * Inside `slang-lower-to-ir.cpp`, we add logic to construct an IR module in a `TargetProgram` that simply references the global shader parameters, entry points, etc. and attaches IR layout decorations to them. This is akin to the existing pass in the same file that constructs IR to represent specialization information, and both of these passes share infrastructure with the main AST->IR lowering pass. Eventually, it is expected that this pass will encompass more of the logic for copying AST-level layout information over to IR-level equivalents. * One small wrinkle with this change was that the output for an HLSL generation test case changed some of its `#line` directives. The old code was actually more inaccurate than the new, so this change just updated the baseline. It also added some logic in the linker to make sure that when an IR instruction has multiple definitions, we try to pick up a source location from any of them, in case the "main" one somehow didn't get a location. * Another small fix was that the key/value map in `StructTypeLayout` for mapping fields/members to their layouts was keyed on `Decl` when it really should have been `VarDeclBase`. This change should in principle be a pure refactoring with no functionality changes, so no new tests were added. It is unfortunately also a change that has a high probability of breaking at least some client code, so we may want to be defensive and mark this with a new major version number (well, a new minor version number since we are pre-`1.0`) to give us some room for releasing hotfixes to the old version if needed. * fixup: infinite recursion bug detected by clang * fixup: remove commented-out code
2019-10-09	Feature/decor entry point name (#1073)	jsmall-nvidia
	* Use name hint on EntryPoint naming. * Placed the entry point name on the EntryPointDecoration.
2019-10-08	Fixed from Review of Entry Point decoration #1068 (#1072)	jsmall-nvidia
	* Remove typo around GeometryPrimitiveTypeDecoration * * GeometryPrimitiveTypeDecoration -> GeometryInputPrimitiveTypeDecoration (to try and closer match meaning and the Modifier name) * Remove a small problem around definition of IRGeometryPrimitiveTypeDecoration * Fix comment around IRStreamOutputTypeDecoration
2019-10-08	Feature/ir entry point profile (#1068)	jsmall-nvidia
	* Split out EntryPointParamDecoration. * Add profile to EntryPointDecoration. * WIP for GS handling for GLSL. * WIP for StreamOut GLSL * Fixed GLSL geometry output. * Clean up - remove unneeded/commented out code from the entry point change. * Use Op nums to identify GeometryTypeDecorations (as opposed to contained enum).
2019-10-04	IR types for subset of Attributes (#1067)	jsmall-nvidia
	* IROutputControlPointsDecoration * IROutputTopologyDecoration * IRPartitioningDecoration * IRDomainDecoration * Use IRPatchConstantDecoration alone for hlsl output. * IRMaxVertexCountDecoration * IRInstanceDecoration * Removed _emitHLSLAttributeSingleString and _emitHLSLAttributeSingleInt Removed GLSLBindingAttribute and just use NumThreadsAttribute * Added IRNumThreadsDecoration. * Added IRNumThreadsDecoration * Fix build problem on x86. Improve diagnostic text based on review.
2019-09-18	Clean up some behavior of operator% (#1060)	Tim Foley
	Work on #1059 The `%` operator in the Slang implementation had several issues, and this change tries to address some of them: * Renamed most occurences of "mod" describing this operator to be "rem" for "remainder" to better match its semantics in HLSL * Split the operator into distinct integer and floating-point variants (`IRem` and `FRem`) to simplify having different codegen for the two * Added floating-point variants of `operator%` and `operator%=` to the stdlib. * Added custom C++ codegen for `kIROp_FRem` such that it maps to the standard C/C++ `remainder()` function * Added custom GLSL codegen so that `kIROp_FRem` maps to the GLSL `mod()` function (which isn't correct...) * Added a test case to confirm that D3D11, D3D12, and CPU targets all agree on the definition of floating-point `%` * Fixed `render-test-tool` to allow a negative integer in a `data=...` specification. This didn't end up being used in the final test, but still seems like a good fix. * Added a customized baseline for the Vulkan flavor of that test to confirm that we are not compiling correctly to SPIR-V just yet Addressing the correctness of the output for GLSL/SPIR-V will have to come as a later change given that the operation we want is not exposed directly by unextended GLSL.
2019-08-08	Revise new COM-lite API (#1007)	Tim Foley
	* Revise new COM-lite API This change revises the "COM-lite" API that was recently introduced to try to streamline it and introduce some missing central/base concepts. The central new abstraction in the API is the notion of a "component type," which is a unit of shader code composition. A component type can have: * IR code for some number of functions/types/etc. * Zero or more global shader parameters * Zero or more "entry point" functions at which execution can start * Zero or more "specialization" parameters (types or values that must be filled in before kernel code can be generated) * Zero or more "requirements" (dependencies on other component types that must be satisfied before kernel code can be generated) Both individual compiled modules, and validated entry points are then examples of component types, and we additionally define a few services that apply to all component types: * We can take N component types and compose them to create a new component type that combines their code, shader parameters, entry points, and specialization parameters. A composed component type may also include requirements from the sub-component types, but it is also possible that by composing thing we satisfy requirements (if `A` requires `B`, and we compose `A` and `B`, then the requirement is now satisfied, and doesn't appear on the composite). * We can take a component type with N specialization parameters, and specialize it by giving N compatible specialization arguments. The result of specialization is a new component type with zero specialization parameters. Under the right circumstances the specialzed component type will be layout compatible with the unspecialized one. * One more example that isn't exposed in the public API today is that we can take a component with requirements and "complete" it by automatically composing it with component types that satisfy those requirements. This can be seen as a kind of linking step that pulls together the transitive closure of dependencies. * We can query the layout for the shader parameters and entry points of a component type, for a specific target. * We can query compiled kernel code for an entry point in a component type (for a specific target). This only works for component types with zero specialization parameters and zero requirements. The idea is that by giving users a fairly general algebra of operations on component types, they can compose final programs in ways that meet their requirements. For example, it becomes possible to incrementally "grow" a component type to represent the global root signature for ray tracing shaders as new entry points are added, in such a way that it always stays layout-compatible with kernels that have already been compiled. Much of the implementation work here is in implementing the unifying component type abstraction, and in particular re-writing code that used to assume a program consisted of a flat list of modules and entry points to work with a hierarchical representation that reflects the underlying algebra (e.g., with types to represent composite and specialized component types). There's also a hidden "legacy" case of a component type to deal with some legacy compiler behaviors that can't be directly modeled on top of the simple algebra with modules and entry points. This API is by no means feature-complete or fully developed. It is expected that we will flesh it out more when bringing up application code (e.g., Falcor) on top of the revamped API. One notable thing that went away in this change is explicit support for "entry point groups" and notions of local root signatures (especially the Falcor-specific handling of the `shared` keyword, which a previous change turned into an explicitly supported feature). With the new "building blocks" approach, it should be possible for a DXR application to deal with local root signatures as a matter of policy (on top of the API we provide). If/when we need to provide some kind of emulation of local root signatures for Vulkan (and/or if Vulkan is extended with an explicit notion of local root signatures), we might need to revisit this choice. * Fix debug build There was invalid code inside an `assert()`, so the release build didn't catch it. * fixup: warnings * fixup: more warnings-as-errors * fixup: review notes * fixup: use component type visitors in place of dynamic casting
2019-07-18	Add back a notion of IR global constants (#1002)	Tim Foley
	This change adds back a little bit of explicit support for global constants in the IR, after a previous change completely removed the existing `IRGlobalConstant` node type. The new `IRGlobalConstant` is not a parent instruction, and doesn't function at all like the old one. Instead it is effectively a simple instruction that takes zero or one operands: * The zero-operand case represents a constant with unknown value. This would usually come from another module, and thus would have an `[import(...)]` linkage decoration, so that after linking it resolves to a constant with a known value. * In the one-operand case, the single operand represents the value of the constant, so that the operation semantically behaves like an identity function. It exists just to give decorations something to "attach" to, so that a global constant with a value can have, e.g., an `[export(...)]` decoration to establish linkage. The IR lowering pass was updated to create the new node type to wrap any global constants. For now we do this both for global `static const` variables and function-scope `static const`, although the latter doesn't really need the extra indirection. The IR linking logic was extended to handle linking of global constants akin to how other global instructions are handled. The new logic is mostly boilerplate, and it is likely that a refactor of the linking logic would eliminate the need for this kind of per-instruction-opcode handling of IR instructions that can have linkage. A custom pass was added that is intended to be run right after linking (it could arguably be folded into `linkIR()`, but I thought it was safer to keep each pass as small as possible). This pass replaces any `IRGlobalConstant` that has a value (operand) with that value, so that global constants should be eliminated after the linking step. This ensures that downstream optimization/transformation passes don't have to deal with the possibility of global constants. Almost all the existing passes would Just Work if global constants were left in the IR. The two big exceptions are: * Anything that relies on testing `IRInst` identity as a way to test for things having the same value would break, since a global constant is a distinct `IRInst` from its value. * The type legalization pass doesn't handle `IRGlobalConstant` instructions with non-simple types. This could be added if we ever wanted it, but it seemed silly to write this code now if it would always be dead (and thus untested). I went ahead and updated the emit logic to handle an `IRGlobalConstant`s that still existing in the IR module at emit time, since the amount of code required was small so that being robust to that case seemed safest (e.g., in case we ever want to have a path that emits code directly while skipping some/all of our IR transformation passes). There should be no visible changes to the functionality of the compiler with this change, but it should help make IR dumps from the front-end more clear/explicit (since each constant will be a distinct instruction with its own name), and paves the way for supporting proper cross-module linkage of constants.
2019-07-17	Change how global-scope constants are handled (#1001)	Tim Foley
	Before this change, global and function-scope `static const` declarations were represented as instructions of type `IRGlobalConstant`, which was represented similarly to an `IRGlobalVar`: with a "body" block of instructions that compute/return the initial value. This representation inhibited optimizations (because a reference to a global constant would not in general be replaced with a reference to its value), and also caused problems for resource type legalization because the logic for type legalization did not (and still does not) handle initializers on globals (so global variables that contain resource types are still unsupported). The change here is simple at the high level: we get rid of `IRGlobalConstant` and instead handle global-scope constants as "ordinary" instructions at the global scope. E.g., if we have a declaration like: static const int a[] = { ... } that will be represented in the IR as a `makeArray` instruction at the global scope, referencing other global-scope instructions that represent the values in the array. This simple choice addresses both of the main limitations. A `static const` variable of integer/float/whatever type is now represented as just a reference to the given IR value and thus enables all the same optimizations. When a `static const` variable uses a type with resources, the existing legalization logic (which can handle most of the "ordinary" instructions already) applies. Another secondary benefit of this approach is that the hacky `IREmitMode` enumeration is no longer needed to help us special-case source code emit for `static const` variables. Beyond just removing `IRGlobalConstant`, and updating the lowering logic to use the initializer direclty, the main change here is to the emit logic to make it properly handle "ordinary" instructions that might appear at global scope. One open issue with this change, that could be addressed in a follow-up change, is that "extern" global constants that need to be imported from another module (but which might not have a known value when the current module is compiled) aren't supported - we don't have a way to put a linkage decoration on them. A future change might re-introduce global constants as a distinct IR instruction type that just references the value as an operand (if it is available). We would then need to replace references to an IR constant with references to its value right after linking.