slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Use slang- prefix on slang compiler and core source (#973)	jsmall-nvidia	2019-05-31
\| \| \| \| \| \| \| \| \| \| \| \|	* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.
*	String/List closer to conventions, and use Index type (#959)	jsmall-nvidia	2019-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* List made members m_ Tweaked types to closer match conventions. * Use asserts for checking conditions on List. Other small improvements. * List<T>.Count() -> getSize() * List<T> Add -> add First -> getFirst Last -> getLast RemoveLast -> removeLast ReleaseBuffer -> detachBuffer GetArrayView -> getArrayView * List<T>:: AddRange -> addRange Capacity -> getCapacity Insert -> insert InsertRange -> insertRange AddRange -> addRange RemoveRange -> removeRange RemoveAt -> removeAt Remove -> remove Reverse -> reverse FastRemove -> fastRemove FastRemoveAt -> fastRemoveAt Clear -> clear * List<T> FreeBuffer -> _deallocateBuffer Free -> clearAndDeallocate SwapWith -> swapWith * List<T> SetSize -> setSize Reserve -> reserve GrowToSize growToSize * UnsafeShrinkToSize -> unsafeShrinkToSize Compress -> compress FindLast -> findLastIndex FindLast -> findLastIndex Simplify Contains * List<T> Removed m_allocator (wasn't used) Swap -> swapElements Sort -> sort Contains -> contains ForEach -> forEach QuickSort -> quickSort InsertionSort -> insertionSort BinarySearch -> binarySearch Max -> calcMax Min -> calcMin * Initializer::Initialize -> initialize List<T>:: Allocate -> _allocate Init -> _init IndexOf -> indexOf * * Put #include <assert.h> in common.h, and remove unneeded inclusions * Small refactor of ArrayView - remove stride as not used * getSize -> getCount setSize -> setCount unsafeShrinkToSize->unsafeShrinkToCount growToSize -> growToCount m_size -> m_count * Some tidy up around Allocator. * Use Index type on List. * Refactor of IntSet. First tentative look at using Index. * Made Index an Int Did preliminary fixes. Made String use Index. * Partial refactor of String. * String::Buffer -> getBuffer ToWString -> toWString * Small improvements to String. String:: Buffer() -> getBuffer() Equals() -> equals * Try to use Index where appropriate. * Fix warnings on windows x86 builds.
*	Initial support for the `precise` keyword (#958)	Tim Foley	2019-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #858 The `precise` keyword exists in both HLSL and GLSL and when applied to a variable declaration is supposed to indicate that all computations that contribute to the value of that variable should not be altered based on "fast-math" optimizations. The main examples are that separate multiply and add operations should not be turned into fused multiply-add (fma) operations, and that operations cannot ignore the possibility of infinity or not-a-number values (e.g., by assuming that `x * 0.0f` is always `0.0f`). (Aside: it is possible that my understanding of what the semantics of `precise` are in HLSL and GLSL is imperfect so that either the GLSL variant isn't sufficient to provide the semantics of the HLSL keyword, or that the definition of "all computations that contribute" to a value isn't actually correct. We may need to revise this implementation based on subsequent learnings.) The basic idea here is to turn the AST `precise` keyword into a `[precise]` decoration in the IR and then emit that as a `precise` keyword again in the output. The main catch is that whereas most of our existing IR decorations apply to things like global shader parameters or `struct` members that usually stick around for the duration of compilation, `[precise]` will get slapped on local variables that will often get optimized away by our SSA pass. There are two ways a variable can get eliminated/replaced during the SSA pass: 1. A use of the variable can be replaced with an ordinary instruction that computes its value. 2. A use of the variable can be replaced with a reference to a "phi node" that will take on the appropriate value based on control flow. These two cases already had logic to copy a "name hint" decoration from the variable over to an instruction that will replace it, and I simply extended them to also propagate over a `[precise]` decoration. The test case added with this change intentionally constructs a case where `[precise]` needs to be propagated over to an SSA "phi node" in order to generate correct output code. The other gotcha is that we can emit variable declarations in various places in `emit.cpp`, and all of these needed to handle `[precise]`. Not only do we have actually local variables (`IRVar`), but we also have SSA phi nodes (`IRParam`), and then there are cases where an intermediate computation (an ordinary instruction) should be `[precise]` and thus we need to emit it as a temporary (not folding it into its use sites) and make sure that the temporary itself gets the `precise` keyword. I have manually confirmed that in the output SPIR-V, this change results in the `NoContraction` SPIR-V decoration being added to the relevant operations, and the output DXBC contains a multiply and an add in place of a multiply-add. The output DXIL does not show any obvious changes due to `precise`, although the exact order and operands of the math instructions emitted does differ when `precise` is added/removed. In all cases the output is equivalent to hand-written HLSL/GLSL with a `precise`-qualified local variable.
*	* Add support for enum and type lookup via :: (scope operator) (#882)	jsmall-nvidia	2019-03-06
\| \| \|	* Added test for scope operator
*	Hotfix/crash invalid vk binding (#875)	jsmall-nvidia	2019-03-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add diagnostic for vk::binding failure. * Add test for vk::binding failure. * Add the expected output for glsl-layout-define.hlsl * * Copy over initialize expr if available when validating unchecked * Fix unloop - because now it always has one parameter (when before it could have none) * Split vk::binding and layout tests with invalid parameters * Removed the diagnostic for 2 ints expected * Added vk::binding that doesn't specify set in vk-bindings.slang * * Fix typo * Improve comments.
*	* Remove typo in TryRecover (#863)	jsmall-nvidia	2019-02-27
\| \| \| \|	* Remove GLSL specific check - as no longer support GLSL * Remove erroneous check for { on parseClass
*	Split front- and back-ends (#846)	Tim Foley	2019-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Split front- and back-ends This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API. The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion. For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points. The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed. The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did: * The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files. * A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does not interact with generic arguments for global or entry-point generic parameters. * The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`). * The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation. * A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around. * The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward. * Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios. * The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time). * One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation). It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required: * I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it. * A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters. * In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation. * In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway. * In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better. * In cases where code was doing, e.g., `context->shared->compileRequest->session->getThing()` I have tried to introduce convenience routines so that the usage site is just `context->getThing()` to improve encapsulation and allow changes to be made more easily going forward. * The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors. * A few APIs now take `DiagnosticSink` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob` parameter for the output. fixup: missing include for compilers with accurate template checking (non-VS) * fixup: review feedback
*	* Add cross compile test (#849)	jsmall-nvidia	2019-02-14
\| \| \|	* Add intrinsic for StructuredBuffer.Load
*	[[vk::shader_record]] (#836)	jsmall-nvidia	2019-02-11
\| \| \| \| \| \| \| \| \| \|	* * Replaced ShaderRecordNVLayoutModifier with ShaderRecordAttribute * Allowed attributed [[vk::shader_record] and [[shader_record]] * Checking there is at most 1 ShaderRecord active * Small typo fixes * Slightly improve diagnostic. Replace expected file.
*	Merge branch 'master' into gencloser	Yong He	2019-02-05
\|\
\| *	Allow entry points to have explicit generic parameters (#826)	Tim Foley	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Allow entry points to have explicit generic parameters Prior to this change, the Slang implementation required users to use global `type_param` declarations in order to specialize a full shader. For example: ```hlsl type_param L : ILight; ParameterBlock<L> gLight; [shader("fragment")] float4 fs(...) { ... gLight.doSomething() ... } ``` With this change we can rewrite code like the above using explicit generics, plus the ability to have `uniform` entry-point parameters: ```hlsl [shader("fragment")] float4 fs<L : ILight>( uniform ParameterBlock<L> light, ...) { ... light.doSomething() ... } ``` Having this support in place should make it possible for us to eliminate global generic type parameters and the complications they cause (both at a conceptual and implementation level). The most central and visible piece of the change is that `EntryPointRequest` now holds a `DeclRef<FuncDecl>` instead of just ` RefPtr<FuncDecl>`, which allows it to refer to a specialization of a generic function. Various places in the code that refer to the `EntryPointRequest::decl` member now use a `getFuncDecl()` or `getFuncDeclRef()` method as appropriate (see `compiler.h`). In order to fill in the new data, the `findAndValidateEntryPoint` function has been greaterly overhauled. The changes to its operation include: * The by-name lookup step for the entry point function has been adapted to accept either a function or a generic function. * The generic argument strings provided by API or command line are no longer parsed all the way to `Type`s, but instead just to `Expr`s in the first pass. * There are now two cases for checking the global generic arguments against their matching parameters. The first case is the new one, where we plug the generic argument `Expr`s into the explicit generic parameters of an entry point (that case re-uses existing semantic checking logic). The second case is the pre-existing code for dealing with global generic type arguments. The `lower-to-ir.cpp` logic for hadling entry points then had to be extended. Making it deal with a full `DeclRef` instead of just a `Decl` was the easy part (just call `emitDeclRef` instead of `ensureDecl`). The more interesting bits were: * We need to carefully add the `IREntryPointDecoration` to the nested function and not the generic in the case where we have a generic entry point. There is a handy `getResolvedInstForDecorations` that can extract the return value for an IR generic so that we can decorate the right hting. * We need to make sure that in the case where we emit a `specialize` instruction (which normally wouldn't get a linkage decoration), we attach an `[export(...)]` decoration to it with the mangled name of the decl-ref, so that it can be found during the linking step. The IR linking step is then slightly more complicated because the mangled entry point name could either refer directly to an `IRFunc` or to a `specialize` instruction for a generic entry point. The logic was refactored to first clone the entry point symbol without concern for which case it is (the old code was specific to functions), and then if the result is a `specialize` instruction, we attempt to run generic specialization on-demand. That on-demand specialization is a bit of a kludge, but it deals with the fact that all the downstream passing only expect to see an `IRFunc`. A future cleanup might try to split out that specialization step into its own pass, which ends up being a limited form of the specialization pass. Since I was already having to touch a lot of the code around IR linking, I went ahead and refactored the signature of the operations. I eliminated the need for the caller to create, pass in, and then destroy an `IRSpecializationState` (really an IR linking state), and replaced it with a structure local to the pass (that data structure was a remnant of an older approach in the compiler), and then also renamed the main operation to `linkIR` to reflect what it is doing in our conceptual flow. Smaller changes made along the way include: * Refactored `visitGenericAppExpr` to create a subroutine `checkGenericAppWithCheckedArgs` so that it can be used by the entry-point validation logic described above). * Refactored the declarations around the IR passes in `emitEntryPoint()` (`emit.cpp`), to show that things are more self-contained than they used to be (e.g., that the `TypeLegalizationContext` is now only needed by one pass). * Refactored the generic specialization code so that there is a stand-along free function that can perform specialization on a `specialize` instruction without all the other context being required. This is only to support the limited specialization that needs to be done as part of linking. * Updated the `global-type-param.slang` test to actually test entry-point generic parameters. In a later pass we can/should rework all the tests/examples for global type parameters over to use explicit entry-point generic parameters (at which point we should rename the tests as well). For now I am leaving thigns with just one test case, with the expectation that bugs will be found and ironed out as we expand to more tests. * fixup * Fixup: don't leave entry-point decorations on stuff we don't want to keep The IR `[entryPoint]` decoration is effectively a "keep this alive" decoration, which means that attaching it to something we don't intend to keep around can lead to Bad Things. The approach to generic entry points was attaching `[entryPoint]` to the underlying `IRFunc` because that seemed to make sense, but that meant that the `specialize` instruction at global scope scould instantiate that generic and then keep it alive, even if the resulting function wouldn't be valid according to the language rules. As a quick fix, I'm attaching `[entryPoint]` to the `specialize` instruction instead in such cases, and then re-attaching it to the result of explicit specialization during linking. * Port most of remaining test and rename global type parameters This change ports as many as possible of the existing tests for global type parameters over to use entry-point generic parameters instead. For the most part this is a mechanical change. A few test cases remain using global generic parameters, as does the `model-viewer` example application. The reason for this is that the shaders have either or both the following features: * A vertex and fragment shader that can/shold agree on their parameters * A type declaration (e.g., a `struct`) that is dependent on one of the generic type parameters In these cases, it would really only make sense to switch to explicit parameters once we support shader entry points nested inside of a `struct` type, so that we can use an outer generic `struct` as a mechanism to scope the entry points and other type-dependent declrations. Since global-scope type parameters need to persist for at least a bit longer, I went ahead and renamed all the use sites over to use `type_param` for consistency.
* \|	Allow generics to close with >>	Yong He	2019-02-05
\|/
*	Feature/casting tidyup (#822)	jsmall-nvidia	2019-02-04
\| \| \| \| \| \| \| \| \| \|	* Use 'is' over 'as' where appropriate. * dynamic_cast -> dynamicCast * Replace 'dynamicCast' with 'as' where has no change in behavior/ambiguity. * Replace dynamicCast with as where doesn't change behavior/non ambiguous.
*	Feature/as refactor review (#821)	jsmall-nvidia	2019-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Replace dynamicCast with as where does not change behavior (ie not Type derived). Use free function where scoping is clear. * Replace uses of dynamicCast with as when there is no difference in behavior. * Remove the IsXXXX methods from Type. * Don't have separate smart pointer to store canonicalType on Type. * Simplify Slang.FilteredMemberRefList.Adjust, such does the cast directly. * Use free as where appropriate. * Use free function version of casts where appropriate. * Fix text in casting.md * Fix typos in decl-refs.md * Remove the uses of free function as on RefDecl. Add 'canAs' to RefDecl as a way to test if a cast is possible. Moved 'as' into RefDeclBase. * Use 'is' to test for as cast on smart pointers. Fix small scope issue. * * Cache stringType and enumTypeType on the Session * Make DeclRefType::Create return a RefPtr * Make casting of result use the method .as (cos using free function would mean objects being wrongly destroyed) * Make results from createInstance ref'd to avoid possible leaks. * Fix typo in template parameter for is on RefPtr.
*	Feature/as refactor (#817)	jsmall-nvidia	2019-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Made dynamicCast a free function. * Replace As with as or dynamicCast depending on if it is a type. * Fix problem with using non smart pointer cast. * Removed legacy asXXXX methods. * Remove As from Type. * Removed As from Qual type -> made coercable into Type, such that can just use free 'as'. Remove left over QualType::As() impl. * Remove As from SyntaxNodeBase. * Made as for instructions implemented by dynamicCast. * Replace As on DeclRef. Use the global as<> to do the cast. * Add const safe versions of dynamicCast and as for IRInst
*	Fix parsing of modifiers on generic declarations (#816)	Tim Foley	2019-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way we handle generics in the AST is to have a `GenericDecl` AST node that can wrap an arbitrary "inner" `Decl`. This representation has proved valuable, but it also creates some gotchas because a `GenericDecl` is also a `Decl`. One example of where this can be tricky is for attributes and other modifiers attached to something generic: ```hlsl [special] void foo<T>(...) { ... } ``` In this case, the current parser logic was gobbling up a list of `Modifier`s (including the `[special]` attribute here), then parsing the declaration that follows, and then attaching those modifiers to the resulting declaration. In this case, however, the "resulting declaration" is not a `FuncDecl` as one might expect, but the `GenericDecl` that wraps it. The solution is pretty simple: when we are done parsing a declaration and are going to attach the modifiers we parsed to it, we should first "unwrap" any outer `GenericDecl` and apply the modifiers to the thing inside instead. This logic presumes that modifiers always want to apply to the inner declaration and not to the generic, an that seems reasonable for now. If we wanted we could add some special-case logic later in the compiler to implicitly "float" certain modifiers up to the generic, in cases where they are found to be inappropriate for the inner declaration.
*	Support "modern" declaration syntax as an option (#792)	Tim Foley	2019-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Support "modern" declaration syntax as an option Fixed #202 This change adds four new declaration keywords: The `let` and `var` keywords introduce immutable and mutable variables, respectively. They can only be used to declare a single variable at a time (unlike C declaration syntax), and they support inference of the variable's type from its initial-value expression. Examples: ``` let a : int = 1; // immutable with explicit type and initial-value expression let b = a + 1; // immutable, with type inferred var c : float; // mutable, with explicit type var d = b + c; // mutable, with type inferred ``` These declaration forms can be used wherever ordinary global, local, or member variable declarations appeared before. Right now they do not change rules about what is or is not considered a shader parameter. The `static` modifier should work on these forms as expected, but a `static let` variable is not the same as a `static const`, so an explicit `const` is still needed if you want that behavior. A `typealias` declaration introduces a named type alias, similar to `typedef`, but with more reasonable syntax. It inherits from the same AST class that `typedef` uses, so all of the code after parsing should be able to treat them as equivalent. To give a simple example: ``` // typedef int MyArray[3]; typealais MyArray = int[3]; ``` A `func` declaration introduces a function. Like `typealias` it re-uses the existing AST class, so there is no need for major changes after parsing. A `func` declaration uses a syntax similar to `let` variables for its parameters, and takes the (optional) result type in a trailing position. For example: ``` func myAdd(a: int, b: int) -> int { return a + b; } ``` If a `func` declaration leaves of the return type clause, the return type is assumed to be `void`. The main difference (beyond the trailing return type) is that the parameters of a `func`-declared function are immutable (unless they are `out`/`inout`). This change doesn't add support for declaring operator overloads with `func`, but that should be added later, and I'd like to make that the only way to declare such operations: ``` func +(left: MyType, right: MyType) -> MyType { ... } ``` The use of `:` for declaring parameter types here means that a function declared with modern syntax currently cannot include HLSL-style semantics on its parameters (or its result). We might consider introducing an `[attribute]`-based syntax for adding semantics to parameters if we think this is important, but for now it is fine to insist that users declare their entry points using traditional syntax. This change strives to avoid unecessary changes after parsing, but if the new syntax catches on with users there are some small ways we can take advantage of it for performance. In particular, since `let` declarations and parameters of modern-style functions are immutable, we do not need to generate read/write local temporaries for them during lowering to the IR (technically we can make the same optimization for `const` locals). In the process of implementing these new forms I also added a few subroutines to help share code better between existing cases in the parser. In particular, parsing of generic parameter lists on declarations that can be generic is now simplified and more unified. * Fixup: remove leftover debugging code * fixup: typos
*	Clean up variable declaration class hierarchy (#787)	Tim Foley	2019-01-22
\| \| \| \| \| \| \| \| \| \| \| \|	The AST class hierarchy for variable declarations had a few messy bits. First, there are more subclasses of `VarDeclBase` than seem strictly necessary; especially for stuff like `struct` member variable which use `StructField` even for `static` fields (which are effectively globals). Second, the AST node type for the "cases" within an `enum` was made a subclass of `VarDeclBase` for expediency, but this isn't really semantically accurate (and doesn't seem to be paying off much in deduplication of code). This change tries to address both of those problems. First, we replace the existing `Variable` and `StructField` cases with a single `VarDecl` case that covers globals, locals, and member variables. I haven't gone so far as to replace function parameters or generic value parameters, but that might be worth considering as a further clean-up. Second, we change `EnumCaseDecl` to inherit directly from `Decl` instead of `VarDeclBase` and add an explicit case for handling them where they were previously handled as if they were variable declarations (this was done by manually surveying all locations in the code that referenced `VarDeclBase`).
*	Initial support for dynamic dispatch using "tagged union" types (#772)	Tim Foley	2019-01-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial support for dynamic dispatch using "tagged union" types Suppose a user declares some generic shader code, like the following: ```hlsl interface IFrobnicator { ... } type_param T : IFrobincator; ParameterBlock<T : IFrobnicator> gFrobnicator; ... gFrobincator.frobnicate(value); ``` and then they have some concrete implementations of the required interface: ```hlsl struct A : IFrobnicator { ... } struct B : IFrobnicator { ... } ``` The current Slang compiler allows them to generate distinct compiled kernels for the case of `T=A` and the case of `T=B`. This means that the decision of which implementation to use must be made at or before the time when a shader gets bound in the application. This change adds a new ability where the Slang compiler can generate code to handle the case where `T` might be either `A` or `B`, and which case it is will be determined dynamically at runtime. This means a single compiled kernel can handle both cases, and the decision about which code path to run can be made any time before the shader executes. This new option is supported by defining a tagged union type. Via the API, the user specifies that `T` should be specialized to `__TaggedUnion(A,B)` (the double underscore indicates that this is an experimental and unsupported feature at present). We refer to the types `A` and `B` here as the "case" types of the tagged union. Conceptually, the compiler synthesizes a type something like: ```hlsl struct TU { union { A a; B b; } payload; uint tag; } ``` The user can then allocate a constant buffer to hold their tagged union type, and when they pick a concrete type to use (say `B`), they fill in the first `sizeof(B)` bytes of their buffer with data describing a `B` instance, and then set the `tag` field to the appopriate 0-based index of the case type they chose (in this case the `B` case gets the tag value `1`). Actually implementing tagged unions takes a few main steps: * Type parsing was extended to special-case `__TaggedUnion` as a contextual keyword. This is really only intended to be used when parsing types from the API or command-line, and Bad Things are likely to happen if a user ever puts it directly in their code. Eventually construction of tagged unions should be an API feature and not part of the language syntax. * Semantic checking was extended to recognize that a tagged union like `__TaggedUnion(A,B)` shoud support an interface like `IFrobnicator` whenever all of the case types suport it, as long as the interface is "safe" for use with tagged unions (which means it doesn't use a few of the advancd langauge features like associated types). * The IR was extended with instructions to represent tagged union types and to extract their tag and the payload for the different cases as needed. * IR generation was extended to synthesize implementations of interface methods for any interface that a tagged union needs to support. Right now the implementation is simplistic and only handles simple method requirements, which it does by emitting a `switch` instruction to pick between the different cases. * A new IR pass was introduced to "desugar" any tagged union types used in the code. The downstream HLSL and GLSL compilers don't support `union`s, so we have to instead emit a tagged union as a "bag of bits" and implement loading the data for particular cases from it manually. * Final code emit mostly Just Works after the above steps, but we had to introduce an explicit IR instruction for bit-casting to handle the output of the desugaring pass. There are a bunch of gaps and caveats in this implementation, but that seems reasonable for something that is an experimental feature. The various `TODO` comments and assertion failures in unimplemented cases are intended, so that this work can be checked in even if it isn't feature-complete. * fixup: missing files * fixup: typos
*	Feature/lex memory reduction (#762)	jsmall-nvidia	2018-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Only do scrubbing if needed. When allocating content try to limit size (with scrubbing each token takes up 1k), now it's 16 bytes min size. * Don't allocate for every call to write on the CallbackWriter - use the m_appendBuffer. * Don't allocate memory for CallbackWriter use m_appendBuffer. * Use UnownedStringSlice for suffix output for parsing float/int literals. Fix typo in invalidFloatingPointLiteralSuffix * Using memory arena to hold tokens that are not in SourceManager. * Improve comment on lexing. * Make UnownedStringSlice allocation simpler on SourceManager. * Fix error on gcc around UnownedStringSlice - because VC converted string + UnownedStringSlice automatically into a String. * Fix generateName needing concat string for gcc. * When constructing a Token in parseAttributeName - because it's a Identifier, we have to set the Name. * Remove translation through String on getIntrinsicOp * Make func-cbuffer-param disablable with -exclude compatibility-issue * Move memory leak in render-test. * From review - can just use "?:" instead of performing a concat.
*	Add support for Vulkan raytraicng "shader record" (#735)	Tim Foley	2018-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The syntax for this is a placeholder for now, since we will probably want to migrate to whatever gets decided on for dxc. To declare that some data should be part of the "shader record" use `layout(shaderRecordNV)` to mirror the GLSL raytracing extension: ```hlsl layout(shaderRecordNV) cbuffer MyShaderRecord { float4 someColor; uint someValue; } ``` The intention (not enforced) is that an application would map `MyShaderRecord` to "root constants" in the "local root signature" when compiling for DXR, while the output code in GLSL will always map to the shader record in Vulkan raytracing: ```glsl layout(shaderRecordNV) buffer MyShaderRecord { float4 someColor; uint someValue; }; ``` This change does not support declaring a global value of `struct` type with `layout(shaderRecordNV)` (or a `ParameterBlock` with the modifiers, although that would be a nice-to-have feature) and it does not support having the contents of the shader record be mutable (even if GLSL/Vulkan allows it). Those can/should be added in future changes. In terms of implementation, this closely mirrors the way that `layout(push_constant)` buffers were being handled, where the data inside the `ConstantBuffer<X>` (the value of type `X`) gets laid out using ordinary rules (and consuming ordinary `UNIFORM` storage, while the buffer itself is given a different layout resource to reflect that fact that it does not consume a VK `binding` any more, but a different conceptual resource. Note: an alternative design here (that might actually be preferrable) would be to have both push-constant and shader-record buffers be handled as alternative aliases for `ConstantBuffer` (or maybe `ParameterBlock`) so that you have, e.g.: ```hlsl PushConstantBuffer<X> myPushConstants; ShaderRecord<Y> myShaderRecord; ``` This alternative design avoids API-specific decorations on the declarations, and reflects the intent of the programmer very directly, even when they are compiling for a target like D3D that doesn't reflect these choices at the IL level (it could still be exposed through the Slang reflection API).
*	Feature/premake linux (#689)	jsmall-nvidia	2018-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Premake work in progress for linux. * Added dump function. * Remove examples on linux Small warning fix. * * Don't build render-test on linux * Removed work around virtual destructor warning, and just used virtual dtor for simplicity * Git ignore obj directories * Fix premake working on windows. * * Fix sprintf_s functions * Make generates arg parsing more robust * Added FloatIntUnion to avoid type punning/strong aliasing issues, and repeated union definitions. * Work around problems building on linux with getClass claiming a strict aliasing issue. * Fix for targetBlock appearing potentiall used unintialized to gcc. * Linux slang link options -fPIC to make dll. * Add -fPIC to build options on linux. * Add -ldl for linux on slang. * Fixes to try and get premake working with .so on linux. * Make core compile with -fPIC * Try to fix linux linking with --no-as-needed before -ldl * Add rpath back. * Remove render-gl from linux build. * Re-add location for linux. * Don't include <malloc.h> except on windows. * Remove unused line to fix warning on osx. * Remove ambiguity on OSX for operator <<. * Fixing ambiguity with operator overloading and Int types for OSX. * Fix ambiguity around UInt and operator * Fix ambiguity of UInt conversion for OSX. * Added UnambiguousInt and UnambiguousUInt to make it easier to work around OSX integer coercion for UInt/Int types.
*	Support cross-compilation of ray tracing shaders to Vulkan (#663)	Tim Foley	2018-10-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Move to newer glslang * Support cross-compilation of ray tracing shaders to Vulkan This change allows HLSL shaders authored for DirectX Raytracing (DXR) to be cross-compiled to run with the experimental `GL_NVX_raytracing` extension (aka "VKRay"). * The GLSL extension spec is marked as experimental, so that any shaders written using this support should be ready for breaking changes when the spec is finalized. * "Callable shaders" are not exposed throug the GLSL extension, so this feature of DXR will not be cross-compiled. * The experimental Vulkan raytracing extension does not have an equivalent to DXR's "local root signature" concept. This does not visibly impact shader translation (because the local/global root signature mapping is handled outside of the HLSL code), but in practice it means that applications which rely on local root signatures on their DXR path will not be able to use the translation in this change as-is; more work will be needed. The simplest part of the implementation was to go into the Slang standard library and start adding GLSL translations for the various DXR operations. In some cases, like mapping `IgnoreHit()` to `ignoreIntersectionNVX()` this is almost trivial. The various functions to query system-provided values (e.g., `RayTMin()`) were also easy, with the only gotcha being that they map to variables rather than function calls in GLSL, and our handling of `__target_intrinsic` assumes that a bare identifier represents a replacement function name, and not a full expression, so we have to wrap these definitions in parentheses. The tricky operations are then `TraceRay<P>()` and `ReportHit<A>()`, because these two are generics/templates in HLSL. GLSL doesn't support generics, even for "standard library" functions, so the raytracing extension implements a slightly complex workaround: the matching operations `traceNVX()` and `reportIntersectionNVX()` pass the payload/attributes argument data via a global variable. That is, shader code for the GLSL extensions writes to the global variable and then calls the intrinsic function. The linkage between the call site and the global is established by a modifier keyword (`rayPayloadNVX` and `hitAttributeNVX`, respectively) and in the case of ray payload also uses `location` number to identify which payload global to use (since a single shader can trace rays with multiple payload types). Our translation strategy in Slang tries to leverage standard language mechanisms instead of special-case logic. For example, to translate the `ReportHit<A>()` function, we provide both a default declaration that will work for HLSL (where the operation is built-in with the signature given), and a definition marked with the `__specialized_for_target(glsl)` modifier. The GLSL definition declares a function `static` variable that will fill the role of the required global, and then does what the GLSL spec requires: assigns to the global, and then calls the `reportIntersectionNVX` builtin (which we declare as a separate builtin). Our ordinary lowering process will turn that `static` variable into an ordinary global in the IR, and the `[__vulkanHitAttributes]` attribute on the variable will be emitted as `hitAttributeNVX` in the output. There is no additional cross-compilation logic in Slang specific to `ReportHit<A>()` - the target-specific definition in the standard library Just Works. The case for `TraceRay<P>()` is a bit more complicated, simply because the GLSL `traceNVX()` function needs to be passed the `location` for the payload global. We implement the payload global as a function-`static` variable, with the knowledge that every unique specialization of `TraceRay<P>()` will generate a unique global variable of type `P` to implement our function-`static` variable. We then add a slightly magical builtin function `__rayPayloadLocation()` that can map such a variable to its generated `location`; the logic for this is implemented in `emit.cpp` and described below. We also changed the `RayDesc` and `BuiltinTriangleIntersectionAttributes` types from "magic" intrinsic types over to ordinary types (because the GLSL output needs to declare them as ordinary `struct` types). This ends up removing some cases in the AST and IR type representations. By itself this change would break HLSL emit, because in that case the types really are intrinsic. We added a `__target_intrinsic` modifier to these types to make them intrinsic for HLSL, and then updated the downstream passes to handle the notion of target-intrinsic types. The logic for binding/layout of entry point inputs and outputs was updated so that raytracing stages don't follow the default logic for varying input/output parameters. This is because the input/output parameters of a raytracing entry point aren't really "varying" in the same sense as those in the rasterization pipeline. In particular, the SPIR-V model for raytracing input and output treats "ray payload" and "hit attributes" parameters as being in a distinct storage class from `in` or `out` parameters. We also detect cases where a ray tracing stage declares inputs/outputs that it shouldn't have. This logic could conceivably be extended to other stages (e.g., to give an error on a compute shader with user-defined varying input/output). The type layout logic added cases for handling raytracing payload and hit-attribute data, but this is currently just a stub implementation that follows the same logic as for varying `in` and `out` parameters (it cannot give meaningful byte sizes/offsets right now). To my knowledge the GLSL spec doesn't currently specify anything about layout, and I haven't read the DXR spec language carefully enough to know what it says about layout. A future change should update the layout logic to allow for byte-based layout of ray payloads, etc. so that we can query this information via reflection. The GLSL legalization logic in `ir.cpp` was updated to factor out the per-entry-point-parameter code into its own function, and then that function was updated to special-case the input/output of a ray-tracing shader. While for rasterization stages we typically want to take the user-declared input/output and "scalarize" it for use in GLSL (in part to deal with language limitations, and in part to tease system values apart from user-defined input/output), the GLSL spec for raytracing requires payload and hit attribute parameters to be declared as single variables. There is also the issue that even for an `in out` parameter, a ray payload parameter should only turn into a single global, whereas the handling for varying `in out` parameters generates both an `in` and an `out` global for the GLSL case. Other than the handling of entry point parameters, the GLSL legalization pass doesn't need to do anything special for ray tracing shaders. The trickiest change in the `emit.cpp` logic is that we now generate `location`s for ray payload arguments (the outgoing from a `TraceRay()` call) on demand during code generation. This is a bit hacky, and it would be nice to handle it as a separate pass on the IR rather than clutter up the emit logic, but this approach was expedient. Basically, any of the global variables that got generated from the `static` declarations in the standard library implementation of `TraceRay()` will trigger the logic to assign them a `location`. The logic for emitting intrinsic operations added a few new `$`-based escape sequences. The `$XP` case handles emitting the location of a generated ray payload variable; this is how we emit the matching location at the site where we call `traceNVX`. The `$XT` case emits the appropriate translation for `RayTCurrent()` in HLSL, because it maps to something different depending on the target stage. All of the test cases here consist of a pair of an HLSL/Slang shader written to the DXR spec, plus a matching GLSL shader for a baseline. The GLSL shaders are carefully designed so that when fed into glslang they will produce the same SPIR-V as our cross-compilation process. This kind of testing is quite fragile, but it seems to be the best we can do until our testing framework code supports both DXR and VKRay. A bunch of the core changes ended up being blocked on issues in the rest of the compiler, so some additional features go implemented or fixed along the way: The first big wall this work ran into was that the `__specialized_for_target` modifier hasn't actually been working correctly for a while. It turns out that for the one function that is using it, `saturate()`, we have been outputting the workaround GLSL function in all cases (including for HLSL output) rather than only on GLSL targets. The problem here is that for a generic function with a `__specialized_for_target` modifier or a `__target_intrinsic` modifier, the IR-level decoration will end up attached to the `IRFunc` instruction nested in the `IRGeneric`, but the logic for comparing IR declarations to see which is more specialized (via `getTargetSpecializationLevel()`) was looking only at decorations on the top-level value (the generic). The quick (hacky) fix here is to make `getTargetSpecializationLevel()` try to look at the return value of a generic rather than the generic itself, so that it can see the decorations that indicate target-specific functions. A more refined fix would be to attach target-specificity decorations to the outer-most generic (to simplify the "linking" logic). The only reason not to fold that into the current fix is that the `__target_intrinsic` modifier currently serves double-duty as a marker of target specialization and information to drive emit logic. The latter (the emit-related stuff) currently needs to live on the `IRFunc`, and moving it to the generic could easily break a lot of code. This needs more work in a follow-on fix, but for now target specialization should again be working. The other big gotcha that the simple "just use the standard library" strategy ran into was that function-`static` variables weren't actually implemented yet, and in particular function-`static` variables inside of generic functions required some careful coding. The logic in `lower-to-ir.cpp` has this `emitOuterGenerics()` function that is supposed to take a declaration that might be nested inside of zero or more levels of AST generics, and emit corresponding IR generics for all those levels. This is needed because two different AST functions nested inside a single generic `struct` declaration should turn into distinct `IRFunc`s nested in distinct `IRGeneric`s. The tricky bit to making that all work is that the same AST-level generic type parameter will then map to different IR-level instructions (the parameters of distinct `IRGeneric`s) when lowering each function. The existing logic handled this in an idiomatic way by making "sub-builders" and "sub-contexts." This change refactors some of the repeated logic into a `NestedContext` type to help simplify the pattern, and applies it consistently throughout the `lower-to-ir.cpp` file. Besides that cleanup, the major change is `lowerFunctionStaticVarDecl` which, unsurprisingly, handles lower of function-`static` variables to IR globals. The careful handling of nested contexts here is needed because if we are in the middle of lowering a generic function, then a `static` variable should turn into its own `IRGeneric` wrapping an `IRGlobalVar`. The body of the function should refer to the global variable by specializing the global variable's `IRGeneric` to the parameters of the functions `IRGeneric`. This tricky detail is handled by `defaultSpecializeOuterGenerics`. An additional subtlety not actually required for this raytracing work (and thus not properly tested right now) is handling function-`static` variables with initializers. These can't just be lowered to globals with initializers, because HLSL follows the C rule that function-`static` variables are initialized when the declaration statement is first executed (and this could be visible in the presence of side-effects). The lowering strategy here translates any `static` variable with an initializer into two globals: one for the actual storage, plus a second `bool` variable to track whether it has been initialized yet. There are some opportunities to optimize this case, especially for `static const` data, but that will need to wait for future changes. We've slowly been shifting away from the model where a user thinks of a "profile" as including both a stage and a feature level. Instead, the user should think about selecting a profile that only describes a feature level (e.g., `sm_6_1`, `glsl_450`, etc.), and then separately specifying a stage (`vertex`, `raygeneration, etc.) for each entry point. The challenge here is that the command-line processing still only had a single `-profile` switch, and no way to specify the stage. Adding the `-stage` option was relatively easy, but making it work with the existing validation logic for command-line arguments was tricky, because of the complex model that `slangc` supports for compiling multiple entry points in a single pass. * In `slang.h` add new reflection parameter categories for ray payloads and hit attributes, as part of entry point input/output signatures. * A previous change already updated our copy of glslang to one that supports the `GL_NVX_raytracing` extension, so in `slang-glslang.cpp` we just needed to map Slang's `enum` values for the raytracing stage names to their equivalents in the glslang code. * Moved the logic for looking up a stage by name (`findStageByName()`) out of `check.cpp` and into `compiler.cpp`, with a declaration in `profile.h` * Added a `$z` suffix to the GLSL translation of `Texture.SampleLevel()`, to handle cases where the texture element type is not a 4-component vector. Note that this fix should actually be applied to all* these texture-sampling operations, but I didn't want to add a bunch of changes that are (clearly) not being tested right now. * The layout logic for entry points was updated to correctly skip producing a `TypeLayout` for an entry point result of type `void`, which meant that the related emit logic now needs to guard against a null value for the result layout. * In `ir.cpp`, dump decorations on every instruction instead of just selected ones, so that our IR dump output is more complete. * Added a command-line `-line-directive-mode` option so that we can easily turn off `#line` directives in the output when debugging. Not all cases where plumbed through because the `none` case is realistically the most important. * Parser was fixed to properly initialize parent links for "scope" declarations used for statements, so that we can walk backwards from a function-scope variable (including a `static`) and see the outer function/generics/etc. * Added GLSL 460 profile, since it is required for ray tracing. Also updated the logic for computing the "effective" profile to use to recognize that GLSL raytracing stages require GLSL 460. * Added some conventional ray-tracing shader suffixes to the handling in `slang-test`. This code isn't actually used, but was relevant when I started by copy-pasting some existing VKRay shaders as the starting point for my testing. * Fixup: typos
*	Improve support for non-32-bit types. (#643)	Tim Foley	2018-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main change here is to fill out the `BaseType` enumeration so that it covers the full range of 8/16/32/64-bit signed and unsigned integers, as well as 16/32/64-bit floating-point numbers, and then propagate that completion through various places in the code. More details: * The current `half`, `float`, `double`, `int`, and `uint` types are still the default names for their types, so things like `float16_t` and `int32_t` were added as `typedef`s. * We still need to generate the full gamut of vector/matrix `typedef`s for the new types, so that things like `float16_t4x3` will work (yes, I know that is ugly as sin, but that's the HLSL syntax...). * A few pieces of dead code from earlier in the compiler's life got removed, since I did a find-in-files for `BaseType::` and tried to either update or delete every site. * A few call sites that were enumerating integer base types in an ad-hoc fashion were changed to use a single `isIntegerBaseType()` function that I added in `check.cpp` * When compiling with dxc for shader model 6.2 and up, we enable the compiler's support for native 16-bit types via a flag. * The public API enumeration for reflection of scalar types added cases for 8- and 16-bit integers (it already exposed the other cases we need) * The lexer was updated to be extremely liberal in what kinds of suffixes it allows on literals. I also removed the logic that was treating, e.g., `0f` as a floating-point literal (it doesn't seem to be the right behavior). That would now be an integer literal with an invalid suffix. * The logic in the parser that applies types to literals was updated to handle a few more cases: `LL` and `ULL` for 64-bit integers, and `H` for 16-bit floats. * The mangling logic needed to be updated to handle the new cases, and I consolidated the handling of those types in their front-end and IR forms. * Removed the explicit `BasicExpressionType::ToString` logic, since all basic types are `DeclRefType`s in the front end, and we can just print them out as such. * As a bit of a gross hack, fudged the conversion costs so that `int` to `int64_t` conversion is a bit more costly. The problem there is that given an operation like `int(0) + uint(0)`, the best applicable candidates ended up being `+(uint,uint)` and `+(int64_t,int64_t)` because the cost of a single `int`-to-`uint` conversion was the same as the sum of the cost of an `int`-to-`int64_t` and a `uint`-to-`int64_t`. A better long-term fix here is to completely change our overload resolution strategy, but that is obviously way too big to squeeze into this change. * Type layout computation was updated to handle all the new types and give them their natural size/alignment. Note that this does not work for down-level HLSL where `half` is treated as a synonym for `float`. It also doesn't deal with the fact that many of these types aren't actually allowed in constant buffers for certain shader models. A future change should work to add error messages for unsupported stuff during type layout (or just make the types themselves require support for certain capabilities)
*	Support for [[vk::push_constant]] (#629)	jsmall-nvidia	2018-08-22
\| \| \| \| \| \| \| \| \| \|	* Support for attributed [[vk::push_constant]] and [[push_constant]]. Can also use layout(push_constant). * Fix test so matches the expected output. * Add expected output to binding-push-constant-gl.hlsl * Trivial change to force travis rebuild to test the gcc linux build really has a problem.
*	Feature/attributed binding (#621)	jsmall-nvidia	2018-07-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Typo fix, and added dxc to command line documentation. * Fix small typos. Added support for Scope to lexer. Fix bug in Token ctor. * Add support for attribute names that are scoped. * Added GLSLBindingAttribute. Make binding work through core.met.slang. * Allow [[gl::binding(binding, set)]] [[vk::binding(binding,set)]]
*	Initial support for enum declarations (#599)	Tim Foley	2018-06-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Slang `enum` declarations will always be scoped, e.g.: ```hlsl enum Color { Red, Green = 2, Blue, } Color c = Color.Red; // Not just `Red` ``` A user can write `enum class` as a placebo for now (to ease sharing of headers with C++). Slang does not currently support the `::` operator for static member lookup, so it must be `Color.Green` and not `Color::Green`. Support for `::` as an alternate syntax could be added later if there is strong user demand. An `enum` type can have a declared "tag type" using syntax like C++ `enum class`: ```hlsl enum MyThings : uint { First = 0, // ... } ``` The `enum` cases will store their values using that type. An `enum` that doesn't declare a tag type will use the type `int` by default. Enum cases are assigned values just like in C/C++: cases can have explicit values, but otherwise default to one more than the previous case, or zero for the first case. All `enum` types will automatically conform to a standard-library `interface` called `__EnumType`, which is used so that basic operators like equality testing can be defined generically for all `enum` types. This change only adds one operator at first (the `==` comparison), but other should be added later. An `enum` case needs to be explicitly converted to an integer where needed (e.g., `int(Color.Red)`). This is implemented by having the main integer types (`int` and `uint`) support built-in initializers that can work for any `enum` type (or rather, anything conforming to `__EnumType`). Eventually these will be restricted so that an `enum` type can only be converted to its associated tag type. IR code generation completely eliminates `enum` types and their cases. The `enum` type will be replaced with its tag type, and the cases will be replaced with the tag values. Currently this could leave some mess in the IR where cast operations are applied between values that actually have the same type.
*	Fix global atomic functions (#582)	Tim Foley	2018-05-29
\| \| \| \| \| \| \| \| \|	Fixes #581 This change adds a new parameter passing mode `__ref` to exist alongisde `in`, `out`, and `inout`. The `__ref` modifier indicates true by-reference parameter passing (whereas `inout` is copy-in-copy-out). This is not intended to be something that users interact with directly, but rather a low-level feature that lets us provide a correct signature for the `Interlocked*()` operations in the standard library. Most of the support for passing what are logically addresses around already exists in the IR, so the majority of the work here is just in introducing the new type `Ref<T>` and then using it appropriately when lowering `__ref` parameters/arguments to the IR.
*	Add support for explicit register space bindings (#542)	Tim Foley	2018-05-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds support for specifying explicit register spaces, like: ```hlsl // Bind to texture register #2 in space #1 Texture2D t : register(t2, space1); ``` I added a test case to confirm that the register space is properly propagated through the Slang reflection API. This change also adds proper error messages for some error/unsupported cases that weren't being diagnosed: * Specifying a completely bogus register "class" (e.g., `register(bad99)`) * Failing to specify a register index (`register(u)`) * Specifying a component mask (`register(t0.x)`) * Using `packoffset` bindings I added test cases to cover all of these, as well as the new errors around support for register `space` bindings. In order to get the existing tests to pass, I had to remove explicit `packoffset` bindings from some DXSDK test shaders. None of these `packoffset` bindings were semantically significant (they matched what the compiler would do anyway, for both Slang and the standard HLSL compiler). Removing them is required for Slang now that we give an explicit error about our lack of `packoffset` support. In a future change we might add logic to either detect semantically insignificant `packoffset`s, or to just go ahead and support them properly (as a general feature on `struct` types).
*	Introduce an IR-level type system (#481)	Tim Foley	2018-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors
*	Pass AST interpolation modifiers through to codegen. (#475)	Tim Foley	2018-04-04
\| \| \| \| \| \| \| \| \|	This is a short-term fix, because we (1) don't have an IR-level representation of interpolation qualifiers, and (2) can't introduce one until after the IR-level type system is introduced (to be able to handle `struct` fields). The approach here is to find the AST-level declaration, either from layout information (in the case of an ordinary variable or function parameter), or from struct field information (because structs are being output from the AST form anyway). I've included a single end-to-end rendering test to confirm that we handle the `nointerpolation` modifier the same as HLSL. I also added the `noperspective` modifier, which seemed to be missing from our implementation.
*	Entry point attribute (#447)	Tim Foley	2018-03-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Typo * Add [shader(...)] and clean up some literal handling * Add supporting for validating the `[shader(...)]` attribute, by checking that its argument is a string literal that names a known shader stage. * Split the `ConstantExpr` class into distinct subclasses rooted at `LiteralExpr`, so we have `BoolLiteralExpr`, `IntegerLiteralExpr`, `FloatingPointLiteralExpr`, and `StringLiteralExpr` * Add a `String` type to the stdlib, to be used as the type of a string literal. This change allows code using `[shader(...)]` to be accepted by the front-end again, but it does nothing about emitting it in final HLSL. * Allow entry points to be specified via [shader(...)] Before this change, the compiler would track a list of `EntryPointRequest` objects, based on what the suer specified via API and/or command-line options. Each entry point request would get matched up with an AST `FuncDecl` as part of semantic checking, and then the back end steps (layout, codegen, etc.) would work from that information. This change makes the compiler modal, in that it can either continue to use an explicit list of entry point requests (this is the mode when the list is non-empty), or it can rely on user-supplied attributes on entry point functions to drive codegen (this is the mode when the list is empty). User-specified `[shader(...)]` attributes are processed at the same place where the association from `EntryPointRequest`s to `FuncDecl`s would otherwise be made, and basically does the same thing in the opposite direction: looks for `FuncDecl`s with the appropriate attribute and synthesizes an `EntryPointRequest` for them. Subsequent processing should ideally not know where a given `EntryPointRequest` came from, and should handle both methods of specifying the entry points equivalently. One design choice that might not make immediate sense is that we do not process a function as an entry point (applying further validation, etc.) just because it has a `[shader(...)]` modifier, unless we are in the appropriate mode (which in this case is the mode where the user didn't specify their own entry points via API or command line). This is to handle cases where the user wants to explicitly compile only one entry point, so that they (1) don't want us to spend time validating code they don't care about, (2) don't want do get output they don't expect, and (3) might actually be presenting us with code that violates the language rules due to a combination of `#define`s in effect (e.g., they might have a `[shader("vertex")]` function that transitively executes a `discard` because of how the preprocessor was configured, but they don't care because they are compiling a fragment entry point). This decision might be something we revisit over time. As part of this work, I had to add some logic to pick a "profile version" to use for a combination of a target and stage (because when you specify `[shader("vertex")]` the compiler can't tell if you want `vs_5_0`, `vs_5_1`, etc.). This isn't really complete right now, because something like `-target dxbc` also doesn't determine a profile, so there is a bit of a kludge at present. We need to figure out a good long-term plan here, which might involve keeping target format, feature level/version, and pipeline stage as truly orthogonal concepts, rather than conflating them. That would involve more work in the API and command-line layers to de-compose things when the user specifies, e.g., `vs_5_1`, but might make downstream logic easier to manage. * Emit [shader(...)] attribute on entry point for SM 6.1 and later This should help ensure that the output from Slang can be compiled with dxc `lib_` profiles. Fix warning
*	Overhaul implementation of [attributes] (#443)	Tim Foley	2018-03-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing code parsed all of the square-bracket `[attributes]` into `HLSLUncheckedAttribute`, and then went on to hand-convert some of them to specialized subclasses of `HLSLAttribute`. When attributes didn't check, they were left as-is, and no error message was issued, because at the time the compiler was focused on accepting arbitrary input. This change greatly overhauls the handling of `[attributes]`. Attributes are now declared in the stdlib, with declarations like: ```hlsl __attributeTarget(LoopStmt) attribute_syntax [unroll(count: int = 0)] : UnrollAttribute; ``` In this syntax, the `unroll` part is giving the attribute name (the `[]` are just for flavor, to make the declaration look like a use site; we could drop it if we don't like the clutter), the `count` is a parameter of the attribute, which we expect to be of type `int`, and which has a default value of `0` if unspecified. The `: UnrollAttribute` part specifies the meta-level C++ class that will implement this attribute (and corresponds to a class in `modifier-defs.h`). This syntax is similar to our current `syntax` declarations. I'm starting to think we should change it to something like a `__meta_class(UnrollAttribute)` modifier, and then use that uniformly across all cases (e.g., also replacing the curreent `__magic_type(Foo)` syntax). The `__attributeTarget(LoopStmt)` is a modifier that specifies the meta-level C++ class for syntax that this attribute is allowed to attach to. It is legal to have more than one of these. Attributes continue to be parsed in an unchecked form, so that we don't tie up semantic analysis and parsing more than necessary. During checking, we look up the attribute name in the current scope, and then replace the unchecked attribute with a more specific one if the checking passes. Checking proceeds in generic and attribute-specific phases. The generic phase includes checking the number of arguments against those specified in the attribute declaration (I don't currently check types, or handle default arguments), and then checking that at least one `__attributeTarget(...)` modifier applies to the syntax node being modified. The attribute-specific phase then applies to the specialized C++ subclass of `Attribute`, and does the actual checking right now (e.g., that step is responsible for actually type-checking things at present). This can obviously be improved over time. With this support I went ahead and added declarations for all the HLSL attributes I could find documented on MSDN. I also added a provisional declaration for the `[shader(...)]` attribute that has been added to dxc, but which is not yet documented. One important detail here is that lookup of attribute names needs to be done carefully, so that we don't let, e.g., local variables shadow an attribute declaration: ```hlsl int unroll = 5; // This attribute should not get confused by the local variable `unroll` [unroll] for(...) { .. } ``` The lookup logic already has a notion of a `LookupMask` that can be used to filter declarations out of the result. In this change I surfaced that mask through the main lookup API (rather than requiring a second pass to "refine" lookup results), and made is so that the default lookup mask does not include attributes, while an explicit mask can be used to look up only attributes. (An alternatie design we discussed was to follow the approach of C# and have the declaration of an attribute like `[unroll]` actually be `unrollAttribute`, with a suffix. I decided not to follow that approach for now because it seemed like printing good error messages in that case could require us to carefully trim the `Attribute` suffix off of names at times, and using the existing mask behavior seemed simpler.) To verify that the shadowing behavior is indeed correct, I modified the `loop-unroll.slang` test case. Smaller notes: * Removed the `HLSL` prefix from several of the C++ attribute classes * Made sure to actually validate the modifiers on statements * Special-cased checking for `ParamDecl` with a null type, because I'm re-using `ParamDecl` for attribute parameters, but can't give a concrete type to some of them right now * Deleting some old, dead emit-from-AST logic around attributes, rather than try to "fix" code that doesn't run (a more complete scrub of that code is still needed) * Fixed AST inheritance hierarchy so that a `Modifier` is a `SyntaxNode` rather than a `SyntaxNodeBase`. I have no idea why we have both of those, and we need to clean that up soon.
*	Merge from 0.9.x (#429)	Tim Foley	2018-02-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix bug when subscripting a type that must be split (#396) The logic was creating a `PairPseudoExpr` as part of a subscript (`operator[]`) operation, but neglecting to fill in its `pairInfo` field, which led to a null-pointer crash further along. * Allow writes to UAV textures (#416) Work on #415 This issue is already fixed in the `v0.10.` line, but I'm back-porting the fix to `v0.9.`. The issue here was that the stdlib declarations for texture types were only including the `get` accessor for subscript operations, even if the texture was write-able. I've also included the fixes for other subscript accessors in the stdlib (notably that `OutputPatch<T>` is readable, but not writable, despite what the name seems to imply). * Fix infinite loop in semantic parsing (#424) The code for parsing semantics was looking for a fixed set of tokens to terminate a semantic list, rather than assuming that whenever you don't see a `:` ahead, you probably are done with semantics. This meant that you could get into an infinite loop just with simple mistakes like leaving out a `;`. This change fixes the parser to note infinite loop in this case, and adds a test case to verify the fix.
*	Remove support for the -no-checking flag (#392)	Tim Foley	2018-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Remove support for the -no-checking flag Fixes #381 Fixes #383 Work on #382 - No longer expose flag through API (`SLANG_COMPILE_FLAG_NO_CHECKING`) and command-line (`-no-checking`) options - Remove all logic in `check.cpp` that was withholding diagnostics (including errors) when the no-checking mode was enabled - Remove `HiddenImplicitCastExpr`, which was only created to support no-checking mode (it represented an implicit cast that our checking through was needed, but couldn't emit because it might be wrong) - Remove logic for storing function bodies as raw token lists when checking is turned off. I'm leaving in the `UnparsedStmt` AST node in case we ever need/want to lazily parse and check function bodies down the line. - Remove a few of the code-generation paths we had to contend with, but keep the comment about them in place. - Remove GLSL-based tests that can't meaningfully work with the new approach. - Fix other tests that used a GLSL baseline so that their GLSL compiles with `-pass-through glslang` instead of invoking `slang` with the `-no-checking` flag. - Remove tests that were explicitly added to test the "rewriter + IR" path, since that is no longer supported. There is more cleanup that can be done here, now that we know that AST-based rewrite and IR will never co-exist, but it is probably easier to deal with that as part of removing the AST-based rewrite path. We've lost some test coverage here, but actually not too much if we consider that we are dropping GLSL input anyway. * Fixup: test runner was mis-counting ignored tests * Fixup: turn on dumping on test failure under Travis * Fixup: enable extensions in Linux build of glslang
*	Remove #import directive (#389)	Tim Foley	2018-01-29
\| \| \| \| \| \| \|	Fixes #380 The `#import` directive was a stopgap measure to allow a macro-heavy shader library to incrementally adopt `import`, but it has turned out to cause as many problems as it fixes (not least because users have never been able to form a good mental model around which kind of import to do when). This change yanks support for the feature.
*	Allow arbitrary type string as type argument in spAddEntryPointEx.	Yong He	2018-01-19
\|
*	add support for `extension` and `type_param` keywords	Yong He	2018-01-14
\|
*	Support nested generics	Yong He	2018-01-12
\| \| \| \|	fixes #362
*	bruteforce implementation of witness table resolution for associated (#358)	Yong He	2018-01-09
\|
*	Using a visitor to systematically replace lookup scopes of generic ↵	Yong He	2017-12-27
\| \| \| \| \| \| \| \|	function's return type expression. fixes #336 Add a `ReplaceScopeVisitor` to replace the scopes of the return type expression tree to use the generic decl's scope instead of module's scope after parser has determined the decl is a generic function header decl.
*	IR: fixes for subscript accessors (#322)	Tim Foley	2017-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* IR: fixes for subscript accessors Fixes #320 This is a bunch of fixes for handling of `__subscript` operations on builtin types (notably `RWStructuredBuffer` and `StructuredBuffer` at this point). - Automatically add a `GetterDecl` to any subscript decalratio was declithout any accessors. This avoids hitting a null- dereference in the emit logic. - Add a notion of a `RefAccessor` (declared with `ref`) as a peer to getters and setters. The idea is that a `ref` accessor returns a pointer to the element data, so that it can be used for both getting and setting values. This is closer to the behavior of `RWStructuredBuffer` element access in HLSL. - Fixes for dealing with "access chains" where there might be a combination of a subscript (where the is a `get` and `set` but no `ref`) and member access, so that we have to read the base value into a temp, modify it, and then write it back. - This logic is still a bit of a mess, so we will eventually want to take a more consistent pass over this to deal with how we "materialize" values for setters. - Update `RWStructuredBuffer` to have a `ref` accessor, and then fix up the IR tests to handle the new opcode that I added for it. - Note: I didn't handle this as an intrinsic simply because the `tests/ir/` tests aren't really set up to handle builtins with ugly mangled names. Fixup: type error in VM for buffer element ref I was using the result type of the op as the element type for computing the element address, but the result type is a pointer to the real element type. This caused test failures on 64-bit platforms, where the stride of the buffer in the `ir/factorial` test needs to be 4. The fix is to assume the result type is a pointer, and extract the pointed-to type out of that.
*	Support simple generics syntax (#319)	Yong He	2017-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Support simple generics syntax. This commit enables simpler generics syntax, e.g. T test<T>(T arg) {} or struct Gen<T>{T x;}; * Support simple generics syntax. This commit enables simpler generics syntax, e.g. T test<T>(T arg) {} or struct Gen<T>{T x;}; * add expected test result for compute/generics-syntax.slang
*	Cleanups (#298)	Tim Foley	2017-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Rename `lower.{h,cpp}` to `ast-legalize.{h,cpp}` This pass isn't really performing lowering akin to `lower-to-ir.{h,cpp}` so the file name is misleading. By renaming this pass we emphasize its role as an AST-related pass. Also update the comment at the top of `ast-legalize.h` to reflect the intended purpose of this pass in a world where we have the IR up and running. * Allow `import` as an alias for `__import` The use of double underscores to mark our new syntax has so far had two purposes: 1. It helps identify syntax that isn't meant to be exposed to users in its current form (e.g., `__generic` gets a double underscore because we want users to have a more pleasant surface syntax for generics that they write). This rationale doesn't apply to `__import`, which is a major language feature that users need to interact with. 2. It helps avoid the problem where the compiler treats something as a keyword that isn't supposed to be reserved in HLSL/GLSL and so causes existing user code to fail to parse (e.g., when the user tries to write a function called `import`). This no longer matters because we look up almost all of our keywords using the existing lexical scoping in the language (so the user can shadow almost any keyword with a local declaration). So, neither of the original two reasons applies to `__import`, and it makes sense to expose it as `import`. Doing so is a one-line change.
*	Add support for global generic parameters (#285)	Yong He	2017-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. This commit does not include ProgramLayout specialization. * IR: pass through `[unroll]` attribute (#284) The initial lowering was adding an `IRLoopControlDecoration` to the instruction at the head of a loop, but this was getting dropped when the IR gets cloned for a particular entry point. The fix was simply to add a case for loop-control decorations to `cloneDecoration`. * fix warnings * IR: support `CompileTimeForStmt` (#286) This statement type is a bit of a hack, to support loops that must be unrolled. The AST-to-AST pass handles them by cloning the AST for the loop body N times, and it was easy enough to do the same thing for the IR: emit the instructions for the body N times. The only thing that requires a bit of care is that now we might see the same variable declarations multiple times, so we need to play it safe and overwrite existing entries in our map from declarations to their IR values. Of course a better answer long-term would be to do the actual unrolling in the IR. This is especially true because we might some day want to support compile-time/must-unroll loops in functions, where the loop counter comes in as a parameter (but must still be compile-time-constant at every call site). * Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. progress on parameter binding * Add a more contrived test case for specializing parameter bindings * update render-test to align buffers to 256 bytes (to get rid of D3D complains on minimal buffer size). * adding one more test case for parameter binding specialization. * Cleanup according to @tfoleyNV 's suggestions. * fix a bug introduced in the cleanup
*	IR: add support for `switch` statements (#278)	Tim Foley	2017-11-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* IR: add support for `switch` statements Fixes #273 This is just something we hadn't gotten to yet on the IR. The actual design of the instruction is unsurprising (once you take into consideration the requirement for structured control flow). A `switch` instruction takes the form: switch <condition> <breakLabel> <defaultLabel> [<caseVal> <caseLabel>]* Where `condition` is the value to switch on, `breakLabel` is the "join point" after the original `switch` statement, `defaultLabel` is where to go if the value doesn't match any case, and each pair of `caseVal` and `caseLabel` is what to do on a particular value. It is required that `caseVal` be a literal, but this isn't currently being enforced in the IR (the front-end should be making a check and constant-folding the case labels). For structured control flow, we also make the assumption that the cases are in order: cases with the same label must be grouped together, and any case that falls through to another must come right before it. Given this representation, the emit logic can reconstruct a `switch` statement with relative ease, given the machinery we already have. It makes sure to group together case values with the same label (again, assuming they are contiguous), and will insert the `default:` label in with whatever group it belongs to. Actually emitting code for a `switch` statement seems superficially simple, until you realize that a complete implementation needs to handle stuff like "Duff's Device." The current implementation makes the assumption that all `case` and `default` statements are directly nested under a `switch`, and that there is no way for control flow to enter a case except by the `switch` itself, or fall-through. In order to facilitate the grouping of cases in the IR-to-HLSL emit logic, the AST-to-IR lowering logic tries to detect cases where there are multiple `case`s in a row, and emit only a single label for them. One big/annoying gotcha is that we don't properly handle the case where a `default:` case has a non-trivial fall-throguh to another case. That seems fine for now since HLSL doesn't support fall-through anyway, but it probably needs to get detected somewhere in the Slang compiler (e.g., maybe we should add a diagnostic pass over the IR that detects target-specific problems like that and emits errors). * IR: Add support for empty statements. - Add empty statement in `lower-to-ir.cpp` - Go ahead and eliminate the statement catch-all and explicitly enumerate the cases we don't support - Fix up parser for block statements so that it doesn't leave a null statement as the body of a `{}` - Add an empty statement to one of the cases for the `switch` test, to ensure we are testing empty statements
*	Support generic interface methods (#251)	Yong He	2017-11-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* improve diagnostic messages and prevent fatal errors from crashing the compiler. * fix top level exception catching. * spelling fix * change wording of invalidSwizzleExpr diagnostic * add speculative GenericsApp expr parsing * add new test case of cascading generics call. * Fixing bugs in compiling cascaded generic function calls. Add implementation of DeclaredSubTypeWitness::SubstituteImpl() This is not needed by the type checker, but needed by IR specialization. When input source contains cascading generic function call, the arguments to `specialize` instruction is currently represented as a substitution. The arg values of this subsittution can be a `DeclaredSubTypeWitness` when a generic function uses one of its generic parameter to specialize another generic function. When the top level generics function is being specialized, this substitution argument, which is a `DeclaredSubTypeWitness`, needs to be substituted with the witness that used to specialize the top level function in the specialized specialize instruction as well. * add a test case for cascading generic function call. * parser bug fix * fixes #255 * add test case for issue #255 * Generate missing `specialize` instruction when calling a generic method from an interface constraint. When calling a generic method via an interface, we should be generating the following ir: ... f = lookup_interface_method(...) f_s = specailize(f, declRef) ... This commit fixes this `emitFuncRef` function to emit the needed `specialize` instruction. * fixes #260 This fix follows the second apporach in the disucssion. It generated mangled name for specialized functions by appending new substitution type names to the original mangled name. * Disabling removing and re-inserting specailized functions in getSpecalizeFunc() I am not sure why it is needed, it seems HLSL and GLSL backends are generating forward declarations anyways, so the order of functions in IRModule shouldn't matter. * cleanup and complete test cases. * fix warnings
*	Parameter blocks (#245)	Tim Foley	2017-11-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Rename existing ParameterBlock to ParameterGroup We are planning to add a new `ParameterBlock<T>` type, which maps to the notion of a "parameter block" as used in the Spire research work. Unfortunately, the compiler codebase already uses the term `ParameterBlock` as catch-all to encompass all of HLSL `cbuffer`/`tbuffer` and GLSL `uniform`/`buffer`/`in`/`out` blocks (all of which are lexical `{}`-enclosed blocks that define parameters...). This change instead renames all of the existing concepts over to `ParameterGroup`, which isn't an ideal name, but at least doesn't directly overlap the new terminology or any existing terminology. The new `ParameterBlockType` case will probably be a subclass of `ParameterGroupType`, since it is a logical extension of the underlying concept. * Add Shader Model 5.1 profiles The HLSL `register(..., space0)` syntax is only allowed on "SM5.1" and later profiles (which is supported by the newer version of `d3dcompiler_47.dll` that comes with the Win10 SDK, but not the older version of `d3dcompiler_47.dll` - good luck figuring out which you have!). This change adds those profiles to our master list of profiles, and nothing else. * First pass at support for `ParameterBlock<T>` - Add the type declaration in stdlib - Add a special case of `ParameterGroupType` for parameter blocks - Handle parameter blocks in type layout (currently handling them identically to constant buffers for now, which isn't going to be right in the long term) - Add an IR pass that basically replaces `ParameterBlock<T>` with `T` - Eventually this should replace it with either `T` or `ConstantBuffer<T>`, depending on whether the layout that was computed required a constant buffer to hold any "free" uniforms - Add first stab at an IR pass to "scalarize" global variables using aggregate types with resources inside. - This currently only applies to global variables, so it won't handle things passed through functions, or used as local variables - It also only supports cases where the references to the original variable are always references to its fields, and not the whole value itself - Add a single test case that technically passes with this level of support, but probably isn't very representative of what we need from the feature * Fold parameter-block desugaring into a more complete "type legalization" pass The basic problem that was arising is that once you desugar `ParameterBlock<T>` into `T`, you then need todeal with splitting `T` into its constituent fields if it contains any resource types. Handling those transformations by following the usual use-def chains wasn't really helping, because you might need systematic rewriting that can really only be handled bottom-up. This change adds a new pass that is intended to perform multiple kinds of type "legalization" at once: - It will turn `ParameterBlock<T>` into `T` - It may at some point also convert `ConstantBuffer<T>` into `T` as well - It will turn an value of an aggregate type that contains resources into N different values (one per field) - As a result of this, it will also deal with AOS-to-SOA conversion of these types Legalization is applied to every function/instruction/value, so that it can make large-scale changes that would be tough to manage with a work list. This pass needs to be run after generics have been fully specialized, so that we know we are always dealing with fully concrete types, so that their legalization for a given target is completely known. This is still work in progress; there's more to be done to get this working with all our test cases, and finish the remaining `ParameterBlock<T>` work. * Improve binding/layout information when using parameter blocks - When doing type layout for a parameter block, don't include the resources consumed by the element type in the resource usage for the parameter block - Note that this is pretty much identical to how a `ConstantBuffer<T>` does not report any `LayoutResourceKind::Uniform` usage, except that `ParameterBlock<T>` is also going to hide underlying texture/sampler reigster usage - The one exception here is that any nested items that use up entire `space`s or `set`s those need to be exposed in the resource usage of the parent (I don't have a test for this) - When type legalization needs to scalarize things, it must propagate layout information down to the new leaf variables. In general, the register/index for a new leaf parameter should be the sum of the offsets for all of the parent variables along the "chain" from the original variable down to the leaf (we aren't dealing with arrays here just yet). - When type legalization decides to eliminate a pointer(-like) type (e.g., desugar `ParameterBlock<T>` over to `T`), actually deal with that in terms of the `LegalVal`s created, so that we can know to turn a `load` into a no-op when applied to a value that got indirection removed. - Hack up the "complex" parameter-block test so that it actually passes (the big hack here is that the HLSL baseline is using names that are generated by the IR, and are unlikely to be stable as we add/remove transformations). - Note: I can't make these be compute tests right now, because regsiter spaces/sets are a feature of D3D12/Vulkan, and our test runner isn't using those APIs.
*	style fixes	Yong He	2017-11-04
\|
*	cleanup useless code	Yong He	2017-11-04
\|