slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2019-10-25	Refactor semantic checking code into more files (#1097)	Tim Foley
	The semantic checking logic was all inside `slang-check.cpp` and as a result this was a monster file that was extremely hard to follow. This change splits `slang-check.cpp` into several smaller files, although some of the resulting files are still quite large. This change attempts to be a copy-paste job as much as possible and does not perform any cleanup on naming, structure, duplication, etc. in the code it deal with. No function bodies or signatures have been touched.
2019-10-25	Don't use mangled names when emitting code (#1096)	Tim Foley
	* Small cleanup to how standard-library code gets marked Declarations in the standard library used to be individualy marked with `FromStdLibModifier` so that downstream passes (notably IR lowering) could treat them specially. At some point we simplified things by just looking for `FromStdLibModifier` in the ancestor list of a declaration, so that we could handle nested declarations without having to recursively attach the modifier to everything. This change simplifies things a bit further in that the `FromStdLibModifier` now only get attached to the `ModuleDecl`, since that will be on the ancestor chain of all the declarations anyway. The second change here is that we attach this marker modifier before we parse and do semantic checking on the module, rather than after. There is no code in this change that relies on that difference, but changing the timing of when we attach the modifier means that the semantic checking logic can now reliably detect if something represents a declaration in the standard library. * Change implementation of `sign()` for GLSL target Issue #602 related to the `sign()` function having a different return type between GLSL and HLSL/SLang. At the time, the issue was fixed by adding special-case logic in the emit pass that looked for the `sign()` function by name. This change cleans up that logic to work using the existing `__target_intrinsic` mechanism instead. * Make sure code emit doesn't depend on mangled names There was existing logic in the HLSL/GLSL emit path (that got copy-pasted into the C/C++ path) where if a builtin function didn't have an application `__target_intrinsic` modifier (`IRTargetIntrinsicDecoration` at the IR level), we used some fairly fragile logic to take the mangled symbol name for a function and unmangle it to recover the original name, as well as the number of parameters (to detect member functions). This approach had always been a kludge, and we have reached a point where it actively hinders forward progress on some valuable new features. The big idea of this change is fairly simple: in the IR lowering pass we detect when we have a stdlib function that ought to have an intrinsic definition but doesn't have a "catch-all" `__target_intrinsic` modifier applied. If we discover such a function, we go ahead and emit it to the IR with just such a catch-all modifier. (The main alternative here would be to require that all functions in the stdlib be manually marked `__target_intrinsic` and make this a front-end issue where we get errors if we write stdlib functions wrong, but this workaround is more expedient for now) Some of the logic around target intrinsics already handled the idea of having catch-all intrinsic modifiers/decorations with an empty target name, but that support wasn't 100% complete so this change strives to get it working. One detail that was only being handled along this mangled-name path was support for emitting calls using member-function syntax. I updated the generic handling of `__target_intrinsic`-based calls to support specifying a member function name using a prefix `.` on the definition string. With this work in place, it is possible to clean up a bunch of logic in the `emit` code that had to assume that any function without a body/definition must be an intrinsic. Instead, with this change we can now use the presence of a suitable `IRTargetIntrinsicDecoration` as the indicator that a function is an intrinsic. This should make the emit logic a bit more robust. One wrinkle that remains with this change is the special handling of functions that get the name `operator[]` (that is, the `get`/`set`/etc. accessors for `__subscript` declarations). The logic no longer depends on the mangled name, but it still feels a bit gross to have this kind of string-based special case (especially since it is our main/only special case like this). Another wrinkle is that the C/C++ back-end logic for handling intrinsic functions largely mirrors the HLSL/GLSL case, but can't just use the same exact code because it has to be intercepted by the logic that generates some of the required functions on-demand. There's still a net cleanup with this change, but there is probably an opportunity to remove even more duplication down the road.
2019-10-24	* Functionality to dump repo if there is a failure throught the ↵	jsmall-nvidia
	-dump-repro-on-failure option (#1095) * Small typo fix
2019-10-24	OffsetContainer serialization (#1093)	jsmall-nvidia
	* OffsetContainer with unit tests. * State serialization working with OffsetContainer. * Fixes to make work with OffsetContainer. * Added OffsetContainer documentation. * Remove RelativeContainer. * Fix problem with + on Offset32Ptr on windows x86 target. * * Made OffsetBase a base class of OffsetContainer. * Added MemoryOffsetBase to just handle being a chunk of memory. * * Use operator[] to access contents of OffsetContainer * Fix the type hash to work across different size_t sizes. * Fixed some Offset type related comments. * Fix bug around using asBase, because it returns a reference just using 'auto' will means it becomes a value type. Remove assignment and copy ctor from OffsetBase. * Evaluation order of assignment can lead to wrong behavior with Offset32Ptr/raw pointers. Document the fact, and fix in StateSerializeUtil.
2019-10-23	Expose more repro in API, support output params. (#1087)	jsmall-nvidia
	* Added spEnableReproCapture to the API. * Added MemoryStreamBase - which can be used to read from without copyin the data. Added the missing Repro API functions - spEnableReproCapture and spExtractRepro. Added support for serializing output filenames. * Improved naming around Stream. Brought Stream and sub types closer to code conventions. * Renamed content -> contents in Stream.
2019-10-21	`Repro` functionality (#1085)	jsmall-nvidia
	* WIP on serialize/save state. * Relative string encoding. * Added RelativeContainer unit test. Split out RelativeContainer into core. * Fix bug in RelativeString encoding. * More work around relative container. * Fix checks. * Use RelativeBase for safe access. Use malloc/free/realloc instead of List. * Add natvis support for relative types. * Setting up of state (not includes) writing of repro state. * Capture after spCompile. * Writing SourceFile and file system files. Added -dump-repo * First pass at loading state. * First pass at reading repro. * Small optimization around Safe32Ptr * Refactor how repro data is stored - to make saving off the files more simple, by having all all backed by 'files'. Make file loading always set up PathInfo so we get uniqueIdentifier info. * Generate unique file names. * Added RelativeFileSystem Added saveFile to ISlangFileSystemExt and implemented for interfaces Added mechanism to save of files (and manifest) * Added ability to replace files in repo with directory holding their contents. * Add support for entry points. * Fix problem compiling on linux. * Added SIMPLE_EX option, where everything on command line must be specified. * Fix typo in unit test for relative container. * Fix another typo in unit test for RelativeContainer. * Fix small bugs. * Fix release unused variable issue in slang-state-serialize.cpp * Fix checking for SIMPLE_EX in testing, else broke COMMAND_LINE_SIMPLE. * Fix warnings on 32 bit debug build. * Added import-subdir-search-path-repro.slang test. Although disabled for now as writes to root of slang project. * Remove wrong version of import-subdir-search-path-repro.slang * Added import-subdir-search-path-repro.slang
2019-10-17	Initial work on representing layout at IR level (#1079)	Tim Foley
	* Initial work on representing layout at IR level This change starts the process of making the back-end of the compiler independent of the AST-level layout information (`TypeLayout`, `VarLayout`, etc.) so that it instead only relies on layout information that is embedded into IR modules. This brings us incrementally closer to a world in which the back-end could be run without the AST-level structures even existing (e.g., for an application that just wants to ship IR without any AST information for IP protection, while still supporting some amount of linking and specialization). The main parts of the change are: * There is a bunch of incidental churn related to specifying entry points by index instead of the `EntryPoint` object for certain operations. This ends up being a better choice because we can use the index to look up side-band information about the entry point that might not be stored on the `EntryPoint` object itself. In particular... * We expand the `ComponentType` interface to support looking up the mangled name of an entry point by index. In common cases (no generic/interface specialization) this would be the same as asking the `EntryPoint` for its mangled name, but in cases where we have specialized a generic entry point, the mangled name would include speicalization arguments that are only available on the `SpecializedComponentType` that wraps the entry point. This part of the change isn't ideal and there might be a better solution waiting to be invented. Note that we store mangled entry point names as strings rather than using `DeclRef`s because that ensures that the information could be serialized and deserialized without a dependence on the AST. * The `TargetProgram` type (which represents binding a specific `ComponentType` for a shader program to a specific `TargetRequest` that represents the target platform) is expanded to include an `IRModule` that represents layout information, in addition to the AST-level `ProgramLayout` it already contained. We create both of these objects at the same time (on-demand) to simplify the overall flow (so that any code that triggers creation of the AST-level layout will also ensure that the IR-level layout exists). * A bunch of code in the emit passes that was passing down layout-related objects has been eliminated. It appears that most of those objects weren't actually being used, so this is just a cleanup, but it helps ensure that the back-end steps are "clean" and don't depend on the AST-level information. The one big exception here is that the emit logic needs to know the stage for the entry point being emitted (to deal with one wrinkle in translating DXR to VKRT). * A big change (actually introduced by @jsmall-nvidia in a branch that this change copied and then built from) is to introduce some more explicit IR instructions to represent layout information, notably an `IRTypeLayout` and an `IRVarLayout`. For now these objects still reference their AST equivalents, but the separation gives us an incremental path to move information from the AST-level objects over to the IR ones. This work includes logic in `IRBuilder` to construct the IR-level layout objects from the AST-level ones on-demand, so that the existing code paths that try to attach AST-level layout will continue to work for now. * Because layout information is now embedded in the IR, the `slang-ir-link.cpp` logic loses a lot of cases that used to deal with attaching AST-level layout objects to IR-level instructions during the linking process. Instead, the linker now assumes that one (or more) of the input IR modules will have layout information associated with it, and the linker makes sure to copy layout decorations (and the instructions they reference) from the input IR module(s) to the output using its more ordinary mechanisms. * Inside `slang-lower-to-ir.cpp`, we add logic to construct an IR module in a `TargetProgram` that simply references the global shader parameters, entry points, etc. and attaches IR layout decorations to them. This is akin to the existing pass in the same file that constructs IR to represent specialization information, and both of these passes share infrastructure with the main AST->IR lowering pass. Eventually, it is expected that this pass will encompass more of the logic for copying AST-level layout information over to IR-level equivalents. * One small wrinkle with this change was that the output for an HLSL generation test case changed some of its `#line` directives. The old code was actually more inaccurate than the new, so this change just updated the baseline. It also added some logic in the linker to make sure that when an IR instruction has multiple definitions, we try to pick up a source location from any of them, in case the "main" one somehow didn't get a location. * Another small fix was that the key/value map in `StructTypeLayout` for mapping fields/members to their layouts was keyed on `Decl` when it really should have been `VarDeclBase`. This change should in principle be a pure refactoring with no functionality changes, so no new tests were added. It is unfortunately also a change that has a high probability of breaking at least some client code, so we may want to be defensive and mark this with a new major version number (well, a new minor version number since we are pre-`1.0`) to give us some room for releasing hotfixes to the old version if needed. * fixup: infinite recursion bug detected by clang * fixup: remove commented-out code
2019-09-23	Simple test profiling (#1062)	jsmall-nvidia
	* First pass support for performance profiling * Test across all elements * Fix bug - sourceContents is not used, should use rawSource. * * Add ability to get prelude from API. * Allow specifying source language for render-test * Made it possible to compile a test input file as C++ * Special handling for reflection * Added C++ impl to performance-profile.slang * Remove some clang warnings. * Output profile timings on appveyor and other TC. * Remove passing around of StdWriters (can use global). Small comment improvements.
2019-09-13	Revisions to "new" Slang API based on use in Falcor (#1052)	Tim Foley
	* Revisions to "new" Slang API based on use in Falcor As I've been integrating the new/revised Slang API (using the "COM-lite" interfaces) I've run into some cases where the API was either missing features or didn't really work as originally implemented. This change fixes the gaps/problems that came up. There are two main things here: 1. Some of the routines that returned an `IComponentType` as a function result weren't actually doing anythign to retain the object they returned (e.g., putting it into a cache). Leaving aside the question of whether we need to add that caching layer, it made sense to instead have the return be through an output argument. Discussion after the initial iteration of the COM-lite API came around to the point that properly reference-counting objects that get returned would be useful if we ever decide we don't like having ever-expanding memory usage for caches of specialized/composed component types. 2. There was no way with the existing API to get at an `IComponentType` that represents an entry point produced during compilation, so that a user could include it in their own composition. This change alters `spCompileRequest_getProgram` to return the global program without* the entry points, and adds a separate `spCompileRequest_getEntryPoint`. This design lets an application compose whatever combination/layout they want, rather than being stuck with a pre-designed composition baked into the compiler. * fixup: review feedback
2019-08-28	Support for getting git version from IGlobalSession (#1040)	jsmall-nvidia
	* Added slang-tag-version.h and travis code to generate the file. * Generate slang-tag-version.h on appveyor. * Move where slang-tag-version.h is generated on appveyor. * Dump slang-tag-version.h to console on travis. * Cat slang-tag-version.h * Added method getBuildTagVersion to IGlobalSession. Added -v option.
2019-08-20	User defined downstream compiler prelude (#1028)	jsmall-nvidia
	* Added setDownstreamCompilerPrelude Renamed setPassThroughPath to setDownstreamCompilerPath. Fixed tests. Added prelude directory & code to TestToolUtil to setup default preludes for testing/command line apis. * Fix merge problem * Remove hacks to make prelude work by adding a search path as no longer needed with 'user prelude'. * Split up prelude into scalar intrinsics, and types. Use slang.h for main header. slang-cpp-prelude.h can now just include what it needs (relative to prelude directory) and define the few remaining things/work arounds. * Fix typo.
2019-08-15	A more convoluted #pragma once file identity test, using relative paths. (#1021)	jsmall-nvidia
	* A more convoluted #pragma once file identity test, using relative paths. * Fix bug with passing - to slang as a command line option causes a crash. Ability to set file-system to use on command line. #pragma once tests try with 'normal' and 'read-file' only versions * OSFileSystem -> OSFileSystemExt LoadFileOSFileSystem -> OSFileSystem Implemented OSFileSystem like OSFileSystemExt as as singleton. Fixes to comments.
2019-08-12	Ability to set Paths on Pass Through/Back End Compilers (#1010)	jsmall-nvidia
	* Expanded prelude for some other resource types. Disable C++ output for ParameterGroup. * WIP: Layout for CPU. * Fixes to CPU layout. * WIP: The uniform is output, but the variable definition is not. * WIP: Entry point parameters to global scope in C++. Handling of resource types (in so far as outputting) * Some discussion of ABI and different input types. * WIP: More C++ support around resource types. * WIP: Split up variables into different structures on emit. * WIP: Emitting C++ with wrapping up of 'Context' * WIP: C++ code has access to semantic values. Wrap in struct so can use method calls to pass shared state. Disable legalizeResourceTypes and legalizeExistentialTypeLayout * Fix structured buffer layout for CPU. * Remove testing/handling of global uniforms on CPU path. Typo fix. Changed CPU tests to use new CPU calling convention. * Check globals are working. Initalize context to zero globals. * Order the global parameters for C++ ouput by their layout. Note - that layout isn't quite working correctly because the StructuredBuffer<int> the int seems to be consuming uniform space. * Work around for reflection not having all data needed for layout ordering for C++ code. * Output constant buffers as pointers. * Entry point parameters accessed through pointer to struct. * WIP: Layout for CPU is reasonable for test case. * Only output 'f' after float literal if type marks as a float. * Cast construction works on C++. * Made IntrinsicOp::ConvertConstruct to make intent clearer. * C++ handling construction from scalar. Handle access of a scalar with .x. Check default initialization. * Comment about need for split of kIROp_construct. Release build works. * Added support from constructVectorFromScalar to C/C++ target. * Handling of in/out in C/C++. * First pass documentation CPU support. * Improvements to C++/C slang code generation documentation. * Small doc change to include need for mechansim to specify cpp compiler path. * WIP: Being able to set path for backends. * Better handling of swizzling - allow swizzling a scalar into a vector. * Fix missing/broken headers for path setting on session. * Fix for compiling using clang on Windows. * Remove Clang test code. * * Removed spSessionGetGlobalSession - no longer needed because SlangSession is slang::IGlobalSession alias. * Gave Session a ref count on spCreateSession, and have it checked on spDestroySession, so behaves correctly as ISlangUnknown Note that spDestroySession does a release (and checks the ref count on debug builds). It's behaviour could be the same as just release, but this seems closer to the original intention.
2019-08-12	Callable CPU code support (#1014)	jsmall-nvidia
	* First pass support for compiling to a loaded shared library. * Improve documentation for cpu target. * Removed the SLANG_COMPILE_FLAG_LOAD_SHARED_LIBRARY flag. Use the SLANG_HOST_CALLABLE code target Document mechanism. * Fix typo in cpp-resource.slang In test code if the target is 'callable' we don't need to compile (indeed there is no source file). * Small refactor using CommandLineCPPCompiler as base class to implement VisualStudioCPPCompiler and GCCCPPCompiler. * Improvements around CPPCompiler. Mechanism to know products produced. Cleaning up products after execution. * Fix multiple definition of 'SourceType'
2019-08-08	Make SlangSession an alias for slang::IGlobalSession (#1011)	jsmall-nvidia
	* Make SlangSession an alias for slang::IGlobalSession. Use asInternal/asExternal for casting in slang.cpp Special case handling of asInternal which argument dependent lookup doesn't find. * To improve implementation asInternal/asExternal in slang.cpp - tried making the internal impls forward to the Slang:: implementations. This caused ambiguities (for example when a function has using namespace Slang in it). To avoid these problems and make it clear where implementation is coming from use Slang::asInternal and removed the forwarding functions. Made the impls SLANG_FORCE_INLINE instead of inline. * Made asInternal/asExternal use SLANG_FORCE_INLINE uniformly.
2019-08-08	Revise new COM-lite API (#1007)	Tim Foley
	* Revise new COM-lite API This change revises the "COM-lite" API that was recently introduced to try to streamline it and introduce some missing central/base concepts. The central new abstraction in the API is the notion of a "component type," which is a unit of shader code composition. A component type can have: * IR code for some number of functions/types/etc. * Zero or more global shader parameters * Zero or more "entry point" functions at which execution can start * Zero or more "specialization" parameters (types or values that must be filled in before kernel code can be generated) * Zero or more "requirements" (dependencies on other component types that must be satisfied before kernel code can be generated) Both individual compiled modules, and validated entry points are then examples of component types, and we additionally define a few services that apply to all component types: * We can take N component types and compose them to create a new component type that combines their code, shader parameters, entry points, and specialization parameters. A composed component type may also include requirements from the sub-component types, but it is also possible that by composing thing we satisfy requirements (if `A` requires `B`, and we compose `A` and `B`, then the requirement is now satisfied, and doesn't appear on the composite). * We can take a component type with N specialization parameters, and specialize it by giving N compatible specialization arguments. The result of specialization is a new component type with zero specialization parameters. Under the right circumstances the specialzed component type will be layout compatible with the unspecialized one. * One more example that isn't exposed in the public API today is that we can take a component with requirements and "complete" it by automatically composing it with component types that satisfy those requirements. This can be seen as a kind of linking step that pulls together the transitive closure of dependencies. * We can query the layout for the shader parameters and entry points of a component type, for a specific target. * We can query compiled kernel code for an entry point in a component type (for a specific target). This only works for component types with zero specialization parameters and zero requirements. The idea is that by giving users a fairly general algebra of operations on component types, they can compose final programs in ways that meet their requirements. For example, it becomes possible to incrementally "grow" a component type to represent the global root signature for ray tracing shaders as new entry points are added, in such a way that it always stays layout-compatible with kernels that have already been compiled. Much of the implementation work here is in implementing the unifying component type abstraction, and in particular re-writing code that used to assume a program consisted of a flat list of modules and entry points to work with a hierarchical representation that reflects the underlying algebra (e.g., with types to represent composite and specialized component types). There's also a hidden "legacy" case of a component type to deal with some legacy compiler behaviors that can't be directly modeled on top of the simple algebra with modules and entry points. This API is by no means feature-complete or fully developed. It is expected that we will flesh it out more when bringing up application code (e.g., Falcor) on top of the revamped API. One notable thing that went away in this change is explicit support for "entry point groups" and notions of local root signatures (especially the Falcor-specific handling of the `shared` keyword, which a previous change turned into an explicitly supported feature). With the new "building blocks" approach, it should be possible for a DXR application to deal with local root signatures as a matter of policy (on top of the API we provide). If/when we need to provide some kind of emulation of local root signatures for Vulkan (and/or if Vulkan is extended with an explicit notion of local root signatures), we might need to revisit this choice. * Fix debug build There was invalid code inside an `assert()`, so the release build didn't catch it. * fixup: warnings * fixup: more warnings-as-errors * fixup: review notes * fixup: use component type visitors in place of dynamic casting
2019-06-19	Start exposing a new COM-lite API (#987)	Tim Foley
	* Start exposing a new COM-lite API This change is mostly about exposing a new API to the Slang compiler that allows more fine-grained control over the compilation flow. The basic concepts in the new API are: * An `IGlobalSession` is the granularity at which we load/parse the Slang stdlib, and therefore gives applications a way to amortize startup cost for the library across multiple compiles. This is a concept that might be able to go away in a future version of Slang. * An `ISession` owns all the code that gets loaded/compiled/generated. Any `import`ed modules are shared across everything in a session (we don't re-parse/-check the code when we see another `import` for the same module). Any generic- or interface-based code in the session can be specialized using types from the same session (but not necessarily across sessions). * An `IModule` is the unit of code loading and scoping. It doesn't expose any API in this change, but would be the right scope for looking up types or entry points by name. * An `IProgram` is a "linked" combination of modules and entry points from which code can be generated and reflection information queried. This change re-uses the existing reflection API types, rather than introduce a new API that duplicates that functionality. That will probably change in a future revision. There are two major pieces of functionality added here that aren't related to the new API: * We now have an API concept of "entry point groups" which are one or more entry points that are intended to be used together so that they need to have non-overlapping parameters. For now this is being used to handle "hit groups" and local root signatures for ray tracing, but I'm not sure this is a concept we will keep in the long run. * We have a very special-case (client-application-specific) flag that ascribes special meaning to the `shared` keyword, so that it can be attached to global parameters to indicate that they are actually to be part of the local root signature rather than the global one for DXR. None of the API design (including naming) here is finalized; the only reason to check in the changes at this point to avoid having a long-running branch that leads to merge pain. Clients should not try to depend on the new API just yet, since it is still a work in progress. * fixup: clang warning * fixup: try to detect clang C++11 support * fixup * fixup * fixup * fixup * fixup: review feedback
2019-05-31	Use slang- prefix on slang compiler and core source (#973)	jsmall-nvidia
	* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.
2019-05-20	Changes required for application adoption of interface-type parameters (#963)	Tim Foley
	* A few changes required for application adoption of interface-type parameters There are a few small changes here that are all related in that they arose from trying to integrate support for specialization via global interface-type shader parameters into a real application. Allow querying the "pending" layout via reflection API ------------------------------------------------------ The naming here isn't ideal, and could probably use a round of "bikeshedding" to arrive at something better, but the basic idea is that when you have a type like: ``` struct MyStuff { int a; IFoo foo; int b; } ``` the fields `a` and `b` get allocated space directly in the "primary" layout for `MyStuff` (at offsets 0 and 4, with `sizeof(MyStuff) == 8`), but the `foo` field can't be allocated space until we know what concrete type will get plugged in there. If we have a concrete type in mind: ``` struct Bar : IFoo { int bar; } ``` then we can know how much space the `foo` field will take up, but we still can't allocate it space directly in `MyStuff`, because we already decided that `sizeof(MyStuff) == 8`. Now imagine we place some `MyStuff` values into constant buffers: ``` cbuffer X { MyStuff x; } cbuffer Y { MyStuff y; float4 z; } ``` In each case we know that we want to place the `MyStuff::foo` field at the end of the containing constant buffer so that it doesn't disrupt the layout of the existing fields. But that means that the offset of `MyStuff::foo` relative to the start of the `MyStuff` isn't fixed, because of unrelated fields like `z` that need to get in between. In our layout code, we handle this by having a notion of a "pending" layout. Once we know how `MyStuff::foo` will be specialized, we can compute both a "primary" and a "pending" layout for `MyStuff`, which basically treats it as if it were two distinct types: ``` struct MyStuff_Primary { int a; int b; } struct MyStuff_Pending { Bar foo; } ``` Layout for an aggregate type like the `X` or `Y` constant buffer then proceeds by computing an aggregate primary layout and an aggregate pending layout, and then finally a constant buffer or parameter block "flushes" all or part of the pending data by appending it to the primary data to get the final layout. What all this means is that a type like `MyStuff` will have two different layouts (a default one for the primary data and a "pending" one for any specialized interface-type fields), and a variable like `Y::y` will also have two variable layouts that specify offsets (one set of offsets for its primary part, and one set of offsets for its pending part). In order to handle interface-type fields with these layout rules, an application needs a way to query the "pending" part of a type or variable layout, which luckily gives it back just another type/variable layout. The API change here is minimal, although actually exploiting the new API correctly in application code could prove challenging. Allow creating of explicitly specialized types ---------------------------------------------- This feature isn't actually implemented all the way through the compiler (I just needed enough to make the API calls go through), but I've added support for specializing a type that has interface-type fields through the reflection API. This maps to an `ExistentialSpecializedType` in the AST, and I'm lowering it to the IR as a `BindExistentialsType`, although that isn't 100% correct for the future. This feature will require a future PR to actually flesh out the implementation work, but I'll wait until that is the sticking point on the application side before I do that. Introduce a tiny `Hasher` abstraction ------------------------------------- While implementing all the boilerplate for a new `Type` subclass (we really need to reduce that work...), I got fed up with how we do hash-code computation and introduced a small utility `Hasher` type that is intended to wrap up the idiom of combining hashes. For now this isn't a major change, but in the future I'd like to expand on the design a bit to clean up some of the warts around how we handle hashing: * The `Hasher` implementation can and should switch from maintaining a single `HashCode` as its state to something that contains a more complete state (larger than the hash code) and just hashes new bytes into that state as it goes. This should make it possible to implement a `Hasher` for more serious hash functions, whether MD5, CityHash, or whatever we decide is good default. * Things that are hashable shouldn't have a `getHashCode()` method, but instead should have something like a `hashInto(Hasher&)` method. This change would have the dual benefits that (1) a composite type can easily hash all the fields that contribute to its identity into the hasher with minimal fuss/boilerplate, and (2) the hashes for composite types will be of higher quality because they can exploit all the bits of the hasher's state to combine the fields, instead of restricting each sub-field to just the bits in a hash code. We should be able to incrementally improve the quality of our design there over future changes, but for now it probably isn't a critical priority. Fixes for legalization of existential types ------------------------------------------- There were some missing cases in the handling of type legalization, such that a global interface-type shader parameter that got specialized to a type that contains only resource-type fields would cause a crash in the legalization step. I added a test for this case, and then made `ir-legalize-types.cpp` account for this case (the code to handle it ias a bit of a kludge, and shows that the `declareVars()` routine there is getting to a level of complexity that is worrying. * fixup: review feedback
2019-04-29	String/List closer to conventions, and use Index type (#959)	jsmall-nvidia
	* List made members m_ Tweaked types to closer match conventions. * Use asserts for checking conditions on List. Other small improvements. * List<T>.Count() -> getSize() * List<T> Add -> add First -> getFirst Last -> getLast RemoveLast -> removeLast ReleaseBuffer -> detachBuffer GetArrayView -> getArrayView * List<T>:: AddRange -> addRange Capacity -> getCapacity Insert -> insert InsertRange -> insertRange AddRange -> addRange RemoveRange -> removeRange RemoveAt -> removeAt Remove -> remove Reverse -> reverse FastRemove -> fastRemove FastRemoveAt -> fastRemoveAt Clear -> clear * List<T> FreeBuffer -> _deallocateBuffer Free -> clearAndDeallocate SwapWith -> swapWith * List<T> SetSize -> setSize Reserve -> reserve GrowToSize growToSize * UnsafeShrinkToSize -> unsafeShrinkToSize Compress -> compress FindLast -> findLastIndex FindLast -> findLastIndex Simplify Contains * List<T> Removed m_allocator (wasn't used) Swap -> swapElements Sort -> sort Contains -> contains ForEach -> forEach QuickSort -> quickSort InsertionSort -> insertionSort BinarySearch -> binarySearch Max -> calcMax Min -> calcMin * Initializer::Initialize -> initialize List<T>:: Allocate -> _allocate Init -> _init IndexOf -> indexOf * * Put #include <assert.h> in common.h, and remove unneeded inclusions * Small refactor of ArrayView - remove stride as not used * getSize -> getCount setSize -> setCount unsafeShrinkToSize->unsafeShrinkToCount growToSize -> growToCount m_size -> m_count * Some tidy up around Allocator. * Use Index type on List. * Refactor of IntSet. First tentative look at using Index. * Made Index an Int Did preliminary fixes. Made String use Index. * Partial refactor of String. * String::Buffer -> getBuffer ToWString -> toWString * Small improvements to String. String:: Buffer() -> getBuffer() Equals() -> equals * Try to use Index where appropriate. * Fix warnings on windows x86 builds.
2019-03-12	Add options to control optimization and debug information (#897)	Tim Foley
	The short version for command-line users is: * Use `-g` to get debug info in the output, where supported * Use `-O0` to disable optimizations, in case that improves debugability * Use `-O2` for optimized/release builds where you can spend the extra compile time The command-line options are matched with new API functions `spSetDebugInfoLevel()` and `spSetOptimizationLevel()` that set the equivalent information. Right now these settings only affect how we invoke fxc and dxc. In the longer run I expect we will want to use them to control other things: * Once we are emitting our own SPIR-V, the `-g` option should control what source-level name information we include in it. * Whether or not `-g` is used could be used to decide whether we preserve the "name hints" in the IR, which in turn decide whether we output GLSL/HLSL source that uses names based on the original program. * We will eventually need/want to include some amount of optimization passes on the Slang IR, and the `-O` options should control which of those passes are enabled on a particular invocation. In this change I decided to expose the options at the level of the entire compile request for API users, and to store the actual information on the Linkage. We might want to revisit this decision and instead allow for the level of optimization to be chosen per-target as part of back-end state. Similarly, we might want to have more fine-grained control over the level of debug output per-target (although we'd still need a front-end setting to determine what debug info is generated into the Slang IR).
2019-03-10	Fix `spReflection_FindTypeByName` (#891)	Yong He

2019-03-08	Improve support for interfaces as shader parameters (#886)	Tim Foley
	* Improve support for interfaces as shader parameters This change adds two main things over the existing support: 1. It is now possible to plug in concrete types that actually contain (uniform/ordinary) fields for the existential type parameters introduced by interface-type shader parameters. The `interface-shader-param2.slang` test shows that this works. 2. There is a limited amount of support for doing correct layout computation and generating output code that matches that layout, so that interface and ordinary-type fields can be interleaved to a limited extent. The `interface-shader-param3.slang` test confirms this behavior. There are several moving pieces in the change. * When it comes to terminology, we try to draw a more clear distinction between existial type parameters/arguments and existential/object value parametes/arguments. A simple way to look at it is that an `IFoo[3]` shader parameter introduces a single existential type parameter (so that a concrete type argument like `SomeThing` can be plugged in for the `IFoo`) but introduces three existential object/value parameters (to represent the concrete values for the array elements). * At the IR level, we support a few new operations. A `BindExistentialsType` can take a type that is not itself an interface/existential type but which depends on interfaces/existentials (e.g., `ConstantBuffer<IFoo>`) and plug in the concrete types to be used for its existential type slots. * Then a `wrapExistentials` instruction can take a type with all the existentials plugged in (possibly by `BindExistentialsType`) and wrap it into a value of the existential-using type (e.g., turn `ConstantBuffer<SomeThing>` into a `ConstantBuffer<IFoo>`). * The IR passes for doing generic/existential specialization have been updated to be able to desugar uses of these new operations just enough so that a `ConstantBuffer<IFoo>` can be used. * When we specialize an IR parameter of an interface type like `IFoo` based on a concrete type `SomeThing`, we turn the parameter into an `ExistentialBox<SomeThing>` to reflect the fact that we are conceptually referring to `SomeThing` indirectly (it shouldn't be factored into the layout of its surrounding type). * Parameter binding was updated so that it passes along the bound existential type arguments in a `Program` or `EntryPoint` to type layout, so that we can take them into account. The type layout code needs to do a little work to pass the appropriate range of arguments along to sub-fields when computing layout for aggregate types. * Type layout was updated to have a notion of "pending" items, which represent the concrete types of data that are logically being referenced by existential value slots. The basic idea is that these values aren't included in the layout of a type by default, but then they get "flushed" to come after all the non-existential-related data in a constant buffer, parameter block, etc. * The logic for computing a parameter group (`ConstantBuffer` or `ParameterBlock`) layout was updated to always "flush" the pending items on the element type of the group, so that the resource usage of specialized existential slots would be taken into account. * The type legalization pass has been adapted so that we can derive two different passes from it. One does resource-type legalization (which is all that the original pass did). The new pass uses the same basic machinery to legalize `ExistentialBox<T>` types by moving them out of their containing type(s), and then turning them into ordinary variables/parameters of type `T`. Big things missing from this change include: - Nothing is making sure that "pending" items at the global or entry-point level will get proper registers/bindings allocated to them. For the uniform case, all that matters in the current compiler is that we declare them in the right order in the output HLSL/GLSL, but for resources to be supported we will need to compute this layout information and start associating it with the existential/interface-type fields. - Nothing is being done to support `BindExistentials<S, ...>` where `S` is a `struct` type that might have existential-type fields (or nested fields...). Eventually we need to desugar a type like this into a fresh `struct` type that has the same field keys as `S`, but with fields replaced by suitable `BindExistentials` as needed. (The hard part of this would seem to be computing which slots go to which fields). As a practial matter, this missing feature means that interface-type members of `cbuffer` declarations won't work. The current tests carefully avoid both of these problems. They don't declare any buffer/texture fields in the concrete types, and they don't make use of `cbuffer` declarations or `ConstantBuffer`s over structure types with interface-type fields. * fixup: add override to methods * fixup: typos
2019-03-05	* Fix issue with dependency including source path - even if source was ↵	jsmall-nvidia
	compiled from a string (#878) * Added FromString Type to PathInfo to identify paths that are not from 'files'
2019-03-02	#include not using search paths (#873)	jsmall-nvidia
	* Fix warnings from visual studio due to coercion losing data. * Removed searchDirectories from FrontEndCompileRequest and use the one in Linkage as that is the one that is changed via Slang API. * * Add searchPaths back to FrontEndRequest * Add comments to explain the issue * Add a test to check include paths
2019-02-27	Hotfix/fix stdlib error reporting (#866)	jsmall-nvidia
	* * Add 'identity' version of bit casts (asint, asuint, asfloat) for scalar and vector * Added identity bit casts for matrix (cos no op). We don't support matrix asint on glsl targets * Added tests in bit-cast.slang * Use kIRPseudoOp_Pos for identity asuint/asint/asfloat casts. * * Stop crash if error in stdlib * Use the buildin source manager when compiling stdlib - fixes that line numbers are displayed * Typo fix * Output line directives for 'meta' slang souce files into stdlib * Improve comments and function names.
2019-02-19	First steps toward supporting interface-type parameters on shaders (#852)	Tim Foley
	* First steps toward supporting interface-type parameters on shaders What's New ---------- From the perspective of a user, the main thing this change adds is the ability to declare top-level shader parameters (either at global scope, or in an entry-point parameter list) with interface types. For example, the following becomes possible: ```hlsl // Define an interface to modify values interface IModifier { float4 modify(float4 val); } // Define some concrete implementations struct Doubler : IModifier { float4 modify(float4 val) { return val + val; } } struct Squarer : IModifier { ... } // Define a global shader parameter of interface type IModifier gGlobalModifier; // Define an entry point with an interface-type `uniform` parameter void myShader( unifrom IModifier entryPointModifier, float4 inColor : COLOR, out float4 outColor : SV_Target) { // Use the interface-type parameters to compute things float4 color = inColor; color = gGlobalModifier.modify(color); color = entryPointModifier.modify(color); outColor = color; } ``` The user can specialize that shader by specifying the concrete types to use for global and entry-point parameters of interface types (e.g., plugging in `Doubler` for `gGlobalModifier` and `Squarer` for `entryPointModifier`). The "plugging in" process is done in terms of a concept of both global and local "existential slots" which are a new `LayoutResourceKind` that represents the holes where concrete types need to be plugged in for existential/interface types. In simple cases like the above, each interface-type parameter will yield a single existential slot in either the global or entry-point parameter layout. Users can query the start slot and number of slots for each shader parameter, just like they would for any other resource that a parameter can consume. Before generating specialized code, the user plugs in the name of the concrete type they would like to use for each slot using `spSetTypeNameForGlobalExistentialSlot` and/or `spSetTypeNameForEntryPointExistentialSlot`. There are some major limitations to the implementation in this first change: * Parameters must be of interface type (e.g., `IFoo`) and not an array (`IFoo[3]`), or buffer (`ConstantBuffer<IFoo>`) over an interface type. Similarly, `struct` types with interface-type fields still don't work. * The work on interface-type function parameters still doesn't include support for `out` or `inout` parameters, nor for functions that return interface types (that isn't technically related to this change, but affects its usefullness). * No work is being done to correctly lay out shader parameters once the concrete types for existential slots are known, so that this change really only works when the concrete type that gets plugged in is empty. These limitations are severe enough that this feature isn't really usable as implemented in this change, and this merely represents a stepping stone toward a more complete implementation. Implementation -------------- The API side of thing largely mirrors what was already done to support passing strings for the type names to use for global/entry-point generic arguments, so there should be no major surprises there. The logic in `check.cpp` computes the list of existential slots when creating unspecialized `Program`s and `EntryPoint`s (this is logically the "front end" of the compiler), and then checks the supplied argument types against what is expected in each slot when creating specialized `Program`s and `EntryPoint`s. This again mirrors how generic arguments are handled. Type layout was extended to compute the number of existential slots that a type consumes, and will thus automatically assign ranges of slots to top-level and entry-point shader parameters in the same way it already allocates `register`s and `binding`s. The big missing feature is the ability to specialize a layout to account for the concrete types plugged into the existential-type slots. IR generation for specialized programs and entry points was slightly extended so that it attaches information about the concrete types plugged into the existential slots, and the witness tables that show how they conform to the interface for that slot. The linking step needed some small tweaks to make sure that information gets copied over to the target-specific program when we start code generation. The meat of the IR-level work is in `ir-bind-existentials.cpp`, which takes the information that was placed in the IR module by the generation/linking steps and uses it to rewrite shader parameters. For example, if there is a shader parameter `p` of type `IModifier`, and the corresponding existential slot has the type `Doubler` in it, we will rewrite the parameter to have type `Doubler`, and rewrite any uses of `p` to instead use `makeExistential(p, /witness that Doubler conforms to IModifier/)`. Once the replacement is done on the parameters, the existing work for specializing existential-based code when the input type(s) are known kicks in and does the rest. Testing ------- A single compute test is added to validate that this feature works. It is narrowly tailored to not require any of the features not supported by the initial implementation (e.g., all of the concrete types used have no members). The test case does include use of an associated type through one of these existential-type parameters, which has exposed a subtle bug in how "opening" of existential values is implemented in the front-end. Rather than fix the underlying problem, I cleaned up the code in the front-end to special case when the existential value being opened is a variable bound with `let`, to directly use a reference to that variable rather than introduce a temporary. Similarly, in the IR generation step, I added an optimization to make variables declared with `let` skip introducing an IR-level variable and just use the SSA value of their initializer directly instead. * fixup: missing files * fixup: incorrect type for unreachable return * fixup: actually comment ir-bind-existentials.cpp
2019-02-15	Split front- and back-ends (#846)	Tim Foley
	* Split front- and back-ends This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API. The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion. For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points. The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed. The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did: * The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files. * A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does not interact with generic arguments for global or entry-point generic parameters. * The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`). * The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation. * A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around. * The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward. * Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios. * The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time). * One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation). It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required: * I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it. * A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters. * In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation. * In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway. * In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better. * In cases where code was doing, e.g., `context->shared->compileRequest->session->getThing()` I have tried to introduce convenience routines so that the usage site is just `context->getThing()` to improve encapsulation and allow changes to be made more easily going forward. * The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors. * A few APIs now take `DiagnosticSink` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob` parameter for the output. fixup: missing include for compilers with accurate template checking (non-VS) * fixup: review feedback
2019-02-04	Feature/view path (#824)	jsmall-nvidia
	* Use 'is' over 'as' where appropriate. * dynamic_cast -> dynamicCast * Replace 'dynamicCast' with 'as' where has no change in behavior/ambiguity. * Replace dynamicCast with as where doesn't change behavior/non ambiguous. * Keep a per view path to the file loaded - such that diagnostic messages always display the path to the requested file. * Add simplifyPath to ISlangFileSystemExt Simplify (if possible) paths that are set on SourceFile and SourcView - doing so makes reading paths simpler. * Fix small typo. * Improve documentation in source for getFileUniqueIdentity * Fix override warning.
2019-01-30	Fixes crashes at source error.	Yong He

2019-01-21	Feature/file unique identity (#789)	jsmall-nvidia
	* * Fix memory bug around expanding va_args - needed buffer to have space for terminating 0 * Fix problem with FileWriter defaults being globals, as memory they allocate, will only be freed after return from main - work around by making StdWriters RefObject derived, and kept in scope such the writers are destroyed before checks for leaks is found * Added SimplifyPathAndHash mode for CacheFileSystem - will simplify the path and see if simplified path is in cache before reading file (limiting amout of underlying file requests) * * Added calcReplaceChar * Renamed DefaultFileSystem to OSFileSystem * Made OSFileSystem convert windows \ to / on linux * Simplified logic for caching in CacheFileSystem. * Added pragma-once-c to add extra test, but also so there is an 'include' directory in preprocessor tests. * Small fixes in pragma once test. * Simplified cache handling path, so that paths/simplified paths area always added. * Improve naming of methods for different caches. * Removed references to 'canonicalPath' and made 'uniqueIdentity' * * Re-add support for canonicalPath to ISlangFileSystem -> not for uniqueIdentifier but as a way to display 'canonicalPath' * Added peliminary support for being able to display verbose paths in a diagnostic * Added 'clearCache' support * Added verbose path support to SourceManager (now needs a ISlangFileSystemExt to do this) * Added support for '-verbose-path' option to slangc and slang-test.
2019-01-21	Path simplification/hash mode, plus bug fixes (#788)	jsmall-nvidia
	* * Fix memory bug around expanding va_args - needed buffer to have space for terminating 0 * Fix problem with FileWriter defaults being globals, as memory they allocate, will only be freed after return from main - work around by making StdWriters RefObject derived, and kept in scope such the writers are destroyed before checks for leaks is found * Added SimplifyPathAndHash mode for CacheFileSystem - will simplify the path and see if simplified path is in cache before reading file (limiting amout of underlying file requests) * * Added calcReplaceChar * Renamed DefaultFileSystem to OSFileSystem * Made OSFileSystem convert windows \ to / on linux * Simplified logic for caching in CacheFileSystem. * Added pragma-once-c to add extra test, but also so there is an 'include' directory in preprocessor tests. * Small fixes in pragma once test. * Simplified cache handling path, so that paths/simplified paths area always added. * Improve naming of methods for different caches.
2019-01-17	Feature/hash for source identity (#786)	jsmall-nvidia
	* * Added COMMAND_LINE_SIMPLE test type * Made how spawning works controllable by paramter/type SpawnType * Made break-outside-loop and global-uniform run command line slangc * calcRelativePath -> calcCombinedPath * Add 64 bit version of GetHash. * Add support for Hash based mode for CacheFileSystem.
2019-01-16	Initial support for dynamic dispatch using "tagged union" types (#772)	Tim Foley
	* Initial support for dynamic dispatch using "tagged union" types Suppose a user declares some generic shader code, like the following: ```hlsl interface IFrobnicator { ... } type_param T : IFrobincator; ParameterBlock<T : IFrobnicator> gFrobnicator; ... gFrobincator.frobnicate(value); ``` and then they have some concrete implementations of the required interface: ```hlsl struct A : IFrobnicator { ... } struct B : IFrobnicator { ... } ``` The current Slang compiler allows them to generate distinct compiled kernels for the case of `T=A` and the case of `T=B`. This means that the decision of which implementation to use must be made at or before the time when a shader gets bound in the application. This change adds a new ability where the Slang compiler can generate code to handle the case where `T` might be either `A` or `B`, and which case it is will be determined dynamically at runtime. This means a single compiled kernel can handle both cases, and the decision about which code path to run can be made any time before the shader executes. This new option is supported by defining a tagged union type. Via the API, the user specifies that `T` should be specialized to `__TaggedUnion(A,B)` (the double underscore indicates that this is an experimental and unsupported feature at present). We refer to the types `A` and `B` here as the "case" types of the tagged union. Conceptually, the compiler synthesizes a type something like: ```hlsl struct TU { union { A a; B b; } payload; uint tag; } ``` The user can then allocate a constant buffer to hold their tagged union type, and when they pick a concrete type to use (say `B`), they fill in the first `sizeof(B)` bytes of their buffer with data describing a `B` instance, and then set the `tag` field to the appopriate 0-based index of the case type they chose (in this case the `B` case gets the tag value `1`). Actually implementing tagged unions takes a few main steps: * Type parsing was extended to special-case `__TaggedUnion` as a contextual keyword. This is really only intended to be used when parsing types from the API or command-line, and Bad Things are likely to happen if a user ever puts it directly in their code. Eventually construction of tagged unions should be an API feature and not part of the language syntax. * Semantic checking was extended to recognize that a tagged union like `__TaggedUnion(A,B)` shoud support an interface like `IFrobnicator` whenever all of the case types suport it, as long as the interface is "safe" for use with tagged unions (which means it doesn't use a few of the advancd langauge features like associated types). * The IR was extended with instructions to represent tagged union types and to extract their tag and the payload for the different cases as needed. * IR generation was extended to synthesize implementations of interface methods for any interface that a tagged union needs to support. Right now the implementation is simplistic and only handles simple method requirements, which it does by emitting a `switch` instruction to pick between the different cases. * A new IR pass was introduced to "desugar" any tagged union types used in the code. The downstream HLSL and GLSL compilers don't support `union`s, so we have to instead emit a tagged union as a "bag of bits" and implement loading the data for particular cases from it manually. * Final code emit mostly Just Works after the above steps, but we had to introduce an explicit IR instruction for bit-casting to handle the output of the desugaring pass. There are a bunch of gaps and caveats in this implementation, but that seems reasonable for something that is an experimental feature. The various `TODO` comments and assertion failures in unimplemented cases are intended, so that this work can be checked in even if it isn't feature-complete. * fixup: missing files * fixup: typos
2019-01-16	Feature/external compiler reporting (#776)	jsmall-nvidia
	* Added support for converting SlangResult to string in PlatformUtil. * * Added reportExternalCompilerError * Made external compilers use this * Made DiagnosticSink accept UnownedStringSlice * Made emitXXX compiler functions return SlangError * Use smart pointers to handle life of Com interfaces * * Make SlangResult compatible with HRESULT for some common cases. * Make PlatformUtil::appendResult return SlangResult * Compile check SLANG_RESULT. * Add tests for checking diagnostics from external compilers. * * Make external compiler tests only run on windows for now. * Added 'windows' and 'unix' categories * Added categories based on what backends are available. Will make more tests run on linux and handle case where dxcompiler is not available on appveyor. * * Added spSessionCheckPassThroughSupport * Use to determine whats available for categories for tests * Add support for outputting source filename/s when using pass through.
2019-01-10	Fixes problem when passing in nullptr for the channel - we need the ↵	jsmall-nvidia
	DiagnosticSink.writer to be nullptr so that we get the buffered result. Unfortunately here, it was set to the default for the channel, which in diagnostics case meant throwing away the diagnostic. (#770)
2019-01-10	Improvements around review of debug serialization info (#769)	jsmall-nvidia
	* * Make SourceView and SourceFile no longer derive from RefObject * Both have life time now managed by SourceManager * Tidied up a little around the serialization test code - just create the IRModule once * Simplified code around deleting SourceView/File. * Looked into generateIRForTranslationUnit - seems reasonable to just call it once, because it has side effects.
2019-01-07	Feature/serialization debug info (#767)	jsmall-nvidia
	* Remove AppContext. Use StdChannels to hold writers, and TestToolUtil to hold test tool specific functionality. * StdChannels -> StdWriters * getStdOut -> getOut, getStdError -> getError * Renamed main.cpp files of tools to try and stop visual studio getting confused between files - such that clicking on an error takes editor to the right location. * Work in progress on being able to serialize debug information. * * Added MemoryStream * First pass converting to IRSerialData * Able to read and write IRSerialData with debug data * Start at reconstruting IR serialized data. * First pass of generation debug SourceLocs from debug data. Works for test set for line nos. * Bug fixes. Moved testing of serialization into IRSerialUtil * Work around problem with irModule = generateIRForTranslationUnit(translationUnit); two times in a row produces different output(!). Fix by just creating once. * Remove problem with use of ternary op in slang.cpp on gcc/clang. * Added -verify-debug-serial-ir option that makes IR modules go through full serialization with debug information and verification. * Add a test that does serial debug verification that is run by default on linux.
2018-12-12	Running tests in slang-test process (#740)	jsmall-nvidia
	* First pass at having an interface to write text to that can be replaced. Simplifed and made more rigerous the interface used to write formatted strings. * Added AppContext to simplify setting up and parsing around of streams. * Added more simplified way to get the std error/out from AppContext. * Work in progress using dll for tools to speed up testing. * First pass at ISlangWriter interface. * Added support for writing VaArgs. Added NullWriter. * Use ISlangWriter for output. * Use ISlangWriter for output - replacing OutputCallback. Make IRDump go to ISlangWriter * SlangWriterTargetType -> SlangWriterChannel Improvements around AppContext * Shared library working with slang-reflection-test. * Dll testing working for render-test. * Include va_list definintion from header. * Fix errors from clang. * Fix typo for linux. * Added -usexes option * Fix typo. * Fix arguments problem on linux. * Fix typo for linux. * Add windows tool shared library projects. * Fix warning from x86 win build. Fix signed warning from slang-test/main.cpp * First attempt at getting premake to work on travis, and run tests. * Try moving build out into script. * Invoke bash scripts so they don't have to be executable. * Drive configuration/tests from env parameters set by travis * Try using source to run travis tests. * Remove the build.linux directory - but doing so will overwrite Makefile. * Made -fno-delete-null-pointer-checks gcc only. * Try to fix warning from -fno-delete-null-pointer-checks * Turn of warnings for unknown switches. * Try to make premake choose the correct tooling. * Disabled missing braces warning. * Disable -Wundefined-var-template on clang. * -Wunused-function disabled for clang. * Fix typo due to SlangBool. * Remove this nullptr tests. * "-Wno-unused-private-field" for clang. * Added "-Wno-undefined-bool-conversion" * Add DominatorList::end fix. * Split scripts into travis_build.sh travis_test.sh * Fix gcc/clang template pre-declaration issue around QualType. * Fix premake to build such that pthread correctly links with slang-glslang
2018-11-28	* Renamed spSessionHasCompileTargetSupport to ↵	jsmall-nvidia
	spSessionCheckCompileTargetSupport. (#728) * Improved return codes from spSessionCheckCompileTargetSupport
2018-11-21	Feature/early depth stencil (#727)	jsmall-nvidia
	* First pass support for early depth stencil. * Add a simple test to check if output has attributes. * Use cross compilation to test [earlydepthstencil] on glsl. * If target is dxil, use dxc to test against. Add hlsl to test earlydepthstencil against. * * Added spSessionHasCompileTargetSupport * Made slang-test use spSessionHasCompileTargetSupport to ignore tests that cannot run
2018-11-06	Feature/shared library refactor (#712)	jsmall-nvidia
	* * Added ISlangSharedLibraryLoader and ISlangSharedLibrary * Implemented default implementations * Added slang API function to get/set the ISlangSharedLibraryLoader on the session * Put function caching onto the Session - so that if the loader is chaged, its easy to reset the shared libraries, and functions * Run premake. * Fix problem with setting null, would cause an unnecessary function/shared lib flush. * * Unload SharedLibrary when DefaultSharedLibrary is deleted. * Make SharedLibrary handle unload safely if already unloaded. * Refactor SharedLibrary, such that it becomes a utility class - simplifying it's semantics. * Simplified ISlangSharedLibrary such that doesn't have unload and isLoaded so easier to implement. Use updated SharedLibrary impl. * Disable aarch64 on windows * Premake windows files without aarch64 build. * Moved slang-shared-library to core (so can be used in code outside of main slang) Fixed problem in premake5 where on windows projects were incorrectly constructed * Allowed RefObject to base class of com types Added ConfigurableSharedLibraryLoader Added -dxc-path -fxc-path -glslang-path Fix problem with dxc-path not honoring it's path when loading dxil * Added documentation for command line control of dll loading paths. * Remove some tabbing issues. * Change name of include guard.
2018-11-01	Add support for a "strict" floating-point mode (#709)	Tim Foley
	This change adds an API function and command line options for controlling the default floating-point behavior for a target, with options for "fast" and "precise" computation. The "precise" option gets mapped to the "IEEE strictness" mode in `fxc` and `dxc` (there is currently no equivalent option for glslang that I could find).
2018-10-30	Fix handling of DXR profiles (#704)	Tim Foley
	The logic in `getEffectiveProfile()` function was mapping these to use `Stage::Unknown` in an early attempt to handle the way that dxc requires the `lib_` profile for DXR shaders, instead of anything that mentions the stage name (in constrast to, e.g., `vs_5_1`). At the same time, the `GetHLSLProfileName()` function was updated to explicitly handle the DXR shaders and map anything it doesn't expect (including `Stage::Unknown`) to a profile named `unknown`, which dxc obviously doesn't like. This change tries to fix both issues by: Having `getEffectiveProfile()` no longer clobber the stage part of a profile for DXR shaders. * Having `GetHLSLProfileName()` map all unhandled cases to the `lib_*` profiles, since that seems likely to be how any future stages will need to be handled as well (based on the precedent with DXR) Along the way, I also fixed a bug where invoking command-line `slangc` with no `-stage` options and then relying on `[shader(...)]` attributes to pick up the entry points would lead to a crash since the array of per-entry-point output paths on each target would not be sized appropriately.
2018-10-29	Rework command-line options handling for entry points and targets (#697)	Tim Foley
	* Rework command-line options handling for entry points and targets Overview: * The biggest functionality change is that the implicit ordering constraints when multiple `-entry` options are reversed: any `-stage` option affects the `-entry` to its left instead of to its right as it used to. This is technically a breaking change, but I expect most users aren't using this feature. * The options parsing tries to handle profile versions and stages as distinct data (rather than using the combined `Profile` type all over), and treats a `-profile` option that specifies both a profile version and a stage (e.g., `-profile ps_5_0`) as if it were sugar for both a `-profile` and a `-stage` (e.g., `-profile sm_5_0 -stage fragment`). * We now technically handle multiple `-target` options in one invocation of `-slangc`, but do not advertise that fact in the documentation because it might be confusing for users. Similar to the relationship between `-stage` and `-entry`, any `-profile` option affects the most recent `-target` option unless there is only one `-target`. * The logic for associating `-o` options with corresponding entry points and targets has been beefed up. The rule is that a `-o` option for a compiled kernel binds to the entry point to its left, unless there is only one entry point (just like for `-stage`). The associated target for a `-o` option is found via a search, however, because otherwise it would be impossible to specify `-o` options for both SPIR-V and DXIL in one pass. * The handling of output paths for entry points in the internal compiler structures was changed, because previously it could only handle one output path per entry point (even when there are multiple targets). The new logic builds up a per-target mapping from an entry point to its desired output path (if any). Details: * Support for formatting profile versions, stages, and compile targets (formats) was added to diagnostic printing, so that we can make better error messages. This is fairly ad hoc, and it would be nice to have all of the string<->enum stuff be more data-driven throughout the codebase. * Test cases were added for (almost) all of the error conditions in the current options validation. The main one that is missing is around specifying an `-entry` option before any source file when compiling multiple files. This is because the test runner is putting the source file name first on the command line automatically, so we can't reproduce that case. * Several reflection-related tests now reflect entry points where they didn't before, because the logic for detecting when to infer a default `main` entry point have been made more loose * On the dxc path, beefed up the handling of mapping from Slang `Profile`s to the coresponding string to use when invoking dxc. * A bunch of tests cases were in violation of the newly imposed rules, so those needed to be cleaned up. * There were also a bunch of test cases that had accidentally gotten "disabled" at some point because there were comparing output from `slangc` both with and without a `-pass-through` option, but that meant that any errors in command-line parsing produced the same error output in both the Slang and pass-through cases. This change updates `slang-test` to always expect a successful run for these tests, and then manually updates or disables the various test cases that are affected. * When merging the updated test for matrix layout mode, I found that the new command-line logic was failing to propagate a matrix layout mode passed to `render-test` into the compiler. This was because the `-matrix-layout` options were implemented as per-target, but the target was being set by API while the option came in via command line (passed through the API). It seems like we want matrix layout mode to be a global option anyway (rather than per-target), so I made that change here. Add missing expected output files * A 64-bit fix * Remove commented-out code noted in review
2018-10-26	Feature/file system cache (#692)	jsmall-nvidia
	* First pass at caching file system. * default-file-system -> slang-file-system fix problem with location("build.linux") confusing windows build for now. * Added CompressedResult Fix problem in Result construction with it being unsigned * Add support for Path simplification. * Testing for Path::Simplify. * Refactored CacheFileSystem - automatically handles ISlangFileSystem or ISlangFileSystemExt appropriately. Removed WrapFileSystem - because wasn't possible to emulate some of the behavior if just loadFile is implemented. Split out StringBlob - so that no need to convert between ISlangBlob and String repeatidly. * Remove unwanted code in ~CompileRequest
2018-10-22	Fix problem with __import not working (#688)	jsmall-nvidia
	* Added getPathType to ISlangFileSystemExt. This is needed so that when searching for a file it's existance can be tested without loading the file. On some platforms a getCanonicalPath can do this - but depending on how getCanonicalPath is implemented, it may not do. This test is made after the relative path is produced before finding the canonical path. * Test for importing along search path. * Added comment to explain the issue around WrapFileSystem impl of getPathType. * Make search path use / not \
2018-10-17	IncludeFileSystem -> DefaultFileSystem (#677)	jsmall-nvidia
	Improvements in 'singleton'ness of DefaultFileSystem Made WrapFileSystem a stand alone type - to remove 'odd' aspects of deriving from DefaultFileSystem (such as inheriting getSingleton method/fixing ref counting) Simplified CompileRequest::loadFile - becauce fileSystemExt is always available.
2018-10-16	Feature/include refactor (#675)	jsmall-nvidia
	* Refactor of path handling. * Added PathInfo * Changed ISlangFileSystem - such that has separate concepts of reading a file, getting a relative path and getting a canonical path * Added support for getting a canonical path for windows/linux * Made maps/testing around canonicalPaths * User output remains around 'foundPath' - which is the same as before * Small improvements around PathInfo * Added a type and make constructors to make clear the different 'path' uses * Fixed bug in findViewRecursively * Checking and reporting for ignored #pragma once. * Removed SLANG_PATH_TYPE_NONE as doesn't serve any useful purpose. * Improve comments in slang.h aroung ISlangFileSystem * Remove the need for <windows.h> in slang-io.cpp * Ran premake5. * Improvements and fixes around PathInfo. * Fix typo on linix GetCanonical * Make the ISlangFileSystem the same as before, and ISlangFileSystem contain the new methods. Internally it always uses the ISlangFileSystemExt, and will wrap a ISlangFileSystem with WrapFileSystem, if it is determined (via queryInterface) that it doesn't implement the full interface.
2018-10-15	Feature/source loc improvements (#673)	jsmall-nvidia
	* Fix comment to better explain usage. * For getting the type string use a temporary SourceManager.