slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Initial work to support OptiX output for ray tracing shaders (#1307)	Tim Foley	2020-04-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial work to support OptiX output for ray tracing shaders This change represents in-progress work toward allowing Slang/HLSL ray-tracing shaders to be cross-compiled for execution on top of OptiX. The work as it exists here is incomplete, but the changes are incremental and should not disturb existing supported use cases. One major unresolved issue in this work is that the OptiX SDK does not appear to set an environment variable Changes include: * Modified the premake script to support new options for adding OptiX to the build. Right now the default path to the OptiX SDK is hard-coded because the installer doesn't seem to set an environment variable. We will want to update that to have a reasonable default path for both Windows and Unix-y platforms in a later chance. * I ran the premake generator on the project since I added new options, which resulted in a bunch of diffs to the Visual Studio project files that are unrelated to this change. Many of the diffs come from previous edits that added files using only the Visual Studio IDE rather than by re-running premake, so it is arguably better to have the checked-in project files more accurately reflect the generated files used for CI builds. * The "downstream compiler" abstraction was extended to have an explicit notion of the kind of pipeline that shaders are being compiled for (e.g., compute vs. rasterization vs. ray tracing). This option is used to tell the NVRTC case when it needs to include the OptiX SDK headers in the search path for shader compilation (and also when it should add a `#define` to make the prelude pull in OptiX). This code again uses a hard-coded default path for the OptiX SDK; we will need to modify that to have a better discovery approach and also to support an API or command-line override. * One note for the future is that instead of passing down a "pipeline type" we could instead pass down the list/set of stages for the kernels being compiled, and the OptiX support could be enabled whenever there is any ray tracing entry point present in a module. That approach would allow mixing RT and compute kernels during downstream compilation. We will need to revisit these choices when we start supporting code generation for multiple entry points at a time. * The CUDA emit logic is currently mostly unchanged. The biggest difference is that when emitting a ray-tracing entry point we prefix the name of the generated `__global__` function with a marker for its stage type, as required by the OptiX runtime (e.g., a `__raygen__` prefix is required on all ray-generation entry points). * The `Renderer` abstraction had a bare minimum of changes made to be able to understand that ray-tracing pipelines exist, and also that some APIs will require the name of each entry point along with its binary data in order to create a program. * The `ShaderCompileRequest` type was updated so that only a single "source" is supported (rather than distinct source for each entry point), and also the entry points have been turned into a single list where each entry identifies its stage instead of a fixed list of fields for the supported entry-point types. * The CUDA compute path had a lot of code added to support execution for the new ray-tracing pipeline type. The logic is mostly derived from the `optixHello` example in the OptiX SDK, and at present only supports running a single ray-generation shader with no parameters. The code here is not intended to be ready for use, but represents a signficiant amount of learning-by-doing. * The `slang-support.cpp` file in `render-test` was updated so that instead of having separate compilation logic for compute vs. rasterization shaders (which would mean adding a third path for ray tracing), there is now a single flow to the code that works for all pipeline types and any kind of entry points. * Implicit in the new code is dropping support for the way GLSL was being compiled for pass-through render tests, which means pass-through GLSL render tests will no longer work. It seems like we didn't have any of those to begin with, though, so it is no great loss. * Also implicit are some new invariants about how shaders without known/default entry points need to be handled. For example, the ray tracing case intentionally does not fill in entry points on the `ShaderCompileRequest` and instead fully relies on the Slang compiler's support for discovering and enumerating entry points via reflection. As a consequence of those edits the `-no-default-entry-point` flag on `render-test` is probably not working, but it seems like we don't have any test cases that use that flag anyway. Given the seemingly breaking changes in those last two bullets, I was surprised to find that all our current tests seem to pass with this change. If there are things that I'm missing, I hope they will come up in review. * fixup: issues from review and CI * Some issues noted during the review process (e.g., a missing `break`) * Fix logic for render tests with `-no-default-entry-point`. I had somehow missed that we had tests reliant on that flag. This required a bit of refactoring to pass down the relevant flag (luckily the function in question was already being passed most of what was in `Options`, so that just passing that in directly actually simplifies the call sites a bit. * There was a missing line of code to actually add the default compute entry points to the compile request. I think this was a problem that slipped in as part of some pre-PR refactoring/cleanup changes that I failed to re-test.
*	Slang compiles CUDA source via NVRTC (#1151)	jsmall-nvidia	2019-12-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* CPPCompiler -> DownstreamCompiler * Added DownstreamCompileResult to start abstraction such that we don't need files. * * Split out slang-blob.cpp * Made CompileResult hold a DownstreamCompileResult - for access to binary or ISlangSharedLibrary * Keep temporary files in scope. * Add a hash to the hex dump stream. * Move all file tracking into DownstreamCompiler. * WIP support for nvrtc. * WIP: Adding support for nvrtc compiler. Adding enum types, wiring up the nvrtc into slang. * Fix remaining CPPCompiler references. * Fix order issue on target string matching. * Use ISlangSharedLibrary for nvrtc. * Use DownstreamCompiler for nvrtc. * WIP first pass at compilation win nvrtc. * Added testing if file is on file system into CommandLineDownstreamCompiler. Added sourceContentsPath. * Make test cuda-compile.cu work by just compiling not comparing output. * Fix warning on clang.
*	DownstreamCompiler abstraction (#1149)	jsmall-nvidia	2019-12-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* CPPCompiler -> DownstreamCompiler * Added DownstreamCompileResult to start abstraction such that we don't need files. * * Split out slang-blob.cpp * Made CompileResult hold a DownstreamCompileResult - for access to binary or ISlangSharedLibrary * Keep temporary files in scope. * Add a hash to the hex dump stream. * Move all file tracking into DownstreamCompiler.
*	Initial work for "global generic value parameters" (#1127)	Tim Foley	2019-11-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial work for "global generic value parameters" The main new feature here is support for the `__generic_value_param` keyword, which introduces a global generic value parameter. For example: __generic_value_param kOffset : uint = 0; This declaration introduces a global generic value parameter `kOffset` of type `uint` that has a nominal default value of zero. The broad strokes of how this feature was added are as follows: * A new `GlobalGenericValueParamDecl` AST node type is introduces in `slang-decl-defs.h` * A new `parseGlobalGenericValueParamDecl` subroutine is added to `slang-parser.cpp`, and is added to the list of declaration cases as the callback for the `__generic_value_param` name. * Cases for `GlobalGenericValueParamDecl` are added to the declaration checking passes in `slang-check-decl.cpp`, mirroring what is done for other variable declaration cases. * A case for `GlobalGenericValueParamDecl` is aded to the `Module::_collectShaderParams` function, so that it is recognized as a kind of specialization parameter. This introduces a specialization parameter of flavor `SpecializationParam::Flavor::GenericValue` (which was already defined before this change, although it was unused). * A case for `SpecializationParam::Flavor::GenericValue` is added in `Module::_validateSpecializationArgsImpl` to check that a specialization argument represents a compile-time-constant value (not a type). * A case for `GlobalGenericValueParmDecl` is introduced in `slang-lower-to-ir.cpp` that introduces a global generic parameter in the IR * The `IRBuilder` is extended to support creating `IRGlobalGenericParam`s for the distinct cases of type, witness-table, and value parameters. The same IR instruction type/opcode is used for all cases, and only the type of the IR instruction differs. * The existing mechanisms for lowering specialization arguments to the IR, and doing specialization on the IR itself Just Work with global generic value parameters since they already support value parameters on explicit generic declarations. That's the santized version of things, but there were also a bunch of cleanups and tweaks required along the way: * The `SpecializationParam` type was extended to also track a `SourceLoc` to help in diagnostic messages, which meant some churn in the code that collects specialization parameters. * The `_extractSpecializationArgs` function is tweaked to support any kind of "term" as a specialization argument (either a type or a value). * To allow parsing specialization arguments that can't possibly be types (e.g., integer literals) we replace the existing `parseTypeString` routine with `parseTermString` and then in `parseTermFromSourceFile` call through to a general case of expression parsing (which can also parse types) rather than only parsing types directly. * Right before doing back-end code generation, we check if the program we are going to emit has remaining (unspecialized) parameters, in which case we emit a diagnostic message for the parameters that haven't been specialized rather than go on to emit code that will fail to compile downstream. * Within the `render-test` tool we collapse down the arrays that held both "generic" and "existential" specialization arguments, so that we just have global and entry-point specialization argument lists. This mirrors how Slang has worked internally for a while, but the difference hasn't been important to the test tool because no tests currently mix generic and existential specialization. The logic for parsing `TEST_INPUT` lines has been streamlined down to just the global and entry-point cases, but the pre-existing keywords are still allowed so that I don't have to tweak any test cases. There are several significant caveats for this feature, which mean that it isn't really ready for users to hammer on just yet: * There is no support for `Val`s of anything but integers, so there is no way to meaningfully have a generic value param with a type other than `int` or `uint`. * We allow for a default-value expression on global generic parameters, but do not actually make use of that value for anything (e.g., to allow a programmer to omit specialization arguments), nor check that it meets the constraints of being compile-time constant. * Global generic value parameters are not currently being treated the same as explicit generic parameters in terms of how they can be used for things like array sizes or other things that require constants. This will probably be relaxed at some point, but allowing a global generic to be used to size an array creates questions around layout. * The IR optimization passes in Slang currently won't eliminate entire blocks of code based on constant values, so using a global generic value parameter to enable/disable features will not currently lead to us outputting drastically different HLSL or GLSL. That said, we expect most downstream compilers to be able to handle an `if(0)` well. * Fix regression for tagged union types The change that made specialization arguments be parsed as "terms" first, and then coerced to types meant that any special-case logic that is specific to the parsing of types would be bypassed and thus not apply. Most of that special-case logic isn't wanted for specialization arguments, since it pertains to cases were we want to, e.g, declare a `struct` type while also declaring a variable of that type. The one special case that is useful is the `__TaggedUnion(...)` syntax, which is the only way to introduce a tagged union type right now. In order to get that case working again, all I had to do was register the existing logic for parsing `__TaggedUnion` as an expression keyword with the right callback, and the existing logic in expression parsing kicks in (that logic was already handling expression keywords like `this` and `true`). I left in the existing logic for handling `__TaggedUnion` directly where types get parsed, rather than try to unify things. A better long-term fix is to make the base case for type parsing route into `parseAtomicExpr` so that the two paths share the core logic. That change should probably come as its own refactoring/cleanup, because it creates the potential for some subtle breakage. * fixup: typo
*	* Added getCStr(Name*) (#1121)	jsmall-nvidia	2019-11-13
\| \| \| \| \| \|	* Added the name to the EntryPointLayout so is always available * Made spReflectionEntryPoint_getName use name * Improved checking for entry point name in render-test * Improved COMPILE test type to allow failure and output of failure.
*	Add basic support for entry points in `.slang-lib` files. (#1112)	Tim Foley	2019-11-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add basic support for entry points in `.slang-lib` files. The basic idea here is that when writing out a `.slang-lib` file based on a compile request, we include new sections in the generated RIFF that represent the entry points that were requested. The entry-point information is serialized in an entirely ad hoc fashion (a future change might clean it up to use the `OffsetContainer` machinery), and contains the name, profile, and mangled symbol name of an entry point. When deserializing this information, we create a list of "extra" entry points that gets attached to the front-end compile requests. These "extra" entry points get turned into `EntryPoint` objects at the same place in the code that entry points specified on the command line or via API would be checked, but the extra entry points bypass the semantic checking and just create "dummy" `EntryPoint` objects. Aside: the ability for a compile request to end up with entry points that weren't originally specified via API or command-line is not new. We already had support for compiling a translation unit with entry points entirely specified via `[shader(...)]` attributes, and this new support tries to function similarly. Because the "dummy" entry points don't retain AST-level information, several parts of the code have been modified to defensively check for `EntryPoint` objects without a matching AST declaration, and skip over them. The main place where this creates a problem is paramete binding, where ignoring the dummy entry point is appropriate since we currently assume linked-in library code has been laid out manually. One small cleanup here is that the `-r` command-line flag and the `spAddLibraryReference` API functio now bottleneck through a common routine to do their work, so that they both gain the new behavior without needing copy-paste programming. In order to keep the existing test case for library linking with entry points working, I had to add a flag to the `render-test` tool so that it can skip specifying entry point names as part of the compile request it creates. In that case it must instead assume that the entry points will be added to the compile request via other means. This logic is a bit magical, and hints that we should be looking for other ways to expose the library linking functionality over time. * fixup: remove alignment assertion
*	Simple test profiling (#1062)	jsmall-nvidia	2019-09-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass support for performance profiling * Test across all elements * Fix bug - sourceContents is not used, should use rawSource. * * Add ability to get prelude from API. * Allow specifying source language for render-test * Made it possible to compile a test input file as C++ * Special handling for reflection * Added C++ impl to performance-profile.slang * Remove some clang warnings. * Output profile timings on appveyor and other TC. * Remove passing around of StdWriters (can use global). Small comment improvements.
*	CPU Performance/Testing improvements (#1055)	jsmall-nvidia	2019-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass of render-test refactor. * Make window construction a function that can choose an implementation. * Remove OpenGL as currently has windows dependency. * Disable Vulkan as Renderer impl has dependency on windows. * Pass Window in as parameter of 'update'. * Add win-window.cpp as was missing. * Fix warning on windows about signs during comparison. * * Added mechanism to add random arrays as buffer inputs and select type * Improved RenderGenerator to generate more types, and to be more careful around int32 ranges. * Added support for security checks (for Visual Studio C++) * Disable Execption handling being on by default when compiling kernels * Added a 'Group' version of the entry point that will evaluate all threads in a group in a single call. In test code use this method if available. * Added -compile-arg to be able to pass arguments to the compile within render-test * Add documention for the _Group execution feature. * Fix some typos in cpu-target.md
*	CPU compute testing on non windows targets (#1045)	jsmall-nvidia	2019-09-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* WIP: Refactor of CPUCompute and stand alone cpu-render-test * Fix compilation on CygWin. * Make CPU compute tests run on non windows targets. * Check that C/C++ compiler is available for CPU compute. * Fix some tabbing issues. * Add -fPIC on gfx * Use dxcompiler_47.dll from slang-binaries on windows. * make https for git module slang-binaries * Fix comment in premake5.lua around d3dcompiler_47.dll * Add resources to the CPUComputeUtil::Context to keep in scope. * Fixes problem compiling on cygwin where dx12 is included in build of gfx lib.
*	WIP: Compute test running on CPU (#1023)	jsmall-nvidia	2019-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* * Simplify some of test code around CPPCompiler * Test using 'callable' with pass-through * Small cpu doc improvements * Improvements to Clang output parsing. * Remove temporary file (base filename) . * Improve handling of external errors - handle severity. * On error dumping out to 'actual' file for runCPPCompilerCompile. * Small fixes. Set the source language type correctly for pass thru. * Remove warning for test for clang backend c * Preliminary work around making render-test compute potentiall work with CPU. Made ShaderCompiler -> a stateless ShaderCompilerUtil. Means we don't require a Renderer interface to do shader compilation. * Refactor such that CPU test can take place in without Window or Renderer. * Hack to look for prelude in source file directory. Fix bug returning the SharedLibrary for HostCallable. * Compute test running on CPU. * Need the prelude currently in same directly as test. * Hack to remove warning - that then produces an error on appveyor build. Disable running render CPU test on non-windows. * Improve handling of disabling CPU tests on linux. * Added bit-cast.slang working on CPU.
*	String/List closer to conventions, and use Index type (#959)	jsmall-nvidia	2019-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* List made members m_ Tweaked types to closer match conventions. * Use asserts for checking conditions on List. Other small improvements. * List<T>.Count() -> getSize() * List<T> Add -> add First -> getFirst Last -> getLast RemoveLast -> removeLast ReleaseBuffer -> detachBuffer GetArrayView -> getArrayView * List<T>:: AddRange -> addRange Capacity -> getCapacity Insert -> insert InsertRange -> insertRange AddRange -> addRange RemoveRange -> removeRange RemoveAt -> removeAt Remove -> remove Reverse -> reverse FastRemove -> fastRemove FastRemoveAt -> fastRemoveAt Clear -> clear * List<T> FreeBuffer -> _deallocateBuffer Free -> clearAndDeallocate SwapWith -> swapWith * List<T> SetSize -> setSize Reserve -> reserve GrowToSize growToSize * UnsafeShrinkToSize -> unsafeShrinkToSize Compress -> compress FindLast -> findLastIndex FindLast -> findLastIndex Simplify Contains * List<T> Removed m_allocator (wasn't used) Swap -> swapElements Sort -> sort Contains -> contains ForEach -> forEach QuickSort -> quickSort InsertionSort -> insertionSort BinarySearch -> binarySearch Max -> calcMax Min -> calcMin * Initializer::Initialize -> initialize List<T>:: Allocate -> _allocate Init -> _init IndexOf -> indexOf * * Put #include <assert.h> in common.h, and remove unneeded inclusions * Small refactor of ArrayView - remove stride as not used * getSize -> getCount setSize -> setCount unsafeShrinkToSize->unsafeShrinkToCount growToSize -> growToCount m_size -> m_count * Some tidy up around Allocator. * Use Index type on List. * Refactor of IntSet. First tentative look at using Index. * Made Index an Int Did preliminary fixes. Made String use Index. * Partial refactor of String. * String::Buffer -> getBuffer ToWString -> toWString * Small improvements to String. String:: Buffer() -> getBuffer() Equals() -> equals * Try to use Index where appropriate. * Fix warnings on windows x86 builds.
*	Improve support for interfaces as shader parameters (#886)	Tim Foley	2019-03-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Improve support for interfaces as shader parameters This change adds two main things over the existing support: 1. It is now possible to plug in concrete types that actually contain (uniform/ordinary) fields for the existential type parameters introduced by interface-type shader parameters. The `interface-shader-param2.slang` test shows that this works. 2. There is a limited amount of support for doing correct layout computation and generating output code that matches that layout, so that interface and ordinary-type fields can be interleaved to a limited extent. The `interface-shader-param3.slang` test confirms this behavior. There are several moving pieces in the change. * When it comes to terminology, we try to draw a more clear distinction between existial type parameters/arguments and existential/object value parametes/arguments. A simple way to look at it is that an `IFoo[3]` shader parameter introduces a single existential type parameter (so that a concrete type argument like `SomeThing` can be plugged in for the `IFoo`) but introduces three existential object/value parameters (to represent the concrete values for the array elements). * At the IR level, we support a few new operations. A `BindExistentialsType` can take a type that is not itself an interface/existential type but which depends on interfaces/existentials (e.g., `ConstantBuffer<IFoo>`) and plug in the concrete types to be used for its existential type slots. * Then a `wrapExistentials` instruction can take a type with all the existentials plugged in (possibly by `BindExistentialsType`) and wrap it into a value of the existential-using type (e.g., turn `ConstantBuffer<SomeThing>` into a `ConstantBuffer<IFoo>`). * The IR passes for doing generic/existential specialization have been updated to be able to desugar uses of these new operations just enough so that a `ConstantBuffer<IFoo>` can be used. * When we specialize an IR parameter of an interface type like `IFoo` based on a concrete type `SomeThing`, we turn the parameter into an `ExistentialBox<SomeThing>` to reflect the fact that we are conceptually referring to `SomeThing` indirectly (it shouldn't be factored into the layout of its surrounding type). * Parameter binding was updated so that it passes along the bound existential type arguments in a `Program` or `EntryPoint` to type layout, so that we can take them into account. The type layout code needs to do a little work to pass the appropriate range of arguments along to sub-fields when computing layout for aggregate types. * Type layout was updated to have a notion of "pending" items, which represent the concrete types of data that are logically being referenced by existential value slots. The basic idea is that these values aren't included in the layout of a type by default, but then they get "flushed" to come after all the non-existential-related data in a constant buffer, parameter block, etc. * The logic for computing a parameter group (`ConstantBuffer` or `ParameterBlock`) layout was updated to always "flush" the pending items on the element type of the group, so that the resource usage of specialized existential slots would be taken into account. * The type legalization pass has been adapted so that we can derive two different passes from it. One does resource-type legalization (which is all that the original pass did). The new pass uses the same basic machinery to legalize `ExistentialBox<T>` types by moving them out of their containing type(s), and then turning them into ordinary variables/parameters of type `T`. Big things missing from this change include: - Nothing is making sure that "pending" items at the global or entry-point level will get proper registers/bindings allocated to them. For the uniform case, all that matters in the current compiler is that we declare them in the right order in the output HLSL/GLSL, but for resources to be supported we will need to compute this layout information and start associating it with the existential/interface-type fields. - Nothing is being done to support `BindExistentials<S, ...>` where `S` is a `struct` type that might have existential-type fields (or nested fields...). Eventually we need to desugar a type like this into a fresh `struct` type that has the same field keys as `S`, but with fields replaced by suitable `BindExistentials` as needed. (The hard part of this would seem to be computing which slots go to which fields). As a practial matter, this missing feature means that interface-type members of `cbuffer` declarations won't work. The current tests carefully avoid both of these problems. They don't declare any buffer/texture fields in the concrete types, and they don't make use of `cbuffer` declarations or `ConstantBuffer`s over structure types with interface-type fields. * fixup: add override to methods * fixup: typos
*	#include not using search paths (#873)	jsmall-nvidia	2019-03-02
\| \| \| \| \| \| \| \| \| \|	* Fix warnings from visual studio due to coercion losing data. * Removed searchDirectories from FrontEndCompileRequest and use the one in Linkage as that is the one that is changed via Slang API. * * Add searchPaths back to FrontEndRequest * Add comments to explain the issue * Add a test to check include paths
*	First steps toward supporting interface-type parameters on shaders (#852)	Tim Foley	2019-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First steps toward supporting interface-type parameters on shaders What's New ---------- From the perspective of a user, the main thing this change adds is the ability to declare top-level shader parameters (either at global scope, or in an entry-point parameter list) with interface types. For example, the following becomes possible: ```hlsl // Define an interface to modify values interface IModifier { float4 modify(float4 val); } // Define some concrete implementations struct Doubler : IModifier { float4 modify(float4 val) { return val + val; } } struct Squarer : IModifier { ... } // Define a global shader parameter of interface type IModifier gGlobalModifier; // Define an entry point with an interface-type `uniform` parameter void myShader( unifrom IModifier entryPointModifier, float4 inColor : COLOR, out float4 outColor : SV_Target) { // Use the interface-type parameters to compute things float4 color = inColor; color = gGlobalModifier.modify(color); color = entryPointModifier.modify(color); outColor = color; } ``` The user can specialize that shader by specifying the concrete types to use for global and entry-point parameters of interface types (e.g., plugging in `Doubler` for `gGlobalModifier` and `Squarer` for `entryPointModifier`). The "plugging in" process is done in terms of a concept of both global and local "existential slots" which are a new `LayoutResourceKind` that represents the holes where concrete types need to be plugged in for existential/interface types. In simple cases like the above, each interface-type parameter will yield a single existential slot in either the global or entry-point parameter layout. Users can query the start slot and number of slots for each shader parameter, just like they would for any other resource that a parameter can consume. Before generating specialized code, the user plugs in the name of the concrete type they would like to use for each slot using `spSetTypeNameForGlobalExistentialSlot` and/or `spSetTypeNameForEntryPointExistentialSlot`. There are some major limitations to the implementation in this first change: * Parameters must be of interface type (e.g., `IFoo`) and not an array (`IFoo[3]`), or buffer (`ConstantBuffer<IFoo>`) over an interface type. Similarly, `struct` types with interface-type fields still don't work. * The work on interface-type function parameters still doesn't include support for `out` or `inout` parameters, nor for functions that return interface types (that isn't technically related to this change, but affects its usefullness). * No work is being done to correctly lay out shader parameters once the concrete types for existential slots are known, so that this change really only works when the concrete type that gets plugged in is empty. These limitations are severe enough that this feature isn't really usable as implemented in this change, and this merely represents a stepping stone toward a more complete implementation. Implementation -------------- The API side of thing largely mirrors what was already done to support passing strings for the type names to use for global/entry-point generic arguments, so there should be no major surprises there. The logic in `check.cpp` computes the list of existential slots when creating unspecialized `Program`s and `EntryPoint`s (this is logically the "front end" of the compiler), and then checks the supplied argument types against what is expected in each slot when creating specialized `Program`s and `EntryPoint`s. This again mirrors how generic arguments are handled. Type layout was extended to compute the number of existential slots that a type consumes, and will thus automatically assign ranges of slots to top-level and entry-point shader parameters in the same way it already allocates `register`s and `binding`s. The big missing feature is the ability to specialize a layout to account for the concrete types plugged into the existential-type slots. IR generation for specialized programs and entry points was slightly extended so that it attaches information about the concrete types plugged into the existential slots, and the witness tables that show how they conform to the interface for that slot. The linking step needed some small tweaks to make sure that information gets copied over to the target-specific program when we start code generation. The meat of the IR-level work is in `ir-bind-existentials.cpp`, which takes the information that was placed in the IR module by the generation/linking steps and uses it to rewrite shader parameters. For example, if there is a shader parameter `p` of type `IModifier`, and the corresponding existential slot has the type `Doubler` in it, we will rewrite the parameter to have type `Doubler`, and rewrite any uses of `p` to instead use `makeExistential(p, /witness that Doubler conforms to IModifier/)`. Once the replacement is done on the parameters, the existing work for specializing existential-based code when the input type(s) are known kicks in and does the rest. Testing ------- A single compute test is added to validate that this feature works. It is narrowly tailored to not require any of the features not supported by the initial implementation (e.g., all of the concrete types used have no members). The test case does include use of an associated type through one of these existential-type parameters, which has exposed a subtle bug in how "opening" of existential values is implemented in the front-end. Rather than fix the underlying problem, I cleaned up the code in the front-end to special case when the existential value being opened is a variable bound with `let`, to directly use a reference to that variable rather than introduce a temporary. Similarly, in the IR generation step, I added an optimization to make variables declared with `let` skip introducing an IR-level variable and just use the SSA value of their initializer directly instead. * fixup: missing files * fixup: incorrect type for unreachable return * fixup: actually comment ir-bind-existentials.cpp
*	Split front- and back-ends (#846)	Tim Foley	2019-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Split front- and back-ends This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API. The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion. For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points. The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed. The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did: * The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files. * A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does not interact with generic arguments for global or entry-point generic parameters. * The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`). * The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation. * A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around. * The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward. * Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios. * The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time). * One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation). It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required: * I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it. * A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters. * In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation. * In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway. * In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better. * In cases where code was doing, e.g., `context->shared->compileRequest->session->getThing()` I have tried to introduce convenience routines so that the usage site is just `context->getThing()` to improve encapsulation and allow changes to be made more easily going forward. * The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors. * A few APIs now take `DiagnosticSink` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob` parameter for the output. fixup: missing include for compilers with accurate template checking (non-VS) * fixup: review feedback
*	Running tests in slang-test process (#740)	jsmall-nvidia	2018-12-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass at having an interface to write text to that can be replaced. Simplifed and made more rigerous the interface used to write formatted strings. * Added AppContext to simplify setting up and parsing around of streams. * Added more simplified way to get the std error/out from AppContext. * Work in progress using dll for tools to speed up testing. * First pass at ISlangWriter interface. * Added support for writing VaArgs. Added NullWriter. * Use ISlangWriter for output. * Use ISlangWriter for output - replacing OutputCallback. Make IRDump go to ISlangWriter * SlangWriterTargetType -> SlangWriterChannel Improvements around AppContext * Shared library working with slang-reflection-test. * Dll testing working for render-test. * Include va_list definintion from header. * Fix errors from clang. * Fix typo for linux. * Added -usexes option * Fix typo. * Fix arguments problem on linux. * Fix typo for linux. * Add windows tool shared library projects. * Fix warning from x86 win build. Fix signed warning from slang-test/main.cpp * First attempt at getting premake to work on travis, and run tests. * Try moving build out into script. * Invoke bash scripts so they don't have to be executable. * Drive configuration/tests from env parameters set by travis * Try using source to run travis tests. * Remove the build.linux directory - but doing so will overwrite Makefile. * Made -fno-delete-null-pointer-checks gcc only. * Try to fix warning from -fno-delete-null-pointer-checks * Turn of warnings for unknown switches. * Try to make premake choose the correct tooling. * Disabled missing braces warning. * Disable -Wundefined-var-template on clang. * -Wunused-function disabled for clang. * Fix typo due to SlangBool. * Remove this nullptr tests. * "-Wno-unused-private-field" for clang. * Added "-Wno-undefined-bool-conversion" * Add DominatorList::end fix. * Split scripts into travis_build.sh travis_test.sh * Fix gcc/clang template pre-declaration issue around QualType. * Fix premake to build such that pthread correctly links with slang-glslang
*	Major overhaul of Renderer abstraction, to support a new example (#624)	Tim Foley	2018-08-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original goal here was to bring up a second example program: `model-viewer`. While the existing `hello-world` example is enough to get somebody up to speed with the basics of the Slang API (as a drop-in replacement for `D3DCompile` or similar), it doesn't really show any of the big-picture stuff that Slang is meant to enable. There wasn't any use of D3D12/Vulkan descriptor tables/sets, and there wasn't any use of interfaces, generics, or `ParameterBlock`s in the shader code. The `model-viewer` example addresses these issues. Its shader code involves generics, interfaces, and multiple `ParameterBlock`s, and the host-side code demonstrates a few key things for working with Slang: * There is an application-level abstraction for parameter blocks, that combines the graphics-API descriptor set object with Slang type information * There is a shader cache layer used to look up an appropriate variant of a rendering effect by using parameter block types to "plug in" global type variables * There is a clear separation between the phases of compilation: a first phase that does semantic checking and enables reflection-based allocation of graphics API objects, followed by one or more code generation passes for specialized kernels. This example is certainly not perfect, and it will need to be revamped more going forward. In particular: * The output picture is ugly as sin. We need a plan for how to get this to load better content, perhaps even popping up an error message to note that the required input data isn't present in the basic repository. * The shader code is too simplistic. There isn't any real material variety, and the `IMaterial` abstraction is completely wrong. * The use of parameter blocks is facile because there are no resource parameters right now. Fixing that will likely expose issues around interfacing with Slang's reflection API. * The whole example exposes the issue that Slang's current APIs aren't really designed for the benefit of two-phase compilation (since our many client application has been stuck on one-phase compilation). * Global type parameters are actually a Bad Idea that we only did for compatibility with existing codebases. We should not be showing them off in an example of the Right Way to use Slang, but the language support for type parameters on entry points is still not complete. Of course, the majority of the changes here are not inside the example applications, and instead involve a major overhaul of the `Renderer` abstraction that is used for both tests and examples. The main thrust of the change is to make the abstraction layer be closer to the D3D12/Vulkan model than to a D3D11-style model. This is important for the `model-viewer` example, since it aspires to show how Slang can be incorporated into a renderer that targets a modern API. The most important bit is actually the use of descriptor sets and "pipeline layouts" a la Vulkan, since without these Slang's `ParameterBlock` abstraction won't make a lot of sense. Implementation of the abstraction for the various APIs has very much been on an as-needed basis. The current implementation is just enough for the two examples to work, plus enough to get all the tests to pass in both debug and release builds on Windows. A big missing feature in the API abstraction right now is memory lifetime management. The code had been trending toward something D3D11-like where a constant buffer could be mapped per-frame with the implementation doing behind-the-scenes allocation for targets like D3D12/Vulkan. I'd like to shift more toward a model of just exposing "transient" allocations that are only valid for one frame, because these are more representation of how an efficient renderer for next-generation APIs will work. That transition isn't actually complete, though, so there are problems with the existing examples where `hello-world` is actually scribbling into memory that the GPU might still be using, while `model-viewer` is doing full-on heavy-weight allocations on a per-frame basis with no real concern for the performance implications. All together, there are a lot of things here that need more work, but this branch has been way too long-lived already, and so I'd like to get this checked in as long as all the tests pass.
*	spCompile/spProcessCommandLineArguments return SlangResult (#610)	jsmall-nvidia	2018-07-06
\| \| \| \| \| \| \| \| \|	* * Make spCompile return SlangResult * Make spProcessCommandLineArguments return SlangResult (and not internally exit) * Remove calls to exit() * Fix typos * Make all output from spProcessCommandLineArguments get sent to diagnostic sink.
*	Make render-test use Slang for all shader compilation (#597)	Tim Foley	2018-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Make render-test use Slang for all shader compilation This streamlines the code for render-test by having all its shader compilation go through the Slang API, so that it doesn't have to deal with custom logic to compile HLSL->DXBC and HLSL->DXIL. We were already leaning on Slang to generate SPIR-V for Vulkan, so this makes all the paths more consistent. My original plan with this change was to make the D3D12 render path start using DXIL at this point, since the change would make that easy, but it turns out that some aspects of how we handle parameter binding are not compatible with that right now, so it would need to come as a later change. There's a lot of details here, so I will try to walk through the changes, including the incidental ones: * Add logic to `premake5.lua` so that we copy the necessary libraries for HLSL shader compilation to our target directory from the Windows SDK. This is necessary so that our tests can actually invoke `dxcompiler.dll` * Re-run Premake to generate new project files. This moves around a few files that I manually added in previous changes without re-running Premake. * When invoking `fxc` as a pass-through compiler, be sure to pass along any macros defines via API or command-line. This isn't a strictly required change with how things worked out, but it is a positive one anyway, because it makes `slangc -pass-through fxc` more useful. * Don't print output from a downstream `fxc` invocation if it produces warnings but no errors. The main reason for this is so that our tests don't fail because of `fxc` warnings on Slang's output (which then don't match the baselines), but it can also be rationalized as not wanting to confuse users with warnings that don't come from the "real" compiler they are using. This probably needs fine-tuning as a policy. * Add the HLSL `NonUniformResourceIndex` function. This was an oversight because it isn't documented as a builtin on MSDN, and only gets mentioned obliquely when they talk about resource indexing. * Add `glsl_<version>` profiles to match our `sm_<version>` profiles, so that it is easy for a user to use the profile mechanism to request a specific GLSL version without also specifying a stage name. * Update the render-test logic so that there is a single `ShaderCompiler` implementation that always uses Slang, and get rid of all of the renderer-specific `ShaderCompiler` implementations. * Update logic in render-test `main.cpp` to select the options to use for the eventual Slang compile based on the choice of renderer and input language. I didn't change the options that render-test exposes, even though they are getting increasingly silly (e.g., `-glsl-rewrite` doesn't use GLSL as its input...). * Note: the D3D12 renderer will still use fxc, DXBC, and SM 5.0 for now, since trying to update it to switch to dxc, DXIL, and SM 6.0 didn't work well at the time. * Add a bit of supporting D3D12 code to make sure that we don't allocate a structured buffer when a buffer has a format. * Make sure to also define the `__HLSL__` macro when compiling Slang code, because otherwise a bunch of tests don't work (I'm not clear on how it worked before...). * fixup: missing file
*	Feature/vulkan first render (#545)	jsmall-nvidia	2018-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass at InputLayout for Vulkan Add support for RGBA_Float32 * Use VulkanModule and VulkanApi to handle accessing Vulkan types. * First pass at Vulkan swap chain/Device queue. * Added VulkanUtil for generic function functions. * Move more functionality to VulkanApi and VulkanUtil. Make Buffer able to initialize itself. * More tidy up around VulkanDeviceQueue * First pass use of VulkanDeviceQueue in VkRenderer * First pass use of VulkanSwapChain on VkRenderer * Added depth formats. Binding for constant and vertex buffers for Vulkan. * Setting up VkImageView on backbuffers. * First pass support for setting up vkRenderPass. * Fixes to work around Vulkan swap chain/verification issues. * Added support for Pipeline and a pipeline cache. * Working without waiting - because use of pipeline cache. * Added support for VkFramebuffer in Vulkan. * First pass at creating Vulkan graphics pipeline. * More efforts to get Vulkan to render. * Small improvement for checking of Binding flags. * Removed setConstantBuffers from the Renderer interface - so that all resource binding takes place through the BindingState. To make this work required a 'hack' in render-test main.cpp - so that the constant buffer binding that is needed in some tests is only added when it doesn't clash. * RendererID -> unified into RendererType. Added getRendererType to Renderer interface. Added ProjectionStyle, and function to get from RendererType. Added getIdentityProjection to RendererUtil - to get projection that is the 'identity' - but hits the same pixels for all projection styles. * Fix build problem on Win32 on Vulkan where should use VK_NULL_HANDLE. * Improve naming, comments. Remove dead code. * Remove unwanted comment.
*	Fixes/improvements based on feature/render-binding-resource (#511)	jsmall-nvidia	2018-04-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * Split out functions for construction or Renderer types into ShaderRendererUtil. Removed the serialization of buffers code into test-render * Improvements in documentation and typename in BindingState types. RegisterSet -> CompactBindIndexSlice RegisterList -> BindIndexSlice RegisterDesc -> ShaderBindSet * Fix debug build break.
*	Separation of Binding/Resource construction on Renderer interface (#508)	jsmall-nvidia	2018-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues. * First pass testing out construction of binding resources external to Renderer implementation. * Moved some BindingState::Desc types to BindingState to make easier to use. * First pass of binding using BindingState::Desc for Dx11. * First pass at binding with dx12. * Fixed issues around separating dx12 binding from ShaderInputLayout * First pass at OpenGl state binding. * BindingState::Desc::Binding::Type -> BindingType * Use Buffer to manage life of vk resources. Construction of buffers handled by createBufferResource (BindingState doesn't have specialized logic) * Remove InputLayout types from binding so can create a binding independent of it. * Added upload buffer to BufferResource - could be used for write mapping. * m_samplers -> m_samplerDescs. First pass at Vk binding with BindingState::Desc. Small tidy/doc improvements. * First pass with binding all taking place through BindingState::Desc. All tests pass. * Removed support for creating BindingState from ShaderInputLayout * Remove serializeOutput from Renderer interface and all implementations. Implement map/unmap on vulkan Implement serializeBindingOutput which uses map/unmap and BindingState::Desc to write result. * Make implementation of BindingState use the BindingState::Desc for much of state - only hold api specific in BindingDetail per implementation. * Use Glsl binding on vulkan (was using hlsl). * BindingState::Desc::Binding -> BindingState::Binding. Made possible by impls using 'BindingDetail' for their specific needs. * Fix compile problems on win32. * Fix a typo in name createBindingSetDesc -> createBindingStateDesc
*	Feature/renderer binding (#489)	jsmall-nvidia	2018-04-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues.
*	SlangResult and small bug/typos fixes (#448)	jsmall-nvidia	2018-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types.
*	bug fix: witnessTable's subTypeDeclRef should have default substitution for ↵	Yong He	2018-02-20
\| \| \| \|	getSpecailizedMangledName to work properly.
*	Remove support for the -no-checking flag (#392)	Tim Foley	2018-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Remove support for the -no-checking flag Fixes #381 Fixes #383 Work on #382 - No longer expose flag through API (`SLANG_COMPILE_FLAG_NO_CHECKING`) and command-line (`-no-checking`) options - Remove all logic in `check.cpp` that was withholding diagnostics (including errors) when the no-checking mode was enabled - Remove `HiddenImplicitCastExpr`, which was only created to support no-checking mode (it represented an implicit cast that our checking through was needed, but couldn't emit because it might be wrong) - Remove logic for storing function bodies as raw token lists when checking is turned off. I'm leaving in the `UnparsedStmt` AST node in case we ever need/want to lazily parse and check function bodies down the line. - Remove a few of the code-generation paths we had to contend with, but keep the comment about them in place. - Remove GLSL-based tests that can't meaningfully work with the new approach. - Fix other tests that used a GLSL baseline so that their GLSL compiles with `-pass-through glslang` instead of invoking `slang` with the `-no-checking` flag. - Remove tests that were explicitly added to test the "rewriter + IR" path, since that is no longer supported. There is more cleanup that can be done here, now that we know that AST-based rewrite and IR will never co-exist, but it is probably easier to deal with that as part of removing the AST-based rewrite path. We've lost some test coverage here, but actually not too much if we consider that we are dropping GLSL input anyway. * Fixup: test runner was mis-counting ignored tests * Fixup: turn on dumping on test failure under Travis * Fixup: enable extensions in Linux build of glslang
*	Initial work on getting render-test to support vulkan (#391)	Tim Foley	2018-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Basic fixes to gets some Vulkan GLSL out of the IR path We haven't been paying much attention to the Vulkan output from the IR path, but that needs to change ASAP. This commit really just implements quick fixes, without concern for whether they are a good fit in the long term. - Add some more mappings from D3D `SV_` semantics to built-in GLSL variables, and stop redeclaring those built-in variables in our output GLSL. - Add custom output logic for HLSL `StructuredBuffer<T>` types, so that they emit as `buffer` declarations with an unsized array inside. This has some real limitations: - What if the user passes the type into a function? The parameter should be typed as an (unsized) array, and not a buffer. - What happens if we have an array of structured buffers? We need to declare an array of blocks (which GLSL allows), but this changes the GLSL we should emit when indexing. - Customize the way that we emit entry point attributes (e.g., `[numthread(...)]`) to also support outputting equivalent GLSL `layout` qualifiers. In many of these cases, a better fix might involve doing more of this work in the IR as part of legalization (e.g., we already have a pass that deals with varying input/output for GLSL, so that should probalby be responsible for swapping the `SV_` to `gl_`, especially in cases where the types don't match perfectly across langauges). * Start adding Vulkan support to render-test - Add both Vulkan and D3D12 as nominally supported back-ends - Add a git submodule to pull in the Vulkan SDK dependencies - I don't want our users to have to install it manually, since the SDK is huge - Checking in the binaries to our main repository seems like a bad idea, but my hope is that we can prune the bloat using a subodule with the `shallow` cloning option - Implement enough logic for the Vulkan back-end to get a single test passing on Vulkan * Fix warning * Fixup: disable new compute tests for Linux * Fixup: ignore Vulkan tests on AppVeyor * Dynamically load Vulkan implementation Rather than statically link to the Vulkan library, we will dynamically load all of the required functions. This removes the need to have the stub libs involved at all. * Remove vulkan submodule I had set up a `vulkan` submodule to pull in the headers and stub libs, but now that we are going to dynamically load all the symbols anyway, the stub lib binaries aren't needed and we can just commit the headers. * Add Vulkan headers to external/
*	Improvements and bug fixes for global type parameters	Yong He	2018-01-21
\| \| \| \| \| \|	1. allow spReflection_FindTypeByName to accept arbitrary type expression string 2. allow const int generic value to be used as expression value, and as array size 3. various bug fixes in witness table specialization / function cloning during specializeIRForEntryPoint to avoid creating duplicate global values, not copying the right definition of a function from the other module, not cloning witness tables that are required by specializeGenerics etc.
*	Enable HLSL/GLSL "rewrite" + IR-based Slang codegen (#300)	Tim Foley	2017-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The big picture here is that the AST-to-AST pass in `ast-legalize` will now detect when a declaration being referenced comes from an `import`ed module, and (if IR codegen is enabled), it will trigger cloning of the IR for the chosen symbol into an IR module that will sit alongside the legalized AST. Then, during HLSL/GLSL code emit, we emit all the IR-based code first, and then the AST-based code. Whenever the AST code references a symbol that was lowered via IR (we keep track of these) we emit the mangled name of the IR symbol. Notes/details: - A lot of the logic for cloning IR symbols referenced by the AST matches the same logic that would clone them for completely IR-based codegen, so I tried to hoist out the common logic and share it (e.g., so that we apply the same guaranteed transformations in both cases). This required basically rewriting the logic in `emit.cpp` that decomposed the various cases. - There is a new compute test case added to test this functionality. `tests/compute/rewriter.hlsl` confirms that we can use the `-no-checking` mode for the HLSL code, but still make use of a library of Slang code that employs generics, etc. - Adding this test case required adding a new compute test mode that invokes `render-test` with the `-hlsl-rewrite` flag. - It turns out that the existing `tests/render/cross-compile0.hlsl` test should have been using this functionality already. It was opting into the use of the IR via `-use-ir`, and the `render-test` application already tries to set `-no-checking` for non-Slang input languages by default. Fixing the code path this test triggers means that it is now a second test of rewriter+IR codegen. - The `translateDeclRef` logic in `ast-legalize.cpp` seemed sloppy in places, and would potentially clone declarations, when declaration references were desired. I tried to clean a bit of this up, so some call sites are now changed. - This change tries to clean up some work around cloning of global values - All global value kinds (not just functions) now go through the logic of trying to pick a "best" definition, so that they can be used when we are linking multiple modules - The logic for registering cloned values has been unified a bit, so that clients always pass in an `IROriginalValuesForClone` that either wraps a single value (maybe just null), or an `IRSpecSymbol` that gives a list of values to regsiter the new value as a clone for. - I made one piece of code that was cloning witness tables as part of generic specializations not* register a clone. I think this is correct because we may specialize the same generic multiple ways, so registering any values we clone is not the right idea, but I might be missing something... - I also reorganized this logic so that it would be easier to clone a global value when we only know its mangled name (which is the case when it is the AST that triggers cloning) - I made sure that when loading a module via `import`, the translation unit for the new module copies the `-use-ir` flag from the overall compile request, if it is present (otherwise we wouldn't generate IR for loaded modules at all... oops). - Note that `getSpecializedGlobalValueForDeclRef()`, which is the main routine used by the AST legalization to trigger cloning of an IR value does not currently handle declaration references that require specialization. - This change does not deal with trying to unify the type legalization logic between the AST-to-AST rewriter and the IR-based codegen, so if you call an imported function with types that require legalization, Bad Things are expected to happen right now.
*	fix warnings	Yong He	2017-11-20
\|
*	fixup global generic parameters	Yong He	2017-11-20
\| \| \| \| \| \|	1. simplify RoundUpToAlignment() 2. add new a render-compute test case to cover the situation where the entry-point interface (parameter/return types of an entry-point function) is dependent on the global generic type. 3. initial fixes to get this test case to compile (but is not producing correct HLSL output yet)
*	Add support for global generic parameters (#285)	Yong He	2017-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. This commit does not include ProgramLayout specialization. * IR: pass through `[unroll]` attribute (#284) The initial lowering was adding an `IRLoopControlDecoration` to the instruction at the head of a loop, but this was getting dropped when the IR gets cloned for a particular entry point. The fix was simply to add a case for loop-control decorations to `cloneDecoration`. * fix warnings * IR: support `CompileTimeForStmt` (#286) This statement type is a bit of a hack, to support loops that must be unrolled. The AST-to-AST pass handles them by cloning the AST for the loop body N times, and it was easy enough to do the same thing for the IR: emit the instructions for the body N times. The only thing that requires a bit of care is that now we might see the same variable declarations multiple times, so we need to play it safe and overwrite existing entries in our map from declarations to their IR values. Of course a better answer long-term would be to do the actual unrolling in the IR. This is especially true because we might some day want to support compile-time/must-unroll loops in functions, where the loop counter comes in as a parameter (but must still be compile-time-constant at every call site). * Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. progress on parameter binding * Add a more contrived test case for specializing parameter bindings * update render-test to align buffers to 256 bytes (to get rid of D3D complains on minimal buffer size). * adding one more test case for parameter binding specialization. * Cleanup according to @tfoleyNV 's suggestions. * fix a bug introduced in the cleanup
*	Support running and comparing execution results of compute shaders in ↵	YONGH\yongh	2017-10-19
\| \| \| \|	testing framework.
*	Implement notion of a "container format" (#213)	Tim Foley	2017-10-16
\| \| \| \| \| \| \| \| \| \| \|	The big addition here is that the Slang "bytecode" is no longer treated as just a "code generation target" (`CodeGenTarget`) akin to DX bytecode (DXBC) or SPIR-V, but instead is a `ContainerFormat` that can be used to emit all the results of a compile request (well, currently just the IR-as-BC, but the intention is there). Getting to this goal involved some prior checkins that eliminated bogus "targets" that weren't really akin to SPIR-V or DXBC: `-target slang-ir-asm` and `-target reflection-json`. Those targets were really in place to support testing, and so they've been made more explicit testing/debug options. This change eliminates `-target slang-ir` and instead tries to allow the user to specify `-o foo.slang-module` as an output file name, that indicates the intention to output a "container" file that will wrap up all the generated code. I've also gone ahead and generalized the existing `-target` option so that we are actually building up a list of code generation targets. This is largely just a cleanup, since it forces code to be more aware of when it is doing something target-specific vs. target independent. For example, reflection layout information lives on a requested target, and not on the compile request as a whole, and similarly output code is per-target, per-entry-point. As a cleanup, I eliminated support for per-translation-unit output. This was vestigial code from back when I used to try and do HLSL generation for a whole translation unit instead of per-entry-point (which turned out to be a lot of complexity for little gain), and it was only being used in the `hello` example and the `render-test` test fixture - in both cases fixing it up was easy enough. I've stubbed out the old `spGetTranslationUnitSource` API, but haven't removed it yet.
*	Fixup: deal with hitting `.obj` size limits for VS	Tim Foley	2017-09-25
\| \| \| \| \| \|	When using the lumped/"unity" build approach for Slang, the resulting `.obj` files run into number-of-sections limits in the VS linker. For now I'm using the `/bigobj` command-line flag to work around this for the `hello` example, just so I can be sure the lumped build still works, but longer term it seems like we need to just drop that approach anyway. The `render-test` application was switched to link against `slang.dll` since there is no reason to have multiple apps use the lumped approach.
*	Overhaul handling of entry points and translation units.	Tim Foley	2017-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main user-visible change here is that instead of `spAddTranslationUnitEntryPoint` we have `spAddEntryPoint`, to reflect that the list of entry points is "global" to a compile request. As a result, `spGetEntryPointSource` now only needs the entry point index, and not the translation unit index. There are a bunch more behind-the-scenes changes, though, reflecting a streamlining of the concepts related to compilation into a smaller number of classes. Now there is: - `Session` (unchanged) to manage the lifetimes of shared stuff like the stdlib - `CompileRequest` (merges in `CompileOptions`) to handle all the lifetime related to a single invocation of the compiler - `TranslationUnitRequest` (merges `TranslationUnitOptions`, `CompileUnit`) to represent a single translation unit ("module") that the user is trying to compile. This is a single file for HLSL/GLSL, but can be multiple files for Slang. - `EntryPointRequest` (merges `EntryPointOption` and a bit of `EntryPointResult`) to track a single entry point that the user is asking to compile (that entry point always comes from a single translation unit) A lot of functions used to take some combination of these and end up with really long signatures. I've given most of the objects "parent" pointers so that they can get back to all the context they need, so most functions don't need as many parameters. It may eventually be important to tease these apart again, in particular: - The code-generation side of things (the `*Result` types) might need to be pulled out in case we want to codegen multiple times from the same AST - Similarly, the layout stuff may also need to be pulled out, in case we want to lay things out multiple times with different rules.
*	Allow for automatic importing of Slang code	Tim Foley	2017-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic idea of this change is that user code can just write: #include "foo.h" and then if `foo.h` gets found in a list of registered directories for "auto-import," then it actually gets interpreted as if the user had writte, more or less: __import foo; That is, the code in `foo.h` will be treated as Slang, and will be fully parsed and checked (no matter what the source language had been), and the scoping rules will be those of `__import` instead of `#include`. This is a really big hammer, and I could imagine it smashing fingers if used poorly. I'm not sure this feature will pan out, but we need to try things to know. One big piece of that that I'll likely keep in either case is an overhaul of command-line options parsing for `slangc`. In particular, this logic has been moved into the core `slang` library (so that users can just pass options in via the API), and it is all done on UTF-8 strings rather than wide strings (which was always going to be Windows-specific).
*	First pass at support for cross-compilation	Tim Foley	2017-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a large change that contains many pieces: - Update the `cross-compile0` test to actually make use of cross compilation. Now the `cross-compile0.hlsl` file contains both HLSL and GLSL source code, and then imports code from `cross-compile0.slang`, which provides a "library" (one function) that can be shared between both the HLSL and GLSL version of things. - Fixed a bug in the support for backslash-escaped newlines. - Added a new `__import` declaration type (replaces the `using` directive that was still around in a vestigial form) An `__import` causes the compiler to look for a Slang source file (currently using the ordinary `#include` lookup logic), and then parse/check the found file as an additional module ("translation unit"), before making its declarations visible in the current scope. - Refactored the main compilation flow to be simpler. There were the `ShaderCompiler` and `ShaderCompilerImpl` classes that weren't relaly doing anything, but added complexity to the whole workflow. - The `render-test` application has been heavily modified to better support testing cross-compilation workflows. At the most basic level we are starting to distinguish pass-through vs. rewriter workflows, and are passing various `#define`s down to the compiler(s) to let the source code be customized as needed for each case. Several annoying corner cases are caused here by having to support the GLSL compilation model, which really wants each entry point in its own specific translation unit, whereas we really want to keep things nicely contained in single files. - Added support for `__intrinsic` operations to have target-specific behavior. This allows a function to be given a different name for some specific target (so a call gets emitted as a call to that other operation). More generally, the library writer can put together an arbitrary format string that will be used in place of expressions that call the given function, e.g.: __intrinsic(hlsl, "$1 - $0") __intrinsic int foo(int a, int b); Given this declaration, a call like `foo(x,y)` will code generate as `x - y` for HLSL, and as `foo(x,y)` for all other targets. Annoying things still to be dealt with: - The way that I'm filtering the user-provided options when passing things down to the compilation of dynamically loaded modules is a bit ad hoc. It would be good to have a systematic notion of which options will be inherited and which won't. There is also more code duplication than I'd like, so we risk having the compiler behave differently when compiling a file at the top level, vs. because of `__import`. - Adding target-specific behavior to intrinsics is all well and good, but the current approach means we can only add this to the original declaration, which limits the ability to easily extend the set of targets. A better approach long-term would be to add a more robust notion of target-based overload resolution (which would happen after semantic checking). Then one mechanism would be used to find the right target-specific overload to use for an operation, and then each (target-specific) definition could use a simpler attribute to intercept code-generation behavior. Note that we might eventually need a similar notion to deal with stage- or profile-specific functions and the overloading behavior around them, so using this for intrinsics doesn't seem like a bad idea.
*	GLSL: get GLSL limping in `render-test`	Tim Foley	2017-06-12
\| \| \| \| \| \|	The test case that is there right now is nominally a cross-compilation test, but for right now it uses the preprocessor to present completely different code for HLSL and GLSL compilation. This change is really just fleshing out the OpenGL side of `render-test` enough that it can produce images using OpenGL to enable further testing.
*	Initial import of code.	Tim Foley	2017-06-09