slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Fixes to make all CPU compute shaders work on CUDA (#1211)	jsmall-nvidia	2020-02-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Launch CUDA test taking into account dispatch size. * Enable isCPUOnly hack to work on CUDA. * Rename 'isCPUOnly' hack to 'onlyCPULikeBinding'. * Add $T special type. Support SampleLevel on CUDA. * Fix typo.
*	Feature/test for double behavior (#1186)	jsmall-nvidia	2020-01-29
\| \| \| \| \| \| \| \| \| \| \| \|	* Split out binding writing. * Pass in the entry type. * Take into account output type with -output-using-type Added GPULikeBindRoot Added dxbc-double-problem test. * Add the dxbc-double-problem test.
*	Synthesizing CUDA tests (#1183)	jsmall-nvidia	2020-01-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* When using setUniform clamp the amount of data written to the buffer size. * CUDA implement StructuredBuffer/ByteAddressBuffer as pointer/count as is on CPU. Allow bounds check to zero index. Update docs. * Synthesize tests. * Fix bug in CUDA output. * Fixing more tests to run on CUDA. * Added BaseType for layout of Vector and Matrix - as they are held as int32_t vector array types. * Enable unbound array support on CUDA. * Added unsized array support for CUDA documentation.
*	Slang -> CUDA kernel runs correctly in test infrastructure (#1167)	jsmall-nvidia	2020-01-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass at BindLocation. * Added BindSet::init - for initializing with two input constant buffers. Needs better name, and perhaps should be another class. * Fix handling of constant buffer stripping. Improved initialization. * Trying to generalize BindLocation a little more. Split out CPULikeBindRoot. * More work to make BindLocation et al work with non uniform bindings. * Added parsing to a location. * WIP: Trying to get CPU working with BindLocation. * Describe problem of knowing the type of the reference point in the binding table. * More ideas on getBindings fix. * Remove BindSet as member of BindLocation. * Added BindLocation::Invalid * Made BindLocation able to be key in hash * Use BindLocation for bindings on BindingSet. * Added cuda and nvrtc categories to test infrastructure. Disabled CUDA synthetic tests by default. Fixed such that all tests now produce something in BindLocation style. * Use m_userIndex instead of m_userData on Resource. Move the binding setup out of cpu-compute-util (as no longer CPU specific) * Removed CPUBinding - used BindLocation/BindSet instead. Fixed some bugs around indexOf around uniform indirection. * Renamed BindSet::Resource -> BindSet::Value. * Document BindLocation. * Fixes for Clang/GCC Improve invariant requirement handling when constructing from BindPoints. * WIP: First attempt to run CUDA kernel. * Fix some issues around doing CUDA kernel launch. * Fix issues around use of cudaMemCpy . * Better cuda runtime error checking mechanism. * Fixed bug in passing parameters to cuda kernel launch. Simplified initialisation of context. * WIP: Fix CUDA runtime issues. * Add explicit CUDA synchronize so failures don't appear on implicit ones. * Fix problem emitting non shared variable on CUDA. * Fix some typos in CUDA layout. Use just a pointer for now for CUDA StucturedBuffer. * Arg order for CUDA launch was wrong. * First compute kernel runs on CUDA.
*	Bind Location (#1166)	jsmall-nvidia	2020-01-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass at BindLocation. * Added BindSet::init - for initializing with two input constant buffers. Needs better name, and perhaps should be another class. * Fix handling of constant buffer stripping. Improved initialization. * Trying to generalize BindLocation a little more. Split out CPULikeBindRoot. * More work to make BindLocation et al work with non uniform bindings. * Added parsing to a location. * WIP: Trying to get CPU working with BindLocation. * Describe problem of knowing the type of the reference point in the binding table. * More ideas on getBindings fix. * Remove BindSet as member of BindLocation. * Added BindLocation::Invalid * Made BindLocation able to be key in hash * Use BindLocation for bindings on BindingSet. * Added cuda and nvrtc categories to test infrastructure. Disabled CUDA synthetic tests by default. Fixed such that all tests now produce something in BindLocation style. * Use m_userIndex instead of m_userData on Resource. Move the binding setup out of cpu-compute-util (as no longer CPU specific) * Removed CPUBinding - used BindLocation/BindSet instead. Fixed some bugs around indexOf around uniform indirection. * Renamed BindSet::Resource -> BindSet::Value. * Document BindLocation. * Fixes for Clang/GCC Improve invariant requirement handling when constructing from BindPoints.
*	Setup of runtime cuda device (#1162)	jsmall-nvidia	2020-01-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* CUDA generated first test compiles. * WIP on enabling CUDA in render-test. * Detect CUDA_PATH environmental variable to build build cuda support into render-test. Added WIP cuda-compute-util.cpp/h Added CUDA as a renderer type. * Fix libraries needed for cuda in premake. * Added -enable-cuda premake option. Defaults to false. * Creates CUDA device, loads PTX and finds entry point. * Fix some erroneous cruft from slang-cuda-prelude.h
*	Remove support for explicit register/binding syntax on TEST_INPUT (#1132)	Tim Foley	2019-11-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `TEST_INPUT` facility allows textual Slang test cases to provide two kinds of information to the `render-test` tool: 1. Information on what shader inputs exist 2. Information on what values/objects to bind into those shader inputs Under the first category of information, there exists supporting for attaching a `dxbinding(...)` annotation to a `TEST_INPUT` which seemingly indicates what HLSL `register` the input uses. There is a similar `glbinding(...)` annotation, used for OpenGL and Vulkan. It turns out that these annotations were, in practice, completely ignored and had no bearing on how `render-test` allocates or bindings graphics API objects. There was some amount of code attempting to validate that explicit registers/bindings were being set appropriately, but the actual values were being ignored. The visible consequence of the `dxbinding` and `glbinding` annotations being ignored is issue #1036: the order of `TEST_INPUT` lines was de facto determining the registers/bindings that were being used by `render-test`. This change simply removes the placebo features and strips things down to what is implemented in practice: the `TEST_INPUT` lines do not need target-API-specific binding/register numbers, because their order in the file implicitly defines them. I added logic to the parsing of `TEST_INPUT` lines to make sure I got an error message on any leftover annotations, and went ahead and systematicaly deleted all of the placebo annotations from our test cases. If we decide to make `TEST_INPUT` lines not depend on order of declaration in the future, we can build it up as a new and better considered feature. The main alternative I considered was to keep the annotations in place, and change `render-test` and the `gfx` abstraction layer to properly respect them, but that path actually creates much more opportunity for breakage (since every single test case would suddenly be specifying its root signature / pipeline layout via a different path using data that has never been tested). The approach in this change has the benefit of giving me high confidence that all the test cases continue to work just as they had before.
*	Simple test profiling (#1062)	jsmall-nvidia	2019-09-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass support for performance profiling * Test across all elements * Fix bug - sourceContents is not used, should use rawSource. * * Add ability to get prelude from API. * Allow specifying source language for render-test * Made it possible to compile a test input file as C++ * Special handling for reflection * Added C++ impl to performance-profile.slang * Remove some clang warnings. * Output profile timings on appveyor and other TC. * Remove passing around of StdWriters (can use global). Small comment improvements.
*	Improvements to testing and ABI for CPU (#1057)	jsmall-nvidia	2019-09-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* WIP: Improving CPU performance/ABI * Optionally output code on CPU for groupThreadID and groupID. * Added ability to set compute dispatch size on command line for render-test. Dispatch compute tests taking into account dispatch size. Added test for semantics are working. * Test using GroupRange. * Fix problem with adding \n for externa diagnostic - to do it if there isn't a \n at the end. Change the ouput order (put result before) so last value is diagnostic string. * Made GroupRange the default exposed CPU ABI entry point style. Removed CPU_EXECUTE test style -as tested via the now cross platform render-test * Split out execution from setup for execution to improve perf. * For better code coverage/testing test all styles of CPU compute entry point. * Improve documentation for ABI changes for CPU code. Add 'expecting' to error message from review. * Fix small typos.
*	CPU ABI improvements (#1056)	jsmall-nvidia	2019-09-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* WIP: Improving CPU performance/ABI * Optionally output code on CPU for groupThreadID and groupID. * Added ability to set compute dispatch size on command line for render-test. Dispatch compute tests taking into account dispatch size. Added test for semantics are working. * Test using GroupRange. * Fix problem with adding \n for externa diagnostic - to do it if there isn't a \n at the end. Change the ouput order (put result before) so last value is diagnostic string.
*	CPU Performance/Testing improvements (#1055)	jsmall-nvidia	2019-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass of render-test refactor. * Make window construction a function that can choose an implementation. * Remove OpenGL as currently has windows dependency. * Disable Vulkan as Renderer impl has dependency on windows. * Pass Window in as parameter of 'update'. * Add win-window.cpp as was missing. * Fix warning on windows about signs during comparison. * * Added mechanism to add random arrays as buffer inputs and select type * Improved RenderGenerator to generate more types, and to be more careful around int32 ranges. * Added support for security checks (for Visual Studio C++) * Disable Execption handling being on by default when compiling kernels * Added a 'Group' version of the entry point that will evaluate all threads in a group in a single call. In test code use this method if available. * Added -compile-arg to be able to pass arguments to the compile within render-test * Add documention for the _Group execution feature. * Fix some typos in cpu-target.md
*	Refactor render-test to make cross platform (#1053)	jsmall-nvidia	2019-09-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass of render-test refactor. * Make window construction a function that can choose an implementation. * Remove OpenGL as currently has windows dependency. * Disable Vulkan as Renderer impl has dependency on windows. * Pass Window in as parameter of 'update'. * Add win-window.cpp as was missing. * Fix warning on windows about signs during comparison.
*	CPU compute testing on non windows targets (#1045)	jsmall-nvidia	2019-09-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* WIP: Refactor of CPUCompute and stand alone cpu-render-test * Fix compilation on CygWin. * Make CPU compute tests run on non windows targets. * Check that C/C++ compiler is available for CPU compute. * Fix some tabbing issues. * Add -fPIC on gfx * Use dxcompiler_47.dll from slang-binaries on windows. * make https for git module slang-binaries * Fix comment in premake5.lua around d3dcompiler_47.dll * Add resources to the CPUComputeUtil::Context to keep in scope. * Fixes problem compiling on cygwin where dx12 is included in build of gfx lib.
*	CPU uniform entry point params (#1041)	jsmall-nvidia	2019-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* * Made entry point parameters a separate entry point * Made CPUMemoryBinding work with entry point parameters/initialize constant buffers * Added isCPUOnly to bindings, because entry point parameters do not layout like constant buffer * entry-point-uniform.slang works on CPU * EntryPointParams -> UniformEntryPointParams Updated CPU documentation. * Update cpu-target.md to removed completed issues. * Only allocate CPU buffers if the size is > 0. Small update to cpu-target doc.
*	Use getElementStride in toIndex. (#1039)	jsmall-nvidia	2019-08-28
\| \| \|	Make toIndex and toField methods of Location.
*	Two fixes to avoid random crash on destruction of GLRenderer (#1038)	jsmall-nvidia	2019-08-27
\| \| \| \| \| \| \| \|	* Two fixes to avoid random crash on destruction of GLRenderer * Use of a weak reference from objects created by GLRenderer, such that GLRenderer dtor can disable those objects assuming GLRenderer is live * Make sure window is not destroyed before the renderer * Used WeakSink for weak pointer.
*	WIP: CPU sample working with Texture2D (#1033)	jsmall-nvidia	2019-08-26
\| \| \| \| \| \| \| \| \| \|	* WIP: Memory binding. * WIP for binding. * Fix handling of writing to constant buffer. * Fix bug in handling indices.
*	WIP: CPU compute coverage (#1030)	jsmall-nvidia	2019-08-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for '=' when defining a name in test. * Add support for double intrinsics. * Add support for asdouble Add findOrAddInst - used instead of findOrEmitHoistableInst, for nominal instructions. Support cloning of string literals. C++ working on more compute tests. * Constant buffer support in reflection. Fixed debugging into source for generated C++. buffer-layout.slang works. * Added cpu test result. * Remove some commented out code. Comment on next fixes. * Improvements to reflection CPU code. * C++ working with ByteAddressBuffer. * Enabled more compute tests for CPU. * Enabled more compute tests on CPU. Added support for [] style access to a vector. * Enabled more CPU compute tests. * Handling of buffer-type-splitting.slang Named buffers can be paths to resources * Fix some warnings, remove some dead code. * Fix problem with verification of number of operands for asuint/asint as they can have 1 or 3 operands. asdouble takes 2. * Fix handling in MemoryArena around aligned allocations. That _allocateAlignedFromNewBlock assumed the block allocated has the aligment that was requested and so did not correct the start address.
*	User defined downstream compiler prelude (#1028)	jsmall-nvidia	2019-08-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Added setDownstreamCompilerPrelude Renamed setPassThroughPath to setDownstreamCompilerPath. Fixed tests. Added prelude directory & code to TestToolUtil to setup default preludes for testing/command line apis. * Fix merge problem * Remove hacks to make prelude work by adding a search path as no longer needed with 'user prelude'. * Split up prelude into scalar intrinsics, and types. Use slang.h for main header. slang-cpp-prelude.h can now just include what it needs (relative to prelude directory) and define the few remaining things/work arounds. * Fix typo.
*	WIP: Compute test running on CPU (#1023)	jsmall-nvidia	2019-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* * Simplify some of test code around CPPCompiler * Test using 'callable' with pass-through * Small cpu doc improvements * Improvements to Clang output parsing. * Remove temporary file (base filename) . * Improve handling of external errors - handle severity. * On error dumping out to 'actual' file for runCPPCompilerCompile. * Small fixes. Set the source language type correctly for pass thru. * Remove warning for test for clang backend c * Preliminary work around making render-test compute potentiall work with CPU. Made ShaderCompiler -> a stateless ShaderCompilerUtil. Means we don't require a Renderer interface to do shader compilation. * Refactor such that CPU test can take place in without Window or Renderer. * Hack to look for prelude in source file directory. Fix bug returning the SharedLibrary for HostCallable. * Compute test running on CPU. * Need the prelude currently in same directly as test. * Hack to remove warning - that then produces an error on appveyor build. Disable running render CPU test on non-windows. * Improve handling of disabling CPU tests on linux. * Added bit-cast.slang working on CPU.
*	String/List closer to conventions, and use Index type (#959)	jsmall-nvidia	2019-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* List made members m_ Tweaked types to closer match conventions. * Use asserts for checking conditions on List. Other small improvements. * List<T>.Count() -> getSize() * List<T> Add -> add First -> getFirst Last -> getLast RemoveLast -> removeLast ReleaseBuffer -> detachBuffer GetArrayView -> getArrayView * List<T>:: AddRange -> addRange Capacity -> getCapacity Insert -> insert InsertRange -> insertRange AddRange -> addRange RemoveRange -> removeRange RemoveAt -> removeAt Remove -> remove Reverse -> reverse FastRemove -> fastRemove FastRemoveAt -> fastRemoveAt Clear -> clear * List<T> FreeBuffer -> _deallocateBuffer Free -> clearAndDeallocate SwapWith -> swapWith * List<T> SetSize -> setSize Reserve -> reserve GrowToSize growToSize * UnsafeShrinkToSize -> unsafeShrinkToSize Compress -> compress FindLast -> findLastIndex FindLast -> findLastIndex Simplify Contains * List<T> Removed m_allocator (wasn't used) Swap -> swapElements Sort -> sort Contains -> contains ForEach -> forEach QuickSort -> quickSort InsertionSort -> insertionSort BinarySearch -> binarySearch Max -> calcMax Min -> calcMin * Initializer::Initialize -> initialize List<T>:: Allocate -> _allocate Init -> _init IndexOf -> indexOf * * Put #include <assert.h> in common.h, and remove unneeded inclusions * Small refactor of ArrayView - remove stride as not used * getSize -> getCount setSize -> setCount unsafeShrinkToSize->unsafeShrinkToCount growToSize -> growToCount m_size -> m_count * Some tidy up around Allocator. * Use Index type on List. * Refactor of IntSet. First tentative look at using Index. * Made Index an Int Did preliminary fixes. Made String use Index. * Partial refactor of String. * String::Buffer -> getBuffer ToWString -> toWString * Small improvements to String. String:: Buffer() -> getBuffer() Equals() -> equals * Try to use Index where appropriate. * Fix warnings on windows x86 builds.
*	Feature/test improvements (#934)	jsmall-nvidia	2019-04-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First pass extract the test information by 'running tests'. * * Checking renderer availablilty * Using TestInfo to determine which tests are run and synthesized * Display if test is synthesized and what render api it's targetting * * Improved comments * Removed some dead code * Display ignored tests. * TestInfo -> TestRequirements * * Added DIAGNOSTIC_TEST type - test always runs (ie has no requirements). * Made diagnostic tests use DIAGNOSTIC_TEST * TestInfo -> TestRequirements * TestDetails holds TestRequirements and TestOptions * Fix debug typo.
*	Adapter selection for Renderer (#923)	jsmall-nvidia	2019-03-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* * Make adapter used selectable on the command line * Added 'adapter' to Renderer::Desc with dx11, dx12, vk honoring it * GL will check that the renderer matches, but cannot select a specific device * Share functionality on dx adapter selection in D3DUtil Note - that on tests that use OpenGL and the adapter doesn't match it will ignore the test (and display a message that the appropriate device couldn't be started) * Small function name improvement. * Variable rename to match type. * Fix typo in Dx12 device selection. * * Add checking if an adapter is warp * Improve some comments
*	First pass support for half on vk (#912)	jsmall-nvidia	2019-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Look at getting half to work on vk. * Alter half test so can always produce consistent test results. * First pass working half on vk. * Improve comments for vulkan extensions around half. * Upgraded vulkan headers to v1.1.103 https://github.com/KhronosGroup/Vulkan-Headers * * Add getFeatures on Render interface * Vulkan renderer determines at startup if it can support half * Parse render-features on render-test * Small changes to half-calc.slang test. * Structured buffer half access works as expected for Vk, but isn't for dx12, so disable for now. * Require the half feature for renderers for the half-structured-buffer.slang test. * * Added ToolReturnCode to be more rigerous about how a return code is passed back from a tool * Added support for a tool being able to pass back an 'ignored' result. * Used enum codes to indicate meanings * Made spawnAndWait return a ToolReturnCode * Ignore tests that don't have required render-feature * Fix macro line continuation usage. * Check dx12 has half support. * Checking for half on dx12 - if CheckFeatureSupport fails, don't fail renderer initialization. * Fix typo.
*	* Added ToolReturnCode to be more rigerous about how a return code is passed ↵	jsmall-nvidia	2019-03-18
\| \| \| \| \| \| \|	back from a tool (#911) * Added support for a tool being able to pass back an 'ignored' result. * Used enum codes to indicate meanings * Made spawnAndWait return a ToolReturnCode
*	Add -profile option to render-test to override a profile to use (#898)	jsmall-nvidia	2019-03-12
\|
*	First steps toward supporting interface-type parameters on shaders (#852)	Tim Foley	2019-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* First steps toward supporting interface-type parameters on shaders What's New ---------- From the perspective of a user, the main thing this change adds is the ability to declare top-level shader parameters (either at global scope, or in an entry-point parameter list) with interface types. For example, the following becomes possible: ```hlsl // Define an interface to modify values interface IModifier { float4 modify(float4 val); } // Define some concrete implementations struct Doubler : IModifier { float4 modify(float4 val) { return val + val; } } struct Squarer : IModifier { ... } // Define a global shader parameter of interface type IModifier gGlobalModifier; // Define an entry point with an interface-type `uniform` parameter void myShader( unifrom IModifier entryPointModifier, float4 inColor : COLOR, out float4 outColor : SV_Target) { // Use the interface-type parameters to compute things float4 color = inColor; color = gGlobalModifier.modify(color); color = entryPointModifier.modify(color); outColor = color; } ``` The user can specialize that shader by specifying the concrete types to use for global and entry-point parameters of interface types (e.g., plugging in `Doubler` for `gGlobalModifier` and `Squarer` for `entryPointModifier`). The "plugging in" process is done in terms of a concept of both global and local "existential slots" which are a new `LayoutResourceKind` that represents the holes where concrete types need to be plugged in for existential/interface types. In simple cases like the above, each interface-type parameter will yield a single existential slot in either the global or entry-point parameter layout. Users can query the start slot and number of slots for each shader parameter, just like they would for any other resource that a parameter can consume. Before generating specialized code, the user plugs in the name of the concrete type they would like to use for each slot using `spSetTypeNameForGlobalExistentialSlot` and/or `spSetTypeNameForEntryPointExistentialSlot`. There are some major limitations to the implementation in this first change: * Parameters must be of interface type (e.g., `IFoo`) and not an array (`IFoo[3]`), or buffer (`ConstantBuffer<IFoo>`) over an interface type. Similarly, `struct` types with interface-type fields still don't work. * The work on interface-type function parameters still doesn't include support for `out` or `inout` parameters, nor for functions that return interface types (that isn't technically related to this change, but affects its usefullness). * No work is being done to correctly lay out shader parameters once the concrete types for existential slots are known, so that this change really only works when the concrete type that gets plugged in is empty. These limitations are severe enough that this feature isn't really usable as implemented in this change, and this merely represents a stepping stone toward a more complete implementation. Implementation -------------- The API side of thing largely mirrors what was already done to support passing strings for the type names to use for global/entry-point generic arguments, so there should be no major surprises there. The logic in `check.cpp` computes the list of existential slots when creating unspecialized `Program`s and `EntryPoint`s (this is logically the "front end" of the compiler), and then checks the supplied argument types against what is expected in each slot when creating specialized `Program`s and `EntryPoint`s. This again mirrors how generic arguments are handled. Type layout was extended to compute the number of existential slots that a type consumes, and will thus automatically assign ranges of slots to top-level and entry-point shader parameters in the same way it already allocates `register`s and `binding`s. The big missing feature is the ability to specialize a layout to account for the concrete types plugged into the existential-type slots. IR generation for specialized programs and entry points was slightly extended so that it attaches information about the concrete types plugged into the existential slots, and the witness tables that show how they conform to the interface for that slot. The linking step needed some small tweaks to make sure that information gets copied over to the target-specific program when we start code generation. The meat of the IR-level work is in `ir-bind-existentials.cpp`, which takes the information that was placed in the IR module by the generation/linking steps and uses it to rewrite shader parameters. For example, if there is a shader parameter `p` of type `IModifier`, and the corresponding existential slot has the type `Doubler` in it, we will rewrite the parameter to have type `Doubler`, and rewrite any uses of `p` to instead use `makeExistential(p, /witness that Doubler conforms to IModifier/)`. Once the replacement is done on the parameters, the existing work for specializing existential-based code when the input type(s) are known kicks in and does the rest. Testing ------- A single compute test is added to validate that this feature works. It is narrowly tailored to not require any of the features not supported by the initial implementation (e.g., all of the concrete types used have no members). The test case does include use of an associated type through one of these existential-type parameters, which has exposed a subtle bug in how "opening" of existential values is implemented in the front-end. Rather than fix the underlying problem, I cleaned up the code in the front-end to special case when the existential value being opened is a variable bound with `let`, to directly use a reference to that variable rather than introduce a temporary. Similarly, in the IR generation step, I added an optimization to make variables declared with `let` skip introducing an IR-level variable and just use the SSA value of their initializer directly instead. * fixup: missing files * fixup: incorrect type for unreachable return * fixup: actually comment ir-bind-existentials.cpp
*	Split front- and back-ends (#846)	Tim Foley	2019-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Split front- and back-ends This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API. The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion. For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points. The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed. The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did: * The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files. * A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does not interact with generic arguments for global or entry-point generic parameters. * The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`). * The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation. * A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around. * The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward. * Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios. * The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time). * One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation). It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required: * I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it. * A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters. * In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation. * In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway. * In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better. * In cases where code was doing, e.g., `context->shared->compileRequest->session->getThing()` I have tried to introduce convenience routines so that the usage site is just `context->getThing()` to improve encapsulation and allow changes to be made more easily going forward. * The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors. * A few APIs now take `DiagnosticSink` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob` parameter for the output. fixup: missing include for compilers with accurate template checking (non-VS) * fixup: review feedback
*	Fix for Dx12 to stop crash when dxil cannot be handled by driver (#851)	jsmall-nvidia	2019-02-14
\| \| \| \| \| \|	* If there is a problem with initialize RenderTestApp::initialize constructing the pipeline, this was not being reported as an error causing a later crash. * Use same style to return error.
*	Path simplification/hash mode, plus bug fixes (#788)	jsmall-nvidia	2019-01-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* * Fix memory bug around expanding va_args - needed buffer to have space for terminating 0 * Fix problem with FileWriter defaults being globals, as memory they allocate, will only be freed after return from main - work around by making StdWriters RefObject derived, and kept in scope such the writers are destroyed before checks for leaks is found * Added SimplifyPathAndHash mode for CacheFileSystem - will simplify the path and see if simplified path is in cache before reading file (limiting amout of underlying file requests) * * Added calcReplaceChar * Renamed DefaultFileSystem to OSFileSystem * Made OSFileSystem convert windows \ to / on linux * Simplified logic for caching in CacheFileSystem. * Added pragma-once-c to add extra test, but also so there is an 'include' directory in preprocessor tests. * Small fixes in pragma once test. * Simplified cache handling path, so that paths/simplified paths area always added. * Improve naming of methods for different caches.
*	Feature/unique tool source names (#766)	jsmall-nvidia	2019-01-07
	* Remove AppContext. Use StdChannels to hold writers, and TestToolUtil to hold test tool specific functionality. * StdChannels -> StdWriters * getStdOut -> getOut, getStdError -> getError * Renamed main.cpp files of tools to try and stop visual studio getting confused between files - such that clicking on an error takes editor to the right location.