slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2018-10-26	Work around dxc matrix layout behavior (#694)	Tim Foley
	The Slang compiler allows the default matrix layout convention (row-major vs. column-major) to be specified via the command line or API. When generating output HLSL, Slang emits a `#pragma pack_matrix` directive for the chosen default convention, so that a user can generate plain HLSL output and still have it encode their desired defaults. The problem that has arisen is that many released versions of dxc (including those in the most recent Windows SDK at this time) ignore the `#pragma pack_matrix` directive (the feature has since been added to top-of-tree dxc). The main fix here is to instead pass the `-Zpr` option in to dxc when invoking it if the row-major (non-default) convention is requested. This will solve the problem for clients that use Slang to generate DXIL, but not for clients who use Slang to generate plain HLSL that they then pass into dxc (those clients are assumed to be able to work around the problem for themselves). In order to test the change, I added a test that fills a constant buffer with sequential integers, and then reads out the rows/columns of an `int3x4` matrix with both row- and column-major layout, as well as an integer placed after the matrix, so we can see the offset it was given. The `render-test` application did not yet support generating code via dxc/DXIL, so I added an option for that. This ends up assuming that anybody who is running the D3D12 tests will also have a version of dxc available.
2018-10-25	Feature/premake linux (#689)	jsmall-nvidia
	* Premake work in progress for linux. * Added dump function. * Remove examples on linux Small warning fix. * * Don't build render-test on linux * Removed work around virtual destructor warning, and just used virtual dtor for simplicity * Git ignore obj directories * Fix premake working on windows. * * Fix sprintf_s functions * Make generates arg parsing more robust * Added FloatIntUnion to avoid type punning/strong aliasing issues, and repeated union definitions. * Work around problems building on linux with getClass claiming a strict aliasing issue. * Fix for targetBlock appearing potentiall used unintialized to gcc. * Linux slang link options -fPIC to make dll. * Add -fPIC to build options on linux. * Add -ldl for linux on slang. * Fixes to try and get premake working with .so on linux. * Make core compile with -fPIC * Try to fix linux linking with --no-as-needed before -ldl * Add rpath back. * Remove render-gl from linux build. * Re-add location for linux. * Don't include <malloc.h> except on windows. * Remove unused line to fix warning on osx. * Remove ambiguity on OSX for operator <<. * Fixing ambiguity with operator overloading and Int types for OSX. * Fix ambiguity around UInt and operator * Fix ambiguity of UInt conversion for OSX. * Added UnambiguousInt and UnambiguousUInt to make it easier to work around OSX integer coercion for UInt/Int types.
2018-09-12	Feature/memory arena (#631)	jsmall-nvidia
	* First pass at MemoryArena. * First pass at RandomGenerator. * Extract TestContext into external source file. * Fix warning on printf. * Use enum classes for Test enums. OutputMode -> TestOutputMode. * First pass at FreeList unit test. * Auto registering tests. Improvements to RandomGenerator. * Remove the need for unitTest headers - cos can use registering. * Added unitTest for MemoryArena. * Do unit tests. * Fix typo.
2018-08-03	Major overhaul of Renderer abstraction, to support a new example (#624)	Tim Foley
	The original goal here was to bring up a second example program: `model-viewer`. While the existing `hello-world` example is enough to get somebody up to speed with the basics of the Slang API (as a drop-in replacement for `D3DCompile` or similar), it doesn't really show any of the big-picture stuff that Slang is meant to enable. There wasn't any use of D3D12/Vulkan descriptor tables/sets, and there wasn't any use of interfaces, generics, or `ParameterBlock`s in the shader code. The `model-viewer` example addresses these issues. Its shader code involves generics, interfaces, and multiple `ParameterBlock`s, and the host-side code demonstrates a few key things for working with Slang: * There is an application-level abstraction for parameter blocks, that combines the graphics-API descriptor set object with Slang type information * There is a shader cache layer used to look up an appropriate variant of a rendering effect by using parameter block types to "plug in" global type variables * There is a clear separation between the phases of compilation: a first phase that does semantic checking and enables reflection-based allocation of graphics API objects, followed by one or more code generation passes for specialized kernels. This example is certainly not perfect, and it will need to be revamped more going forward. In particular: * The output picture is ugly as sin. We need a plan for how to get this to load better content, perhaps even popping up an error message to note that the required input data isn't present in the basic repository. * The shader code is too simplistic. There isn't any real material variety, and the `IMaterial` abstraction is completely wrong. * The use of parameter blocks is facile because there are no resource parameters right now. Fixing that will likely expose issues around interfacing with Slang's reflection API. * The whole example exposes the issue that Slang's current APIs aren't really designed for the benefit of two-phase compilation (since our many client application has been stuck on one-phase compilation). * Global type parameters are actually a Bad Idea that we only did for compatibility with existing codebases. We should not be showing them off in an example of the Right Way to use Slang, but the language support for type parameters on entry points is still not complete. Of course, the majority of the changes here are not inside the example applications, and instead involve a major overhaul of the `Renderer` abstraction that is used for both tests and examples. The main thrust of the change is to make the abstraction layer be closer to the D3D12/Vulkan model than to a D3D11-style model. This is important for the `model-viewer` example, since it aspires to show how Slang can be incorporated into a renderer that targets a modern API. The most important bit is actually the use of descriptor sets and "pipeline layouts" a la Vulkan, since without these Slang's `ParameterBlock` abstraction won't make a lot of sense. Implementation of the abstraction for the various APIs has very much been on an as-needed basis. The current implementation is just enough for the two examples to work, plus enough to get all the tests to pass in both debug and release builds on Windows. A big missing feature in the API abstraction right now is memory lifetime management. The code had been trending toward something D3D11-like where a constant buffer could be mapped per-frame with the implementation doing behind-the-scenes allocation for targets like D3D12/Vulkan. I'd like to shift more toward a model of just exposing "transient" allocations that are only valid for one frame, because these are more representation of how an efficient renderer for next-generation APIs will work. That transition isn't actually complete, though, so there are problems with the existing examples where `hello-world` is actually scribbling into memory that the GPU might still be using, while `model-viewer` is doing full-on heavy-weight allocations on a per-frame basis with no real concern for the performance implications. All together, there are a lot of things here that need more work, but this branch has been way too long-lived already, and so I'd like to get this checked in as long as all the tests pass.
2018-07-06	spCompile/spProcessCommandLineArguments return SlangResult (#610)	jsmall-nvidia
	* * Make spCompile return SlangResult * Make spProcessCommandLineArguments return SlangResult (and not internally exit) * Remove calls to exit() * Fix typos * Make all output from spProcessCommandLineArguments get sent to diagnostic sink.
2018-06-28	Share graphics API layer between tests/examples (#603)	Tim Foley
	The `render-test` project has an in-progress graphics API abstraction layer, and it makes sense to share this code with our examples rather than write a bunch of redundant code between examples and tests. Most of this change is just moving files from `tools/render-test/` to a new library project at `tools/slang-graphics/`. The most complicated code change there is renaming from `render_test` to `slang_graphics`. The existing `hello` example was ported to use the graphics API layer instead of raw D3D11 API calls. It is still hard-coded to use the D3D11 back-end and the `SLANG_DXBC` target, so more work is needed if we want to actually support multiple APIs in the examples. I also went ahead and implemented an extremely rudimentary set of APIs to abstract over the Windows platform calls that were being made in the example, so that we could potentially run that same example on other platforms. I did not* port `render-test` to use those APIs, and I also did not implement them for anything but Windows (my assumption is that for most other platforms we would just use SDL2, and require people to ensure it is installed to their machine before building Slang examples).
2018-06-22	Expose macros/functionality for defining interfaces (#604)	jsmall-nvidia
	* Added Result definitions to the slang.h * Removed slang-result.h and added slang-com-helper.h * Move slang-com-ptr.h to be publically available. * Add SLANG_IUNKNOWN macros to simplify implementing interfaces. Use the SLANG_IUNKNOWN macros to in slang.c * Removed slang-defines.h added outstanding defines to slang.h
2018-06-13	Make render-test use Slang for all shader compilation (#597)	Tim Foley
	* Make render-test use Slang for all shader compilation This streamlines the code for render-test by having all its shader compilation go through the Slang API, so that it doesn't have to deal with custom logic to compile HLSL->DXBC and HLSL->DXIL. We were already leaning on Slang to generate SPIR-V for Vulkan, so this makes all the paths more consistent. My original plan with this change was to make the D3D12 render path start using DXIL at this point, since the change would make that easy, but it turns out that some aspects of how we handle parameter binding are not compatible with that right now, so it would need to come as a later change. There's a lot of details here, so I will try to walk through the changes, including the incidental ones: * Add logic to `premake5.lua` so that we copy the necessary libraries for HLSL shader compilation to our target directory from the Windows SDK. This is necessary so that our tests can actually invoke `dxcompiler.dll` * Re-run Premake to generate new project files. This moves around a few files that I manually added in previous changes without re-running Premake. * When invoking `fxc` as a pass-through compiler, be sure to pass along any macros defines via API or command-line. This isn't a strictly required change with how things worked out, but it is a positive one anyway, because it makes `slangc -pass-through fxc` more useful. * Don't print output from a downstream `fxc` invocation if it produces warnings but no errors. The main reason for this is so that our tests don't fail because of `fxc` warnings on Slang's output (which then don't match the baselines), but it can also be rationalized as not wanting to confuse users with warnings that don't come from the "real" compiler they are using. This probably needs fine-tuning as a policy. * Add the HLSL `NonUniformResourceIndex` function. This was an oversight because it isn't documented as a builtin on MSDN, and only gets mentioned obliquely when they talk about resource indexing. * Add `glsl_<version>` profiles to match our `sm_<version>` profiles, so that it is easy for a user to use the profile mechanism to request a specific GLSL version without also specifying a stage name. * Update the render-test logic so that there is a single `ShaderCompiler` implementation that always uses Slang, and get rid of all of the renderer-specific `ShaderCompiler` implementations. * Update logic in render-test `main.cpp` to select the options to use for the eventual Slang compile based on the choice of renderer and input language. I didn't change the options that render-test exposes, even though they are getting increasingly silly (e.g., `-glsl-rewrite` doesn't use GLSL as its input...). * Note: the D3D12 renderer will still use fxc, DXBC, and SM 5.0 for now, since trying to update it to switch to dxc, DXIL, and SM 6.0 didn't work well at the time. * Add a bit of supporting D3D12 code to make sure that we don't allocate a structured buffer when a buffer has a format. * Make sure to also define the `__HLSL__` macro when compiling Slang code, because otherwise a bunch of tests don't work (I'm not clear on how it worked before...). * fixup: missing file
2018-06-05	Fix atomic operations on RWBuffer (#593)	Tim Foley
	* Fix atomic operations on RWBuffer An earlier change added support for passing true pointers to `__ref` parameters to fix the global `Interlocked()` functions when applied to `groupshared` variables or `RWStructureBuffer<T>` elements. That change didn't apply to `RWBuffer<T>` or `RWTexture2D<T>`, etc. because those types had so far only declared `get` and `set` accessors, but not any `ref` accessors (which return a pointer). The main fixes here are: Add `ref` accessors to the subscript oeprations on the `RW` resource types Adjust the logic for emitting calls to subscript accessors so that we don't get quite as eager about invoking a `ref` accessor, and instead try to invoke just a `get` or `set` accessor when these will suffice. This is important for Vulkan cross-compilation, where we don't yet support the semantics of our `ref` accessors. * Add a test case for atomics on a `RWBuffer` * Fix up `render-test` so that we can specify a format for a buffer resource, which allows us to use things other than `StructuredBuffer` and `ByteAddressBuffer`. The work there is probably not complete; I just did what I could to get the test working. * A bunch of files got whitespace edits thanks to the fact that I'm using editorconfig and others on the project seemingly arent... * fixup: remove ifdefed-out code
2018-06-01	1st stage renderer binding refactor (#587)	jsmall-nvidia
	* First pass at support for textures in vulkan. * Binding state has first pass support for VkImageView VkSampler. * Split out _calcImageViewType * Fix bug in debug build around constant buffer being added but not part of the binding description for the test. * Offset recalculated for vk texture construction just store the texture size for each mip level. * When outputing a vector type with a size of 1 in GLSL, it needs to be output as the underlying type. For example vector<float,1> should be output as float in GLSL. * Vulkan render-test produces right output for the test tests/compute/textureSamplingTest.slang -slang -gcompute -o tests/compute/textureSamplingTest.slang.actual.txt -vk * Small improvement around xml encoding a string. * More generalized test synthesis. * Fix image usage flags for Vulkan. * Improvements to what gets synthesized vulkan tests. * Do transition on all mip levels. * Fixing problems appearing from vulkan debug layer. * Disable Vulkan synthesized tests for now. * Add Resource::Type member to Resource::DescBase. * Removed the CompactIndexSlice from binding. Just bind the indices needed. * BindingRegister -> RegisterSet * RegisterSet -> RegisterRange * Typo fix for debug build. * Remove comment that no longer applied.
2018-05-29	Feature/vulkan texture (#579)	jsmall-nvidia
	* First pass at support for textures in vulkan. * Binding state has first pass support for VkImageView VkSampler. * Split out _calcImageViewType * Fix bug in debug build around constant buffer being added but not part of the binding description for the test. * Offset recalculated for vk texture construction just store the texture size for each mip level. * When outputing a vector type with a size of 1 in GLSL, it needs to be output as the underlying type. For example vector<float,1> should be output as float in GLSL. * Vulkan render-test produces right output for the test tests/compute/textureSamplingTest.slang -slang -gcompute -o tests/compute/textureSamplingTest.slang.actual.txt -vk * Small improvement around xml encoding a string. * More generalized test synthesis. * Fix image usage flags for Vulkan. * Improvements to what gets synthesized vulkan tests. * Do transition on all mip levels. * Fixing problems appearing from vulkan debug layer. * Disable Vulkan synthesized tests for now.
2018-05-24	A bunch of work to resolve #569 (#576)	Tim Foley
	* render-test should not fail on HLSL compiler warnings The logic in `render-test` that invokes `D3DCompile` was causing a test to fail if it produced any warnings (not just if compilation fails). Warning output can be dealt with by the test runner, since it will compare output between runs anyway, and it is useful to be able to run something through `render-test` that compiles with warnings. * Be more careful about deleting IR instructions There was an `IRInst::deallocate()` method that had a precondition that the instruction should already be removed from its parent and clear out all its operands before calling, but it wasn't checking this and the few call sites weren't doing things right either. I consolidated things on `IRInst::removeAndDeallocate()` which does all the things: removes from the parent, clear out operands, and then deallocates. I also made sure to clear out the type operand. This clears up some crashing issues where passes were removing instructions but those instructions would still show up as users of other instructions. * Don't emit bitwise not for non-Boolean types It seems like the logic in `emit.cpp` messed things up and decided that `Not` (the IR instruction that is equivalent to `!` in the AST) should emit as `!` for Boolean types and `~` for other types, but this makes no sense (e.g., `~(a & 1)` is very different from `!(a & 1)`, even when interpreted as a condition). It seems like this logic was intended for the `BitNot` case, where `~a` and `!a` are actually equivalent for Boolean values (but a target language might not like `~a` on `bool` values). Maybe the original plan was that the `Not` instruction should only apply to Boolean values in the first place, and that other values should be converted to `bool` (or a vector of `bool`) before applying `Not`, but even in that case the emit logic makes no sense. This caused an actual problem for one of my test cases, so it was important to fix it now. * Fix issue with cached resolution for overoaded operators The basic problem was that the lookup logic was forming a key based on the first definition it found for the overloaded operator, but that means that when processing a prefix `++a` call we might look up the postfix definition of `operator++` and decide to use its opcode as the key. This "fixes" the logic by looking for the first definition with a "compatible" definition (e.g., a `__prefix` function if we are checking a `PrefixExpr`), and then uses its opcode. A better fix in the long run would be to make the cache just be keyed on the operator name and the "fixity" of the expression (prefix, postfix, or infix). * Introduce an intermediate structured control-flow representation The code previously used a single function called `emitIRStmtsForBlocks` in `emit.cpp` that would take a logical sub-graph of the CFG and emit it as high-level statements. It would do this by recognizing operations like coniditional branches that it could turn into high-level `if` statements, etc. The main problem with this function was that it mixed together the logic for how we restructure the program with the logic for how we emit high-level code from that structure. This change splits those two parts of the algorithm by introducing an intermediate data structure: a tree of `Region`s, which represent single-entry regions of the CFG. There are subclasses of `Region` corresponding to various structured control-flow constructs, and then a leaf case that wraps a single `IRBlock`. The new function `generateRegionsForIRBlocks()` (in `ir-restructure.cpp`) now handles the restructuring work, by building one or more `Region`s to represent a sub-graph, while `emitRegion()` handles emitting HLSL/GLSL source code from a region. Splitting things in this way opens up some opportunities for future changes: * We can expand the set of IR control-flow constructs allowed, so long as we can still generate structure `Region`s from them, without having to mess with the emit logic (e.g., we could start to support multi-level `break` by introducing temporaries as needed). In the limit we can generate our `Region`s using something like the "Relooper" algorithm. * We can emit to other representations while retaining the same control-flow restructuring support. E.g., if we drop the structured information from the IR, then emitting to SPIR-V for Vulkan would require us to use the strucured control-flow information from these `Region`s. * We can do analysis that needs to understand `Region` structure. This is relevant to issue #569, which was what prompted me to start on this work. Now that we have a representation of the nesting of `Region`s, we can use it to reason about visibility of values between blocks. During development of this change I ran into a gotcha, in that I had been assuming each IR block would map to a single `Region`, forgetting that our current lowering of "continue clauses" in `for` loops leads to them being duplicated. The `Region` representation handles this by having a linked-list struct mapping IR blocks to the `SimpleRegion`s that represent them. I added a test case that includes a `for` loop with a continue clause that is reached along multiple paths just to make sure that we continue to support that case. The compiler output should not change as a result of this work; this is supposed to be a pure refactoring change. * Add a pass to resolve scoping issues in generated code Fixes #569 The basic problem arises because the structured control flow that we output in high-level HLSL/GLSL doesn't match the "scoping" rules of an SSA IR. In particular, SSA says that a value can be used in any block that is dominated by the definition, but in the presence of `break` and `continue` statements it is easy to construct cases where a block dominates something that is not in its scope for structured control flow. Consider: ```hlsl for(;;) { int a = xyz; if(a) { int b = a; break; } int c = a; } int d = b; ``` This program is invalid as HLSL, because the variable `b` is referenced outside of its scope, but if we look at the CFG for this function, it is clear that the block that computes `b` dominated the block that computes `d`. IR optimizations can easily create code like this, so we need to be ready for it. The previous change added an explicit `Region` structure to represent the structured control flow that we re-form out of the IR, and this change adds a pass that exploits the structuring information to detect cases like the above and introduce temporaries to fix the scoping issue. For example, the pass would change the earlier code block into something like: ```hlsl int tmp; for(;;) { int a = xyz; if(a) { int b = a; tmp = b; break; } int c = a; } int d = tmp; ``` That is, we introduce a new `tmp` variable at a scope "above" both the definition and use of `b`, and then we copy `b` into that temporary right where it is computed, and then use the temporary instead of the original `b` at the use site. A few details that came up during the implementation: * Downstream compilers may get confused by code like the above, and complain that `tmp` may be used before it is initialized, even though the very definition of dominators in a CFG means we don't have to worry about it. Still, I introduced some one-off code to initialize the temporaries just to silence spurious warnings coming from fxc. * We need to be careful not to apply this logic to "phi nodes" (the parameters of basic blocks) since they will already be turned into temporaries by the emit logic, and trying to introduce temporaries with this pass led to broken code (I still need to investigate why). It may be that a future version of this pass should also take the code out of SSA form, so that we can introduce both kinds of temporaries in a single pass (and maybe eliminate some unnecessary variables by doing basic register allocation). There is another transformation that could fix some issues of this kind, by moving code out of a structured control-flow construct and to the "join point" after it. For example, we could turn our loop from the start of this commit message into: ```hlsl for(;;) { int a = xyz; if(a) { break; } int c = a; } int b = a; int d = b; ``` Moving the definition of `b` to after the loop is possible because there is no way to get out of the loop without executing that code anyway. Now the scoping issue for `d`'s use of `b` has gone away, but of course we've introduced a new scoping issue for `a`, when it gets used by `b`. Adding a pass to re-arrange control flow like this could reduce the cases where we have to apply the current pass, but it wouldn't eliminate them entirely. That means such a pass can be deferred to future work. This change includes a test case the reproduces the original issue, so that we can confirm the fix works.
2018-05-11	Generate Visual Studio projects using Premake (#557)	Tim Foley
	* Generate Visual Studio projects using Premake This change adds a `premake5.lua` file that allows us to generate our Visual Studio solution using Premake 5 (https://premake.github.io/). The existing Visual Studio solution/projects are now replaced with the Premake-generated ones, and project contributors will be expected to update these by running premake after adding/removing files. I have not changed the Linux `Makefile` build at all, because that file is also used for things like running our tests, so that clobbering it with a premake-generated `Makefile` would break our continuous testing. Hopefully future changes can switch to a generated `Makefile` and perhaps even add an XCode project as well. Notes: * The `build/slang-build.props` file is no longer needed/used, so it has been removed. * The `slang-eval-test` test fixture wasn't following our naming conventions for its directory path, so it was updated to streamline the Premake build configuration work. This required changes to the `Makefile` as well * Some seemingly unncessary preprocessor definitions that were specified for `core` and `slang-glslang` have been dropped. We will see if anything breaks from that. * Possible fixup for Premake vpath issue Premake's `vpath` feature seems to be nondeterministic about the order it applies filters (because Lua isn't deterministic about the order of entries in a key/value table), and as a result we can end up in a weird case where it decides that a `foo.cpp.h` file matches the `.cpp` filter (I'm not sure why) before it tests against the `.h` filter. This change uses an (undocumented) Premake facility to set `vpath` using a list of singleton tables, which seems to fix the order in which things get tested. * Remove support for "single-file" build of Slang The `hello` example was the only bit of code that uses the "single-file" way of building Slang, and this had already run up against limitations of the Visual Studio compilers in its Debug\|x64 build. Rather than mess with Premake to make it pass through the `/bigobj` linker flag that is needed to work around the issue, it makes more sense just to stop using/supporting the feature since we wouldn't want users to depend on it anyway (our documentation no longer refers to it). While I was at it I went ahead and made sure that the `SLANG_DYNAMIC` flag doesn't need to be set manually, so that instead there is a non-default `SLANG_STATIC` option (not that we have a static-library build of Slang at the moment).
2018-05-04	Use Surface for screen capture in Renderer interface (#551)	jsmall-nvidia
	* Remove serialization of screen captures from a renderer implementation, capture now writes to a Surface. Then client code can decide to serialize (or use as needed). * Improved comment for captureScreenSurface.
2018-05-03	Added Surface type - as a simple value type to hold a 2d collection of ↵	jsmall-nvidia
	pixels. (#548) Added PngSerializeUtil allows currently for just writing Surface of RGBA format. Removes dependency on stbi_image except for in PngSerializeUtil. Removed use of gWindowWidth/Height globals - pass the height into initialize or Renderer.
2018-05-03	Fixes based on review of vulkan-first-render PR #545 (#546)	jsmall-nvidia

2018-05-03	Feature/vulkan first render (#545)	jsmall-nvidia
	* First pass at InputLayout for Vulkan Add support for RGBA_Float32 * Use VulkanModule and VulkanApi to handle accessing Vulkan types. * First pass at Vulkan swap chain/Device queue. * Added VulkanUtil for generic function functions. * Move more functionality to VulkanApi and VulkanUtil. Make Buffer able to initialize itself. * More tidy up around VulkanDeviceQueue * First pass use of VulkanDeviceQueue in VkRenderer * First pass use of VulkanSwapChain on VkRenderer * Added depth formats. Binding for constant and vertex buffers for Vulkan. * Setting up VkImageView on backbuffers. * First pass support for setting up vkRenderPass. * Fixes to work around Vulkan swap chain/verification issues. * Added support for Pipeline and a pipeline cache. * Working without waiting - because use of pipeline cache. * Added support for VkFramebuffer in Vulkan. * First pass at creating Vulkan graphics pipeline. * More efforts to get Vulkan to render. * Small improvement for checking of Binding flags. * Removed setConstantBuffers from the Renderer interface - so that all resource binding takes place through the BindingState. To make this work required a 'hack' in render-test main.cpp - so that the constant buffer binding that is needed in some tests is only added when it doesn't clash. * RendererID -> unified into RendererType. Added getRendererType to Renderer interface. Added ProjectionStyle, and function to get from RendererType. Added getIdentityProjection to RendererUtil - to get projection that is the 'identity' - but hits the same pixels for all projection styles. * Fix build problem on Win32 on Vulkan where should use VK_NULL_HANDLE. * Improve naming, comments. Remove dead code. * Remove unwanted comment.
2018-04-20	Fixes/improvements based on feature/render-binding-resource (#511)	jsmall-nvidia
	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * Split out functions for construction or Renderer types into ShaderRendererUtil. Removed the serialization of buffers code into test-render * Improvements in documentation and typename in BindingState types. RegisterSet -> CompactBindIndexSlice RegisterList -> BindIndexSlice RegisterDesc -> ShaderBindSet * Fix debug build break.
2018-04-19	Separation of Binding/Resource construction on Renderer interface (#508)	jsmall-nvidia
	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues. * First pass testing out construction of binding resources external to Renderer implementation. * Moved some BindingState::Desc types to BindingState to make easier to use. * First pass of binding using BindingState::Desc for Dx11. * First pass at binding with dx12. * Fixed issues around separating dx12 binding from ShaderInputLayout * First pass at OpenGl state binding. * BindingState::Desc::Binding::Type -> BindingType * Use Buffer to manage life of vk resources. Construction of buffers handled by createBufferResource (BindingState doesn't have specialized logic) * Remove InputLayout types from binding so can create a binding independent of it. * Added upload buffer to BufferResource - could be used for write mapping. * m_samplers -> m_samplerDescs. First pass at Vk binding with BindingState::Desc. Small tidy/doc improvements. * First pass with binding all taking place through BindingState::Desc. All tests pass. * Removed support for creating BindingState from ShaderInputLayout * Remove serializeOutput from Renderer interface and all implementations. Implement map/unmap on vulkan Implement serializeBindingOutput which uses map/unmap and BindingState::Desc to write result. * Make implementation of BindingState use the BindingState::Desc for much of state - only hold api specific in BindingDetail per implementation. * Use Glsl binding on vulkan (was using hlsl). * BindingState::Desc::Binding -> BindingState::Binding. Made possible by impls using 'BindingDetail' for their specific needs. * Fix compile problems on win32. * Fix a typo in name createBindingSetDesc -> createBindingStateDesc
2018-04-17	Feature/renderer binding (#489)	jsmall-nvidia
	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues.
2018-04-10	Feature/dx12 compute (#482)	jsmall-nvidia
	* Dx12 rendering works in test framework. * Turn on dx12 render tests. * Getting simpler dx12 compute tests to work. * With expected data in test - check for specialized and then for the default, so that multiple test can share the same expected data, but specialized cases can still be set. * Fixed construction and binding on dx12 textures. * Control which render apis used in test from command line. * Small aesthetic fixes in render-test/main.cpp. * Fix binding problem for uavs/srvs dx12. Previously tried to create srv/uav for StorageBuffers (like dx11 does), but the binding breaks as you can end up with two srvs using the same register. First pass at fixing problems with Texture creation for dx12 - assertions were hit with 3d or array textures. * Fixes to improve Dx12 setup shader resource views for cubemaps/arrays. * Fixed d3d12 textureSamplingTest - problem was that cubemap/array textures were not being uploaded correctly. * Changed the order of how binding of constant buffers (as just set on the Renderer) indexes. Previously they were given the lowest indices, but they clashed with the indices from the 'Binding'. Changing this means all tests run on d3d12. * Add code to allow use of warp (although not command line switchable yet). Fix problem setting up raw UAV - as identified by warp. * Added RenderApiUtil - which can detect if a render api is potentially available. * Moved render flag testing/parsing into RenderApiUtil. * Fix signed/unsigned warning. * Fixes around enums prefixed with k on the review of feature/dx12 compute branch.
2018-04-04	Dx12 rendering works in test framework. (#476)	jsmall-nvidia

2018-04-03	Fixes based on review of feature/dx12 PR. (#473)	jsmall-nvidia

2018-04-02	Feature/dx12 (#469)	jsmall-nvidia
	* Fix signed/unsigned comparison warning. * Split out d3d functions that will work across dx11 and 12. * Improve slang-test/README.md around command line options. * Make Guid comparison honor alignment for comparisons, such that mechanism work on architectures that can only do aligned accesses. * Initial setup of D3D12 Renderer, with presentFrame and clearFrame. * More support for D3D12 * Added FreeList * Added D3D12CircularResourceHeap * First attempt at createBuffer * First pass at map/unmap. * First pass binding vertex/constant buffers, and setting up InputLayout. Note that memory is not kept in scope on binding yet. * First pass of D3DDescriptorHeap * Small tidy up in render-d3d11. Added D3DDescriptorHeap to project. * First pass at D3D12 bind state. * Fix typos in D3D12Resource * Tidy up Dx11 render binding a little to match more with Dx12 style. * First pass at Dx12 BindingState * Handling of the command list d3d12. Support for submitGpuWork and waitForGpu. * First attempt at Dx12 capture of backbuffer to file. * First attempt at D3D12 binding for graphics. * D3D12 setup viewport etc - does now render triangle in render0.hlsl. * First pass at support for compute on D3D12Renderer * Use spaces over tabs in D3DUtil * Tabs to spaces in D3D12DescriptorHeap * Convert tab->spaces on render-d3d12.cpp
2018-03-29	Change uses of "spire" to "slang" (#461)	Tim Foley
	Fixes #350 When the Slang project forked off from the Spire research effort, we renamed things as we went, but many cases seem to have slipped through the cracks. The two biggest diffs here are: - The `hello` example program was incorrectly talking about what was in the shader file (Slang no longer supports the "module" or "pipeline" constructs from Spire), and so it wasn't just a simple rename. - The files under `tests/bindings` were mistakenly using `__SPIRE__` as a preprocessor guard, which means that they weren't actually testing what they meant to. Luckily, it looks like the relevant functionality didn't regress while these tests were unintentionally deactivated.
2018-03-26	Fix signed/unsigned comparison warning. (#455)	jsmall-nvidia

2018-03-26	Renderer resource mangement for render-test (#453)	jsmall-nvidia
	* First pass at resource based renderer using RefObject. * Correct handling of array of buffer pointers to Dx11. * Fix bug with setting viewOut incorrectly in createInputTexture. * More support for allowing com like interfaces. * Added and tidied Slang::Result - adding interface specific results * Guid added comparison support, and made base interface IComUnknown - with lowerCamel methods
2018-03-22	Tidy up of Renderer (#452)	jsmall-nvidia
	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types. * Refactor Renderer implementations such that: * Class definition does not contain long implementation/s * Removed unused globals * Ordered implementation after class definition * Made renderer specific classes child classes, and use Impl postfix to differentiate * Converted tabs into spaces * First pass at Slang::ComPtr. Added slang-defines.h which sets up some fairly commonly used defines such as SLANG_FORCE_INLINE, compiler detection, os detection, and some other cross platform features. * * Fixed bug in vk renderer - where features structure not initialized on hkCreateDevice * Make member variables in Renderer implementations use prefix * Updated test README.md to document that free parameter can control what test is run * * changed setClearColor to take an array of 4 floats to make API clearer on usage * mix of type usage style - defaulted to more conventional style * * Fixed swapWith * Use SLANG_FORCE_INLINE * Don't bother initializing List data when type is POD * Added convenience macro for Result handling SLANG_RETURN_NULL_ON_ERROR
2018-03-21	First pass impls on ComPtr and reorganise Renderer (#450)	jsmall-nvidia
	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types. * Refactor Renderer implementations such that: * Class definition does not contain long implementation/s * Removed unused globals * Ordered implementation after class definition * Made renderer specific classes child classes, and use Impl postfix to differentiate * Converted tabs into spaces * First pass at Slang::ComPtr. Added slang-defines.h which sets up some fairly commonly used defines such as SLANG_FORCE_INLINE, compiler detection, os detection, and some other cross platform features.
2018-03-20	SlangResult and small bug/typos fixes (#448)	jsmall-nvidia
	* Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types.
2018-02-20	bug fix: witnessTable's subTypeDeclRef should have default substitution for ↵	Yong He
	getSpecailizedMangledName to work properly.
2018-02-02	Remove support for the -no-checking flag (#392)	Tim Foley
	* Remove support for the -no-checking flag Fixes #381 Fixes #383 Work on #382 - No longer expose flag through API (`SLANG_COMPILE_FLAG_NO_CHECKING`) and command-line (`-no-checking`) options - Remove all logic in `check.cpp` that was withholding diagnostics (including errors) when the no-checking mode was enabled - Remove `HiddenImplicitCastExpr`, which was only created to support no-checking mode (it represented an implicit cast that our checking through was needed, but couldn't emit because it might be wrong) - Remove logic for storing function bodies as raw token lists when checking is turned off. I'm leaving in the `UnparsedStmt` AST node in case we ever need/want to lazily parse and check function bodies down the line. - Remove a few of the code-generation paths we had to contend with, but keep the comment about them in place. - Remove GLSL-based tests that can't meaningfully work with the new approach. - Fix other tests that used a GLSL baseline so that their GLSL compiles with `-pass-through glslang` instead of invoking `slang` with the `-no-checking` flag. - Remove tests that were explicitly added to test the "rewriter + IR" path, since that is no longer supported. There is more cleanup that can be done here, now that we know that AST-based rewrite and IR will never co-exist, but it is probably easier to deal with that as part of removing the AST-based rewrite path. We've lost some test coverage here, but actually not too much if we consider that we are dropping GLSL input anyway. * Fixup: test runner was mis-counting ignored tests * Fixup: turn on dumping on test failure under Travis * Fixup: enable extensions in Linux build of glslang
2018-02-02	Initial work on getting render-test to support vulkan (#391)	Tim Foley
	* Basic fixes to gets some Vulkan GLSL out of the IR path We haven't been paying much attention to the Vulkan output from the IR path, but that needs to change ASAP. This commit really just implements quick fixes, without concern for whether they are a good fit in the long term. - Add some more mappings from D3D `SV_` semantics to built-in GLSL variables, and stop redeclaring those built-in variables in our output GLSL. - Add custom output logic for HLSL `StructuredBuffer<T>` types, so that they emit as `buffer` declarations with an unsized array inside. This has some real limitations: - What if the user passes the type into a function? The parameter should be typed as an (unsized) array, and not a buffer. - What happens if we have an array of structured buffers? We need to declare an array of blocks (which GLSL allows), but this changes the GLSL we should emit when indexing. - Customize the way that we emit entry point attributes (e.g., `[numthread(...)]`) to also support outputting equivalent GLSL `layout` qualifiers. In many of these cases, a better fix might involve doing more of this work in the IR as part of legalization (e.g., we already have a pass that deals with varying input/output for GLSL, so that should probalby be responsible for swapping the `SV_` to `gl_`, especially in cases where the types don't match perfectly across langauges). * Start adding Vulkan support to render-test - Add both Vulkan and D3D12 as nominally supported back-ends - Add a git submodule to pull in the Vulkan SDK dependencies - I don't want our users to have to install it manually, since the SDK is huge - Checking in the binaries to our main repository seems like a bad idea, but my hope is that we can prune the bloat using a subodule with the `shallow` cloning option - Implement enough logic for the Vulkan back-end to get a single test passing on Vulkan * Fix warning * Fixup: disable new compute tests for Linux * Fixup: ignore Vulkan tests on AppVeyor * Dynamically load Vulkan implementation Rather than statically link to the Vulkan library, we will dynamically load all of the required functions. This removes the need to have the stub libs involved at all. * Remove vulkan submodule I had set up a `vulkan` submodule to pull in the headers and stub libs, but now that we are going to dynamically load all the symbols anyway, the stub lib binaries aren't needed and we can just commit the headers. * Add Vulkan headers to external/
2018-02-01	Implement type splitting for raw buffers (#393)	Tim Foley
	* Fix render-test to handle raw buffers I don't know if this fix will work for UAVs that are neither structured nor raw, but it fixes the code that currently only really works if every UAV is structured (since it doesn't set a format). * Make type legalization consider raw buffer types The type layout logic was already handling these, but the type splitting logic in legalization was failing to split structure types that contain, e.g., `RWByteAddressBuffer`. A compute test case has been added to confirm the fix.
2018-01-21	Improvements and bug fixes for global type parameters	Yong He
	1. allow spReflection_FindTypeByName to accept arbitrary type expression string 2. allow const int generic value to be used as expression value, and as array size 3. various bug fixes in witness table specialization / function cloning during specializeIRForEntryPoint to avoid creating duplicate global values, not copying the right definition of a function from the other module, not cloning witness tables that are required by specializeGenerics etc.
2018-01-19	Allow arbitrary type string as type argument in spAddEntryPointEx.	Yong He

2018-01-12	Support nested generics	Yong He
	fixes #362
2018-01-09	bruteforce implementation of witness table resolution for associated (#358)	Yong He

2018-01-04	Bug fixes for Slang integration (#356)	Yong He
	* fix #353 * move validateEntryPoint to after all entrypoints has been checked * bug fix: DeclRefType::SubstituteImpl should change ioDiff * bug fix: generic resource usage should have count of 1 instead of 0. * update test case
2017-11-28	Enable HLSL/GLSL "rewrite" + IR-based Slang codegen (#300)	Tim Foley
	The big picture here is that the AST-to-AST pass in `ast-legalize` will now detect when a declaration being referenced comes from an `import`ed module, and (if IR codegen is enabled), it will trigger cloning of the IR for the chosen symbol into an IR module that will sit alongside the legalized AST. Then, during HLSL/GLSL code emit, we emit all the IR-based code first, and then the AST-based code. Whenever the AST code references a symbol that was lowered via IR (we keep track of these) we emit the mangled name of the IR symbol. Notes/details: - A lot of the logic for cloning IR symbols referenced by the AST matches the same logic that would clone them for completely IR-based codegen, so I tried to hoist out the common logic and share it (e.g., so that we apply the same guaranteed transformations in both cases). This required basically rewriting the logic in `emit.cpp` that decomposed the various cases. - There is a new compute test case added to test this functionality. `tests/compute/rewriter.hlsl` confirms that we can use the `-no-checking` mode for the HLSL code, but still make use of a library of Slang code that employs generics, etc. - Adding this test case required adding a new compute test mode that invokes `render-test` with the `-hlsl-rewrite` flag. - It turns out that the existing `tests/render/cross-compile0.hlsl` test should have been using this functionality already. It was opting into the use of the IR via `-use-ir`, and the `render-test` application already tries to set `-no-checking` for non-Slang input languages by default. Fixing the code path this test triggers means that it is now a second test of rewriter+IR codegen. - The `translateDeclRef` logic in `ast-legalize.cpp` seemed sloppy in places, and would potentially clone declarations, when declaration references were desired. I tried to clean a bit of this up, so some call sites are now changed. - This change tries to clean up some work around cloning of global values - All global value kinds (not just functions) now go through the logic of trying to pick a "best" definition, so that they can be used when we are linking multiple modules - The logic for registering cloned values has been unified a bit, so that clients always pass in an `IROriginalValuesForClone` that either wraps a single value (maybe just null), or an `IRSpecSymbol` that gives a list of values to regsiter the new value as a clone for. - I made one piece of code that was cloning witness tables as part of generic specializations not* register a clone. I think this is correct because we may specialize the same generic multiple ways, so registering any values we clone is not the right idea, but I might be missing something... - I also reorganized this logic so that it would be easier to clone a global value when we only know its mangled name (which is the case when it is the AST that triggers cloning) - I made sure that when loading a module via `import`, the translation unit for the new module copies the `-use-ir` flag from the overall compile request, if it is present (otherwise we wouldn't generate IR for loaded modules at all... oops). - Note that `getSpecializedGlobalValueForDeclRef()`, which is the main routine used by the AST legalization to trigger cloning of an IR value does not currently handle declaration references that require specialization. - This change does not deal with trying to unify the type legalization logic between the AST-to-AST rewriter and the IR-based codegen, so if you call an imported function with types that require legalization, Bad Things are expected to happen right now.
2017-11-20	fix warnings	Yong He

2017-11-20	fixup global generic parameters	Yong He
	1. simplify RoundUpToAlignment() 2. add new a render-compute test case to cover the situation where the entry-point interface (parameter/return types of an entry-point function) is dependent on the global generic type. 3. initial fixes to get this test case to compile (but is not producing correct HLSL output yet)
2017-11-17	Add support for global generic parameters (#285)	Yong He
	* Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. This commit does not include ProgramLayout specialization. * IR: pass through `[unroll]` attribute (#284) The initial lowering was adding an `IRLoopControlDecoration` to the instruction at the head of a loop, but this was getting dropped when the IR gets cloned for a particular entry point. The fix was simply to add a case for loop-control decorations to `cloneDecoration`. * fix warnings * IR: support `CompileTimeForStmt` (#286) This statement type is a bit of a hack, to support loops that must be unrolled. The AST-to-AST pass handles them by cloning the AST for the loop body N times, and it was easy enough to do the same thing for the IR: emit the instructions for the body N times. The only thing that requires a bit of care is that now we might see the same variable declarations multiple times, so we need to play it safe and overwrite existing entries in our map from declarations to their IR values. Of course a better answer long-term would be to do the actual unrolling in the IR. This is especially true because we might some day want to support compile-time/must-unroll loops in functions, where the loop counter comes in as a parameter (but must still be compile-time-constant at every call site). * Add support for global generic parameters (In-progress work) This commit include: 1. Update Slang API to allow specification of generic type arguments in an `EntryPointRequest` 2. Add parsing of `__generic_param` construct, which becomes a GlobalGenericParamDecl, contains members of `GenericTypeConstraintDecl`. 3. Semantics checking will check whether the provided type arguments conform to the interfaces as defined by the generic parameter, and store SubtypeWitness values in the EntryPointRequest, which will be used by `specializeIRForEntryPoint` when generating final IR. 4. Add a new type of substitution - `GlobalGenericParamSubstitution` for subsittuting references to `__generic_param` decls or to its member `GenericTypeConsraintDecl` with the actual type argument or witness tables. 5. Update `IRSpecContext` to apply `GlobalGenericParamSubstitution` when specializing the IR for an EntryPointRequest. 6. Update `render-test` to take additional `type` inputs, which specifies the type arguments to substitute into the global `__generic_param` types. progress on parameter binding * Add a more contrived test case for specializing parameter bindings * update render-test to align buffers to 256 bytes (to get rid of D3D complains on minimal buffer size). * adding one more test case for parameter binding specialization. * Cleanup according to @tfoleyNV 's suggestions. * fix a bug introduced in the cleanup
2017-11-16	Revise type legalization so it can handle constant buffers (#282)	Tim Foley
	* Revise type legalization so it can handle constant buffers The existing legalization approach with "tuples" can handle scalarizing a `struct` type with resource-type fields in it, but it had several big gaps. The most notable is that given a type that mixes uniform and resource fields, we can't just blindly scalarize things: ``` struct P { float4 a; float4 b; Texture2D t; }; cbuffer C { P gParam[8]; }; ``` The existing code was completely ignoring the declaration of `gParam` inside `C`, but even if we fixed that issue, we'd get something like: ``` cbuffer C { float4 gParam_a[8]; float4 gParam_b[8]; }; Texture2D gParam_t[8]; ``` In this case we've completely changed the layout of the uniform buffer, by switching from AOS to SOA. Even if we could get the type layout logic and the IR to agree on this, it would be a surprise to users, and "principle of least surprise" should be a big deal on a project with as many moving parts as ours. The right thing to do is to have legalization create a "stripped" version of the original type `P` and use that: ``` struct P_stripped { float4 a; float4 b; }; cbuffer C { P_stripped gParam[8]; }; Texture2D gParam_t[8]; ``` Then at a call site, this: ``` foo(gParam); ``` becomes: ``` foo(gParam, gParam_t); ``` This is exactly how the current AST-to-AST legalization handles mixed uniform and resource types, but the way it does it involves some annoying kludges: - That pass has a notion of a "tuple" similar to our legalization, but every tuple has an optional "primary" entry for all the uniform data, plus tuple elements for the resources, and a given field may be represented on one side, the other, or both. It makes the code for handling tuples very messy. - That pass does the "stripping" of types by actually marking up the AST declarations (this is okay because it is constructing a new AST as it goes), so that when they get emitted certain fields don't actually show up. That is, we fix the problem with type `P` by actually modifying the user's declaration of `P`. That seems out of bounds for the IR. This change fixes the problem in our IR type legalization while trying to avoid the problems of the AST-to-AST pass by using two new ideas: 1. We add a new case for `LegalType` (and `LegalVal`) that is a "pair" type, where a pair consists of both an "ordinary" type (for uniform data) and a "special" type (for resource data). E.g., after legalization, the type for `C` (which can be over-simplified to `ConstantBuffer<P>` for our purposes), will be a `LegalType::pair` where the ordinary side is `ConstantBuffer<P_stripped>` and the special side is a tuple containing the `Texture2D` field. 2. We add a new (and annoyingly hacky) AST-level type called `FilteredTupleType` which is semantically a sort of tuple type (it holds a list of elements, and the elements have their own types), but which remembers an "original type" that it was created from, and for each element remembers the field of the original type that it corresponds to. This is used to construct a type like `P_stripped` as an actual AST-level structural type. The core logic for legalizing an aggregate type had to get more complicated just because of the new pair case, so there is now a `TupleTypeBuilder` that asists with taking an aggregate type, processing its fields, and then picking the right `LegalType` representation for the result. Other smaller changes: - Made the legalization logic actually legalize `PtrType<T>`. E.g., if `T` legalizes to a tuple, we need to construct a tuple of pointer types. The same exact thing needs to be applied to arrays, and any other generic type that should "distribute over" pairs/tuples. - Made the legalization logic actually legalize `ConstantBuffer<T>` and similar. The basic idea there is if `T` maps to a pair, we wrap `ConstantBuffer<...>` around the ordinary side, and `implicitDeref` around the special side. - Removed a bunch of `#ifdef`ed-out code from the end of `ir-legalize-types.cpp`. That was code from my first attempt at legalization that failed miserably (trying to do it via local changes and a work list instead of a global rewrite pass), but it had some code I wanted to reference when writing the version that actually got checked in (should have deleted the code earlier, though). - Added a bunch of cases for `LegalType::none` (and the `LegalVal` equivalent) that helped simplify the logic fo the `pair` case by allowing me to always dispatch to both the "ordinary" and "special" sides, even if they might not actually be present. - Renamed `TupleType` and `TupleVal` over to `TuplePseudoType` and `TuplePseudoval` to recognize the fact that we might actually need/want real tuples in the type system, to go along with these fake ones (that need to be optimized away). The biggest doubt I have about this change is the whole `FilteredTupleType` thing; it seems like an obviously contrived type to add to the front-end type system that really only solves IR-level problems. A cleaner approach might have been to just add a plain old `TupleType` to the front-end type system (and initially I started with that), and then have yet another `LegalType`/`LegalVal` case that handles mapping from the fields of the original type to the numbered tuple elements. I expect we'll actually want to make that change in the future (especially if we ever add true tuples to the front-end), but for right now I let myself be swayed by the desire to have these stripped/filtered types get names that explain their provenance ("where they came from") to make our output code more debuggable. The way I've done it is probably overkill, though, and we need a much more complete effort on the readability and debuggability of our output before anything like that is worth worrying about. * Fixup: typo * Fixup: fix output of "non-mangled" names for test cases - Make sure to attach high-level decls to variables created as part of type legalization - Also, try to share more of the code between the different cases of variables - Fix up `parameter-blocks` test case that was passing `-no-mangle` but expecting mangled names in the output - Fix up `multiple-parameter-blocks` to not rely on `-no-mangle` for now, because it would lead to two global variables with the same name (need to fix that underlying issue eventually). - Also fix name generation logic so that we only use "original" names (from high-level decls) specifically when the `-no-mangle` flag is on, and otherwise use IR-level names. * Fix: handle constant buffers better in render-test - Don't request both CB and SRV usage for buffers, since that is illegal - Also, don't try to create an SRV when user requested a CB (since the required usage flag won't be there) - Record the input buffer type on the `D3DBinding` for a buffer, and use that to tell us when to bind a CB instead of SRV/UAV - Fix expected output for `cbuffer-legalize` test now that we are actually feeding it correct cbuffer dta.
2017-11-07	turn on 'treat warnings as errors' (#266)	Yong He

2017-11-04	Merge remote-tracking branch 'refs/remotes/official/master'	Yong He

2017-11-04	fixes x64 warnings	Yong He

2017-11-04	merge with fixWarnings branch	Yong He

2017-11-04	fixed all warnings	Yong He

2017-11-01	Adding support for associated types.	Yong He