| Age | Commit message (Collapse) | Author |
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Test for internal error with resource/dynamic dispatch.
* Fix typo.
Co-authored-by: Theresa Foley <tfoleyNV@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Moved to experiments.
Added some more tests.
* More tests around associated types.
* Return interface tests.
* More tests.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Some generic experiments.
* Add some more generic tests.
* More generic experiments/issues.
* Some more generic tests.
* Remove erroneous test.
* Small improvements.
* Disable test that was accidentally enabled.
* Add equality-2.slang.
* Some more generic tests.
* Issues around type inference.
* Some more generic tests.
* Tuple experiment.
* Generic interfaces don't seem to be supported.
* Add inheritance test.
* Alternative array type issue.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Update slang-llvm dependencies.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix bool handling in constant folding for generic parameters.
|
|
First, we have a CUDA-only test that simply needed a format name to be changed to match the new conventions in `gfx`.
Second, we have one of the "active mask" tests that seems to produce different results locally for developers (under Vulkan) than it does on CI. This is almost certainly down to differences in GPUs and/or drivers. The inconsistency ultimately proves the point that I was trying to make when I wrote those tests - the "active mask" concept is effectively meaningless as exposed in D3D and Vulkan because it has not been specified in a way that allows programmers to reason about its value, and drivers have implemented wildly different interpretations of its supposed semantics for so long that there is no real hope of turning `WaveGetActiveMask()` into something that returns a well-defined value in any but the most trivial cases.
TLDR: I disabled that test for Vulkan, which means it is completely disabled.
|
|
Fixes #1990
The underlying problem here is in the `ExtractExistentialType` AST node class.
An "existential" in current Slang is typically a value of interface type. When such a value is used in an operation, the type-checker "opens" the extistential so that subsequent type-checking steps can work with the (statically unknown) specific type of the value stored inside. The `ExtractExistentialType` AST node represents the type of an existential that has been "opened" in this way.
When the front-end performs lookup "into" a value with one of these types, it nees to use a reference to the original interface declaration with a "this-type substitution" that refers to the "opened" type (a this-type substitution tells the compiler the concrete type it should use in place of `This` in signatures within the interface; it allows compiler to "see" the right associated type definitions to use in a context).
Prior to this change, the implementation would store the specialized reference to the original interface declaration in the `ExtractExistentialType` node as part of its state. The catch there is that the specialized interface reference indirectly refers to the `ExtractExistentialType` AST node itself, creating a circularity. As soon as the front-end performs any operation that tries to recurse over that structure, it would go into an infinite loop.
The fix here sounds kind of like a hack, but seems to be pretty nice in practice. Instead of always storing the specialized interface reference, we instead store the few values that are needed to construct it, and then create and cache the actual reference on-demand. The on-demand created fields are not considered part of the state of the AST node for any kind of recursion or serialization, so they avoid the original problem.
A single test case was added that represents the original bug, and confirms the fix.
|
|
* Use detected shader model in gfx/d3d12.
* Enable all d3d12 tests on Github.
* Improve d3d12 software device detection.
* Disable d3d12 tests on github for now.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use updated slang-binaries that have SPIR-V diagnostics improvements.
* Re-enable nv-ray-tracing-motion-blur, because with SPIR-V diagnostic fixes in glslang - there shouldn't be spurious errors from glslang compilation.
* If optimization fails use the SPIR-V we have.
* Update slang binaries.
* Hack to disable gfx unit tests for now to try and get CI pass for this PR.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use updated slang-binaries that have SPIR-V diagnostics improvements.
* Re-enable nv-ray-tracing-motion-blur, because with SPIR-V diagnostic fixes in glslang - there shouldn't be spurious errors from glslang compilation.
* If optimization fails use the SPIR-V we have.
* Update SPIR-V headers and generated files.
Updated documentation.
* Update spirv-headers/tools.
Revert slang-binaries.
* Remove hack around spir-v optimization as no longer needed.
disable nv-ray-tracing-motion-blur.slang
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Support for test proxy.
* Turn on testing using proxy.
* Don't pass sink into check of downstream compiler.
* Small change to kick off build.
* Remove register specification on transcendental.
* Increase poll timeout.
Small improvements to proxy.
* Disable gfx unit tests.
* Put test runner in shared library mode by default.
* Change comment. Kick off another CI test.
* Small edit to kick off builds.
* Run unit tests on proxy.
* Turn on using proxy for now.
* Enable swift shader.
* Fix typo.
Add exception support.
* Make the default spwan type SharedLibrary
Use isolation for gfx unit tests.
* Update slang-binaries.
* Fix typo.
* Report unit test output information.
|
|
* Format list updated with additional formats supported by both D3D and Vulkan; D3DUtil::getMapFormat() and VkUtil::getVkFormat() updated to include additional formats; GFX_FORMAT() updated with all additional formats (BC compression unfinished)
* Finished updating GFX_FORMAT with newly added formats and sizes; Pixel size is now tracked using the FormatPixelSize struct containing the values for bytes per block and pixels per block to accomodate BC formats; Updated gfxGetFormatSize and associated sub-calls to return FormatPixelSize instead of uint8_t; Most calls to gfxGetFormatSize() updated to reflect changes, a couple calls still unupdated
* Changes to accommodate new formats finished, debugging slang-literal unit test
* First format unit test working
* One test added for BC1Unorm and RGBA8Unorm_SRGB, both passing
* Refactored format testing code to merge BC1Unorm and RGBA8Unorm SRGB into a single file
* All unit tests added for BC and Srgb formats
* Most tests added and working; Added five additional formats (still need tests) and made the appropriate changes to support these; createTextureView() modified for D3D11, D3D12, and Vulkan to take into account the format specified in the texture view desc when the texture's format is typeless
* Format enums renamed to more closely match their D3D counterparts; Added a universal float and uint buffer and buffer view for use across all Format tests
* Remaining tests added; D3D12 tests pass, but Vulkan crashes in BC1_UNORM and D3D11 spits out a bunch of D3D11 Errors (but supposedly passes)
* re-run premake
* Added Sint versions of test shaders; Vulkan and D3D11 tests also pass
* Size struct for format unit tests no longer use initializer lists
* Fixed a Size struct missed in the previous pass
* Fixed minor bugs causing tests to fail
* Added documentation detailing all currently unsupported formats
* Skip tests causing unsupported format warnings due to swiftshader
* updated several test using old Format enum names
* Revert change to compareComputeResult() that was added for debugging purposes
* DEBUGGING: Added prints to identify which formats are failing on CI
* Reverted attempted debugging changes; Fixed texture2d-gather.hlsl to use updated Format enums
* Fixed incorrect array sizes in d3d11 _initSrvDesc()
* Commented out further tests that produce unexpected results when tested for Vulkan with swiftshader
* Revert "Merge branch 'expanded-format-support' of https://github.com/lucy96chen/slang into expanded-format-support"
This reverts commit 20008f0d3ecc3b1405ecac8c138edaa3cd37ed6b, reversing
changes made to 6081e95827315fee50e18409394d5abd62fac787.
* Added a fuzzy comparison function for use with floats
* submodule update
* Revert messed up changes caused by previous revert after automatically merging on github
|
|
`bool`. (#1987)
* Passing associated type arguments to existential parameters + packing for `bool`.
* fix typo
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Diagnostic for no type conformance + bug fix.
* Fixes.
* Fix.
* Include heterogeneous example only with --enable-experimental-projects premake flag
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
* Add prebuild command to premake for heterogeneous example
* Pass in entry point as parameter (also remove shader bytecode)
* Pass in module name as parameter
* Squashed commit of the following:
commit 5b13b57fe600724344c556fe4309a5d6bb3d39ab
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Oct 7 23:38:50 2021 -0700
Return diagnostics data when encountering module load error by exception (#1966)
commit 112e1515c30fa972ff56f91514b70946153c718c
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Thu Oct 7 16:12:29 2021 -0400
Disable test crashing CI (#1965)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Disable test that appears to be crashing.
commit da32069a0c1c8c723d7ef45100049a8f0dd5d9c4
Author: Kai Yao <kyao@nvidia.com>
Date: Mon Oct 4 13:58:51 2021 -0700
Modified barrier API to accept multiple resources per call (#1959)
Co-authored-by: Yong He <yonghe@outlook.com>
commit 97bb82ebcdf8f1391b9d93b5a8d7b1dfc4e88e52
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Mon Oct 4 14:15:51 2021 -0400
Removing exceptions from core/compiler-core (#1953)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Stream. Working on all tests.
* Split out CharEncode.
* Make method names lower camel.
m_prefix in Writer/Reader
* Tidy up around CharEncode interface.
* Small improvements around encode/decode.
* Better use of types.
* Remove readLine from TextReader.
* Remove exceptions from Stream/Text handling.
* Fix some typos.
* Fix tabbing.
* Fix missing override.
* Remove remaining exception throw/catch via using signal mechanism.
* Remove exceptions that are not used anymore.
* Document the Stream interface.
* Remove index for decoding 'get byte' function.
* Fix CharReader -> ByteReader.
commit b3dfe383c6d31ff3dbd76dcfb32de8d536382f3e
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Mon Oct 4 09:46:33 2021 -0700
Get native handles for TextureResource and BufferResource (#1960)
* Added getNativeHandle() to TextureResource and BufferResource; Implemented getNativeHandle() in Vulkan and D3D12; Added new unit test files for the aforementioned implementation
* Added missing getNativeHandle() implementations to renderer-shared.cpp and CUDA
* Finished new getNativeHandle() unit tests for ITextureResource and IBufferResource; Modified ICommandQueue and ICommandBuffer unit tests to call QueryInterface to convert to IUnknown then back and compare resulting pointers for equality
* Unit tests updated and pass locally
* Cast m_buffer.m_buffer and m_image to uint64_t
commit 35bca4cc432613af3926da3bed217a6baa9cbd26
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Oct 1 13:08:25 2021 -0700
Add getNativeHandle() to ICommandQueue and ICommandBuffer (#1952)
* Added support for getting command buffer and command queue handles to ICommandBuffer and ICommandQueue; D3D12Device, VkDevice, and DebugDevice modifieid to implement this new functionality; immediate-renderer-base.cpp also modified to implement the new functions
* Removed excess boilerplate
* Changed readRef() to get() in D3D12 getNativeHandle() implementation for ICommandBuffer and ICommandQueue
* Added unit tests for new getNativeHandle() implementations, unfinished
* Queue test added; Minor cleanup changes
* getBufferHandleTestImpl() now closes the command buffer before returning
* Added getNativeHandle() implementations to CUDADevice
* Added comment clarifying that the Vulkan check is checking for a null handle, which is defined to be 0
commit 6c6200f547c7387598743b23bb3c8f0d375d9494
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Sep 30 20:25:34 2021 -0700
VK Resource Barrier (#1955)
* Resource barrier API and VK implementation
* Stub implementations
* Handle VK Acceleration Structure flag
* Add a couple more cases to pipeline barrier stages
commit 627fc976bac5c2381dbace9c7925cb6a68b8de12
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 19:48:47 2021 -0700
Fix aarch64 build on github (#1957)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 122d701513e116856bd59c999221ce36a373d7db
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 17:51:56 2021 -0700
Fix GitHub release (#1956)
* Fix aarch64 release build config.
* Fix for WinAarch64 build.
* Update premake for embed-std-lib build on aarch64.
* `platform` fix for aarach64 build.
* Try revert back to use absolute output path for slang-stdlib-generated.h
* Fix
* fix
Co-authored-by: Yong He <yhe@nvidia.com>
commit aa8f7b899b7b562b3d3c6e25c3da41569505e70c
Author: Chad Engler <englercj@live.com>
Date: Wed Sep 29 13:02:47 2021 -0700
Fix ARM64 detection for MSVC (#1951)
commit 6736b0c1c5fa3e89bc561eb7965a1a0d17af3466
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 29 11:29:46 2021 -0700
Add ISession::loadModuleFromSource. (#1950)
Co-authored-by: Yong He <yhe@nvidia.com>
commit d8e452412e14a6a8ba137f2adcae13b398e5cecb
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 15:03:03 2021 -0700
Fix AbortCompilationException leaking through loadModule API. (#1949)
* Fix AbortCompilationException leaking through loadModule API.
* Update.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
commit cdf1b2c007fefdca128584d2a9f63dec3d350e16
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 11:54:24 2021 -0700
Improvements to the unit test framework. (#1948)
commit af788b62e18bbd55cd748ad60400a74cf1bc93ee
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Sep 24 16:53:41 2021 -0700
Add existing device handle support unit test (#1946)
commit bec8e6aec85b6e3f875c58bdd59eb15613978358
Author: Yong He <yonghe@outlook.com>
Date: Fri Sep 24 11:33:44 2021 -0700
Move existing unit tests to a standalone dll. (#1945)
commit f2a3c933bc11a498c622fa18694c84beca8ca031
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Thu Sep 23 12:19:49 2021 -0700
Add method to retrieve native handles (#1944)
* Added a getNativeHandle() method that retrieves the natively created handles; Modified RendererBase, VKDevice, D3D12Device, and DebugDevice to implement this new method
* Moved ExistingDeviceHandles out of Desc directly inside IDevice and renamed to NativeHandles; Modified calls accessing the struct accordingly in RendererBase, DebugDevice, VKDevice, and D3D12Device
* Minor cleanup changes (renames, etc.)
commit b9b398d038b524f15a86ff27cd6888d54e8754e0
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 22 10:06:59 2021 -0700
Add gfx unit testing framework. (#1943)
* Add gfx unit testing framework.
* Fix compilation error.
* Reset gfxDebugCallback after render_test.
* Pass enabledApi flags through.
* Fix for code review suggestions.
Co-authored-by: Yong He <yhe@nvidia.com>
commit 6e9cee69b3588ddae09b08b9f580f59ad899983f
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Tue Sep 21 18:46:32 2021 -0700
Support for existing device/instance handles in Vulkan (#1942)
commit b1f04c8544c650de3947955ca68f679535d249aa
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Wed Sep 15 20:22:45 2021 -0700
Allow D3D12Device to use an existing device handle (#1940)
* Added a new field for an existing device handle to IDevice::Desc; Modified D3D12Device::initialize to set the device stored in desc if it already exists instead of creating a new one
* Turned existingDeviceHandle into a struct containing an array of two elements; Updated D3D12Device::initialize to match changes to existingDeviceHandle; Updated comments
* Fixed style error for ExistingDeviceHandles struct
commit 2f7b9f5ae8be21c6c1d75ae9caefbc7b3f8986a9
Author: Pablo Delgado <private@pablode.com>
Date: Thu Sep 16 01:17:57 2021 +0200
Fix incorrect WIN32 macros and missing Windows.h inclusion (#1939)
* Replace WIN32 preprocessor macros with _WIN32
* Add missing Windows.h include for InterlockedIncrement
commit 11d43642008905ac69a3832eb8a9b2ae7b785f86
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 14 11:36:44 2021 -0700
Avoid upcasting to f32 in 16bit float-uint bit cast. (#1938)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 502aa3812a82cf0d091cff0c67804e4ee448ac78
Author: David Siher <32305650+dsiher@users.noreply.github.com>
Date: Tue Sep 14 12:59:55 2021 -0400
Bring heterogeneous-hello-world back up to date. (#1935)
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
Co-authored-by: Yong He <yonghe@outlook.com>
* Revert "Squashed commit of the following:"
This reverts commit 4f665858d65f7c332c616ef6db9fdafa1c5e0b9f.
* Run premake
* Remove prebuild command (only works on Windows?)
* Rerun premake
* Fix heterogeneous prebuild command
* Remove linux specific prebuild command
* Fix prebuild command (again)
* Change target from dxbc to hlsl to see if that fixes linux issues
* Use Path::getFileNameWithoutExt
* Change string-literal.slang.expected to have extra filename in decoration
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* GFX: implement mutable shader objects.
* Revert unnecessary changes
* Revert more changes.
* Fix clang errors.
* Fix clang/gcc errors.
* Fix clang errors.
* Remove CPU test.
* Fix after merge.
* Fix after merge.
* Remove gl test
* Code review fixes.
* Fixing all vk validation errors.
* Flush test output more often.
* Fix a crash in `specializeDynamicAssociatedTypeLookup`.
* temporarily disable std-lib-serialize test to see what happens
* Fix crashes.
* Make sure cpu gfx unit tests are properly disabled on TeamCity.
* Disable cpu test.
* Fix.
* Fix cuda.
* Disable nv-ray-tracing-motion-blur
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Upgrade to GLSLANG 11.16.0+
* Small edit to readme - really to kick another build.
* Upgrade slang-binaries to include new glslang binaries.
* Update slang-binaries to include linux-x86
* Upgrade slang-binaries.
* Support for GL_NV_ray_tracing_motion_blur extension.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Reenable erroneously disabled test.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Disable test that appears to be crashing.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Stream. Working on all tests.
* Split out CharEncode.
* Make method names lower camel.
m_prefix in Writer/Reader
* Tidy up around CharEncode interface.
* Small improvements around encode/decode.
* Better use of types.
* Remove readLine from TextReader.
* Remove exceptions from Stream/Text handling.
* Fix some typos.
* Fix tabbing.
* Fix missing override.
* Remove remaining exception throw/catch via using signal mechanism.
* Remove exceptions that are not used anymore.
* Document the Stream interface.
* Remove index for decoding 'get byte' function.
* Fix CharReader -> ByteReader.
|
|
+ Implement bit_cast between float16 and uint16 in GLSL.
+ Enable pack-any-value-16bit test on vk.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* First integration with 'slang-llvm'.
* Fix project.
* Fix test output.
* First pass assert support.
* Add inline impls for min and max.
* Add abs inline abs impl for llvm.
* Make abs not use ternary op
* Fix typo in slang-llvm.h
* Sundary fixes to make remaining tests using llvm backend pass.
|
|
* `reinterpret` and 16-bit value packing.
* Update `half-texture` cross-compile test reference result.
* Revert inadvertent reformatting of slang-ir-inst-defs.h
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix crash: dynamic dispatch of generic interface method.
* Fix memory error.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
This function takes a user provided `typeID` and arbitrary typed value, and turns them into an existential value whose `witnessTableID` is `typeID` and whose `anyValue` is the user provided value. This allows the users to pack the runtime type id info in arbitrary way.
|
|
* Add GLSL450 intrinsics to SPIRV direct emit.
* Fix.
* Fix compiler error.
* Fix.
* Fix compiler error.
* Make direct-spirv tests actually run.
|
|
* Further implementation of SPIRV direct emit.
This change implements:
- Struct, Vector, Matrix and Unsized Array types.
- Basic arithmetic opcodes, vector construct, swizzle etc.
- getElementPtr, getElement, fieldAddress, extractField.
- SPIRV target intrinsics with SPIRV asm code in stdlib.
- RWStructuredBuffer and StructuredBuffer.
- Pointer storage class propagation.
- Control flow.
* Fix.
|
|
* Work to mitigate SPIR-V bloat
SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things.
Note: experimentation shows that the bloat is less pronounced when running either *no* SPIR-V optimizations or *full* SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request.
This change primarily does two things:
First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would *always* copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters.
Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies.
Notes:
* The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon.
* The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing.
* There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments.
* Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated.
* This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create *more* memory traffic (because the same values get loaded in multiple callee functions).
* fixup: run premake
* fixup: typo
|
|
|
|
* Update VS projects to 2019.
* Empty commit to trigger build
* Implement gfx inline ray tracing on D3D12.
* Allow render-test to run inline ray tracing tests.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fixes related to combined texture/sampler types
Work on #1891
Our intention has always been to support combined texture/sampler types in Slang, both targets like OpenGL where that is the only option available and for targets like Vulkan where it can be beneficial to performance. Because Slang's current users mostly focus on D3D12+Vulkan codebases, they strongly prefer separate textures and samplers, and the relevant support code in Slang has "bit-rotted" over many releases until the functionality that was there isn't useful any more.
This change significantly overhauls the implementation of combined texture/sampler types and adds a test that uses them in the hopes of avoiding future regressions.
The new combined texture/sampler types use the prefix `Sampler`, so where there is an existing standalone `Texture2D` type the equivalent combined texture/sampler type will be `Sampler2D`. The intention is that this naming mirrors the GLSL conventions (where the type is `sampler2d`) while following HLSL naming precedent (to the extent it exists).
The operations available on a `Sampler2D` are intended to be those that are available on a `Texture2D`, and it is just that in cases where the `Texture2D` operation would take a separate sampler argument:
Texture2D t = ...;
SamplerState s = ...;
float4 result = t.Sample(s, uv);
the equivalent `Sampler2D` operation just elides that argument:
Sampler2D s = ...;
float4 result = s.Sample(uv);
In terms of implementation, there are a lot of subtle details here:
* I've tried to use the same metaprogramming logic that generates all the stdlib declarations for `Texture*` to also generate `Sampler*` in the hopes that this helps keep them in sync.
* The big catch to the above is that it means that for certain operations the indidces of parameters depend on whether or not an explicit sampler parameter is used. Rather than try to tweak the indices in the stdlib generation logic (which is already complicated) I went and added Yet Another Hack to the logic that handles intrinsic definition strings. Basically, the special-case handling of `$p` has been modified so that it *also* applies a negative offset to future parameter references in the same intrinsic string.
* Trying to actually bring this up in our test framework revealed that the "flat" reflection API was seemingly not reflecting combined texture/sampler types correctly at all (it was reflecting them as just plain textures). Other than that issue, the Vulkan path seems to work fine with combined texture/samplers.
* I also had to add logic to the `TEST_INPUT` parsing to re-introduce handling of the combined types (that was something I consciously left out to reduce the amount of code in the earlier refactor there). Luckily, the architecture is such that a combined texture/sampler can leverage most of the existing logic for the separate cases.
* fixup: reveiw feedback
|
|
This change adds support for variadic macros in the C-style
preprocessor, e.g.:
#define DEBUG(MSG, ...) print(__FILE__, __LINE__, MSG, __VA_ARGS__)
Similar to the gcc preprocessor, this feature supports both named
variadic macro parameters and unnamed ones (which then default to
`__VA_ARGS__`.
The implementation work is mostly straightforward, although there are a
few subtle design choices worth mentioning:
* A variadic macro is represented by it having a variadic *parameter*
that is part of the ordinary parameter list.
* Argument parsing does *not* detect whether the macro being invoked is
variadic and collect/combine arguments to form a single argument value
for the variadic parameter. This is motivated by the need for some
extensions to differentiate a variadic parameter receiving a single
empty argument vs. zero arguments.
* Because any reference to the variadic parameter needs to expand to the
comma-separated arguments that match it, the logic for turning a macro
parameter reference into a list of tokens has been factored out into a
subroutine that handles the details.
* The choice in the earlier refactor to have a macro invocation collect
all the argument tokens (including the intervening commas) into a
single token list seems to pay off here, because it means that the
tokens in the expansion of a variadic parameter reference were already
stored contiguously.
* The special-case logic for handling an empty argument list had to be
tweaked again to ensure that an empty argument list is treated as
having zero arguments for the variadic parameter. Note that
historically C did not define the behavior of this case, and always
required at least one argument for any variadic macro parameter.
* The logic for checking whether the number of arguments to a macro
invocation is valid needed to handle variadic and non-variadic macros
as distinct cases. There really isn't much overlap in how the checks
need to work, even if we tried to change the underling representation.
The main missing feature here is any way to discard a comma in a macro
body that appears before a variadic parameter reference, e.g.:
#define DEBUG(...) print("debug:", __VA_ARGS__)
In this case, an empty invocation list `DEBUG()` will expand to
`print("debug:",)` - a call with a trailing comma in the argument list.
If users end up needing a way to discard commas in cases like this, we
have many options we can consider. This change does not implement any of
them to keep the initial work as minimal as possible.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add support for sizeOf/alignOf/offsetOf to stdlib.
Add $G intrinsic expansion that works of the generic parameters not the param type
* Test cuda layout.
* Fix CUDA layout issues.
Fix reflection to handle other built in types.
Fix __offsetOf
* Tests of reflection and layout as reported directly from CUDA.
* Comment about use of aligned size as size.
* Fix warning from VS.
* Check alignment is pow2.
* Small improvements to alignment calcs.
* Tab to spaces.
* Fix alignment pointer sizes on 32 bit OS for CUDA.
* Fix CUDA reflection on 32 bit.
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Re-enable CUDA RWTexture tests.
Re-enable RWTexture1D test
Make sure tests have only single mip for RWTexture (required for CUDA)
* Fix issue with reading CUDA surface.
Re-enable working CUDA RWTextureTest.
Enable 1D case.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fixes around Float16. Incorrect calculation of 'elementSize'.
|
|
* Include a "stack trace" with nested-import errors
When errors occur in nested `#include` files it is often helpful to have a "stack trace" / traceback of the `#include` chain that led from a root translation unit to the file with an error.
This change implements a similar feature for `import`s.
It is worth noting that `import`s don't really *require* this kind of compiler support the way `#include`s do because the intention is that the meaning of an `import`ed file does not depend on the order or nesting of `import`s. As such, when trying to *fix* an error in an `import`ed file, you usually don't care how it came to be `import`ed into your shaders.
The use case here is somebody adapting a large body of Slang code to use in a different codebase, such that they have certain `.slang` files they don't actually intend to have compile correctly, and they want to be able to diagnose how they came to include those files when/if they cause problems.
The actual feature implementation is pretty simple because we already track a stack of active `import`s so that we can detect and diagnose recursive `import`s. This change simply changes the disagnostics when there is an error in imported code so that instead of just noting the inner-most `import` site it lists all the `import` sites that were active at the time.
The change includes a test case to confirm that the behavior works (at least for the case of a parse error).
* fixup: test outputs
Co-authored-by: Yong He <yonghe@outlook.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
* Fix a bug in preprocessor "busy" logic
This bug manifested in both incorrect preprocessor output for certain complicated cases, and also (more importantly) a use-after-free issue.
One of the "clever" design choices in the Slang preprocessor implementation is that the set of "busy" macros during expansion is implicitly defined by a linked list of those invocations that are actively being read from as part of the input stack. This logic works very well for checking whether a macro name is busy before triggering expansion, and for computing what macros should be considered busy during expansion of an object-like macro.
The problem case here was with function-like macros where the preprocessor was re-using the same list of busy macros that had been fetched when reading the macro name, but doing so *after* all the macro argument tokens had been read. Because additional tokens had been read from the input stream stack, there was no guarantee that the invocations that had been active before were still live.
The new logic computes the set of busy macros fresh before starting expansion of a function-like macro invocation. A test is included to ensure that the case that showed the use-after-free bug has been fixed.
In addition, the new logic is careful to compute the busy macros only based on where subsequent tokens will come from and not based on any macros that might have contributed to the argument tokens themselves. A test case has been added that relies on getting this detail correct.
* Update slang-preprocessor.cpp
Remove a test typo.
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
* Various Fixes to gfx, reflection and emit.
- Fix GLSL emit to properly output `*bitsTo*` functions for `IRBitCast` insts.
- Add line directive mode setting for `ISession`.
- Extend `TypeLayout::getElementStride` to handle `VectorType` case.
- Fix `IDevice::readBufferResource` 's D3D12 implementation to copy only the requested bytes out.
- Fix `render-test` to use the `ISession` from `gfx` instead of creating its own `ISession` to make sure `gfx` and `render-test` agree on WitnessTable and RTTI IDs.
- Extend `render-test` to support filling vector and matrix values in the new `set x = ...` TEST_INPUT syntax.
- Add a `dynamic-dispatch-15` test case to make sure packing / unpacking works correctly across all targets, and to make sure render-test's RTTI/WitnessTable ID filling logic is correct for non-trivial cases.
* Remove default-major test
* Fix cyclic reference in `ExtendedTypeLayout`.
* Move `lineDirectiveMode` setting to `TargetDesc`.
Add `structureSize` to `TargetDesc` and `SessionDesc` for future binary compatibility.
* Cleanup.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix issue with with SLANG_ENABLE_GLSLANG_SUPPORT
* Update expected output from glslang-error.glsl
* Fix bug in glsl dissassembly.
* Make ExtensionTracker available even if source is not emitted.
* Only explicitly set extension tracker based on capability bits, if we are in pass through.
* Small simplification of invoke sourceEmit.
|
|
If the user has a derived `struct` type:
```hlsl
struct Base { int b = 1; }
struct Derived : Base { int d = 2; }
```
Then it is still reasonable for them to want to use initializer lists when declaring variables using the `Derived` type:
```hlsl
Derived x = {};
Derived y = { 7, 8 };
```
This change implements two missing pieces of functionality in the Slang compiler to allow this case:
* First, when the front-end semantic checks are applied to an initializer list, if the type being initialized is a derived `struct` type it always expects to find initialization arguments for its base type before those for its fields.
* Second, when lowering an initializer-list expression from the AST to the IR, the compiler expects the first argument in the list to be the initial value for the base field (if any). This also applies to default-initialization of fields/variables.
This change slightly entangles front-end logic with the logic for how struct inheritance is lowered to the IR, but the behavior is unlikely to confuse users who expect C++-like layout.
It is worth noting that with this change it should be possible to initialize the base type using either a nested initializer list or flat arguments:
```hlsl
struct BigBase { int x; int y; int z; }
struct BigDerived : BigBase { int w; }
BigDerived a = { {1,2,3}, 4 };
BigDerived b = { 1, 2, 3, 4 };
```
This behavior should Just Work because of the existing C-like rules for initializer lists where an aggregate can be initialized by either a `{}`-enclosed block or distinct values for its leaf fields.
|
|
During lowering from AST to IR, the Slang compiler translates code that uses `struct` inheritance:
```hlsl
struct Base { int a; }
struct Derived : Base {}
```
into code where the inheritance relationship is "witnessed" by a simple field:
```hlsl
struct Base { int a; }
struct Derived { Base __anonymous_field__; }
```
The underlying bug here is that the `__anonymous_field__` that the compiler generated during IR lowering was not being given any linkage decorations (no mangled name). As a result, if multiple separately-compiled modules all access that field they could disagree on its identity as an IR instruction. This could lead to output code being generated where the declaration of `__anonymous_field__` uses one IR instruction, but accesses use another.
This change includes a fix for the issue, and a test that serves as a reproducer for the original problem.
|
|
The basic bug here was that `enum` types with an explicit tag type:
enum Color : int32_t { ... }
would have an `InheritanceDecl` implying that `Color` inherits from
`int32_t`. The problem is that this is *not* actually an inheritance
relationship, since a `Color` needs to be explicitly cast to/from an
`int32_t`.
Various parts of the compiler currently treat this case like real
inheritance, and as a result the operations taht would apply to an
`int32_t` end up applying to a `Color` as well. This particularly leads
to an ambiguity between applying the `==` operator, because it has
overloads for both the `__EnumType` and `__Builtin{something}`
interfaces.
The fix here is to explicitly exclude the `InheritanceDecl` that
represents an enumeration tag type when considering declared subtype
relationships. A more complete version of this fix would need to go
through all places in the code where `InheritanceDecl`s are used and
make sure that any places using them for true inheritnace relationships
ignore those that represent an enumeration tag type.
(An alternative option would be to use a distinct kind of `Decl` to
represent the tag-type relationship, perhaps even going so far as to
modifying the type of the relevant AST node as part of semantic
checking)
This change includes a regression test for the way this bug surfaced in
user code.
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
|
|
* Allow overriding specialization args via `IShaderObject`.
* Fixes.
Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Added SourceLoc handling for command line parsing.
* Fix typo in debug.
* Fix issue around the DiagnosticSink used in options parsing not having a writer available - by having DiagnosticSink parenting.
* Small rename for clarity.
* WIP extracting command line args for downstream tools.
* Unit tests/bug fixes around extracting args.
* Use DownstreamArgs in the EndToEndCompileRequest
* Passing downstream compiler options downstream.
* Fix issue with endToEndReq being nullptr.
* Fix issue with diagnostics number change.
* Small improvements to how the source line is displayed if it's too long.
Default to 120, as suggested in previous review.
* Make render test use x-args parsing and CommandArgReader.
* Added missing diagnostics.
* More DownstreamArgs to linkage so can be seen by 'components'.
Added dxc-x-arg test.
* Used combination of name and args instead of two Lists, which whilst equivalent was perhaps a little confusing.
* Added documentation for -X support.
* Added test for x-args parsing diagnostic. Improved diagnostic with list of known names.
* Fix issues from merge.
* Fix lookup for -matrix-layout-column-major in render test.
* Remove commented out line.
|
|
Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Added SourceLoc handling for command line parsing.
* Fix typo in debug.
* Fix issue around the DiagnosticSink used in options parsing not having a writer available - by having DiagnosticSink parenting.
* Small rename for clarity.
* WIP extracting command line args for downstream tools.
* Unit tests/bug fixes around extracting args.
* Use DownstreamArgs in the EndToEndCompileRequest
* Passing downstream compiler options downstream.
* Fix issue with endToEndReq being nullptr.
* Fix issue with diagnostics number change.
* Small improvements to how the source line is displayed if it's too long.
Default to 120, as suggested in previous review.
Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>
|
|
* Overhaul the preprocessor
The old Slang preprocessor was based on a simple mental model that tried to unify two parts of macro expansion:
* Scanning for macro invocations in a sequence of tokens
* Producing the expanded tokens for a macro expansion by substituting arguments into its body
The basic was that substitution of macro arguments into a macro definition is superficially similar to top-level macro expansion, just with an environment where the macro arguments act like `#define`s for the corresponding parameter names. That approach was "clever" and could conceivably have been extended to include a lot of advanced preprocessor features (e.g., a preprocessor-level `lambda` would be easy to support!), but it was basically impossible to make it correctly handle all the corner cases of the full C/C++ preprocessor.
The fundamental problem with the old approach was that it conflated the two parts of expansion listed above into one implementation, while the various special cases of the C/C++ preprocessor rely on treating the two cases very differently. The new approach here (which is somewhere between a refactor and a full rewrite of the preprocessor) changes things up in a few key ways:
* The abstraction still cares a lot about streams of tokens, but it now treats the top level streams (`InputFile`s) as fairly different from the lower-level streams (`InputStream`s)
* Macro expansion is handled as a dedicated type of stream that wraps another stream. This allows macro expansion to be applied to anything, and supports cases where multiple rounds of macro expansion are required by the spec.
* Macro *invocations* and the substitution of their arguments are now handled by a completely new system.
* Macro arguments are no longer treated as if they were `#define`s
* The macro body/definition is analyzed at definition time to detect various kinds of issues, and to derive a list of "ops" that make it easier to "play back" the definition at substitution time
* Token pasting and stringizing are now only handled in macro definitions (rather than being allowed anywhere), and their use cases are restricted to only those that make sense (e.g., you can't stringize anythign except a macro parameter, because anything else wouldn't make sense)
The key new types here are the `ExpansionInputStream` which handles scanning for macro invocations, and the `MacroInvocation` type, which handles playing back the macro body with substitutions.
The `ExpansionInputStream` is the easier of the two to understand. By refactoring it to use a single token of lookahead, the one major detail it had to deal with before (abandoning expansion of a function-like macro if the macro name was not followed by `(`) is significantly easier to manage.
The more subtle part is the `MacroInvocation` type, and most of the complexity there is around handling of token pasting, and the fact that either or both of the operands to a token paste might be empty.
Many of the test cases that exposed the problems in the preprocessor have been moved from `current-bugs` to `preprocessor` since they now work correctly.
* debugging: enable extractor command line dump
* fixup
* fixup
|