| Age | Commit message (Collapse) | Author |
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* OSX 10.15 produced error/warning for not using ref.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use updated slang-binaries that have SPIR-V diagnostics improvements.
* Re-enable nv-ray-tracing-motion-blur, because with SPIR-V diagnostic fixes in glslang - there shouldn't be spurious errors from glslang compilation.
* If optimization fails use the SPIR-V we have.
* Update SPIR-V headers and generated files.
Updated documentation.
* Update spirv-headers/tools.
Revert slang-binaries.
* Remove hack around spir-v optimization as no longer needed.
disable nv-ray-tracing-motion-blur.slang
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Support for test proxy.
* Turn on testing using proxy.
* Don't pass sink into check of downstream compiler.
* Small change to kick off build.
* Remove register specification on transcendental.
* Increase poll timeout.
Small improvements to proxy.
* Disable gfx unit tests.
* Put test runner in shared library mode by default.
* Change comment. Kick off another CI test.
* Small edit to kick off builds.
* Run unit tests on proxy.
* Turn on using proxy for now.
* Enable swift shader.
* Fix typo.
Add exception support.
* Make the default spwan type SharedLibrary
Use isolation for gfx unit tests.
* Update slang-binaries.
* Fix typo.
* Report unit test output information.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP fixing glslang error reporting.
* SPIR-V opt diagnostics.
* Small formatting tidy up.
* Disable nv-ray-tracing-motion-blur.slang for now, until merged and can rebuild slang-glslang.
* Formatting improvements.
* Some small improvements around optimizing in glslang.
* Make clear if calling optimize with 'non' in glslang no work is done (such as a pointless copy).
* Small formatting change - mainly to kick off a build.
* Upgrade slang-binaries to fix git fetch issue on TC.
* Small formatting edits. Kick build.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* First integration of slang-pack.
* Use .os
* Add optional dependency support.
* Update github actions/scripts to update deps. aarch64 needs special handling.
* Upgrade to latest slang-pack for ignore-deps support.
* Fix linux build issues.
* Copying slang-llvm from dependencies.
* Add support for LLVM for host callable.
Added CodeGenTransitionMap.
* Remove hack to enable host callable for LLVM.
* Small improvements around transitions/downstream compiler.
* Fix typo in method name.
* Fix comment.
* Update visual studio project.
* Updage slang-llvm to include initialization fix.
* Fix handling extraction of clang version number.
* Fix some formatting problems.
* hack - to see if there is a version problem on CI.
* Remove progress on github action linux.
* Allow version lines to have text before 'prefix'.
* Update slang-binaries to include centos-7 premake binaries.
* Upgrade slang-binaries.
* Upgrade slang-binaries.
* Update slang binaries to have certificates.
* Fix handling of dependency path.
* Update README to include LLVM
Update building to include --deps and --arch
* Include slang-llvm in packages.
* Update building docs.
|
|
`bool`. (#1987)
* Passing associated type arguments to existential parameters + packing for `bool`.
* fix typo
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Diagnostic for no type conformance + bug fix.
* Fixes.
* Fix.
* Include heterogeneous example only with --enable-experimental-projects premake flag
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add support for LLVM for host callable.
Added CodeGenTransitionMap.
* Remove hack to enable host callable for LLVM.
* Small improvements around transitions/downstream compiler.
* Fix typo in method name.
* Fix comment.
|
|
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
* Add prebuild command to premake for heterogeneous example
* Pass in entry point as parameter (also remove shader bytecode)
* Pass in module name as parameter
* Squashed commit of the following:
commit 5b13b57fe600724344c556fe4309a5d6bb3d39ab
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Oct 7 23:38:50 2021 -0700
Return diagnostics data when encountering module load error by exception (#1966)
commit 112e1515c30fa972ff56f91514b70946153c718c
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Thu Oct 7 16:12:29 2021 -0400
Disable test crashing CI (#1965)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Disable test that appears to be crashing.
commit da32069a0c1c8c723d7ef45100049a8f0dd5d9c4
Author: Kai Yao <kyao@nvidia.com>
Date: Mon Oct 4 13:58:51 2021 -0700
Modified barrier API to accept multiple resources per call (#1959)
Co-authored-by: Yong He <yonghe@outlook.com>
commit 97bb82ebcdf8f1391b9d93b5a8d7b1dfc4e88e52
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Mon Oct 4 14:15:51 2021 -0400
Removing exceptions from core/compiler-core (#1953)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Stream. Working on all tests.
* Split out CharEncode.
* Make method names lower camel.
m_prefix in Writer/Reader
* Tidy up around CharEncode interface.
* Small improvements around encode/decode.
* Better use of types.
* Remove readLine from TextReader.
* Remove exceptions from Stream/Text handling.
* Fix some typos.
* Fix tabbing.
* Fix missing override.
* Remove remaining exception throw/catch via using signal mechanism.
* Remove exceptions that are not used anymore.
* Document the Stream interface.
* Remove index for decoding 'get byte' function.
* Fix CharReader -> ByteReader.
commit b3dfe383c6d31ff3dbd76dcfb32de8d536382f3e
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Mon Oct 4 09:46:33 2021 -0700
Get native handles for TextureResource and BufferResource (#1960)
* Added getNativeHandle() to TextureResource and BufferResource; Implemented getNativeHandle() in Vulkan and D3D12; Added new unit test files for the aforementioned implementation
* Added missing getNativeHandle() implementations to renderer-shared.cpp and CUDA
* Finished new getNativeHandle() unit tests for ITextureResource and IBufferResource; Modified ICommandQueue and ICommandBuffer unit tests to call QueryInterface to convert to IUnknown then back and compare resulting pointers for equality
* Unit tests updated and pass locally
* Cast m_buffer.m_buffer and m_image to uint64_t
commit 35bca4cc432613af3926da3bed217a6baa9cbd26
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Oct 1 13:08:25 2021 -0700
Add getNativeHandle() to ICommandQueue and ICommandBuffer (#1952)
* Added support for getting command buffer and command queue handles to ICommandBuffer and ICommandQueue; D3D12Device, VkDevice, and DebugDevice modifieid to implement this new functionality; immediate-renderer-base.cpp also modified to implement the new functions
* Removed excess boilerplate
* Changed readRef() to get() in D3D12 getNativeHandle() implementation for ICommandBuffer and ICommandQueue
* Added unit tests for new getNativeHandle() implementations, unfinished
* Queue test added; Minor cleanup changes
* getBufferHandleTestImpl() now closes the command buffer before returning
* Added getNativeHandle() implementations to CUDADevice
* Added comment clarifying that the Vulkan check is checking for a null handle, which is defined to be 0
commit 6c6200f547c7387598743b23bb3c8f0d375d9494
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Sep 30 20:25:34 2021 -0700
VK Resource Barrier (#1955)
* Resource barrier API and VK implementation
* Stub implementations
* Handle VK Acceleration Structure flag
* Add a couple more cases to pipeline barrier stages
commit 627fc976bac5c2381dbace9c7925cb6a68b8de12
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 19:48:47 2021 -0700
Fix aarch64 build on github (#1957)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 122d701513e116856bd59c999221ce36a373d7db
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 17:51:56 2021 -0700
Fix GitHub release (#1956)
* Fix aarch64 release build config.
* Fix for WinAarch64 build.
* Update premake for embed-std-lib build on aarch64.
* `platform` fix for aarach64 build.
* Try revert back to use absolute output path for slang-stdlib-generated.h
* Fix
* fix
Co-authored-by: Yong He <yhe@nvidia.com>
commit aa8f7b899b7b562b3d3c6e25c3da41569505e70c
Author: Chad Engler <englercj@live.com>
Date: Wed Sep 29 13:02:47 2021 -0700
Fix ARM64 detection for MSVC (#1951)
commit 6736b0c1c5fa3e89bc561eb7965a1a0d17af3466
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 29 11:29:46 2021 -0700
Add ISession::loadModuleFromSource. (#1950)
Co-authored-by: Yong He <yhe@nvidia.com>
commit d8e452412e14a6a8ba137f2adcae13b398e5cecb
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 15:03:03 2021 -0700
Fix AbortCompilationException leaking through loadModule API. (#1949)
* Fix AbortCompilationException leaking through loadModule API.
* Update.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
commit cdf1b2c007fefdca128584d2a9f63dec3d350e16
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 11:54:24 2021 -0700
Improvements to the unit test framework. (#1948)
commit af788b62e18bbd55cd748ad60400a74cf1bc93ee
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Sep 24 16:53:41 2021 -0700
Add existing device handle support unit test (#1946)
commit bec8e6aec85b6e3f875c58bdd59eb15613978358
Author: Yong He <yonghe@outlook.com>
Date: Fri Sep 24 11:33:44 2021 -0700
Move existing unit tests to a standalone dll. (#1945)
commit f2a3c933bc11a498c622fa18694c84beca8ca031
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Thu Sep 23 12:19:49 2021 -0700
Add method to retrieve native handles (#1944)
* Added a getNativeHandle() method that retrieves the natively created handles; Modified RendererBase, VKDevice, D3D12Device, and DebugDevice to implement this new method
* Moved ExistingDeviceHandles out of Desc directly inside IDevice and renamed to NativeHandles; Modified calls accessing the struct accordingly in RendererBase, DebugDevice, VKDevice, and D3D12Device
* Minor cleanup changes (renames, etc.)
commit b9b398d038b524f15a86ff27cd6888d54e8754e0
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 22 10:06:59 2021 -0700
Add gfx unit testing framework. (#1943)
* Add gfx unit testing framework.
* Fix compilation error.
* Reset gfxDebugCallback after render_test.
* Pass enabledApi flags through.
* Fix for code review suggestions.
Co-authored-by: Yong He <yhe@nvidia.com>
commit 6e9cee69b3588ddae09b08b9f580f59ad899983f
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Tue Sep 21 18:46:32 2021 -0700
Support for existing device/instance handles in Vulkan (#1942)
commit b1f04c8544c650de3947955ca68f679535d249aa
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Wed Sep 15 20:22:45 2021 -0700
Allow D3D12Device to use an existing device handle (#1940)
* Added a new field for an existing device handle to IDevice::Desc; Modified D3D12Device::initialize to set the device stored in desc if it already exists instead of creating a new one
* Turned existingDeviceHandle into a struct containing an array of two elements; Updated D3D12Device::initialize to match changes to existingDeviceHandle; Updated comments
* Fixed style error for ExistingDeviceHandles struct
commit 2f7b9f5ae8be21c6c1d75ae9caefbc7b3f8986a9
Author: Pablo Delgado <private@pablode.com>
Date: Thu Sep 16 01:17:57 2021 +0200
Fix incorrect WIN32 macros and missing Windows.h inclusion (#1939)
* Replace WIN32 preprocessor macros with _WIN32
* Add missing Windows.h include for InterlockedIncrement
commit 11d43642008905ac69a3832eb8a9b2ae7b785f86
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 14 11:36:44 2021 -0700
Avoid upcasting to f32 in 16bit float-uint bit cast. (#1938)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 502aa3812a82cf0d091cff0c67804e4ee448ac78
Author: David Siher <32305650+dsiher@users.noreply.github.com>
Date: Tue Sep 14 12:59:55 2021 -0400
Bring heterogeneous-hello-world back up to date. (#1935)
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
Co-authored-by: Yong He <yonghe@outlook.com>
* Revert "Squashed commit of the following:"
This reverts commit 4f665858d65f7c332c616ef6db9fdafa1c5e0b9f.
* Run premake
* Remove prebuild command (only works on Windows?)
* Rerun premake
* Fix heterogeneous prebuild command
* Remove linux specific prebuild command
* Fix prebuild command (again)
* Change target from dxbc to hlsl to see if that fixes linux issues
* Use Path::getFileNameWithoutExt
* Change string-literal.slang.expected to have extra filename in decoration
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Upgrade to GLSLANG 11.16.0+
* Small edit to readme - really to kick another build.
* Upgrade slang-binaries to include new glslang binaries.
* Update slang-binaries to include linux-x86
* Upgrade slang-binaries.
* Support for GL_NV_ray_tracing_motion_blur extension.
* Fix issues with doc output around spirv_direct
Updated docs.
* Remove spirv_direct from names of codegen targets.
* Improvements around spirv_direct in docs.
* Updated stdlib docs.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Upgrade to GLSLANG 11.16.0+
* Small edit to readme - really to kick another build.
* Upgrade slang-binaries to include new glslang binaries.
* Update slang-binaries to include linux-x86
* Upgrade slang-binaries.
* Support for GL_NV_ray_tracing_motion_blur extension.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Small improvements to premake for linux around linking.
* Link with pthread of slang-glslang.
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Stream. Working on all tests.
* Split out CharEncode.
* Make method names lower camel.
m_prefix in Writer/Reader
* Tidy up around CharEncode interface.
* Small improvements around encode/decode.
* Better use of types.
* Remove readLine from TextReader.
* Remove exceptions from Stream/Text handling.
* Fix some typos.
* Fix tabbing.
* Fix missing override.
* Remove remaining exception throw/catch via using signal mechanism.
* Remove exceptions that are not used anymore.
* Document the Stream interface.
* Remove index for decoding 'get byte' function.
* Fix CharReader -> ByteReader.
|
|
* Fix aarch64 release build config.
* Fix for WinAarch64 build.
* Update premake for embed-std-lib build on aarch64.
* `platform` fix for aarach64 build.
* Try revert back to use absolute output path for slang-stdlib-generated.h
* Fix
* fix
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix AbortCompilationException leaking through loadModule API.
* Update.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Replace WIN32 preprocessor macros with _WIN32
* Add missing Windows.h include for InterlockedIncrement
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
+ Implement bit_cast between float16 and uint16 in GLSL.
+ Enable pack-any-value-16bit test on vk.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* First integration with 'slang-llvm'.
* Fix project.
* Fix test output.
* First pass assert support.
* Add inline impls for min and max.
* Add abs inline abs impl for llvm.
* Make abs not use ternary op
* Fix typo in slang-llvm.h
* Sundary fixes to make remaining tests using llvm backend pass.
|
|
* `reinterpret` and 16-bit value packing.
* Update `half-texture` cross-compile test reference result.
* Revert inadvertent reformatting of slang-ir-inst-defs.h
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix crash: dynamic dispatch of generic interface method.
* Fix memory error.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix mangling logic for the case where a symbol name contains characters that aren't permitted (this usually occurs when a module name consists of the actual path to the module). There were multiple early-out `if` cases that accidentally fell through to the fallback path, so that symbol names would end up being excessively long.
* Fix type conversion cost lookup cache, by allowing single-element vectors (e.g., `vector<float,1>`) and single row/column matrix types to be distinguished from types of lower rank. Previously, `float` and `float1` and `float1x1` would share a single cache entry, even though each (currently) has very different conversion rules.
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Improve passes used for -O1 option with glslang.
* Set sourceManager when serializing container.
|
|
|
|
This function takes a user provided `typeID` and arbitrary typed value, and turns them into an existential value whose `witnessTableID` is `typeID` and whose `anyValue` is the user provided value. This allows the users to pack the runtime type id info in arbitrary way.
|
|
* optix SBT record data can now be accessed using uniform parameters on ray tracing entry points
* Update slang-emit.cpp
* fixing a bug where SBT instruction was missing a location at which to insert. Switching back to emitFieldExtract and accounting for changes in instruction emission location
|
|
* Add GLSL450 intrinsics to SPIRV direct emit.
* Fix.
* Fix compiler error.
* Fix.
* Fix compiler error.
* Make direct-spirv tests actually run.
|
|
* Further implementation of SPIRV direct emit.
This change implements:
- Struct, Vector, Matrix and Unsized Array types.
- Basic arithmetic opcodes, vector construct, swizzle etc.
- getElementPtr, getElement, fieldAddress, extractField.
- SPIRV target intrinsics with SPIRV asm code in stdlib.
- RWStructuredBuffer and StructuredBuffer.
- Pointer storage class propagation.
- Control flow.
* Fix.
|
|
This change is related to a case where a user saw a `/` character printed as part of a structure field name in output HLSL (which obviously broke downstream compilation).
The root cause in that case was that a module that was `import`ed was found via a file system directory path, and thus the module name that the Slang compiler formed included a `/`. Due to other factors the structure field was emitted based on its mangled name (missing name hint), and that meant the bare `/` escaped into the output.
That situation implies some other things maybe are questionable:
* Could we have avoided a `/` ending up in the module name to begin with? Maybe, but we kind of want to support non-ASCII characters in names in the long run.
* Could we have fixed whatever was causing the name hint to be dropped? Maybe, but we don't ever want name hints to be *required* for valid compilation (hence why they are "hints").
Even if we investigate these other issues, it is important that the name mangling scheme only ever produce output that uses a constrained subset of ASCII that we expect most compilation targets can support (GLSL being an annoying outlier because of its rules around `_`).
This change adds a path where when mangling a `Name` as part of a mangled symbol name we check for the presence of bytes that aren't allowed and then produce an escaped form instead if needed. Note that the code is still byte-oriented for now aothough in the long-run it might want to be oriented around Unicode code points or extended grapheme clusters.
|
|
* Fix a few issues around opaque types as outputs
Slang and HLSL support opaque types (textures, buffers, samplers, etc.) as members of `struct`s, mutable local variables, function results, and `out`/`inout` parameters. GLSL and SPIR-V do not.
In order to translate Slang code over to GLSL/SPIR-V we use a variety of passes that seek to eliminate all of the above use cases and produce code that only uses opaque types in the limited ways that GLSL/SPIR-V allow. This change relates to the passes that deal with function results and `out`/`inout` parameters.
There are two basic changes here:
1. The `specializeResourceOutputs` pass was only dealing with resource (texture/buffer) types. This change updates it to process sampler types as well.
2. The sequencing of the passes made it possible that an opaque-typed local variable might be left around after `specializeResourceOutputs`, which would mean the code is still invalid for GLSL/SPIR-V. This change adds an additional SSA-formation pass which would eliminate any opaque-type local variables whose lifetimes were made simple enough by the optimizations.
Together these changes fix a problem-case user shader that was failing to compile for Vulkan.
* Update slang-emit.cpp
Fix typo 'reuslt'
* Update slang-emit.cpp
Comment change to re-trigger CI build.
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
points (#1917)
* optix SBT record data can now be accessed using uniform parameters on ray tracing entry points
* Update slang-emit.cpp
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Improvements to how glslang optimizations are handled. Increasing the amount of ids to allow for optimizing more complex shaders, and compacting to try and keep ids within range for usage.
* Add other optimizations from current glslang.
* Improved -O1/default GLSLANG passes improves output SPIR-V size (Can be more that 1/3 of the size), for a modest increase in compilation time.
* Improve comments.
|
|
* Work to mitigate SPIR-V bloat
SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things.
Note: experimentation shows that the bloat is less pronounced when running either *no* SPIR-V optimizations or *full* SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request.
This change primarily does two things:
First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would *always* copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters.
Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies.
Notes:
* The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon.
* The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing.
* There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments.
* Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated.
* This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create *more* memory traffic (because the same values get loaded in multiple callee functions).
* fixup: run premake
* fixup: typo
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add repro-directory feature.
* Added writer support to repro-directory.
* Upgrade glslang to 11.5.0
* Add -load-repro-directory option
* Improve repro doc.
|
|
|
|
* Add debug symbols for release build.
* Hack to try and capture failing compilation.
* Typo fix for capture hack.
* Specify return type on lambdas.
* Added const.
* Try breakpoint.
* Up count
* Let's capture everything so we can valgrind.
* Disable always writing repros.
* Make Scope non RefCounted.
* Fix issue with not serializing Scope.
* More comments around changes to Scope.
Remove Scope* from serialization.
* Remove code used for testing original issue.
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Removes read of uniitialized data identified via valgrind runs.
Co-authored-by: Theresa Foley <tfoleyNV@users.noreply.github.com>
|
|
* Fixes related to combined texture/sampler types
Work on #1891
Our intention has always been to support combined texture/sampler types in Slang, both targets like OpenGL where that is the only option available and for targets like Vulkan where it can be beneficial to performance. Because Slang's current users mostly focus on D3D12+Vulkan codebases, they strongly prefer separate textures and samplers, and the relevant support code in Slang has "bit-rotted" over many releases until the functionality that was there isn't useful any more.
This change significantly overhauls the implementation of combined texture/sampler types and adds a test that uses them in the hopes of avoiding future regressions.
The new combined texture/sampler types use the prefix `Sampler`, so where there is an existing standalone `Texture2D` type the equivalent combined texture/sampler type will be `Sampler2D`. The intention is that this naming mirrors the GLSL conventions (where the type is `sampler2d`) while following HLSL naming precedent (to the extent it exists).
The operations available on a `Sampler2D` are intended to be those that are available on a `Texture2D`, and it is just that in cases where the `Texture2D` operation would take a separate sampler argument:
Texture2D t = ...;
SamplerState s = ...;
float4 result = t.Sample(s, uv);
the equivalent `Sampler2D` operation just elides that argument:
Sampler2D s = ...;
float4 result = s.Sample(uv);
In terms of implementation, there are a lot of subtle details here:
* I've tried to use the same metaprogramming logic that generates all the stdlib declarations for `Texture*` to also generate `Sampler*` in the hopes that this helps keep them in sync.
* The big catch to the above is that it means that for certain operations the indidces of parameters depend on whether or not an explicit sampler parameter is used. Rather than try to tweak the indices in the stdlib generation logic (which is already complicated) I went and added Yet Another Hack to the logic that handles intrinsic definition strings. Basically, the special-case handling of `$p` has been modified so that it *also* applies a negative offset to future parameter references in the same intrinsic string.
* Trying to actually bring this up in our test framework revealed that the "flat" reflection API was seemingly not reflecting combined texture/sampler types correctly at all (it was reflecting them as just plain textures). Other than that issue, the Vulkan path seems to work fine with combined texture/samplers.
* I also had to add logic to the `TEST_INPUT` parsing to re-introduce handling of the combined types (that was something I consciously left out to reduce the amount of code in the earlier refactor there). Luckily, the architecture is such that a combined texture/sampler can leverage most of the existing logic for the separate cases.
* fixup: reveiw feedback
|
|
* Fix CentOS/gcc compile issue.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Remove StructTag and associated systems.
* Fix typo and remove unit test for StructTag.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP Abi struct.
* Use AbiSystem on SessionDesc.
* Use mask/shift constants.
* Fix issue causing warning on linux.
* Abi -> Api.
* Fix typo.
* Refactor to use StructTag.
* Mechanism to be able to follow fields.
* Field adding is working.
* WIP with StructTagConverter.
* First pass of StructTag appears to work. Still needs diagnostics.
* Small tidy up around Field.
* Use bit field to record what fields are recorded to remove allocation around the m_stack.
Use ScopeStack for RAII.
* Return SlangResult instead of pointers.
* Use SlangResult with copy.
* Split StructTagConverter implementations.
* Fix some bugs around lazy converting.
* First pass at unit test for StructTag.
* Testing StructTag going backwards in time.
* First pass as StructTag diagnostics.
* Make Traits a namespace.
* Fix some issues with Traits not being a class.
* Fix 32 bit warning.
|
|
This change adds support for variadic macros in the C-style
preprocessor, e.g.:
#define DEBUG(MSG, ...) print(__FILE__, __LINE__, MSG, __VA_ARGS__)
Similar to the gcc preprocessor, this feature supports both named
variadic macro parameters and unnamed ones (which then default to
`__VA_ARGS__`.
The implementation work is mostly straightforward, although there are a
few subtle design choices worth mentioning:
* A variadic macro is represented by it having a variadic *parameter*
that is part of the ordinary parameter list.
* Argument parsing does *not* detect whether the macro being invoked is
variadic and collect/combine arguments to form a single argument value
for the variadic parameter. This is motivated by the need for some
extensions to differentiate a variadic parameter receiving a single
empty argument vs. zero arguments.
* Because any reference to the variadic parameter needs to expand to the
comma-separated arguments that match it, the logic for turning a macro
parameter reference into a list of tokens has been factored out into a
subroutine that handles the details.
* The choice in the earlier refactor to have a macro invocation collect
all the argument tokens (including the intervening commas) into a
single token list seems to pay off here, because it means that the
tokens in the expansion of a variadic parameter reference were already
stored contiguously.
* The special-case logic for handling an empty argument list had to be
tweaked again to ensure that an empty argument list is treated as
having zero arguments for the variadic parameter. Note that
historically C did not define the behavior of this case, and always
required at least one argument for any variadic macro parameter.
* The logic for checking whether the number of arguments to a macro
invocation is valid needed to handle variadic and non-variadic macros
as distinct cases. There really isn't much overlap in how the checks
need to work, even if we tried to change the underling representation.
The main missing feature here is any way to discard a comma in a macro
body that appears before a variadic parameter reference, e.g.:
#define DEBUG(...) print("debug:", __VA_ARGS__)
In this case, an empty invocation list `DEBUG()` will expand to
`print("debug:",)` - a call with a trailing comma in the argument list.
If users end up needing a way to discard commas in cases like this, we
have many options we can consider. This change does not implement any of
them to keep the initial work as minimal as possible.
|