| Age | Commit message (Collapse) | Author |
|
* Fix user-guide typos
Use LLM to scan each of the markdown files to fix typos.
Try not to change anything but the typos in this CL.
* typo not caught by LLM
* add output of ./build_toc.ps1
|
|
This fixes issue #6654
Only hoist instructions that are optimized by prepareFuncForForwardDiff.
Add flag hoistLoopInvariantInsts to IRSimplificationOptions and set this
to true only if called from prepareFuncForForwardDiff, then only hoist
if the flag is set. Additionally, do not hoist loops if they only have a
single trivial iteration.
|
|
interface-typed output parameter (#6788)
* More specific diagnostic for invalid concrete-to-interface arg coercion
* Add test for the new error message
* Fix typo in expected test result
|
|
Change introduces a workflow that will automatically run on new
issues and add a "Dev Opened" label if opened by select dev
members. This workflow can be used to add additional labels in
the future.
|
|
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
* Get real value for typeAdapter
When the type is mismatch and typeAdapter is used, get the real value
from typeAdapter so that we don't get nullptr for irValue.
This fixes the assert if uint is used for SV_VertexID, which is an int
in the system binding semantic.
Fixes: #6525
* Add test case; add nullptr check
|
|
* void field rework
* move void cleanup pass earlier
|
|
Closes https://github.com/shader-slang/slang/issues/5995
|
|
Fixes #6624
This commit changes the behavior of getArgumentValueString() to return
the string's value, instead of returning the string's token,
as that token also contains the surrounding quotation marks.
This commit also modifies the relevant unit test accordingly,
to not check for the surrounding quotations.
|
|
* Use GITHUB_TOKEN for fetching prebuilt
This PR extends Commit c6b702c to use GITHUB_TOKEN if set for fetching
prebuilt binaries.
This allows webgpu-dawn and slang-tint to be downloaded for certain IPs
where the github API rate is limited.
Fixes #6689
* Don't ignore download failure if github token is provided
* Update readme for getting github access token
* format code
* combine cmake_parse_arguments calls
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
* Add support for Ray Payload Access Qualifiers (PAQs) (#3448)
- Added [raypayload] attribute for struct declarations
- Implemented field validation requiring read/write access qualifiers
- Added diagnostic error for missing qualifiers
- Enabled PAQs in DXC compiler and HLSL emission
- Added new test demonstrating PAQ syntax
- Implemented proper handling of ray payload attributes in IR generation
* format code
* Cleanup: Remove unused vars
* Add check to enablePAQ only for profile >= lib_6_7
* Review Fix - Add PAQ support for DX Raytracing
add enablePAQ flag to DownstreamCompileOpitons, improve PAQ handling
update raypayload-attribute-paq.slang to ensure hlsl and dxil is
validated
* Add diagnostic test for missing paq for lib_6_7
Compile using `-disable-payload-qualifiers` aka lib_6_6 profile
raypayload-attribute-no-struct.slang and
raypayload-attribute.slang
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
|
|
(#6651)
* Fix incorrect assert on mixed global uniform and varyings
* add test
* remove unnecessary include
* fix incorrect logic
* fix comment grammar
* address review comments and improve test
* minimize diff
* fix more issues for cuda build
* remove unnecessary line for diff
|
|
|
|
(#6696)
* Initial loop analysis pass
* More changes for a single-pass implication propagation
* Update slang-ir-autodiff-loop-analysis.cpp
* Cleanup + new system for loop analysis
* Fixup bugs in loop analysis
* Remove some relation types to simplify the analysis. Add test
* Remove unused
* Address comments
* Fix issue with continue loops
* Update reverse-loop-exit-value-inference-1.slang
* Update reverse-continue-loop.slang
|
|
output) function outputs (#6737)
Closes https://github.com/shader-slang/slang/issues/6632
|
|
* fix(d3d11): correct parameter in VSSetConstantBuffers1 from uavCount to cbvCount (fixes #6531)
Root cause - Incorrect parameter passing in slang-rhi
1. slang-rhi #281 - Add the correct cbvCount for setting Constant Buffer
2. Prevent render tests from overwriting reference images
* Add missing tests/render/multiple-stage-io-locations.slang.3.expected.png
* Add more expected images from texture2d-gather
* Add new option: skipReferenceImageGeneration
For Github CI we set this to true - So we don't overwrite the expected
images
---------
Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
|
|
* Implement sparse texture Load intrinsics for SPIRV
* changed test name from TEST_load to TEST_sparse
---------
Co-authored-by: Darren Wihandi <65404740+fairywreath@users.noreply.github.com>
|
|
* implement parameterblock for metal
Metal uses argument buffer to pass parameter buffer to pipeline, in this
change, we implement a simple way to copy the data to argument buffer.
In argument buffer tier2 rule, all the fields in parameter block will be
flatten to ordinary data, therefore
- we keep the m_data as in ShaderObjectImpl a CPU buffer to track on the data set in.
- For resource types, they will be represeted as device pointer or resource id in argument
buffer, we will just set their address or id at corresponsing offset in the CPU buffer
every time when 'setResource' or 'setSampler' is called.
- When binding the pipeline, we just simply copy the CPU argument buffer to GPU argument buffer.
- The only special case is nested parameter block. Because nested parameter block is represented
as a device pointer which will be another argument buffer, we will just recursively call
`_ensureArgumentBufferUpToDate` to get sub-object's argument buffer, and fill the GPU address of
those 'sub'-argument buffer to the root argument buffer at correct offset.
* Inform command encoder to hazard track the bindless resources
Since for all the resources within argument buffer are bindless, Metal
won't automatically hazard track those resources, we will have to call
'useResources' to inform Metal to hazard track those resources,
otherwise we will have to call wait fence after each command submission.
* nullptr check
* address comment
|
|
* Fixed generic interface specialization crashes:
- Add an export decoration to specialized generic interfaces.
* Fixed generic interface specialization crashes:
- Add an export decoration to specialized generic interfaces.
- Use getTypeNameHint(...) instead of a manual mangler.
* In cloneInstDecorationsAndChildren: specialize all linkage decorations, not just the exports.
- If a linkage decoration is already present, it is not specialized and replaced by the specialized one.
- If a specialization uses the TypeNameHint, sanitize it to be used as an identifier.
- Use the identifier name sanitizer from slang-mangle.
* Added tests/generics/generic-interface-linkage.slang
- See #6601 and #6688
|
|
This PR enables existing CoopVec tests with DX12 backend.
In order to use the CoopVec feature with DX12 backend, we have to use an option, "-dx12-experimental", because the current implementation of CoopVec feature in dxcompiler.dll requires "experimental feature".
Note that when the "experimental feature" is enabled, slang-test becomes less stable.
For that reason, we should use the option "-dx12-experimental" only when it is needed.
All tests for GLSL are deleted because CoopVec support for GLSL in Slang is deprecated and no longer supported.
Some of CoopVec tests are still disabled for DX12 backend because:
DXC doesn't support 8bit integer types and
Some of CoopVec features are not implemented in DXC backend.
|
|
* Reapply "Eliminate empty struct on metal target (#6603)" (#6711)
This reverts commit bc9dc6557fc0cc3a4c0c2ff27e636940e361cf5d.
* Remove argument in make_struct call corresponding to void field
This is a follow-up of #6543, where we leave the VoidType field as it in
make_struct call during legalization pass.
So during cleaning_void IR pass, when we remove "VoidType" from struct,
we will have to also clean up the argument corresponding to the
"VoidType" field.
|
|
* Enable "-HV 2021" option for DXC
|
|
Fixes issue #6533
This patch updates handling of Array and ConstantBuffer types for WGSL
transpiling, giving correct syntax for arrays of buffers in WGSL.
|
|
|
|
MacOS test is accidentally disabled in #6491. Re-enable it.
|
|
|
|
* Use coopvec supporting dxcompiler.dll and dxil.dll
* Fix the failing tests
|
|
* Add GetDimensions support for CUDA
This CL adds GetDimensions support for cuda by using the PTX
instructions. Currently, PTX only supports getting width, height and
depth.
This CL also adds a new test to test this support.
Fixes #5139
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
Fixes #6692
This change fixes the hyperlinks used in the User Guide's Reference section
as the previously used paths were leading to a 404
|
|
# Make `IRWitnessTable` Hoistable
## Intention of the PR
This commit makes `IRWitnessTable` Hoistable so that we can avoid duplicated `IRWitnessTable`.
## Problems
This commit tries to address the following issues arise after turning `IRWitnessTable` into Hoistable:
1. A Hoistable instance is immutable.
2. When tries to create a duplicated child, you will get a previously created instance of `IRWitnessTable`, instead of a new one.
3. We don't actually want to hoist `IRWitnessTable`.
4. There can be only one instance of Hoistable and it cannot appear as childs multiple times.
5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface.
## Implementation
### Solution for "1. A Hoistable instance is immutable."
`IRWitnessTable::setConcreteType()` is removed, because when an `IRInst` is Hoistable, it is treated as immutable. Any `IRInst::setXXX()` methods don't work anymore.
There were two places calling `setConcreteType()` and their logic had to change little bit.
`DeclLoweringVisitor::visitInheritanceDecl()` in `source/slang/slang-lower-to-ir.cpp` was calling `setConcreteType()`. It had a little strange logic around `lowerType()`. The `IRWitnessTable` was added with `context->setGlobalValue()` first and its `concreteType` was changed later. This commit works around in a way that it sets the parent of `IRWitnessTable` temporarily and reset it with the correct `IRWitnessTable`. Without this logic, it went into an infinite recursion.
`AutoDiffPass::fillDifferentialTypeImplementation()` in `source/slang/slang-ir-autodiff.cpp` was calling `setConcreteType()`. It was changing the concreteType of `innerResult.diffWitness`. This commit creates a new `IRWitnessTable` and copies its `IRWitnessTableEntry`.
### Solution for "2. When tries to create a duplicated child, you will get a previously created instance of IRWitnessTable, instead of a new one"
After a call to `IRBuilder::createWitnessTable()`, this commit checks if the returned `IRWitnessTable` is a brand new or not. If it is not a new one, we have to avoid adding the decorations and children.
This commit decides when to add decorations and children based on whether `IRWitnessTable` has any of decorations or children already. It doesn't seem like a proper way to check. But when I tried, it was difficult to find a bottleneck point where the decorations and children are added to `IRWitnessTable` first time. Note that we are not trying to find when `IRWitnessTable` is created for the first time; we need to find if the decorations and children were added once.
It might be fine to have duplicated `IRWitnessTableEntry` in most of the cases, but I noticed that it fails an assertion check when `shouldDeepCloneWitnessTable()` returns false in `cloneWitnessTableImpl()`.
### Solution for "3. We don't actually want to hoist IRWitnessTable."
The reason why this commit makes `IRWitnessTable` is to prevent the duplicated instances of `IRInst`. But we don't really want to "Hoist" them.
When an `IRWitnessTable` gets Hoisted out, it causes unexpected problems and the specialization process fails due to the missing `IRWitnessTable` in the input.
This commit prevent from hoisting `IRWitnessTable` in `_replaceInstUsesWith()`. The way this is implemented feel little hack but we discussed on Slack and decided to go with this. One of the proper approaches could be to add a new flag in `IROpFlags` and have a new one like `kIROpFlag_Deduplicate`, which is different from just `kIROpFlag_Hoistable`.
### Solution for "4. There can be only one instance of Hoistable and it cannot appear as childs multiple times."
When `IRWitnessTable` is Hoistable, there can be only a unique set of instances. And we cannot have an instance as a duplicated childs. It is because `IRInst` has only one set of `IRInst* next` and `IRInst* prev`.
Before this commit, an instance of `IRGeneral` could have duplicated instances of `IRWitnessTable`. As an example, `IInteger` interface inherits two other interfaces, `IArithmetic` and `ILogical`. And they both inherits from `IComparable`.
```
interface IInteger : IArithmetic, ILogical {}
interface IArithmetic : IComparable {}
interface ILogical : IComparable
```
When we specialize it in `specializeGenericImpl()`, an `IRBlock` gets the following list of children:
- IRWitnessTable for IComparable,
- IRWitnessTable for IArithmetic,
- IRWitnessTable for IComparable,
- IRWitnessTable for ILogical,
For the cloning during the specialize, "IRWitnessTable for `IComparable`" must be cloned before the cloning of "IRWitnessTable for `IArithmetic`". Because "IRWitnessTable for `IArithmetic`" refers "IRWitnessTable for `IComparable`" as its `IRWitnessTableEntry`. The order they appear in the `IRBlock` as children decides which instances will be cloned first. And "IRWitnessTable for `IComparable`" must appear before "IRWitnessTable for `IArithmetic`".
Note that "IRWitnessTable for `IComparable`" appears twice, The first one was added for "IRWitnessTable for `IArithmetic`". And the second one is added for "IRWitnessTable for `ILogical`".
With this commit "IRWitnessTable for `IComparable`" can appear as a child only once in `IRBlock`. So it causes an error if it gets the following list:
- IRWitnessTable for IArithmetic,
- IRWitnessTable for IComparable,
- IRWitnessTable for ILogical,
In order to resolve the problem, "IRWitnessTable for `IComparable`" must appear before both "IRWitnessTable for `IArithmetic`" and "IRWitnessTable for `ILogical`" as following:
- IRWitnessTable for IComparable,
- IRWitnessTable for IArithmetic,
- IRWitnessTable for ILogical,
To address the problem, the instances of `IRWitnessTable` is always added to the end of the children list. If it is already added to the list, we don't move. This works out because the AST tree is built based on the dependencies.
### Solution for "5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface."
This issue was found while testing with Falcor tests where it uses Conformance-type feature of Slang.
We are using different import and export mangled names for a same Witness-table when the witness-table is for "Enum" interface.
The way we simplify the implementation of "Enum" causes a problem when it comes to generate export/import for the witness-table. And the exact repro step is still unclear.
There were two suggested solutions for the problem and this PR adopted the first option for now. Maybe we want to improve it with the second option later.
option 1, when we produce mangled names for those witness-table, we can use a mangled name with the underlying "int" type instead of the name of the enum type. In this way, all witness-tables for enum types whose underlying type is same will get the same mangled name. It will allow us to deduplicate the witness-table during the linking.
option 2, we can preserve type info for enum type when generating IR. We can still erase all other uses of the type info of enum types for now. But when we generate the witness-table, instead of filling the conforming type operand to IntType, we fill it as EnumType(IntType) where EnumType is a new global IROp code to represent all enum types (like InterfaceType/StructType). This way the operands for the two witness-tables will be different.
"option 1" is more quick and dirty and "option 2" is more proper way to address it.
I should go with "option 1" and improve it with "option 2" approach later.
|
|
* Include generics' operands in call graph construction
* add test
|
|
This reverts commit b3deec2001ea34e20e9a6af8ddf5cf3866cafac0.
|
|
Fixes #5856
This commit updates the out-of-date license information in
https://github.com/shader-slang/slang/blob/master/CONTRIBUTING.md#license
to state that contributions are licensed under the Apache License 2.0
with LLVM Exception instead of the MIT License.
|
|
* [docs] Admonish slangc entry points / shader attributes
Admonish the related non-functional compilation command in the reference manual until #5541 is addressed.
* Refine shader attribute description.
* Refine shader attribute description
* Update with all supported targets
Add all targets supporting shader attributes per provided verbiage.
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
* Ignore failure to fetch webgpu_dawn and slang-tint
Fixes #6683
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
close #6694
We should return nullptr when findAndValidateEntryPoint fails to valid
the entrypoint.
|
|
* Eliminate empty struct on metal target
Close 6573.
We previously disabled the type legalization for ParameterBlock on
Metal, but Metal doesn't allow empty struct in the argument buffer
which is mapped from ParameterBlock, so we will need legalizeEmptyTypes
on Metal target.
* update test
* update function name
|
|
* Fix SPV_KHR_maximal_reconvergence extension name spelling
Vulkan validation layers emit warnings on lowercase khr.
* Move OpExtension check
|
|
* Fix mul operator followed by global scope
This should fix expr like `2.0f * ::a::b::c`.
But it will no longer parse something like
```
extension<T> Ptr<T> { static void foo(); }
int*::foo() // won't work, but this is a less common case
```
Fixes #6684
* Update simpe-namespace.slang to test global scope
|
|
Close #6541.
Previously in type legalization pass, we skip the VoidType field when call make_struct, however in some optimization pass we keep counting the VoidType field. We have to make this behavior consistently over all our codebase.
So in this change, we spot the make_struct call and leave VoidType field as it.
|
|
(#6675)
* Improve embed tool to search all include directories as determined by CMake
Hopefully this puts an end to prelude generation issues.
* Update CMakeLists.txt
* Update CMakeLists.txt
* Use Slang's string representation instead of malloc-ing chars
|
|
* Don't load cached builtin module in slang-bootstrap.
* Fixes.
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
|
|
|
|
* spirv: add support for ops added by multiple extensions
Some spirv ops are added by multiple extensions and capabilities. This
commit adds support to avoid emitting unnecessary extensions and
capabilities if one of the options is already required by some other op.
* spirv: allow OpRaytracingAccelerationStructure to use multiple extensions
This Op is provided by both SPV_KHR_ray_tracing and SPV_KHR_ray_query
and the respective capabilities. Use one if already available and
otherwise fall back to SPV_KHR_ray_tracing.
* tests/vkray: add negative checks for RayTracingKHR and RayQueryKHR
- Add new rayquery-compute.slang to test that only RayQueryKHR is needed
in compute shaders.
- Add checks for RayTracingKHR and RayQueryKHR capabilities and
extensions in raygen.slang
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
* initial wip
* more WIP
* preserve old lower behavior
* remove unnecessary includes
* add test
* add no target case in test
* fix broken test
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
The previous implementation had two issues in the modifier processing loop:
1. isConst was incorrectly initialized to true, making the const check redundant
2. Premature loop break could skip processing important modifiers. e.g.
isExtern
Changes:
- Initialize isConst to false by default
- Remove early break condition to process all modifiers
Fixes: #6606
Co-authored-by: Yong He <yonghe@outlook.com>
|