| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
| |
There was a bug that causes the compiler failing to treat a `no_diff
TypePack` as a type pack, and thus diagnose an error when resolving the
following call.
The fix is to unwrap any ModifiedType wrappers in `IsTypePack()` check.
|
| |
|
|
|
|
|
|
| |
Fixes #8221
This modifies the code snippet used to demonstrate link-time
specialization to use the public `loadModuleFromSourceString` API
instead of the internal `UnownedRawBlob::create`.
It also corrects a couple variable names in the snippet as well.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As mentioned in #8316 , there is a small duplicated and outdated section
in WGSL-Specific Functionalities documentation about specialization
constants support,
remove the outdated duplicated one
<img width="893" height="146" alt="image"
src="https://github.com/user-attachments/assets/abcd7521-645b-4bd6-b926-ce2d978775bd"
/>
as there is a new section in the page
<img width="851" height="319" alt="image"
src="https://github.com/user-attachments/assets/f52e5230-812b-4b29-88f4-bfff890f37ed"
/>
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#8603)
This change achieves link-time type resolution with a different
mechanism.
For `extern struct Foo : IFoo = FooImpl;`,
instead of synthesizing a wrapper type `Foo` that has a `FooImpl inner`
field and dispatches all interface method calls to `inner.method()`,
this PR completely removes this synthesis step, and instead just lower
such `extern`/`export` types as `IRSymbolAlias` instructions that is
just a reference to the type being wrapped.
Then we extend the linker logic to clone the referenced symbol instead
of the SymbolAlias insts itself during linking.
By doing so, we greatly simply the logic need to support link-time
types, and achieves higher robustness by not having to deal with many
AST synthesis scenarios.
Closes #8554.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#8547)
This allows us to specialize functions whose argument is a sub element
of a constant buffer, instead of being only applicable to entire buffer
element. Closes #8421.
This change also implements a proper heuristic to determine when to
specialize the calls and defer the buffer loads.
This PR addresses a pathological case exposed in
`slangpy\slangpy\benchmarks\test_benchmark_tensor.py`, which used to
take 27ms to finish, and now takes 1.25ms.
For example, given:
```
struct Bottom
{
float bigArray[1024];
[mutating]
void setVal(int index, float value) { bigArray[index] = value; }
}
struct Root
{
Bottom top[2];
[mutating]
void setTopVal(int x, int y, float value)
{
top[x].setVal(y, value);
}
}
RWStructuredBuffer<Root> sb;
[shader("compute")]
[numthreads(1, 1, 1)]
void compute_main(uint3 tid: SV_DispatchThreadID)
{
sb[0].setTopVal(1, 2, 100.0f);
}
```
We are now able to specialize the call to `setTopVal` into:
```
void compute_main(uint3 tid: SV_DispatchThreadID)
{
setTopVal_specialized(0, 1, 2, 100.0f);
}
void setTopVal_specialized(int sbIdx, int x, int y, float value)
{
Bottom_setVal_specialized(sbIdx, x, y, value);
}
void Bottom_setVal_specialized(int sbIdx, int x, int y, float value)
{
sb[sbIdx].top[x].bigArray[y] = value;
}
```
And get rid of all unnecessary loads. Achieving this requires a
combination of function call specialization and buffer-load-defer pass.
The buffer-load-defer pass has been completely rewritten to be more
correct and avoid introducing redundant loads.
This PR also adds tests to make sure pointers, bindless handles, and
loads from structured buffer or constant buffers works as expected.
|
| |
|
|
|
|
|
|
|
| |
This change relaxes a previous restriction on link-time types and
constants, so that we now allow them to be used to define shader
parameters.
Doing so will result in a parameter layout that is incomplete prior to
linking. The PR added a test to call the reflection API on a fully
linked program and ensure that we can report correct binding info.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The link-time specialization documentation contained an incorrect
example that used `[ForceUnroll]` with a link-time type method call,
which would cause a compilation error. The issue was that
`[ForceUnroll]` requires loop bounds to be known at compile time, but
`sampler.getSampleCount()` is a method call that returns a value at
runtime.
**Problem:**
The documentation example showed:
```csharp
Sampler sampler;
[ForceUnroll]
for (int i = 0; i < sampler.getSampleCount(); i++)
output[tid] += sampler.sample(i);
```
This would fail with error: `loop does not terminate within the limited
number of iterations, unrolling is aborted.`
**Solution:**
Removed the `[ForceUnroll]` attribute entirely, leaving a simple loop:
```csharp
Sampler sampler;
for (int i = 0; i < sampler.getSampleCount(); i++)
output[tid] += sampler.sample(i);
```
Since the loop bounds come from a runtime method call, there's no way
for the loop to be unrolled regardless of the directive used, so the
simplest solution is to remove the unroll attribute completely.
- [x] Remove ForceUnroll attribute from documentation example
- [x] Remove explanatory note about unroll vs ForceUnroll
- [x] Remove test cases for the removed functionality
- [x] Fix missing closing backticks in code block
Fixes #8161.
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
Co-authored-by: expipiplus1 <857308+expipiplus1@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Closes #8112. ~~The issue asks for a "C layout", but in this PR I use
the term "CPU layout" because this naming was pre-existing in the
codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout
is to match CPU-side struct definitions with the shader side. I'm open
to better naming suggestions, though.~~
Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the
CPU target depends on the object layout rules of existing CPU layout
rules, but they're incompatible with actual shaders. So a new
`kCLayoutRulesImpl_` was needed anyway.
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-Adds semantic SV_VulkanSamplePosition that emits corresponding
gl_SamplePosition and SpvBuiltinSamplePosition
-Adds gl_SamplePosition property to glsl.meta.slang
-Adds SPIRV and GLSL tests for the semantic and property
-Plan is to later implement SV_SamplePosition that follows HLSL range of
-0.5 to +0.5,
and emits GetRenderTargetSamplePosition(SV_SampleIndex) which needs more
complicated IR manipulation for HLSL and Metal
Fixes #7906
---------
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
Added a note section under the Installation section that warns users
about potential conflicts when multiple Slang installations are present
on the system. The note specifically addresses:
* The scenario where Slang from Vulkan SDK might conflict with a
standalone installation
* How LD_LIBRARY_PATH on Linux overrides the RUNPATH in the slangc
executable
Closes https://github.com/shader-slang/slang/issues/7405
|
| |
|
|
|
|
|
|
|
| |
This file is automatically overwritten by the build:
https://github.com/shader-slang/slang/blob/b7df3c7aa27301f88e31ed0a7bbf230688adab6a/source/slang/CMakeLists.txt#L68-L78
It is currently out of date (running the build gives rise to unstaged
changes), so this PR updates it.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #7410
Changes:
1. super-type capabilities must be a super-set of sub-type capabilities
(and support the same shader stages/targets)
* InheritanceDecl visits super-type to inherit it's capabilities;
validate InheritanceDecl capabilities against sub-type
* visit all container decl's with a default case
* clean up functionDeclBase visitor
* Simplify `diagnoseUndeclaredCapability` by moving logic into
capability checking (more correct*)
3. added changed behavior to documentation
4. fixed some incorrect capabilities
5. **we do not** diagnose capability errors on interface
requirement-to-implementation if both lack explicit capability
requirements. This change is to work around a slangpy regression (test
case for the failing situation is in
`tests\language-feature\capability\capability-interface-extension-1.slang`),
Note: maybe for slang-2026 we don't do this?
6. requirement & implementation must support the same shader
stage/target. This was changed because otherwise we can have cases where
`X` inherits from `Y`, but `Y` is only expected to be used in `glsl`
whilst `X` is expected to be used in `hlsl | glsl`
7. removed
`tests/language-feature/capability/capabilitySimplification3.slang`
because it tests nothing special (redundant)
Note: not using rebase due to separate branches depending on this PR
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement SPV_EXT_fragment_invocation_density
-Adds semantics SV_FragSize and SV_FragInvocationCount and implements them for SPIRV and GLSL using the appropriate target builtins from extensions.
-Adds test case checking for expected target builtins from these semantics.
-For future work, could implement SV_FragSize using pixel shader input SV_ShadingRate for HLSL, and SV_FragInvocationCount needs research.
Fixes #7974
Generated with Claude Code
* address review feedback
https://github.com/shader-slang/slang/pull/8037#pullrequestreview-3084645845
* fixup format
* review feedback
https://github.com/shader-slang/slang/pull/8037#pullrequestreview-3086442819
|
| |
|
|
|
| |
* Fix broken links in User Guide
* Fix link text with filename, use title
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added small section about default values of `struct` members
* Added one more example and made `int a` to be brace initilised as well
* Implemented suggested changes
* Update docs/user-guide/02-conventional-features.md
Co-authored-by: Yong He <yonghe@outlook.com>
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
groups (#7851)
* Fix capability generator to sort capabilities alphabetically within header groups
The slang-capability-generator now sorts capabilities alphabetically by name
within each header group in the generated a3-02-reference-capability-atoms.md
documentation file. This ensures consistent ordering for better readability
and organization.
Fixes #5030
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-authored-by: aidanfnv <aidanfnv@users.noreply.github.com>
* Regenerate capability atoms documentation with alphabetical sorting
The capability generator now sorts capabilities alphabetically within
each header group. This commit includes the regenerated documentation
file to demonstrate the alphabetical sorting functionality implemented
in the generator code.
Generated with updated capability generator that sorts capabilities
within groups: Targets, Stages, Versions, Extensions, Compound
Capabilities, and Other.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* format code (#7853)
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: aidanfnv <aidanfnv@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* alias amplification shader as task shader and add mesh shader profile
* add task shader stage alias to capabilities
* regenerate command line reference
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Support DeviceIndex
* format code
* regenerate command line reference
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Parse optional witness syntax
* Allow failing optional constraint
* Make `is` work with optional constraint
* Allow using optional constraint in checked if statements
* Fix tests
* Make it work with structs
* Fix MSVC build error
* Disallow using `as` with optional constraints
* Update test to match split is/as errors
* Add tests
* Fix uninitialized variables in tests
* Add tests of incorrect uses & fix related bugs
* Mention optional constraints in docs
* format code
* Fix type unification with NoneWitness
* Fix formatting
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Nathan V. Morrical <natemorrical@gmail.com>
|
| |
|
|
|
|
|
| |
* Require `override` keyword for overriding default interface methods.
* Update doc.
* Fix test.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add new capdef for lss intrinsics
Fixes #7426
Raygen shaders need to be supported for only hitobject APIs. So we need
a special capability for that, instead of a common one.
* regenerate command line reference
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Better handling for 16-byte boundary of d3d cbuffer
Fixes #6921
D3D cbuffers have slightly different packing rules that allow packing
vectors into a 16-byte slot at element alignments, except when
a field would cross a 16-byte boundary. In that case, we need to
realign the field to the next 16-byte boundary.
In particular, this impacts vec3s, which are not a power of two in
size and thus require slightly different alignment logic, compared to
std430 and std140. (Example: a float and float3 should fit together in
that order in a single slot.)
Adds test cases.
Adds documentation page for GLSL target
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* WiP: Add coopvec support for Optix
* format code
* fix minor issues
* Fix review comments
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add legalization for 0-sized arrays.
* Allow 0-sized arrays in the front-end.
* More tests.
* Add `Conditional<T, hasValue>` type to core module.
* Update toc.
* Fix wording.
* Update test.
|
| |
|
|
|
| |
1. "here, $$f$$ here has" => "here, $$f$$ has"
2. $f$ ==> $$f$$
* $f$ does not render on website, works on vscode
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Language version + tuple syntax.
* Fix compile error.
* regenerate documentation Table of Contents
* Fix.
* regenerate command line reference
* Fix.
* Fix.
* Fix more test failures.
* revert empty line change,
* Retrigger CI
* #version->#lang
* Update source/core/slang-type-text-util.cpp
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Remove comments.
* Fix parsing logic.
* Fix parser.
* Fix parser.
* update test comment
* Update options.
* regenerate documentation Table of Contents
* regenerate command line reference
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
|
| |
|
|
| |
github pages reqire new-line to render table correctly.
Grammer problem.
|
| |
|
| |
* change default descriptor binding to be VkMutable
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* WiP: LSS intrinsics: initial commit
* format code
* Fix CI failures
* Address review comment
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Properly implement WaveMask* variants of WaveMultiPrefix* intrinsics
* More partitioned intrinsics
* More partitioned intrinsics and cleaned up non-prefixed WaveMask* implementations
* Refactor HLSL WaveMultiPrefix* implementations
* fix cap atoms
* Clean up implementation
* Add GLSL intrinsics and cleanup
* Add tests
* Fix affected capability test
* Update and fix tests
* Move expected.txt file
* Refactor WaveMask* to call WaveMulti*
* Refactor SPIRV/GLSL preamble code
* Enable emit-via-glsl tests
* remove wave_multi_prefix capability in favor of subgroup_partitioned
* Update docs
* Update cap atoms doc
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement throw statement
It already existed in the IR, so only parsing, checking and lowering was
missing.
* Initial catch implementation
Likely very broken.
* Error out when catch() isn't last in scope
* Prevent accessing variables from scope preceding catch
As those may actually not be available at that point.
* Add IError and use it in Result type lowering
* Add diagnostic tests
* Allow caught throws in non-throw functions
* Fix catch propagating between functions & SPIR-V merge issue
* Add test for non-trivial error types
* Fix MSVC build
* Fix invalid value type from Result lowering
* Also lower error handling in templates
* Lower result types only after specialization
* Attempt to disambiguate error enums by witness table
* Revert matching by witness, types should be distinct too
* Don't assert valueField when getting Result's error value
It may not exist if the function returns void, but getting the error
value is still legitimate.
* Update tests for new error numbers & get rid of expected.txt
* Change catch lowering to resemble breaking a loop
... To make SPIR-V happy.
* Fix dead catch blocks and invalid cached dominator tree
* More SPIR-V adjustment
* Lower catch as two nested loops
* Add defer interaction test and revert broken defer changes
* Fix enum type when throwing literals
* Cleanup and bikeshedding
* Document error handling mechanism
* Fix table of contents
* Use boolean tag in Result<T, E>
* Use anyValue storage for Result<T,E>
* Remove IError
* Fix formatting
* Eradicate success values from docs and tests
* Use parseModernParamDecl for catch parameter
* Implement do-catch syntax
* Implement catch-all
* Fix formatting
* Fix marshalling native calls that throw
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes: [#7143](https://github.com/shader-slang/slang/issues/7143)
fixes: [#7146](https://github.com/shader-slang/slang/issues/7146)
Goal of PR:
* This PR is part of the larger #7115 refactor to how dynamic dispatch works.
* The first step is to add the `-std <std-revision>` flag.
* The second step is to provide basic `dyn` keyword support in AST. This does not include `varDecl` support since most of these interactions require `some` keyword support.
Future PR(s) goal:
* Support `some` keyword in AST. With this we will also implement all varDecl interactions between `dyn` and `some`.
* Add IR support for `some` and `dyn`.
Breakdown of PR:
* most of the logic is in `validateDyn.*`. This was done so that in the future when we implement more features we will have an easy time removing/adding restrictions to `dyn` interfaces.
Breaking changes:
* As per spec (https://github.com/shader-slang/spec/pull/14/files), any type conforming to a `dyn` interface errors if member list contains one of the following: opaque type, non copyable type, or unsized type.
* Due to the breaking change, the test `tests\compute\dynamic-dispatch-bindless-texture.slang` is incorrect. This has been fixed.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
semantics (#7150)
* Map SV_VertexID to `gl_VertexIndex - gl_BaseVertex`, provide SV_Vulkan* SV semantics
* Fix docs
* Regenerate toc
* Fix affected pointer-2 test
* Add tests
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The user can explicitly use Vulkan memory model, or it will be
automatically used when cooperative-matrix is used.
When vulkan memory model is used, two keywords, "Coherent" and
"Volatile", are not allowed.
There are many differences regarding atomic and texture but
this PR has changes limited to support `globallycoherent`
keyword. When variables with `globallycoherent` is used with `OpLoad`, it
will use additional options, `MakePointerAvailable|NonPrivatePointer`,
that will provide the same effect. For `OpStore`, it will use
`MakePointerVisible|NonPrivatePointer`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix correct bindings for bindless resource model [spirv and glsl]
fixes: #6952
Problem:
* Currently all bindless objects are placed in the same set (fine) and same binding (incorrect behavior for vulkan). This is incorrect since as per [spec](https://registry.khronos.org/vulkan/specs/latest/man/html/VkDescriptorType.html), only 1 resource type may be written to each index inside a set (these rules are loosened with VK_EXT_mutable_descriptor_type)
* This means currently generated bindings do not work in practice if we (for example) use `Sampler2D.Handle` and `Texture1D.Handle` in a shader since we would place 2 incompatible objects in the same binding-index and set.
Solution:
* `__getDynamicResourceHeap` was modified to allow bindings to chosen dynamically for a descriptor
* use `IOpaqueDescriptor` to check compile-time information of resource types so that we can identify different resources
* Using this information of `IOpaqueDescriptor`, we modify `defaultGetDescriptorFromHandle` to provide a binding model (1 resource per binding-index) which produces legal spirv/glsl.
* To support `VK_EXT_mutable_descriptor_type` the function `defaultGetDescriptorFromHandle` has a set of options (`BindlessDescriptorOptions`) for a user to pick-from to support their binding model. Capabilities are not used here for flexibility purposes (specifically old shaders mixed with modern vulkan extensions).
Other changes:
* Added `TexelBuffer` DescriptorKind to aid in generating correct bindings
* format code
* Add to docs bindless changes, make AccelerationStructure use its handle directly, adjust tests accordingly
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
This commit implements two new types and related Load/Store functions in CoopMat.
tensor_addrressing.TensorLayout
tensor_addressing.TensorView
CoopMat.Load(..., TensorLayout)
CoopMat.Load(..., TensorLayout, TensorView)
CoopMat.Store(..., TensorLayout)
CoopMat.Store(..., TensorLayout, TensorView)
CoopMat.Load(..., TensorLayout, TensorView)
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This commit adds three new functions for CoopMat as described in the proposal document,
Cooperative matrix 2 proposal spec#12
The new functions are:
CoopMat<T,S,M,N,R>::Transpose
CoopMat<T,S,M,N,R>::ReduceRow
CoopMat<T,S,M,N,R>::ReduceColumn
CoopMat<T,S,M,N,R>::ReduceRowAndColumn
CoopMat<T,S,M,N,R>::Reduce2x2
|
| |
|
|
|
|
|
|
|
| |
**NOTE: This is a breaking change for users who were using POC variant of DXC.
In order to keep the compatibility, the users will have to use -capability hlsl_coopvec_poc to their command line.
This PR adds a new capability "hlsl_coopvec_poc".
When it is used, the HLSL for CoopVec will be emitted for the POC variant of DXC.
When it is not used, the HLSL for CoopVec will be emitted for the DXC that officially supports the cooperative vector.
|
| |
|
|
|
|
|
|
|
|
| |
Closes https://github.com/shader-slang/slang/issues/6805
This change adds a note to the SPIR-V target specific doc that SV_InstanceID does not map directly to SPIR-V's BuiltIn InstanceIndex, and adds a more detailed explanation of the difference, its motivation, and how to derive the actual value equivalent to BuiltIn InstanceIndex with an example.
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #7049
The root cause of the problem in #7049 is simply that newer NVRTC versions produce a warning when asked to generate code for older CUDA SM versions, and the default that Slang was requesting compilation for was old enough to trigger that warning, and thus trip up the test case (which only looks at the first diagnostic produced by the downstream compiler).
Superficially, the fix was easy: change the test case in question (`tests/diagnostics/local-line.slang`) to request `-capability cuda_sm_8_0`, the minimum version supported by current NVRTC.
Unfortunately, the simple fix required some other fixes in order to actually work.
The capability system includes capability names of the form `cuda_sm_*_*`, but specifying such a capability had *no* impact on the CUDA SM version passed in when invoking NVRTC.
Instead, only the CUDA SM versions requested in the implementation of intrinsics in the core module were affecting the version number passed down.
This change adds logic to `slang-compiler.cpp` to take explicitly requested capabilities into account when inferring the CUDA SM version to be passed downstream.
A more complete fix would also add similar logic for all the other targets.
Unfortunately... yet again... that fix wasn't enough to make things work as expect.
Now I had the problem that requesting `-capability cuda_sm_8_0` was actually causing the NVRTC invocation to request CUDA SM version **9.0**!
The underlying problem *there* was that the `slang-capabilities.capdef` file has defined certain capability names in a way that implies atomic capabilities much higher than one would expect.
E.g., the `cuda_sm_8_0` alias was including HLSL `sm_5_0`, but then `sm_5_0` in turn included `_cuda_sm_9_0`.
The fix, for now, is to change the definitions in `slang-capabilities.capdef` to not have the counter-intuitive definitions for `cuda_sm_*_*`.
With this set of fixes, the test failure in the original bug report no longer occurs.
The work that went into this change suggests several larger-scope fixes that would be good to pursue:
* Ideally the capability definitions would have some sort of validation checking to make sure that counter-intuitive results like `cuda_sm_8_0` requesting CUDA SM 9.0 do not occur.
* The translation of capabilities over to version numbers for a downstream compiler should be expanded to cover other targets, and not just CUDA. It might be better/simpler to just pass the capabilities themselves to the downstream compiler, since it is possible that a downstream compiler could have more fine-grained enable/disable options than a simple version number.
* The entire approach to computing version numbers required for downstream compilation should be cleaned up so that we don't have this duplication between the capabilities that represent those versions and separate syntactic constructs that are used to "request" those versions as part of code generation.
* We are very much at the point where we should consider dropping the current behavior where a profile name or capability like `sm_5_0`, that is specific to a single target or a subset of targets, also implies a set of comparable capabilities for other targets.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add cluster geometry intrinsics for ray tracing
- Added GetClusterID() method to HitObject class
- Added CandidateClusterID() and CommittedClusterID() methods to
RayQuery class
- Added SPV_NV_cluster_acceleration_structure extension support
- Added GL_NV_cluster_acceleration_structure extension support
- Added test files for RayQuery and HitObject cluster methods
Fixes #6431
* OpRayQueryGetIntersectionClusterIdNV - unrecognized spirv
Disabling spirv backend for SPV_NV_cluster_acceleration_structure
hlsl.meta.slang(18674): error 29100: unrecognized spirv opcode:
OpRayQueryGetIntersectionClusterIdNV
result:$$int = OpRayQueryGetIntersectionClusterIdNV
&this $iCandidateOrCommitted;
^~~~~~
hlsl.meta.slang(18670): error 30019: expected an expression of type
'int', got 'void'
return spirv_asm
^~~~~~~~~
ninja: build stopped: subcommand failed.
* 6431 - Fix spirv opcode
* Remove tests
* Add relevant tests
* Review - Simplify tests
|
| |
|
|
|
|
| |
This change fixes some broken links/anchors in the Slang docs that I found after running a link checker tool on the readthedocs site.
Some of these were broken on GitHub Pages as well, such as anchors with apostrophes or docs that were moved.
The doc writer change adds an explicit .html extension to Parameter links, so that Sphinx/RTD does not treat it as an anchor.
|
| |
|
|
|
|
|
| |
into H2 (#6986)
readthedocs treats H1 headings as document titles in the sidebar table of contents.
This change turns all extra H1 headings into H2 in the "Metal-Specific Functionalities" doc.
These extra H1 headings seem to be errors, given that they are inconsistent, and the H2s after them do not belong. For examples, the heading "Resource Types" is H2, but "Array Types" is H1, and "Mesh Shader Support" is H1, but the following "Header Inclusions and Namespace" is not a subtopic of the Mesh Shader Support.
|
| |
|
| |
This change adds hidden (commented out) tables of contents to the index of the User Guide, as well as each of the additional chapters, so that readthedocs will render the User Guide structure.
|
| |
|
|
|
|
|
| |
* Add `IOpaqueHandle::descriptorAccess`.
* Update doc.
* fix.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add Slang Byte Code generation and interpreter.
* Fix compile issues.
* format code
* More compile fix.
* Fix clang issue.
* Fix more clang issues.
* Another clang fix.
* Fix clang issues.
* Fix another clang issue.
* Fix wasm build.
* Update building.md
* Fix test-server.
* Fix compile error.
* Fix bug.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|