| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
| |
Close #8568.
The root cause of this issue is that when the struct is indirectly
inherited from IDifferentiable type, we will not check the reference of
the DerivativeMember attribute. This PR fixes this issue by checking the
DerivativeMember attribute right before synthesize the requirement
methods of IDifferentiable interface.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For #8564
Similar to #8580, this re-adds the cross-compile target setup step for
macOS releases that was erroneously removed in
https://github.com/shader-slang/slang/pull/8470, which made x86_64
releases build aarch64 binaries.
It also simplifies the workflow logic a bit by adding a separate `arch`
variable to the release matrix, which refers to the target architecture
in the manner that the setup requires, so that we do not have to replace
the string `"aarch64"` with `"arm64"` in setting
`CMAKE_OSX_ARCHITECTURES` for native aarch64 macOS builds and do not
have to conditionally set the MSVC `arch` to `amd64_arm64` for Windows
cross-compilation.
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ByteAddressBuffer with DescriptorHandle (#8252)
- [x] Fix segmentation fault in wrapConstantBufferElement for
DescriptorHandle types
- [x] Split DescriptorKind.Buffer into ConstantBuffer and StorageBuffer
- [x] Update binding enums with descriptive names (ConstantBuffer_Read,
StorageBuffer_Read, etc.)
- [x] Update resource type mappings for correct binding assignments
- [x] Update template logic to handle ConstantBuffer and StorageBuffer
kinds separately
- [x] Update tests to reflect correct binding assignments
- [x] Split DescriptorKind.TexelBuffer into UniformTexelBuffer and
StorageTexelBuffer
- [x] Update TextureBuffer<T> to use UniformTexelBuffer kind
- [x] Update _Texture extension to determine texel buffer kind based on
access mode
- [x] Update test desc-handle-1.slang to handle new DescriptorKind enum
cases
<!-- START COPILOT CODING AGENT TIPS -->
---
✨ Let Copilot coding agent [set things up for
you](https://github.com/shader-slang/slang/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For #8578
This re-adds the cross-compile target setup step for Windows releases
that was erroneously removed in #8470, which made aarch64 releases build
x64 binaries.
The flow should be:
- setup MSVC for host arch
- build generators with cmake
- setup MSVC for cross-compile target arch
- build slang
Based on the description of #8470, it seems that the cross-compile MSVC
step was mistaken as a duplicate for the host MSVC step and removed for
being seemingly redundant.
|
| |
|
|
|
|
|
| |
- Fix bug parsing multiple link-time structs on the same line. Closes
#8553.
- Fix bug parsing anonymous struct type as function return type in
modern syntax. Closes #8558
- Support semantics on modern style param/var declarations.
|
| | |
|
| |
|
|
|
| |
Stop including private header (see #8333).
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#8547)
This allows us to specialize functions whose argument is a sub element
of a constant buffer, instead of being only applicable to entire buffer
element. Closes #8421.
This change also implements a proper heuristic to determine when to
specialize the calls and defer the buffer loads.
This PR addresses a pathological case exposed in
`slangpy\slangpy\benchmarks\test_benchmark_tensor.py`, which used to
take 27ms to finish, and now takes 1.25ms.
For example, given:
```
struct Bottom
{
float bigArray[1024];
[mutating]
void setVal(int index, float value) { bigArray[index] = value; }
}
struct Root
{
Bottom top[2];
[mutating]
void setTopVal(int x, int y, float value)
{
top[x].setVal(y, value);
}
}
RWStructuredBuffer<Root> sb;
[shader("compute")]
[numthreads(1, 1, 1)]
void compute_main(uint3 tid: SV_DispatchThreadID)
{
sb[0].setTopVal(1, 2, 100.0f);
}
```
We are now able to specialize the call to `setTopVal` into:
```
void compute_main(uint3 tid: SV_DispatchThreadID)
{
setTopVal_specialized(0, 1, 2, 100.0f);
}
void setTopVal_specialized(int sbIdx, int x, int y, float value)
{
Bottom_setVal_specialized(sbIdx, x, y, value);
}
void Bottom_setVal_specialized(int sbIdx, int x, int y, float value)
{
sb[sbIdx].top[x].bigArray[y] = value;
}
```
And get rid of all unnecessary loads. Achieving this requires a
combination of function call specialization and buffer-load-defer pass.
The buffer-load-defer pass has been completely rewritten to be more
correct and avoid introducing redundant loads.
This PR also adds tests to make sure pointers, bindless handles, and
loads from structured buffer or constant buffers works as expected.
|
| |
|
|
|
|
|
|
|
| |
legalization pass. (#8567)
This is crash that be triggered by providing custom
`getDescriptorFromHandle` and use it to return access a
ByteAddressBuffer from a bindless handle.
Closes #8355.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Enables all tests/metal/ tests that can be easily enabled.
These tests were not originally designed as render tests; they are
generally being enabled for pipecleaning purposes, and will not be
rigorously testing the corresponding funcitonality.
Where they cannot be enabled as render tests, and a metallib test wasn't
already enabled, a metallib test was enabled instead (where possible).
Fixes #7892
|
| |
|
|
|
|
|
|
|
|
| |
Fixes #8439
When checked, generic type equality constraints types are now in a
canonical order, allowing for a commutative type equality operator.
---------
Co-authored-by: Mukund Keshava <mkeshava@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
packing/unpacking. (#8526)
Part of the effort to improve the performance of generated SPIRV code.
The existing lower-buffer-element-type pass works by loading the entire
buffer element content from memory, and translate it to logical type
stored in a local variable at the earliest reference of a buffer handle.
This means that is can generate inefficient code that reads more than
necessary.
Consider this example:
```
struct BigStruct { bool values[1024]; }
ConstantBuffer<BigStruct> cb;
void test(BigStruct v)
{
if (v.values[0]) { printf("ok"); }
}
[numthreads(1,1,1)]
void computeMain()
{
test(cb);
}
```
In IR, the `computeMain` function before lower-buffer-element-type pass
is something like following:
```
func test:
%v = param : BigStruct
%barr = fieldExtract(%v, "values")
%element = elementExtract(%barr, 0)
... // uses %element
func computeMain:
%v = load(cb)
call %test %v
```
The existing lower-buffer-element-type pass will rewrite the bool array
in `BigStruct` into `int` array so it is legal in SPIRV. However, it
does so by inserting the translation on the first `load` of the constant
buffer:
```
struct BigStruct_std430 {
int values[1024];
}
var cb : ConstantBuffer<BigStruct_std430>;
func computeMain:
%tmpVar : var<BigStruct>
call %unpackStorage(%tmpVar, cb)
%v : BigStruct = load %tmpVar
call %test %v
```
This means that the entire array will be loaded and translated to int,
before calling `test`, which only uses one element. It turns out that
the downstream compiler isn't always able to optimize out this
inefficient translation/copy.
This PR completely rewrites the way buffer-element-type lowering is
handled to avoid producing this inefficient code. It works in two parts:
first we turn on the `transformParamsToConstRef` pass for SPIRV target
as well, so we will translate the `test` function to take the `v`
parameter as `constref`. The second part is a redesigned
buffer-element-type pass that defers the storage-type to logical-type
translation until a value is actually used by a `load` instruction.
In this example, after `transformParamsToConstRef`, the IR is:
```
func test:
%v = param : ConstRef<BigStruct>
%barr = fieldAddr(%v, "values")
%elementPtr = elementAddr(%barr, 0)
%element = load(%elementPtr)
... // uses %element
func computeMain:
call %test %cb
```
The new `buffer-element-type-lowering` pass will take this IR, and
insert translation at latest possible time across the entire call graph,
and translate the IR into:
```
func test:
%v = param : ConstRef<BigStruct_std430>
%barr = fieldAddr(%v, "values")
%elementPtr : ptr<int> = elementAddr(%barr, 0)
%element_int = load(%elementPtr)
%element = cast(%element_int) : %bool
... // uses %element
func computeMain:
call %test %cb
```
In this new IR, there is no longer a load and conversion of the entire
array.
See new comment in `slang-ir-lower-buffer-element-type.cpp` for more
details of how the pass works.
This PR also address many other issues surfaced by turning on
`transformParamsToConstRef` pass on SPIRV backend.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
| |
Closes https://github.com/shader-slang/slang/issues/8477
About a 50% reduction in deser performance for capability sets
|
| |
|
|
|
|
|
|
| |
(#8482)
cmake's find_package() will only find the configs when they are in a
slang/ subdirectory.
Co-authored-by: Mukund Keshava <mkeshava@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this, there are functions with missing parameters in their type
in the IR after running the `introduceExplicitGlobalContext` pass:
```
[layout(%15)]
[export("_SV4test12outputBuffer")]
[nameHint("outputBuffer")]
let %outputBuffer : _ = key
[noSideEffect]
[export("_S4test7dostuffp1pi_ff")]
[nameHint("dostuff")]
func %dostuff : Func(Float, Float)
{
block %34(
[nameHint("f")]
param %f : Float,
[nameHint("kernelContext")]
param %kernelContext : Ptr(%KernelContext, 0 : UInt64, 1 : UInt64)):
let %35 : Float = mul(%f, %f)
let %36 : Ptr(ConstantBuffer(%GlobalParams, DefaultLayout), 0 : UInt64, 1 : UInt64) = get_field_addr(%kernelContext, %globalParams)
let %37 : ConstantBuffer(%GlobalParams, DefaultLayout) = load(%36)
let %38 : Ptr(RWStructuredBuffer(Float, DefaultLayout, %20)) = get_field_addr(%37, %outputBuffer)
let %39 : RWStructuredBuffer(Float, DefaultLayout, %20) = load(%38)
let %40 : Ptr(Float) = rwstructuredBufferGetElementPtr(%39, 1 : Int)
let %41 : Float = load(%40)
let %42 : Float = mul(%35, %41)
return_val(%42)
}
```
Not sure why this doesn't seem to negatively affect existing targets,
but it sure is an issue for the LLVM target I'm working on. I could've
left this fix for that PR, but I want to check now if this causes any
issues with the existing targets using the CI.
This also happens with the entry point functions, where the function
type is not updated after adding `ComputeThreadVaryingInput`. This had
no effect in the C++ target because
`convertEntryPointPtrParamsToRawPtrs(irModule);` is called right after
and fixes it.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Slang compiler was segfaulting when trying to compile shaders that
return resource types (like `Texture2D`, `RWTexture2D`, `SamplerState`,
etc.) from entry point functions. This occurred because there was
missing validation that should reject such invalid return types before
they reach IR generation.
For example, this code would cause a segfault:
```slang
StructuredBuffer<Texture2D<int>> skyLight;
[shader("compute")]
Texture2D<int> computeMain(uint3 threadID : SV_DispatchThreadID)
{
return skyLight[threadID.x];
}
```
## Root Cause
The issue was in the entry point validation logic in
`validateEntryPoint()`. While there was a TODO comment indicating that
return type validation should be performed, it was never implemented.
The compiler would accept the invalid shader code and attempt to process
it during IR lowering, where resource types as return values are not
properly handled, leading to a segmentation fault.
## Solution
1. **Added robust validation**: Modified `validateEntryPoint()` in
`slang-check-shader.cpp` to use the existing
`SemanticsVisitor::getTypeTags()` functionality to check for invalid
return types by detecting `TypeTag::Opaque` and `TypeTag::Unsized` bits.
This leverages the existing type analysis infrastructure that
comprehensively handles:
- Direct resource types (Texture2D, RWTexture2D, SamplerState, etc.)
- Structs containing resource-typed fields (through type tag
propagation)
- Nested structures and complex type hierarchies
- Arrays and other composite types
2. **Added diagnostic message**: Uses existing diagnostic
`entryPointCannotReturnResourceType` (error 38010) that provides a clear
error message explaining why resource types cannot be returned from
shader entry points
3. **Updated existing tests**: Modified existing tests to match the
updated validation behavior
## Result
Instead of a segfault, users now get a clear, actionable error message:
```
error 38010: entry point 'computeMain' cannot return type 'Texture2D<int>' that contains resource types
```
The fix properly handles all resource types including `Texture2D`,
`RWTexture2D`, `SamplerState`, and others, while preserving the ability
to compile valid shaders that return simple data types.
Fixes #6438.
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: expipiplus1 <857308+expipiplus1@users.noreply.github.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Treat DescriptorHandle as uint64_t instead of uint2. Implement
target-specific SPIR-V emission with the bindless texture support.
For OpImageTexelPointer, Image must have a type of OpTypePointer with
Type OpTypeImage. Fix the issue by using [constref] in __subscript.
Add a test coverage for various texture/sampler handle types.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Generate a diagnostic warning whenever unsupported modifiers (keywords,
attributes) are found on entry point parameters. These have been
silently ignored up until now, with the parser accepting them but Slang
not actually doing anything with them.
Fixes #7151
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Fixes #7715
Updating the Vulkan SDK on the Windows CI machines to 1.4.321.1 has
fixed some illegitimate VVL errors in the `cooperative-vector` tests,
and #8541 has fixed some legitimate VVL errors in some of those tests,
so now they can be removed from the list of expected test failures.
The only expected `cooperative-vector` failures that remain are for
`-emit-spriv-via-glsl`, as we do not support
`GLSL_NV_cooperative_vector` yet, see #7727.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It appears that the inputType of the coopvec-mat-mul cannot be signed
int32.
It could be floating types or signed int32.
Changing the tests to use uint32 instead of int32.
The spec guarantees the following combinations and the rest should be
queried at the runtime if it is supported by the HW.
https://registry.khronos.org/vulkan/specs/latest/man/html/VkCooperativeVectorPropertiesNV.html#_description
inputType | inputInterpretation | matrixInterpretation |
biasInterpretation | resultType
-- | -- | -- | -- | --
FLOAT16 | FLOAT16 | FLOAT16 | FLOAT16 | FLOAT16
UINT32 | SINT8_PACKED | SINT8 | SINT32 | SINT32
SINT8 | SINT8 | SINT8 | SINT32 | SINT32
FLOAT32 | SINT8 | SINT8 | SINT32 | SINT32
FLOAT16 | FLOAT_E4M3 | FLOAT_E4M3 | FLOAT16 | FLOAT16
FLOAT16 | FLOAT_E5M2 | FLOAT_E5M2 | FLOAT16 | FLOAT16
|
| |
|
|
|
|
|
|
|
| |
Fixes #8335
---------
Co-authored-by: Mukund Keshava <mkeshava@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
| |
Four WGPU tests print VVL errors.
And it is preventing us from upgrading VulkanSDK on CI machines.
This commit put them in the expected-failure-github.txt so that we can
continue upgrading VulkanSDK.
They will be re-enabled when the following issues are resolved:
- https://github.com/shader-slang/slang/issues/8145
- https://github.com/shader-slang/slang/issues/8379
|
| |
|
|
| |
Related to
- https://github.com/shader-slang/slang/issues/8519
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit removes unnecessary Load and Store pairs in IR.
When the IR is like
```
let %1 = var
let %2 = load(%ptr)
store(%1 %2)
```
This PR will replace all uses of %1 with %ptr.
And the load and store instructions will be removed.
But I found that there can be cases where %2 might be still used later
in other IRs.
For these cases, the removal of load instruction relies on DCE.
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the recent Windows runtime libraries, a new popup window started
appearing when `abort()` is called. This was observed when VVL prints a
message as a part of WGPU test.
Although it can be helpful when we want to debug it, it breaks the
behavior of CI scripts when the tests are expected to continue even when
they fail. When the test fail, CI script stops in the middle and wait
for a user to click on a button on the dialog window, which cannot
happen. As a result, when there is a VVL error message, CI run stops in
the middle and the testing stops prematurely.
This commit adds a new command-line argument, `-ignore-abort-msg`, that
ignores the abort message and it wouldn't show the dialog popup window.
From the implementation perspective, there are three places that are
related.
- slang-test itself should turn off the flag.
- render-test should turn off the flag after getting the argument from
slang-test
- test-server should turn off the flag after getting the argument from
slang-test
When test-server runs render-test, the arguments are already handled by
slang-test, so test-server needs to just pass through the arguments.
|
| |
|
| |
Added instructions to retrieve the hash value of spirv-headers.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a type mismatch issue. See the generated cuda code
```cuda
struct Query_0
{
EmptyExample_0 query_0;
uint hasNonEmptyAbsorbingBoundary_0;
};
struct Query_1
{
uint hasNonEmptyAbsorbingBoundary_0;
};
struct GlobalParams_0
{
Query_0* gQuery_0;
RWStructuredBuffer<float3 > gInput_0;
RWStructuredBuffer<float> gOutput_0;
};
...
Query_1 _S4 = *globalParams_0->gQuery_0; // ==> type mismatch at call site!
```
**Root Cause:** During the empty type legalization pass in Slang's IR
processing, struct types were being optimized. e.g., `Query_0` →
`Query_1` with empty type removed), but this created an inconsistency:
**Function parameters were updated:** When Query_compute_0 function was
legalized, its parameter type was correctly updated from `Query_0` to
the optimized `Query_1`
**Global parameter types were NOT updated:** The
`ParameterBlock<Struct>` type in globalParams still referenced the old
`Query_0` type
The PR adds special handling for type operands in the `legalizeInst`
function. This triggers the legalization of the `StructType` from the
original `legalizeOperand` call site. The leaglized result will be saved
in the type-to-legal-type map and be re-used when the same type requires
legalization again (e.g. in the `IRFunc` as parameter)
Fixes: https://github.com/shader-slang/slang/issues/7905
|
| |
|
|
|
|
|
|
|
|
|
| |
Close #8201.
This PR unify the lowering logic for LookupDeclRef of an interface
requirement. We will always lower this AST node to a
LookupWitness IR. The key of this IR is the special witnessTableType
`ThisTypeWitness`, this witness Table is simply a wrapper for an
interface type. Our current specialization pass doesn't handle this kind
of LookupWitness IR at all, so we will also add the specialization of
this_type IR as well.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Close #8193.
When constructing `TransitiveTypeWitness` node, we should check if there
is operand that represents two equal times. Currently, we only check
whether the operand is `TypeEqualityWitness`, which is not good enough,
because a `DeclaredSubtypeWitness` could also be representing two same
types, in that case, we should also const fold this kind of witness.
Fails to do so, we could finally ends up with a generating a lookup
witness IR on a generic parameter that is not supposed to be looked up.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Closes #8500.
`slang-ir-translate-global-varying-var.cpp` turns the global varying
outputs into a struct that's returned from the entry point. Currently,
there's a problem when one of the outputs is a struct. It always creates
a generic `IRTypeLayout`, even when a correct type layout already
exists. Somehow, this appears to work when the global varying outputs
aren't structs.
The crash occurs in
`slang-ir-glsl-legalize.cpp:createGLSLGlobalVaryingsImpl()`. It
correctly handles the generated outer struct, but when that contains an
inner struct, it's been given a non-struct type layout and crashes.
This PR uses the correct layout if found, instead of generating a broken
placeholder. This matches the behaviour that has already been
implemented for inputs.
Additionally, I removed a call to `addResourceUsage` from both the input
and output side. I can't see any way in which it would've affected
anything, the layout builder is never used after that call and it
doesn't retroactively modify the layout that was already created.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Overview
========
This change is the start of an attempt to address how the Slang compiler
codebase has ended up conflating two similar, but semantically distinct,
concepts:
* The long-standing notion of `ref` parameters (only allowed for use in
the builtin modules), which are encoded using a wrapper `Type` in the
AST as part of the representation of the parameters of a `FuncType`.
* A recently-introduced notion of explicit reference types that mirror
the built-in `Ptr` type, with a relationship comparable to that between
pointer and reference types in C++.
The change splits the `Ref<T>` type in the core module into two distinct
types, with one for each of the two use cases. Similarly, the `RefType`
class in the compiler's AST is split into two distinct classes, to
represent the two cases.
Background
==========
The `Ref<T>` type in the core module (hidden and not intended for users
to ever see or use) was originally introduced to encode the `ref`
parameter-passing mode, comparable to the hidden `Out<T>` and `InOut<T>`
types used to encode `out` and `inout` parameter-passing modes. The
`Ref<T>` type in the core module was encoded as a instance of the
`RefType` class in the Slang AST (similar to how `Out<T>` mapped to an
`OutType`). These AST classes were *only* intended to be used by the
compiler front-end as part of its encoding of function types. The
`FuncType` class needed a way to distinguish an `inout int` parameter
from a plain (implicitly `in`) `int` parameter, so these wrapper like
`RefType` and `OutType` were introduced to encode both the parameter
type (`T`) and the parameter-passing mode in a form that could be passed
around as a `Type`.
Notably, the `Ref<T>` type (and `Out<T>`, etc.) were *not* intended to
be type names that ever get uttered in Slang code (not even in the
builtin modules), and the vast majority of the compiler code was not
supposed to ever encounter them. They were an implementation detail of
`FuncType`, and nothing else.
(In hindsight it may have been a mistake to use a nominal type declared
in the core module to implement these wrappers; it might have been a
good idea to use an entirely separate class of `Type` for this case...)
Recent changes to the builtin modules introduced functions that wanted
to *return* a reference (so that the parameter-passing-mode modifiers
like `ref` could not trivially be used), and as part of those changes
the appealingly-named `Ref<T>` type in the core module was re-used for
this new case. Builtin operations were declared with an explicit
`Ref<T>` return type, and parts of the compiler front-end that had
previously been blissfully unaware of the AST's `RefType` (and
`InOutType`, etc.) had to start accounting for the possibility that an
explicit `Ref<T>` would show up.
Related changes also introduced a comparable conflation of the
(unfortunately-named) `constref` parameter-passing modifier and builtin
operations that wanted to return an explicit reference that is
read-only. Both use cases were mapped to the core-module `ConstRef<T>`
type, which appeared in the AST as an instance of the `ConstRefType`
class.
The overlapping use of `ConstRef<T>`` is actually significantly more
troublesome than the `Ref<T>` case because, despite what its name
implies, `constref` was not really supposed to be the read-only analogue
of `ref`, but rather it is closer to the "immutable value borrow"
analogue to `inout`'s "mutable value borrow." The semantics of a "value
borrow" vs. a "memory reference" in Slang have not been very carefully
codified, and the conflation around `ConstRef<T>` has contributed to
things becoming increasingly muddy in the compiler back-end.
Main Changes
============
Core Module
-----------
The `Ref<T>` type has been replaced with two distinct types, with one
for each use case:
* `RefParam<T>` is intended for use when encoding a `ref` parameter in a
function type
* `ExplicitRef<T>` is intended for use when an operation in a builtin
module wants to return a reference
The other types used to represent parameter-passing modes (e.g.,
`InOut<T>`) were renamed to better indicate that their role in defining
parameter types (e.g., `InOutParam<T>`).
The `ExplicitRef<T>` type was given additional generic parameters for
the allowed access and the address space, akin to what `Ptr<T>` now
supports. The pointer dereference operator (prefix `*`) in the core
module should now properly propagate the access and address space of the
pointer over to the reference that gets returned.
The two distinct use cases of `ConstRef<T>` were not split in the way as
`Ref<T>`, instead the case for the `constref` parameter-passing mode
uses `ConstParamRef<T>`, while cases that previously used `ConstRef<T>`
to represent a read-only explicit reference instead now use
`ExplicitRef<T, Access.Read>`.
Prior to this change there were two subscripts declared on pointers: one
in the `Ptr` type itself, and another in an `extension` for pointers
with `Access.ReadWrite`. The comments on the code seemed to indicate
that the catch-all subscript used to only have a `get` accessor, while
the `ref` was only available on read-write pointers, but it seems that
subsequent changes converted the default subscript to support `ref`.
This change eliminates the subscript added via `extension`, since it is
redundant.
AST and Front-End
=================
Similar to the changes in the core module, the AST `RefType` class was
split into:
* `RefParamType` for the case of encoding `ref` parameters
* `ExplicitRefType` for the case where the user meant an explicit
reference type
All the other classes that represent wrappers for encoding
parameter-passing modes (e.g., `OutType`) were similarly renamed (e.g.,
`OutParamType`).
The `ConstRefType` class was simply renamed to `ConstRefParamType`,
because any use cases of `ConstRefType` that intended an explicit
reference type will now use `ExplicitRefType` with `Acccess.Read`.
For convenience, this change includes type aliases to map the old names
for these types over to the new ones (e.g., `using OutType =
OutParamType`) so that the change doesn't need to affect quite so many
lines of code. The `RefType` and `ConstRefType` names are intentionally
left undefined, since it woudl be unsafe to assume that existing use
sites should default to either of the two possible interpretations.
All use cases of `RefType` and `ConstRefType` (and their former shared
base class `RefTypeBase`) were audited and updated to refer to either
`RefParamType`/`ConstRefParamType` or `ExplicitRefType`, as appropriate
(based on whether the context of the code indicated it was working with
parameter-passing mode wrapper types, or explicit reference types).
In many (many) cases comments were added to the code that was updated
(and some unrelated code that needed to be audited along the way) to
note cases where there appears to be something fishy going on in the
compiler and/or there are obvious opportunities for next-step
improvement.
The `QualType` constructor used to infer l-value-ness when passed a
`RefType` or `ConstRefType`; that code was introduced to support
explicit reference types. The code was updated to consult the access
argument of an `ExplicitRefType` to try and determine the right
l-value-ness to use. There is some ambiguity about what should be done
in the case where the value of the generic argument representing the
access cannot be statically determined; a better solution may be needed.
Many other cases in the front-end that were working with `RefType` and
`ConstRefType` for explicit references also need to figure out
l-value-ness, and these were changed to rely on the logic already added
to `QualType` so that it wouldn't have to be duplicated. It isn't clear
if this structure is the best way to tackle the problem, but it seems to
at least be an upgrade over the more strictly ad-hoc logic that was in
place before.
Future Work
===========
IR-Level Work
-------------
The most obvious next step to take is that the split that was made in
the compiler front-end needs to be properly plumbed through all of the
back-end. There appears to be a lot of code in the back end of the
compiler that has made the same conflation of `ref` parameters and
explicit reference types that the front-end did. In practice, any uses
of `ExplicitRef<T>` in the front-end should desugar into plain
pointer-based code in the IR.
Clean Up Parameter-Passing Modes
--------------------------------
The code that handles different parameter-passing modes
(`ParameterDirection`s) and their wrapper types is somewhat scattered
and messy (as found while auditing use cases of `RefType`). A cleanup
pass is warranted to ensure that most code only needs to think about
`ParameterDirection`s. There should ideally be only a single operation
in the front-end that handles determining the `ParameterDirection` of a
parameter based on its modifiers. Similarly, there should be one
operation to wrap a value type based on a parameter direction, and one
operation to derive a `ParameterDirection` from the wrapper type.
Ideally, the accessors for `FuncType` should not provide unrestricted
access to the potentially-wrapped parameter types, and should instead
return some kind of `ParamInfo` struct that encodes both a
`ParameterDirection` and the unwrapped `Type` of the parameter.
Clean Up `QualType`
-------------------
A significant piece of future work that appears required is to
drastically clean up and improve the way that `QualType`s are represente
and handled in the front-end. There are currently various distinct
`bool` flags in `QualType` (some with very unclear meaning) and
differnet parts of the codebase consult/modify only subsets of them; a
clear enumeration of the "value categories" (to use the C++ terminology)
that Slang supports could be quite helpful. Naively, a `QualType` should
at least encode the basic information that a `Ptr` type encodes:
* A value type
* Allowed access (read-only, read-write, etc.)
* Address space
The main additional thing that a `QualType` needs is a way to
distinguish cases where an expression evaluates to:
* A reference to a memory location, where all the information from a
`Ptr` is relevant
* A simple value, such that the access and address space are irrelevant
* A reference to an abstract storage location (a `property`,
`subscript`, or an implicit conversion that needs to support being an
l-value), in which case address space is irrelevant and the "allowed
access" basically amounts to a listing of the accessors the storage
location supports
Eliminate Explicit Reference Types
----------------------------------
Finally, twe should eventually eliminate the `ExplicitRef<T>` type from
the core module (and all of the supporting code from the front-end),
since the feature is not a good fit for the Slang language. We should
find some other way to decorate operations in the builtin module that
need to returns a reference rather than a value (note how `ref`
accessors already avoided exposing explicit reference types, by design).
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
# Add RHI Device Caching and Test Prefix Exclusion
## Summary
This PR introduces two key improvements to the Slang test
infrastructure:
1. **RHI Device Caching**: Implements device caching to significantly
speed up test execution by reusing graphics devices across tests, **RHI
Device Caching reduces slang-test execution time from ~15 minutes to ~5
minutes in Windows release builds**
2. **Test Prefix Exclusion**: Adds `-exclude-prefix` option to skip
tests matching specified path prefixes
## Changes
### RHI Device Caching
- **New `DeviceCache` class** (`slang-test-device-cache.h/cpp`):
Thread-safe device cache with LRU eviction (max 10 devices)
- **Cache control option**: `-cache-rhi-device` flag in both
`slang-test` and `render-test`
- Default: **enabled** in slang-test, **disabled** in render-test when
run standalone
- Automatically skips caching for CUDA devices (due to driver issues)
- **Performance benefit**: Eliminates expensive device
creation/destruction cycles, especially beneficial for Vulkan on Tegra
platforms
### Test Prefix Exclusion
- **New `-exclude-prefix <prefix>` option** in slang-test
- Allows excluding entire test directories or patterns from execution
- Complements existing `-category` and individual test filtering options
### Usage Examples
```bash
# Enable device caching (default)
slang-test
# Disable device caching
slang-test -cache-rhi-device false
# Exclude tests from specific directories
slang-test -exclude-prefix tests/problematic/
slang-test -exclude-prefix tests/slow/ -exclude-prefix tests/experimental/
```
This change should significantly improve test execution performance,
particularly in CI environments with frequent device operations. This is
needed for running the GPU test in aarch64, where repeated device
creation/destroy is causing driver issues.
Needed by: https://github.com/shader-slang/slang/issues/8346
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| | |
|
| |
|
| |
I think the commit diff speaks for itself.
|
| |
|
|
|
|
| |
It turned out that SPIRV-Headers update has to be done after building
SPIRV-Tools
Co-authored-by: Gangzheng Tong <tonggangzheng@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
- Uses `LOAD_LIBRARY_SEARCH_USER_DIRS` in `LoadLibraryExW`, which might
cause exception if there is no pathes added by `AddDllDirectory()`
After:
- Use the composite flag `LOAD_LIBRARY_SEARCH_DEFAULT_DIRS`, which
searches for several locations.
- Will still search dir added by `AddDllDirectory()`, but avoids empty
path seraching if there is no AddDllDirectory() calls.
Related to https://github.com/shader-slang/slang/issues/8462
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
| |
This allows doc only changes to PASS the CI by skipping jobs.
|
| |
|
|
|
|
|
|
|
|
| |
Remove SYSTEM flag from SPIRV-Headers to fix MacOS header precedence.
When the path is registered as SYSTEM, it is used with `-isystem` option
not `-I` option and it gets less searching order on MacOS.
When spirv.h is installed on the system directory, it will end up using
the system installed spirv.h, which is most likely an older version than
we should use.
|
| |
|
|
|
|
|
|
| |
fixes https://github.com/shader-slang/slang/issues/8472
Fixes an issue with GCC 9.4.0 on Ubuntu 20.04, it will throw an error
about PATH_MAX not being declared.
Co-authored-by: Mukund Keshava <mkeshava@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
files (#7957)
This PR implements the requested fix for issue #7923 where
DebugCompilationUnit incorrectly referenced header files instead of the
main shader file.
## Summary
- Modified IRDebugSource to include isIncludedFile flag as third operand
- Updated emitDebugSource function to accept and pass the included file
flag
- Updated call sites to use source->isIncludedFile() from SourceFile
class
- Modified SPIR-V emission to only create DebugCompilationUnit for
non-included files
## Test Results
The fix has been verified with the provided reproducer code. The SPIR-V
output now correctly shows DebugCompilationUnit referencing the main
shader file instead of header files.
Generated with [Claude Code](https://claude.ai/code)
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Lujin Wang <lujinwangnv@users.noreply.github.com>
Co-authored-by: Claude Code <claude@anthropic.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix CUDA global variable initialization with constructor calls
Resolves CUDA compilation failure where global variables with struct
constructor
initialization generated illegal `__device__` variable runtime
initialization.
**Problem:**
```cuda
// Generated invalid CUDA code:
__device__ static const Stuff_0 gStuff_0 = Stuff_x24init_0(args...);
// Error: "dynamic initialization is not supported for a __device__
variable"
Root Cause Discovered:
Through extensive debugging, found that
moveGlobalVarInitializationToEntryPoints
pass only handled kIROp_GlobalVar instructions, but global constants
with
constructor calls appeared as kIROp_Call instructions at module scope.
Solution:
1. IR Pipeline Fix: Extended moveGlobalVarInitializationToEntryPoints to
detect
and transform module-level constructor calls into proper global
variables with
entry-point initialization
2. Field Access Fix: Enhanced kIROp_FieldExtract logic to emit correct
->
syntax for pointer types and address-of operations
3. Constructor Emission: Added CUDA-specific handling for constructor
calls
Architecture:
- Transforms let %gStuff = call %Constructor(...) into kernel context
initialization
- Moves runtime initialization from global scope to entry-point
execution
- Follows CUDA best practices for global state management
Files:
- source/slang/slang-ir-explicit-global-init.cpp: Extended IR
transformation pass
- source/slang/slang-emit-c-like.cpp: Enhanced field access and foldable
value logic
- source/slang/slang-emit-cuda.cpp: Added CUDA-specific field extraction
handling
Result:
// Now generates proper CUDA code:
struct KernelContext_0 { Stuff_0 gStuff_1; };
// Runtime initialization in entry point:
kernelContext_1.gStuff_1 = constructor_call();
Fixes: tests/compute/type-legalize-global-with-init.slang
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add WASM FS module support for slang-playground
This change adds the necessary Emscripten build flags to export the
FileSystem (FS) module interface in the slang-wasm build:
- Adds -sMODULARIZE=1 to enable modular builds
- Adds -sEXPORTED_RUNTIME_METHODS=['FS'] to export the FS interface
These changes are required to support the slang-playground.
The existing flags are also reformatted for better readability.
Related to https://github.com/shader-slang/slang-playground/issues/170
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR resolves a configuration conflict where the
SLANG_ENABLE_SPIRV_TOOLS_MIMALLOC option was defined twice with
different defaults, causing user settings to be overridden.
Changes:
- Removed duplicate definition in external/CMakeLists.txt that only
applied to WIN32
- Define default in main CMakeLists.txt
- ON by default for Windows
- OFF by default for other platforms (still need to resolve crashes so
set it to `OFF` for now)
Preserved user configuration - explicit
-DSLANG_ENABLE_SPIRV_TOOLS_MIMALLOC=value settings are now properly
respected
Related to https://github.com/shader-slang/slang/issues/8158
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
| |
When slang detects assignment to a mesh output reference on metal,
generate a diagnostic message. (Metal mesh shader outputs must be
assigned via 'set' instead of 'ref'.)
Fixes #7498
|
| |
|
|
|
|
|
|
|
|
| |
- Remove hard-coded Win SDK version.
- Using self-hosted machine for building release package in Windows.
- Remove the steps from release.yml that have been done in common steup.
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Enable GetLssPositionsAndRadii() call in rayGenLssIntrinsicsHitObject
shader that was previously commented out. This fixes the failing
ray-tracing-lss-intrinsics-hit-object test which was returning all zero
values for LSS position and radius data.
The hit object LSS intrinsics are now working correctly in D3D12
backend, returning proper endcap positions and radii values as expected
by the test. All 27 test assertions now pass successfully.
Fixes #8128
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #8406 (and #8410).
`AddressSpace`, `MemoryScope` and `AccessQualifier` are no longer
`BaseType`.
I added a new `__magic_enum` (very similar to `__magic_type`) syntax to
be able to easily create values or these enums from the compiler. (I
don't know if it was the right way to do it, but it works and the
changes are small enough?).
I had a weird bug: `tests/language-feature/capability/address-of.slang`
was failing in `IRBuilder::_findOrEmitConstant(IRConstant& keyInst)`.
When needing a new `u64(0)`, it did not find it in the `ConstantMap`
first, but then failed to add it right after because it already existed
in the map! But this was triggered by `IRPtrType*
IRBuilder::getPtrType(IROp op, IRType* valueType, AccessQualifier
accessQualifier, AddressSpace addressSpace)`, which is a strange
coincidence... but I could not find the issue in what I did. I ended up
bumping unordered_dense, and it solved the issue (so there was a bug in
there).
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
requirements (#7269)
## Summary
This PR enhances constexpr validation by adding proper error checking
when function arguments cannot satisfy constexpr parameter requirements,
addressing issue #6370.
## Problem
Previously, when a function declared constexpr parameters, the compiler
would attempt to propagate constexpr-ness to the call site arguments,
but there was insufficient validation and error reporting when this
propagation failed. This could lead silent failures where constexpr
requirements weren't properly enforced
## Solution
This PR adds checks that:
1. **Validates constexpr arguments**: When a function parameter is
marked as `constexpr`, the compiler now explicitly checks that the
corresponding argument can be marked as `constexpr`
2. **Issues clear compilation errors**: added
`Diagnostics::argIsNotConstexpr`)
3. **Handles both call scenarios**: The validation works for both:
- Direct function calls with IR-level function definitions
- Calls to function from external modules
Fixes #6370
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|