slang.git/tests/compute, branch master

implement dot products for 1 vectors (#8599)

2025-10-10T16:42:43+00:00

Closes https://github.com/shader-slang/slang/issues/8378

Fix DerivativeGroupQuadsKHR workgroup size validation for texture sampling (#8647)

2025-10-08T23:18:50+00:00

Fixes #8545 where Slang generates SPIR-V with DerivativeGroupQuadsKHR execution mode but doesn't validate workgroup sizes when texture sampling triggers automatic derivative computation. **Root Cause**: Validation code was looking for IRNumThreadsDecoration on the wrong IR node **Fix**: One-line change in slang-emit-spirv.cpp to search decoration on entryPoint instead of entryPointDecor **Tests**: Added regression tests for both quad and linear derivative group validation Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Lujin Wang Co-authored-by: slangbot Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>

Rewriting the lower-buffer-element-type pass to avoid unnecessary packing/unpacking. (#8526)

2025-09-30T00:45:08+00:00

Part of the effort to improve the performance of generated SPIRV code. The existing lower-buffer-element-type pass works by loading the entire buffer element content from memory, and translate it to logical type stored in a local variable at the earliest reference of a buffer handle. This means that is can generate inefficient code that reads more than necessary. Consider this example: ``` struct BigStruct { bool values[1024]; } ConstantBuffer cb; void test(BigStruct v) { if (v.values[0]) { printf("ok"); } } [numthreads(1,1,1)] void computeMain() { test(cb); } ``` In IR, the `computeMain` function before lower-buffer-element-type pass is something like following: ``` func test: %v = param : BigStruct %barr = fieldExtract(%v, "values") %element = elementExtract(%barr, 0) ... // uses %element func computeMain: %v = load(cb) call %test %v ``` The existing lower-buffer-element-type pass will rewrite the bool array in `BigStruct` into `int` array so it is legal in SPIRV. However, it does so by inserting the translation on the first `load` of the constant buffer: ``` struct BigStruct_std430 { int values[1024]; } var cb : ConstantBuffer; func computeMain: %tmpVar : var call %unpackStorage(%tmpVar, cb) %v : BigStruct = load %tmpVar call %test %v ``` This means that the entire array will be loaded and translated to int, before calling `test`, which only uses one element. It turns out that the downstream compiler isn't always able to optimize out this inefficient translation/copy. This PR completely rewrites the way buffer-element-type lowering is handled to avoid producing this inefficient code. It works in two parts: first we turn on the `transformParamsToConstRef` pass for SPIRV target as well, so we will translate the `test` function to take the `v` parameter as `constref`. The second part is a redesigned buffer-element-type pass that defers the storage-type to logical-type translation until a value is actually used by a `load` instruction. In this example, after `transformParamsToConstRef`, the IR is: ``` func test: %v = param : ConstRef %barr = fieldAddr(%v, "values") %elementPtr = elementAddr(%barr, 0) %element = load(%elementPtr) ... // uses %element func computeMain: call %test %cb ``` The new `buffer-element-type-lowering` pass will take this IR, and insert translation at latest possible time across the entire call graph, and translate the IR into: ``` func test: %v = param : ConstRef %barr = fieldAddr(%v, "values") %elementPtr : ptr = elementAddr(%barr, 0) %element_int = load(%elementPtr) %element = cast(%element_int) : %bool ... // uses %element func computeMain: call %test %cb ``` In this new IR, there is no longer a load and conversion of the entire array. See new comment in `slang-ir-lower-buffer-element-type.cpp` for more details of how the pass works. This PR also address many other issues surfaced by turning on `transformParamsToConstRef` pass on SPIRV backend. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>

Fix CUDA global variable initialization with constructor calls (#8340)

2025-09-18T15:46:44+00:00

Fix CUDA global variable initialization with constructor calls Resolves CUDA compilation failure where global variables with struct constructor initialization generated illegal `__device__` variable runtime initialization. **Problem:** ```cuda // Generated invalid CUDA code: __device__ static const Stuff_0 gStuff_0 = Stuff_x24init_0(args...); // Error: "dynamic initialization is not supported for a __device__ variable" Root Cause Discovered: Through extensive debugging, found that moveGlobalVarInitializationToEntryPoints pass only handled kIROp_GlobalVar instructions, but global constants with constructor calls appeared as kIROp_Call instructions at module scope. Solution: 1. IR Pipeline Fix: Extended moveGlobalVarInitializationToEntryPoints to detect and transform module-level constructor calls into proper global variables with entry-point initialization 2. Field Access Fix: Enhanced kIROp_FieldExtract logic to emit correct -> syntax for pointer types and address-of operations 3. Constructor Emission: Added CUDA-specific handling for constructor calls Architecture: - Transforms let %gStuff = call %Constructor(...) into kernel context initialization - Moves runtime initialization from global scope to entry-point execution - Follows CUDA best practices for global state management Files: - source/slang/slang-ir-explicit-global-init.cpp: Extended IR transformation pass - source/slang/slang-emit-c-like.cpp: Enhanced field access and foldable value logic - source/slang/slang-emit-cuda.cpp: Added CUDA-specific field extraction handling Result: // Now generates proper CUDA code: struct KernelContext_0 { Stuff_0 gStuff_1; }; // Runtime initialization in entry point: kernelContext_1.gStuff_1 = constructor_call(); Fixes: tests/compute/type-legalize-global-with-init.slang

CUDA: Fix compiler crash with unsized array field - nonuniformres-as-… (#8380)

2025-09-10T12:01:36+00:00

…function-parameter.slang #8315 Root Cause: CUDA compilation crashed with `assert failure: !seenFinalUnsizedArrayField` because unsized arrays like `RWStructuredBuffer globalBuffer[]` were not the final field in generated parameter structs, violating the layout constraint in slang-ir-layout.cpp. Fix: Extended `collectGlobalUniformParameters` to automatically reorder struct fields for CUDA targets - regular fields first, unsized arrays last. Other targets preserve original order. Impact: - Enables CUDA support for nonuniform resource indexing as function parameters - Zero impact on existing GLSL/HLSL/SPIRV targets - Automatic handling - no manual parameter reordering required Files: slang-emit.cpp, slang-ir-collect-global-uniforms.cpp/.h, test file --------- Co-authored-by: slangbot Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska

Fix #8314 - Enable tests/compute/texture-subscript.slang for CUDA (#8408)

2025-09-09T11:16:51+00:00

The test can be enabled

Fix#8085: Batch-9: Enable cuda tests (#8269)

2025-09-03T16:06:43+00:00

Fix#8086: Batch-10: Enable cuda tests (#8270)

2025-09-03T05:15:57+00:00

render-test: Change D3D12 default to sm_6_5 (#8320)

2025-09-02T23:43:48+00:00

Changes default for render-test to sm_6_5. Since sm_6_5 is the new default, remove the -use-dxil option, add -use-dxcb option Remove -use-dxil option from all test cases. Add -use-dxcb to two tests that needed it. Fixes #7611

Fix#8084: Batch-8: Enable cuda tests (#8268)

2025-08-25T05:20:41+00:00