<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/compute, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2025-10-10T16:42:43+00:00</updated>
<entry>
<title>implement dot products for 1 vectors (#8599)</title>
<updated>2025-10-10T16:42:43+00:00</updated>
<author>
<name>Ellie Hermaszewska</name>
<email>ellieh@nvidia.com</email>
</author>
<published>2025-10-10T16:42:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=b4023f715885ada9a2777ea3b0d6d9739860b39b'/>
<id>urn:sha1:b4023f715885ada9a2777ea3b0d6d9739860b39b</id>
<content type='text'>
Closes https://github.com/shader-slang/slang/issues/8378</content>
</entry>
<entry>
<title>Fix DerivativeGroupQuadsKHR workgroup size validation for texture sampling (#8647)</title>
<updated>2025-10-08T23:18:50+00:00</updated>
<author>
<name>Lujin Wang</name>
<email>143145775+lujinwangnv@users.noreply.github.com</email>
</author>
<published>2025-10-08T23:18:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4e4aad5a0493defde1e0ef29f27e5d663c1182cd'/>
<id>urn:sha1:4e4aad5a0493defde1e0ef29f27e5d663c1182cd</id>
<content type='text'>
Fixes #8545 where Slang generates SPIR-V with DerivativeGroupQuadsKHR
execution mode but doesn't validate workgroup sizes when texture
sampling triggers automatic derivative computation.

**Root Cause**: Validation code was looking for IRNumThreadsDecoration
on the wrong IR node
**Fix**: One-line change in slang-emit-spirv.cpp to search decoration on
entryPoint instead of entryPointDecor
**Tests**: Added regression tests for both quad and linear derivative
group validation

Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: github-actions[bot] &lt;41898282+github-actions[bot]@users.noreply.github.com&gt;
Co-authored-by: Lujin Wang &lt;lujinwangnv@users.noreply.github.com&gt;
Co-authored-by: slangbot &lt;ellieh+slangbot@nvidia.com&gt;
Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Rewriting the lower-buffer-element-type pass to avoid unnecessary packing/unpacking. (#8526)</title>
<updated>2025-09-30T00:45:08+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-09-30T00:45:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89'/>
<id>urn:sha1:a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89</id>
<content type='text'>
Part of the effort to improve the performance of generated SPIRV code.

The existing lower-buffer-element-type pass works by loading the entire
buffer element content from memory, and translate it to logical type
stored in a local variable at the earliest reference of a buffer handle.
This means that is can generate inefficient code that reads more than
necessary.

Consider this example:
```
struct BigStruct { bool values[1024]; }
ConstantBuffer&lt;BigStruct&gt; cb;

void test(BigStruct v)
{
      if (v.values[0]) { printf("ok"); }
}

[numthreads(1,1,1)]
void computeMain()
{
    test(cb);
}
```

In IR, the `computeMain` function before lower-buffer-element-type pass
is something like following:
```
func test:
   %v = param : BigStruct
   %barr = fieldExtract(%v, "values")
   %element = elementExtract(%barr, 0)
    ... // uses %element 

func computeMain:
  %v = load(cb)
  call %test %v
```

The existing lower-buffer-element-type pass will rewrite the bool array
in `BigStruct` into `int` array so it is legal in SPIRV. However, it
does so by inserting the translation on the first `load` of the constant
buffer:

```
struct BigStruct_std430 {
    int values[1024];
}
var cb : ConstantBuffer&lt;BigStruct_std430&gt;;
func computeMain:
   %tmpVar : var&lt;BigStruct&gt;
    call %unpackStorage(%tmpVar, cb)
   %v : BigStruct = load %tmpVar
   call %test %v
```

This means that the entire array will be loaded and translated to int,
before calling `test`, which only uses one element. It turns out that
the downstream compiler isn't always able to optimize out this
inefficient translation/copy.

This PR completely rewrites the way buffer-element-type lowering is
handled to avoid producing this inefficient code. It works in two parts:
first we turn on the `transformParamsToConstRef` pass for SPIRV target
as well, so we will translate the `test` function to take the `v`
parameter as `constref`. The second part is a redesigned
buffer-element-type pass that defers the storage-type to logical-type
translation until a value is actually used by a `load` instruction.

In this example, after `transformParamsToConstRef`, the IR is:

```
func test:
   %v = param : ConstRef&lt;BigStruct&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr = elementAddr(%barr, 0)
   %element = load(%elementPtr)
    ... // uses %element 

func computeMain:
  call %test %cb
```

The new `buffer-element-type-lowering` pass will take this IR, and
insert translation at latest possible time across the entire call graph,
and translate the IR into:

```
func test:
   %v = param : ConstRef&lt;BigStruct_std430&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr : ptr&lt;int&gt; = elementAddr(%barr, 0)
   %element_int = load(%elementPtr)
    %element = cast(%element_int) : %bool
    ... // uses %element 

func computeMain:
  call %test %cb
```

In this new IR, there is no longer a load and conversion of the entire
array.

See new comment in `slang-ir-lower-buffer-element-type.cpp` for more
details of how the pass works.

This PR also address many other issues surfaced by turning on
`transformParamsToConstRef` pass on SPIRV backend.

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Fix CUDA global variable initialization with constructor calls (#8340)</title>
<updated>2025-09-18T15:46:44+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-09-18T15:46:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=54faa55c0bd4c4beede7337a76ed3a56d1eb4f15'/>
<id>urn:sha1:54faa55c0bd4c4beede7337a76ed3a56d1eb4f15</id>
<content type='text'>
Fix CUDA global variable initialization with constructor calls

Resolves CUDA compilation failure where global variables with struct
constructor
initialization generated illegal `__device__` variable runtime
initialization.

  **Problem:**
  ```cuda
  // Generated invalid CUDA code:
  __device__ static const Stuff_0 gStuff_0 = Stuff_x24init_0(args...);
// Error: "dynamic initialization is not supported for a __device__
variable"

  Root Cause Discovered:
Through extensive debugging, found that
moveGlobalVarInitializationToEntryPoints
pass only handled kIROp_GlobalVar instructions, but global constants
with
  constructor calls appeared as kIROp_Call instructions at module scope.

  Solution:
1. IR Pipeline Fix: Extended moveGlobalVarInitializationToEntryPoints to
detect
and transform module-level constructor calls into proper global
variables with
  entry-point initialization
2. Field Access Fix: Enhanced kIROp_FieldExtract logic to emit correct
-&gt;
  syntax for pointer types and address-of operations
3. Constructor Emission: Added CUDA-specific handling for constructor
calls

  Architecture:
- Transforms let %gStuff = call %Constructor(...) into kernel context
initialization
- Moves runtime initialization from global scope to entry-point
execution
  - Follows CUDA best practices for global state management

  Files:
- source/slang/slang-ir-explicit-global-init.cpp: Extended IR
transformation pass
- source/slang/slang-emit-c-like.cpp: Enhanced field access and foldable
value logic
- source/slang/slang-emit-cuda.cpp: Added CUDA-specific field extraction
handling

  Result:
  // Now generates proper CUDA code:
  struct KernelContext_0 { Stuff_0 gStuff_1; };
  // Runtime initialization in entry point:
  kernelContext_1.gStuff_1 = constructor_call();

  Fixes: tests/compute/type-legalize-global-with-init.slang</content>
</entry>
<entry>
<title>CUDA: Fix compiler crash with unsized array field - nonuniformres-as-… (#8380)</title>
<updated>2025-09-10T12:01:36+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-09-10T12:01:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=3d0f5ee55788dca324641ae9268ee37dc4d7d9d5'/>
<id>urn:sha1:3d0f5ee55788dca324641ae9268ee37dc4d7d9d5</id>
<content type='text'>
…function-parameter.slang #8315

Root Cause:
CUDA compilation crashed with `assert failure:
!seenFinalUnsizedArrayField` because unsized arrays like
`RWStructuredBuffer&lt;uint&gt; globalBuffer[]` were not the final field in
generated parameter structs, violating the layout constraint in
slang-ir-layout.cpp.

Fix:
Extended `collectGlobalUniformParameters` to automatically reorder
struct fields for CUDA targets - regular fields first, unsized arrays
last. Other targets preserve original order.

Impact:
- Enables CUDA support for nonuniform resource indexing as function
parameters
- Zero impact on existing GLSL/HLSL/SPIRV targets
- Automatic handling - no manual parameter reordering required

Files: slang-emit.cpp, slang-ir-collect-global-uniforms.cpp/.h, test
file

---------

Co-authored-by: slangbot &lt;ellieh+slangbot@nvidia.com&gt;
Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;
Co-authored-by: Ellie Hermaszewska &lt;ellieh@nvidia.com&gt;</content>
</entry>
<entry>
<title>Fix #8314 - Enable tests/compute/texture-subscript.slang for CUDA (#8408)</title>
<updated>2025-09-09T11:16:51+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-09-09T11:16:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=63676c5e51d9d58d3cde7e296f82250b71538b85'/>
<id>urn:sha1:63676c5e51d9d58d3cde7e296f82250b71538b85</id>
<content type='text'>
The test can be enabled</content>
</entry>
<entry>
<title>Fix#8085: Batch-9: Enable cuda tests (#8269)</title>
<updated>2025-09-03T16:06:43+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-09-03T16:06:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=bf607e2f3fa183e9a2b18c7a98438a05247d6ed3'/>
<id>urn:sha1:bf607e2f3fa183e9a2b18c7a98438a05247d6ed3</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Fix#8086: Batch-10: Enable cuda tests (#8270)</title>
<updated>2025-09-03T05:15:57+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-09-03T05:15:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=639978008de3a74c00e03451cd9fc74452766fcd'/>
<id>urn:sha1:639978008de3a74c00e03451cd9fc74452766fcd</id>
<content type='text'>
</content>
</entry>
<entry>
<title>render-test: Change D3D12 default to sm_6_5 (#8320)</title>
<updated>2025-09-02T23:43:48+00:00</updated>
<author>
<name>James Helferty (NVIDIA)</name>
<email>jhelferty@nvidia.com</email>
</author>
<published>2025-09-02T23:43:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f02b08490aa905f42a8d90381db84b1f8e409c0c'/>
<id>urn:sha1:f02b08490aa905f42a8d90381db84b1f8e409c0c</id>
<content type='text'>
Changes default for render-test to sm_6_5.
Since sm_6_5 is the new default, remove the -use-dxil option, add
-use-dxcb option
Remove -use-dxil option from all test cases.
Add -use-dxcb to two tests that needed it.

Fixes #7611</content>
</entry>
<entry>
<title>Fix#8084: Batch-8: Enable cuda tests (#8268)</title>
<updated>2025-08-25T05:20:41+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-08-25T05:20:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=1562f98c07954ae17f9e7ef186f6c8eb029740ab'/>
<id>urn:sha1:1562f98c07954ae17f9e7ef186f6c8eb029740ab</id>
<content type='text'>
</content>
</entry>
</feed>
