<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/optimization, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2025-10-16T03:59:47+00:00</updated>
<entry>
<title>Immutable access qualifier for pointers and use `__ldg` on cuda. (#8710)</title>
<updated>2025-10-16T03:59:47+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-10-16T03:59:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=01510f2c922af8629c7a730ef92a31fa83bd9f49'/>
<id>urn:sha1:01510f2c922af8629c7a730ef92a31fa83bd9f49</id>
<content type='text'>
This PR implements `Access.Immutable` to allow pointers to immutable
data.

The new type `ImmutablePtr&lt;T&gt;` is defined as an alias of `Ptr&lt;T,
Address.Immutable&gt;`.
By forming a immutable pointer, the programmer is conveying to the
compiler that the data at the pointer address will never change during
the execution of the current program. Therefore loads from immutable
pointers can be deduplicated by the compiler, and will translate to
`__ldg` when generating code for CUDA.

The SPIRV backend is not changed in this PR, since the current SPIRV
spec makes it very difficult to specify loads from immutable address
without generating tons of wrappers and boilerplate type declarations.
We would like to see the spec evolved a bit to around its support of
`NonWritable` physical storage pointers or immutable loads before we
attempt to express such immutability in SPIRV. For now we simply emit
ordinary pointers and loads when generating spirv.

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Small fix to buffer load specialization pass to allow more specialization to happen. (#8653)</title>
<updated>2025-10-10T01:26:28+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-10-10T01:26:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=3cf1f5a616917480c63b76aae906dc36b29e46ce'/>
<id>urn:sha1:3cf1f5a616917480c63b76aae906dc36b29e46ce</id>
<content type='text'>
This allows us to further cleanup unnecessary copies in the target code
we generate.

Part of effort of #8652.</content>
</entry>
<entry>
<title>Fix a bug that causes a struct field to be initialized twice. (#8619)</title>
<updated>2025-10-07T15:53:36+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-10-07T15:53:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=54e1d02715747ee81585bfd23e96a1f4956dbf66'/>
<id>urn:sha1:54e1d02715747ee81585bfd23e96a1f4956dbf66</id>
<content type='text'>
We insert field initialization logic at the beginning of every ctor in
`synthesizeCtorBody`, but then immediately inserts another round of
initialization again for explicit ctors in `maybeInsertDefaultInitExpr`,
both called from `SemanticsDeclBodyVisitor::visitAggTypeDecl` right next
to each other.

The fix is to remove `maybeInsertDefaultInitExpr`.

This change also enhances the address aliasing analysis, so that for the
following case:
```
this-&gt;member1 = 0;
this-&gt;member2 = 0;
this-&gt;member1 = param;
```
We can still remove the first assignment to `this-&gt;member1` despite
seeing `this-&gt;member2=0`, since it is easy to know that `this-&gt;member2`
cannot alias with `this-&gt;member1`.

Closes #8600.</content>
</entry>
<entry>
<title>Enhance buffer load specialization pass to specialize past field extracts. (#8547)</title>
<updated>2025-10-01T02:08:23+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-10-01T02:08:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=e4611e2e30a3e5969d402f5ed7e72706a0e3b024'/>
<id>urn:sha1:e4611e2e30a3e5969d402f5ed7e72706a0e3b024</id>
<content type='text'>
This allows us to specialize functions whose argument is a sub element
of a constant buffer, instead of being only applicable to entire buffer
element. Closes #8421.

This change also implements a proper heuristic to determine when to
specialize the calls and defer the buffer loads.

This PR addresses a pathological case exposed in
`slangpy\slangpy\benchmarks\test_benchmark_tensor.py`, which used to
take 27ms to finish, and now takes 1.25ms.


For example, given:
```
struct Bottom
{
    float bigArray[1024];

    [mutating]
    void setVal(int index, float value) { bigArray[index] = value; }
}

struct Root
{
    Bottom top[2];
    [mutating]
    void setTopVal(int x, int y, float value)
    {
        top[x].setVal(y, value);
    }
}

RWStructuredBuffer&lt;Root&gt; sb;

[shader("compute")]
[numthreads(1, 1, 1)]
void compute_main(uint3 tid: SV_DispatchThreadID)
{
    sb[0].setTopVal(1, 2, 100.0f);
}
```

We are now able to specialize the call to `setTopVal` into:
```
void compute_main(uint3 tid: SV_DispatchThreadID)
{
    setTopVal_specialized(0, 1, 2, 100.0f);
}

void setTopVal_specialized(int sbIdx, int x, int y, float value)
{
      Bottom_setVal_specialized(sbIdx, x, y, value);
}

void Bottom_setVal_specialized(int sbIdx, int x, int y, float value)
{
     sb[sbIdx].top[x].bigArray[y] = value;
}
```

And get rid of all unnecessary loads. Achieving this requires a
combination of function call specialization and buffer-load-defer pass.
The buffer-load-defer pass has been completely rewritten to be more
correct and avoid introducing redundant loads.

This PR also adds tests to make sure pointers, bindless handles, and
loads from structured buffer or constant buffers works as expected.</content>
</entry>
<entry>
<title>Rewriting the lower-buffer-element-type pass to avoid unnecessary packing/unpacking. (#8526)</title>
<updated>2025-09-30T00:45:08+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-09-30T00:45:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89'/>
<id>urn:sha1:a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89</id>
<content type='text'>
Part of the effort to improve the performance of generated SPIRV code.

The existing lower-buffer-element-type pass works by loading the entire
buffer element content from memory, and translate it to logical type
stored in a local variable at the earliest reference of a buffer handle.
This means that is can generate inefficient code that reads more than
necessary.

Consider this example:
```
struct BigStruct { bool values[1024]; }
ConstantBuffer&lt;BigStruct&gt; cb;

void test(BigStruct v)
{
      if (v.values[0]) { printf("ok"); }
}

[numthreads(1,1,1)]
void computeMain()
{
    test(cb);
}
```

In IR, the `computeMain` function before lower-buffer-element-type pass
is something like following:
```
func test:
   %v = param : BigStruct
   %barr = fieldExtract(%v, "values")
   %element = elementExtract(%barr, 0)
    ... // uses %element 

func computeMain:
  %v = load(cb)
  call %test %v
```

The existing lower-buffer-element-type pass will rewrite the bool array
in `BigStruct` into `int` array so it is legal in SPIRV. However, it
does so by inserting the translation on the first `load` of the constant
buffer:

```
struct BigStruct_std430 {
    int values[1024];
}
var cb : ConstantBuffer&lt;BigStruct_std430&gt;;
func computeMain:
   %tmpVar : var&lt;BigStruct&gt;
    call %unpackStorage(%tmpVar, cb)
   %v : BigStruct = load %tmpVar
   call %test %v
```

This means that the entire array will be loaded and translated to int,
before calling `test`, which only uses one element. It turns out that
the downstream compiler isn't always able to optimize out this
inefficient translation/copy.

This PR completely rewrites the way buffer-element-type lowering is
handled to avoid producing this inefficient code. It works in two parts:
first we turn on the `transformParamsToConstRef` pass for SPIRV target
as well, so we will translate the `test` function to take the `v`
parameter as `constref`. The second part is a redesigned
buffer-element-type pass that defers the storage-type to logical-type
translation until a value is actually used by a `load` instruction.

In this example, after `transformParamsToConstRef`, the IR is:

```
func test:
   %v = param : ConstRef&lt;BigStruct&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr = elementAddr(%barr, 0)
   %element = load(%elementPtr)
    ... // uses %element 

func computeMain:
  call %test %cb
```

The new `buffer-element-type-lowering` pass will take this IR, and
insert translation at latest possible time across the entire call graph,
and translate the IR into:

```
func test:
   %v = param : ConstRef&lt;BigStruct_std430&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr : ptr&lt;int&gt; = elementAddr(%barr, 0)
   %element_int = load(%elementPtr)
    %element = cast(%element_int) : %bool
    ... // uses %element 

func computeMain:
  call %test %cb
```

In this new IR, there is no longer a load and conversion of the entire
array.

See new comment in `slang-ir-lower-buffer-element-type.cpp` for more
details of how the pass works.

This PR also address many other issues surfaced by turning on
`transformParamsToConstRef` pass on SPIRV backend.

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Refresh of disabled WGPU tests (#5614)</title>
<updated>2024-11-21T07:37:28+00:00</updated>
<author>
<name>Anders Leino</name>
<email>aleino@nvidia.com</email>
</author>
<published>2024-11-21T07:37:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=93f5d13fc90c518ef2829dc16c28f0403e230337'/>
<id>urn:sha1:93f5d13fc90c518ef2829dc16c28f0403e230337</id>
<content type='text'>
Some tests are now passing and are enabled.
Other tests are still failing, but are given comments categorizing the failures.

Tests in the 'Not supported in WGSL' category are also removed from the expected failures
list. (Though they are still kept disabled for WebGPU, of course.)

This closes #5519.</content>
</entry>
<entry>
<title>Make various parameters and return types require specialization when targeting WGSL (#5483)</title>
<updated>2024-11-06T11:14:35+00:00</updated>
<author>
<name>Anders Leino</name>
<email>aleino@nvidia.com</email>
</author>
<published>2024-11-06T11:14:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f8294202ce8d5658f6308eeaed634058db9bbb4b'/>
<id>urn:sha1:f8294202ce8d5658f6308eeaed634058db9bbb4b</id>
<content type='text'>
Structured buffer types translate to array types in the WGSL emitter.
WGSL doesn't allow passing runtime-sized arrays to functions.
Similarly for pointers to texture handles.
Also, structured buffers (runtime-sized arrays) cannot be returned in WGSL.

This closes issue #5228, issue #5278 and issue #5288 by enabling specialized functions
to be generated in these cases, in order to work around these constraints.</content>
</entry>
<entry>
<title>Enable WebGPU tests in CI (#5239)</title>
<updated>2024-10-15T16:11:53+00:00</updated>
<author>
<name>Anders Leino</name>
<email>aleino@nvidia.com</email>
</author>
<published>2024-10-15T16:11:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=9e3b0367cfd63f21a0519b61b6fd13e94dac1c51'/>
<id>urn:sha1:9e3b0367cfd63f21a0519b61b6fd13e94dac1c51</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Fix return type address space checking. (#4465)</title>
<updated>2024-06-26T00:07:55+00:00</updated>
<author>
<name>ArielG-NV</name>
<email>159081215+ArielG-NV@users.noreply.github.com</email>
</author>
<published>2024-06-26T00:07:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=63e0064bd3a2007adf17a35d3c58894d90ddc04a'/>
<id>urn:sha1:63e0064bd3a2007adf17a35d3c58894d90ddc04a</id>
<content type='text'>
solves: `func-resource-result-simple.slang`, related to #4291

when getting the address space of a function return we now check the return address of the inst if the return type is a non pointer type.</content>
</entry>
<entry>
<title>enable more metal tests (#4326)</title>
<updated>2024-06-10T20:28:36+00:00</updated>
<author>
<name>skallweitNV</name>
<email>64953474+skallweitNV@users.noreply.github.com</email>
</author>
<published>2024-06-10T20:28:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=712ce653d4c3d7284dd71389f31540d0da7f144e'/>
<id>urn:sha1:712ce653d4c3d7284dd71389f31540d0da7f144e</id>
<content type='text'>
</content>
</entry>
</feed>
