<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/source/slang/slang-ir-specialize-address-space.cpp, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2025-10-01T02:08:23+00:00</updated>
<entry>
<title>Enhance buffer load specialization pass to specialize past field extracts. (#8547)</title>
<updated>2025-10-01T02:08:23+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-10-01T02:08:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=e4611e2e30a3e5969d402f5ed7e72706a0e3b024'/>
<id>urn:sha1:e4611e2e30a3e5969d402f5ed7e72706a0e3b024</id>
<content type='text'>
This allows us to specialize functions whose argument is a sub element
of a constant buffer, instead of being only applicable to entire buffer
element. Closes #8421.

This change also implements a proper heuristic to determine when to
specialize the calls and defer the buffer loads.

This PR addresses a pathological case exposed in
`slangpy\slangpy\benchmarks\test_benchmark_tensor.py`, which used to
take 27ms to finish, and now takes 1.25ms.


For example, given:
```
struct Bottom
{
    float bigArray[1024];

    [mutating]
    void setVal(int index, float value) { bigArray[index] = value; }
}

struct Root
{
    Bottom top[2];
    [mutating]
    void setTopVal(int x, int y, float value)
    {
        top[x].setVal(y, value);
    }
}

RWStructuredBuffer&lt;Root&gt; sb;

[shader("compute")]
[numthreads(1, 1, 1)]
void compute_main(uint3 tid: SV_DispatchThreadID)
{
    sb[0].setTopVal(1, 2, 100.0f);
}
```

We are now able to specialize the call to `setTopVal` into:
```
void compute_main(uint3 tid: SV_DispatchThreadID)
{
    setTopVal_specialized(0, 1, 2, 100.0f);
}

void setTopVal_specialized(int sbIdx, int x, int y, float value)
{
      Bottom_setVal_specialized(sbIdx, x, y, value);
}

void Bottom_setVal_specialized(int sbIdx, int x, int y, float value)
{
     sb[sbIdx].top[x].bigArray[y] = value;
}
```

And get rid of all unnecessary loads. Achieving this requires a
combination of function call specialization and buffer-load-defer pass.
The buffer-load-defer pass has been completely rewritten to be more
correct and avoid introducing redundant loads.

This PR also adds tests to make sure pointers, bindless handles, and
loads from structured buffer or constant buffers works as expected.</content>
</entry>
<entry>
<title>Rewriting the lower-buffer-element-type pass to avoid unnecessary packing/unpacking. (#8526)</title>
<updated>2025-09-30T00:45:08+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-09-30T00:45:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89'/>
<id>urn:sha1:a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89</id>
<content type='text'>
Part of the effort to improve the performance of generated SPIRV code.

The existing lower-buffer-element-type pass works by loading the entire
buffer element content from memory, and translate it to logical type
stored in a local variable at the earliest reference of a buffer handle.
This means that is can generate inefficient code that reads more than
necessary.

Consider this example:
```
struct BigStruct { bool values[1024]; }
ConstantBuffer&lt;BigStruct&gt; cb;

void test(BigStruct v)
{
      if (v.values[0]) { printf("ok"); }
}

[numthreads(1,1,1)]
void computeMain()
{
    test(cb);
}
```

In IR, the `computeMain` function before lower-buffer-element-type pass
is something like following:
```
func test:
   %v = param : BigStruct
   %barr = fieldExtract(%v, "values")
   %element = elementExtract(%barr, 0)
    ... // uses %element 

func computeMain:
  %v = load(cb)
  call %test %v
```

The existing lower-buffer-element-type pass will rewrite the bool array
in `BigStruct` into `int` array so it is legal in SPIRV. However, it
does so by inserting the translation on the first `load` of the constant
buffer:

```
struct BigStruct_std430 {
    int values[1024];
}
var cb : ConstantBuffer&lt;BigStruct_std430&gt;;
func computeMain:
   %tmpVar : var&lt;BigStruct&gt;
    call %unpackStorage(%tmpVar, cb)
   %v : BigStruct = load %tmpVar
   call %test %v
```

This means that the entire array will be loaded and translated to int,
before calling `test`, which only uses one element. It turns out that
the downstream compiler isn't always able to optimize out this
inefficient translation/copy.

This PR completely rewrites the way buffer-element-type lowering is
handled to avoid producing this inefficient code. It works in two parts:
first we turn on the `transformParamsToConstRef` pass for SPIRV target
as well, so we will translate the `test` function to take the `v`
parameter as `constref`. The second part is a redesigned
buffer-element-type pass that defers the storage-type to logical-type
translation until a value is actually used by a `load` instruction.

In this example, after `transformParamsToConstRef`, the IR is:

```
func test:
   %v = param : ConstRef&lt;BigStruct&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr = elementAddr(%barr, 0)
   %element = load(%elementPtr)
    ... // uses %element 

func computeMain:
  call %test %cb
```

The new `buffer-element-type-lowering` pass will take this IR, and
insert translation at latest possible time across the entire call graph,
and translate the IR into:

```
func test:
   %v = param : ConstRef&lt;BigStruct_std430&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr : ptr&lt;int&gt; = elementAddr(%barr, 0)
   %element_int = load(%elementPtr)
    %element = cast(%element_int) : %bool
    ... // uses %element 

func computeMain:
  call %test %cb
```

In this new IR, there is no longer a load and conversion of the entire
array.

See new comment in `slang-ir-lower-buffer-element-type.cpp` for more
details of how the pass works.

This PR also address many other issues surfaced by turning on
`transformParamsToConstRef` pass on SPIRV backend.

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>[CBP] Pointer frontend changes + groupshared pointer support (#7848)</title>
<updated>2025-08-29T22:52:34+00:00</updated>
<author>
<name>ArielG-NV</name>
<email>159081215+ArielG-NV@users.noreply.github.com</email>
</author>
<published>2025-08-29T22:52:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=7758625d3fea67e55e98e7e4103d56c9918365be'/>
<id>urn:sha1:7758625d3fea67e55e98e7e4103d56c9918365be</id>
<content type='text'>
Resolves #7628
Resolves: #8197

Primary Goals:
1. Add `Access` to pointer
2. AddressSpace::GroupShared support for pointers (SPIR-V)
3. Add `__getAddress()` to replace `&amp;`
* `&amp;` is not updated to `require(cpu)` since slangpy uses `&amp;`. This
means we must: (1) merge PR; (2) replace `&amp;` with `__getAddress()`; (3)
add `require(cpu)` to `&amp;`

Changes:
* Added to `Ptr` the `Access` generic argument &amp; logic (for
`Access::Read`).
* Moved the generic argument `AddressSpace` from `Ptr` to the end of the
type.
* Added pointer casting support between any `Ptr` as long as the
`AddressSpace` is the same
* Disallow globallycoherent T* and coherent T*
* Disallow const T*, T const*, and const T*
* Fixed .natvis display of `ConstantValue` `ValOperandNode`
* Support generic resolution of type-casted integers
* Added `VariablePointer` emitting for spirv + other minor logic needed
for groupshared pointers

Breaking Changes:
* Anyone using the `AddressSpace` of `Ptr` will now have to account for
the `Access` argument
* we disallow various syntax paired with `Ptr` and `T*`

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Fixup address spaces after inlining. (#7731)</title>
<updated>2025-07-11T23:54:43+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-07-11T23:54:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=1e1a49ccf595dcc99bd9792a47199ec89d5b4370'/>
<id>urn:sha1:1e1a49ccf595dcc99bd9792a47199ec89d5b4370</id>
<content type='text'>
* Fixup address spaces after inlining.

* add -O0</content>
</entry>
<entry>
<title>Add validation for destination of atomic operations (#6093)</title>
<updated>2025-01-22T17:10:35+00:00</updated>
<author>
<name>Anders Leino</name>
<email>aleino@nvidia.com</email>
</author>
<published>2025-01-22T17:10:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=04353fb7602b7eb6a8b86193510ebe0c670b7724'/>
<id>urn:sha1:04353fb7602b7eb6a8b86193510ebe0c670b7724</id>
<content type='text'>
* Reimplement the GLSL atomic* functions in terms of __intrinsic_op

Many of these functions map directly to atomic IR instructions.
The functions taking atomic_uint are left as they are.

This helps to address #5989, since the destination pointer type validation can then be
written only for the atomic IR instructions.

* Add validation for atomic operations

Diagnose an error if the destination of the atomic operation is not appropriate, where
appropriate means it's either:
- 'groupshared'
- from a device buffer

This closes #5989.

* Add tests for GLSL atomics destination validation

Attempting to use the GLSL atomic functions on destinations that are neither groupshared
nor from a device buffer should fail with the following error:

error 41403: cannot perform atomic operation because destination is neither groupshared
             nor from a device buffer.

* Validate atomic operations after address space specialization

Address space specialization for SPIR-V is not done as part of `linkAndOptimizeIR`, as it
is for e.g. Metal, so opt out and add a separate call for SPIR-V.

* Allow unchecked in/inout parameters for non-SPIRV targets

* Clean up callees left without uses during address space specialziation

* format code

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;
Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
<entry>
<title>format</title>
<updated>2024-10-29T06:49:26+00:00</updated>
<author>
<name>Ellie Hermaszewska</name>
<email>ellieh@nvidia.com</email>
</author>
<published>2024-10-29T06:49:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f65d756bff8d4c5cbc15bd0322a2ae8e6b896a21'/>
<id>urn:sha1:f65d756bff8d4c5cbc15bd0322a2ae8e6b896a21</id>
<content type='text'>
* format

* Minor test fixes

* enable checking cpp format in ci</content>
</entry>
<entry>
<title>Fix spirv lowering logic around pointer to unsized array. (#5243)</title>
<updated>2024-10-09T14:35:10+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2024-10-09T14:35:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=ac6f04c15995061ebe8e0ddf62ecf7eb979afb65'/>
<id>urn:sha1:ac6f04c15995061ebe8e0ddf62ecf7eb979afb65</id>
<content type='text'>
* Fix spirv lowering logic around pointer to unsized array.

* Fix.

---------

Co-authored-by: Ellie Hermaszewska &lt;ellieh@nvidia.com&gt;</content>
</entry>
<entry>
<title>Remove using SpvStorageClass values casted into AddressSpace values (#4861)</title>
<updated>2024-08-19T22:06:34+00:00</updated>
<author>
<name>Ellie Hermaszewska</name>
<email>ellieh@nvidia.com</email>
</author>
<published>2024-08-19T22:06:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f77a5ac9d1547a4394bba4ab8e94d905972c79b7'/>
<id>urn:sha1:f77a5ac9d1547a4394bba4ab8e94d905972c79b7</id>
<content type='text'>
* Remove using SpvStorageClass values casted into AddressSpace values

Also removes support for specific storage classes in __target_intrinsic snippets

* remove SLANG_RETURN_NEVER macro

* squash warnings

* Make nonexhaustive switch statement error on gcc

* Add SLANG_EXHAUSTIVE_SWITCH_BEGIN/END macros

---------

Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
<entry>
<title>Overhaul IR lowering of pointer types. (#4710)</title>
<updated>2024-07-25T22:00:14+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2024-07-25T22:00:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=c9d89a40775a055873adf82cfb0ee1cb6bdcb93c'/>
<id>urn:sha1:c9d89a40775a055873adf82cfb0ee1cb6bdcb93c</id>
<content type='text'>
* Overhaul IR lowering of pointer types.

* Propagate address space in IRBuilder.

* Fixup.

* Fix.

* Fix.

* Change how Ptr type is printed to text.

* Fix.</content>
</entry>
<entry>
<title>Specialize address space during spirv legalization. (#4600)</title>
<updated>2024-07-10T23:17:10+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2024-07-10T23:17:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=746d47bb491e0b97e35ab373b4b78d33b9a61164'/>
<id>urn:sha1:746d47bb491e0b97e35ab373b4b78d33b9a61164</id>
<content type='text'>
* Specialize address space during spirv legalization.

* Fix.

* Fix building doc.

* Fix cmake.

* Update assert.</content>
</entry>
</feed>
