<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/compute/byte-address-buffer-array.slang, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2025-09-30T00:45:08+00:00</updated>
<entry>
<title>Rewriting the lower-buffer-element-type pass to avoid unnecessary packing/unpacking. (#8526)</title>
<updated>2025-09-30T00:45:08+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-09-30T00:45:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89'/>
<id>urn:sha1:a6deb5ed82cb8fc6b4f4c5c5fee264e09f97ff89</id>
<content type='text'>
Part of the effort to improve the performance of generated SPIRV code.

The existing lower-buffer-element-type pass works by loading the entire
buffer element content from memory, and translate it to logical type
stored in a local variable at the earliest reference of a buffer handle.
This means that is can generate inefficient code that reads more than
necessary.

Consider this example:
```
struct BigStruct { bool values[1024]; }
ConstantBuffer&lt;BigStruct&gt; cb;

void test(BigStruct v)
{
      if (v.values[0]) { printf("ok"); }
}

[numthreads(1,1,1)]
void computeMain()
{
    test(cb);
}
```

In IR, the `computeMain` function before lower-buffer-element-type pass
is something like following:
```
func test:
   %v = param : BigStruct
   %barr = fieldExtract(%v, "values")
   %element = elementExtract(%barr, 0)
    ... // uses %element 

func computeMain:
  %v = load(cb)
  call %test %v
```

The existing lower-buffer-element-type pass will rewrite the bool array
in `BigStruct` into `int` array so it is legal in SPIRV. However, it
does so by inserting the translation on the first `load` of the constant
buffer:

```
struct BigStruct_std430 {
    int values[1024];
}
var cb : ConstantBuffer&lt;BigStruct_std430&gt;;
func computeMain:
   %tmpVar : var&lt;BigStruct&gt;
    call %unpackStorage(%tmpVar, cb)
   %v : BigStruct = load %tmpVar
   call %test %v
```

This means that the entire array will be loaded and translated to int,
before calling `test`, which only uses one element. It turns out that
the downstream compiler isn't always able to optimize out this
inefficient translation/copy.

This PR completely rewrites the way buffer-element-type lowering is
handled to avoid producing this inefficient code. It works in two parts:
first we turn on the `transformParamsToConstRef` pass for SPIRV target
as well, so we will translate the `test` function to take the `v`
parameter as `constref`. The second part is a redesigned
buffer-element-type pass that defers the storage-type to logical-type
translation until a value is actually used by a `load` instruction.

In this example, after `transformParamsToConstRef`, the IR is:

```
func test:
   %v = param : ConstRef&lt;BigStruct&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr = elementAddr(%barr, 0)
   %element = load(%elementPtr)
    ... // uses %element 

func computeMain:
  call %test %cb
```

The new `buffer-element-type-lowering` pass will take this IR, and
insert translation at latest possible time across the entire call graph,
and translate the IR into:

```
func test:
   %v = param : ConstRef&lt;BigStruct_std430&gt;
   %barr = fieldAddr(%v, "values")
   %elementPtr : ptr&lt;int&gt; = elementAddr(%barr, 0)
   %element_int = load(%elementPtr)
    %element = cast(%element_int) : %bool
    ... // uses %element 

func computeMain:
  call %test %cb
```

In this new IR, there is no longer a load and conversion of the entire
array.

See new comment in `slang-ir-lower-buffer-element-type.cpp` for more
details of how the pass works.

This PR also address many other issues surfaced by turning on
`transformParamsToConstRef` pass on SPIRV backend.

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Fix#8085: Batch-9: Enable cuda tests (#8269)</title>
<updated>2025-09-03T16:06:43+00:00</updated>
<author>
<name>Harsh Aggarwal (NVIDIA)</name>
<email>haaggarwal@nvidia.com</email>
</author>
<published>2025-09-03T16:06:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=bf607e2f3fa183e9a2b18c7a98438a05247d6ed3'/>
<id>urn:sha1:bf607e2f3fa183e9a2b18c7a98438a05247d6ed3</id>
<content type='text'>
</content>
</entry>
<entry>
<title>render-test: Change D3D12 default to sm_6_5 (#8320)</title>
<updated>2025-09-02T23:43:48+00:00</updated>
<author>
<name>James Helferty (NVIDIA)</name>
<email>jhelferty@nvidia.com</email>
</author>
<published>2025-09-02T23:43:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f02b08490aa905f42a8d90381db84b1f8e409c0c'/>
<id>urn:sha1:f02b08490aa905f42a8d90381db84b1f8e409c0c</id>
<content type='text'>
Changes default for render-test to sm_6_5.
Since sm_6_5 is the new default, remove the -use-dxil option, add
-use-dxcb option
Remove -use-dxil option from all test cases.
Add -use-dxcb to two tests that needed it.

Fixes #7611</content>
</entry>
<entry>
<title>Fix HLSL ByteAddressBuffer Load* parameter integer type (#7117)</title>
<updated>2025-05-16T17:42:59+00:00</updated>
<author>
<name>Darren Wihandi</name>
<email>65404740+fairywreath@users.noreply.github.com</email>
</author>
<published>2025-05-16T17:42:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=8683b85c0494db99feb08b6efcdc26dfe006729f'/>
<id>urn:sha1:8683b85c0494db99feb08b6efcdc26dfe006729f</id>
<content type='text'>
* Fix HLSL ByteAddressBuffer Load* parameter integer type

* Fix tests

* Fix load with alignment function signature clash

* Fix LoadAligned tests</content>
</entry>
<entry>
<title>Update SPIRV-Tools and fix new validation errors. (#6511)</title>
<updated>2025-03-06T22:26:34+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2025-03-06T22:26:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4485cf3eaf142cfd5f8470e86739acc67d4e12ea'/>
<id>urn:sha1:4485cf3eaf142cfd5f8470e86739acc67d4e12ea</id>
<content type='text'>
* Update SPIRV-Tools and fix new validation errors.

* Implement pointers for glsl target.

* Reworked packStorage/unpackStorage code gen to operate on pointers rather than values.</content>
</entry>
<entry>
<title>update slang-rhi (shader object refactor) (#6251)</title>
<updated>2025-02-28T01:54:22+00:00</updated>
<author>
<name>Simon Kallweit</name>
<email>64953474+skallweitNV@users.noreply.github.com</email>
</author>
<published>2025-02-28T01:54:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=38734ec1f6644f1565aeb91106f371b14d3ba07a'/>
<id>urn:sha1:38734ec1f6644f1565aeb91106f371b14d3ba07a</id>
<content type='text'>
* remove unused resource

* define buffer data

* add vs2022 build presets

* update slang-rhi API usage

* update slang-rhi

---------

Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
<entry>
<title>Fix the type error in kIROp_RWStructuredBufferLoad (#4523)</title>
<updated>2024-07-02T03:06:58+00:00</updated>
<author>
<name>kaizhangNV</name>
<email>149626564+kaizhangNV@users.noreply.github.com</email>
</author>
<published>2024-07-02T03:06:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=bd01bd3f4b8eecbfb924b8eb4090694e44e8166c'/>
<id>urn:sha1:bd01bd3f4b8eecbfb924b8eb4090694e44e8166c</id>
<content type='text'>
* Fix the type error in kIROp_RWStructuredBufferLoad

In StructuredBuffer::Load(), we allow any type of integer
as the input. However, when emitting glsl code,
StructuredBuffer::Load(index)
will be translated to the subscript index of the buffer, e.g.
buffer[index], however, glsl doesn't allow 64bit integer as the
subscript.

So the easiest fix is to convert the index to uint when emitting
glsl.

* Add commit</content>
</entry>
<entry>
<title>Add LoadAligned and StoreAligned methods to ByteAddressBuffers (#4066)</title>
<updated>2024-05-14T06:57:57+00:00</updated>
<author>
<name>Sriram Murali</name>
<email>85252063+sriramm-nv@users.noreply.github.com</email>
</author>
<published>2024-05-14T06:57:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=487ae034e2b03ddd67945132c8fecbd937952705'/>
<id>urn:sha1:487ae034e2b03ddd67945132c8fecbd937952705</id>
<content type='text'>
Fixes #4062

This change enables wide load/stores for byte-address-buffer backed
resources, when the data is accessed at an offset that is aligned.

**Goals**
- Improve performance by issuing wider instructions instead of sequence
  of scalar instructions, for load and stores of byte-address buffers.
- Reduce code-size and readability of the generated shaders.
- Help naive users as well as ninja programmers, generate optimal code.

**Non Goals**
- Help with Structured buffers, or other resources.
- Target compilation time improvements.

**Key changes**
Adds 2 new overloads for Load and Store operations on ByteAddress Buffers.
1. Load / Store with an extra alignment parameter
```
    resource.Load&lt;T&gt;(offset, alignment);
    resource.Store&lt;T&gt;(offset, value, alignment);
```
2. LoadAligned / StoreAligned with no extra parameter, 
   with the same signature as orignial Load / Store.
```
    resource.LoadAligned&lt;T&gt;(offset);
    resource.StoreAligned&lt;T&gt;(offset, value);
```
    - This overload will implicitly identify the alignment value,
    from the base type T of the elementary unit of the resource.

**Supported resources**
1. Vectors
   This can be upto 4 elements, i.e. float -- float4.
2. Arrays
   This does not have a limit on number of elements, but on a
   conservative estimate, we can limit to few hundreds.
3. Structures
   This is used to group a resource of a single type. 
```
 struct {
    float4 x;
 }
```
**Code updates**
- Modified byte-address-ir legalize to handle struct, array and vector
  kinds of load or store access
- Added custom hlsl stdlib functions to implement all the overloads for Load,
  Store etc.
- Added C-like emitter, SPIR-V emitter for handling ByteAddressBuffers.
- Added a new core stdlib function intrinsic to wrap around alignOf&lt;T&gt;().
- Added a new peephole optimization entry to identify the equivalent
  IntLiteral value from the alignOf&lt;T&gt;() inst.
- Added tests to check explicit, and implicit aligned Load and Store
  operations.
</content>
</entry>
</feed>
