<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/vkray/raygen.slang.glsl, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2023-09-05T15:26:59+00:00</updated>
<entry>
<title>SPIR-V image operations (#3163)</title>
<updated>2023-09-05T15:26:59+00:00</updated>
<author>
<name>Ellie Hermaszewska</name>
<email>ellieh@nvidia.com</email>
</author>
<published>2023-09-05T15:26:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=2c2294d3310b24fd73cd41ec51338a736f3a2886'/>
<id>urn:sha1:2c2294d3310b24fd73cd41ec51338a736f3a2886</id>
<content type='text'>
* Add __truncate and __sampledType for spirv_asm

Allows some texture tests to start passing

* add __isVector

Currently unused

* Add 1-vector legalization pass (WIP)

* Add capabilities for image types

* neaten instruction dumping

* add 1-vector test

* Add a couple of cases to vec1 legalization

* Remove texture tests from expected failures

* comment

* regenerate vs projects

* Remove redundant define form synchapi emulation

* refactoring image methods

* All sample functions refactored

* Remove incorrect glsl intrinsics

Partially addresses https://github.com/shader-slang/slang/issues/3174

* __subscript image ops via writing funcs

* Extract texture struct writing from core.meta.slang

* Abstract out cuda intrinsic

* Remvoe erroneous call to opDecorateIndex

* spirv asm IR utils

* Correct position of loads for SPIR-V asm inst operands

* Raise constructors to global scope during spir-v legalization

* Correct snippet output

* Implement most texture sampling ops for SPIR-V

* Legalize 1-vectors for glsl too

* Make SPIR-V inst operands non-hoistable

* Better 1-vector legalization

* Put textures in ptrs for spirv

* insert missing break

* Add vec1 legalization test

* Add some missing pieces to slang-ir-insts

* Greatly neaten vec1 legalization

* a

* Neaten vec1 legalization

* Add image read and write intrinsics for spir-v

* Squash warnings

* regenerate vs projects

* Drop redundant guards

* Drop 5 tests from expected failure list

* Inst numbering changes to cross compile tests

* vec1 legalization tests only on vk

* Correct location of asm op emit

* Inline constant in spirv-asm

* Correct signedness for lane in wave intrinsics

* Extract element from float1 for cuda

* squash warnings

* Neaten spirv-emit

* dedupe more capabilities

* warnings

* neaten assert

* comments

* comments</content>
</entry>
<entry>
<title>Misc. SPIRV Fixes, Part 2. (#3147)</title>
<updated>2023-08-24T23:32:33+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2023-08-24T23:32:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=0470ea05a42d6c3f35d81a433fefdd440500cdbd'/>
<id>urn:sha1:0470ea05a42d6c3f35d81a433fefdd440500cdbd</id>
<content type='text'>
* Misc. SPIRV Fixes, Part 2.

* Fix up.

* Fix.

* Add system smenatic values.

* 16 bit int and floats, matrix/vector reshape, bool ops.

* Fix.

* Fix.

* Allow push constant entry point params.

* entrypoint params.

* swizzleSet and swizzledStore.

* packoffset.

* string hash.

* Fix.

* Matrix arithmetics.

---------

Co-authored-by: Yong He &lt;yhe@nvidia.com&gt;</content>
</entry>
<entry>
<title>Various dxc/fxc compatibility fixes. (#2863)</title>
<updated>2023-05-03T03:29:38+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2023-05-03T03:29:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=d52376a65f37fcbbb67428b917fd3819436b6dfb'/>
<id>urn:sha1:d52376a65f37fcbbb67428b917fd3819436b6dfb</id>
<content type='text'>
* Various dxc/fxc compatibility fixes.

* Cleanup.

* Fix test cases.

* Fix comments.

---------

Co-authored-by: Yong He &lt;yhe@nvidia.com&gt;</content>
</entry>
<entry>
<title>For C-like targets, emit resource declarations before other globals (#2843)</title>
<updated>2023-04-26T19:46:24+00:00</updated>
<author>
<name>Sai Praveen Bangaru</name>
<email>31557731+saipraveenb25@users.noreply.github.com</email>
</author>
<published>2023-04-26T19:46:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=e1940e53c0f76e91a2616693b261beb9190015be'/>
<id>urn:sha1:e1940e53c0f76e91a2616693b261beb9190015be</id>
<content type='text'>
* For C-like targets, emit resource declarations before other globals

* Remove unused tests</content>
</entry>
<entry>
<title>Fix optimization pass not converging. (#2725)</title>
<updated>2023-03-23T23:59:02+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2023-03-23T23:59:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=50e7d9797d9bf4b98a056d5df128c24dde6e78bd'/>
<id>urn:sha1:50e7d9797d9bf4b98a056d5df128c24dde6e78bd</id>
<content type='text'>
* Fix optimization pass not converging.

* Fix.

* Fix tests.

---------

Co-authored-by: Yong He &lt;yhe@nvidia.com&gt;</content>
</entry>
<entry>
<title>More control flow simplifications. (#2673)</title>
<updated>2023-02-24T18:01:47+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2023-02-24T18:01:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=bd6306cdaa4a49344658bd026721b6532e103d09'/>
<id>urn:sha1:bd6306cdaa4a49344658bd026721b6532e103d09</id>
<content type='text'>
* More control flow and Phi param simplifications.

* Fix.

* Fix gcc error.

* Fix.

* More IR cleanup.

* Fix bug in phi param dce + ifelse simplify.

* Propagate and DCE side-effect-free functions.

* Enhance CFG simplifcation to remove loops with no side effects.

* Fix.

* Fixes.

* Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation.

* More cleanup.

* Fixes.

* Fix.

---------

Co-authored-by: Yong He &lt;yhe@nvidia.com&gt;</content>
</entry>
<entry>
<title>Overhaul global inst deduplication and cpp/cuda backend. (#2654)</title>
<updated>2023-02-16T21:55:32+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2023-02-16T21:55:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4c4826d47eeef4675daae4ae53ff76f4d5ebd84a'/>
<id>urn:sha1:4c4826d47eeef4675daae4ae53ff76f4d5ebd84a</id>
<content type='text'>
* Overhaul global inst deduplication and cpp/cuda backend.

* Update IR documentation.

---------

Co-authored-by: Yong He &lt;yhe@nvidia.com&gt;</content>
</entry>
<entry>
<title>Work to mitigate SPIR-V bloat (#1914)</title>
<updated>2021-07-21T19:52:08+00:00</updated>
<author>
<name>Theresa Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2021-07-21T19:52:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=23d406f8a3b325f91fecd9ad52bd510ded5f49a7'/>
<id>urn:sha1:23d406f8a3b325f91fecd9ad52bd510ded5f49a7</id>
<content type='text'>
* Work to mitigate SPIR-V bloat

SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things.

Note: experimentation shows that the bloat is less pronounced when running either *no* SPIR-V optimizations or *full* SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request.

This change primarily does two things:

First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would *always* copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters.

Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies.

Notes:

* The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon.

* The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing.

* There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments.

* Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated.

* This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create *more* memory traffic (because the same values get loaded in multiple callee functions).

* fixup: run premake

* fixup: typo</content>
</entry>
<entry>
<title>Use "capability" system to select VKRT extension (#1647)</title>
<updated>2021-01-05T17:00:00+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2021-01-05T17:00:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=b4f94629f225b837e7102acc337610c5d4d8a7c1'/>
<id>urn:sha1:b4f94629f225b837e7102acc337610c5d4d8a7c1</id>
<content type='text'>
* Use "capability" system to select VKRT extension

Slang currently supports translation of ray tracing shader code to Vulkan GLSL code that uses the `GL_NV_ray_tracing` extension. A multi-vendor equivalent of that extension has been released as `GL_EXT_ray_tracing` and we want Slang to support that extension as well.

At the simplest, making the change from one extension to the other is just a matter of changing a few strings, since it does not appear that anything of significance was changed at the GLSL level (or even in SPIR-V). Where this gets trickier is when we have users who want us to support *both* extensions, and to be able to switch between them.

The solution we've implemented here more or less amounts to:

* If you don't tell the compiler which extension to use, it will default to `GL_EXT_ray_tracing` (the newer multi-vendor one).

* If you explicitly want the older extension, you can opt into it using the `-profile` option or via a new API for explicitly adding capabilities to your target.

Making that work required a few different kinds of changes:

* The options parsing and public API needed ways to add optional capabilities to a target.

* During GLSL code emit, we can check the capabilities that were added to the target to see if the `GL_NV_ray_tracing` extension was explicitly enabled and, if not, default to using the `GL_EXT_ray_tracing` names for things. This step is needed because some of the modifiers/attributes involved in the extension have to be handled explicitly in the code generator rather than implicitly as part of mapping intrinsic functions.

* We add two different translations to the relevant operatiosn in the stdlib, one marked with each of the extensions. If profile/capability-based overload resolution can be relied on to pick the right one, this should Just Work.

* Next, a bunch of work had to go into making capability-based overloading Just Work for the purposes of this change. There's been a nearly complete reworking of the implementation of `CapabilitySet` here to make it more suitable for our needs.

* The tests that were using ray tracing translation for Vulkan needed to be updated. For some of them I updated their baselines to use `GL_EXT_ray_tracing` so that they can test the new path. For others, I updated the command line for the test case so that it explicitly opts into using `GL_NV_ray_tracing`. The result is that we have some coverage of each extension. I would have liked to have each test run in both modes, but our pass-through glslang support doesn't support `-D` options, so I couldn't take that step easily.

This change does *not* add support for `GL_EXT_ray_query`, the extension that supports "DXR 1.1" style queries under Vulkan. Adding support for that extension should hopefully be a smaller step because it doesn't have the same multiple-extensions issue.

This change does *not* address a lot of possible avenues for improvement or cleanup around the capability system. It focuses only on those changes that are necessary to make the ray tracing feature work and leaves the rest for future work.

* fixup: infinite loop

* Comment-only change to retrigger TC build</content>
</entry>
<entry>
<title>Run SSA pass to clean up temporary variables during generics lowering. (#1447)</title>
<updated>2020-07-23T20:47:12+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2020-07-23T20:47:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=fed4292a581364b611a82a0f6c1c1c95f82dfeb2'/>
<id>urn:sha1:fed4292a581364b611a82a0f6c1c1c95f82dfeb2</id>
<content type='text'>
* Run SSA pass to clean up generic temporary variables during lowering.

* Fix `undefined` emitting logic.

* revert dumpir control flag

* Defer fold decision of `undefined`  values after special case logic for GLSL and HLSL.

* Update expected test result.

* Manually update raygen.slang.glsl to minimize change.

* fix formatting

Co-authored-by: Tim Foley &lt;tfoleyNV@users.noreply.github.com&gt;</content>
</entry>
</feed>
