<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/tests/compute/entry-point-uniform-params.slang, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2024-07-19T18:49:42+00:00</updated>
<entry>
<title>Support parameter block in metal shader objects. (#4671)</title>
<updated>2024-07-19T18:49:42+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2024-07-19T18:49:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f114433debfba67cbe1db239b6e92278d41ed438'/>
<id>urn:sha1:f114433debfba67cbe1db239b6e92278d41ed438</id>
<content type='text'>
* Support parameter block in metal shader objects.

* Ingore parameter block tests on devices without tier2 argument buffer.

* Fix warning.

* Fix texture subscript test.

---------

Co-authored-by: Yong He &lt;yhe@nvidia.com&gt;</content>
</entry>
<entry>
<title>enable more metal tests (#4326)</title>
<updated>2024-06-10T20:28:36+00:00</updated>
<author>
<name>skallweitNV</name>
<email>64953474+skallweitNV@users.noreply.github.com</email>
</author>
<published>2024-06-10T20:28:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=712ce653d4c3d7284dd71389f31540d0da7f144e'/>
<id>urn:sha1:712ce653d4c3d7284dd71389f31540d0da7f144e</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Metal compute tests (#4292)</title>
<updated>2024-06-07T07:28:16+00:00</updated>
<author>
<name>skallweitNV</name>
<email>64953474+skallweitNV@users.noreply.github.com</email>
</author>
<published>2024-06-07T07:28:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=004fe27a52b7952111ad7e749397aeff499de7ed'/>
<id>urn:sha1:004fe27a52b7952111ad7e749397aeff499de7ed</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Warning on lossy implicit casts. (#2367)</title>
<updated>2022-08-18T06:08:34+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2022-08-18T06:08:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=adaea0e993fd8db351b5dad92802e47ed6d0ec77'/>
<id>urn:sha1:adaea0e993fd8db351b5dad92802e47ed6d0ec77</id>
<content type='text'>
* Warning on bool to float conversion.

* Fix test cases.

* Improve.

* LanguageServer: don't show constant value for non constant variables.

* Fix tests.

* Fix warnings in tests.

Co-authored-by: Yong He &lt;yhe@nvidia.com&gt;</content>
</entry>
<entry>
<title>Clean up render-test handling of input (#1766)</title>
<updated>2021-03-25T23:40:17+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2021-03-25T23:40:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=abb020b15c4f1057657e5f8f63c02b15615bf6ec'/>
<id>urn:sha1:abb020b15c4f1057657e5f8f63c02b15615bf6ec</id>
<content type='text'>
The original goal of this change was to streamline the `TEST_INPUT` system by eliminating options that are no longer relevant once we have eliminated the non-shader-object execution paths. The result is more or less a re-implementation/refactor of the logic around how input is parsed and represented, that tries to set things up for a more general sytem going forward.

The main changes isthat the `ShaderInputLayout` no longer tracks a simple flat list of `ShaderInputLayoutEntry` (that is a kind of pseudo-union of the various buffer/texture/value cases), and it instead uses a hierarchical representation composed of `RefObject`-derived classes to represent "values."

There are several "simple" cases of values

* Textures

* Samplers

* Uniform/ordinary data (`uniform`)

* Buffers composed of uniform/ordinary data (`ubuffer`)

Then there are composed/aggregate values that nest other values:

* An *aggregate* value is a set of *fields* which are name/value pairs. It can be used to fill in a structure, for example.

* An *array* value is a list of values for the elements of an array. It can be used to fill out an array-of-textures parameter, for example.

* A combined texture/sampler value is a pair of a texture value and a sampler value (easy enough)

* An *object* holds an optional type name for a shader object to allocate (it defaults to the type that is "under" the current shader cursor when binding), and a nested value that describes how to fill in the contents of that object

Finally there are cases of values that are just syntactic sugar:

* A `cbuffer` is just shorthand for creating an object value with a nested uniform/ordinary data value

The big idea with this recursive structure is that it gives us a way to handle more arbitrary data types with name-based binding. Supporting this new capability requires changes to both how input layouts get parsed, and also how they get bound into shader objects.

On the parsing side, things have been refactored a bit so that parsing isn't a single monolithic routine. The refactor also tries to make it so that the various options on an input item (e.g., the `size=...` option for textures) are only supported on the relevant type of entry (so you can't specify as many useless options that will be ignored).

The bigger change to parsing is that it now supports a hierarchical structure, where certain input elements like `begin_array` can push a new "parent" value onto a stack, and subsequent `TEST_INPUT` lines will be parsed as children of that item until a matching `end` item. This approach means that we can now in principle describe arbitrary hierarchical structures as part of test input without endlessly increasing the complexity of invididual `TEST_INPUT` lines.

On the binding side, we now have a central recursive operation called `assign(ShaderCursor, ShaderInputLayout::ValPtr)` that assigns from a parsed `ShaderInputLayout` value to a particular cursor. That operation can then recurse on the fields/elements/contents of whatever the cursor points to.

Major open directions:

* With this change it is still necessary to use `uniform` entries to set things like individual integers or `float`s and that is a little silly. It would be good to have some streamlines cases for setting individual scalar values.

* Further, once we have a hierarchical representation of the values for `TEST_INPUT` lines, it becomes clear that we really ought to move to a format more like `TEST_INPUT: dstLocation = srcValue;` where `srcValue` is some kind of hierarchial expression grammar. Refactoring things in this way should make the binding logic even more clear and easy to understand. The refactored parser should make parsing hierarchical expressions easier to do in the future (even if it uses the push/pop model for now)

* One detailed note is that the representation of buffers in this change is kind of a compromise. Just as an "object" value is a thin wrapper around a recursively-contained value for its "content" it seems clear that a buffer could be represented as a wrapper around a content value that could include hierarchical aggregates/objects instead of just flat binary data (this would be important for things like a buffer over a structure type that lays out different on different targets). The main problem right now with changing the representation is actually needing to compute the size of a buffer based on its content, so that can/should be addressed in a subsequent change.

Details:

* The base `RenderTestApp` class and the `ShaderObjectRenderTestApp` classes have been merged, since the hierarchy no longer serves any purpose.

* Disabled the tess that rely on `StructuredBuffer&lt;IWhatever&gt;` because they aren't really supported by our current shader object implementation

* Replaced used of `Uniform` and `root_constants` in `TEST_INPUT` lines with just `uniform`

* Removed a bunch of uses of `stride` from `cbuffer` inputs, where it wasn't really correct/meaningful

* Added the `copyBuffer()` operation to VK/D3D renderers, along with some missing `Usage` cases to support it.

* Made `ShaderCursor` handle the logic to look up a name in the entry points of a root shader object, rather than just having that logic in `render-test`. (We probably need to make a clear design choice on this issue)</content>
</entry>
<entry>
<title>Improve Vulkan shader-objects implementation. (#1765)</title>
<updated>2021-03-25T16:41:53+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2021-03-25T16:41:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=e050035e0b7d3f257a46bc1cb644163026cb1b23'/>
<id>urn:sha1:e050035e0b7d3f257a46bc1cb644163026cb1b23</id>
<content type='text'>
* Improve Vulkan shader-objects implementation.

1. Null bindings no longer crashes.
2. No longer copies push constants to staging CPU buffer before setting it into command buffer. The entry-point shader object now directly sets it into command buffer upon `bindObject` call.

* Update comments

* Fix

* Re-enable 3 tests.

Improved vulkan implementation so that each shader object is responsible for creating descriptor sets on-demand.

Fixed slang reflection to correctly report `ParameterBlock` binding.

* Fix gcc compile error.</content>
</entry>
<entry>
<title>Remove old code paths from render-test (#1760)</title>
<updated>2021-03-17T19:55:30+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2021-03-17T19:55:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=6e5d85efb9fa5f647f7f0c7ef784a9fd09b29023'/>
<id>urn:sha1:6e5d85efb9fa5f647f7f0c7ef784a9fd09b29023</id>
<content type='text'>
* Remove old code paths from render-test

Historically, the `render-test` tool was using three different code paths:

* One based on `gfx` and manual (non-reflection-based) parameter setting, used for OpenGL, D3D11, D3D12, and Vulkan
* One for CPU that used reflection-based parameter setting but shared no code with the first
* One for CUDA that used reflection-based parameter setting and shared some, but not all, code with the CPU path

Recently we've updated `render-test` to include a fourth option:

* Using `gfx` and the "shader object" system it exposes for a unified reflection-based parameter-setting system taht works across OpenGL, D3D11, D3D12, Vulkan, CUDA, and CPU

This change removes the first three options and leaves only the single unified path. A sa result, a bunch of code in `render-test` is no longer needed, and the codebase no longer relies on things like the `IDescriptorSet`-related APIs in `gfx`.

Several existing tests had to be disabled to make this change possible. Those tests will need to be audited and either re-enabled once we fix issues in the shader object system, or permanently removed if they don't test stuff we intend to support in the long run (e.g., global-scope type parameters, which aren't a clear necessity).

* fixup: CUDA detection logic</content>
</entry>
<entry>
<title>Implements CUDA renderer in gfx. (#1637)</title>
<updated>2020-12-11T17:42:23+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2020-12-11T17:42:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=856d7d3705cedabcc2e9389a3f0ac730b0e33476'/>
<id>urn:sha1:856d7d3705cedabcc2e9389a3f0ac730b0e33476</id>
<content type='text'>
* Implements CUDA renderer in gfx.

* Revert unnecessary change.

* Revert unnecessary changes.

Co-authored-by: Tim Foley &lt;tfoleyNV@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Change the policy for entry-point uniform parameters on Vulkan (#1476)</title>
<updated>2020-08-05T18:47:18+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2020-08-05T18:47:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=e713b56a63dcbf945e3e0e6d82666318795c74ff'/>
<id>urn:sha1:e713b56a63dcbf945e3e0e6d82666318795c74ff</id>
<content type='text'>
Entry point `uniform` parameters were a feature of the original Cg and HLSL, but have not been used much in production shader code. One of our goals on Slang is to reduce the (ab)use of the global scope, so bringing entry point `uniform` parameters up to a greater level of usability is an important goal.

Some policy choices about how global vs. entry-point `uniform` parameters behave have already been made, that shape decisions looking forward:

* For DXBC/DXIL, it makes the most sense to follow the lead of fxc/dxc, by treating entry point `uniform` parameters as a kind of syntax sugar for global shader parameters. Any parameters of "ordinary" types are bundles up into an implicit constant buffer, and all the resources (including the implicit constant buffer) are assigned `register`s just as for globals. It is up to the application to decide how to bind those parameters via a root signature (using root descriptors, root constants, descriptor tables, local vs. global root signature, etc.)

* For CPU, it makes sense to pass global vs. entry-point parameters as two different pointers, although the details of what we do for CPU are the least constrained across all current targets.

* For CUDA compute, it makes the most sense to map global shader parameters to `__constant__` global data, and entry-point `uniform` parameters to kernel parameters. This choice ensures that the signature of a kernel when translated from Slang-&gt;CUDA follows the Principle of Least Surprise, at the cost of making entry-point vs. global parameters be passed via different mechanisms.

* For OptiX ray tracing, it makes sense to expand on the precedent from CUDA compute: pass global parameters via global `__constant__` data (as is already expected by OptiX for whole-launch parameters), and pass entry-point `uniform` parameters via the "shader record." This establishes a precedent that for ray-tracing shaders, global-scope parameters map to the "global root signature" concept from DXR, while entry-point `uniform` parameters map to a "local root signature" or "shader record."

* For Vulkan ray tracing, the precedent from OptiX then argues that entry-point `uniform` parameters should map to the Vulkan "shader record" concept (and thus cannot support things like resource types).

* The remaining interesting case is what to do for non-ray-tracing shaders on Vulkan.

The dev team agrees that the most reasonable choice to make for non-ray-tracing Vulkan shaders is to map entry-point `uniform` parameters to "push constants." In particular, this makes it easy to express the case of a compute kernel with direct parameters of ordinary/value types in the way that will be implemented most efficiently.

The big picture is then that a kernel like:

```hlsl
void computeMain(uniform float someValue) { ... }
```

will map to output GLSL like:

```glsl
layout(push_constant)
uniform
{
    float someValue;
} U;

void main() { ... }
```

If the user really wanted a constant-buffer binding to be created instead, they can easily change their input to make the buffer explicit:

```hlsl
struct Params { float someValue; }
void computeMain(uniform ConstantBuffer&lt;Params&gt; params) { ... }
```

(Forcing the user to be explicit about the desire for a buffer here creates a nice symmetry between Vulkan and CUDA; in the first case the user sets up the data in host memory and passes it to the GPU by copy, while in the second case the user must allocate and set up a device-memory buffer for the data. This symmetry extends to D3D if the application chooses to map entry-point `uniform` parameters to root constants.)

This change implements logic in the "parameter binding" part of the Slang compiler to make sure that entry-point `uniform` parameters are wrapped up in a push-constant buffer rather than an ordinary constant buffer for non-ray-tracing shaders on Vulkan (and in a shader record "buffer" for the ray-tracing case).

The majority of the actual work was in adding support for root/push constants to the test framework and the graphics API abstraction it uses. To be clear about that support:

* Root constant ranges are (perhaps confusingly) treated as a new kind of "slot" that can appear on a descriptor set. This choice ensures that the implicit numbering of registers/spaces used by the back-ends can account for these ranges correctly.

* The `TEST_INPUT` lines are extended to allow a `root_constants` case that behaves more or less like `cbuffer`

* The CPU and CUDA paths can treat a `root_constants` input identically to a `cbuffer`. They already allocate the actual buffers based on reflection, and just use `cbuffer` as a directive that causes bytes to be copied in.

* On D3D12 and Vulkan, a descriptor set allocates a `List&lt;char&gt;` to hold the bytes of root constant data assigned into it, and these bytes are flushed to the command list when the table is actually bound (usually right before rendering).

* On D3D11, a descriptor set treats a root constant range more or less like a constant buffer range (with a single buffer), except that it also automatically allocates a buffer to hold the data. Assigning "root constant" data automatically copies it into that buffer.

The small number of tests that used entry-point `uniform` parameters of ordinary types were updated to use the new `root_constant` input type, and the bugs that surfaced were fixed.

A new test to confirm that entry-point `uniform` parameters map to the shader record for VK ray tracing was added.

An important but technically unrelated change is the removal of the `DescriptorSetImpl::Binding` type and related function from the Vulkan implementation of `Renderer`. That type was created to ensure that objects that are bound into a descriptor set don't get released while the descriptor set is still alive, but the implementation relied on a complicated linear search to check for existing bindings, which could create a performance issue for descriptor sets that include large arrays of descriptors. The new implementation makes use of the approach already present in the various `Renderer` implementations (including the Vulkan one) for assigning ranges in a descriptor set a flat/linear index for where their pertinent data is to be bound. As a result, the Vulkan `DescriptorSetImpl` now uses a single flat array of `RefPtr`s to track bound objects, and has no need for linear search when binding.

Co-authored-by: Yong He &lt;yonghe@outlook.com&gt;</content>
</entry>
<entry>
<title>Fixes to make all CPU compute shaders work on CUDA (#1211)</title>
<updated>2020-02-08T16:19:31+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-02-08T16:19:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=0eed0125fa5e5f425d546efdc2b284b09ffc2785'/>
<id>urn:sha1:0eed0125fa5e5f425d546efdc2b284b09ffc2785</id>
<content type='text'>
* Launch CUDA test taking into account dispatch size.

* Enable isCPUOnly hack to work on CUDA.

* Rename 'isCPUOnly' hack to 'onlyCPULikeBinding'.

* Add $T special type.
Support SampleLevel on CUDA.

* Fix typo.
</content>
</entry>
</feed>
