| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix 7441: CUDA boolean vector layout to use 1-byte elements
Boolean vectors (bool1, bool2, bool3, bool4) were incorrectly implemented
as integer-based types using 4 bytes per element instead of actual 1-byte
boolean elements on CUDA targets.
Changes:
- Update CUDA prelude to define boolean vectors as structs with bool fields
instead of typedef aliases to integer vectors
- Implement CUDALayoutRulesImpl::GetVectorLayout to use 1-byte alignment
for boolean vectors, matching actual CUDA memory layout behavior
- Update make_bool functions to populate struct fields correctly
This ensures boolean vectors have the same memory layout as bool[4] arrays:
- bool1: 1 byte (was 4 bytes)
- bool2: 2 bytes (was 8 bytes)
- bool3: 3 bytes (was 12 bytes)
- bool4: 4 bytes (was 16 bytes)
Fixes memory layout mismatch between Slang reflection API and actual
CUDA compilation, achieving 75% memory savings for boolean vector usage.
* Fix CI issues -
Add and update associated functions and operators
* Make boolX same as uchar
* Use align construct on struct for boolX
* Improve Test case for robust alignment checks
* Formatting
* Disable selected slangpy tests
* add metal check which is slightly different than cuda
* Test-1
* Test-2
* Test-3
* Test-4
* ReflectionChange
* cleanup and update
* _slang_select with plain bool is needed for reverse-loop-checkpoint-test
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Add inner texture type to reflection json
* Add expected result of test
* Adjust test expected results
* Fix ci test result
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds some new entry points to the reflection API to cover corner cases that a majority of applications won't care about.
These are most likely to come up for users who want to make a complete copy of the Slang reflection information into a data format of their own design.
All of the information is stuff that we already computed as part of layout, and just hadn't exposed:
* Alignment information for type layouts. This is only useful for ordinary/uniform data; in all other cases alignment is always one. Even for uniform/ordinary data, it is unlikely that any application would actually make use of it.
* Layout information for the result of an entry point function. This would be useful for applications that need to enumerate the varying outputs (user- or system-defined) of a shader. Having information available for `out` parameters but not the function result was inconsistent.
* The "element type" of a parameter block type (e.g., going from `ParameterBlock<X>` to `X`). This seems to have been an oversight since `ConstantBuffer<X>` appears to have been implemented, and the case for a type *layout* was handled.
* The "container" variable layout for a parameter block or constant buffer. It took a while for us to arrive at the current representation of layout for parameter groups, and most client code continues to use the original API that requires us to generated kludged "do what I mean" data. However, if we don't expose the more useful new representation fully, there is no way for users to take advantage of it!
The reflection test tool has been updated to print the new information where it makes sense, which provides us some level of coverage for the new code.
Unfortunately, this led to some cascading changes:
* First, a bunch of the tests had their output changed since they include new information. That's the easy bit.
* Next, the "container" and "element" var layouts don't actually have names (because there is no actual variable underlying them), which means that the code to emit variable names in the JSON dump needed to be condition.
* Making the `"name"` output conditional messed up a lot of the delicate logic that had been dealing with when to emit commas for the output JSON (JSON uses commas as separators, and doesn't allow trailing commas). I added a bit of new infrastructure to make it simple(-ish) to track when a comma actually needs to be output.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Rework command-line options handling for entry points and targets
Overview:
* The biggest functionality change is that the implicit ordering constraints when multiple `-entry` options are reversed: any `-stage` option affects the `-entry` to its *left* instead of to its *right* as it used to. This is technically a breaking change, but I expect most users aren't using this feature.
* The options parsing tries to handle profile versions and stages as distinct data (rather than using the combined `Profile` type all over), and treats a `-profile` option that specifies both a profile version and a stage (e.g., `-profile ps_5_0`) as if it were sugar for both a `-profile` and a `-stage` (e.g., `-profile sm_5_0 -stage fragment`).
* We now technically handle multiple `-target` options in one invocation of `-slangc`, but do not advertise that fact in the documentation because it might be confusing for users. Similar to the relationship between `-stage` and `-entry`, any `-profile` option affects the most recent `-target` option unless there is only one `-target`.
* The logic for associating `-o` options with corresponding entry points and targets has been beefed up. The rule is that a `-o` option for a compiled kernel binds to the entry point to its left, unless there is only one entry point (just like for `-stage`). The associated target for a `-o` option is found via a search, however, because otherwise it would be impossible to specify `-o` options for both SPIR-V and DXIL in one pass.
* The handling of output paths for entry points in the internal compiler structures was changed, because previously it could only handle one output path per entry point (even when there are multiple targets). The new logic builds up a per-target mapping from an entry point to its desired output path (if any).
Details:
* Support for formatting profile versions, stages, and compile targets (formats) was added to diagnostic printing, so that we can make better error messages. This is fairly ad hoc, and it would be nice to have all of the string<->enum stuff be more data-driven throughout the codebase.
* Test cases were added for (almost) all of the error conditions in the current options validation. The main one that is missing is around specifying an `-entry` option before any source file when compiling multiple files. This is because the test runner is putting the source file name first on the command line automatically, so we can't reproduce that case.
* Several reflection-related tests now reflect entry points where they didn't before, because the logic for detecting when to infer a default `main` entry point have been made more loose
* On the dxc path, beefed up the handling of mapping from Slang `Profile`s to the coresponding string to use when invoking dxc.
* A bunch of tests cases were in violation of the newly imposed rules, so those needed to be cleaned up.
* There were also a bunch of test cases that had accidentally gotten "disabled" at some point because there were comparing output from `slangc` both with and without a `-pass-through` option, but that meant that any errors in command-line parsing produced the *same* error output in both the Slang and pass-through cases. This change updates `slang-test` to always expect a successful run for these tests, and then manually updates or disables the various test cases that are affected.
* When merging the updated test for matrix layout mode, I found that the new command-line logic was failing to propagate a matrix layout mode passed to `render-test` into the compiler. This was because the `-matrix-layout*` options were implemented as per-target, but the target was being set by API while the option came in via command line (passed through the API). It seems like we want matrix layout mode to be a global option anyway (rather than per-target), so I made that change here.
* Add missing expected output files
* A 64-bit fix
* Remove commented-out code noted in review
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Update glslang version
* Fix build for new glslang
The latest glslang required a few changes to our manual build for their code (because we are *not* taking a dependency on CMake).
* Rebuild project files using premake, which picks up a few files added to glslang, but also a few diffs in Slang's own project files in cases where they were edited manually instead of using premake.
* Fix up the declaration our our device limits (which are inentionally set to *not* limit what code passes through our glslang), because the underlying structure definition in glslang has changed. This is a kludgy bit of glslang's design, but it doesn't make sense for us to invest in a more serious workaround.
* Remove the "hack sampler" workaround
When the `GL_KHR_vulkan_glsl` spec was introduced to allow GLSL to be compiled for Vulkan SPIR-V, it made an annoying mistake by leaving a few builtins as taking `sampler2D`, etc. when the equivalent SPIR-V operations only require a `texture2D`, etc. The relevant builtins are:
* `textureSize`
* `textureQueryLevels`
* `textureSamples`
* `texelFetch`
* `texelFetchOffset`
This means that shader code that wanted to use those operations needed to conspire to have a `sampler` handy so they could write, e.g.:
```glsl
vec4 val = texelFetch(sampler2D(myTexture, someRandomSampler), p, lod);
```
when what they really wanted was this:
```glsl
vec4 val = texelFetch(myTexture, p, lod);
```
That is annoying but probably something each to work around for a GLSL programmer, but when cross-compiling from HLSL, you might have an operation like:
```hlsl
float4 val = myTexure.Load(p);
```
in which case a cross-compiler needs to manufacture a sampler out of thin air. If the shader happened to use a sampler for something else you could snag that, but in the worse case you had to cross-compile to GLSL that declared a new sampler.
Slang did this by declaring a sampler called `SLANG_hack_samplerForTexelFetch` (because `texelFetch` is the operation that first surfaced the issue). For complex reasons we *always* define this sampler, even if we turn out not to need it in a particular output kernel. This choice has a bunch of annoying consequences:
* There is *always* a sampler defined in descriptor set zero, because that's where we put the hack sampler, so a user-defined parameter block always has a set number of 1 or greater (see #646).
* The hack sampler shows up in reflection output because users need to size their descriptor sets appropriately to pass along this sampler that won't actually be used if they don't want to get debug spew from the validation layers.
We filed an issue on glslang about this problem, and eventually some kind folks from the gamedev community (who also saw the same problem) defined an extension spec (`GL_EXT_samplerless_texture_functions`) to fix the underlying issue and contributed a patch to glslang to make it support that extension.
This change just backs the hack out of Slang now that we have a glslang version that supports the extension to get past the defect in the original GLSL-for-Vulkan definition. Besides yanking out the code for the hack, we also change the relevant builtins to declare that they require this new GLSL extension (so that we properly request it from glslang when the builtins are used), and fix some reflection test cases that exposed the existence of the "hack sampler."
* Fixup: syntax error in stdlib generator files
* Remove more code for hack sampler
There was logic to ensure we always have a "default" register space/set when cross-compiling, because the hack sampler would need it. This is no longer necessary once we remove the hack sampler.
* Fix expected test output.
Fixing the root cause of issue #646 means that one of our test cases that tickles that issue now produces different output (luckily it can now be used as a regression test for the issue).
|
| |
|
|
|
|
|
|
|
|
|
| |
- We use this to work around the fact that, e.g., `Texture2D.Load` doesn't take a sampler, but the equivalent GLSL operation `texelFetch` requires one
- Previously we tried to hide the sampler from the user, hoping that glslang would drop it and we could just ignore it, but that doesn't work
- For now we'll go ahead and explicitly show the sampler in the reflection info so that an app can react appropriately
- We also generate a unique binding for the sampler, instead of the old behavior that fixed it with `binding = 0`
- We still fix it with `set = 0`, so it might still surprise users
|
|
|
The tricky bit here was that the `reflection-json` output format isn't really a code generation target like the others, and we need to be able to have multiple "targets" active to make sense of it. This needs cleaning-up.
|