| Age | Commit message (Collapse) | Author |
|
When `SIMPLE` type test is used with `-g[1-3]` option, the filecheck
pattern will most likely to match to the string itself on the embedded
source code rather than match to the emitted spirv-asm code.
This commit avoids the problem by removing the embedded source code.
This commit also provides an option to keep the embedded source code,
`-preserve-embedded-source`.
The source code removal is happening in two steps:
1. iterate all output lines and find SPIRV-ASM in the following pattern:
`%N = OpExtInst %void %M DebugSource %fileId %sourceId`. And then, store
the "%sourceId" value to identify which SPIRV instructions are for the
embedded source code.
2. iterate all output lines again to find the `%sourceId = OpString
"...."` and replace the whole string with the following string, ``` %1 =
OpString "// slang-test removed the embedded source // Use
`-preserve-embedded-source` to keep it explicitly " ```
This change revealed problems in the existing tests:
- tests/bugs/spirv-debug-info.slang : The expected text was missing and
it had to be added. The file also had Carrage-Return character on all
lines and the pre-commit git hook removed them.
- tests/spirv/debug-info.slang : the expected keyword DebugValue had to
change to DebugDeclare, because that's what we get with ToT.
- tests/spirv/debug-value-dynamic-index.slang : This test is currently
failing, and it will pass once DebugLocalVariable instruction missing
for parameter of the entry point function #7693 is resolved.
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
|
|
Emits the appropriate OpCapability for 8- and 16-bit type usage:
- UniformAndStorageBuffer8BitAccess: for 16-bit types in
SpvStorageClassUniform and SpvStorageClassStorageBuffer
- UniformAndStorageBuffer16BitAccess: for 16-bit types in
SpvStorageClassUniform and SpvStorageClassStorageBuffer
- StoragePushConstant8: for 8-bit types in SpvStorageClassPushConstant
- StoragePushConstant16: for 16-bit types in SpvStorageClassPushConstant
- StorageInputOutput16: for 16-bit types in SpvStorageClassInput and
SpvStorageClassOutput
Generated with Claude Code, with revisions.
Fixes #7879.
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: James Helferty (NVIDIA) <jhelferty-nv@users.noreply.github.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
fixes https://github.com/shader-slang/slang/issues/8271
This PR does the following,
- Fail slang-test when there are VVL error messages.
- VVL error for `gfx-unit-test-tool/` were not captured properly by the
debug callback.
- Set an environment variable,
`VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation`, for CI and
VisualStudio project setup.
- Ignores VVL error about NullHandle is used for the acceleration
structure; a fix is at ToT of VVL and not available from release build
yet.
- Fix VVL error complaining about the varying inputs are not provided
for the tests, `gfx-unit-test-tool/linkTimeTypeLayout.internal` and
`gfx-unit-test-tool/linkTimeTypeLayoutNested.internal`.
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
(#8223)
The Metal backend was generating incorrect type names for 8-bit vector
types, causing compilation failures when targeting Metal. According to
the Metal specification, 8-bit vector types should be named `charN` and
`ucharN` (e.g., `char2`, `uchar3`) rather than `int8_tN` and `uint8_tN`.
## Problem
When compiling Slang code with 8-bit vector types for Metal, the
compiler would emit:
```metal
uint8_t2 _S8 = uint8_t2(uint8_t(0U), uint8_t(16U));
int8_t3 _S9 = int8_t3(int8_t(0), int8_t(16), int8_t(48));
```
But the Metal compiler expects:
```metal
uchar2 _S8 = uchar2(uint8_t(0U), uint8_t(16U));
char3 _S9 = char3(int8_t(0), int8_t(16), int8_t(48));
```
This caused errors like:
```
error: unknown type name 'uint8_t2'; did you mean 'uint8_t'?
```
## Solution
Modified `MetalSourceEmitter::emitSimpleTypeImpl()` to emit the correct
Metal-specific type names for 8-bit types:
- `kIROp_Int8Type` now emits `char` instead of `int8_t`
- `kIROp_UInt8Type` now emits `uchar` instead of `uint8_t`
This change only affects the Metal backend and ensures that vector types
like `int8_t2`, `uint8_t3`, etc. are correctly emitted as `char2`,
`uchar3`, etc.
## Testing
- Added a new test case `tests/metal/8bit-vector-types.slang` to verify
the fix
- Re-enabled the previously disabled Metal test in
`tests/hlsl-intrinsic/countbits8.slang`
- Updated `tests/metal/byte-address-buffer.slang` to expect the correct
type names
- Verified that existing Metal tests continue to pass
Fixes #8211.
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
|
|
(#7929)
Fixes the Slang compiler internal error "subscript had no getter" when
reading from mesh shader output index arrays (e.g., `triangles[0].x`).
## Problem
The `OutputIndices` struct was missing a `ref` accessor in its
`__subscript` implementation, causing the compiler to fail when trying
to materialize subscript expressions as r-values.
## Solution
Added the missing `ref` accessor to `OutputIndices.__subscript` using
the `kIROp_MeshOutputRef` intrinsic operation, matching the pattern used
in `OutputVertices` and `OutputPrimitives`.
## Files Changed
- `source/slang/core.meta.slang` - Added missing `ref` accessor
- `tests/bugs/gh-7925.slang` - Test case to reproduce and verify the fix
Fixes #7925
Generated with [Claude Code](https://claude.ai/code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Lujin Wang <lujinwangnv@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
`emitReflectionVarLayoutJSON` will output the `userAttribs` section
twice as it gets output by `emitReflectionModifierInfoJSON` first before
being output again by a direct call to `emitUserAttributes`.
It seems the answer here is to just remove the extra explicit call to
`emitUserAttributes` and rely on the call in
`emitReflectionModifierInfoJSON`?
|
|
Closes #8112. ~~The issue asks for a "C layout", but in this PR I use
the term "CPU layout" because this naming was pre-existing in the
codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout
is to match CPU-side struct definitions with the shader side. I'm open
to better naming suggestions, though.~~
Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the
CPU target depends on the object layout rules of existing CPU layout
rules, but they're incompatible with actual shaders. So a new
`kCLayoutRulesImpl_` was needed anyway.
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
-Adds semantic SV_VulkanSamplePosition that emits corresponding
gl_SamplePosition and SpvBuiltinSamplePosition
-Adds gl_SamplePosition property to glsl.meta.slang
-Adds SPIRV and GLSL tests for the semantic and property
-Plan is to later implement SV_SamplePosition that follows HLSL range of
-0.5 to +0.5,
and emits GetRenderTargetSamplePosition(SV_SampleIndex) which needs more
complicated IR manipulation for HLSL and Metal
Fixes #7906
---------
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
|
|
Fixes #8185. The previous implementation is incorrect and basically only
works in the `x = 0` case. `delta` was the smallest possible positive
value representable as a float, but that's below the rounding error of
addition with almost all reasonably sized floats.
This fixed implementation is based on bit twiddling instead. I've
checked the float case against the C++ `nextafterf` with both a -inf ->
inf and inf -> -inf sweep, in addition to the test included in this PR.
|
|
## Summary
- Add Metal platform support for `WaveGetActiveMask()` and
`WaveActiveCountBits()` wave intrinsics
- Update capability requirements to include Metal platform for subgroup
ballot operations
- Implement Metal-specific intrinsic assembly using `simd_ballot()` and
`simd_vote` APIs
## Changes
- **source/slang/hlsl.meta.slang**:
- Add Metal target case for `WaveGetActiveMask()` using
`simd_ballot(true)`
- Update capability requirements from `cuda_glsl_hlsl_spirv` to
`cuda_glsl_hlsl_metal_spirv` for wave ballot functions
- **source/slang/slang-capabilities.capdef**:
- Add `metal` to `subgroup_ballot_activemask` capability alias
|
|
Enable CUDA support for batch 3 tests
- Enhanced wave operations with exclusive support
- Added proper identity values for min/max operations
- Fixed intrinsic name mapping issues
- Updated test configurations
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
In Metal, if `ParameterBlock` contains `DescriptorHandle` directly, it
would be emitted as DescriptorHandle literal, which is not valid Metal
code,
This fix adds a case for `kIROp_DescriptorHandleType` and directs it to
the Parent's `emitType` function to handle it.
|
|
Close #8054.
For detailed root cause is at:
https://github.com/shader-slang/slang/issues/8054#issuecomment-3189579508
|
|
Close #8090.
When we do type coerce, we use a cache to store the conversion cost
of different type. The key of the cache is defined by
struct BasicTypeKey
{
uint32_t baseType : 8;
uint32_t dim1 : 4;
uint32_t dim2 : 4;
...
}
where dim1 and dim2 is used for dimension of vector and matrix.
However the dim is only 4 bits, so `vector<int, 16>` will have the same
key as `int`, which is wrong.
Fix the issue by extending it to 8 bit.
Also to make the hash key still within 32 bits, we adjust baseType to 5 bits,
and knownConstantBitCount to 6 bits.
---------
Co-authored-by: kaizhangNV <kazhang@nvidia.com>
|
|
|
|
Metal's popcount prototype is `T popcount(T x)` but we want to use it to
implement `countbits` where the prototype always returns `uint`.
Using `popcount` directly would implicitly cast successfully to the
32-bit return value in all cases except when the argument is a 64-bit
type. Thus, this change always explicitly casts the result to `$TR`,
which should be one of the `uint[N]` types, and should always be able to
hold the number of bits in the type.
Addresses #6877
|
|
logic (#8168)
Fixes: #8167
Current emitting logic does not work, this has been corrected.
The provided test ensures our CUDA code is valid by compiling PTX from
it.
`m_writer->emit("OptixTraversableHandle");` should be `out <<` since
`out` adds to type-name-cache; otherwise using a type twice will produce
bad type-names (since we filled type-name cache with "" instead of
"typeName")
|
|
Fixes #7011
|
|
AST-parent has target+set the AST-child does not (#8175)
Fixes: #8174
Changes:
* To determine if we propagate capabilities, we need to ensure that a
`join` will do nothing (optimization since `join` is expensive + caching
data for the `join` adds up to be expensive). This logic was changed in
`slang-check-decl.cpp` since the current logic was incorrect.
* A parent could have the set `metal+glsl` and the use-site could have
`glsl`. In this case, we will not remove `metal` from the parent since
`{metal+glsl}.implies({glsl})` is true.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
Fixes #6785
|
|
When we legalize the entry point param, there are cases where we need to
reconstruct a struct for the parameter and the original struct wouldn't
be used. But if the user tries to use the origianl struct as a type for
a function parameter, we will end up using both the original struct and
the synthesized struct at the same time.
On Metal and WGSL, it causes an error when an identical semtaic is used
on more than one variable.
This commit removes the semantics from the original struct after cloning
the type.
Fixes https://github.com/shader-slang/slang/issues/8141
Related to https://github.com/shader-slang/slang/issues/7693
---------
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
|
|
Enable CUDA for the tests listed in issue #8078
This requires a minor CUDA prelude change, adding some math functions.
|
|
Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
Fixes #8116
---------
Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
|
|
What is fixed:
Since we make `vk` tests compile as GLSL, we must use a capability that
specifies a GLSL equivalent.
If we do not do this, we will get an error since we are not specifying
GLSL capabilities (but are specifying a profile).
Since a profile is specified, we emit capability errors.
|
|
Fixes: #7410
Changes:
1. super-type capabilities must be a super-set of sub-type capabilities
(and support the same shader stages/targets)
* InheritanceDecl visits super-type to inherit it's capabilities;
validate InheritanceDecl capabilities against sub-type
* visit all container decl's with a default case
* clean up functionDeclBase visitor
* Simplify `diagnoseUndeclaredCapability` by moving logic into
capability checking (more correct*)
3. added changed behavior to documentation
4. fixed some incorrect capabilities
5. **we do not** diagnose capability errors on interface
requirement-to-implementation if both lack explicit capability
requirements. This change is to work around a slangpy regression (test
case for the failing situation is in
`tests\language-feature\capability\capability-interface-extension-1.slang`),
Note: maybe for slang-2026 we don't do this?
6. requirement & implementation must support the same shader
stage/target. This was changed because otherwise we can have cases where
`X` inherits from `Y`, but `Y` is only expected to be used in `glsl`
whilst `X` is expected to be used in `hlsl | glsl`
7. removed
`tests/language-feature/capability/capabilitySimplification3.slang`
because it tests nothing special (redundant)
Note: not using rebase due to separate branches depending on this PR
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
when we have `GetElementPtr -> load -> GetElement` in our use-chain, the
final `GetElement` may use the `load` as a `Index`, not a base.
This is a non-issue with `getFieldExtract` since a field is a StructKey.
We will still add this check to ensure no bugs down the line.
---------
Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
Closes https://github.com/shader-slang/slang/issues/5750
|
|
Due to an older version of spec referred there was an inconsitency v1.29
2/20/2025 - [HitObject LoadLocalRootArgumentsConstant]
Latest spec
https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html#hitobject-loadlocalroottableconstant
Refer:
OptiX backend support for Shader Execution Reordering (SER) features as
outlined in issue #6647. -
|
|
Closes #8061.
Along with the fix, also enhanced coercion/overload resolution to filter
candidates based on the target type, allowing
`tests\language-feature\higher-order-functions\overloaded.slang` to
pass.
|
|
expressions in legacy mode (#7984)
This PR implements a warning system to help users identify potentially
unintended comma operator usage in expressions. The comma operator can
be confusing when used in contexts like variable initialization where
users might have intended to use braces for initialization instead.
## Problem
The following code compiles without error but is likely not written as
intended:
```slang
float4 vColor = (0.f, 0.f, 0.f, 1.f); // Uses comma operators, evaluates to 1.f
```
The intended code should use braces:
```slang
float4 vColor = {0.f, 0.f, 0.f, 1.f}; // Proper initialization
```
## Solution
Added a new warning diagnostic (`commaOperatorUsedInExpression`, ID:
41024) that warns when comma operators are used in expressions, with
exemptions for contexts where they are commonly intended:
- **For-loop side effects**: `for (int i = 0; i < 10; i++, x++)` - no
warning
- **Expand expressions**: `expand(f(), g(each param))` - no warning
- **Slang 2026+ mode**: `let m = (1,2,3)` creates tuples - no warning
- **All other expressions**: `float4 v = (a, b, c, d)` and `return a, b`
- warns for each comma
## Implementation Details
- Added context tracking in `SemanticsContext` with
`m_inForLoopSideEffect` flag
- Modified `visitForStmt` to use special context when checking side
effect expressions
- Added comma operator detection in `visitInvokeExpr` for `InfixExpr`
nodes
- Added language version check using `isSlang2026OrLater()` to disable
warnings in Slang 2026+ mode where parentheses create tuples
- Performance optimization: language version check is hoisted to avoid
unnecessary casting
- Warning can be suppressed using `-Wno-41024` command line flag
## Test Coverage
Added comprehensive test cases using filecheck format that verify:
- Warnings are generated for comma operators in variable initialization
(legacy mode only)
- Warnings are generated for comma operators in return statements
(legacy mode only)
- Warnings are generated for comma operators in general expressions
(legacy mode only)
- No warnings for comma operators in for-loop side effects
- No warnings in Slang 2026+ mode where parentheses create tuples
- Warning suppression works correctly
Example output (legacy mode):
```
warning 41024: comma operator used in expression (may be unintended)
float4 vColor = (0.f, 0.f, 0.f, 1.f);
^
warning 41024: comma operator used in expression (may be unintended)
return a *= 2, a + 1;
^
```
Fixes #6732.
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: aidanfnv <198290069+aidanfnv@users.noreply.github.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: aidanfnv <aidanf@nvidia.com>
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
|
|
Fixes #7574
Changes:
* Add an initial (fairly simple) optimization pass which is able to
eliminate redundant copies.
* Our current existing optimizer passes remove redundant load/store very
robustly, this pass will focus on other cases of copy elimination
* Primary approach is to make all functions which are `in T` and `T` is
trivial to copy into a `__constref T`. We then (depending on scenario)
manually insert a variable+load if a pass-by-reference is not possible;
otherwise we pass by `constref`.
* Added optimizations to eliminate redundant code which causes
`constref` to fail to compile
---------
Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
Update the SPIRV emit of atomic fp16 vector extension from its previous
incorrect name to SPV_NV_shader_atomic_fp16_vector.
|
|
|
|
* Fix `tools/gfx/gfx.slang`
* Add back `tests/cpu-program/gfx-smoke.slang`
|
|
* Fix GetDimensions to use mipLevel for SPIRV
* format code (#84)
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
|
|
* Fix noperspective modifier for SV_Barycentrics in SPIRV and GLSL
- Added test case with both regular and noperspective SV_Barycentrics inputs
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-authored-by: davli-nv <davli-nv@users.noreply.github.com>
* fixup format
* address review
https://github.com/shader-slang/slang/pull/8067#pullrequestreview-3090037501
* address review
https://github.com/shader-slang/slang/pull/8067#discussion_r2255818595
* add test case from review
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: davli-nv <davli-nv@users.noreply.github.com>
|
|
* Fix 7723 - Add autodiff tests
* Update bug-1.slang
Adding Vulkan
|
|
* Implement SPV_EXT_fragment_invocation_density
-Adds semantics SV_FragSize and SV_FragInvocationCount and implements them for SPIRV and GLSL using the appropriate target builtins from extensions.
-Adds test case checking for expected target builtins from these semantics.
-For future work, could implement SV_FragSize using pixel shader input SV_ShadingRate for HLSL, and SV_FragInvocationCount needs research.
Fixes #7974
Generated with Claude Code
* address review feedback
https://github.com/shader-slang/slang/pull/8037#pullrequestreview-3084645845
* fixup format
* review feedback
https://github.com/shader-slang/slang/pull/8037#pullrequestreview-3086442819
|
|
|
|
* Initial plan
* Fix pragma warning not working with multifile modules
- Check if DiagnosticSink already has a WarningStateTracker before creating new one
- This preserves pragma warning state across __include'd files
- Add regression tests for multifile pragma warnings
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Add additional test cases for nested pragma warnings
- Test nested __include scenarios with pragma warning directives
- Verify pragma warnings work correctly with multiple levels of includes
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
Add support for kIROp_PtrLit types in metal and add a test for null pointer
values, which is the only valid value.
|
|
close #7931.
For a generic callable, we have two passes overload resolution, in first pass, we will resolve
the generic by only checking the generic parameters, while in the second pass, we will resolve
the function signature to resolve the overload.
But in our candidate comparison logic, we pick a preferred generic even two generics are equally
good. However, we should not make this decision in the first pass, because we don't know about
the function arguments in this pass yet. So we just return OverloadEpxr2 in this case, and let the
function overload resolution to break the tie.
|
|
* fix bug
* fix test
* push test changs for clarity
* fix bug
* fix test
* push test changs for clarity
* test what fails
* remove redundant code
|
|
* Fix 7441: CUDA boolean vector layout to use 1-byte elements
Boolean vectors (bool1, bool2, bool3, bool4) were incorrectly implemented
as integer-based types using 4 bytes per element instead of actual 1-byte
boolean elements on CUDA targets.
Changes:
- Update CUDA prelude to define boolean vectors as structs with bool fields
instead of typedef aliases to integer vectors
- Implement CUDALayoutRulesImpl::GetVectorLayout to use 1-byte alignment
for boolean vectors, matching actual CUDA memory layout behavior
- Update make_bool functions to populate struct fields correctly
This ensures boolean vectors have the same memory layout as bool[4] arrays:
- bool1: 1 byte (was 4 bytes)
- bool2: 2 bytes (was 8 bytes)
- bool3: 3 bytes (was 12 bytes)
- bool4: 4 bytes (was 16 bytes)
Fixes memory layout mismatch between Slang reflection API and actual
CUDA compilation, achieving 75% memory savings for boolean vector usage.
* Fix CI issues -
Add and update associated functions and operators
* Make boolX same as uchar
* Use align construct on struct for boolX
* Improve Test case for robust alignment checks
* Formatting
* Disable selected slangpy tests
* add metal check which is slightly different than cuda
* Test-1
* Test-2
* Test-3
* Test-4
* ReflectionChange
* cleanup and update
* _slang_select with plain bool is needed for reverse-loop-checkpoint-test
|