slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2025-08-28	Remove the embedded source to avoid self-matching in slang-test (#8305)	Jay Kwak
	When `SIMPLE` type test is used with `-g[1-3]` option, the filecheck pattern will most likely to match to the string itself on the embedded source code rather than match to the emitted spirv-asm code. This commit avoids the problem by removing the embedded source code. This commit also provides an option to keep the embedded source code, `-preserve-embedded-source`. The source code removal is happening in two steps: 1. iterate all output lines and find SPIRV-ASM in the following pattern: `%N = OpExtInst %void %M DebugSource %fileId %sourceId`. And then, store the "%sourceId" value to identify which SPIRV instructions are for the embedded source code. 2. iterate all output lines again to find the `%sourceId = OpString "...."` and replace the whole string with the following string, ``` %1 = OpString "// slang-test removed the embedded source // Use `-preserve-embedded-source` to keep it explicitly " ``` This change revealed problems in the existing tests: - tests/bugs/spirv-debug-info.slang : The expected text was missing and it had to be added. The file also had Carrage-Return character on all lines and the pre-commit git hook removed them. - tests/spirv/debug-info.slang : the expected keyword DebugValue had to change to DebugDeclare, because that's what we get with ToT. - tests/spirv/debug-value-dynamic-index.slang : This test is currently failing, and it will pass once DebugLocalVariable instruction missing for parameter of the entry point function #7693 is resolved. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
2025-08-28	Add SPIRV OpCapability for 8/16bit use in storage (#8194)	James Helferty (NVIDIA)
	Emits the appropriate OpCapability for 8- and 16-bit type usage: - UniformAndStorageBuffer8BitAccess: for 16-bit types in SpvStorageClassUniform and SpvStorageClassStorageBuffer - UniformAndStorageBuffer16BitAccess: for 16-bit types in SpvStorageClassUniform and SpvStorageClassStorageBuffer - StoragePushConstant8: for 8-bit types in SpvStorageClassPushConstant - StoragePushConstant16: for 16-bit types in SpvStorageClassPushConstant - StorageInputOutput16: for 16-bit types in SpvStorageClassInput and SpvStorageClassOutput Generated with Claude Code, with revisions. Fixes #7879. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: James Helferty (NVIDIA) <jhelferty-nv@users.noreply.github.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-26	Fail slang-test when VVL printed errors (#8280)	Jay Kwak
	fixes https://github.com/shader-slang/slang/issues/8271 This PR does the following, - Fail slang-test when there are VVL error messages. - VVL error for `gfx-unit-test-tool/` were not captured properly by the debug callback. - Set an environment variable, `VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation`, for CI and VisualStudio project setup. - Ignores VVL error about NullHandle is used for the acceleration structure; a fix is at ToT of VVL and not available from release build yet. - Fix VVL error complaining about the varying inputs are not provided for the tests, `gfx-unit-test-tool/linkTimeTypeLayout.internal` and `gfx-unit-test-tool/linkTimeTypeLayoutNested.internal`. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-26	Fix Metal 8-bit vector type names: emit char/uchar instead of int8_t/uint8_t ↵	Copilot
	(#8223) The Metal backend was generating incorrect type names for 8-bit vector types, causing compilation failures when targeting Metal. According to the Metal specification, 8-bit vector types should be named `charN` and `ucharN` (e.g., `char2`, `uchar3`) rather than `int8_tN` and `uint8_tN`. ## Problem When compiling Slang code with 8-bit vector types for Metal, the compiler would emit: ```metal uint8_t2 _S8 = uint8_t2(uint8_t(0U), uint8_t(16U)); int8_t3 _S9 = int8_t3(int8_t(0), int8_t(16), int8_t(48)); ``` But the Metal compiler expects: ```metal uchar2 _S8 = uchar2(uint8_t(0U), uint8_t(16U)); char3 _S9 = char3(int8_t(0), int8_t(16), int8_t(48)); ``` This caused errors like: ``` error: unknown type name 'uint8_t2'; did you mean 'uint8_t'? ``` ## Solution Modified `MetalSourceEmitter::emitSimpleTypeImpl()` to emit the correct Metal-specific type names for 8-bit types: - `kIROp_Int8Type` now emits `char` instead of `int8_t` - `kIROp_UInt8Type` now emits `uchar` instead of `uint8_t` This change only affects the Metal backend and ensures that vector types like `int8_t2`, `uint8_t3`, etc. are correctly emitted as `char2`, `uchar3`, etc. ## Testing - Added a new test case `tests/metal/8bit-vector-types.slang` to verify the fix - Re-enabled the previously disabled Metal test in `tests/hlsl-intrinsic/countbits8.slang` - Updated `tests/metal/byte-address-buffer.slang` to expect the correct type names - Verified that existing Metal tests continue to pass Fixes #8211. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
2025-08-25	Fix#8084: Batch-8: Enable cuda tests (#8268)	Harsh Aggarwal (NVIDIA)

2025-08-25	Fix#8083: Batch-7: Enable cuda tests (#8267)	Harsh Aggarwal (NVIDIA)

2025-08-25	Fix#8082: Batch-6: Enable cuda tests (#8266)	Harsh Aggarwal (NVIDIA)

2025-08-25	Fix#8081: Batch-5: Enable cuda tests (#8263)	Harsh Aggarwal (NVIDIA)

2025-08-25	Fix#8080: Batch-4: Enable cuda tests (#8261)	Harsh Aggarwal (NVIDIA)

2025-08-22	Fix mesh shader OutputIndices subscript error by adding missing ref accessor ↵	Lujin Wang
	(#7929) Fixes the Slang compiler internal error "subscript had no getter" when reading from mesh shader output index arrays (e.g., `triangles[0].x`). ## Problem The `OutputIndices` struct was missing a `ref` accessor in its `__subscript` implementation, causing the compiler to fail when trying to materialize subscript expressions as r-values. ## Solution Added the missing `ref` accessor to `OutputIndices.__subscript` using the `kIROp_MeshOutputRef` intrinsic operation, matching the pattern used in `OutputVertices` and `OutputPrimitives`. ## Files Changed - `source/slang/core.meta.slang` - Added missing `ref` accessor - `tests/bugs/gh-7925.slang` - Test case to reproduce and verify the fix Fixes #7925 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Lujin Wang <lujinwangnv@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-21	Fix reflection JSON writing userAttribs section twice for some cases. (#8210)	MindSpunk
	`emitReflectionVarLayoutJSON` will output the `userAttribs` section twice as it gets output by `emitReflectionModifierInfoJSON` first before being output again by a direct call to `emitUserAttributes`. It seems the answer here is to just remove the extra explicit call to `emitUserAttributes` and rely on the call in `emitReflectionModifierInfoJSON`?
2025-08-21	Introduce CDataLayout & -fvk-use-c-layout (#8136)	Julius Ikkala
	Closes #8112. ~~The issue asks for a "C layout", but in this PR I use the term "CPU layout" because this naming was pre-existing in the codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout is to match CPU-side struct definitions with the shader side. I'm open to better naming suggestions, though.~~ Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the CPU target depends on the object layout rules of existing CPU layout rules, but they're incompatible with actual shaders. So a new `kCLayoutRulesImpl_` was needed anyway. --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-08-21	Implement SV_VulkanSamplePosition (#8236)	davli-nv
	-Adds semantic SV_VulkanSamplePosition that emits corresponding gl_SamplePosition and SpvBuiltinSamplePosition -Adds gl_SamplePosition property to glsl.meta.slang -Adds SPIRV and GLSL tests for the semantic and property -Plan is to later implement SV_SamplePosition that follows HLSL range of -0.5 to +0.5, and emits GetRenderTargetSamplePosition(SV_SampleIndex) which needs more complicated IR manipulation for HLSL and Metal Fixes #7906 --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
2025-08-20	Fix nextafter() (#8195)	Julius Ikkala
	Fixes #8185. The previous implementation is incorrect and basically only works in the `x = 0` case. `delta` was the smallest possible positive value representable as a float, but that's below the rounding error of addition with almost all reasonably sized floats. This fixed implementation is based on bit twiddling instead. I've checked the float case against the C++ `nextafterf` with both a -inf -> inf and inf -> -inf sweep, in addition to the test included in this PR.
2025-08-20	Add Metal support for WaveGetActiveMask and WaveActiveCountBits (#8218)	Tianyu Li
	## Summary - Add Metal platform support for `WaveGetActiveMask()` and `WaveActiveCountBits()` wave intrinsics - Update capability requirements to include Metal platform for subgroup ballot operations - Implement Metal-specific intrinsic assembly using `simd_ballot()` and `simd_vote` APIs ## Changes - source/slang/hlsl.meta.slang: - Add Metal target case for `WaveGetActiveMask()` using `simd_ballot(true)` - Update capability requirements from `cuda_glsl_hlsl_spirv` to `cuda_glsl_hlsl_metal_spirv` for wave ballot functions - source/slang/slang-capabilities.capdef: - Add `metal` to `subgroup_ballot_activemask` capability alias
2025-08-20	Updated support to enable batch3 (#8219)	Harsh Aggarwal (NVIDIA)
	Enable CUDA support for batch 3 tests - Enhanced wave operations with exclusive support - Added proper identity values for min/max operations - Fixed intrinsic name mapping issues - Updated test configurations Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-08-18	Emit descriptor handle correctly for ParameterBlock<DescriptorHandle> (#8206)	Gangzheng Tong
	In Metal, if `ParameterBlock` contains `DescriptorHandle` directly, it would be emitted as DescriptorHandle literal, which is not valid Metal code, This fix adds a case for `kIROp_DescriptorHandleType` and directs it to the Parent's `emitType` function to handle it.
2025-08-18	Fix issue of double lowering issue a differentiable function (#8182)	kaizhangNV
	Close #8054. For detailed root cause is at: https://github.com/shader-slang/slang/issues/8054#issuecomment-3189579508
2025-08-18	Fix constructor overload ambiguity with scalar and vector parameters (#8109)	Copilot
	Close #8090. When we do type coerce, we use a cache to store the conversion cost of different type. The key of the cache is defined by struct BasicTypeKey { uint32_t baseType : 8; uint32_t dim1 : 4; uint32_t dim2 : 4; ... } where dim1 and dim2 is used for dimension of vector and matrix. However the dim is only 4 bits, so `vector<int, 16>` will have the same key as `int`, which is wrong. Fix the issue by extending it to 8 bit. Also to make the hash key still within 32 bits, we adjust baseType to 5 bits, and knownConstantBitCount to 6 bits. --------- Co-authored-by: kaizhangNV <kazhang@nvidia.com>
2025-08-18	Enable CUDA Test Enablement - Batch 1: Autodiff Tests (1-16) (#8139)	Harsh Aggarwal (NVIDIA)

2025-08-15	Use 64bit int instead of emulation on metal (#8180)	James Helferty (NVIDIA)
	Metal's popcount prototype is `T popcount(T x)` but we want to use it to implement `countbits` where the prototype always returns `uint`. Using `popcount` directly would implicitly cast successfully to the 32-bit return value in all cases except when the argument is a 64-bit type. Thus, this change always explicitly casts the result to `$TR`, which should be one of the `uint[N]` types, and should always be able to hold the number of bits in the type. Addresses #6877
2025-08-15	[CUDA] Fix incorrect `kIROp_RaytracingAccelerationStructureType` emitting ↵	ArielG-NV
	logic (#8168) Fixes: #8167 Current emitting logic does not work, this has been corrected. The provided test ensures our CUDA code is valid by compiling PTX from it. `m_writer->emit("OptixTraversableHandle");` should be `out <<` since `out` adds to type-name-cache; otherwise using a type twice will produce bad type-names (since we filled type-name cache with "" instead of "typeName")
2025-08-15	Prohibit use of buffer.GetDimensions on metal (#8156)	James Helferty (NVIDIA)
	Fixes #7011
2025-08-14	[Capability System] Fix bug where capabilities do not correctly propegate if ↵	ArielG-NV
	AST-parent has target+set the AST-child does not (#8175) Fixes: #8174 Changes: * To determine if we propagate capabilities, we need to ensure that a `join` will do nothing (optimization since `join` is expensive + caching data for the `join` adds up to be expensive). This logic was changed in `slang-check-decl.cpp` since the current logic was incorrect. * A parent could have the set `metal+glsl` and the use-site could have `glsl`. In this case, we will not remove `metal` from the parent since `{metal+glsl}.implies({glsl})` is true. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-13	Handle SV_Barycentrics on metal (#8163)	James Helferty (NVIDIA)
	Fixes #6785
2025-08-13	Remove the semantic decoration from the original entry struct (#8146)	Jay Kwak
	When we legalize the entry point param, there are cases where we need to reconstruct a struct for the parameter and the original struct wouldn't be used. But if the user tries to use the origianl struct as a type for a function parameter, we will end up using both the original struct and the synthesized struct at the same time. On Metal and WGSL, it causes an error when an identical semtaic is used on more than one variable. This commit removes the semantics from the original struct after cloning the type. Fixes https://github.com/shader-slang/slang/issues/8141 Related to https://github.com/shader-slang/slang/issues/7693 --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
2025-08-12	Enable CUDA testing for batch 2 (#8147)	jarcherNV
	Enable CUDA for the tests listed in issue #8078 This requires a minor CUDA prelude change, adding some math functions.
2025-08-09	[SPIR-V] Emit control flags for `branch/flatten` decorations (#8134)	amidescent
	Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-09	Fix atomics error diagnostics (#8117)	venkataram-nv
	Fixes #8116 --------- Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
2025-08-09	Fix `tests\bugs\op-select-return-composite.slang` Test Failing (#8132)	ArielG-NV
	What is fixed: Since we make `vk` tests compile as GLSL, we must use a capability that specifies a GLSL equivalent. If we do not do this, we will get an error since we are not specifying GLSL capabilities (but are specifying a profile). Since a profile is specified, we emit capability errors.
2025-08-08	Error if super-type capabilities are a super-set of sub-type (#7452)	ArielG-NV
	Fixes: #7410 Changes: 1. super-type capabilities must be a super-set of sub-type capabilities (and support the same shader stages/targets) * InheritanceDecl visits super-type to inherit it's capabilities; validate InheritanceDecl capabilities against sub-type * visit all container decl's with a default case * clean up functionDeclBase visitor * Simplify `diagnoseUndeclaredCapability` by moving logic into capability checking (more correct) 3. added changed behavior to documentation 4. fixed some incorrect capabilities 5. we do not* diagnose capability errors on interface requirement-to-implementation if both lack explicit capability requirements. This change is to work around a slangpy regression (test case for the failing situation is in `tests\language-feature\capability\capability-interface-extension-1.slang`), Note: maybe for slang-2026 we don't do this? 6. requirement & implementation must support the same shader stage/target. This was changed because otherwise we can have cases where `X` inherits from `Y`, but `Y` is only expected to be used in `glsl` whilst `X` is expected to be used in `hlsl \| glsl` 7. removed `tests/language-feature/capability/capabilitySimplification3.slang` because it tests nothing special (redundant) Note: not using rebase due to separate branches depending on this PR --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-08	Initial copy elision pass regression fix (#8126)	ArielG-NV
	when we have `GetElementPtr -> load -> GetElement` in our use-chain, the final `GetElement` may use the `load` as a `Index`, not a base. This is a non-issue with `getFieldExtract` since a field is a StructKey. We will still add this check to ensure no bugs down the line. --------- Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-08	Diagnose on array of parameterblock instead of asserting (#8123)	Ellie Hermaszewska
	Closes https://github.com/shader-slang/slang/issues/5750
2025-08-07	Fix intrinsic LoadLocalRootTableConstant for optix (#7949)	Harsh Aggarwal (NVIDIA)
	Due to an older version of spec referred there was an inconsitency v1.29 2/20/2025 - [HitObject LoadLocalRootArgumentsConstant] Latest spec https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html#hitobject-loadlocalroottableconstant Refer: OptiX backend support for Shader Execution Reordering (SER) features as outlined in issue #6647. -
2025-08-07	Support `expand` on concrete tuple values. (#8106)	Yong He
	Closes #8061. Along with the fix, also enhanced coercion/overload resolution to filter candidates based on the target type, allowing `tests\language-feature\higher-order-functions\overloaded.slang` to pass.
2025-08-07	Add warning for comma operators used outside for-loops and expand ↵	Copilot
	expressions in legacy mode (#7984) This PR implements a warning system to help users identify potentially unintended comma operator usage in expressions. The comma operator can be confusing when used in contexts like variable initialization where users might have intended to use braces for initialization instead. ## Problem The following code compiles without error but is likely not written as intended: ```slang float4 vColor = (0.f, 0.f, 0.f, 1.f); // Uses comma operators, evaluates to 1.f ``` The intended code should use braces: ```slang float4 vColor = {0.f, 0.f, 0.f, 1.f}; // Proper initialization ``` ## Solution Added a new warning diagnostic (`commaOperatorUsedInExpression`, ID: 41024) that warns when comma operators are used in expressions, with exemptions for contexts where they are commonly intended: - For-loop side effects: `for (int i = 0; i < 10; i++, x++)` - no warning - Expand expressions: `expand(f(), g(each param))` - no warning - Slang 2026+ mode: `let m = (1,2,3)` creates tuples - no warning - All other expressions: `float4 v = (a, b, c, d)` and `return a, b` - warns for each comma ## Implementation Details - Added context tracking in `SemanticsContext` with `m_inForLoopSideEffect` flag - Modified `visitForStmt` to use special context when checking side effect expressions - Added comma operator detection in `visitInvokeExpr` for `InfixExpr` nodes - Added language version check using `isSlang2026OrLater()` to disable warnings in Slang 2026+ mode where parentheses create tuples - Performance optimization: language version check is hoisted to avoid unnecessary casting - Warning can be suppressed using `-Wno-41024` command line flag ## Test Coverage Added comprehensive test cases using filecheck format that verify: - Warnings are generated for comma operators in variable initialization (legacy mode only) - Warnings are generated for comma operators in return statements (legacy mode only) - Warnings are generated for comma operators in general expressions (legacy mode only) - No warnings for comma operators in for-loop side effects - No warnings in Slang 2026+ mode where parentheses create tuples - Warning suppression works correctly Example output (legacy mode): ``` warning 41024: comma operator used in expression (may be unintended) float4 vColor = (0.f, 0.f, 0.f, 1.f); ^ warning 41024: comma operator used in expression (may be unintended) return a *= 2, a + 1; ^ ``` Fixes #6732. <!-- START COPILOT CODING AGENT TIPS --> --- 💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click [here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to start the survey. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: aidanfnv <198290069+aidanfnv@users.noreply.github.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: aidanfnv <aidanf@nvidia.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
2025-08-07	Initial copy elision pass (#8042)	ArielG-NV
	Fixes #7574 Changes: * Add an initial (fairly simple) optimization pass which is able to eliminate redundant copies. * Our current existing optimizer passes remove redundant load/store very robustly, this pass will focus on other cases of copy elimination * Primary approach is to make all functions which are `in T` and `T` is trivial to copy into a `__constref T`. We then (depending on scenario) manually insert a variable+load if a pass-by-reference is not possible; otherwise we pass by `constref`. * Added optimizations to eliminate redundant code which causes `constref` to fail to compile --------- Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-07	Fix atomic fp16 vector SPIRV emit (#8104)	jarcherNV
	Update the SPIRV emit of atomic fp16 vector extension from its previous incorrect name to SPV_NV_shader_atomic_fp16_vector.
2025-08-06	Fix unused space discovery for bindless heap. (#8075)	Yong He

2025-08-06	Add back GFX smoke test (#8030)	Sam Estep
	* Fix `tools/gfx/gfx.slang` * Add back `tests/cpu-program/gfx-smoke.slang`
2025-08-06	Fix GetDimensions to use mipLevel for SPIRV (#8065)	Jay Kwak
	* Fix GetDimensions to use mipLevel for SPIRV * format code (#84) --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
2025-08-06	Fix noperspective modifier for SV_Barycentrics in SPIRV and GLSL (#8067)	davli-nv
	* Fix noperspective modifier for SV_Barycentrics in SPIRV and GLSL - Added test case with both regular and noperspective SV_Barycentrics inputs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: davli-nv <davli-nv@users.noreply.github.com> * fixup format * address review https://github.com/shader-slang/slang/pull/8067#pullrequestreview-3090037501 * address review https://github.com/shader-slang/slang/pull/8067#discussion_r2255818595 * add test case from review --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: davli-nv <davli-nv@users.noreply.github.com>
2025-08-06	Fix 7723 - Add autodiff tests (#7919)	Harsh Aggarwal (NVIDIA)
	* Fix 7723 - Add autodiff tests * Update bug-1.slang Adding Vulkan
2025-08-05	Implement SPV_EXT_fragment_invocation_density (SPV_NV_shading_rate) (#8037)	davli-nv
	* Implement SPV_EXT_fragment_invocation_density -Adds semantics SV_FragSize and SV_FragInvocationCount and implements them for SPIRV and GLSL using the appropriate target builtins from extensions. -Adds test case checking for expected target builtins from these semantics. -For future work, could implement SV_FragSize using pixel shader input SV_ShadingRate for HLSL, and SV_FragInvocationCount needs research. Fixes #7974 Generated with Claude Code * address review feedback https://github.com/shader-slang/slang/pull/8037#pullrequestreview-3084645845 * fixup format * review feedback https://github.com/shader-slang/slang/pull/8037#pullrequestreview-3086442819
2025-08-05	Enable compute/ dir which passes (#7991)	Harsh Aggarwal (NVIDIA)

2025-08-05	Fix #pragma warning not working with multifile modules (#7942)	Copilot
	* Initial plan * Fix pragma warning not working with multifile modules - Check if DiagnosticSink already has a WarningStateTracker before creating new one - This preserves pragma warning state across __include'd files - Add regression tests for multifile pragma warnings Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Add additional test cases for nested pragma warnings - Test nested __include scenarios with pragma warning directives - Verify pragma warnings work correctly with multiple levels of includes Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2025-08-04	Add support for pointer literals in metal (#8040)	jarcherNV
	Add support for kIROp_PtrLit types in metal and add a test for null pointer values, which is the only valid value.
2025-08-03	fix overload in extension issue (#7999)	kaizhangNV
	close #7931. For a generic callable, we have two passes overload resolution, in first pass, we will resolve the generic by only checking the generic parameters, while in the second pass, we will resolve the function signature to resolve the overload. But in our candidate comparison logic, we pick a preferred generic even two generics are equally good. However, we should not make this decision in the first pass, because we don't know about the function arguments in this pass yet. So we just return OverloadEpxr2 in this case, and let the function overload resolution to break the tie.
2025-08-01	Drain sink when single-argument constructor call fail (#7883)	ArielG-NV
	* fix bug * fix test * push test changs for clarity * fix bug * fix test * push test changs for clarity * test what fails * remove redundant code
2025-08-01	Fix 7441: CUDA boolean vector layout to use 1-byte elements (#7862)	Harsh Aggarwal (NVIDIA)
	* Fix 7441: CUDA boolean vector layout to use 1-byte elements Boolean vectors (bool1, bool2, bool3, bool4) were incorrectly implemented as integer-based types using 4 bytes per element instead of actual 1-byte boolean elements on CUDA targets. Changes: - Update CUDA prelude to define boolean vectors as structs with bool fields instead of typedef aliases to integer vectors - Implement CUDALayoutRulesImpl::GetVectorLayout to use 1-byte alignment for boolean vectors, matching actual CUDA memory layout behavior - Update make_bool functions to populate struct fields correctly This ensures boolean vectors have the same memory layout as bool[4] arrays: - bool1: 1 byte (was 4 bytes) - bool2: 2 bytes (was 8 bytes) - bool3: 3 bytes (was 12 bytes) - bool4: 4 bytes (was 16 bytes) Fixes memory layout mismatch between Slang reflection API and actual CUDA compilation, achieving 75% memory savings for boolean vector usage. * Fix CI issues - Add and update associated functions and operators * Make boolX same as uchar * Use align construct on struct for boolX * Improve Test case for robust alignment checks * Formatting * Disable selected slangpy tests * add metal check which is slightly different than cuda * Test-1 * Test-2 * Test-3 * Test-4 * ReflectionChange * cleanup and update * _slang_select with plain bool is needed for reverse-loop-checkpoint-test