summaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAge
...
* Fix CUDA global variable initialization with constructor calls (#8340)Harsh Aggarwal (NVIDIA)2025-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix CUDA global variable initialization with constructor calls Resolves CUDA compilation failure where global variables with struct constructor initialization generated illegal `__device__` variable runtime initialization. **Problem:** ```cuda // Generated invalid CUDA code: __device__ static const Stuff_0 gStuff_0 = Stuff_x24init_0(args...); // Error: "dynamic initialization is not supported for a __device__ variable" Root Cause Discovered: Through extensive debugging, found that moveGlobalVarInitializationToEntryPoints pass only handled kIROp_GlobalVar instructions, but global constants with constructor calls appeared as kIROp_Call instructions at module scope. Solution: 1. IR Pipeline Fix: Extended moveGlobalVarInitializationToEntryPoints to detect and transform module-level constructor calls into proper global variables with entry-point initialization 2. Field Access Fix: Enhanced kIROp_FieldExtract logic to emit correct -> syntax for pointer types and address-of operations 3. Constructor Emission: Added CUDA-specific handling for constructor calls Architecture: - Transforms let %gStuff = call %Constructor(...) into kernel context initialization - Moves runtime initialization from global scope to entry-point execution - Follows CUDA best practices for global state management Files: - source/slang/slang-ir-explicit-global-init.cpp: Extended IR transformation pass - source/slang/slang-emit-c-like.cpp: Enhanced field access and foldable value logic - source/slang/slang-emit-cuda.cpp: Added CUDA-specific field extraction handling Result: // Now generates proper CUDA code: struct KernelContext_0 { Stuff_0 gStuff_1; }; // Runtime initialization in entry point: kernelContext_1.gStuff_1 = constructor_call(); Fixes: tests/compute/type-legalize-global-with-init.slang
* Diagnostic for metal ref mesh output assignment (#8365)James Helferty (NVIDIA)2025-09-17
| | | | | | | When slang detects assignment to a mesh output reference on metal, generate a diagnostic message. (Metal mesh shader outputs must be assigned via 'set' instead of 'ref'.) Fixes #7498
* Diagnose error when the function args can't satisfy constexpr parameter ↵Gangzheng Tong2025-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | requirements (#7269) ## Summary This PR enhances constexpr validation by adding proper error checking when function arguments cannot satisfy constexpr parameter requirements, addressing issue #6370. ## Problem Previously, when a function declared constexpr parameters, the compiler would attempt to propagate constexpr-ness to the call site arguments, but there was insufficient validation and error reporting when this propagation failed. This could lead silent failures where constexpr requirements weren't properly enforced ## Solution This PR adds checks that: 1. **Validates constexpr arguments**: When a function parameter is marked as `constexpr`, the compiler now explicitly checks that the corresponding argument can be marked as `constexpr` 2. **Issues clear compilation errors**: added `Diagnostics::argIsNotConstexpr`) 3. **Handles both call scenarios**: The validation works for both: - Direct function calls with IR-level function definitions - Calls to function from external modules Fixes #6370 --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix pointers and C-like layout in varying parameters (#8425)Julius Ikkala2025-09-10
| | | | | | | | | | | | | Closes #8409, but ended up being more about fixing another bug. While the issue itself seems to only be a simple typo fix (see second commit in this PR), I found out during writing a test that pointers never got correct locations regardless of layout. Their locations were always assigned to zero due to lacking a resource usage entry in `TypeLayout`. They were also missing the `Flat` decoration, so I went ahead and added that too. I can split this up into two separate PRs if that's preferred; both aspects just share a test right now and fix a similar-looking issue in the resulting SPIR-V.
* CUDA: Fix compiler crash with unsized array field - nonuniformres-as-… (#8380)Harsh Aggarwal (NVIDIA)2025-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | …function-parameter.slang #8315 Root Cause: CUDA compilation crashed with `assert failure: !seenFinalUnsizedArrayField` because unsized arrays like `RWStructuredBuffer<uint> globalBuffer[]` were not the final field in generated parameter structs, violating the layout constraint in slang-ir-layout.cpp. Fix: Extended `collectGlobalUniformParameters` to automatically reorder struct fields for CUDA targets - regular fields first, unsized arrays last. Other targets preserve original order. Impact: - Enables CUDA support for nonuniform resource indexing as function parameters - Zero impact on existing GLSL/HLSL/SPIRV targets - Automatic handling - no manual parameter reordering required Files: slang-emit.cpp, slang-ir-collect-global-uniforms.cpp/.h, test file --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Check if debugVar for is debuggable types in the legalization pass (#8326)Gangzheng Tong2025-09-10
| | | | | | | | | | | | | | | | | ## Problem When generic functions with debug variables were specialized with concrete types containing non-debuggable fields (e.g., `StructuredBuffer`), the IR cloning process would create invalid `DebugVar` instructions without checking if the substituted types remained debuggable. ## Solution This fix adds a defensive check in the legalization pass that removes the debugVar created for the non-debuggable types. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix language server auto-complete regression in debug build. (#8416)Yong He2025-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes this regression: ```slang struct MyType { // Regression Condition 1: there must be more than one member in the lookup scope. float v; int getSum() { return 0; } } void m(MyType t) { // Regression condition 2: the completion must be in an init expression. // Regression condition 3: none of the candidate members can coerce to the expected type. // Regression behavior: no completion candidates are shown, because // SemanticsVisitor::resolveOverloadedLookup throws an error when there are 0 applicable candidates // after type coercion filtering. Texture2D x = t.; // completion request after . here } ``` The root cause is that we shouldn't be applying candidate filtering on the candidate list when in completion checking mode. Closes #8417.
* Fix #8314 - Enable tests/compute/texture-subscript.slang for CUDA (#8408)Harsh Aggarwal (NVIDIA)2025-09-09
| | | The test can be enabled
* Relax restriction on using link-time types for shader parameters. (#8387)Yong He2025-09-05
| | | | | | | | | This change relaxes a previous restriction on link-time types and constants, so that we now allow them to be used to define shader parameters. Doing so will result in a parameter layout that is incomplete prior to linking. The PR added a test to call the reflection API on a fully linked program and ensure that we can report correct binding info.
* Disable a few WGSL testing using Wave functions in branches (#8386)Jay Kwak2025-09-05
| | | | | | | | | | WGSL requires Wave functions to be used only in uniform control flow. The latest compiler, Tint, started to error out when Wave functions are used in a dynamic control flow. This commit disables some of tests using Wave functions in dynamic branches. If possible, they are altered to call Wave functions in uniform control flows.
* Add warnings for overflows of integer types (#8281)jarcherNV2025-09-05
| | | | | The code int x4 = 0xFFFFFFFFFFFFFFFF previously did not produce a warning due to the value being too large for the type. This patch now checks for this and similar issues during parsing.
* Try both LoadLibrary functions on Windows (#8368)jarcherNV2025-09-05
| | | | If a given library cannot be found using LoadLibraryExA then try again using LoadLibraryA. Return an error only if both of these failed.
* Enable CUDA support for additional HLSL intrinsic tests (#8293)Harsh Aggarwal (NVIDIA)2025-09-04
| | | | | | | | | | | | | | | | | | | | | | | | Enable CUDA support for additional HLSL intrinsic tests by implementing missing functionality and fixing compiler bugs affecting CUDA targets. - Fix critical bug in InterlockedCompareStore64 where division used /4 instead of /8 for 64-bit types, causing incorrect memory addressing for all signed int 64_t atomics - Add signed int64_t atomic wrappers (atomicExch, atomicCAS) to CUDA prelu de that properly cast to/from unsigned types as required by CUDA's atomic API - Enable tests: atomic-intrinsics-64bit.slang - Implement CUDA support for QuadAny and QuadAll operations using warp shu ffle primitives (__shfl_sync with quad-level lane masking) - Add CUDA to quad_control capability definition in slang-capabilities.capdef - Add _slang_quadAny/_slang_quadAll helper functions to CUDA prelude - Enable tests: quad-control-comp-functionality.slang, subgroup-quad.slang --------- Co-authored-by: szihs <675653+szihs@users.noreply.github.com>
* Handle slang-test command comments better (#8363)Jay Kwak2025-09-04
| | | | | | | | | | | | | | | | | | | | | | | | Before this PR only the following was a valid line without any white-space character nor additional `/` character, ``` //TEST: ``` This PR is to allow slang-test to handle the following variants of the test command comments, ``` ///TEST: // TEST: // TEST: ////// TEST: ``` This PR revealed a regression on two tests: - tests/cpp-compiler/c-compile-shared-library.c (cpu) - tests/cpp-compiler/cpp-compile-shared-library.cpp (cpu) They are disabled as a part of this PR. And there is a new github issue to track it later, - https://github.com/shader-slang/slang/issues/8362
* Diagnose on structured buffers containing resources (#8222)Ellie Hermaszewska2025-09-03
| | | closes https://github.com/shader-slang/slang/issues/3313
* Fix#8085: Batch-9: Enable cuda tests (#8269)Harsh Aggarwal (NVIDIA)2025-09-03
|
* Add Optix Intrinsics Coverage (#8159) (#8310)Harsh Aggarwal (NVIDIA)2025-09-03
| | | Add 29 intrinsics to the list by new test
* Fix#8086: Batch-10: Enable cuda tests (#8270)Harsh Aggarwal (NVIDIA)2025-09-03
|
* render-test: Change D3D12 default to sm_6_5 (#8320)James Helferty (NVIDIA)2025-09-02
| | | | | | | | | Changes default for render-test to sm_6_5. Since sm_6_5 is the new default, remove the -use-dxil option, add -use-dxcb option Remove -use-dxil option from all test cases. Add -use-dxcb to two tests that needed it. Fixes #7611
* Emit DebugInfo for the legalized entry point parameters (#7703)Jay Kwak2025-09-02
| | | | | | | | | | | | | This commit is to emit the debug-info for the entry point parameters. Two things are implemented/fixed in this PR: - We were not emitting the `DebugVar` and `DebugValue` at the IR lowering level when the type of the entry point parameter is `ConstRef`. This commit handles the `ConstRef` case in a same way that the other types are handled so that `DebugVar` and `DebugValues` are properly emitted at the IR lowering level. - Two types for Geometry shaders were incorrectly treated as not valid types for the DebugInfo. They are `InputPatch` and `OutputPatch`. This commit handles them as valid types for DebugInfo.
* [CBP] Pointer frontend changes + groupshared pointer support (#7848)ArielG-NV2025-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolves #7628 Resolves: #8197 Primary Goals: 1. Add `Access` to pointer 2. AddressSpace::GroupShared support for pointers (SPIR-V) 3. Add `__getAddress()` to replace `&` * `&` is not updated to `require(cpu)` since slangpy uses `&`. This means we must: (1) merge PR; (2) replace `&` with `__getAddress()`; (3) add `require(cpu)` to `&` Changes: * Added to `Ptr` the `Access` generic argument & logic (for `Access::Read`). * Moved the generic argument `AddressSpace` from `Ptr` to the end of the type. * Added pointer casting support between any `Ptr` as long as the `AddressSpace` is the same * Disallow globallycoherent T* and coherent T* * Disallow const T*, T const*, and const T* * Fixed .natvis display of `ConstantValue` `ValOperandNode` * Support generic resolution of type-casted integers * Added `VariablePointer` emitting for spirv + other minor logic needed for groupshared pointers Breaking Changes: * Anyone using the `AddressSpace` of `Ptr` will now have to account for the `Access` argument * we disallow various syntax paired with `Ptr` and `T*` --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Remove the embedded source to avoid self-matching in slang-test (#8305)Jay Kwak2025-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When `SIMPLE` type test is used with `-g[1-3]` option, the filecheck pattern will most likely to match to the string itself on the embedded source code rather than match to the emitted spirv-asm code. This commit avoids the problem by removing the embedded source code. This commit also provides an option to keep the embedded source code, `-preserve-embedded-source`. The source code removal is happening in two steps: 1. iterate all output lines and find SPIRV-ASM in the following pattern: `%N = OpExtInst %void %M DebugSource %fileId %sourceId`. And then, store the "%sourceId" value to identify which SPIRV instructions are for the embedded source code. 2. iterate all output lines again to find the `%sourceId = OpString "...."` and replace the whole string with the following string, ``` %1 = OpString "// slang-test removed the embedded source // Use `-preserve-embedded-source` to keep it explicitly " ``` This change revealed problems in the existing tests: - tests/bugs/spirv-debug-info.slang : The expected text was missing and it had to be added. The file also had Carrage-Return character on all lines and the pre-commit git hook removed them. - tests/spirv/debug-info.slang : the expected keyword DebugValue had to change to DebugDeclare, because that's what we get with ToT. - tests/spirv/debug-value-dynamic-index.slang : This test is currently failing, and it will pass once DebugLocalVariable instruction missing for parameter of the entry point function #7693 is resolved. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
* Add SPIRV OpCapability for 8/16bit use in storage (#8194)James Helferty (NVIDIA)2025-08-28
| | | | | | | | | | | | | | | | | | | | | Emits the appropriate OpCapability for 8- and 16-bit type usage: - UniformAndStorageBuffer8BitAccess: for 16-bit types in SpvStorageClassUniform and SpvStorageClassStorageBuffer - UniformAndStorageBuffer16BitAccess: for 16-bit types in SpvStorageClassUniform and SpvStorageClassStorageBuffer - StoragePushConstant8: for 8-bit types in SpvStorageClassPushConstant - StoragePushConstant16: for 16-bit types in SpvStorageClassPushConstant - StorageInputOutput16: for 16-bit types in SpvStorageClassInput and SpvStorageClassOutput Generated with Claude Code, with revisions. Fixes #7879. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: James Helferty (NVIDIA) <jhelferty-nv@users.noreply.github.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fail slang-test when VVL printed errors (#8280)Jay Kwak2025-08-26
| | | | | | | | | | | | | | | | | | | | | | fixes https://github.com/shader-slang/slang/issues/8271 This PR does the following, - Fail slang-test when there are VVL error messages. - VVL error for `gfx-unit-test-tool/` were not captured properly by the debug callback. - Set an environment variable, `VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation`, for CI and VisualStudio project setup. - Ignores VVL error about NullHandle is used for the acceleration structure; a fix is at ToT of VVL and not available from release build yet. - Fix VVL error complaining about the varying inputs are not provided for the tests, `gfx-unit-test-tool/linkTimeTypeLayout.internal` and `gfx-unit-test-tool/linkTimeTypeLayoutNested.internal`. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix Metal 8-bit vector type names: emit char/uchar instead of int8_t/uint8_t ↵Copilot2025-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (#8223) The Metal backend was generating incorrect type names for 8-bit vector types, causing compilation failures when targeting Metal. According to the Metal specification, 8-bit vector types should be named `charN` and `ucharN` (e.g., `char2`, `uchar3`) rather than `int8_tN` and `uint8_tN`. ## Problem When compiling Slang code with 8-bit vector types for Metal, the compiler would emit: ```metal uint8_t2 _S8 = uint8_t2(uint8_t(0U), uint8_t(16U)); int8_t3 _S9 = int8_t3(int8_t(0), int8_t(16), int8_t(48)); ``` But the Metal compiler expects: ```metal uchar2 _S8 = uchar2(uint8_t(0U), uint8_t(16U)); char3 _S9 = char3(int8_t(0), int8_t(16), int8_t(48)); ``` This caused errors like: ``` error: unknown type name 'uint8_t2'; did you mean 'uint8_t'? ``` ## Solution Modified `MetalSourceEmitter::emitSimpleTypeImpl()` to emit the correct Metal-specific type names for 8-bit types: - `kIROp_Int8Type` now emits `char` instead of `int8_t` - `kIROp_UInt8Type` now emits `uchar` instead of `uint8_t` This change only affects the Metal backend and ensures that vector types like `int8_t2`, `uint8_t3`, etc. are correctly emitted as `char2`, `uchar3`, etc. ## Testing - Added a new test case `tests/metal/8bit-vector-types.slang` to verify the fix - Re-enabled the previously disabled Metal test in `tests/hlsl-intrinsic/countbits8.slang` - Updated `tests/metal/byte-address-buffer.slang` to expect the correct type names - Verified that existing Metal tests continue to pass Fixes #8211. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
* Fix#8084: Batch-8: Enable cuda tests (#8268)Harsh Aggarwal (NVIDIA)2025-08-25
|
* Fix#8083: Batch-7: Enable cuda tests (#8267)Harsh Aggarwal (NVIDIA)2025-08-25
|
* Fix#8082: Batch-6: Enable cuda tests (#8266)Harsh Aggarwal (NVIDIA)2025-08-25
|
* Fix#8081: Batch-5: Enable cuda tests (#8263)Harsh Aggarwal (NVIDIA)2025-08-25
|
* Fix#8080: Batch-4: Enable cuda tests (#8261)Harsh Aggarwal (NVIDIA)2025-08-25
|
* Fix mesh shader OutputIndices subscript error by adding missing ref accessor ↵Lujin Wang2025-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (#7929) Fixes the Slang compiler internal error "subscript had no getter" when reading from mesh shader output index arrays (e.g., `triangles[0].x`). ## Problem The `OutputIndices` struct was missing a `ref` accessor in its `__subscript` implementation, causing the compiler to fail when trying to materialize subscript expressions as r-values. ## Solution Added the missing `ref` accessor to `OutputIndices.__subscript` using the `kIROp_MeshOutputRef` intrinsic operation, matching the pattern used in `OutputVertices` and `OutputPrimitives`. ## Files Changed - `source/slang/core.meta.slang` - Added missing `ref` accessor - `tests/bugs/gh-7925.slang` - Test case to reproduce and verify the fix Fixes #7925 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Lujin Wang <lujinwangnv@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix reflection JSON writing userAttribs section twice for some cases. (#8210)MindSpunk2025-08-21
| | | | | | | | | `emitReflectionVarLayoutJSON` will output the `userAttribs` section twice as it gets output by `emitReflectionModifierInfoJSON` first before being output again by a direct call to `emitUserAttributes`. It seems the answer here is to just remove the extra explicit call to `emitUserAttributes` and rely on the call in `emitReflectionModifierInfoJSON`?
* Introduce CDataLayout & -fvk-use-c-layout (#8136)Julius Ikkala2025-08-21
| | | | | | | | | | | | | | | | Closes #8112. ~~The issue asks for a "C layout", but in this PR I use the term "CPU layout" because this naming was pre-existing in the codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout is to match CPU-side struct definitions with the shader side. I'm open to better naming suggestions, though.~~ Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the CPU target depends on the object layout rules of existing CPU layout rules, but they're incompatible with actual shaders. So a new `kCLayoutRulesImpl_` was needed anyway. --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Implement SV_VulkanSamplePosition (#8236)davli-nv2025-08-21
| | | | | | | | | | | | | | | | -Adds semantic SV_VulkanSamplePosition that emits corresponding gl_SamplePosition and SpvBuiltinSamplePosition -Adds gl_SamplePosition property to glsl.meta.slang -Adds SPIRV and GLSL tests for the semantic and property -Plan is to later implement SV_SamplePosition that follows HLSL range of -0.5 to +0.5, and emits GetRenderTargetSamplePosition(SV_SampleIndex) which needs more complicated IR manipulation for HLSL and Metal Fixes #7906 --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Fix nextafter() (#8195)Julius Ikkala2025-08-20
| | | | | | | | | | Fixes #8185. The previous implementation is incorrect and basically only works in the `x = 0` case. `delta` was the smallest possible positive value representable as a float, but that's below the rounding error of addition with almost all reasonably sized floats. This fixed implementation is based on bit twiddling instead. I've checked the float case against the C++ `nextafterf` with both a -inf -> inf and inf -> -inf sweep, in addition to the test included in this PR.
* Add Metal support for WaveGetActiveMask and WaveActiveCountBits (#8218)Tianyu Li2025-08-20
| | | | | | | | | | | | | | | | | | | | ## Summary - Add Metal platform support for `WaveGetActiveMask()` and `WaveActiveCountBits()` wave intrinsics - Update capability requirements to include Metal platform for subgroup ballot operations - Implement Metal-specific intrinsic assembly using `simd_ballot()` and `simd_vote` APIs ## Changes - **source/slang/hlsl.meta.slang**: - Add Metal target case for `WaveGetActiveMask()` using `simd_ballot(true)` - Update capability requirements from `cuda_glsl_hlsl_spirv` to `cuda_glsl_hlsl_metal_spirv` for wave ballot functions - **source/slang/slang-capabilities.capdef**: - Add `metal` to `subgroup_ballot_activemask` capability alias
* Updated support to enable batch3 (#8219)Harsh Aggarwal (NVIDIA)2025-08-20
| | | | | | | | | Enable CUDA support for batch 3 tests - Enhanced wave operations with exclusive support - Added proper identity values for min/max operations - Fixed intrinsic name mapping issues - Updated test configurations Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Emit descriptor handle correctly for ParameterBlock<DescriptorHandle> (#8206)Gangzheng Tong2025-08-18
| | | | | | | | In Metal, if `ParameterBlock` contains `DescriptorHandle` directly, it would be emitted as DescriptorHandle literal, which is not valid Metal code, This fix adds a case for `kIROp_DescriptorHandleType` and directs it to the Parent's `emitType` function to handle it.
* Fix issue of double lowering issue a differentiable function (#8182)kaizhangNV2025-08-18
| | | | | Close #8054. For detailed root cause is at: https://github.com/shader-slang/slang/issues/8054#issuecomment-3189579508
* Fix constructor overload ambiguity with scalar and vector parameters (#8109)Copilot2025-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | Close #8090. When we do type coerce, we use a cache to store the conversion cost of different type. The key of the cache is defined by struct BasicTypeKey { uint32_t baseType : 8; uint32_t dim1 : 4; uint32_t dim2 : 4; ... } where dim1 and dim2 is used for dimension of vector and matrix. However the dim is only 4 bits, so `vector<int, 16>` will have the same key as `int`, which is wrong. Fix the issue by extending it to 8 bit. Also to make the hash key still within 32 bits, we adjust baseType to 5 bits, and knownConstantBitCount to 6 bits. --------- Co-authored-by: kaizhangNV <kazhang@nvidia.com>
* Enable CUDA Test Enablement - Batch 1: Autodiff Tests (1-16) (#8139)Harsh Aggarwal (NVIDIA)2025-08-18
|
* Use 64bit int instead of emulation on metal (#8180)James Helferty (NVIDIA)2025-08-15
| | | | | | | | | | | | Metal's popcount prototype is `T popcount(T x)` but we want to use it to implement `countbits` where the prototype always returns `uint`. Using `popcount` directly would implicitly cast successfully to the 32-bit return value in all cases except when the argument is a 64-bit type. Thus, this change always explicitly casts the result to `$TR`, which should be one of the `uint[N]` types, and should always be able to hold the number of bits in the type. Addresses #6877
* [CUDA] Fix incorrect `kIROp_RaytracingAccelerationStructureType` emitting ↵ArielG-NV2025-08-15
| | | | | | | | | | | | | | logic (#8168) Fixes: #8167 Current emitting logic does not work, this has been corrected. The provided test ensures our CUDA code is valid by compiling PTX from it. `m_writer->emit("OptixTraversableHandle");` should be `out <<` since `out` adds to type-name-cache; otherwise using a type twice will produce bad type-names (since we filled type-name cache with "" instead of "typeName")
* Prohibit use of buffer.GetDimensions on metal (#8156)James Helferty (NVIDIA)2025-08-15
| | | Fixes #7011
* [Capability System] Fix bug where capabilities do not correctly propegate if ↵ArielG-NV2025-08-14
| | | | | | | | | | | | | | | | | | AST-parent has target+set the AST-child does not (#8175) Fixes: #8174 Changes: * To determine if we propagate capabilities, we need to ensure that a `join` will do nothing (optimization since `join` is expensive + caching data for the `join` adds up to be expensive). This logic was changed in `slang-check-decl.cpp` since the current logic was incorrect. * A parent could have the set `metal+glsl` and the use-site could have `glsl`. In this case, we will not remove `metal` from the parent since `{metal+glsl}.implies({glsl})` is true. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Handle SV_Barycentrics on metal (#8163)James Helferty (NVIDIA)2025-08-13
| | | Fixes #6785
* Remove the semantic decoration from the original entry struct (#8146)Jay Kwak2025-08-13
| | | | | | | | | | | | | | | | | | | | When we legalize the entry point param, there are cases where we need to reconstruct a struct for the parameter and the original struct wouldn't be used. But if the user tries to use the origianl struct as a type for a function parameter, we will end up using both the original struct and the synthesized struct at the same time. On Metal and WGSL, it causes an error when an identical semtaic is used on more than one variable. This commit removes the semantics from the original struct after cloning the type. Fixes https://github.com/shader-slang/slang/issues/8141 Related to https://github.com/shader-slang/slang/issues/7693 --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Enable CUDA testing for batch 2 (#8147)jarcherNV2025-08-12
| | | | Enable CUDA for the tests listed in issue #8078 This requires a minor CUDA prelude change, adding some math functions.
* [SPIR-V] Emit control flags for `branch/flatten` decorations (#8134)amidescent2025-08-09
| | | | Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix atomics error diagnostics (#8117)venkataram-nv2025-08-09
| | | | | | | Fixes #8116 --------- Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>