summaryrefslogtreecommitdiff
path: root/source
AgeCommit message (Collapse)Author
2025-09-22Fixed typo in `SemanticsVisitor::_readAggregateValueFromInitializerList` (#8504)Ronan
I think the commit diff speaks for itself.
2025-09-19Use LOAD_LIBRARY_SEARCH_DEFAULT_DIRS for LoadLibraryExW (#8491)Gangzheng Tong
Before: - Uses `LOAD_LIBRARY_SEARCH_USER_DIRS` in `LoadLibraryExW`, which might cause exception if there is no pathes added by `AddDllDirectory()` After: - Use the composite flag `LOAD_LIBRARY_SEARCH_DEFAULT_DIRS`, which searches for several locations. - Will still search dir added by `AddDllDirectory()`, but avoids empty path seraching if there is no AddDllDirectory() calls. Related to https://github.com/shader-slang/slang/issues/8462 --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-09-19include limits.h in slang-platform.cpp (#8473)John Zupin
fixes https://github.com/shader-slang/slang/issues/8472 Fixes an issue with GCC 9.4.0 on Ubuntu 20.04, it will throw an error about PATH_MAX not being declared. Co-authored-by: Mukund Keshava <mkeshava@nvidia.com>
2025-09-18Fix DebugCompilationUnit to reference main shader file instead of header ↵Lujin Wang
files (#7957) This PR implements the requested fix for issue #7923 where DebugCompilationUnit incorrectly referenced header files instead of the main shader file. ## Summary - Modified IRDebugSource to include isIncludedFile flag as third operand - Updated emitDebugSource function to accept and pass the included file flag - Updated call sites to use source->isIncludedFile() from SourceFile class - Modified SPIR-V emission to only create DebugCompilationUnit for non-included files ## Test Results The fix has been verified with the provided reproducer code. The SPIR-V output now correctly shows DebugCompilationUnit referencing the main shader file instead of header files. Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Lujin Wang <lujinwangnv@users.noreply.github.com> Co-authored-by: Claude Code <claude@anthropic.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-09-18Fix CUDA global variable initialization with constructor calls (#8340)Harsh Aggarwal (NVIDIA)
Fix CUDA global variable initialization with constructor calls Resolves CUDA compilation failure where global variables with struct constructor initialization generated illegal `__device__` variable runtime initialization. **Problem:** ```cuda // Generated invalid CUDA code: __device__ static const Stuff_0 gStuff_0 = Stuff_x24init_0(args...); // Error: "dynamic initialization is not supported for a __device__ variable" Root Cause Discovered: Through extensive debugging, found that moveGlobalVarInitializationToEntryPoints pass only handled kIROp_GlobalVar instructions, but global constants with constructor calls appeared as kIROp_Call instructions at module scope. Solution: 1. IR Pipeline Fix: Extended moveGlobalVarInitializationToEntryPoints to detect and transform module-level constructor calls into proper global variables with entry-point initialization 2. Field Access Fix: Enhanced kIROp_FieldExtract logic to emit correct -> syntax for pointer types and address-of operations 3. Constructor Emission: Added CUDA-specific handling for constructor calls Architecture: - Transforms let %gStuff = call %Constructor(...) into kernel context initialization - Moves runtime initialization from global scope to entry-point execution - Follows CUDA best practices for global state management Files: - source/slang/slang-ir-explicit-global-init.cpp: Extended IR transformation pass - source/slang/slang-emit-c-like.cpp: Enhanced field access and foldable value logic - source/slang/slang-emit-cuda.cpp: Added CUDA-specific field extraction handling Result: // Now generates proper CUDA code: struct KernelContext_0 { Stuff_0 gStuff_1; }; // Runtime initialization in entry point: kernelContext_1.gStuff_1 = constructor_call(); Fixes: tests/compute/type-legalize-global-with-init.slang
2025-09-17Add WASM FS module support for slang-playground (#8459)Gangzheng Tong
Add WASM FS module support for slang-playground This change adds the necessary Emscripten build flags to export the FileSystem (FS) module interface in the slang-wasm build: - Adds -sMODULARIZE=1 to enable modular builds - Adds -sEXPORTED_RUNTIME_METHODS=['FS'] to export the FS interface These changes are required to support the slang-playground. The existing flags are also reformatted for better readability. Related to https://github.com/shader-slang/slang-playground/issues/170
2025-09-17Diagnostic for metal ref mesh output assignment (#8365)James Helferty (NVIDIA)
When slang detects assignment to a mesh output reference on metal, generate a diagnostic message. (Metal mesh shader outputs must be assigned via 'set' instead of 'ref'.) Fixes #7498
2025-09-17Fix LSS intrinsics for hit objects in ray tracing tests (#8469)Harsh Aggarwal (NVIDIA)
Enable GetLssPositionsAndRadii() call in rayGenLssIntrinsicsHitObject shader that was previously commented out. This fixes the failing ray-tracing-lss-intrinsics-hit-object test which was returning all zero values for LSS position and radius data. The hit object LSS intrinsics are now working correctly in D3D12 backend, returning proper endcap positions and radii values as expected by the test. All 27 test assertions now pass successfully. Fixes #8128
2025-09-17Added __magic_enum (#8436)Ronan
Fixes #8406 (and #8410). `AddressSpace`, `MemoryScope` and `AccessQualifier` are no longer `BaseType`. I added a new `__magic_enum` (very similar to `__magic_type`) syntax to be able to easily create values or these enums from the compiler. (I don't know if it was the right way to do it, but it works and the changes are small enough?). I had a weird bug: `tests/language-feature/capability/address-of.slang` was failing in `IRBuilder::_findOrEmitConstant(IRConstant& keyInst)`. When needing a new `u64(0)`, it did not find it in the `ConstantMap` first, but then failed to add it right after because it already existed in the map! But this was triggered by `IRPtrType* IRBuilder::getPtrType(IROp op, IRType* valueType, AccessQualifier accessQualifier, AddressSpace addressSpace)`, which is a strange coincidence... but I could not find the issue in what I did. I ended up bumping unordered_dense, and it solved the issue (so there was a bug in there).
2025-09-16Diagnose error when the function args can't satisfy constexpr parameter ↵Gangzheng Tong
requirements (#7269) ## Summary This PR enhances constexpr validation by adding proper error checking when function arguments cannot satisfy constexpr parameter requirements, addressing issue #6370. ## Problem Previously, when a function declared constexpr parameters, the compiler would attempt to propagate constexpr-ness to the call site arguments, but there was insufficient validation and error reporting when this propagation failed. This could lead silent failures where constexpr requirements weren't properly enforced ## Solution This PR adds checks that: 1. **Validates constexpr arguments**: When a function parameter is marked as `constexpr`, the compiler now explicitly checks that the corresponding argument can be marked as `constexpr` 2. **Issues clear compilation errors**: added `Diagnostics::argIsNotConstexpr`) 3. **Handles both call scenarios**: The validation works for both: - Direct function calls with IR-level function definitions - Calls to function from external modules Fixes #6370 --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-09-10Fix segfault in SPIR-V header processing in SpirvInstructionHelper (#8428)Gangzheng Tong
The `SpirvInstructionHelper::loadBlob()` method could segfault when calling `m_headerWords.addRange()` if the SPIR-V blob contained insufficient data for the required 5-word header. To reproduce, run ``` ./build/Debug/bin/slangc.exe tests/modules/environment.slang -o tests/modules/environment.slang-module -target spirv -separate-debug-info (0): error 57004: output SPIR-V contains no exported symbols. Please make sure to specify at least one entrypoint. Segmentation fault ``` The error is expected, but the `Segmentation fault` is not. This PR adds the check to ensure the SPIR-V blob has at least `SPV_INDEX_INSTRUCTION_START * sizeof(SpvWord)` bytes (20 bytes minimum) before attempting to process the header words. Related to: https://github.com/shader-slang/slang/issues/7547
2025-09-10Add FindModifier for Declarations (#8308)Xuanda Yang
Add `findModifier` for `DeclReflection` so pattern like `extern struct foo;` can be properly reflected. Closes #8009
2025-09-10Fix crash when compiling specialized generic entrypoint containing a static ↵Yong He
const decl. (#8392) Closes #8184. We fixed three issues with this regression test: 1. After generating IR for a `SpecializeComponentType`, we should also strip the frontend decorations from the IR so there is no HighLevelDeclDecoration that will go into the backend. 2. When lowering a static const inside a generic function, we should not give the static const a linkage, because it won't such constant will not appear in global scope. Trying to give it a linkage decoration will lead to the parent generic (for the function) to have two duplicate Export/Import decorations with different mangle names, and confuses the linker. 3. Make sure internal exceptions does not leak through `IComponentType::getEntryPointCode`/`getTargetCode`.
2025-09-10Fix pointers and C-like layout in varying parameters (#8425)Julius Ikkala
Closes #8409, but ended up being more about fixing another bug. While the issue itself seems to only be a simple typo fix (see second commit in this PR), I found out during writing a test that pointers never got correct locations regardless of layout. Their locations were always assigned to zero due to lacking a resource usage entry in `TypeLayout`. They were also missing the `Flat` decoration, so I went ahead and added that too. I can split this up into two separate PRs if that's preferred; both aspects just share a test right now and fix a similar-looking issue in the resulting SPIR-V.
2025-09-10Squash warnings on gcc 14 (#8377)Ellie Hermaszewska
2025-09-10CUDA: Fix compiler crash with unsized array field - nonuniformres-as-… (#8380)Harsh Aggarwal (NVIDIA)
…function-parameter.slang #8315 Root Cause: CUDA compilation crashed with `assert failure: !seenFinalUnsizedArrayField` because unsized arrays like `RWStructuredBuffer<uint> globalBuffer[]` were not the final field in generated parameter structs, violating the layout constraint in slang-ir-layout.cpp. Fix: Extended `collectGlobalUniformParameters` to automatically reorder struct fields for CUDA targets - regular fields first, unsized arrays last. Other targets preserve original order. Impact: - Enables CUDA support for nonuniform resource indexing as function parameters - Zero impact on existing GLSL/HLSL/SPIRV targets - Automatic handling - no manual parameter reordering required Files: slang-emit.cpp, slang-ir-collect-global-uniforms.cpp/.h, test file --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-09-10Check if debugVar for is debuggable types in the legalization pass (#8326)Gangzheng Tong
## Problem When generic functions with debug variables were specialized with concrete types containing non-debuggable fields (e.g., `StructuredBuffer`), the IR cloning process would create invalid `DebugVar` instructions without checking if the substituted types remained debuggable. ## Solution This fix adds a defensive check in the legalization pass that removes the debugVar created for the non-debuggable types. --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-09-10Fix language server auto-complete regression in debug build. (#8416)Yong He
Fixes this regression: ```slang struct MyType { // Regression Condition 1: there must be more than one member in the lookup scope. float v; int getSum() { return 0; } } void m(MyType t) { // Regression condition 2: the completion must be in an init expression. // Regression condition 3: none of the candidate members can coerce to the expected type. // Regression behavior: no completion candidates are shown, because // SemanticsVisitor::resolveOverloadedLookup throws an error when there are 0 applicable candidates // after type coercion filtering. Texture2D x = t.; // completion request after . here } ``` The root cause is that we shouldn't be applying candidate filtering on the candidate list when in completion checking mode. Closes #8417.
2025-09-09Enable slang_lldb.py on macos (#8327)James Helferty (NVIDIA)
macos 15.6 includes python 3.9.6 with Xcode, which doesn't understand match/case. Changing it to to the less spiffy if/elif. Co-authored-by: Yong He <yonghe@outlook.com> Co-authored-by: Sam Estep <sam@samestep.com>
2025-09-08Use wide char version of Windows API (#8390)Gangzheng Tong
This PR modernizes the Windows-specific code by replacing ANSI Windows API functions with their Unicode (wide character) counterparts. This change ensures proper handling of Unicode file paths and strings on Windows systems. ### File Operations (`source/core/slang-io.cpp`) - `DeleteFileA` → `DeleteFileW` - `GetTempPathA` → `GetTempPathW` - `GetTempFileNameA` → `GetTempFileNameW` - `RemoveDirectoryA` → `RemoveDirectoryW` - `SHFileOperationA` → `SHFileOperationW` - `GetModuleFileNameA` → `GetModuleFileNameW` with UTF-8 conversion ### Platform Operations (`source/core/slang-platform.cpp`) - `GetModuleHandleExA` → `GetModuleHandleExW` - `LoadLibraryExA` → `LoadLibraryExW` - `LoadLibraryA` → `LoadLibraryW` - `OutputDebugStringA` → `OutputDebugStringW` ### Runtime and Tools - `MessageBoxA` → `MessageBoxW` in slang-rt - `GetCurrentDirectoryA` → `GetCurrentDirectoryW` in slang-fiddle - String literal conversion to wide strings in vk-pipeline-create --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gangzheng Tong <gtong-nv@users.noreply.github.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-09-08Don't emit ArrayStride 0 for RWStructuredBuffer pointers (#8397)Julius Ikkala
Fixes #8396 by not emitting the `ArrayStride` when it would've been zero. The problem is caused by #7848, more details in the issue description. I checked that with equivalent GLSL code, glslangValidator does not emit any `ArrayStride`. I assume that the addition of `storageClass == SpvStorageClassStorageBuffer` to line 1848 is not a mistake. If it is, that addition could also be simply reverted to fix this issue, I tested that option as well. With these changes, Slang tests work locally on my PC again. Related to this; it'd be nice to have GPUs from multiple vendors in the CI to avoid this kind of thing happening again. Or even just llvmpipe; that doesn't even require a GPU and would add at least one more driver to test with.
2025-09-05Relax restriction on using link-time types for shader parameters. (#8387)Yong He
This change relaxes a previous restriction on link-time types and constants, so that we now allow them to be used to define shader parameters. Doing so will result in a parameter layout that is incomplete prior to linking. The PR added a test to call the reflection API on a fully linked program and ensure that we can report correct binding info.
2025-09-06Add check for backtrace availability (#8329)Dario Mylonopoulos
The header execinfo.h and the related backtrace functionality is not available on all linux platforms. In particular it's missing on musl linux and on Android before API version 33. This causes compilation errors on those platforms. With this change, we first check if backtrace functionality is available by checking if we are using glibc or a compatible Android version. Tested on manylinux_2_28 with glibc 2.28 and musllinux_1_2 with musl 1.2, has not been tested on Android. Co-authored-by: Yong He <yonghe@outlook.com>
2025-09-05Add warnings for overflows of integer types (#8281)jarcherNV
The code int x4 = 0xFFFFFFFFFFFFFFFF previously did not produce a warning due to the value being too large for the type. This patch now checks for this and similar issues during parsing.
2025-09-05Try both LoadLibrary functions on Windows (#8368)jarcherNV
If a given library cannot be found using LoadLibraryExA then try again using LoadLibraryA. Return an error only if both of these failed.
2025-09-04Fix#8128 LSS and sphere hit object intrinsics fail to compile (#8339)Harsh Aggarwal (NVIDIA)
Update intrinsics signature as per the nvapi header
2025-09-04Enable CUDA support for additional HLSL intrinsic tests (#8293)Harsh Aggarwal (NVIDIA)
Enable CUDA support for additional HLSL intrinsic tests by implementing missing functionality and fixing compiler bugs affecting CUDA targets. - Fix critical bug in InterlockedCompareStore64 where division used /4 instead of /8 for 64-bit types, causing incorrect memory addressing for all signed int 64_t atomics - Add signed int64_t atomic wrappers (atomicExch, atomicCAS) to CUDA prelu de that properly cast to/from unsigned types as required by CUDA's atomic API - Enable tests: atomic-intrinsics-64bit.slang - Implement CUDA support for QuadAny and QuadAll operations using warp shu ffle primitives (__shfl_sync with quad-level lane masking) - Add CUDA to quad_control capability definition in slang-capabilities.capdef - Add _slang_quadAny/_slang_quadAll helper functions to CUDA prelude - Enable tests: quad-control-comp-functionality.slang, subgroup-quad.slang --------- Co-authored-by: szihs <675653+szihs@users.noreply.github.com>
2025-09-03Diagnose on structured buffers containing resources (#8222)Ellie Hermaszewska
closes https://github.com/shader-slang/slang/issues/3313
2025-09-03Enable ccache for self-hosted runner (#8345)Gangzheng Tong
Related to https://github.com/shader-slang/slang/issues/6728 --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-09-02Emit DebugInfo for the legalized entry point parameters (#7703)Jay Kwak
This commit is to emit the debug-info for the entry point parameters. Two things are implemented/fixed in this PR: - We were not emitting the `DebugVar` and `DebugValue` at the IR lowering level when the type of the entry point parameter is `ConstRef`. This commit handles the `ConstRef` case in a same way that the other types are handled so that `DebugVar` and `DebugValues` are properly emitted at the IR lowering level. - Two types for Geometry shaders were incorrectly treated as not valid types for the DebugInfo. They are `InputPatch` and `OutputPatch`. This commit handles them as valid types for DebugInfo.
2025-08-29Remove unused variable in slangc::main (#8325)Jay Kwak
2025-08-29[CBP] Pointer frontend changes + groupshared pointer support (#7848)ArielG-NV
Resolves #7628 Resolves: #8197 Primary Goals: 1. Add `Access` to pointer 2. AddressSpace::GroupShared support for pointers (SPIR-V) 3. Add `__getAddress()` to replace `&` * `&` is not updated to `require(cpu)` since slangpy uses `&`. This means we must: (1) merge PR; (2) replace `&` with `__getAddress()`; (3) add `require(cpu)` to `&` Changes: * Added to `Ptr` the `Access` generic argument & logic (for `Access::Read`). * Moved the generic argument `AddressSpace` from `Ptr` to the end of the type. * Added pointer casting support between any `Ptr` as long as the `AddressSpace` is the same * Disallow globallycoherent T* and coherent T* * Disallow const T*, T const*, and const T* * Fixed .natvis display of `ConstantValue` `ValOperandNode` * Support generic resolution of type-casted integers * Added `VariablePointer` emitting for spirv + other minor logic needed for groupshared pointers Breaking Changes: * Anyone using the `AddressSpace` of `Ptr` will now have to account for the `Access` argument * we disallow various syntax paired with `Ptr` and `T*` --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-28Revert "Reduce the dependency to thread library (#8216)" (#8319)kaizhangNV
This cause the pthread not found issue on old glibc build. This reverts commit 113327194d4cf750af6265a560615850a8e7e6fb.
2025-08-28[Documentation] optix test coverage #463 (#8311)Harsh Aggarwal (NVIDIA)
Update docs/shader-execution-reordering.md with additional intrinsics Add correct capability `LoadLocalRootTableConstant`
2025-08-28Add SPIRV OpCapability for 8/16bit use in storage (#8194)James Helferty (NVIDIA)
Emits the appropriate OpCapability for 8- and 16-bit type usage: - UniformAndStorageBuffer8BitAccess: for 16-bit types in SpvStorageClassUniform and SpvStorageClassStorageBuffer - UniformAndStorageBuffer16BitAccess: for 16-bit types in SpvStorageClassUniform and SpvStorageClassStorageBuffer - StoragePushConstant8: for 8-bit types in SpvStorageClassPushConstant - StoragePushConstant16: for 16-bit types in SpvStorageClassPushConstant - StorageInputOutput16: for 16-bit types in SpvStorageClassInput and SpvStorageClassOutput Generated with Claude Code, with revisions. Fixes #7879. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: James Helferty (NVIDIA) <jhelferty-nv@users.noreply.github.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-26Fix Metal 8-bit vector type names: emit char/uchar instead of int8_t/uint8_t ↵Copilot
(#8223) The Metal backend was generating incorrect type names for 8-bit vector types, causing compilation failures when targeting Metal. According to the Metal specification, 8-bit vector types should be named `charN` and `ucharN` (e.g., `char2`, `uchar3`) rather than `int8_tN` and `uint8_tN`. ## Problem When compiling Slang code with 8-bit vector types for Metal, the compiler would emit: ```metal uint8_t2 _S8 = uint8_t2(uint8_t(0U), uint8_t(16U)); int8_t3 _S9 = int8_t3(int8_t(0), int8_t(16), int8_t(48)); ``` But the Metal compiler expects: ```metal uchar2 _S8 = uchar2(uint8_t(0U), uint8_t(16U)); char3 _S9 = char3(int8_t(0), int8_t(16), int8_t(48)); ``` This caused errors like: ``` error: unknown type name 'uint8_t2'; did you mean 'uint8_t'? ``` ## Solution Modified `MetalSourceEmitter::emitSimpleTypeImpl()` to emit the correct Metal-specific type names for 8-bit types: - `kIROp_Int8Type` now emits `char` instead of `int8_t` - `kIROp_UInt8Type` now emits `uchar` instead of `uint8_t` This change only affects the Metal backend and ensures that vector types like `int8_t2`, `uint8_t3`, etc. are correctly emitted as `char2`, `uchar3`, etc. ## Testing - Added a new test case `tests/metal/8bit-vector-types.slang` to verify the fix - Re-enabled the previously disabled Metal test in `tests/hlsl-intrinsic/countbits8.slang` - Updated `tests/metal/byte-address-buffer.slang` to expect the correct type names - Verified that existing Metal tests continue to pass Fixes #8211. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
2025-08-26Fix `shouldEmitSPIRVDirectly` (#8019)ArielG-NV
Fixes: #8018 Changes: * Do not emit true for `shouldEmitSPIRVDirectly` with a GLSL target
2025-08-26fix a autodiff crash (#8259)kaizhangNV
close #8068. Currently the AutoDiff aggressively scan every IR inst in searching the differentiable IR. This is not efficient and could have bug, details in https://github.com/shader-slang/slang/issues/8068#issuecomment-3214856668. This PR change the behavior. It will do a initial filter to only gather the global differentiable IRs and IRFunc and IRGeneric as well. For IRGeneric, we will pick it only when it's used in other generic function (it's only useful when dealing with dynamic dispatch). Then we will start searching reachable insts from this IR list by using the same method as before.
2025-08-22Fix mesh shader OutputIndices subscript error by adding missing ref accessor ↵Lujin Wang
(#7929) Fixes the Slang compiler internal error "subscript had no getter" when reading from mesh shader output index arrays (e.g., `triangles[0].x`). ## Problem The `OutputIndices` struct was missing a `ref` accessor in its `__subscript` implementation, causing the compiler to fail when trying to materialize subscript expressions as r-values. ## Solution Added the missing `ref` accessor to `OutputIndices.__subscript` using the `kIROp_MeshOutputRef` intrinsic operation, matching the pattern used in `OutputVertices` and `OutputPrimitives`. ## Files Changed - `source/slang/core.meta.slang` - Added missing `ref` accessor - `tests/bugs/gh-7925.slang` - Test case to reproduce and verify the fix Fixes #7925 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Lujin Wang <lujinwangnv@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
2025-08-22Fix readlink missing include (#8260)TheGoldMonkey
On Linux, `slang-platform.cpp` compiles with libstdc++ only because `unistd.h` is being transitively included. It fails to compile with standard libraries that don't include `unistd.h` like libc++. This is the how it's being transitively included with libstdc++: ``` /home/mcvm/dev/slang/source/core/slang-platform.h /home/mcvm/dev/slang/source/core/../core/slang-string.h /home/mcvm/dev/slang/source/core/../core/slang-hash.h /home/mcvm/dev/slang/external/unordered_dense/include/ankerl/unordered_dense.h /usr/lib64/gcc/x86_64-unknown-linux-gnu/15.1.0/../../../../include/c++/15.1.0/memory /usr/lib64/gcc/x86_64-unknown-linux-gnu/15.1.0/../../../../include/c++/15.1.0/bits/shared_ptr_atomic.h /usr/lib64/gcc/x86_64-unknown-linux-gnu/15.1.0/../../../../include/c++/15.1.0/bits/atomic_base.h /usr/lib64/gcc/x86_64-unknown-linux-gnu/15.1.0/../../../../include/c++/15.1.0/bits/atomic_wait.h /usr/include/unistd.h ``` Tested building with `-stdlib=libc++` and `-stdlib=libstdc++`.
2025-08-21Add record and replay support for IComponentType2 (#8215)jarcherNV
Add record and replay support for the IComponentType2 struct and its functions getTargetCompileResult and getEntryPointCompileResult.
2025-08-21Fix reflection JSON writing userAttribs section twice for some cases. (#8210)MindSpunk
`emitReflectionVarLayoutJSON` will output the `userAttribs` section twice as it gets output by `emitReflectionModifierInfoJSON` first before being output again by a direct call to `emitUserAttributes`. It seems the answer here is to just remove the extra explicit call to `emitUserAttributes` and rely on the call in `emitReflectionModifierInfoJSON`?
2025-08-21Introduce CDataLayout & -fvk-use-c-layout (#8136)Julius Ikkala
Closes #8112. ~~The issue asks for a "C layout", but in this PR I use the term "CPU layout" because this naming was pre-existing in the codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout is to match CPU-side struct definitions with the shader side. I'm open to better naming suggestions, though.~~ Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the CPU target depends on the object layout rules of existing CPU layout rules, but they're incompatible with actual shaders. So a new `kCLayoutRulesImpl_` was needed anyway. --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-08-21Implement SV_VulkanSamplePosition (#8236)davli-nv
-Adds semantic SV_VulkanSamplePosition that emits corresponding gl_SamplePosition and SpvBuiltinSamplePosition -Adds gl_SamplePosition property to glsl.meta.slang -Adds SPIRV and GLSL tests for the semantic and property -Plan is to later implement SV_SamplePosition that follows HLSL range of -0.5 to +0.5, and emits GetRenderTargetSamplePosition(SV_SampleIndex) which needs more complicated IR manipulation for HLSL and Metal Fixes #7906 --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
2025-08-20Fix nextafter() (#8195)Julius Ikkala
Fixes #8185. The previous implementation is incorrect and basically only works in the `x = 0` case. `delta` was the smallest possible positive value representable as a float, but that's below the rounding error of addition with almost all reasonably sized floats. This fixed implementation is based on bit twiddling instead. I've checked the float case against the C++ `nextafterf` with both a -inf -> inf and inf -> -inf sweep, in addition to the test included in this PR.
2025-08-20Add Metal support for WaveGetActiveMask and WaveActiveCountBits (#8218)Tianyu Li
## Summary - Add Metal platform support for `WaveGetActiveMask()` and `WaveActiveCountBits()` wave intrinsics - Update capability requirements to include Metal platform for subgroup ballot operations - Implement Metal-specific intrinsic assembly using `simd_ballot()` and `simd_vote` APIs ## Changes - **source/slang/hlsl.meta.slang**: - Add Metal target case for `WaveGetActiveMask()` using `simd_ballot(true)` - Update capability requirements from `cuda_glsl_hlsl_spirv` to `cuda_glsl_hlsl_metal_spirv` for wave ballot functions - **source/slang/slang-capabilities.capdef**: - Add `metal` to `subgroup_ballot_activemask` capability alias
2025-08-20Reduce the dependency to thread library (#8216)Jay Kwak
Slang compiler doesn't use thread and we should declare the dependency to the thread library when we don't need it. The use of Thread is limited to the tools such as slang-test.
2025-08-20Updated support to enable batch3 (#8219)Harsh Aggarwal (NVIDIA)
Enable CUDA support for batch 3 tests - Enhanced wave operations with exclusive support - Added proper identity values for min/max operations - Fixed intrinsic name mapping issues - Updated test configurations Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2025-08-18Emit descriptor handle correctly for ParameterBlock<DescriptorHandle> (#8206)Gangzheng Tong
In Metal, if `ParameterBlock` contains `DescriptorHandle` directly, it would be emitted as DescriptorHandle literal, which is not valid Metal code, This fix adds a case for `kIROp_DescriptorHandleType` and directs it to the Parent's `emitType` function to handle it.
2025-08-18Make LLDB IR data formatters more robust (#7927)Sam Estep
This is a followup on #7828 to fix bugs that were causing CodeLLDB to give wrong values and hang (see vadimcn/codelldb#1302) because I didn't realize that these data formatters can be passed _either_ a value of a given type _or_ a pointer to a value of that type, and need to handle both cases. I also introduced loop bounds to prevent hangs in the case where these synthetic values are constructed for things like uninitialized variables. From looking at the preexisting data formatters from #4272 in `source/core/core_lldb.py`, it seems like they _technically_ have similar bugs to this, but since those types are simpler, it's unclear to me whether that can actually manifest in meaningful ways like these bugs in `source/slang/slang_lldb.py` were doing. Anyways, to test this, put a breakpoint here: https://github.com/shader-slang/slang/blob/6d399804a353154259cf4410940f144db8f9b5cf/source/slang/slang-emit-cpp.cpp#L1733 And use this `.vscode/launch.json` for CodeLLDB: ```json { "version": "0.2.0", "configurations": [ { "name": "LLDB", "preLaunchTask": "Debug build", "type": "lldb", "request": "launch", "initCommands": ["command source .lldbinit"], "program": "build/Debug/bin/slangc", "args": [ "tests/cpu-program/cpu-hello-world-test.slang", "-target", "executable", "-o", "hello" ] } ] } ``` Before this PR, the `inst` variable will display in the debug pane as `{kIROp_StringLit 0x00007fffffff5f68}`, which is the wrong pointer value. You can also check this by running `p inst` in the Debug Console, which will print this: ``` (Slang::IRInst *) 0x000055555fdac3b8 {kIROp_StringLit 0x00007fffffff5f68} ``` In contrast, running `p *inst` prints the correct pointer value: ``` (Slang::IRInst) {kIROp_StringLit 0x000055555fdac3b8} { [op] = kIROp_StringLit [UID] = 76 [type] = 0x000055555fdac348 {kIROp_StringType None} [decorations/children] = {} [parent] = 0x000055555fdac2d0 {kIROp_ModuleInst None} [uses] = 0x000055555fdadf18 {kIROp_StringLit 0x000055555fdac3b8} } ``` But as you can see, in that case the synthetic `[value]` child is completely missing. Then if you try to expand `inst` in the debug pane, CodeLLDB will hang (or at least it does when I try this). After this PR, the hex integer for the pointer is always consistent, and CodeLLDB does not hang in the debug pane when you expand `inst`, and shows the correct `[value]` child just like when running `v *inst`. As an aside: after this PR, the `[value]` child is still missing when specifically running `p *inst` in the Debug Console. It _is_ possible to fix this: ```diff diff --git a/source/slang/slang_lldb.py b/source/slang/slang_lldb.py index 23905d8c5..d2b3a4da9 100644 --- a/source/slang/slang_lldb.py +++ b/source/slang/slang_lldb.py @@ -93,13 +93,11 @@ class IRInst_synthetic(lldb.SBSyntheticValueProvider): value: list[tuple[str, lldb.SBValue]] = [] match op.value: case "kIROp_StringLit": - string_lit_t = target.FindFirstType("Slang::IRStringLit") - string_lit = self.valobj.Cast(string_lit_t) + string_lit = self.valobj.EvaluateExpression("(Slang::IRStringLit*)this") val = string_lit.GetChildMemberWithName("value") value = [("[value]", val.GetChildMemberWithName("stringVal"))] case "kIROp_IntLit": - int_lit_t = target.FindFirstType("Slang::IRIntLit") - int_lit = self.valobj.Cast(int_lit_t) + int_lit = self.valobj.EvaluateExpression("(Slang::IRIntLit*)this") val = int_lit.GetChildMemberWithName("value") value = [("[value]", val.GetChildMemberWithName("intVal"))] diff --git a/typings/lldb.pyi b/typings/lldb.pyi index 2672ba244..3a08e9141 100644 --- a/typings/lldb.pyi +++ b/typings/lldb.pyi @@ -496,7 +496,7 @@ class SBValue: def Persist(self): ... def GetDescription(self, description): ... def GetExpressionPath(self, *args): ... - def EvaluateExpression(self, *args): ... + def EvaluateExpression(self, expr: str) -> SBValue: ... def Watch(self, *args): ... def WatchPointee(self, resolve_location, read, write, error): ... def GetVTable(self): ... ``` However, that makes the debugger run _significantly_ slower, so I'm choosing not do do it here. --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>