summaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAge
...
* Fix specialization constants getting incorrectly folded (#7299)Julius Ikkala2025-06-03
|
* Add check for the variable requirement (#6677)Gangzheng Tong2025-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add check for the variable requirement This change adds the capability check for the variables requirement. With this check, the shader ``` [require(cpp_cuda_glsl_hlsl_metal_spirv)] Buffer<float> InputTyped; [require(cpp_cuda_glsl_hlsl_metal_spirv)] RWBuffer<float> OutputTyped; ``` will issue error if targeting to WSGL e.g. `.\build\Debug\bin\slangc .\tests\wgsl_no_buffer.slang -o wgsl_no_buffer.txt -target wgsl -entry Main -stage compute` .\tests\wgsl_no_buffer.slang(2): error 36108: 'InputTyped' has dependencies that are not compatible on the required target 'wgsl'. Buffer<float> InputTyped; ^~~~~~~~~~ .\tests\wgsl_no_buffer.slang(4): error 36108: 'OutputTyped' has dependencies that are not compatible on the required target 'wgsl'. RWBuffer<float> OutputTyped; ^~~~~~~~~~~ Fixes #6304 * Add var capability tests * Do capability checks for global var only * Add inferredCapabilityRequirements to var capability check * Add requirement to the intrinsic types Buffer/RWBuffer * format code * Update capabliity test * use DefaultDataLayout as default data layout * Use visitMemberExpr to check the capabilities * Update the cap tests to match the error messages * update test to use the ScalarDataLayout for hlsl target * Update tests check condition to use error number only * Add default push_constant data layout type --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Ensure we do not have an `initExpr` on a `VarDecl` inside an `InterfaceDecl` ↵ArielG-NV2025-05-30
| | | | | | | | | | | | | | | | (#7283) * Ensure we do not have an initExpr on a var inside an InterfaceDecl Ensure we do not have an initExpr on a var inside an InterfaceDecl. If we do, send an error. Ensure the language server does not segfault with this error as per the issue. * format code * split tests --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Enable LSS hit object test (#7273)Mukund Keshava2025-05-30
| | | | | | | | | | | | | | | | | | | | | * Enable LSS hit object test Enabled LSS SER tests now that PR #7211, which added SER support to OptiX, has been merged. Ran: ./build/Debug/bin/slangc.exe tests/cuda/lss-test.slang -target ptx -Xnvrtc -I"C:/ProgramData/NVIDIA Corporation/OptiX SDK 9.0.0/include" and confirmed that the HitObject intrinsic is called. eg: call (%f15, %f16, %f17, %f18, %f19, %f20, %f21, %f22), _optix_hitobject_get_linear_curve_vertex_data, (); * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix SPIRV `OpSpecConstantOp` emit (#7158)Darren Wihandi2025-05-29
| | | | | | | | | | | | | * Fix SPIRV specialization constant with floating-point operations * Improve test * WIP * Restrict `OpSpecConstantOp` allowed operations based on SPIRV specifications * Fix typo on floating type check * Emit error on float to int spec cosnt int val casts
* Implement MapElement for CoopMat (#7159)Jay Kwak2025-05-29
| | | | | | | | | With this PR, MapElement works for the following signatures: - CoopMat<...>::MapElement(functype(...)); - CoopMat<...>::MapElement(capturing-lambda); - CoopMat<...>::MapElement(not-capturing-lambda); - Tuple<CoopMat<...>,...>::MapElement(functype(...)); - Tuple<CoopMat<...>,...>::MapElement(capturing-lambda); - Tuple<CoopMat<...>,...>::MapElement(not-capturing-lambda);
* Language version + tuple syntax. (#7230)Yong He2025-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Language version + tuple syntax. * Fix compile error. * regenerate documentation Table of Contents * Fix. * regenerate command line reference * Fix. * Fix. * Fix more test failures. * revert empty line change, * Retrigger CI * #version->#lang * Update source/core/slang-type-text-util.cpp Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> * Remove comments. * Fix parsing logic. * Fix parser. * Fix parser. * update test comment * Update options. * regenerate documentation Table of Contents * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Change default descriptor binding to be VkMutable (#7224)ArielG-NV2025-05-28
| | | * change default descriptor binding to be VkMutable
* Add check for subscript operator return type (#7244)Mukund Keshava2025-05-27
| | | Fixes #6987
* Add LSS intrinsics (#7200)Mukund Keshava2025-05-27
| | | | | | | | | | | | | * WiP: LSS intrinsics: initial commit * format code * Fix CI failures * Address review comment --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix operator precedence in OptiX ray payload pointer casting which broke due ↵Harsh Aggarwal (NVIDIA)2025-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | to (#6326) (#7194) * Fix operator precedence in OptiX ray payload pointer casting Added extra parentheses around the cast to ensure proper operator precedence when dereferencing the OptiX ray payload pointer. This fixes the issue where the compiler was treating the expression as (RayPayload_0 *)getOptiXRayPayloadPtr()->color_0 instead of ((RayPayload_0 *)getOptiXRayPayloadPtr())->color_0. Error: nvrtc 12.9: tests/cuda/optix-cluster.slang(17): error : expression must have pointer-to-class type but it has type "void *" nvrtc 12.9: note : (RayPayload_0 *)getOptiXRayPayloadPtr()->color_0 = color_1; nvrtc 12.9: note : ^ Tested using: ./build/Debug/bin/slangc -target ptx -Xnvrtc -I"/home/haaggarwal/NVIDIA-OptiX-SDK-9.0.0-linux64-x86_64/include" -DSLANG_CUDA_ENABLE_OPTIX -entry closestHitShaderA ./tests/cuda/optix-cluster.slang * Fix Check
* Implement shader execution reordering support for OptiX (#7211)Harsh Aggarwal (NVIDIA)2025-05-26
| | | | | | | | | | | | | | | | * Implement shader execution reordering support for OptiX Added OptiX backend support for Shader Execution Reordering (SER) features as outlined in issue #6647. This implementation: 1. Added CUDA target support for HitObject API 2. Implemented core SER functionality (TraceRay, MakeHit/Miss, Invoke) 3. Added OptiX-specific hit object handling functions 4. Added test case for OptiX SER functionality * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Fix #7232 (#7236)Julius Ikkala2025-05-25
|
* Add full support for SPV_NV_shader_subgroup_partitioned (#7103)Darren Wihandi2025-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Properly implement WaveMask* variants of WaveMultiPrefix* intrinsics * More partitioned intrinsics * More partitioned intrinsics and cleaned up non-prefixed WaveMask* implementations * Refactor HLSL WaveMultiPrefix* implementations * fix cap atoms * Clean up implementation * Add GLSL intrinsics and cleanup * Add tests * Fix affected capability test * Update and fix tests * Move expected.txt file * Refactor WaveMask* to call WaveMulti* * Refactor SPIRV/GLSL preamble code * Enable emit-via-glsl tests * remove wave_multi_prefix capability in favor of subgroup_partitioned * Update docs * Update cap atoms doc
* Implement throw & catch statements (#6916)Julius Ikkala2025-05-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Implement throw statement It already existed in the IR, so only parsing, checking and lowering was missing. * Initial catch implementation Likely very broken. * Error out when catch() isn't last in scope * Prevent accessing variables from scope preceding catch As those may actually not be available at that point. * Add IError and use it in Result type lowering * Add diagnostic tests * Allow caught throws in non-throw functions * Fix catch propagating between functions & SPIR-V merge issue * Add test for non-trivial error types * Fix MSVC build * Fix invalid value type from Result lowering * Also lower error handling in templates * Lower result types only after specialization * Attempt to disambiguate error enums by witness table * Revert matching by witness, types should be distinct too * Don't assert valueField when getting Result's error value It may not exist if the function returns void, but getting the error value is still legitimate. * Update tests for new error numbers & get rid of expected.txt * Change catch lowering to resemble breaking a loop ... To make SPIR-V happy. * Fix dead catch blocks and invalid cached dominator tree * More SPIR-V adjustment * Lower catch as two nested loops * Add defer interaction test and revert broken defer changes * Fix enum type when throwing literals * Cleanup and bikeshedding * Document error handling mechanism * Fix table of contents * Use boolean tag in Result<T, E> * Use anyValue storage for Result<T,E> * Remove IError * Fix formatting * Eradicate success values from docs and tests * Use parseModernParamDecl for catch parameter * Implement do-catch syntax * Implement catch-all * Fix formatting * Fix marshalling native calls that throw --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Add CoopVec load/store pointer overloads (#6822)Darren Wihandi2025-05-23
| | | | | | | | | | | * Add pointer/T* variants for coop vec load/store * fix stride decoration and improved test * fix compile warnings * Improve test * Use `coopVecLoad` function in test
* Add default constructor for Ptr type (#7214)Darren Wihandi2025-05-23
| | | | | * Add default constructor for Ptr type * Make pointers c-style type, remove __init() constructor
* Implement default initializer list for C-Style type member (#7079)kaizhangNV2025-05-22
| | | | | | | | | | | | | | | | | | | | | | | * Implement default initializer list for C-Style type member Close #6189. Previsouly, for the C-Style member in a struct, if it doesn't have any initialize expression, when we synthesize the ctor, we will not associate the default value for the parameter corresponding to that member. This bring some trouble that existing slang users has to add '= {}' to every struct fields in order to make all the parameters in the synthesized ctor having a default value, so people can still use `Struct a = {}` to create a struct. To make this use case convenience, we will automatically associated a '= {}' as the default value for this case. This PR also add support for empty initializing link-time sized vector/matrix by "= {}". In addition, this PR also fix a bug in auto diff where we should not report error when proccessing transpose on an empty struct.
* Make sizeof(T) & alignof(T) of generic types work as compile-time constants ↵Julius Ikkala2025-05-22
| | | | | | | | | | | (#7213) * Make sizeof(generic) work as compile-time constant * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Initial `dyn` keyword support & `-lang 2026` compiler option (#7172)ArielG-NV2025-05-22
| | | | | | | | | | | | | | | | | | | | fixes: [#7143](https://github.com/shader-slang/slang/issues/7143) fixes: [#7146](https://github.com/shader-slang/slang/issues/7146) Goal of PR: * This PR is part of the larger #7115 refactor to how dynamic dispatch works. * The first step is to add the `-std <std-revision>` flag. * The second step is to provide basic `dyn` keyword support in AST. This does not include `varDecl` support since most of these interactions require `some` keyword support. Future PR(s) goal: * Support `some` keyword in AST. With this we will also implement all varDecl interactions between `dyn` and `some`. * Add IR support for `some` and `dyn`. Breakdown of PR: * most of the logic is in `validateDyn.*`. This was done so that in the future when we implement more features we will have an easy time removing/adding restrictions to `dyn` interfaces. Breaking changes: * As per spec (https://github.com/shader-slang/spec/pull/14/files), any type conforming to a `dyn` interface errors if member list contains one of the following: opaque type, non copyable type, or unsized type. * Due to the breaking change, the test `tests\compute\dynamic-dispatch-bindless-texture.slang` is incorrect. This has been fixed.
* Add inverse hyperbolic derivatives (#7173)Darren Wihandi2025-05-21
| | | | | * Add inverse hyperbolic derivatives * Add test
* Map `SV_VertexID` to `gl_VertexIndex-gl_BaseVertex`, add `SV_Vulkan*ID` ↵Darren Wihandi2025-05-19
| | | | | | | | | | | | | | | | | semantics (#7150) * Map SV_VertexID to `gl_VertexIndex - gl_BaseVertex`, provide SV_Vulkan* SV semantics * Fix docs * Regenerate toc * Fix affected pointer-2 test * Add tests --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Support Vulkan memory model (#7057)Jay Kwak2025-05-16
| | | | | | | | | | | | | | | The user can explicitly use Vulkan memory model, or it will be automatically used when cooperative-matrix is used. When vulkan memory model is used, two keywords, "Coherent" and "Volatile", are not allowed. There are many differences regarding atomic and texture but this PR has changes limited to support `globallycoherent` keyword. When variables with `globallycoherent` is used with `OpLoad`, it will use additional options, `MakePointerAvailable|NonPrivatePointer`, that will provide the same effect. For `OpStore`, it will use `MakePointerVisible|NonPrivatePointer`.
* Fix: Preserve inout param modifications with OptiX IgnoreHit() (#6956)Harsh Aggarwal (NVIDIA)2025-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add noreturn attribute to IgnoreHit * Revert "Add noreturn attribute to IgnoreHit" This reverts commit 3cf2354dada71b9a8713b08f3a2e261de4aabfa4. * Fix: Preserve inout param modifications with OptiX IgnoreHit() Issue #6326 identified that in OptiX, when using IgnoreHit() (which maps to the "noreturn" optixIgnoreIntersection() intrinsic), any modifications made to 'inout' parameters within the shader would be lost. This was due to IgnoreHit() preventing the execution of the copy-back operation from the temporary variable (used to implement 'inout' semantics) to the original parameter. This commit introduces a new IR pass, 'undoParameterCopy', specifically for CUDA/OptiX targets to address this. The pass operates as follows: 1. Identifies temporary IR variables created for 'inout' parameters, which are now decorated with 'TempCallArgVarDecoration'. 2. Maps these temporary variables back to their original parameter storage (e.g., the OptiX payload pointer). 3. Replaces all uses of the temporary variable directly with the original parameter pointer. 4. Removes the temporary variable declaration and its initializing store (which copied from the original parameter to the temporary). By transforming the IR to operate directly on the original parameter storage before any potential call to IgnoreHit(), this fix ensures that all modifications are preserved, correctly resolving issue #6326. The pass is integrated into the compilation flow for relevant targets. * Refactor(IR): Optimize GetOptiXRayPayloadPtr for better DCE/CSE To allow for more effective dead code elimination (DCE) and common subexpression elimination (CSE) of `getOptiXRayPayloadPtr` instructions, this commit: - Marks `kIROp_GetOptiXRayPayloadPtr` as side-effect-free within `IRInst::mightHaveSideEffects` (in `slang-ir.cpp`). - Flags `GetOptiXRayPayloadPtr` as `HOISTABLE` in its definition within `slang-ir-inst-defs.h`. This addresses scenarios where multiple, potentially redundant, calls to `getOptiXRayPayloadPtr` might appear in the IR, allowing optimizers to produce cleaner and potentially more efficient code for OptiX targets. This change supports efforts to refine IR handling for ray-tracing shader stages. * Remove debugging code * Refactor UndoParameterCopyVisitor for improved performance - Optimized IR traversal by combining multiple passes into a single scan - Removed unnecessary dictionary, immediately replace uses when a temp var is found - Reduced duplicate code paths by checking for both temp vars and redundant stores in one loop - Better handling of the 'changed' flag to ensure DCE only runs when needed - Results in fewer instruction traversals and improved efficiency for large functions * Add Test
* Enable Windows full debug testsuite in CI (#7085)Gangzheng Tong2025-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Unify Debug Layer Control Logic and Add Disable Option for Debug Builds This PR refactors and unifies the debug layer control logic in slang-test. A new `-disable-debug-layers` option is introduced, allowing debug builds to skip enabling the validation (debug) layer. This is currently needed to ensure stability in the debug test suite. Previously, different toggles such as ENABLE_VALIDATION_LAYER, ENABLE_DEBUG_LAYER, and debugLayerEnabled were used inconsistently across different components of slang-test. This PR standardizes the logic by using a single variable, debugLayerEnabled, to control the enabling/disabling of the debug layer internally. Notes: By default, the debug/validation layer is enabled in debug builds and is not supported in release builds of slang-test. Fixes: #7132 * Disable spirv-opt for the DebugFunctionDefinition issue * Run debug build only in GCP machines * Fix VUID-vkCmdPipelineBarrier-pBufferMemoryBarriers-02818 dstAcessMask can't include VK_ACCESS_TRANSFER_READ_BIT when stage mask has VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_KHR * Set failed retry limit to 32 --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Address structured buffer `GetDimensions` issues for WGSL, GLSL and SPIRV ↵Darren Wihandi2025-05-16
| | | | | | | | | | | | | | | | | | | | | | | (#7010) * Fix structured buffer get dimensions * Further fixes and added tests * Remove unnecessary include * Fix test issues * attempt to fix wgpu crash * test remove half usage in test * attempt to fix WGPU test issue * Another attempt to fix WGSL test - make test similar to the existing GetDimensions test --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Fix correct bindings for bindless resource model [SPIRV and GLSL] (#7131)ArielG-NV2025-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | Fix correct bindings for bindless resource model [spirv and glsl] fixes: #6952 Problem: * Currently all bindless objects are placed in the same set (fine) and same binding (incorrect behavior for vulkan). This is incorrect since as per [spec](https://registry.khronos.org/vulkan/specs/latest/man/html/VkDescriptorType.html), only 1 resource type may be written to each index inside a set (these rules are loosened with VK_EXT_mutable_descriptor_type) * This means currently generated bindings do not work in practice if we (for example) use `Sampler2D.Handle` and `Texture1D.Handle` in a shader since we would place 2 incompatible objects in the same binding-index and set. Solution: * `__getDynamicResourceHeap` was modified to allow bindings to chosen dynamically for a descriptor * use `IOpaqueDescriptor` to check compile-time information of resource types so that we can identify different resources * Using this information of `IOpaqueDescriptor`, we modify `defaultGetDescriptorFromHandle` to provide a binding model (1 resource per binding-index) which produces legal spirv/glsl. * To support `VK_EXT_mutable_descriptor_type` the function `defaultGetDescriptorFromHandle` has a set of options (`BindlessDescriptorOptions`) for a user to pick-from to support their binding model. Capabilities are not used here for flexibility purposes (specifically old shaders mixed with modern vulkan extensions). Other changes: * Added `TexelBuffer` DescriptorKind to aid in generating correct bindings * format code * Add to docs bindless changes, make AccelerationStructure use its handle directly, adjust tests accordingly --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Enforce rule that `export`/`extern` (non cpp) must be `const` (#7113)ArielG-NV2025-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * Enforce rule that `export`/`extern` (non cpp) must be `const` fixes: #5570 Problem: 1. we allow non-const-link-time-var to be linked to a const-link-time-var. 2. problem is that: module use site has const var, so, we emit OpStore %Ptr %Const in IR, this is expected, this is good. We fail because we in reality have a OpStore %Ptr %Var (fails since we need a OpLoad in-between) in IR since the module with our link-time-variable-value is a regular variable. 3. We loose the float_litteral talked about inside the github issue since, we technically don't use our variable "VAL" (we never OpLoad from it), so spirv-opt removes the float_litteral, this is a byproduct of the actual issue. Solution: * `export`/`extern` variables must always be `const`. This excludes `__extern_cpp` since `cpp` does not exhibit this issue and works differently. * format code * changel logic and tests to only ensure `static const` with `export`/`extern` * changing the rules: only reqirement is that if we have const we must have static * remove a spirrious change made * fix merge --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
* Fix HLSL ByteAddressBuffer Load* parameter integer type (#7117)Darren Wihandi2025-05-16
| | | | | | | | | * Fix HLSL ByteAddressBuffer Load* parameter integer type * Fix tests * Fix load with alignment function signature clash * Fix LoadAligned tests
* Allow lambda exprs without captures to coerce to `functype`. (#7129)Yong He2025-05-16
|
* Fix RWStructuredBuffer emission (#7139)Mukund Keshava2025-05-16
| | | Fixes #7127
* Fix broken -emit-spirv-via-glsl test option (#7091)sricker-nvidia2025-05-16
| | | | | | | | | | | | | | | | | | | | | | | | Fixes issue #6898 The -emit-spirv-via-glsl slang-test option has been broken for some amount of time. Tests that were using it were operating as if using -emit-spirv-directly, leading to many duplicated tests. After fixing the test option, there were an number of errors that appeared as a result. This change fixes the broken test option and the resulting test errors. Some of the test errors revealed some legitimate issues, such as: -The GLSL bitCount instrinsic only supports 32-bit integers and requires emulation for other bit widths. -Emitting GLSL 8-bit and 16-bit glsl integer types did not emit the proper extension requirements -Emitting GLSL and casting for 16-bit integers was missing a closing parenthesis. -Missing profile for GL_EXT_shader_explicit_arithmetic_types -Missing toType cases for UInt8/Int8 for the kIROp_BitCast case in tryEmitInstExprImpl.
* Add checking for hlsl register semantic. (#7118)Yong He2025-05-15
| | | | | | | | | | | * Add checking for hlsl register semantic. * Fix. * Fix test. * Fix switch error. * Fix tests.
* Implement C++ style default member initializer (#7087)kaizhangNV2025-05-15
| | | | | close #7069. We just insert the assignment expression to the beginning of the user-defined ctor, where the assignee is the member with init expression.
* Implement spec const for generic parameter (#7121)kaizhangNV2025-05-15
| | | | | | | | | | | | | | | | | | Close #6840. This PR add supports to use specialize constant in generic parameter, and that parameter can also be used as array size, e.g. following code should work: ``` struct MyStruct<let N: int> { float buffer[N]; } MyStruct<SpecConstVar> s; ``` - Loose the restriction from Link-Time to SpecializationConstant when extract generic argument - Tweak the logic of how we decide whether a inst is hoistable. Besides checking existing hoistable flag of each IRInst, when we detect a IRInst's type is SpecConstRateType, we will treat that inst hoistable. Because IRInst in global scope can be deduplicated, and every SpecConstRateType inst should be in the global scope or IRGeneric scope (which will be at global scope after specialization). - Remove the SpecConstIntVal to IRInst map in IR lowering logic, because we already have way to deduplicate the global scope IR.
* Rename 'main' on some backends (#7105)Mukund Keshava2025-05-15
| | | | | | | | | | | | | | | | | | | | | | | * Rename 'main' on some backednds Fixes #5542 Some backends like cuda, metal, cpu do not allow 'main' as the entry point. This commit adds a new warning that is emitted when a program uses main as the entry point. In addition to emitting the warning, it internally renames the entry point to "main_". It also adds a test to check these for the three backends. * format code * move test to diagnostics * rename test to diagnostic * generate unique name --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Support tensor addressing (#7060)Jay Kwak2025-05-15
| | | | | | | | | | | This commit implements two new types and related Load/Store functions in CoopMat. tensor_addrressing.TensorLayout tensor_addressing.TensorView CoopMat.Load(..., TensorLayout) CoopMat.Load(..., TensorLayout, TensorView) CoopMat.Store(..., TensorLayout) CoopMat.Store(..., TensorLayout, TensorView) CoopMat.Load(..., TensorLayout, TensorView)
* Re-enable spirv-validation, because the issue is no longer repro (#7082)Jay Kwak2025-05-15
|
* Add new coopmat2 functions: Reduce and Transpose (#7027)Jay Kwak2025-05-14
| | | | | | | | | | | | This commit adds three new functions for CoopMat as described in the proposal document, Cooperative matrix 2 proposal spec#12 The new functions are: CoopMat<T,S,M,N,R>::Transpose CoopMat<T,S,M,N,R>::ReduceRow CoopMat<T,S,M,N,R>::ReduceColumn CoopMat<T,S,M,N,R>::ReduceRowAndColumn CoopMat<T,S,M,N,R>::Reduce2x2
* Support the new CoopVec builtins (#7108)Jay Kwak2025-05-14
| | | | | | | | | **NOTE: This is a breaking change for users who were using POC variant of DXC. In order to keep the compatibility, the users will have to use -capability hlsl_coopvec_poc to their command line. This PR adds a new capability "hlsl_coopvec_poc". When it is used, the HLSL for CoopVec will be emitted for the POC variant of DXC. When it is not used, the HLSL for CoopVec will be emitted for the DXC that officially supports the cooperative vector.
* Do not print errors in _coerce when "JustTrying". (#7064)Jay Kwak2025-05-15
| | | | | | | | | | | | | | | | | | | | * Do not print errors in _coerce when "JustTrying". While figuring out which generic-overload works best, `_coerce()` is printing errors and Slang compilation terminates prematurely. When `TryCheckGenericOverloadCandidateTypes()` is calling `_coerce()` in "JustTrying" mode, the error messages should be snoozed. The following logic shows the intention of how to silence the error messages, but the chain of `sink` was broken in the middle and `_coerce()` was using `getSink()` from the SemanticVisitor. val = ExtractGenericArgInteger( arg, getType(m_astBuilder, valParamRef), context.mode == OverloadResolveContext::Mode::JustTrying ? nullptr : getSink()); * Use tempSink when available.
* Remove readonly keyword from buffer pointer definitions (#7068)aidanfnv2025-05-14
| | | | | | For https://github.com/shader-slang/slang/issues/6880 This change removes the readonly keyword from buffer pointer definitions from the GLSL source emitter, to allow for mutable buffer pointers. Support for readonly will be readded when we add const pointer support later.
* Infer type while constant folding causes failure (#7090)ArielG-NV2025-05-14
| | | | | | | Problem: * Infering type with `let` while using constant-foldable object's means we run in a circular loop of `ensureDecl`. Changes: * If we constant-fold we effectively are ready to check definition. If we are not a constant to fold, we run `setCheckState` after anyways.
* support specialization constant sized array (#6871)kaizhangNV2025-05-14
| | | | | | | | | | | | | | | | | | | | | Close #6859 Goal of this PR We want to support an array whose size can be specialization constant for shared/global variable e.g. layout (constant_id = 0) const uint BLOCK_SIZE = 64; shared float buf_a[(BLOCK_SIZE + 5) * 4]; Overview of the solution: During IndexExpr check, we will loose the restriction to allow SpecConst passing, but the size parameter will not be a constant value because it cannot be folded into a constant, so we will make it follow the same logic as generic parameter value, and the size will be represented by FuncCallIntVal/PolynomialIntVal/DeclRefIntVal. During IR lowering, we will detect whether there is spec constant in the IntVal, and wrap the IRInst with a SpecConstRateType, and propagate the type though the lowering logic, such that the IntVal representing the array size will have SpecConstRateType. During spirv emit stage, if we detect that a IRInst has SpecConstRateType, we will emit it as SpecConstantOp. We have to implement new logic to emit OpSpecConstantOp, the existing emit logic doesn't support emitting OpSpecConstantOp, especially this op can embed arithmetic operation at global scope, where we can only emit arithmetic instruct at local. But there are only few instructs we need to support. Overview of the solution: This PR doesn't support generic, and we will create a separate PR to extend that, tracked in #6840.
* Error out on invalid vector sizes (#7076)Darren Wihandi2025-05-14
| | | | | | | | | * Error out on invalid vector sizes * Remove unnecessary include * Fix incorrect assert * Add test
* Support Array Sizes using Generic arguments to be initialized via {} (#6720)Sruthik P2025-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add support for Array Sizes using Generic arguments to be initialized via {} Fixes one subissue of #6138 This change adds support for initializing Arrays with Generic size arguments via {} and adds a test to verify it. The change checks for an array whose size parameter is a GenericParamIntVal and since the size of such an array will be known at link time, is not considered as a case of the size not being known statically. * Add support for Array Sizes using Generic arguments to be initialized via {} Fixes one subissue of #6138. Fixes the issue #6958. This change adds support for initializing Arrays with Generic size arguments via {} and adds a test to verify it. Support is added by means of adding a new AST Expr node that lowers down to the IR MakeArrayFromElement and the emission of a diagnostic is replaced with the creation of this new AST Expr node. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
* Add half-precision matrix type aliases in GLSL (#7066)James Helferty (NVIDIA)2025-05-13
| | | | | | | | | | | Fixes #6708 This commit adds type aliases for half-precision matrices, including f16mat3x2, f16mat3x3, f16mat3x4, f16mat4x2, f16mat4x3, and f16mat4x4. Convenience aliases for square matrices (f16mat2, f16mat3, f16mat4) are also added. This commit introduces a new test file that validates the usage of half-precision types in a compute shader context.
* Make CUDA version capabilities reach NVRTC (#7074)Theresa Foley2025-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes #7049 The root cause of the problem in #7049 is simply that newer NVRTC versions produce a warning when asked to generate code for older CUDA SM versions, and the default that Slang was requesting compilation for was old enough to trigger that warning, and thus trip up the test case (which only looks at the first diagnostic produced by the downstream compiler). Superficially, the fix was easy: change the test case in question (`tests/diagnostics/local-line.slang`) to request `-capability cuda_sm_8_0`, the minimum version supported by current NVRTC. Unfortunately, the simple fix required some other fixes in order to actually work. The capability system includes capability names of the form `cuda_sm_*_*`, but specifying such a capability had *no* impact on the CUDA SM version passed in when invoking NVRTC. Instead, only the CUDA SM versions requested in the implementation of intrinsics in the core module were affecting the version number passed down. This change adds logic to `slang-compiler.cpp` to take explicitly requested capabilities into account when inferring the CUDA SM version to be passed downstream. A more complete fix would also add similar logic for all the other targets. Unfortunately... yet again... that fix wasn't enough to make things work as expect. Now I had the problem that requesting `-capability cuda_sm_8_0` was actually causing the NVRTC invocation to request CUDA SM version **9.0**! The underlying problem *there* was that the `slang-capabilities.capdef` file has defined certain capability names in a way that implies atomic capabilities much higher than one would expect. E.g., the `cuda_sm_8_0` alias was including HLSL `sm_5_0`, but then `sm_5_0` in turn included `_cuda_sm_9_0`. The fix, for now, is to change the definitions in `slang-capabilities.capdef` to not have the counter-intuitive definitions for `cuda_sm_*_*`. With this set of fixes, the test failure in the original bug report no longer occurs. The work that went into this change suggests several larger-scope fixes that would be good to pursue: * Ideally the capability definitions would have some sort of validation checking to make sure that counter-intuitive results like `cuda_sm_8_0` requesting CUDA SM 9.0 do not occur. * The translation of capabilities over to version numbers for a downstream compiler should be expanded to cover other targets, and not just CUDA. It might be better/simpler to just pass the capabilities themselves to the downstream compiler, since it is possible that a downstream compiler could have more fine-grained enable/disable options than a simple version number. * The entire approach to computing version numbers required for downstream compilation should be cleaned up so that we don't have this duplication between the capabilities that represent those versions and separate syntactic constructs that are used to "request" those versions as part of code generation. * We are very much at the point where we should consider dropping the current behavior where a profile name or capability like `sm_5_0`, that is specific to a single target or a subset of targets, also implies a set of comparable capabilities for other targets.
* Ensure to emit 32-bit type for bitfield operations for SPIRV (#7046)kaizhangNV2025-05-12
| | | Close #7014
* cluster acceleration structure optix 6431 (#7028)Harsh Aggarwal (NVIDIA)2025-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add cluster geometry intrinsics for ray tracing - Added GetClusterID() method to HitObject class - Added CandidateClusterID() and CommittedClusterID() methods to RayQuery class - Added SPV_NV_cluster_acceleration_structure extension support - Added GL_NV_cluster_acceleration_structure extension support - Added test files for RayQuery and HitObject cluster methods Fixes #6431 * OpRayQueryGetIntersectionClusterIdNV - unrecognized spirv Disabling spirv backend for SPV_NV_cluster_acceleration_structure hlsl.meta.slang(18674): error 29100: unrecognized spirv opcode: OpRayQueryGetIntersectionClusterIdNV result:$$int = OpRayQueryGetIntersectionClusterIdNV &this $iCandidateOrCommitted; ^~~~~~ hlsl.meta.slang(18670): error 30019: expected an expression of type 'int', got 'void' return spirv_asm ^~~~~~~~~ ninja: build stopped: subcommand failed. * 6431 - Fix spirv opcode * Remove tests * Add relevant tests * Review - Simplify tests