slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Add SPV_NV_bindless_texture support (#8534)	Lujin Wang	2025-09-26
\| \| \| \| \| \| \| \| \| \| \| \| \|	Treat DescriptorHandle as uint64_t instead of uint2. Implement target-specific SPIR-V emission with the bindless texture support. For OpImageTexelPointer, Image must have a type of OpTypePointer with Type OpTypeImage. Fix the issue by using [constref] in __subscript. Add a test coverage for various texture/sampler handle types. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Enable CUDA support for additional HLSL intrinsic tests (#8293)	Harsh Aggarwal (NVIDIA)	2025-09-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable CUDA support for additional HLSL intrinsic tests by implementing missing functionality and fixing compiler bugs affecting CUDA targets. - Fix critical bug in InterlockedCompareStore64 where division used /4 instead of /8 for 64-bit types, causing incorrect memory addressing for all signed int 64_t atomics - Add signed int64_t atomic wrappers (atomicExch, atomicCAS) to CUDA prelu de that properly cast to/from unsigned types as required by CUDA's atomic API - Enable tests: atomic-intrinsics-64bit.slang - Implement CUDA support for QuadAny and QuadAll operations using warp shu ffle primitives (__shfl_sync with quad-level lane masking) - Add CUDA to quad_control capability definition in slang-capabilities.capdef - Add _slang_quadAny/_slang_quadAll helper functions to CUDA prelude - Enable tests: quad-control-comp-functionality.slang, subgroup-quad.slang --------- Co-authored-by: szihs <675653+szihs@users.noreply.github.com>
*	[CBP] Pointer frontend changes + groupshared pointer support (#7848)	ArielG-NV	2025-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Resolves #7628 Resolves: #8197 Primary Goals: 1. Add `Access` to pointer 2. AddressSpace::GroupShared support for pointers (SPIR-V) 3. Add `__getAddress()` to replace `&` * `&` is not updated to `require(cpu)` since slangpy uses `&`. This means we must: (1) merge PR; (2) replace `&` with `__getAddress()`; (3) add `require(cpu)` to `&` Changes: * Added to `Ptr` the `Access` generic argument & logic (for `Access::Read`). * Moved the generic argument `AddressSpace` from `Ptr` to the end of the type. * Added pointer casting support between any `Ptr` as long as the `AddressSpace` is the same * Disallow globallycoherent T* and coherent T* * Disallow const T, T const, and const T* * Fixed .natvis display of `ConstantValue` `ValOperandNode` * Support generic resolution of type-casted integers * Added `VariablePointer` emitting for spirv + other minor logic needed for groupshared pointers Breaking Changes: * Anyone using the `AddressSpace` of `Ptr` will now have to account for the `Access` argument * we disallow various syntax paired with `Ptr` and `T*` --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Add Metal support for WaveGetActiveMask and WaveActiveCountBits (#8218)	Tianyu Li	2025-08-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## Summary - Add Metal platform support for `WaveGetActiveMask()` and `WaveActiveCountBits()` wave intrinsics - Update capability requirements to include Metal platform for subgroup ballot operations - Implement Metal-specific intrinsic assembly using `simd_ballot()` and `simd_vote` APIs ## Changes - source/slang/hlsl.meta.slang: - Add Metal target case for `WaveGetActiveMask()` using `simd_ballot(true)` - Update capability requirements from `cuda_glsl_hlsl_spirv` to `cuda_glsl_hlsl_metal_spirv` for wave ballot functions - source/slang/slang-capabilities.capdef: - Add `metal` to `subgroup_ballot_activemask` capability alias
*	Handle SV_Barycentrics on metal (#8163)	James Helferty (NVIDIA)	2025-08-13
\| \| \|	Fixes #6785
*	Error if super-type capabilities are a super-set of sub-type (#7452)	ArielG-NV	2025-08-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: #7410 Changes: 1. super-type capabilities must be a super-set of sub-type capabilities (and support the same shader stages/targets) * InheritanceDecl visits super-type to inherit it's capabilities; validate InheritanceDecl capabilities against sub-type * visit all container decl's with a default case * clean up functionDeclBase visitor * Simplify `diagnoseUndeclaredCapability` by moving logic into capability checking (more correct) 3. added changed behavior to documentation 4. fixed some incorrect capabilities 5. we do not* diagnose capability errors on interface requirement-to-implementation if both lack explicit capability requirements. This change is to work around a slangpy regression (test case for the failing situation is in `tests\language-feature\capability\capability-interface-extension-1.slang`), Note: maybe for slang-2026 we don't do this? 6. requirement & implementation must support the same shader stage/target. This was changed because otherwise we can have cases where `X` inherits from `Y`, but `Y` is only expected to be used in `glsl` whilst `X` is expected to be used in `hlsl \| glsl` 7. removed `tests/language-feature/capability/capabilitySimplification3.slang` because it tests nothing special (redundant) Note: not using rebase due to separate branches depending on this PR --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	add task shader alias (#7372)	Sirox	2025-07-02
\| \| \| \| \| \| \| \| \| \| \|	* alias amplification shader as task shader and add mesh shader profile * add task shader stage alias to capabilities * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Support the GLSL/SPIR-V Built-in variable `DeviceIndex` (#7552)	ArielG-NV	2025-06-29
\| \| \| \| \| \| \| \| \| \| \|	* Support DeviceIndex * format code * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Add new capdef for lss intrinsics (#7427)	Mukund Keshava	2025-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add new capdef for lss intrinsics Fixes #7426 Raygen shaders need to be supported for only hitobject APIs. So we need a special capability for that, instead of a common one. * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Add optix support for coopvec (#7286)	Mukund Keshava	2025-06-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	* WiP: Add coopvec support for Optix * format code * fix minor issues * Fix review comments --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Fix somme mis-define of capability (#7356)	kaizhangNV	2025-06-05
\| \| \| \| \| \| \| \| \| \| \|	Close #7315. We have couple mis-definition in capability. sm_50 shouldn't require cuda compute_9_0, drop it to compute_6_0 unpack should only require compute_6_0 subgroup_ballot will require sm_60 Co-authored-by: Yong He <yonghe@outlook.com>
*	Language version + tuple syntax. (#7230)	Yong He	2025-05-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Language version + tuple syntax. * Fix compile error. * regenerate documentation Table of Contents * Fix. * regenerate command line reference * Fix. * Fix. * Fix more test failures. * revert empty line change, * Retrigger CI * #version->#lang * Update source/core/slang-type-text-util.cpp Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> * Remove comments. * Fix parsing logic. * Fix parser. * Fix parser. * update test comment * Update options. * regenerate documentation Table of Contents * regenerate command line reference --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
*	Add LSS intrinsics (#7200)	Mukund Keshava	2025-05-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	* WiP: LSS intrinsics: initial commit * format code * Fix CI failures * Address review comment --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Add full support for SPV_NV_shader_subgroup_partitioned (#7103)	Darren Wihandi	2025-05-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Properly implement WaveMask* variants of WaveMultiPrefix* intrinsics * More partitioned intrinsics * More partitioned intrinsics and cleaned up non-prefixed WaveMask* implementations * Refactor HLSL WaveMultiPrefix* implementations * fix cap atoms * Clean up implementation * Add GLSL intrinsics and cleanup * Add tests * Fix affected capability test * Update and fix tests * Move expected.txt file * Refactor WaveMask* to call WaveMulti* * Refactor SPIRV/GLSL preamble code * Enable emit-via-glsl tests * remove wave_multi_prefix capability in favor of subgroup_partitioned * Update docs * Update cap atoms doc
*	Support Vulkan memory model (#7057)	Jay Kwak	2025-05-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The user can explicitly use Vulkan memory model, or it will be automatically used when cooperative-matrix is used. When vulkan memory model is used, two keywords, "Coherent" and "Volatile", are not allowed. There are many differences regarding atomic and texture but this PR has changes limited to support `globallycoherent` keyword. When variables with `globallycoherent` is used with `OpLoad`, it will use additional options, `MakePointerAvailable\|NonPrivatePointer`, that will provide the same effect. For `OpStore`, it will use `MakePointerVisible\|NonPrivatePointer`.
*	Fix broken -emit-spirv-via-glsl test option (#7091)	sricker-nvidia	2025-05-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes issue #6898 The -emit-spirv-via-glsl slang-test option has been broken for some amount of time. Tests that were using it were operating as if using -emit-spirv-directly, leading to many duplicated tests. After fixing the test option, there were an number of errors that appeared as a result. This change fixes the broken test option and the resulting test errors. Some of the test errors revealed some legitimate issues, such as: -The GLSL bitCount instrinsic only supports 32-bit integers and requires emulation for other bit widths. -Emitting GLSL 8-bit and 16-bit glsl integer types did not emit the proper extension requirements -Emitting GLSL and casting for 16-bit integers was missing a closing parenthesis. -Missing profile for GL_EXT_shader_explicit_arithmetic_types -Missing toType cases for UInt8/Int8 for the kIROp_BitCast case in tryEmitInstExprImpl.
*	Support tensor addressing (#7060)	Jay Kwak	2025-05-15
\| \| \| \| \| \| \| \| \| \| \|	This commit implements two new types and related Load/Store functions in CoopMat. tensor_addrressing.TensorLayout tensor_addressing.TensorView CoopMat.Load(..., TensorLayout) CoopMat.Load(..., TensorLayout, TensorView) CoopMat.Store(..., TensorLayout) CoopMat.Store(..., TensorLayout, TensorView) CoopMat.Load(..., TensorLayout, TensorView)
*	Add new coopmat2 functions: Reduce and Transpose (#7027)	Jay Kwak	2025-05-14
\| \| \| \| \| \| \| \| \| \| \| \|	This commit adds three new functions for CoopMat as described in the proposal document, Cooperative matrix 2 proposal spec#12 The new functions are: CoopMat<T,S,M,N,R>::Transpose CoopMat<T,S,M,N,R>::ReduceRow CoopMat<T,S,M,N,R>::ReduceColumn CoopMat<T,S,M,N,R>::ReduceRowAndColumn CoopMat<T,S,M,N,R>::Reduce2x2
*	Support the new CoopVec builtins (#7108)	Jay Kwak	2025-05-14
\| \| \| \| \| \| \| \| \|	**NOTE: This is a breaking change for users who were using POC variant of DXC. In order to keep the compatibility, the users will have to use -capability hlsl_coopvec_poc to their command line. This PR adds a new capability "hlsl_coopvec_poc". When it is used, the HLSL for CoopVec will be emitted for the POC variant of DXC. When it is not used, the HLSL for CoopVec will be emitted for the DXC that officially supports the cooperative vector.
*	Make CUDA version capabilities reach NVRTC (#7074)	Theresa Foley	2025-05-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #7049 The root cause of the problem in #7049 is simply that newer NVRTC versions produce a warning when asked to generate code for older CUDA SM versions, and the default that Slang was requesting compilation for was old enough to trigger that warning, and thus trip up the test case (which only looks at the first diagnostic produced by the downstream compiler). Superficially, the fix was easy: change the test case in question (`tests/diagnostics/local-line.slang`) to request `-capability cuda_sm_8_0`, the minimum version supported by current NVRTC. Unfortunately, the simple fix required some other fixes in order to actually work. The capability system includes capability names of the form `cuda_sm__`, but specifying such a capability had no impact on the CUDA SM version passed in when invoking NVRTC. Instead, only the CUDA SM versions requested in the implementation of intrinsics in the core module were affecting the version number passed down. This change adds logic to `slang-compiler.cpp` to take explicitly requested capabilities into account when inferring the CUDA SM version to be passed downstream. A more complete fix would also add similar logic for all the other targets. Unfortunately... yet again... that fix wasn't enough to make things work as expect. Now I had the problem that requesting `-capability cuda_sm_8_0` was actually causing the NVRTC invocation to request CUDA SM version 9.0! The underlying problem there was that the `slang-capabilities.capdef` file has defined certain capability names in a way that implies atomic capabilities much higher than one would expect. E.g., the `cuda_sm_8_0` alias was including HLSL `sm_5_0`, but then `sm_5_0` in turn included `_cuda_sm_9_0`. The fix, for now, is to change the definitions in `slang-capabilities.capdef` to not have the counter-intuitive definitions for `cuda_sm__`. With this set of fixes, the test failure in the original bug report no longer occurs. The work that went into this change suggests several larger-scope fixes that would be good to pursue: * Ideally the capability definitions would have some sort of validation checking to make sure that counter-intuitive results like `cuda_sm_8_0` requesting CUDA SM 9.0 do not occur. * The translation of capabilities over to version numbers for a downstream compiler should be expanded to cover other targets, and not just CUDA. It might be better/simpler to just pass the capabilities themselves to the downstream compiler, since it is possible that a downstream compiler could have more fine-grained enable/disable options than a simple version number. * The entire approach to computing version numbers required for downstream compilation should be cleaned up so that we don't have this duplication between the capabilities that represent those versions and separate syntactic constructs that are used to "request" those versions as part of code generation. * We are very much at the point where we should consider dropping the current behavior where a profile name or capability like `sm_5_0`, that is specific to a single target or a subset of targets, also implies a set of comparable capabilities for other targets.
*	cluster acceleration structure optix 6431 (#7028)	Harsh Aggarwal (NVIDIA)	2025-05-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add cluster geometry intrinsics for ray tracing - Added GetClusterID() method to HitObject class - Added CandidateClusterID() and CommittedClusterID() methods to RayQuery class - Added SPV_NV_cluster_acceleration_structure extension support - Added GL_NV_cluster_acceleration_structure extension support - Added test files for RayQuery and HitObject cluster methods Fixes #6431 * OpRayQueryGetIntersectionClusterIdNV - unrecognized spirv Disabling spirv backend for SPV_NV_cluster_acceleration_structure hlsl.meta.slang(18674): error 29100: unrecognized spirv opcode: OpRayQueryGetIntersectionClusterIdNV result:$$int = OpRayQueryGetIntersectionClusterIdNV &this $iCandidateOrCommitted; ^~~~~~ hlsl.meta.slang(18670): error 30019: expected an expression of type 'int', got 'void' return spirv_asm ^~~~~~~~~ ninja: build stopped: subcommand failed. * 6431 - Fix spirv opcode * Remove tests * Add relevant tests * Review - Simplify tests
*	Add a new capability hlsl_2018 that avoid using select/and/or (#7003)	Jay Kwak	2025-05-05
\| \| \|	Co-authored-by: Yong He <yonghe@outlook.com>
*	Add Slang Byte Code generation and interpreter. (#6896)	Yong He	2025-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add Slang Byte Code generation and interpreter. * Fix compile issues. * format code * More compile fix. * Fix clang issue. * Fix more clang issues. * Another clang fix. * Fix clang issues. * Fix another clang issue. * Fix wasm build. * Update building.md * Fix test-server. * Fix compile error. * Fix bug. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Implement shader subgroup rotate intrinsics (#6878)	Darren Wihandi	2025-04-22
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Initial implementation for SPIRV, GLSL and Metal * test add bool test * Fix and improve subgroup rotate tests * Add proper GLSL extensions and proper Metal type checking * Clean up tests and add diagnostics test for subgroup type for Metal * Update wave-intrinsics docs
*	Add a new SM profile 6.9 (#6879)	Jay Kwak	2025-04-22
\|
*	Add cooperative matrix 1 support (#6565)	Darren Wihandi	2025-04-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* initial wip for spirv * working tiled example * clean up store and load * minor fixes * fix loadAny name * add initial tests, including broken/unimplemented intrinsics * fix subscript * run tests at 16x16, remove not supported arithmetic tests * minor fixups on implementation * rename CoopMatMatrixUse * Update tests to pass validation layers locally * Add mat-mul-add test and minor fixes * Add more tests * Remove dead code * Add coopMatLoad function and tests, enforce constexpr for matrix layout * Use getVectorOrCoopMatrixElementType in place of getVectorElementType
*	Output SPV_KHR_compute_shader_derivatives extension string instead of the NV ↵	Darren Wihandi	2025-03-19
\| \| \| \| \| \| \|	extension (#6641) * Output SPV_KHR_compute_shader_derivatives instead of the NV extension * add alias for nv extension
*	Implement floating-point pack/unpack intrinsics for all targets (#6503)	Darren Wihandi	2025-03-18
\| \| \| \| \| \| \|	* Implement floating-point pack/unpack intrinsics * remove unused functions and update caps in glsl meta file * rename pack capability
*	Add support for Metal subgroup/simd operations (#6247)	Darren Wihandi	2025-02-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* initial work for metal subgroups * add glsl intrinsics * enable wave tests * enable glsl subgroup tests, glsl barrier fixes * minor fixes * fix incorrect test target * disable some glsl functional tests * disable failing glsl test --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Add support for WGSL subgroup operations (#6213)	Darren Wihandi	2025-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* initial work * more work * more work on glsl intrinsics * add subgroup broadcast for glsl * wip add wgsl extension tracking * enable tests, enable extensions and added some todos * format and warning fixes * fix wgsl extension tracker --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Support cooperative vector (#6223)	Jay Kwak	2025-01-30
\| \| \| \| \| \| \|	* Support cooperative vector without Vulkan-header update Adding a Slang support for cooperative vector. But this commit doesn't have Vulkan-header update.
*	Implement WaveMultiPrefix* for SPIRV and GLSL (#6182)	Darren Wihandi	2025-01-29
\|
*	Implement Quad Control intrinsics (#5981)	Darren Wihandi	2025-01-17
\|
*	Fix typo in capdef file (#5711)	Bruce Mitchener	2024-12-02
\| \| \| \| \| \| \| \| \|	After #5671, when a build is done, the doc file is regenerated from the capdef file resulting in a changed file in the build tree. Fixing the typo in the capdef file prevents that from happening. Co-authored-by: Yong He <yonghe@outlook.com>
*	Write only texture types. (#5454)	Yong He	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for write-only textures. * Fix capabilities. * Fix implementation. * Fix. * format code --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Add `InterlockedAddF64` intrinsic. (#5412)	Yong He	2024-10-27
\|
*	Feature/wgsl intrinsic texture gather (#5141)	Jay Kwak	2024-09-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR implements the texture gather functions for WGSL. The pattern was very similar to how Metal was implemented. Before copy and paste from the Metal implementation, I had to clean up the Metal implementation to make it more readable and maintainable. Gather functions are available only for 2D and 3D textures. Their `array` and `depth` variants may or may not be supported depending on the target. `static_assert` ensures that Gather functions are available only for 2D and 3D textures. Removed incorrect use of "$p" argument for targeting GLSL.
*	WGSL implement texture intrinsics except gather and sampler-less (#5123)	Jay Kwak	2024-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	This commit implements all of the texture intrinsics for WGSL except "Gather" and sampler-less. They will be implemented in a separate PR. A few things to note: - texture sampling functions are available only for the fragment shader stage; not for compute - WGSL doesn't have any functions similar to CalculateLevelOfDetail or CalculateLevelOfDetailUnclamped. - WGSL doesn't have a function overlaoding for textureSample with "clamp" or "status" arguments. - WGSL doesn't support Load operation with offset for texture_multisampled_XX and texture_storage_XX. - WGSL supports only four types of depth textures: 2D, 2D_array, cube and cube_array. - WGSL doesn't support "offset" variants for cube and cube_array.
*	Add WGSL intrinsics for synchronization (#5114)	Anders Leino	2024-09-19
\| \| \|	This closes issue #5085.
*	Add WGSL pack/unpack, constructor, derivatives & misc intrinsics (#5102)	Anders Leino	2024-09-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add WGSL pack/unpack intrinsics This addresses issue #5080. * Add WGSL constructor intrinsics This addresses issue #5081. * Add WGSL derivative and miscellaneous intrinsics This addresses issue #5083. * Add some missing WGSL intrinsics - degrees - faceforward
*	Implement math intrinsics for WGSL (#5078)	Jay Kwak	2024-09-17
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Implement math intrinsics for WGSL This commit implements math related intrinsics and a few others for WGSL. The implementation is based on the following doc, https://www.w3.org/TR/WGSL slang-test was looking for the downstream compiler for WGSL even though it is not used. This commit adds a minimal change to avoid the crash.
*	Initial WGSL support (#5006)	Anders Leino	2024-09-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add WGSL as a target This is required for #4807. * C-like emitter: Allow the function header emission to be overloaded WGSL-style function headers are pretty different from normal C-style headers: Normal C-style headers: ReturnType Func(...) void VoidFunc(...) WGSL-style headers: fn Func(...) -> ReturnType fn VoidFunc(...) This change allows the header style to be overloaded, in order to accomodate WGSL-style headers as required to resolve issue #4807, but retains normal C-style headers as the default implementation. [1] https://www.w3.org/TR/WGSL/#function-declaration-sec * C-like emitter: Allow emission of switch case selectors to be overloaded The C-like emitter will emit code like this: switch(a.x) { case 0: case 1: { ... } break; ... } This is not allowed in WGSL. Instead, selectors for cases that share a body must [1] be separated by commas, like this: switch(a.x) { case 0, 1: { ... } break; ... } To prepare for addressing issue #4807, this patch makes the emission of switch case selectors overloadable. [1] https://www.w3.org/TR/WGSL/#syntax-case_selectors * C-like emitter: Support WGSL-style declarations This patch helps to address issue 4807. C-like languages declare variables like this: i32 a; WGSL declares variables like this: var a : i32 The patch introduces overloads so that the forthcoming WGSL emitter can output WGSL-style declarations, which helps to resolve #4807. * C-like emitter: Support overloading of declarators Unlike C-like languages, WGSL does not support the following types at the syntax level, via declarators: - arrays - pointers - references For this reason, this patch introduces support for overloading the declarator emitter, in order to help address issue #4807. C-like languages: int a[3]; // Array-ness of type is mixed into the "declarator" WGSL: var a : array<int, 3>; // Array-ness of type is part of the... type_specifier! * C-like emitter: Allow struct declaration separator to be overridden C-like languages use ';' as a separator, and languages like e.g. WGSL use ','. This change prepares for addressing issue #4807. * C-like emitter: Allow overriding of whether pointer-like syntax is necessary Things like e.g. structured buffers map to "ptr-to-array" in WGSL, but ptr-typed expressions don't always need C-style pointer-like syntax. Therefore, make it overrideable whether or not such syntax is emitted in various cases in order to address #4807. * C-like emitter: Emit parenthesis to avoid warning about & and + precedence This helps with #4807 because WGSL compilers (e.g. Tint) treat absence of parenthesis as an error. * C-like emitter: Add hook for emitting struct field attributes WGSL requires @align attributes to specify explicit field alignment in certain cases. Thus, this patch prepares for addressing #4807. * C-like emitter: Add hook for emitting global param types Declarations of structured buffers map to global array declarations in WGSL. However, in all other cases such as when structured buffers are used in operands, their types map to ptr-to-array. This patch makes it possible for the WGSL back-end to say that structured buffers generally map to "ptr-to-array" types, but still have a special case of just "array" when declaring the global shader parameter. Thus, this patch helps with addressing #4807. * IR lowering: Use std140 for WGSL uniform buffers This patch just cuts out some logic that prevented std140 to be chosen for WGSL uniform buffers. Note that WGSL buffers in the uniform address space is not quite std140, but for now it's close enough to avoid compile issues. Later on, a custom layout should be created for WGSL uniform buffers. When that's done, this change will be revisited, but for now it helps to resolve #4807. * Don't emit line directives in WGSL by default WGSL does not support line directives [1]. The plan currently seems to be to instead support source-map [2]. This is part of addressing issue #4807. [1] https://github.com/gpuweb/gpuweb/issues/606 [2] https://github.com/mozilla/source-map * WGSL IR legalization: Map SV's The implementation closely follows the cooresponding one for Metal. Supported: - DispatchThreadID - GroupID - GroupThreadID - GroupThreadID Unsupported: - GSInstanceID This is not complete, but it helps to address #4807. * WGSL emitter: Add support for basic language constructs A lot of the basics are added in order to generate correct WGSL code for basic Slang language constructs. This addresses issue #4807. This adds support for at least the following: - statments - if statements - ternary operator - while statement - for statements - variable declarations - switch statements - Note: Slang may emit non-constant case expressions, see issue 4834 - literals - integer literals - u?int[16\|32\|64]_t - float and half literals - bool literals - vector literals and splatting (e.g 1.xxx) - function definitions - assignments - +=, =, /= - array assignments - vector assignments/updates - swizzles of other vectors - from matrix rows ('m[i]' notation) - from matrix cols (using swizzle notation, e.g 'm._11_12_13') - matrix assignments/updates - to rows ('m[i]' notation) - to cols (using swizzle notation, e.g 'm._11_12_13') - declarations - arrays [1] https://www.w3.org/TR/WGSL/#syntax-switch_body Add some WGSL capabilities This patch registers some WGSL capabilities required to pass many of the initial compute shader compile tests. Many capabilities still remain to be added -- this is just an initial set to help resolve issue #4807. - asint - min and max - cos and sin - all and any * WGSL and C-like emitters: Add hack to bitcast case expression In WGSL, the switch condition and case types must match. https://www.w3.org/TR/WGSL/#switch-statement Slang currently allows these types to mismatch, as pointed out in #4921. Issue #4921 should eventually be addressed in the front-end by a patch like [1]. However, at the moment that would break Falcor tests. Thus, this patch temporarily works around the issue in the WGSL emitter only in order to help resolve #4807. In the future, the Falcor tests should be fixed, this patch should be dropped and [1] should be merged instead. [1] a32156ef52f43b8503b2c77f2f1d51220ab9bdea
*	Document All Capability Atoms and Profiles (#5008)	ArielG-NV	2024-09-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Document All Capability Atoms and Profiles Fixes: #4125 Unimplemented Considerations: 1. This PR does not add support to query all capability-atom's from a command-line option. It is understood that this might be desired, due to this, the logic to generate `docs\user-guide\a3-02-reference-capability-atoms.md` was made to be "command-line friendly" so minimal changes are needed to pipe our documentation into a command-line option if this change is to be added. Changes: 1. Added a way to document atoms inside `.capdef`. Method to document is described under `source\slang\slang-capabilities.capdef`. The goal is to error if a public atom does not have any form of documentation to ensure we always have up-to-date documentation to guide user on what an atom is/does. * The following `.capdef` file syntax was added * /// [HEADER_GROUP] * /// regular comment 2. When capability generator runs it auto-generates `docs\user-guide\a3-02-reference-capability-atoms.md` 3. Added to the user-guide 3 sections: `Reference`, `Reference -> Capability Profiles`, `Reference -> Capability atoms` section
*	Allow capabilities to be used with `[shader("...")]` (#4928)	ArielG-NV	2024-08-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Allow capabilities to be used with `[shader("...")]` Fixes: #4917 Changes: 1. Allow using capabilities instead of `Stage`s with `EntryPointAttribute`. 2. When resolving capabilities for an entrypoint+profile (per entrypoint) in `resolveStageOfProfileWithEntryPoint` add our `EntryPointAttribute` and resolved capability 3. Added tests and some capabilities related clean-up * fix a warning made by a mistake in syntax * change fineStageByName to assume it is passed a stage without a '_' * test with and without prefix '_' * cleanup some profiles and reprisentation to work better with 'Stage' and 'Profile' This use case is why we need to clean all profile-usage into `CapabilityName`s directly. * change how we compare * only change profiles * let all capabilities be resolved by 'shader' profile for now * fix warning checks I accidently broke * meshshading_internal to _meshshading --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Initial support for precompiled DXIL in slang-modules (#4755)	cheneym2	2024-08-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add embedded precompiled binary IR ops Add IR operations to embed precompiled DXIL or SPIR-V blobs into IR. Adds a BlobLit literal that is mostly identical to StringLit except for its inability to be displayed, e.g. in dumped IR. In the future, the blob might be dumped as hexadecimal, but for now it is summarized as "<binary blob>". * EmbeddedDXIL and SPIR-V options The options, '-embed-dxil' and '-embed-spirv' in slangc, will cause a target dxil or spirv to be compiled and stored in the translation unit IR when written to a slang-module. Subsequent changes actually implement the options. * Per-translation unit DXIL precompilation When -embed-dxil is specified, perform a precompilation to DXIL of each TU, linked only with stdlib. Embed the resulting DXIL for the TU in a IR op. Being part of IR, the precompiled DXIL can be serialized to disk in a slang-module. Upon loading slang-modules, the new IR op will be searched for and the precompiled DXIL blob is saved with the loaded Module. During linking, if all the Modules have precompiled blobs they will be sent to the downstream compile commands as libraries instead of source, skipping the downstream compilation, using DXC only for linking. Fixes Issue #4580 * Remove placeholder embedded SPIRV support Code was added only to sketch out how other precompiled bins will be supported. * Remove the rest of the SPIRV placeholder support * Fix warnings, test error on non-windows * Remove lib_6_6 hack, add dxil_lib capability * Allocate blob value from irmodule memarena * Add null check after memarena allocation * Restore the request->e2erequest code path for generatewholeprogram * Update capability handling, move EmbedDXIL enum to end to preserve abi * Remove lib_6_6 hack * Move ICompileRequest functions to end
*	Add `_Internal`/`External` atom enforcement and validation. (#4702)	ArielG-NV	2024-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add `_Internal`/`External` atom validation and use enforcement. Fixes: #4676 Changes: * Added `validateInternalAtomExternalAtomPair` to the capability generator to ensure all `_Internal` atoms have a corresponding `External` atom. * Validation of 'RequireCapabilityAttribute' warns if a user uses an '_Internal' atom. * Added 'External' atoms to atoms with an already existing '_Internal' atom. * Printing an atom removes '_'. * Fixed some incorrect which were checking for the incorrect warning/error (capability4.slang, capability5.slang, capability6.slang). * switch capability name to use `UnownedStringSlice` instead of `const char` switch capability name to use `UnownedStringSlice` instead of `const char`, this includes using functions like `.startsWith`. * grammer --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Overhaul IR lowering of pointer types. (#4710)	Yong He	2024-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Overhaul IR lowering of pointer types. * Propagate address space in IRBuilder. * Fixup. * Fix. * Fix. * Change how Ptr type is printed to text. * Fix.
*	Metal: `Interlocked` (atomic) member function support for buffers (#4655)	ArielG-NV	2024-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Metal: `Interlocked` (atomic) member function support for buffers fixes: #4654 fixes: #4481 1. Add `Interlocked` (atomic) member function support for buffers to Metal 2. Fix `__getEquivalentStructuredBuffer` so it works with CPP/Metal targets * add `CompareStore` support * legalize RWByteAddressBuffer to fully replace StructuredBuffer * destroy replaced byte-addr buffer * cleanup as per review and add comment to explain why certain code exists * fix flow of byte-address-buffer replacement * toggle on option to translate byteAddrBuffer to StructuredBuffer * cleanup unused buffers * add treatGetEquivalentStructuredBufferAsGetThis flag to treat getEquivStructuredBuffer as a byteAddressBuffer * comment to explain `treatGetEquivalentStructuredBufferAsGetThis` --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	Implement 64-bit version of clockARB (#4571)	venkataram-nv	2024-07-10
\| \| \| \| \| \| \| \| \| \| \|	* Implement 64-bit version of clockARB * Fix capability versions * Corrections to capabilities --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	add GL_EXT_ray_tracing_position_fetch (#4566)	ArielG-NV	2024-07-10
\|