| Age | Commit message (Collapse) | Author |
|
Enable CUDA for the tests listed in issue #8078
This requires a minor CUDA prelude change, adding some math functions.
|
|
* Initial plan
* Add U32_firstbitlow implementation for CUDA and CPP backends
Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
* Add I32_firstbitlow and comprehensive testing for signed/unsigned firstbitlow
Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
* Convert firstbitlow test to use inline filecheck syntax
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Add U32_firstbithigh and I32_firstbithigh implementations for CUDA and CPP backends
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Update prelude/slang-cpp-scalar-intrinsics.h
* Update prelude/slang-cpp-scalar-intrinsics.h
* Update prelude/slang-cpp-scalar-intrinsics.h
* Refactor Metal bit intrinsics to handle zero case correctly
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Update slang-cuda-prelude.h
remove fake links
* Update hlsl.meta.slang
* if -1, return -1 due to implicit hlsl rule
* -1 or 0 is ~0u as per hlsl implictly
* 0 or -1 as per hlsl
* fix the math to map to hlsl
* fix compile error
* forgot `31 - clz`
* format code (#7943)
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Update source/slang/hlsl.meta.slang
* Update source/slang/hlsl.meta.slang
* Update source/slang/hlsl.meta.slang
* Update source/slang/hlsl.meta.slang
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: bmillsNV <163073245+bmillsNV@users.noreply.github.com>
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
Co-authored-by: ArielG-NV <aglasroth@nvidia.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
|
Change adds 16-bit and 8-bit support for countbits intrinsic. In
cases where a backend's native counbits lacks support, support
is emulated.
New tests are added for 16-bit and 8-bit support. Additional testing
added for 32-bit and minor updates made to 64-bit countbits.
|
|
Change modifies the countbits intrinsic to use generics in order to
support 64bit countbits on select platforms where this is supported.
On platforms where this is not natively supported, we emulate by
converting the 64-bit type into a uint2 (metal and spir-v).
This should align with the implementation of other uint64_t
intrinsics such as abs, min, max and clamp.
Added new countbits64 test to verify changes.
Updated documentation for 64bit-type-support.html
|
|
* Add intptr_t abs/min/max operations for CPU & CUDA targets
* Define intptr_t and uintptr_t with CUDACC_RTC
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* format
* Minor test fixes
* enable checking cpp format in ci
|
|
(#5415)
This commit changes the word "stdlib" or "standard library" to "core module" in the source code.
|
|
Fixes https://github.com/shader-slang/slang/issues/4549
|
|
|
|
* Make the exponent return value from frexp int
Fixes https://github.com/shader-slang/slang/issues/3282
* Update slang-llvm.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Overhaul global inst deduplication and cpp/cuda backend.
* Update IR documentation.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Float16 support for C++/CPU based targets with f16tof32 and f32tof16.
* Small correction around INF/NAN handling for f32tof16
* Small improvement to f16tof32
* Disable CUDA test for now.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* First integration with 'slang-llvm'.
* Fix project.
* Fix test output.
* First pass assert support.
* Add inline impls for min and max.
* Add abs inline abs impl for llvm.
* Make abs not use ternary op
* Fix typo in slang-llvm.h
* Sundary fixes to make remaining tests using llvm backend pass.
|
|
* Enable default cpp prelude.
* Print the "#include" line as a normal source if the file does not exist.
* Bug fix
* Fix.
* Fix c++ prelude header.
* Remove unnecessary fopen call.
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
etc.
|
|
Use SLANG_PRELUDE_STD macro to prefix functions that may need to be specified in std:: namespace.
|
|
* Make compilation work on gcc by disabling -Wclass-mem-access
|
|
This change continues the work already started in moving the definitions of many built-in functions to the standard library.
The main focus in this change was reducing the number of operations that had to be special-cased on the CPU and CUDA targets by making sure that the scalar cases of built-in functions map to the proper names in the prelude (e.g., `F32_sin()`) via the ordinary `__target_intrinsic` mechanism. In some cases this cleanup meant that special-case logic that was constructing definitions for those functions using C++ code could be scrapped.
Additional changes made along the way:
* A few scalar functions that were missing in the CPU/CUDA preludes got added: `round`, hyperbolic trigonometric functions, `frexp`, `modf`, and `fma`
* The floating-point `min()` and `max()` definitions in the preludes were changed to use intrinsic operations on the target (which are likely to follow IEEE semantics, while our definitions did not)
* For the CUDA target, many of the functions had their names translated during code emit from, e.g., `sin` to `sinf`. This change makes the CUDA target more closely match the C++/CPU target in using names like `F32_sin` consistently.
* For the CUDA target, a few additional functions have intrinsics that don't exist (portably) on CPU: `sincos()` and `rsqrt()`.
* For the Slang stdlib definitions to work, a new `$P` replacement was defined for `__targert_intrinsic` that expands to a type based on the first operand of the function (e.g., `F32` for `float`).
* I removed the dedicated opcodes for matrix-matrix, matrix-vector, and vector-matrix multiplication, and instead turned them into ordinary functions with definitions and `__target_intrinsic` modifiers to map them appropriately for HLSL and GLSL. This is realistically how we would have implemented these if we'd had `__target_intrinsic` from the start.
Notes about possible follow-on work:
* The `ldexp` function is still left in the Slang stdlib because it has to account for a floating-point exponent and the `math.h` version only handles integers for the exponent. It is possible that we can/should define another overload for `ldexp` (and `frexp`) that uses an integer for exponent, and then have that one be a built-in on CPU/CUDA, with the HLSL `frexp` being defined in the stdlib to delegate to the correct `frexp` for those targets.
* The `firstbithigh` and related functions are missing for our CPU and CUDA targets, and will need to be added. It is worth nothing that `firstbithigh` apparently has some very odd functionality around signed integer arguments (which are supported, despite MSDN being unclear on that point). General cleanup will be required for those functions.
* Maxing the various matrix and vector products no longer be intrinsic ops might affect how we emit code for them as sub-expressions (both whether we fold them into use sites and how we parenthize them). This doesn't seem to affect any of our existing tests, but we could consider marking these functions with `[__readNone]` to ensure they can be folded, and then also adding whatever modifier(s) we might invent to control precdence and parentheses insertion during emit.
|
|
* Added support ldexp.
* Added classify-float.slang test
Fixed glsl output.
* Added classify-double.slang
* Added ldexp test to scalar-double.slang
* isnan, isinf, isfinite are macros (on some targets) so remove :: prefix.
|
|
* * For integer literals add postfix, and use unsigned/signed output appropriately
* Extend GLSL extension handling by type, and for adding 64 bit int extensions
* Added tests for int/uint64 types
* Add explicit Int/UInt64 emit functions to avoid ambiguity.
* Fix uint64_t intrinsics on CUDA/C++.
* WIP 64 bit types documentation.
* Testing int64 intrinsic support.
* Dx12 Dxil sm6.0 does actually support int64_t.
|
|
* Added hlsl-intrinsic test folder.
Enabled ceil as works across targets.
* log10 support.
* Fix float % on CPU/CUDA to match HLSL which is fmod (not fremainder).
* Added log10 tests back to scalar-float.slang
* Don't add the ( for $Sx - it's clearer what's going on without it.
* Works on CUDA/CPU. Problem with asint/asuint do not seem to be found.
* Only asuint exists for double.
* Support countbits on CUDA and C++.
* Fix typo in C++ population count.
* First pass at int vector intrinsic tests.
* Swizzle for int.
* Bit cast tests on CUDA.
* Fix warning on gcc.
* Fix bit-cast-double execution on CUDA.
* scalar-int test working on gcc release.
|
|
* Added hlsl-intrinsic test folder.
Enabled ceil as works across targets.
* log10 support.
* Fix float % on CPU/CUDA to match HLSL which is fmod (not fremainder).
* Added log10 tests back to scalar-float.slang
* Don't add the ( for $Sx - it's clearer what's going on without it.
|
|
* Add test result for compile-to-cuda
* Add RAII for some CUDA types to simplify usage.
* First pass handling of some instrinsics on CUDA (for example transcendentals)
* CUDA working with built in intrinsics.
* Add missing CUDA prelude intrinsics.
* CUDA matches CPU output on simple-cross-compile.slang
* First pass at hlsl-scalar-float-intrinsic.slang test.
* Fix smoothstep impl on CUDA and CPU.
* Fixed step intrinsic on CUDA/CPU.
* Added operator[] to Matrix for C++, to allow row access.
Needs a fix for CUDA.
* Fixed warning on clang build.
|
|
|
|
|
|
Work on #1059
The `%` operator in the Slang implementation had several issues, and this change tries to address some of them:
* Renamed most occurences of "mod" describing this operator to be "rem" for "remainder" to better match its semantics in HLSL
* Split the operator into distinct integer and floating-point variants (`IRem` and `FRem`) to simplify having different codegen for the two
* Added floating-point variants of `operator%` and `operator%=` to the stdlib.
* Added custom C++ codegen for `kIROp_FRem` such that it maps to the standard C/C++ `remainder()` function
* Added custom GLSL codegen so that `kIROp_FRem` maps to the GLSL `mod()` function (which isn't correct...)
* Added a test case to confirm that D3D11, D3D12, and CPU targets all agree on the definition of floating-point `%`
* Fixed `render-test-tool` to allow a negative integer in a `data=...` specification. This didn't end up being used in the final test, but still seems like a good fix.
* Added a customized baseline for the Vulkan flavor of that test to confirm that we are *not* compiling correctly to SPIR-V just yet
Addressing the correctness of the output for GLSL/SPIR-V will have to come as a later change given that the operation we want is not exposed directly by unextended GLSL.
|
|
* Fix asdouble in C++ prelude.
* Fix small typo
|
|
* Add support for '=' when defining a name in test.
* Add support for double intrinsics.
* Add support for asdouble
Add findOrAddInst - used instead of findOrEmitHoistableInst, for nominal instructions.
Support cloning of string literals.
C++ working on more compute tests.
* Constant buffer support in reflection.
Fixed debugging into source for generated C++.
buffer-layout.slang works.
* Added cpu test result.
* Remove some commented out code.
Comment on next fixes.
* Improvements to reflection CPU code.
* C++ working with ByteAddressBuffer.
* Enabled more compute tests for CPU.
* Enabled more compute tests on CPU.
Added support for [] style access to a vector.
* Enabled more CPU compute tests.
* Handling of buffer-type-splitting.slang
Named buffers can be paths to resources
* Fix some warnings, remove some dead code.
* Fix problem with verification of number of operands for asuint/asint as they can have 1 or 3 operands. asdouble takes 2.
* Fix handling in MemoryArena around aligned allocations. That _allocateAlignedFromNewBlock assumed the block allocated has the aligment that was requested and so did not correct the start address.
|
|
* Added setDownstreamCompilerPrelude
Renamed setPassThroughPath to setDownstreamCompilerPath.
Fixed tests.
Added prelude directory & code to TestToolUtil to setup default preludes for testing/command line apis.
* Fix merge problem
* Remove hacks to make prelude work by adding a search path as no longer needed with 'user prelude'.
* Split up prelude into scalar intrinsics, and types.
Use slang.h for main header.
slang-cpp-prelude.h can now just include what it needs (relative to prelude directory) and define the few remaining things/work arounds.
* Fix typo.
|