summaryrefslogtreecommitdiffstats
path: root/source/compiler-core/slang-nvrtc-compiler.cpp
Commit message (Collapse)AuthorAge
* Enable ccache for self-hosted runner (#8345)Gangzheng Tong2025-09-03
| | | | | | | | Related to https://github.com/shader-slang/slang/issues/6728 --------- Co-authored-by: slangbot <ellieh+slangbot@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Enabling optix ci pipeline (#7311)Harsh Aggarwal (NVIDIA)2025-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Revert "Disable OptiX tests by default. (#1331)" This reverts commit e45f8c1f49855cebe90b6722324ec24146ff5a3d. * Enable optix submodule to build Add support for default entry points in compilation Implemented logic to check for defined entry points in the module when no explicit entry points are provided. If found, these entry points are added to the `specializedEntryPoints` list, with the assumption that no specialization is needed for them at this time. * Disable optix if cuda is not enabled * Add submodule OptixSDK path in search * Distinguish user-explicit vs auto-detected SLANG_ENABLE_OPTIX When SLANG_ENABLE_OPTIX is explicitly set by user and CUDA is not available, show SEND_ERROR to maintain strict validation. When OptiX is auto-detected (e.g., local submodule present) but CUDA unavailable, gracefully disable with STATUS message to allow builds to continue. This addresses review feedback to keep error for explicit requests while handling auto-detection gracefully. * Apply CMake formatting to SLANG_ENABLE_OPTIX validation logic * revert: slang-rhi changes as those are merged independently as in PR # slang-rhi#400
* Fix cuda_fp16 header issue (#7476)Mukund Keshava2025-06-19
| | | | | | | | | | | | * Fix cuda_fp16 header issue This fixes the header include issue. However, it does not add a diagnostic. That will be added in a separate PR. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Bump the min shader model version for CUDA 12.8+ (#7415)Jay Kwak2025-06-14
|
* try to find cuda headers in /usr/include (#6800)Simon Kallweit2025-04-14
| | | | Co-authored-by: Simon Kallweit <simon.kallweit@gmail.com> Co-authored-by: Yong He <yonghe@outlook.com>
* update slang-rhi + fix nvrtc options (#6080)Simon Kallweit2025-01-14
| | | | | | | | | | | | | | | * update slang-rhi * pass --dopt=on to nvrtc when enabling debug information * fix leaks in slang-rhi * update slang-rhi * only use --dopt when available --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Find OptiX headers (#6071)Simon Kallweit2025-01-13
| | | | | | | * add support for finding OptiX headers * add documentation * fix linux path
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* preparation for clang format (#5422)Ellie Hermaszewska2024-10-29
| | | | | | | | | | | | | | | | | | | | | | | * Clang-format excludes * Add .clang-format * Don't clang-format in external * Missing includes and forward declarations * Replace wonky include-once macro name * neaten include naming * Add clang-format to formatting script * Add xargs and diff to required binaries * add clang-format to ci formatting check * Add max version check to formatting script * temporarily disable checking formatting for cpp files
* Move the file public header files to `include` dir (#4636)kaizhangNV2024-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Move the file public header files to `include` dir Close the issue (#4635). Move the following headers files to a `include` dir located at root dir of slang repo: slang-com-helper.h -> include/slang-com-helper.h slang-com-ptr.h -> include/slang-com-ptr.h slang-gfx.h -> include/slang-gfx.h slang.h -> include/slang.h Change cmake/SlangTarget.cmake to add include path to every target, and change the source file to use "#include <slang.h>" to include the public headers. The source code update is by the script like follow: ``` fileNames_slang=$(grep -r "\".*slang\.h\"" source/ -l) for fileName in "${fileNames_slang[@]}" do echo "$fileName" sed -i "s/\".*slang\.h\"/\"slang\.h\"/" $fileName done ``` * Fix the test issues * Fix cpu test issues by adding include seach path * Update cmake to not add include path for every target Also change "#include <slang.h>" to "include "slang.h" " to make the coding style consistent with other slang code. * Change public include to private include for unit-test and slang-glslang
* CUDA: Fixes for NVRTC 12.x and warp mask ambiguity; adds CC 8.x warp ↵Neil Bickford2023-11-07
| | | | | | | | | | | reduction intrinsics. (#3314) * CUDA: Fixes for NVRTC 12.x, warp mask ambiguity; add reduction partial specializations. * Fixes running NVRTC on CUDA 12 without a specified profile (used in testing, e.g. `slang-test -api cuda -category wave`) * Fixes mask ambiguity between getting the lane index from threadId.x and a full mask of threads. * Adds partial specializations for compute capability 8.x warp reduction intrinsics. * Fix formatting
* Fix literals needing cast (#3039)jsmall-nvidia2023-08-01
| | | | | | | | | | | | * Cast integer literals. * Fix expected output. * For CUDA, search global instructions to see what types are used. Improve lookup for fp16 header in CUDA. * Fix issue with f16tof32 * Small improvement around finding used base types.
* Generalize collectInductionValues (#3031)Ellie Hermaszewska2023-08-01
| | | | | | | | | | | | | | | | | * Generalize collectInductionValues * Support affine transformations of loop index as induction variables * Test for generalized induction value collection * Neaten inductive variable finding * Store the type of implication success when finding inductive variables * Test that loop induction finding does not alway succeed * Support chains of additions and branches of additions in induction variable finding * Use c++17 for downstream compilers
* Source embedding for output (#2889)jsmall-nvidia2023-05-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Fix typo. * Add options for source embedding. * Small improvements. * Working with tests. * Add check for supported language types for embedding. * Try and remove assume warning. * Fix warning on MacOSX. * Some extra checking around Style::Text. * Some small improvements to docs/handling for headers extensions. * Fix md issue. * Small fixes around zeroing partial last element. * Another small fix.... * Small improvement in hex conversion. * Add an assert for unsignedness.
* Artifact simplification (#2781)jsmall-nvidia2023-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP simplifying artifact interface. * Use ContainedKind. * Remove LazyCastableList. Use ContainedKind for find. * Remove ICastableList. * Remove need for ICastableList. * Remove IArtifactContainer. * Small fixes. * Small improvements around Artifact. * Make explicit find is for *representations* that can cast. Fix bug in handling casting in lookup. * Made associated items artifacts too. * Small fixes. * Small improvements around writing a container.
* Add struct version to DownstreamCompiler::CompileOptions interface (#2692)jsmall-nvidia2023-03-10
| | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Add versioning to CompileOptions for DownstreamCompiler so we can add new options without breaking binary interface. * Use builtin offset of directly.
* Improvements to NVRTC diagnostic parsing (#2504)jsmall-nvidia2022-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Float16 support for C++/CPU based targets with f16tof32 and f32tof16. * Small correction around INF/NAN handling for f32tof16 * Small improvement to f16tof32 * Disable CUDA test for now. * Improvements to NVRTC diagnostic parsing. Handle compilerSpecificArgs. Fix issue with terminating nul ending up in diagnostic string. * Improved NVRTC error parsing. f32tof16 and f16tof32 work in principal on CUDA. * Small update to test, although they remain disabled. * Work around SLANG_E_NOT_AVAILABLE being turned into ignored, when a legitimate error is found * A more tightly constrained fallback NVRTC diagnostic parsing. * Remove CharUtil, as not neeed. Co-authored-by: Yong He <yonghe@outlook.com>
* Passing source to Downstream compilation as artifacts (#2382)jsmall-nvidia2022-09-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Make DownstreamCompileOptions use POD types. * CharSliceAllocator -> SliceAllocator Added SliceConverter CharSliceCaster -> SliceCaster * First attempt at zero terminating around blobs. * Fix clang warning. * Add SlangTerminatedChars Make Blob implementations support it. Make most blobs 'terminated'. * Fix bug setting up sourceFiles for CommandLineDownstreamCompiler. * Traffic in TerminatedCharSlice for sourceFiles. Use ArtifactDesc to generate temporary file names for source. * Fix typo in testing for shared library/C++. * Working with source being passed as artifacts to DownstreamCompiler. * Use artifacts in SourceManager/SourceFile. * Support infering extension from the original file extension.
* DownstreamCompileOptions using POD types (#2381)jsmall-nvidia2022-08-26
| | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Make DownstreamCompileOptions use POD types. * CharSliceAllocator -> SliceAllocator Added SliceConverter CharSliceCaster -> SliceCaster * First attempt at zero terminating around blobs. * Fix clang warning. * Add SlangTerminatedChars Make Blob implementations support it. Make most blobs 'terminated'. * Fix bug setting up sourceFiles for CommandLineDownstreamCompiler. * Traffic in TerminatedCharSlice for sourceFiles. Use ArtifactDesc to generate temporary file names for source. * Fix typo in testing for shared library/C++.
* Improve binary compatibility for DownstreamCompiler types (#2371)jsmall-nvidia2022-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP replacing DownstreamCompileResult. * First attempt at replacing DownstreamCompileResult with IArtifact and associated types. * Small renaming around CharSlice. * ICastable -> ISlangCastable Added IClonable Fix issue with cloning in ArtifactDiagnostics. * Only add the blob if one is defined in DXC. * Guard adding blob representation. * Make cloneInterface available across code base. Set enums backing type for ArtifactDiagnostic. * Added ::create for ArtifactDiagnostics. * Use SemanticVersion for DownstreamCompilerDesc. Set sizes for enum types. * Depreciate old incompatible CompileOptions. Change SemanticVersion use 32 bits for the patch. * Split out CastableUtil. * Change IDownstreamCompiler to use canConvert and convert to use artifact types. * Fix typos. * Fix typo bug. Allow trafficing in PTX assembly/binaries * struct DownstreamCompilerBaseUtil -> struct DownstreamCompilerUtilBase Co-authored-by: Yong He <yonghe@outlook.com>
* Replace DownstreamCompileResult with Artifact (#2369)jsmall-nvidia2022-08-22
| | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP replacing DownstreamCompileResult. * First attempt at replacing DownstreamCompileResult with IArtifact and associated types. * Small renaming around CharSlice. * ICastable -> ISlangCastable Added IClonable Fix issue with cloning in ArtifactDiagnostics. * Only add the blob if one is defined in DXC. * Guard adding blob representation. * Make cloneInterface available across code base. Set enums backing type for ArtifactDiagnostic. * Added ::create for ArtifactDiagnostics.
* IDownstreamCompiler interface (#2361)jsmall-nvidia2022-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls. * Add ICastableList * Added UnknownCastableAdapter * Make ISlangSharedLibrary derive from ICastable, and remain backwards compatible with slang-llvm. * Refactor Representation on Artifact. * Make our ISlangBlobs also derive from ICastable. Make ISlangBlob atomic ref counted. * Split out CastableList and related types, and placed in core. * Small fixes around IArtifact. Improve IArtifact docs. First impl of getChildren for IArtifact. * Documentation improvements for Artifact related types. * Fix typo. * Special case adding a ICastableList to a LazyCastableList. * Small simplification of LazyCastableList, by adding State member. * Removed the ILockFile interface because IFileArtifactRepresentation can be used. * Implement DiagnosticsArtifactRepresentation. * Added PostEmitMetadataArtifactRepresentation * Add searching by predicate. Added handling of accessing Artifact as ISharedLibrary * Fix typo. * Add find to IArtifacgtList. Fix some missing SLANG_NO_THROW. * Small improvements around ArtifactDesc types. * Another small change around ArtifactKind. * Some more shuffling of ArtifactDesc. * Make IArtifact castable Remove IArtifactList Made IArtifactContainer derive from IArtifact Made ModuleLibrary atomic ref counted/given IModuleLibrary interface. * Must call _requireChildren before any children access. * Fix missing SLANG_MCALL on castAs. * Fix missing SLANG_OVERRIDE. * Added IArtifactHandler * Use ICastable for basis of scope/lookup. * WIP first attempt to remove CompileResult. * Fix support for for downstream compiler shared library adapter. * Fix issues found when replacing CompileResult. * Fix typo. * Fix getting items form 'significant' member of an Artifact. * Split out ArtifactUtil & ArtifactHandler. * Work around for problem on Visual studio. * Improve searching. * Add missing files. * Split out Artifact associated types. Don't produce a container by default - use associated for 'metadata'. * Remove no longer used ArtifactPayload type. * Generalized converting representations. Small improvements to artifacts. * Fix intermediate dumping issue. * Removed #if 0 out CompileResult. Remove DownstreamCompileResult maybeDumpIntermediate. * Pull out functionality for dumping artifact output into ArtifactOutputUtil Fixed a bug in naming files based on ArtifactDesc. * std::atomic issue. * Pull out types from DownstreamCompile to simplify moving to an interface. * Fix typo. * Use IDownstreamCompiler interface. Split out DownstreamCompilerUtil and DownstreamCompilerSet. * Update projects. * Fix missing SLANG_MCALL. * Fix calling convention of IDownstreamCompiler impls. * Split out binary work arounds into a dep1.cpp/h * Small reorganising around DownstreamCompilerInfo. * Remove Desc library functionality to DownstreamCompilerUtil. * Expand IDiagnostics interface. Rename associated impls with Impl suffix. * Fix outputting as text bug. Some small improvements. * Add fix around prefix for dumping. Improved how handling for extensions work form ArtifactDesc. * Dump assembly if available. * Simplify some of Dep1 definitions.
* Artifact and ICastable (#2351)jsmall-nvidia2022-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls. * Add ICastableList * Added UnknownCastableAdapter * Make ISlangSharedLibrary derive from ICastable, and remain backwards compatible with slang-llvm. * Refactor Representation on Artifact. * Make our ISlangBlobs also derive from ICastable. Make ISlangBlob atomic ref counted. * Fix typo.
* Hotfix/osx fix (#1994)jsmall-nvidia2021-10-28
| | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * OSX 10.15 produced error/warning for not using ref.
* Glslang as DownstreamCompiler (#1846)jsmall-nvidia2021-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP Fxc as downstream compiler. * First pass FXC downstream compiler working. * GCC compile fix. * Fix FXC parsing issue. * Special case filesystem access. * Use StringUtil getSlice. * Fix isses with not emitting source for FXC. * WIP on DXC. * Small fixes for DXBC handling. * Removed DXC from ParseDiagnosticUtil (can use generic) Try to improve output for notes from DXC. * FIrst pass of Glslang as DownstreamCompiler * Fix some problems with parsing for glslang replacement. * Add slang-glslang-compiler.cpp/.h * Fix downstream for spir-v output. * dissassemble -> disassemble * Fix typo and improve some naming/comments. * Remove getSharedLibrary from DownstreamCompiler * Removed some no longer used diagnostics.
* FXC as DownstreamCompiler (#1844)jsmall-nvidia2021-05-14
| | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP Fxc as downstream compiler. * First pass FXC downstream compiler working. * GCC compile fix. * Fix FXC parsing issue. * Special case filesystem access. * Use StringUtil getSlice. * Fix isses with not emitting source for FXC. * Small fixes for DXBC handling.
* Simplify CommandLine by removing Escaping (#1825)jsmall-nvidia2021-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions.
* Support for escaped paths in tools (#1823)jsmall-nvidia2021-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Fix typo. * Disable path canonical test on linux because CI issue.
* Preliminary CUDA Half support (#1808)jsmall-nvidia2021-04-23
| | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP CUDA half support. * Working support for half on CUDA - requires cuda_fp16.h and associated files can be found. * Fix for win32 for unused funcs. * Fix for Clang. * Hack to disable unused local function warning.
* NVTRC 64 bit requirement (#1792)jsmall-nvidia2021-04-14
| | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Made 64 bit requirement clearer for CUDA/PTX. Added compile assert for NVRTC for 32 bit OS, that are known to require 64bit. * Disabled location of NVRTC on linux/win x86. * Only restrict NVRTC on windows * Simplify checking on 64 bit windows for NVRTC. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
* Added compiler-core project (#1775)jsmall-nvidia2021-04-01
* #include an absolute path didn't work - because paths were taken to always be relative. * Split out compiler-core initially with just slang-source-loc.cpp * More lexer, name, token to compiler-core. * Split Lexer and Core diagnostics. * Move slang-file-system to core. * Add slang-file-system to core. * More DownstreamCompiler into compiler-core * Fix typo. * Add compiler-core to bootstrap proj. * Small fixes to premake * For linux try with compiler-core * Remove compiler-core from examples. * Added NameConventionUtil to compiler-core * Add global function to CharUtil to *hopefully* avoid linking issue. * Hack to make linkage of CharUtil work on linux.