summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-emit-cpp.cpp
Commit message (Collapse)AuthorAge
* Refactor _Texture to constrain on texel types. (#6115)Yong He2025-01-17
| | | | | | | | | * Refactor _Texture to constrain on texel types. * Fix tests. * Fix. * Disable glsl texture test because rhi can't run it correctly.
* Lower varying parameters as pointers instead of SSA values. (#5919)Yong He2025-01-07
| | | | | * Add executable test on matrix-typed vertex input. * Fix emit logic of matrix layout qualifier. * Pass fragment shader varying input by constref to allow EvaluateAttributeAtCentroid etc. to be implemented correctly.
* Add packed 8bit builtin types (#5939)Darren Wihandi2024-12-26
| | | | | * Add packed bytes builtin type * fix test
* Move switch statement bodies to their own lines (#5493)Ellie Hermaszewska2024-11-05
| | | | | | | | | * Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Write only texture types. (#5454)Yong He2024-10-30
| | | | | | | | | | | | | | | | * Add support for write-only textures. * Fix capabilities. * Fix implementation. * Fix. * format code --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com> Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* formatEllie Hermaszewska2024-10-29
| | | | | | | * format * Minor test fixes * enable checking cpp format in ci
* Initial `Atomic<T>` type implementation. (#5125)Yong He2024-09-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Initial Atomic<T> type implementation. * Update design doc. * Fix. * Add test. * Fixes and add tests. * Fix WGSL. * Fix glsl. * Fix metal. * experiemnt with github metal. * experiment github metal 2 * github metal experiment 3 * experiment with github metal 4. * experiment with metal 5. * experiment 7. * metal experiment 8. * Fix metal tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Remove use of `G0` and `__target_intrinsic` in stdlib. (#4170)Yong He2024-05-14
| | | | | | | * Remove use of `G0` and `__target_intrinsic` in stdlib. * Fix. * Fix calling intrinsic in global scope.
* Add host shared library target. (#4098)Yong He2024-05-03
| | | | | | | | | | | | | * Add host shared library target. * Attempt fix. * Fix warnings. * try fix. * Fix test. * Fix.
* Support emitting generic target_intrinsic type. (#3745)Yong He2024-03-12
|
* Refactor compiler option representations. (#3598)Yong He2024-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
* Add per-buffer data layout control. (#3551)Yong He2024-02-05
| | | | | | | | | | | | | * Add per-buffer data layout control. Fixes #3534. * Fixes. * Robustness. * Update test. * Fix.
* Unify stdlib `Texture` types into one generic type. (#3327)Yong He2023-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Unify Texture types in stdlib into 1 generic type. * Fixes. * Fix. * Fixes. * Fix reflection. * Fix binding reflection. * Add gather intrinsics. * Fix gather intrinsics. * Fix texture type toText. * Fix intrinsic. * fix cuda intrinsic. * Fix project files. * cleanup. * Fix. * Fix. * Fix sampler feedback test. * Fix getDimension intrinsics. * Fix spirv sample image intrinsics. * Fix test. * Fix GLSL intrinsic. * Cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add `requirePrelude()` intrinsic function. (#3250)Yong He2023-09-29
| | | | | | | | | * Add `requirePrelude()` intrinsic function. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Support `constref` parameters passing. (#3249)Yong He2023-09-28
| | | | | | | | | | | | | | | * Support `constref` parameters passing. * Fix. * Fix. * Add test and diagnostic on mix use of __constref and no_diff. * check for [constref] on differentiable member method. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Avoid make copies of __ref parameters when doing autodiff. (#3242)Yong He2023-09-27
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add `target_switch` and `intrinsic_asm` statement. (#3154)Yong He2023-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add `target_switch` and `__intrinsic_asm` statement. * Cleanup. * WaveGetActiveMask, WaveGetActiveMask, WaveCountBits. * WaveIsFirstLane. * More wave intrinsics. * wave intrinsics. * merge fix. * Fix. * Fix. * Update test. * update test. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Lower all ByteAddressBuffer uses for SPIRV. (#3143)Yong He2023-08-23
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Support per field matrix layout (#3101)Yong He2023-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Support per field matrix layout * Fix warnings. * Fix. * Fix tests. * Fix spiv gen. * Fix. * More test fixes. * Fix. * Run only GPU tests on self-hosted servers. * Remove -use-glsl-matrix-layout-modifier. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix native string emit for CUDA/Cpp backend. (#2980)Yong He2023-07-12
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fix for operator assignment issue (#2951)jsmall-nvidia2023-06-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP handling LValue coercion via LValueImplicitCast * Need to have the ptr type for the cast. * Casting conversion working on C++. * Make the LValue casts record if in or in/out as we can produce better code if we know the difference. * WIP LValueCast pass * Fix tests so we don't fail because downstream compilers detect use of uninitialized variable. * Do conversions through through tmp for l-value scenarios that can't work other ways. * Fix a typo. * Change diagnostic implicit-cast-lvalue for a type that still exhibits the issue. * Add matrix test. * Added a bit more clarity around LValue casting choices. * Small comment improvements. Improvements based on comments on PR. * Use findOuterGeneric.
* Various fixes for autodiff and slangpy. (#2876)Yong He2023-05-09
| | | | | | | | | | | | | * Various fixes for autodiff and slangpy. * Fix cuda code gen for `select`. * Fix getBuildTagString(). * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska2023-04-26
|
* StringBuilder to lowerCamel (#2840)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI. * Small tidy on String. * Append -> append * ToString -> toString ProduceString -> produceString * Small fixes. * StringToXXX -> stringToXXX * Fix typo introduced by Append -> append. * Made intToAscii do reversal at the end. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Dictionary using lowerCamel (#2835)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
* Fix missing `f` suffix for float lits in CUDA backend. (#2791)Yong He2023-04-11
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Emit simpler vector element access code. (#2770)Yong He2023-04-03
| | | | | | | | | * Emit simpler vector element access code * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Translate all composed types into tuple types in pyBind. (#2744)Yong He2023-03-27
| | | | | | | | | | | * Translate all composed types into tuple types in pyBind. * Delete temp file. * Fix get tuple element code emit logic. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add PyTorch C++ binding generation. (#2734)Yong He2023-03-26
| | | | | | | | | * Add PyTorch C++ binding generation. * fix --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Support for producing SourceMap on emit (#2707)jsmall-nvidia2023-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP source map. * Split out handling of RttiTypeFuncs to a map type. * Make RttiTypeFuncsMap hold default impls. * Slightly more sophisticated RttiTypeFuncsMap * Source map decoding. * Fix tabs. * Fix asserts due to negative values. * Use less obscure mechanisms in SourceMap. * Source map decoding. Simplifying SourceMap usage. * First attempt at ouputting a source map as part of emit. * Added support for -source-map option. SourceMap is added to the artifact.
* More control flow simplifications. (#2673)Yong He2023-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * More control flow and Phi param simplifications. * Fix. * Fix gcc error. * Fix. * More IR cleanup. * Fix bug in phi param dce + ifelse simplify. * Propagate and DCE side-effect-free functions. * Enhance CFG simplifcation to remove loops with no side effects. * Fix. * Fixes. * Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation. * More cleanup. * Fixes. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Overhaul global inst deduplication and cpp/cuda backend. (#2654)Yong He2023-02-16
| | | | | | | | | * Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix code generation for matrix reshape. (#2568)Yong He2022-12-14
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Rename IR opcodes to unify style. (#2556)Yong He2022-12-07
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Rework differential conformance dictionary checking. (#2483)Yong He2022-11-02
| | | | | | | * Rework differential conformance dictionary checking. * Revert space changes. Co-authored-by: Yong He <yhe@nvidia.com>
* Run simple compute kernel in gfx-smoke test. (#2400)Yong He2022-09-15
|
* Language feature: pointer sized int types. (#2401)Yong He2022-09-15
| | | | | | | | | | | | | | | | | | | | | * Language feature: pointer sized int types. * Fix. * small change to test. * Fix stdlib. * Fix. * Fix. * Add typedef for `size_t` in stdlib. * Fix test. * Add `intptr_t::size` constant. Co-authored-by: Yong He <yhe@nvidia.com>
* Assorted Artifact improvements (#2374)jsmall-nvidia2022-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP replacing DownstreamCompileResult. * First attempt at replacing DownstreamCompileResult with IArtifact and associated types. * Small renaming around CharSlice. * ICastable -> ISlangCastable Added IClonable Fix issue with cloning in ArtifactDiagnostics. * Only add the blob if one is defined in DXC. * Guard adding blob representation. * Make cloneInterface available across code base. Set enums backing type for ArtifactDiagnostic. * Added ::create for ArtifactDiagnostics. * Use SemanticVersion for DownstreamCompilerDesc. Set sizes for enum types. * Depreciate old incompatible CompileOptions. Change SemanticVersion use 32 bits for the patch. * Split out CastableUtil. * Change IDownstreamCompiler to use canConvert and convert to use artifact types. * Fix typos. * Fix typo bug. Allow trafficing in PTX assembly/binaries * struct DownstreamCompilerBaseUtil -> struct DownstreamCompilerUtilBase * Add other riff types. * Small fix around artifact kind. * Make using slices instead of strings explicit on atomic ref counted types. (not complete). Added IArtifactList. Use IArtifactList to hold the 'associated' files. Use IUnknown for scoping for atomic ref counting. Small naming improvements. * Make artifact not use String in construction (so it owns contents). * Calculate compile products as artifacts. * Small improvements around ArtifactDesc. * Use ICastableList for list of artifacts and remove IArtifactList.
* Call `gfx` in slang program. (#2370)Yong He2022-08-20
|
* Artifact split interface and implementation (#2349)jsmall-nvidia2022-08-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls.
* Allow `class` to implement COM interface, [DLLExport] (#2338)Yong He2022-07-25
| | | | | | | * Allow `class` to implement COM interface, [DLLExport] * Fix [COM] usage in tests and examples with UUIDs. Co-authored-by: Yong He <yhe@nvidia.com>
* Support `class` types. (#2321)Yong He2022-07-12
| | | | | | | | | * Support `class` types. * Ignore class-keyword test * Fix codereview comments and warnings. Co-authored-by: Yong He <yhe@nvidia.com>
* Native call marshalling for ComPtr parameters and return values. (#2305)Yong He2022-06-29
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add CPU executable compile test (#2278)Yong He2022-06-21
| | | | | | | | | | | * Add cpu executable compile test * Fix. * Fix permission on linux * retrigger build Co-authored-by: Yong He <yhe@nvidia.com>
* Actual global support (#2262)jsmall-nvidia2022-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure. * Add support for __global. * Added `ActualGlobalRate` Added special handling around globals and COM interfaces. Tested out in cpu-com-example. * Fix typo in NodeBase. * Support for accessing globals by name working. * Check that actual global initialization is working. * Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar. * Remove context. Only create replacement if needed. * Split out COM host-callable into a unit-test. * host-callable com testing on C++and llvm. * Comment around the COM ptr replacement. * Disable com test on vs 32 bit. Fix C++ prelude * Disable 32 bit targets testing com host-callable. * Use JSON parsing to locate VS version. * Need platform detection in C++prelude. * Fix com host callable test for LLVM. * Work around for not being able to include "targetConditionals.h"
* COM interfaces with host callable (#2258)jsmall-nvidia2022-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure.
* Added NativeStringType (#2252)jsmall-nvidia2022-05-27
| | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Refactor prelude emit (#2236)jsmall-nvidia2022-05-17
| | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor how prelude output works in emit. * Small improvement to emit output. * Move around comment on target specific language directives based on review. Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Initial support for COM interface in host code. (#2230)Yong He2022-05-10
| | | | Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Use IR pass to eliminate phi nodes (#2226)Theresa Foley2022-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Use IR pass to eliminate phi nodes "Phi nodes" are one of the key contrivances that makes SSA (Static Single Assignment) form work. Because SSA is so great for compiler IRs, we kind of need to deal with phi nodes, but they also get in the way because they don't have a direct analog in most lower-level machine ISAs or execution models, nor in most of the high-level languages a transpiler wants to emit. As a result a compiler like ours needs to be able to eliminate the phi nodes from a program as part of generating output code. (For any clever people noting that SPIR-V supports phi nodes directly: yes, it does. It doesn't need to and it probably *shouldn't*. Anybody involved in the decision-making knows my reasoning, and anybody else should feel free to ask me if they want the lecture. Anyway...) The basic idea of elimiating phi nodes is simple enough. We replace each phi node with a temporary variable. Uses of the phi use values loaded from the temporary. The operation of the phi itself (assigning a value based on the branch taken) amounts to an assignment into the temporary. Previously, the Slang compiler dealt with phi nodes very late in the process of generating code: in the middle of emitting strings of source code in a high-level language like HLSL or GLSL. Doing the work that late in compilation has two big drawbacks: 1. Our ability to emit clean and/or optimal code is limited because we may not be able to make certain changes to the IR, or because we cannot make use of additional information like a dominator tree that might be available at other points in compilation. 2. Any other IR passes that relate to temporary variables won't be able to see the variables that we generate for phi nodes. This could raise issues with correctness (e.g., if we want to compute live-range information for *all* temporary variables), or performance (we have no way to run additional IR optimization passes after phis are eliminated). This change addresses these problems by making the elimination of phi nodes an explicit IR pass. Additional optimizations can easily be run after this pass (although we'd need to be careful not to run passes that could end up introducing new phis). The pass makes use of the information available to it to try to produce code that will emit to "clean" HLSL/GLSL. The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is heavily commented, so I won't describe the approach in detail here. There are two related issues that came up, though: First, it turned out that our emit logic for local variables (`IRVar` instructions) wasn't using the function we'd defined named `emitVar()`. One worrying consequence of that oversight was that the `precise` modifier would impact generated HLSL/GLSL for variables that turned into SSA values (including phi nodes), but *not* for local variables that had not been SSA'd (or that had been SSA'd and then de-SSA'd). This change also fixes that bug; it is unclear how widespread the impact of the original issue might be. Second, generating explicit IR temporaries for phi nodes exposed a pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass basically detects cases where we have an instruction `I` with a use `U` such that the use follows the rules of SSA form ("def dominates use," meaning `I` dominations `U`), but does not follow the more restrictive scoping rules of high-level-language output (where a value computed "inside" a loop is not automatically visible to code outside the loop just because it dominates that code). That pass did not correctly account for the case where `I` was a temporary variable. It seems that case could not arise before now because we didn't have any passes that would move `var`, `load`, or `store` operations out of the basic block they started in. The fix for that pass was relatively simple, and will make the whole thing more robust in case we add more aggressive optimizations later. * fixup: expected test output