diff options
| author | Yong He <yonghe@outlook.com> | 2023-08-28 21:24:49 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2023-08-28 21:24:49 -0700 |
| commit | c787c4b82ba76f87069911f203eb192060b5264f (patch) | |
| tree | 2aa98326ce8d4c9f2011d79ee9f6d34db14716e7 /source/slang/slang-emit-cuda.cpp | |
| parent | af363c02bf0fa2502c14f454965adff87170ff15 (diff) | |
Add `target_switch` and `intrinsic_asm` statement. (#3154)
* Add `target_switch` and `__intrinsic_asm` statement.
* Cleanup.
* WaveGetActiveMask, WaveGetActiveMask, WaveCountBits.
* WaveIsFirstLane.
* More wave intrinsics.
* wave intrinsics.
* merge fix.
* Fix.
* Fix.
* Update test.
* update test.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
Diffstat (limited to 'source/slang/slang-emit-cuda.cpp')
| -rw-r--r-- | source/slang/slang-emit-cuda.cpp | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/source/slang/slang-emit-cuda.cpp b/source/slang/slang-emit-cuda.cpp index 3e974f60e..fa0e3c7aa 100644 --- a/source/slang/slang-emit-cuda.cpp +++ b/source/slang/slang-emit-cuda.cpp @@ -436,17 +436,17 @@ void CUDASourceEmitter::_emitInitializerList(IRType* elementType, IRUse* operand m_writer->emit("\n}"); } -void CUDASourceEmitter::emitIntrinsicCallExprImpl(IRCall* inst, IRTargetIntrinsicDecoration* targetIntrinsic, EmitOpInfo const& inOuterPrec) +void CUDASourceEmitter::emitIntrinsicCallExprImpl(IRCall* inst, UnownedStringSlice intrinsicDefinition, EmitOpInfo const& inOuterPrec) { // This works around the problem, where some intrinsics that require the "half" type enabled don't use the half/float16_t type. // For example `f16tof32` can operate on float16_t *and* uint. If the input is uint, although we are // using the half feature (as far as CUDA is concerned), the half/float16_t type is not visible/directly used. - if (targetIntrinsic->getDefinition().startsWith(toSlice("__half"))) + if (intrinsicDefinition.startsWith(toSlice("__half"))) { m_extensionTracker->requireBaseType(BaseType::Half); } - Super::emitIntrinsicCallExprImpl(inst, targetIntrinsic, inOuterPrec); + Super::emitIntrinsicCallExprImpl(inst, intrinsicDefinition, inOuterPrec); } bool CUDASourceEmitter::tryEmitInstStmtImpl(IRInst* inst) |
