From 52b91231cdadc048f93b224f5035759cf1a96eaa Mon Sep 17 00:00:00 2001 From: Sai Praveen Bangaru <31557731+saipraveenb25@users.noreply.github.com> Date: Tue, 30 Apr 2024 16:05:33 -0400 Subject: Added diagnostics & built-in type lowering for `[CUDAKernel]` functions (#4042) * Added diagnostics & built-in type lowering for `[CUDAKernel]` functions This PR adds - Diagnostics for non-void return from a cuda kernel entry point - Diagnostics for using differentiable types in a differentiable cuda kernel entry point - Logic for converting built-in types (float3, float3x3, etc..) to portable struct types and unpacks the parameter back into a built-in type on the CUDA side. This is because built-in types have different implementations in CUDA & CPP targets, which causes signature mis-match when linking. * Fix error codes * Add ability to lower structs and arrays that contain built-in types. + Added tests + Fix issue where the host-side was not marshalling data to lowered types. * Update slang-ir-pytorch-cpp-binding.cpp --------- Co-authored-by: Yong He --- source/slang/slang-emit.cpp | 3 +++ 1 file changed, 3 insertions(+) (limited to 'source/slang/slang-emit.cpp') diff --git a/source/slang/slang-emit.cpp b/source/slang/slang-emit.cpp index 1fa04b4be..afdd37fce 100644 --- a/source/slang/slang-emit.cpp +++ b/source/slang/slang-emit.cpp @@ -471,10 +471,13 @@ Result linkAndOptimizeIR( switch (target) { case CodeGenTarget::PyTorchCppBinding: + generateHostFunctionsForAutoBindCuda(irModule, sink); + lowerBuiltinTypesForKernelEntryPoints(irModule, sink); generatePyTorchCppBinding(irModule, sink); handleAutoBindNames(irModule); break; case CodeGenTarget::CUDASource: + lowerBuiltinTypesForKernelEntryPoints(irModule, sink); removeTorchKernels(irModule); handleAutoBindNames(irModule); break; -- cgit v1.2.3