|
|
* Added diagnostics & built-in type lowering for `[CUDAKernel]` functions
This PR adds
- Diagnostics for non-void return from a cuda kernel entry point
- Diagnostics for using differentiable types in a differentiable cuda kernel entry point
- Logic for converting built-in types (float3, float3x3, etc..) to portable struct types and unpacks the parameter back into a built-in type on the CUDA side. This is because built-in types have different implementations in CUDA & CPP targets, which causes signature mis-match when linking.
* Fix error codes
* Add ability to lower structs and arrays that contain built-in types.
+ Added tests
+ Fix issue where the host-side was not marshalling data to lowered types.
* Update slang-ir-pytorch-cpp-binding.cpp
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|