| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
| |
* Move switch statement bodies to their own lines
* format
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
| |
* format
* Minor test fixes
* enable checking cpp format in ci
|
| |
|
|
|
|
|
|
|
| |
This avoids a problem with broadcasted tensors. Our tensor-view platform is designed to allow unrestricted access to tensor memory, while broadcasted tensors were designed for 'read-only' use-cases. Trying to write into a broadcasted tensor needs re-allocation, which Slang is not designed to do.
For now, we enforce contiguity on tensors with any 0 strides.
In the future, we will introduce a ConstTensorView object to allow such tensors to be used as an input.
This patch also propagates name-hint information through structs & arrays of tensors, to allow sensible names for the error messages (before this the error messages were temporary inst numbers, which is nearly impossible to debug)
|
| |
|
|
|
|
|
| |
non-CUDA/Torch targets (#4364)
* Remove `IRHLSLExportDecoration` and `IRKeepAliveDecoration` for non-CUDA/Torch targets
* Update hlsl-torch-cross-compile.slang
|
| |
|
|
|
| |
* fix all Clang-14 warnings
* remove a clang-14 warning fix because it is a MSVC warning...
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added diagnostics & built-in type lowering for `[CUDAKernel]` functions
This PR adds
- Diagnostics for non-void return from a cuda kernel entry point
- Diagnostics for using differentiable types in a differentiable cuda kernel entry point
- Logic for converting built-in types (float3, float3x3, etc..) to portable struct types and unpacks the parameter back into a built-in type on the CUDA side. This is because built-in types have different implementations in CUDA & CPP targets, which causes signature mis-match when linking.
* Fix error codes
* Add ability to lower structs and arrays that contain built-in types.
+ Added tests
+ Fix issue where the host-side was not marshalling data to lowered types.
* Update slang-ir-pytorch-cpp-binding.cpp
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support visibility control and default to `internal`.
* Fix wip.
* Fixes.
* Fix.
* Fix test.
* Add legacy language detection and compatibility for existing code.
* Add doc.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Correctly use removeTrivialSingleIterationLoops during simplification
* remove unused variables
* Fix invalid fallthrough
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Various fixes
* Remove unused parameter
* Update slang-ir-loop-unroll.cpp
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Make dynamic cast transparent through `IRAttributedType`.
* Add [CUDAXxx] variant of attributes.
* Support marshaling of vector types.
* Wrap cuda kernels in `extern "C"` block.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
| |
* Remove unused variable
* Remove unused variable
* Remove unused if bindings
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
exporting type information (#3209)
* Initial: add a DiffTensor impl
* Auto-binding and diff tensor implementations now work
* Refactored diff-tensor implementation + added py-export for struct types
* Cleanup
* Update slang-ir-pytorch-cpp-binding.cpp
* Updated test names
* Update autodiff-data-flow.slang.expected
* Add more versions of load/store & default generic args for DiffTensorView.
* Add diagnostic for default generic arg and more tests
* Add more `[AutoPyBind]` tests
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add slangpy doc, fix cuda prelude.
* more bug fix.
* fix.
* fix.
* More fix.
* fix.
* f
* fix prelude.
* update prelude.
* update doc
* Update prelude.
* add zeros_like
* update doc.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Translate all composed types into tuple types in pyBind.
* Delete temp file.
* Fix get tuple element code emit logic.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
* Add PyTorch C++ binding generation.
* fix
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|