From efeda20ec280771348887ae4eb498a8b158c9c0c Mon Sep 17 00:00:00 2001 From: Yong He Date: Thu, 30 Mar 2023 14:34:54 -0700 Subject: Fix stdlib definitions for tensor interlocked methods. (#2761) Co-authored-by: Yong He --- docs/user-guide/a1-02-slangpy.md | 79 ++++++++++++++++++++++++++++++++-------- 1 file changed, 63 insertions(+), 16 deletions(-) (limited to 'docs/user-guide') diff --git a/docs/user-guide/a1-02-slangpy.md b/docs/user-guide/a1-02-slangpy.md index 6a9b8baa3..8ee5233ba 100644 --- a/docs/user-guide/a1-02-slangpy.md +++ b/docs/user-guide/a1-02-slangpy.md @@ -226,53 +226,100 @@ The `TensorView` represents the GPU view of a tensor and provides accesors to Following is a list of builtin methods and attributes for PyTorch interop. -### `static TorchTensor TorchTensor.alloc(uint x, uint y, ...)` +### `TorchTensor` methods + +#### `static TorchTensor TorchTensor.alloc(uint x, uint y, ...)` Allocates a new PyTorch tensor with the given dimensions. -### `static TorchTensor TorchTensor.zerosLike(TorchTensor other)` +#### `static TorchTensor TorchTensor.emptyLike(TorchTensor other)` +Allocates a new PyTorch tensor that has the same dimensions as `other` without initializing it. + +#### `static TorchTensor TorchTensor.zerosLike(TorchTensor other)` Allocates a new PyTorch tensor that has the same dimensions as `other` and initialize it to zero. -### `uint TorchTensor.dims()` +#### `uint TorchTensor.dims()` Returns the tensor's dimension count. -### `uint TorchTensor.size(int dim)` +#### `uint TorchTensor.size(int dim)` Returns the tensor's size (in number of elements) at `dim`. -### `uint TorchTensor.stride(int dim)` +#### `uint TorchTensor.stride(int dim)` Returns the tensor's stride (in bytes) at `dim`. -### `TensorView.operator[uint x, uint y, ...]` +### `TensorView` methods + +#### `TensorView.operator[uint x, uint y, ...]` Provide an accessor to data content in a tensor. -### `TensorView.operator[vector index]` +#### `TensorView.operator[vector index]` Provide an accessor to data content in a tensor, indexed by a uint vector. `tensor[uint3(1,2,3)]` is equivalent to `tensor[1,2,3]`. -### `uint TensorView.dims()` +#### `uint TensorView.dims()` Returns the tensor's dimension count. -### `uint TensorView.size(int dim)` +#### `uint TensorView.size(int dim)` Returns the tensor's size (in number of elements) at `dim`. -### `uint TensorView.stride(int dim)` +#### `uint TensorView.stride(int dim)` Returns the tensor's stride (in bytes) at `dim`. -### `cudaThreadIdx()` +#### `void TensorView.fillZero()` +Fills the tensor with zeros. Modifies the tensor in-place. + +#### `void TensorView.fillValue(T value)` +Fills the tensor with the specified value, modifies the tensor in-place. + +#### `T* TensorView.data_ptr_at(vector index)` +Returns a pointer to the element at `index`. + +#### `void TensorView.InterlockedAdd(vector index, T val, out T oldVal)` +Atomically add `val` to element at `index`. + +#### `void TensorView.InterlockedMin(vector index, T val, out T oldVal)` +Atomically computes the min of `val` and the element at `index`. Available for 32 and 64 bit integer types only. + +#### `void TensorView.InterlockedMax(vector index, T val, out T oldVal)` +Atomically computes the max of `val` and the element at `index`. Available for 32 and 64 bit integer types only. + +#### `void TensorView.InterlockedAnd(vector index, T val, out T oldVal)` +Atomically computes the bitwise and of `val` and the element at `index`. Available for 32 and 64 bit integer types only. + +#### `void TensorView.InterlockedOr(vector index, T val, out T oldVal)` +Atomically computes the bitwise or of `val` and the element at `index`. Available for 32 and 64 bit integer types only. + +#### `void TensorView.InterlockedXor(vector index, T val, out T oldVal)` +Atomically computes the bitwise xor of `val` and the element at `index`. Available for 32 and 64 bit integer types only. + +#### `void TensorView.InterlockedExchange(vector index, T val, out T oldVal)` +Atomically swaps `val` into the element at `index`. Available for `float` and 32/64 bit integer types only. + +#### `void TensorView.InterlockedCompareExchange(vector index, T compare, T val)` +Atomically swaps `val` into the element at `index` if the element equals to `compare`. Available for `float` and 32/64 bit integer types only. + +### CUDA Support Functions + +#### `cudaThreadIdx()` Returns the `threadIdx` variable in CUDA. -### `cudaBlockIdx()` +#### `cudaBlockIdx()` Returns the `blockIdx` variable in CUDA. -### `cudaBlockDim()` +#### `cudaBlockDim()` Returns the `blockDim` variable in CUDA. -### `[CudaKernel]` attribute +#### `syncTorchCudaStream()` +Waits for all pending CUDA kernel executions to complete on host. + +### Attributes for PyTorch Interop + +#### `[CudaKernel]` attribute Marks a function as a CUDA kernel (maps to a `__global__` function) -### `[TorchEntryPoint]` attribute +#### `[TorchEntryPoint]` attribute Marks a function for export to Python. Functions marked with `[TorchEntryPoint]` will be accessible from a loaded module returned by `slangpy.loadModule`. -### `[CudaDeviceExport]` attribute +#### `[CudaDeviceExport]` attribute Marks a function as a cuda device function, and ensures the compiler to include it in the generated cuda source. ## Type Marshalling Between Slang and Python -- cgit v1.2.3