Template Class device_tensor
Defined in File device_tensor.h
Class Documentation
-
template<typename T, uint32_t NumDims>
class device_tensor A cuda device tensor buffer object that references a tensor in cuda device memory buffer for manipulation by the host.
Note
See https://www.quantstart.com/articles/Matrix-Matrix-Multiplication-on-the-GPU-with-Nvidia-CUDA/
- Template Parameters:
T – the tensor data type
NumDims – number of tensor dimensions
Public Functions
-
inline device_tensor(size_t sizeDim0, size_t sizeDim1, size_t sizeDim2, size_t sizeDim3, const T *data = nullptr)
-
inline device_tensor(device_tensor &&other)
Copy Constructor.
- Parameters:
other –
-
inline device_tensor &operator=(device_tensor &&other) noexcept
-
inline ~device_tensor()
-
__host__ inline const T *Get() const
Gets the raw pointer to device buffer memory.
- Returns:
T const*
-
__host__ inline Eigen::DenseIndex GetDimensionSize(int dim) const
-
__host__ inline size_t GetCount() const
Gets the total number of elements in the tensor.
- Returns:
host
-
__host__ inline size_t GetSize() const
Gets the total number of elements in the tensor.
- Returns:
host
-
__host__ inline size_t GetByteSize() const
Gets the total number of bytes in the memory buffer.
- Returns:
host
-
__host__ inline void SetDataSync(const T *data)
Performs a synchronous copy of data into the device buffer.
- Parameters:
data –
- Returns:
host
-
__host__ inline void SetDataAsync(const T *data)
Set the Data asyncronously from host memory.
- Parameters:
data –
- Returns:
host
-
__host__ inline void SetDataAsync(const device_tensor<T, NumDims> &data)