Template Class device_matrix¶
Defined in File device_matrix.h
Class Documentation¶
-
template<typename
T
>
classicrar::cuda
::
device_matrix
: private noncopyable¶ A cuda device buffer object that represents a memory buffer on a cuda device. Matrix size is fixed at construction and can only be resized using move semantics.
- Note
See https://www.quantstart.com/articles/Matrix-Matrix-Multiplication-on-the-GPU-with-Nvidia-CUDA/
- Note
See https://forums.developer.nvidia.com/t/guide-cudamalloc3d-and-cudaarrays/23421
- Template Parameters
T
: numeric type
Public Functions
-
device_matrix
()¶ Default constructor.
-
device_matrix
(device_matrix &&other) noexcept¶ Move Constructor.
- Parameters
other
:
-
device_matrix &
operator=
(device_matrix &&other) noexcept¶ Move Assignment Operator.
- Return
device_matrix&
- Parameters
other
:
-
device_matrix
(size_t rows, size_t cols, const T *data = nullptr)¶ Construct a new device matrix object of fixed size and initialized asyncronously if data is provided.
- Parameters
rows
: number of rowscols
: number of columnsdata
: constigous column major data of size rows*cols*sizeof(T) to copy to device
-
~device_matrix
()¶
-
__host__ __device__ size_t
GetRows
() const¶
-
__host__ __device__ size_t
GetCols
() const¶
-
__host__ __device__ size_t
GetCount
() const¶
-
__host__ __device__ size_t
GetSize
() const¶
-
__host__ void
SetZeroAsync
()¶