Template Class device_vector¶
Defined in File device_vector.h
Class Documentation¶
-
template<typename
T
>
classicrar::cuda
::
device_vector
: private noncopyable¶ A cuda device buffer object that represents a memory buffer on a cuda device.
- Note
See https://www.quantstart.com/articles/Matrix-Matrix-Multiplication-on-the-GPU-with-Nvidia-CUDA/
- Note
See https://forums.developer.nvidia.com/t/guide-cudamalloc3d-and-cudaarrays/23421
- Template Parameters
T
: numeric type
Public Functions
-
device_vector
()¶ Default constructor.
-
__host__
device_vector
(device_vector &&other) noexcept¶
-
__host__ device_vector &
operator=
(device_vector &&other) noexcept¶
-
__host__
device_vector
(size_t count, const T *data = nullptr)¶ Construct a new device buffer object.
- Parameters
size
:data
:
-
__host__
~device_vector
()¶
-
__host__ __device__ size_t
GetCount
() const¶ Gets the number of elements in the buffer.
-
__host__ __device__ size_t
GetRows
() const¶ Gets the number of rows in the column vector.
-
__host__ __device__ constexpr size_t
GetCols
() const¶
-
__host__ __device__ size_t
GetSize
() const¶ Gets the buffer size in bytes.
-
__host__ void
SetZeroAsync
()¶
-
__host__ void
SetDataSync
(const T *data)¶ Performs a synchronous copy of data into the device buffer.
- Pre
data points to a buffer of byte size >= GetSize()
- Parameters
data
: data buffer for host to device copying
-
__host__ void
SetDataAsync
(const T *data)¶ Sets buffer data from pinned host memory.
- Pre
Heap memory must be pinned using cudaHostRegister(…, cudaHostRegisterPortable)
- Pre
data points to a buffer of byte size >= GetSize()
- Parameters
data
: data buffer for host to device copying