Template Class device_vector

Inheritance Relationships

Base Type

  • private noncopyable

Class Documentation

template<typename T>
class icrar::cuda::device_vector : private noncopyable

A cuda device buffer object that represents a memory buffer on a cuda device.

Note

See https://www.quantstart.com/articles/Matrix-Matrix-Multiplication-on-the-GPU-with-Nvidia-CUDA/

Note

See https://forums.developer.nvidia.com/t/guide-cudamalloc3d-and-cudaarrays/23421

Template Parameters
  • T: numeric type

Public Functions

device_vector()

Default constructor.

__host__ device_vector(device_vector &&other) noexcept
__host__ device_vector &operator=(device_vector &&other) noexcept
__host__ device_vector(size_t count, const T *data = nullptr)

Construct a new device buffer object.

Parameters
  • size:

  • data:

__host__ device_vector(const std::vector<T> &data)
__host__ device_vector(const Eigen::Matrix<T, Eigen::Dynamic, 1> &data)
__host__ device_vector(const Eigen::Matrix<T, 1, Eigen::Dynamic> &data)
__host__ ~device_vector()
__host__ __device__ T *Get()
__host__ __device__ const T *Get() const
__host__ __device__ size_t GetCount() const

Gets the number of elements in the buffer.

__host__ __device__ size_t GetRows() const

Gets the number of rows in the column vector.

__host__ __device__ constexpr size_t GetCols() const
__host__ __device__ size_t GetSize() const

Gets the buffer size in bytes.

__host__ void SetZeroAsync()
__host__ void SetDataSync(const T *data)

Performs a synchronous copy of data into the device buffer.

Pre

data points to a buffer of byte size >= GetSize()

Parameters
  • data: data buffer for host to device copying

__host__ void SetDataAsync(const T *data)

Sets buffer data from pinned host memory.

Pre

Heap memory must be pinned using cudaHostRegister(…, cudaHostRegisterPortable)

Pre

data points to a buffer of byte size >= GetSize()

Parameters
  • data: data buffer for host to device copying

__host__ void ToHost(T *out) const
__host__ void ToHost(std::vector<T> &out) const
__host__ void ToHost(Eigen::Matrix<T, Eigen::Dynamic, 1> &out) const
__host__ void ToHostAsync(T *out) const

Sets buffer data from pinned host memory.

Pre

Heap memory must be pinned using cudaHostRegister(…, cudaHostRegisterPortable)

Pre

data points to a buffer of byte size >= GetSize()

Parameters
  • data: data buffer for device to host copying

__host__ void ToHostAsync(std::vector<T> &out) const
__host__ void ToHostAsync(Eigen::Matrix<T, Eigen::Dynamic, 1> &out) const
__host__ Eigen::Matrix<T, Eigen::Dynamic, 1> ToHostAsync() const