Classes | |
class | Array |
A templated class for managing an array of data using a specified memory type. More... | |
class | CudaEvent |
class | CudaStream |
Typedefs | |
template<class T > | |
using | DeviceArray = Array<T, DeviceMemory> |
An array of specific type that is allocated on the device with cudaMalloc and freed with cudaFree . | |
template<class T > | |
using | PinnedHostArray = Array<T, PinnedHostMemory> |
A pinned array of specific type that allocated on the host with cudaMallocHost and freed with cudaFreeHost . | |
Functions | |
template<class T > | |
void | copyDeviceToDevice (const T *src, T *dest, std::size_t n, CudaStream *=nullptr) |
Copies data from the CUDA device to the CUDA device. | |
template<class T > | |
void | copyDeviceToHost (const T *src, T *dest, std::size_t n, CudaStream *=nullptr) |
Copies data from the CUDA device to the host. | |
template<class T > | |
void | copyHostToDevice (const T *src, T *dest, std::size_t n, CudaStream *=nullptr) |
Copies data from the host to the CUDA device. | |
float | cudaEventElapsedTime (CudaEvent &begin, CudaEvent &end) |
Calculates the elapsed time between two CUDA events. | |
void | cudaEventRecord (CudaEvent &event, CudaStream &stream) |
Records a CUDA event. | |
using RooBatchCompute::CudaInterface::DeviceArray = Array<T, DeviceMemory> |
An array of specific type that is allocated on the device with cudaMalloc
and freed with cudaFree
.
Definition at line 209 of file CudaInterface.h.
using RooBatchCompute::CudaInterface::PinnedHostArray = Array<T, PinnedHostMemory> |
A pinned array of specific type that allocated on the host with cudaMallocHost
and freed with cudaFreeHost
.
The memory is "pinned", i.e. page-locked and accessible to the device for fast copying.
cudaMallocHost
on developer.download.nvidia.com. Definition at line 218 of file CudaInterface.h.
void RooBatchCompute::CudaInterface::copyDeviceToDevice | ( | const T * | src, |
T * | dest, | ||
std::size_t | n, | ||
CudaStream * | = nullptr ) |
Copies data from the CUDA device to the CUDA device.
[in] | src | Pointer to the source memory on the device. |
[in] | dest | Pointer to the destination memory on the device. |
[in] | nBytes | Number of bytes to copy. |
[in] | stream | CudaStream for asynchronous memory transfer (optional). |
Definition at line 120 of file CudaInterface.h.
void RooBatchCompute::CudaInterface::copyDeviceToHost | ( | const T * | src, |
T * | dest, | ||
std::size_t | n, | ||
CudaStream * | = nullptr ) |
Copies data from the CUDA device to the host.
[in] | src | Pointer to the source memory on the device. |
[in] | dest | Pointer to the destination memory on the host. |
[in] | nBytes | Number of bytes to copy. |
[in] | stream | CudaStream for asynchronous memory transfer (optional). |
Definition at line 106 of file CudaInterface.h.
void RooBatchCompute::CudaInterface::copyHostToDevice | ( | const T * | src, |
T * | dest, | ||
std::size_t | n, | ||
CudaStream * | = nullptr ) |
Copies data from the host to the CUDA device.
[in] | src | Pointer to the source memory on the host. |
[in] | dest | Pointer to the destination memory on the device. |
[in] | nBytes | Number of bytes to copy. |
[in] | stream | CudaStream for asynchronous memory transfer (optional). |
Definition at line 92 of file CudaInterface.h.
Calculates the elapsed time between two CUDA events.
[in] | begin | CudaEvent representing the start event. |
[in] | end | CudaEvent representing the end event. |
Definition at line 146 of file CudaInterface.cu.
void RooBatchCompute::CudaInterface::cudaEventRecord | ( | CudaEvent & | event, |
CudaStream & | stream ) |
Records a CUDA event.
[in] | event | CudaEvent object representing the event to be recorded. |
[in] | stream | CudaStream in which to record the event. |
Definition at line 96 of file CudaInterface.cu.