template<typename AFloat>
class TMVA::DNN::TCudaMatrix< AFloat >
TCudaMatrix Class.
The TCudaMatrix class represents matrices on a CUDA device. The elements of the matrix are stored in a TCudaDeviceBuffer object which takes care of the allocation and freeing of the device memory. TCudaMatrices are lightweight object, that means on assignment and copy creation only a shallow copy is performed and no new element buffer allocated. To perform a deep copy use the static Copy method of the TCuda architecture class.
The TCudaDeviceBuffer has an associated cuda stream, on which the data is transferred to the device. This stream can be accessed through the GetComputeStream member function and used to synchronize computations.
The TCudaMatrix class also holds static references to CUDA resources. Those are the cublas handle, a buffer of curand states for the generation of random numbers as well as a vector containing ones, which is used for summing column matrices using matrix-vector multiplication. The class also has a static buffer for returning results from the device.
Definition at line 98 of file CudaMatrix.h.