Namespace for dispatching RooFit computations to various backends. More...
Namespaces | |
namespace | CUDA |
namespace | CudaInterface |
Classes | |
class | AbsBuffer |
class | AbsBufferManager |
class | Batch |
class | Batches |
class | BracketAdapter |
Little adapter that gives a bracket operator to types that don't have one. More... | |
class | BracketAdapterWithMask |
class | Config |
Minimal configuration struct to steer the evaluation of a single node with the RooBatchCompute library. More... | |
struct | ReduceNLLOutput |
class | RooBatchComputeInterface |
The interface which should be implemented to provide optimised computation functions for implementations of RooAbsReal::doEval(). More... | |
Typedefs | |
typedef std::span< double > | ArgSpan |
typedef const double *__restrict | InputArr |
typedef std::span< const std::span< const double > > | VarSpan |
Enumerations | |
enum class | Architecture { AVX512 , AVX2 , AVX , SSE4 , GENERIC , CUDA } |
enum | Computer { AddPdf , ArgusBG , BMixDecay , Bernstein , BifurGauss , BreitWigner , Bukin , CBShape , Chebychev , ChiSquare , DeltaFunction , DstD0BG , ExpPoly , Exponential , ExponentialNeg , Gamma , GaussModelExpBasis , Gaussian , Identity , Johnson , Landau , Lognormal , LognormalStandard , NegativeLogarithms , NormalizedPdf , Novosibirsk , Poisson , Polynomial , Power , ProdPdf , Ratio , TruthModelExpBasis , TruthModelSinBasis , TruthModelCosBasis , TruthModelLinBasis , TruthModelQuadBasis , TruthModelSinhBasis , TruthModelCoshBasis , Voigtian } |
Functions | |
void | compute (Config cfg, Computer comp, std::span< double > output, std::initializer_list< std::span< const double > > vars, ArgSpan extraArgs={}) |
It is not possible to construct a std::span directly from an initializer list (probably it will be with C++26). | |
void | compute (Config cfg, Computer comp, std::span< double > output, VarSpan vars, ArgSpan extraArgs={}) |
Architecture | cpuArchitecture () |
std::string | cpuArchitectureName () |
__roodevice__ double | fast_cos (double x) |
__roodevice__ double | fast_exp (double x) |
__roodevice__ double | fast_isqrt (double x) |
__roodevice__ double | fast_log (double x) |
__roodevice__ double | fast_sin (double x) |
int | initCPU () |
Inspect hardware capabilities, and load the optimal library for RooFit computations. | |
int | initCUDA () |
ReduceNLLOutput | reduceNLL (Config cfg, std::span< const double > probas, std::span< const double > weights, std::span< const double > offsetProbas) |
double | reduceSum (Config cfg, InputArr input, size_t n) |
Variables | |
constexpr std::size_t | bufferSize = 64 |
R__EXTERN RooBatchComputeInterface * | dispatchCPU = nullptr |
This dispatch pointer points to an implementation of the compute library, provided one has been loaded. | |
R__EXTERN RooBatchComputeInterface * | dispatchCUDA = nullptr |
Namespace for dispatching RooFit computations to various backends.
A space to attach TBranches.
This namespace contains an interface for providing high-performance computation functions for use in RooAbsReal::doEval(), see RooBatchComputeInterface.
Furthermore, several implementations of this interface can be created, which reside in RooBatchCompute::RF_ARCH, where RF_ARCH may be replaced by the architecture that this implementation targets, e.g. SSE, AVX, etc.
Using the pointer RooBatchCompute::dispatch, a computation request can be dispatched to the fastest backend that is available on a specific platform.
typedef std::span<double> RooBatchCompute::ArgSpan |
Definition at line 45 of file RooBatchCompute.h.
typedef const double* __restrict RooBatchCompute::InputArr |
Definition at line 46 of file RooBatchCompute.h.
typedef std::span<const std::span<const double> > RooBatchCompute::VarSpan |
Definition at line 44 of file RooBatchCompute.h.
|
strong |
Enumerator | |
---|---|
AVX512 | |
AVX2 | |
AVX | |
SSE4 | |
GENERIC | |
CUDA |
Definition at line 65 of file RooBatchCompute.h.
Definition at line 67 of file RooBatchCompute.h.
|
inline |
It is not possible to construct a std::span directly from an initializer list (probably it will be with C++26).
That's why we need an explicit overload for this.
Definition at line 214 of file RooBatchCompute.h.
|
inline |
Definition at line 205 of file RooBatchCompute.h.
|
inline |
Definition at line 195 of file RooBatchCompute.h.
|
inline |
Definition at line 200 of file RooBatchCompute.h.
|
inline |
Definition at line 78 of file RooVDTHeaders.h.
|
inline |
Definition at line 68 of file RooVDTHeaders.h.
|
inline |
Definition at line 88 of file RooVDTHeaders.h.
|
inline |
Definition at line 83 of file RooVDTHeaders.h.
|
inline |
Definition at line 73 of file RooVDTHeaders.h.
int RooBatchCompute::initCPU | ( | ) |
Inspect hardware capabilities, and load the optimal library for RooFit computations.
Definition at line 44 of file Initialisation.cxx.
int RooBatchCompute::initCUDA | ( | ) |
Definition at line 89 of file Initialisation.cxx.
|
inline |
Definition at line 226 of file RooBatchCompute.h.
Definition at line 220 of file RooBatchCompute.h.
|
constexpr |
Definition at line 48 of file RooBatchCompute.h.
RooBatchCompute::RooBatchComputeInterface * RooBatchCompute::dispatchCPU = nullptr |
This dispatch pointer points to an implementation of the compute library, provided one has been loaded.
Using a virtual call, computation requests are dispatched to backends with architecture-specific functions such as SSE, AVX, AVX2, etc.
Definition at line 192 of file RooBatchCompute.h.
RooBatchCompute::RooBatchComputeInterface * RooBatchCompute::dispatchCUDA = nullptr |
Definition at line 193 of file RooBatchCompute.h.