RDataSource defines an API that RDataFrame can use to read arbitrary data formats.
A concrete RDataSource implementation (i.e. a class that inherits from RDataSource and implements all of its pure methods) provides an adaptor that RDataFrame can leverage to read any kind of tabular data formats. RDataFrame calls into RDataSource to retrieve information about the data, retrieve (thread-local) readers or "cursors" for selected columns and to advance the readers to the desired data entry.
The sequence of calls that RDataFrame (or any other client of a RDataSource) performs is the following:
RDataSource implementations must support running multiple event-loops consecutively (although sequentially) on the same dataset.
- SetNSlots() is called once per RDataSource object, typically when it is associated to a RDataFrame.
- GetColumnReaders() can be called several times, potentially with the same arguments, also in-between event-loops, but not during an event-loop.
- GetEntryRanges() will be called several times, including during an event loop, as additional ranges are needed. It will not be called concurrently.
- Initialise() and Finalise() are called once per event-loop, right before starting and right after finishing.
- InitSlot(), SetEntry(), and FinaliseSlot() can be called concurrently from multiple threads, multiple times per event-loop.
Definition at line 105 of file RDataSource.hxx.
virtual std::vector< std::pair< ULong64_t, ULong64_t > > ROOT::RDF::RDataSource::GetEntryRanges |
( |
| ) |
|
|
pure virtual |
virtual void ROOT::RDF::RDataSource::Initialise |
( |
| ) |
|
|
inlinevirtual |