Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
ROOT::Internal::RClusterPool Class Reference

Managed a set of clusters containing compressed and packed pages.

The cluster pool steers the preloading of (partial) clusters. There is a two-step pipeline: in a first step, compressed pages are read from clusters into a memory buffer. The second pipeline step decompresses the pages and pushes them into the page pool. The actual logic of reading and unzipping is implemented by the page source. The cluster pool only orchestrates the work queues for reading and unzipping. It uses one extra I/O thread for reading waits for data from storage and generates no CPU load.

The unzipping step of the pipeline therefore behaves differently depending on whether or not implicit multi-threading is turned on. If it is turned off, i.e. in a single-threaded environment, the cluster pool will only read the compressed pages and the page source has to uncompresses pages at a later point when data from the page is requested.

Definition at line 56 of file RClusterPool.hxx.

Classes

struct  RCounters
 Performance counters that get registered in fMetrics. More...
 
struct  RInFlightCluster
 Clusters that are currently being processed by the pipeline. More...
 
struct  RReadItem
 Request to load a subset of the columns of a particular cluster. More...
 

Public Member Functions

 RClusterPool (const RClusterPool &other)=delete
 
 RClusterPool (ROOT::Internal::RPageSource &pageSource)
 
 RClusterPool (ROOT::Internal::RPageSource &pageSource, unsigned int clusterBunchSize)
 
 ~RClusterPool ()
 
RClusterGetCluster (ROOT::DescriptorId_t clusterId, const RCluster::ColumnSet_t &physicalColumns)
 Returns the requested cluster either from the pool or, in case of a cache miss, lets the I/O thread load the cluster in the pool, blocks until done, and then returns it.
 
ROOT::Experimental::Detail::RNTupleMetricsGetMetrics ()
 
RClusterPooloperator= (const RClusterPool &other)=delete
 
void StartBackgroundThread ()
 Spawn the I/O background thread. No-op if already started.
 
void StopBackgroundThread ()
 Stop the I/O background thread. No-op if already stopped. Called by the destructor.
 
void WaitForInFlightClusters ()
 Used by the unit tests to drain the queue of clusters to be preloaded.
 

Static Public Attributes

static constexpr unsigned int kDefaultClusterBunchSize = 1
 

Private Member Functions

void ExecReadClusters ()
 The I/O thread routine, there is exactly one I/O thread in-flight for every cluster pool.
 
RClusterWaitFor (ROOT::DescriptorId_t clusterId, const RCluster::ColumnSet_t &physicalColumns)
 Returns the given cluster from the pool, which needs to contain at least the columns physicalColumns.
 

Private Attributes

std::int64_t fBunchId = 0
 Used as an ever-growing counter in GetCluster() to separate bunches of clusters from each other.
 
unsigned int fClusterBunchSize
 The number of clusters that are being read in a single vector read.
 
std::unique_ptr< RCountersfCounters
 
std::condition_variable fCvHasReadWork
 Signals a non-empty I/O work queue.
 
std::vector< RInFlightClusterfInFlightClusters
 The clusters that were handed off to the I/O thread.
 
std::mutex fLockWorkQueue
 Protects the shared state between the main thread and the I/O thread, namely the work queue and the in-flight clusters vector.
 
ROOT::Experimental::Detail::RNTupleMetrics fMetrics
 The cluster pool counters are observed by the page source.
 
ROOT::Internal::RPageSourcefPageSource
 Every cluster pool is responsible for exactly one page source that triggers loading of the clusters (GetCluster()) and is used for implementing the I/O and cluster memory allocation (PageSource::LoadClusters()).
 
std::unordered_map< ROOT::DescriptorId_t, std::unique_ptr< RCluster > > fPool
 The cache of active clusters and their successors.
 
std::deque< RReadItemfReadQueue
 The communication channel to the I/O thread.
 
std::thread fThreadIo
 The I/O thread calls RPageSource::LoadClusters() asynchronously.
 

#include <ROOT/RClusterPool.hxx>

Constructor & Destructor Documentation

◆ RClusterPool() [1/3]

ROOT::Internal::RClusterPool::RClusterPool ( ROOT::Internal::RPageSource & pageSource,
unsigned int clusterBunchSize )

Definition at line 51 of file RClusterPool.cxx.

◆ RClusterPool() [2/3]

ROOT::Internal::RClusterPool::RClusterPool ( ROOT::Internal::RPageSource & pageSource)
inlineexplicit

Definition at line 126 of file RClusterPool.hxx.

◆ RClusterPool() [3/3]

ROOT::Internal::RClusterPool::RClusterPool ( const RClusterPool & other)
delete

◆ ~RClusterPool()

ROOT::Internal::RClusterPool::~RClusterPool ( )

Definition at line 61 of file RClusterPool.cxx.

Member Function Documentation

◆ ExecReadClusters()

void ROOT::Internal::RClusterPool::ExecReadClusters ( )
private

The I/O thread routine, there is exactly one I/O thread in-flight for every cluster pool.

Definition at line 88 of file RClusterPool.cxx.

◆ GetCluster()

ROOT::Internal::RCluster * ROOT::Internal::RClusterPool::GetCluster ( ROOT::DescriptorId_t clusterId,
const RCluster::ColumnSet_t & physicalColumns )

Returns the requested cluster either from the pool or, in case of a cache miss, lets the I/O thread load the cluster in the pool, blocks until done, and then returns it.

Triggers along the way the background loading of the following fClusterBunchSize number of clusters. The returned cluster has at least all the pages of physicalColumns and possibly pages of other columns, too. If implicit multi-threading is turned on, the uncompressed pages of the returned cluster are already pushed into the page pool associated with the page source upon return. The cluster remains valid until the next call to GetCluster().

Definition at line 179 of file RClusterPool.cxx.

◆ GetMetrics()

ROOT::Experimental::Detail::RNTupleMetrics & ROOT::Internal::RClusterPool::GetMetrics ( )
inline

Definition at line 150 of file RClusterPool.hxx.

◆ operator=()

RClusterPool & ROOT::Internal::RClusterPool::operator= ( const RClusterPool & other)
delete

◆ StartBackgroundThread()

void ROOT::Internal::RClusterPool::StartBackgroundThread ( )

Spawn the I/O background thread. No-op if already started.

Definition at line 66 of file RClusterPool.cxx.

◆ StopBackgroundThread()

void ROOT::Internal::RClusterPool::StopBackgroundThread ( )

Stop the I/O background thread. No-op if already stopped. Called by the destructor.

Definition at line 74 of file RClusterPool.cxx.

◆ WaitFor()

ROOT::Internal::RCluster * ROOT::Internal::RClusterPool::WaitFor ( ROOT::DescriptorId_t clusterId,
const RCluster::ColumnSet_t & physicalColumns )
private

Returns the given cluster from the pool, which needs to contain at least the columns physicalColumns.

Executed at the end of GetCluster when all missing data pieces have been sent to the load queue. Ideally, the function returns without blocking if the cluster is already in the pool.

Definition at line 333 of file RClusterPool.cxx.

◆ WaitForInFlightClusters()

void ROOT::Internal::RClusterPool::WaitForInFlightClusters ( )

Used by the unit tests to drain the queue of clusters to be preloaded.

Definition at line 387 of file RClusterPool.cxx.

Member Data Documentation

◆ fBunchId

std::int64_t ROOT::Internal::RClusterPool::fBunchId = 0
private

Used as an ever-growing counter in GetCluster() to separate bunches of clusters from each other.

Definition at line 94 of file RClusterPool.hxx.

◆ fClusterBunchSize

unsigned int ROOT::Internal::RClusterPool::fClusterBunchSize
private

The number of clusters that are being read in a single vector read.

Definition at line 92 of file RClusterPool.hxx.

◆ fCounters

std::unique_ptr<RCounters> ROOT::Internal::RClusterPool::fCounters
private

Definition at line 86 of file RClusterPool.hxx.

◆ fCvHasReadWork

std::condition_variable ROOT::Internal::RClusterPool::fCvHasReadWork
private

Signals a non-empty I/O work queue.

Definition at line 104 of file RClusterPool.hxx.

◆ fInFlightClusters

std::vector<RInFlightCluster> ROOT::Internal::RClusterPool::fInFlightClusters
private

The clusters that were handed off to the I/O thread.

Definition at line 102 of file RClusterPool.hxx.

◆ fLockWorkQueue

std::mutex ROOT::Internal::RClusterPool::fLockWorkQueue
private

Protects the shared state between the main thread and the I/O thread, namely the work queue and the in-flight clusters vector.

Definition at line 100 of file RClusterPool.hxx.

◆ fMetrics

ROOT::Experimental::Detail::RNTupleMetrics ROOT::Internal::RClusterPool::fMetrics
private

The cluster pool counters are observed by the page source.

Definition at line 114 of file RClusterPool.hxx.

◆ fPageSource

ROOT::Internal::RPageSource& ROOT::Internal::RClusterPool::fPageSource
private

Every cluster pool is responsible for exactly one page source that triggers loading of the clusters (GetCluster()) and is used for implementing the I/O and cluster memory allocation (PageSource::LoadClusters()).

Definition at line 90 of file RClusterPool.hxx.

◆ fPool

std::unordered_map<ROOT::DescriptorId_t, std::unique_ptr<RCluster> > ROOT::Internal::RClusterPool::fPool
private

The cache of active clusters and their successors.

Definition at line 96 of file RClusterPool.hxx.

◆ fReadQueue

std::deque<RReadItem> ROOT::Internal::RClusterPool::fReadQueue
private

The communication channel to the I/O thread.

Definition at line 106 of file RClusterPool.hxx.

◆ fThreadIo

std::thread ROOT::Internal::RClusterPool::fThreadIo
private

The I/O thread calls RPageSource::LoadClusters() asynchronously.

The thread is mostly waiting for the data to arrive (blocked by the kernel) and therefore can safely run in addition to the application main threads.

Definition at line 111 of file RClusterPool.hxx.

◆ kDefaultClusterBunchSize

constexpr unsigned int ROOT::Internal::RClusterPool::kDefaultClusterBunchSize = 1
staticconstexpr

Definition at line 124 of file RClusterPool.hxx.

Libraries for ROOT::Internal::RClusterPool:

The documentation for this class was generated from the following files: