Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
ROOT::Experimental::Detail::RClusterPool Class Reference

Managed a set of clusters containing compressed and packed pages.

The cluster pool steers the preloading of (partial) clusters. There is a two-step pipeline: in a first step, compressed pages are read from clusters into a memory buffer. The second pipeline step decompresses the pages and pushes them into the page pool. The actual logic of reading and unzipping is implemented by the page source. The cluster pool only orchestrates the work queues for reading and unzipping. It uses two threads, one for each pipeline step. The I/O thread for reading waits for data from storage and generates no CPU load. In contrast, the unzip thread is supposed to submit multi-threaded, CPU heavy work to the application's task scheduler.

The unzipping step of the pipeline therefore behaves differently depending on whether or not implicit multi-threadin is turned on. If it is turned off, i.e. in a single-threaded environment, the cluster pool will only read the compressed pages and the page source has to uncompresses pages at a later point when data from the page is requested.

Definition at line 56 of file RClusterPool.hxx.

Classes

struct  RInFlightCluster
 Clusters that are currently being processed by the pipeline. More...
 
struct  RReadItem
 Request to load a subset of the columns of a particular cluster. More...
 
struct  RUnzipItem
 Request to decompress and if necessary unpack compressed pages. More...
 

Public Member Functions

 RClusterPool (const RClusterPool &other)=delete
 
 RClusterPool (RPageSource &pageSource)
 
 RClusterPool (RPageSource &pageSource, unsigned int size)
 
 ~RClusterPool ()
 
RClusterGetCluster (DescriptorId_t clusterId, const RPageSource::ColumnSet_t &columns)
 Returns the requested cluster either from the pool or, in case of a cache miss, lets the I/O thread load the cluster in the pool, blocks until done, and then returns it.
 
unsigned int GetWindowPost () const
 
unsigned int GetWindowPre () const
 
RClusterPooloperator= (const RClusterPool &other)=delete
 

Static Public Attributes

static constexpr unsigned int kDefaultPoolSize = 4
 

Private Member Functions

void ExecReadClusters ()
 The I/O thread routine, there is exactly one I/O thread in-flight for every cluster pool.
 
void ExecUnzipClusters ()
 The unzip thread routine which takes a loaded cluster and passes it to fPageSource.UnzipCluster (which might be a no-op if IMT is off).
 
size_t FindFreeSlot () const
 Returns an index of an unused element in fPool; callers of this function (GetCluster() and WaitFor()) make sure that a free slot actually exists.
 
RClusterFindInPool (DescriptorId_t clusterId) const
 Every cluster id has at most one corresponding RCluster pointer in the pool.
 
RClusterWaitFor (DescriptorId_t clusterId, const RPageSource::ColumnSet_t &columns)
 Returns the given cluster from the pool, which needs to contain at least the columns columns.
 

Private Attributes

std::condition_variable fCvHasReadWork
 Signals a non-empty I/O work queue.
 
std::condition_variable fCvHasUnzipWork
 Signals non-empty unzip work queue.
 
std::vector< RInFlightClusterfInFlightClusters
 The clusters that were handed off to the I/O thread.
 
std::mutex fLockUnzipQueue
 The lock associated with the fCvHasUnzipWork conditional variable.
 
std::mutex fLockWorkQueue
 Protects the shared state between the main thread and the pipeline threads, namely the read and unzip work queues and the in-flight clusters vector.
 
RPageSourcefPageSource
 Every cluster pool is responsible for exactly one page source that triggers loading of the clusters (GetCluster()) and is used for implementing the I/O and cluster memory allocation (PageSource::LoadCluster()).
 
std::vector< std::unique_ptr< RCluster > > fPool
 The cache of clusters around the currently active cluster.
 
std::queue< RReadItemfReadQueue
 The communication channel to the I/O thread.
 
std::thread fThreadIo
 The I/O thread calls RPageSource::LoadCluster() asynchronously.
 
std::thread fThreadUnzip
 The unzip thread takes a loaded cluster and passes it to fPageSource->UnzipCluster() on it.
 
std::queue< RUnzipItemfUnzipQueue
 The communication channel between the I/O thread and the unzip thread.
 
unsigned int fWindowPost
 The number of desired clusters in the pool, including the currently active cluster.
 
unsigned int fWindowPre
 The number of clusters before the currently active cluster that should stay in the pool if present.
 

Static Private Attributes

static constexpr unsigned int kWorkQueueLimit = 4
 Maximum number of queued cluster requests for the I/O thread. A single request can span mutliple clusters.
 

#include <ROOT/RClusterPool.hxx>

Constructor & Destructor Documentation

◆ RClusterPool() [1/3]

ROOT::Experimental::Detail::RClusterPool::RClusterPool ( RPageSource pageSource,
unsigned int  size 
)

Definition at line 50 of file RClusterPool.cxx.

◆ RClusterPool() [2/3]

ROOT::Experimental::Detail::RClusterPool::RClusterPool ( RPageSource pageSource)
inlineexplicit

Definition at line 145 of file RClusterPool.hxx.

◆ RClusterPool() [3/3]

ROOT::Experimental::Detail::RClusterPool::RClusterPool ( const RClusterPool other)
delete

◆ ~RClusterPool()

ROOT::Experimental::Detail::RClusterPool::~RClusterPool ( )

Definition at line 66 of file RClusterPool.cxx.

Member Function Documentation

◆ ExecReadClusters()

void ROOT::Experimental::Detail::RClusterPool::ExecReadClusters ( )
private

The I/O thread routine, there is exactly one I/O thread in-flight for every cluster pool.

Definition at line 110 of file RClusterPool.cxx.

◆ ExecUnzipClusters()

void ROOT::Experimental::Detail::RClusterPool::ExecUnzipClusters ( )
private

The unzip thread routine which takes a loaded cluster and passes it to fPageSource.UnzipCluster (which might be a no-op if IMT is off).

Marks the cluster as ready to be picked up by the main thread.

Definition at line 85 of file RClusterPool.cxx.

◆ FindFreeSlot()

size_t ROOT::Experimental::Detail::RClusterPool::FindFreeSlot ( ) const
private

Returns an index of an unused element in fPool; callers of this function (GetCluster() and WaitFor()) make sure that a free slot actually exists.

Definition at line 165 of file RClusterPool.cxx.

◆ FindInPool()

ROOT::Experimental::Detail::RCluster * ROOT::Experimental::Detail::RClusterPool::FindInPool ( DescriptorId_t  clusterId) const
private

Every cluster id has at most one corresponding RCluster pointer in the pool.

Definition at line 156 of file RClusterPool.cxx.

◆ GetCluster()

ROOT::Experimental::Detail::RCluster * ROOT::Experimental::Detail::RClusterPool::GetCluster ( DescriptorId_t  clusterId,
const RPageSource::ColumnSet_t columns 
)

Returns the requested cluster either from the pool or, in case of a cache miss, lets the I/O thread load the cluster in the pool, blocks until done, and then returns it.

Triggers along the way the background loading of the following fWindowPost number of clusters. The returned cluster has at least all the pages of columns and possibly pages of other columns, too. If implicit multi-threading is turned on, the uncompressed pages of the returned cluster are already pushed into the page pool associated with the page source upon return. The cluster remains valid until the next call to GetCluster().

Definition at line 220 of file RClusterPool.cxx.

◆ GetWindowPost()

unsigned int ROOT::Experimental::Detail::RClusterPool::GetWindowPost ( ) const
inline

Definition at line 151 of file RClusterPool.hxx.

◆ GetWindowPre()

unsigned int ROOT::Experimental::Detail::RClusterPool::GetWindowPre ( ) const
inline

Definition at line 150 of file RClusterPool.hxx.

◆ operator=()

RClusterPool & ROOT::Experimental::Detail::RClusterPool::operator= ( const RClusterPool other)
delete

◆ WaitFor()

ROOT::Experimental::Detail::RCluster * ROOT::Experimental::Detail::RClusterPool::WaitFor ( DescriptorId_t  clusterId,
const RPageSource::ColumnSet_t columns 
)
private

Returns the given cluster from the pool, which needs to contain at least the columns columns.

Executed at the end of GetCluster when all missing data pieces have been sent to the load queue. Ideally, the function returns without blocking if the cluster is already in the pool.

Definition at line 334 of file RClusterPool.cxx.

Member Data Documentation

◆ fCvHasReadWork

std::condition_variable ROOT::Experimental::Detail::RClusterPool::fCvHasReadWork
private

Signals a non-empty I/O work queue.

Definition at line 108 of file RClusterPool.hxx.

◆ fCvHasUnzipWork

std::condition_variable ROOT::Experimental::Detail::RClusterPool::fCvHasUnzipWork
private

Signals non-empty unzip work queue.

Definition at line 114 of file RClusterPool.hxx.

◆ fInFlightClusters

std::vector<RInFlightCluster> ROOT::Experimental::Detail::RClusterPool::fInFlightClusters
private

The clusters that were handed off to the I/O thread.

Definition at line 106 of file RClusterPool.hxx.

◆ fLockUnzipQueue

std::mutex ROOT::Experimental::Detail::RClusterPool::fLockUnzipQueue
private

The lock associated with the fCvHasUnzipWork conditional variable.

Definition at line 112 of file RClusterPool.hxx.

◆ fLockWorkQueue

std::mutex ROOT::Experimental::Detail::RClusterPool::fLockWorkQueue
private

Protects the shared state between the main thread and the pipeline threads, namely the read and unzip work queues and the in-flight clusters vector.

Definition at line 104 of file RClusterPool.hxx.

◆ fPageSource

RPageSource& ROOT::Experimental::Detail::RClusterPool::fPageSource
private

Every cluster pool is responsible for exactly one page source that triggers loading of the clusters (GetCluster()) and is used for implementing the I/O and cluster memory allocation (PageSource::LoadCluster()).

Definition at line 94 of file RClusterPool.hxx.

◆ fPool

std::vector<std::unique_ptr<RCluster> > ROOT::Experimental::Detail::RClusterPool::fPool
private

The cache of clusters around the currently active cluster.

Definition at line 100 of file RClusterPool.hxx.

◆ fReadQueue

std::queue<RReadItem> ROOT::Experimental::Detail::RClusterPool::fReadQueue
private

The communication channel to the I/O thread.

Definition at line 110 of file RClusterPool.hxx.

◆ fThreadIo

std::thread ROOT::Experimental::Detail::RClusterPool::fThreadIo
private

The I/O thread calls RPageSource::LoadCluster() asynchronously.

The thread is mostly waiting for the data to arrive (blocked by the kernel) and therefore can safely run in addition to the application main threads.

Definition at line 121 of file RClusterPool.hxx.

◆ fThreadUnzip

std::thread ROOT::Experimental::Detail::RClusterPool::fThreadUnzip
private

The unzip thread takes a loaded cluster and passes it to fPageSource->UnzipCluster() on it.

If implicit multi-threading is turned off, the UnzipCluster() call is a no-op. Otherwise, the UnzipCluster() call schedules the unzipping of pages using the application's task scheduler.

Definition at line 125 of file RClusterPool.hxx.

◆ fUnzipQueue

std::queue<RUnzipItem> ROOT::Experimental::Detail::RClusterPool::fUnzipQueue
private

The communication channel between the I/O thread and the unzip thread.

Definition at line 116 of file RClusterPool.hxx.

◆ fWindowPost

unsigned int ROOT::Experimental::Detail::RClusterPool::fWindowPost
private

The number of desired clusters in the pool, including the currently active cluster.

Definition at line 98 of file RClusterPool.hxx.

◆ fWindowPre

unsigned int ROOT::Experimental::Detail::RClusterPool::fWindowPre
private

The number of clusters before the currently active cluster that should stay in the pool if present.

Definition at line 96 of file RClusterPool.hxx.

◆ kDefaultPoolSize

constexpr unsigned int ROOT::Experimental::Detail::RClusterPool::kDefaultPoolSize = 4
staticconstexpr

Definition at line 143 of file RClusterPool.hxx.

◆ kWorkQueueLimit

constexpr unsigned int ROOT::Experimental::Detail::RClusterPool::kWorkQueueLimit = 4
staticconstexprprivate

Maximum number of queued cluster requests for the I/O thread. A single request can span mutliple clusters.

Definition at line 59 of file RClusterPool.hxx.

Libraries for ROOT::Experimental::Detail::RClusterPool:

The documentation for this class was generated from the following files: