Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
ROOT::Experimental::Detail::RPageSink Class Referenceabstract

Abstract interface to write data into an ntuple.

The page sink takes the list of columns and afterwards a series of page commits and cluster commits. The user is responsible to commit clusters at a consistent point, i.e. when all pages corresponding to data up to the given entry number are committed.

Definition at line 177 of file RPageStorage.hxx.

Classes

struct  RCounters
 Default I/O performance counters that get registered in fMetrics. More...
 

Public Member Functions

 RPageSink (const RPageSink &)=delete
 
 RPageSink (RPageSink &&)=default
 
 RPageSink (std::string_view ntupleName, const RNTupleWriteOptions &options)
 
 ~RPageSink () override
 
ColumnHandle_t AddColumn (DescriptorId_t fieldId, const RColumn &column) final
 Register a new column.
 
std::uint64_t CommitCluster (NTupleSize_t nEntries)
 Finalize the current cluster and create a new one for the following data.
 
void CommitClusterGroup ()
 Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).
 
void CommitDataset ()
 Finalize the current cluster and the entrire data set.
 
void CommitPage (ColumnHandle_t columnHandle, const RPage &page)
 Write a page to the storage. The column must have been added before.
 
void CommitSealedPage (DescriptorId_t physicalColumnId, const RPageStorage::RSealedPage &sealedPage)
 Write a preprocessed page to storage. The column must have been added before.
 
void CommitSealedPageV (std::span< RPageStorage::RSealedPageGroup > ranges)
 Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.
 
void Create (RNTupleModel &model)
 Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) To do so, Create() calls CreateImpl() after updating the descriptor.
 
void DropColumn (ColumnHandle_t) final
 Unregisters a column.
 
RNTupleMetricsGetMetrics () override
 Returns the default metrics object. Subclasses might alternatively provide their own metrics object by overriding this.
 
EPageStorageType GetType () final
 Whether the concrete implementation is a sink or a source.
 
const RNTupleWriteOptionsGetWriteOptions () const
 Returns the sink's write options.
 
RPageSinkoperator= (const RPageSink &)=delete
 
RPageSinkoperator= (RPageSink &&)=default
 
virtual RPage ReservePage (ColumnHandle_t columnHandle, std::size_t nElements)=0
 Get a new, empty page for the given column that can be filled with up to nElements.
 
virtual void UpdateSchema (const RNTupleModelChangeset &changeset, NTupleSize_t firstEntry)
 Incorporate incremental changes to the model into the ntuple descriptor.
 
- Public Member Functions inherited from ROOT::Experimental::Detail::RPageStorage
 RPageStorage (const RPageStorage &other)=delete
 
 RPageStorage (RPageStorage &&other)=default
 
 RPageStorage (std::string_view name)
 
virtual ~RPageStorage ()
 
const std::string & GetNTupleName () const
 Returns the NTuple name.
 
RPageStorageoperator= (const RPageStorage &other)=delete
 
RPageStorageoperator= (RPageStorage &&other)=default
 
virtual void ReleasePage (RPage &page)=0
 Every page store needs to be able to free pages it handed out.
 
void SetTaskScheduler (RTaskScheduler *taskScheduler)
 

Static Public Member Functions

static std::unique_ptr< RPageSinkCreate (std::string_view ntupleName, std::string_view location, const RNTupleWriteOptions &options=RNTupleWriteOptions())
 Guess the concrete derived page source from the file name (location)
 

Protected Member Functions

virtual RNTupleLocator CommitClusterGroupImpl (unsigned char *serializedPageList, std::uint32_t length)=0
 Returns the locator of the page list envelope of the given buffer that contains the serialized page list.
 
virtual std::uint64_t CommitClusterImpl (NTupleSize_t nEntries)=0
 Returns the number of bytes written to storage (excluding metadata)
 
virtual void CommitDatasetImpl (unsigned char *serializedFooter, std::uint32_t length)=0
 
virtual RNTupleLocator CommitPageImpl (ColumnHandle_t columnHandle, const RPage &page)=0
 
virtual RNTupleLocator CommitSealedPageImpl (DescriptorId_t physicalColumnId, const RPageStorage::RSealedPage &sealedPage)=0
 
virtual std::vector< RNTupleLocatorCommitSealedPageVImpl (std::span< RPageStorage::RSealedPageGroup > ranges)
 Vector commit of preprocessed pages.
 
virtual void CreateImpl (const RNTupleModel &model, unsigned char *serializedHeader, std::uint32_t length)=0
 
void EnableDefaultMetrics (const std::string &prefix)
 Enables the default set of metrics provided by RPageSink.
 
RSealedPage SealPage (const RPage &page, const RColumnElementBase &element, int compressionSetting)
 Helper for streaming a page.
 
- Protected Member Functions inherited from ROOT::Experimental::Detail::RPageStorage
void WaitForAllTasks ()
 

Static Protected Member Functions

static RSealedPage SealPage (const RPage &page, const RColumnElementBase &element, int compressionSetting, void *buf)
 Seal a page using the provided buffer.
 

Protected Attributes

std::unique_ptr< RNTupleCompressorfCompressor
 Helper to zip pages and header/footer; includes a 16MB (kMAXZIPBUF) zip buffer.
 
std::unique_ptr< RCountersfCounters
 
RNTupleDescriptorBuilder fDescriptorBuilder
 
RNTupleMetrics fMetrics
 
std::uint64_t fNextClusterInGroup = 0
 Remembers the starting cluster id for the next cluster group.
 
std::vector< RClusterDescriptor::RColumnRangefOpenColumnRanges
 Keeps track of the number of elements in the currently open cluster. Indexed by column id.
 
std::vector< RClusterDescriptor::RPageRangefOpenPageRanges
 Keeps track of the written pages in the currently open cluster. Indexed by column id.
 
std::unique_ptr< RNTupleWriteOptionsfOptions
 
NTupleSize_t fPrevClusterNEntries = 0
 Used to calculate the number of entries in the current cluster.
 
- Protected Attributes inherited from ROOT::Experimental::Detail::RPageStorage
std::string fNTupleName
 
RTaskSchedulerfTaskScheduler = nullptr
 

Private Attributes

Internal::RNTupleSerializer::RContext fSerializationContext
 Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization.
 

Additional Inherited Members

- Public Types inherited from ROOT::Experimental::Detail::RPageStorage
using ColumnHandle_t = RColumnHandle
 The column handle identifies a column with the current open page storage.
 
using SealedPageSequence_t = std::deque< RSealedPage >
 

#include <ROOT/RPageStorage.hxx>

Inheritance diagram for ROOT::Experimental::Detail::RPageSink:
[legend]

Constructor & Destructor Documentation

◆ RPageSink() [1/3]

ROOT::Experimental::Detail::RPageSink::RPageSink ( std::string_view  ntupleName,
const RNTupleWriteOptions options 
)

Definition at line 297 of file RPageStorage.cxx.

◆ RPageSink() [2/3]

ROOT::Experimental::Detail::RPageSink::RPageSink ( const RPageSink )
delete

◆ RPageSink() [3/3]

ROOT::Experimental::Detail::RPageSink::RPageSink ( RPageSink &&  )
default

◆ ~RPageSink()

ROOT::Experimental::Detail::RPageSink::~RPageSink ( )
override

Definition at line 302 of file RPageStorage.cxx.

Member Function Documentation

◆ AddColumn()

ROOT::Experimental::Detail::RPageStorage::ColumnHandle_t ROOT::Experimental::Detail::RPageSink::AddColumn ( DescriptorId_t  fieldId,
const RColumn column 
)
finalvirtual

Register a new column.

When reading, the column must exist in the ntuple on disk corresponding to the meta-data. When writing, every column can only be attached once.

Implements ROOT::Experimental::Detail::RPageStorage.

Definition at line 332 of file RPageStorage.cxx.

◆ CommitCluster()

std::uint64_t ROOT::Experimental::Detail::RPageSink::CommitCluster ( NTupleSize_t  nEntries)

Finalize the current cluster and create a new one for the following data.

Returns the number of bytes written to storage (excluding meta-data).

Definition at line 473 of file RPageStorage.cxx.

◆ CommitClusterGroup()

void ROOT::Experimental::Detail::RPageSink::CommitClusterGroup ( )

Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).

Definition at line 495 of file RPageStorage.cxx.

◆ CommitClusterGroupImpl()

virtual RNTupleLocator ROOT::Experimental::Detail::RPageSink::CommitClusterGroupImpl ( unsigned char *  serializedPageList,
std::uint32_t  length 
)
protectedpure virtual

Returns the locator of the page list envelope of the given buffer that contains the serialized page list.

Typically, the implementation takes care of compressing and writing the provided buffer.

Implemented in ROOT::Experimental::Detail::RPageSinkBuf, ROOT::Experimental::Detail::RPageSinkDaos, and ROOT::Experimental::Detail::RPageSinkFile.

◆ CommitClusterImpl()

virtual std::uint64_t ROOT::Experimental::Detail::RPageSink::CommitClusterImpl ( NTupleSize_t  nEntries)
protectedpure virtual

Returns the number of bytes written to storage (excluding metadata)

Implemented in ROOT::Experimental::Detail::RPageSinkBuf, ROOT::Experimental::Detail::RPageSinkDaos, and ROOT::Experimental::Detail::RPageSinkFile.

◆ CommitDataset()

void ROOT::Experimental::Detail::RPageSink::CommitDataset ( )

Finalize the current cluster and the entrire data set.

Definition at line 524 of file RPageStorage.cxx.

◆ CommitDatasetImpl()

virtual void ROOT::Experimental::Detail::RPageSink::CommitDatasetImpl ( unsigned char *  serializedFooter,
std::uint32_t  length 
)
protectedpure virtual

◆ CommitPage()

void ROOT::Experimental::Detail::RPageSink::CommitPage ( ColumnHandle_t  columnHandle,
const RPage page 
)

Write a page to the storage. The column must have been added before.

Definition at line 423 of file RPageStorage.cxx.

◆ CommitPageImpl()

virtual RNTupleLocator ROOT::Experimental::Detail::RPageSink::CommitPageImpl ( ColumnHandle_t  columnHandle,
const RPage page 
)
protectedpure virtual

◆ CommitSealedPage()

void ROOT::Experimental::Detail::RPageSink::CommitSealedPage ( DescriptorId_t  physicalColumnId,
const RPageStorage::RSealedPage sealedPage 
)

Write a preprocessed page to storage. The column must have been added before.

Definition at line 433 of file RPageStorage.cxx.

◆ CommitSealedPageImpl()

virtual RNTupleLocator ROOT::Experimental::Detail::RPageSink::CommitSealedPageImpl ( DescriptorId_t  physicalColumnId,
const RPageStorage::RSealedPage sealedPage 
)
protectedpure virtual

◆ CommitSealedPageV()

void ROOT::Experimental::Detail::RPageSink::CommitSealedPageV ( std::span< RPageStorage::RSealedPageGroup ranges)

Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.

Definition at line 456 of file RPageStorage.cxx.

◆ CommitSealedPageVImpl()

std::vector< ROOT::Experimental::RNTupleLocator > ROOT::Experimental::Detail::RPageSink::CommitSealedPageVImpl ( std::span< RPageStorage::RSealedPageGroup ranges)
protectedvirtual

Vector commit of preprocessed pages.

The ranges array specifies a range of sealed pages to be committed for each column. The returned vector contains, in order, the RNTupleLocator for each page on each range in ranges, i.e. the first N entries refer to the N pages in ranges[0], followed by M entries that refer to the M pages in ranges[1], etc. The default is to call CommitSealedPageImpl for each page; derived classes may provide an optimized implementation though.

Reimplemented in ROOT::Experimental::Detail::RPageSinkDaos.

Definition at line 446 of file RPageStorage.cxx.

◆ Create() [1/2]

void ROOT::Experimental::Detail::RPageSink::Create ( RNTupleModel model)

Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) To do so, Create() calls CreateImpl() after updating the descriptor.

Create() associates column handles to the columns referenced by the model

Definition at line 398 of file RPageStorage.cxx.

◆ Create() [2/2]

std::unique_ptr< ROOT::Experimental::Detail::RPageSink > ROOT::Experimental::Detail::RPageSink::Create ( std::string_view  ntupleName,
std::string_view  location,
const RNTupleWriteOptions options = RNTupleWriteOptions() 
)
static

Guess the concrete derived page source from the file name (location)

Definition at line 306 of file RPageStorage.cxx.

◆ CreateImpl()

virtual void ROOT::Experimental::Detail::RPageSink::CreateImpl ( const RNTupleModel model,
unsigned char *  serializedHeader,
std::uint32_t  length 
)
protectedpure virtual

◆ DropColumn()

void ROOT::Experimental::Detail::RPageSink::DropColumn ( ColumnHandle_t  columnHandle)
inlinefinalvirtual

Unregisters a column.

A page source decreases the reference counter for the corresponding active column. For a page sink, dropping columns is currently a no-op.

Implements ROOT::Experimental::Detail::RPageStorage.

Definition at line 270 of file RPageStorage.hxx.

◆ EnableDefaultMetrics()

void ROOT::Experimental::Detail::RPageSink::EnableDefaultMetrics ( const std::string &  prefix)
protected

Enables the default set of metrics provided by RPageSink.

prefix will be used as the prefix for the counters registered in the internal RNTupleMetrics object. This set of counters can be extended by a subclass by calling fMetrics.MakeCounter<...>().

A subclass using the default set of metrics is always responsible for updating the counters appropriately, e.g. fCounters->fNPageCommited.Inc()

Alternatively, a subclass might provide its own RNTupleMetrics object by overriding the GetMetrics() member function.

Definition at line 572 of file RPageStorage.cxx.

◆ GetMetrics()

RNTupleMetrics & ROOT::Experimental::Detail::RPageSink::GetMetrics ( )
inlineoverridevirtual

Returns the default metrics object. Subclasses might alternatively provide their own metrics object by overriding this.

Implements ROOT::Experimental::Detail::RPageStorage.

Reimplemented in ROOT::Experimental::Detail::RPageSinkBuf.

Definition at line 301 of file RPageStorage.hxx.

◆ GetType()

EPageStorageType ROOT::Experimental::Detail::RPageSink::GetType ( )
inlinefinalvirtual

Whether the concrete implementation is a sink or a source.

Implements ROOT::Experimental::Detail::RPageStorage.

Definition at line 265 of file RPageStorage.hxx.

◆ GetWriteOptions()

const RNTupleWriteOptions & ROOT::Experimental::Detail::RPageSink::GetWriteOptions ( ) const
inline

Returns the sink's write options.

Definition at line 267 of file RPageStorage.hxx.

◆ operator=() [1/2]

RPageSink & ROOT::Experimental::Detail::RPageSink::operator= ( const RPageSink )
delete

◆ operator=() [2/2]

RPageSink & ROOT::Experimental::Detail::RPageSink::operator= ( RPageSink &&  )
default

◆ ReservePage()

virtual RPage ROOT::Experimental::Detail::RPageSink::ReservePage ( ColumnHandle_t  columnHandle,
std::size_t  nElements 
)
pure virtual

Get a new, empty page for the given column that can be filled with up to nElements.

If nElements is zero, the page sink picks an appropriate size.

Implemented in ROOT::Experimental::Detail::RPageSinkBuf, ROOT::Experimental::Detail::RPageSinkDaos, and ROOT::Experimental::Detail::RPageSinkFile.

◆ SealPage() [1/2]

ROOT::Experimental::Detail::RPageStorage::RSealedPage ROOT::Experimental::Detail::RPageSink::SealPage ( const RPage page,
const RColumnElementBase element,
int  compressionSetting 
)
protected

Helper for streaming a page.

This is commonly used in derived, concrete page sinks. Note that if compressionSetting is 0 (uncompressed) and the page is mappable, the returned sealed page will point directly to the input page buffer. Otherwise, the sealed page references an internal buffer of fCompressor. Thus, the buffer pointed to by the RSealedPage should never be freed. Usage of this method requires construction of fCompressor.

Definition at line 565 of file RPageStorage.cxx.

◆ SealPage() [2/2]

ROOT::Experimental::Detail::RPageStorage::RSealedPage ROOT::Experimental::Detail::RPageSink::SealPage ( const RPage page,
const RColumnElementBase element,
int  compressionSetting,
void *  buf 
)
staticprotected

Seal a page using the provided buffer.

Definition at line 536 of file RPageStorage.cxx.

◆ UpdateSchema()

void ROOT::Experimental::Detail::RPageSink::UpdateSchema ( const RNTupleModelChangeset changeset,
NTupleSize_t  firstEntry 
)
virtual

Incorporate incremental changes to the model into the ntuple descriptor.

This happens, e.g. if new fields were added after the initial call to RPageSink::Create(RNTupleModel &). firstEntry specifies the global index for the first stored element in the added columns.

Reimplemented in ROOT::Experimental::Detail::RPageSinkBuf.

Definition at line 340 of file RPageStorage.cxx.

Member Data Documentation

◆ fCompressor

std::unique_ptr<RNTupleCompressor> ROOT::Experimental::Detail::RPageSink::fCompressor
protected

Helper to zip pages and header/footer; includes a 16MB (kMAXZIPBUF) zip buffer.

There could be concrete page sinks that don't need a compressor. Therefore, and in order to stay consistent with the page source, we leave it up to the derived class whether or not the compressor gets constructed.

Definition at line 201 of file RPageStorage.hxx.

◆ fCounters

std::unique_ptr<RCounters> ROOT::Experimental::Detail::RPageSink::fCounters
protected

Definition at line 193 of file RPageStorage.hxx.

◆ fDescriptorBuilder

RNTupleDescriptorBuilder ROOT::Experimental::Detail::RPageSink::fDescriptorBuilder
protected

Definition at line 211 of file RPageStorage.hxx.

◆ fMetrics

RNTupleMetrics ROOT::Experimental::Detail::RPageSink::fMetrics
protected

Definition at line 194 of file RPageStorage.hxx.

◆ fNextClusterInGroup

std::uint64_t ROOT::Experimental::Detail::RPageSink::fNextClusterInGroup = 0
protected

Remembers the starting cluster id for the next cluster group.

Definition at line 204 of file RPageStorage.hxx.

◆ fOpenColumnRanges

std::vector<RClusterDescriptor::RColumnRange> ROOT::Experimental::Detail::RPageSink::fOpenColumnRanges
protected

Keeps track of the number of elements in the currently open cluster. Indexed by column id.

Definition at line 208 of file RPageStorage.hxx.

◆ fOpenPageRanges

std::vector<RClusterDescriptor::RPageRange> ROOT::Experimental::Detail::RPageSink::fOpenPageRanges
protected

Keeps track of the written pages in the currently open cluster. Indexed by column id.

Definition at line 210 of file RPageStorage.hxx.

◆ fOptions

std::unique_ptr<RNTupleWriteOptions> ROOT::Experimental::Detail::RPageSink::fOptions
protected

Definition at line 196 of file RPageStorage.hxx.

◆ fPrevClusterNEntries

NTupleSize_t ROOT::Experimental::Detail::RPageSink::fPrevClusterNEntries = 0
protected

Used to calculate the number of entries in the current cluster.

Definition at line 206 of file RPageStorage.hxx.

◆ fSerializationContext

Internal::RNTupleSerializer::RContext ROOT::Experimental::Detail::RPageSink::fSerializationContext
private

Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization.

Definition at line 180 of file RPageStorage.hxx.

Libraries for ROOT::Experimental::Detail::RPageSink:

The documentation for this class was generated from the following files: