Logo ROOT  
Reference Guide
Loading...
Searching...
No Matches
ROOT::Internal::RPageSinkBuf Class Reference

Wrapper sink that coalesces cluster column page writes.

Definition at line 39 of file RPageSinkBuf.hxx.

Classes

class  RColumnBuf
 A buffered column. More...
struct  RCounters
 I/O performance counters that get registered in fMetrics. More...

Public Types

using Callback_t = std::function<void(RPageSink &)>
using ColumnHandle_t = RColumnHandle
 The column handle identifies a column with the current open page storage.
using SealedPageSequence_t = std::deque<RSealedPage>

Public Member Functions

 RPageSinkBuf (const RPageSinkBuf &)=delete
 RPageSinkBuf (RPageSinkBuf &&)=delete
 RPageSinkBuf (std::unique_ptr< RPageSink > inner)
 ~RPageSinkBuf () override
ColumnHandle_t AddColumn (ROOT::DescriptorId_t fieldId, RColumn &column) final
 Register a new column.
std::unique_ptr< RPageSinkCloneAsHidden (std::string_view name, const RNTupleWriteOptions &opts) const final
 Creates a new sink with the same underlying storage as this but writing to a different RNTuple named name.
void CommitAttributeSet (std::string_view attrSetName, const RNTupleLink &attrAnchorInfo) final
 Adds the given anchor information (name + locator) into the main RNTuple's descriptor as an attribute set linked to it with the given name.
std::uint64_t CommitCluster (ROOT::NTupleSize_t nNewEntries) final
 Finalize the current cluster and create a new one for the following data.
void CommitClusterGroup () final
 Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).
RNTupleLink CommitDataset ()
 Run the registered callbacks and finalize the current cluster and the entrire data set.
RNTupleLink CommitDatasetImpl () final
void CommitPage (ColumnHandle_t columnHandle, const RPage &page) final
 Write a page to the storage. The column must have been added before.
void CommitSealedPage (ROOT::DescriptorId_t physicalColumnId, const RSealedPage &sealedPage) final
 Write a preprocessed page to storage. The column must have been added before.
void CommitSealedPageV (std::span< RPageStorage::RSealedPageGroup > ranges) final
 Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.
void CommitStagedClusters (std::span< RStagedCluster > clusters) final
 Commit staged clusters, logically appending them to the ntuple descriptor.
void CommitSuppressedColumn (ColumnHandle_t columnHandle) final
 Commits a suppressed column for the current cluster.
void DropColumn (ColumnHandle_t) final
 Unregisters a column.
ROOT::DescriptorId_t GetColumnId (ColumnHandle_t columnHandle) const
const ROOT::RNTupleDescriptorGetDescriptor () const final
 Return the RNTupleDescriptor being constructed.
virtual ROOT::Experimental::Detail::RNTupleMetricsGetMetrics ()
 Returns the default metrics object.
ROOT::NTupleSize_t GetNEntries () const final
const std::string & GetNTupleName () const
 Returns the NTuple name.
virtual RSinkGuard GetSinkGuard ()
EPageStorageType GetType () final
 Whether the concrete implementation is a sink or a source.
const ROOT::RNTupleWriteOptionsGetWriteOptions () const
 Returns the sink's write options.
void Init (RNTupleModel &model)
 Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model.
void InitImpl (ROOT::RNTupleModel &model) final
bool IsInitialized () const
RPageSinkBufoperator= (const RPageSinkBuf &)=delete
RPageSinkBufoperator= (RPageSinkBuf &&)=delete
void RegisterOnCommitDatasetCallback (Callback_t callback)
 The registered callback is executed at the beginning of CommitDataset();.
RPage ReservePage (ColumnHandle_t columnHandle, std::size_t nElements) final
 Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero.
void SetTaskScheduler (RTaskScheduler *taskScheduler)
RStagedCluster StageCluster (ROOT::NTupleSize_t nNewEntries) final
 Stage the current cluster and create a new one for the following data.
void UpdateExtraTypeInfo (const ROOT::RExtraTypeInfoDescriptor &extraTypeInfo) final
 Adds an extra type information record to schema.
void UpdateSchema (const RNTupleModelChangeset &changeset, ROOT::NTupleSize_t firstEntry) final
 Incorporate incremental changes to the model into the ntuple descriptor.

Static Public Member Functions

static RSealedPage SealPage (const RSealPageConfig &config)
 Seal a page using the provided info.

Static Public Attributes

static constexpr std::size_t kNBytesPageChecksum = sizeof(std::uint64_t)
 The page checksum is a 64bit xxhash3.

Protected Member Functions

RSealedPage SealPage (const ROOT::Internal::RPage &page, const ROOT::Internal::RColumnElementBase &element)
 Helper for streaming a page.
void WaitForAllTasks ()

Protected Attributes

bool fIsInitialized = false
 Flag if sink was initialized.
ROOT::Experimental::Detail::RNTupleMetrics fMetrics
std::string fNTupleName
std::unique_ptr< ROOT::RNTupleWriteOptionsfOptions
std::unique_ptr< ROOT::Internal::RPageAllocatorfPageAllocator
 For the time being, we will use the heap allocator for all sources and sinks. This may change in the future.
RTaskSchedulerfTaskScheduler = nullptr

Private Member Functions

void ConnectFields (const std::vector< ROOT::RFieldBase * > &fields, ROOT::NTupleSize_t firstEntry)
void FlushClusterImpl (std::function< void(void)> FlushClusterFn)

Private Attributes

std::vector< RColumnBuffBufferedColumns
 Vector of buffered column pages. Indexed by column id.
std::atomic< std::size_t > fBufferedUncompressed = 0
 The sum of uncompressed bytes in buffered pages. Used to heuristically reduce the memory usage.
std::unique_ptr< RCountersfCounters
std::unique_ptr< ROOT::RNTupleModelfInnerModel
 The buffered page sink maintains a copy of the RNTupleModel for the inner sink.
std::unique_ptr< RPageSinkfInnerSink
 The inner sink, responsible for actually performing I/O.
ROOT::DescriptorId_t fNColumns = 0
ROOT::DescriptorId_t fNFields = 0
std::vector< Callback_tfOnDatasetCommitCallbacks
std::vector< unsigned char > fSealPageBuffer
 Used as destination buffer in the simple SealPage overload.
std::vector< ColumnHandle_tfSuppressedColumns
 Columns committed as suppressed are stored and passed to the inner sink at cluster commit.
RWritePageMemoryManager fWritePageMemoryManager
 Used in ReservePage to maintain the page buffer budget.

#include <ROOT/RPageSinkBuf.hxx>

Inheritance diagram for ROOT::Internal::RPageSinkBuf:
ROOT::Internal::RPageSink ROOT::Internal::RPageStorage

Member Typedef Documentation

◆ Callback_t

using ROOT::Internal::RPageSink::Callback_t = std::function<void(RPageSink &)>
inherited

Definition at line 258 of file RPageStorage.hxx.

◆ ColumnHandle_t

The column handle identifies a column with the current open page storage.

Definition at line 180 of file RPageStorage.hxx.

◆ SealedPageSequence_t

Definition at line 130 of file RPageStorage.hxx.

Constructor & Destructor Documentation

◆ RPageSinkBuf() [1/3]

ROOT::Internal::RPageSinkBuf::RPageSinkBuf ( std::unique_ptr< RPageSink > inner)
explicit

Definition at line 40 of file RPageSinkBuf.cxx.

◆ RPageSinkBuf() [2/3]

ROOT::Internal::RPageSinkBuf::RPageSinkBuf ( const RPageSinkBuf & )
delete

◆ RPageSinkBuf() [3/3]

ROOT::Internal::RPageSinkBuf::RPageSinkBuf ( RPageSinkBuf && )
delete

◆ ~RPageSinkBuf()

ROOT::Internal::RPageSinkBuf::~RPageSinkBuf ( )
override

Definition at line 56 of file RPageSinkBuf.cxx.

Member Function Documentation

◆ AddColumn()

ROOT::Internal::RPageStorage::ColumnHandle_t ROOT::Internal::RPageSinkBuf::AddColumn ( ROOT::DescriptorId_t fieldId,
RColumn & column )
finalvirtual

Register a new column.

When reading, the column must exist in the ntuple on disk corresponding to the metadata. When writing, every column can only be attached once.

Implements ROOT::Internal::RPageStorage.

Definition at line 65 of file RPageSinkBuf.cxx.

◆ CloneAsHidden()

std::unique_ptr< ROOT::Internal::RPageSink > ROOT::Internal::RPageSinkBuf::CloneAsHidden ( std::string_view name,
const RNTupleWriteOptions & opts ) const
finalvirtual

Creates a new sink with the same underlying storage as this but writing to a different RNTuple named name.

Only one of the two sinks can safely write at the same time. The RNTuple written by this cloned sink will be stored in a hidden key (this is a convenient assumption we make now since this method is only used to create attribute RNTuples).

Implements ROOT::Internal::RPageSink.

Definition at line 325 of file RPageSinkBuf.cxx.

◆ CommitAttributeSet()

void ROOT::Internal::RPageSinkBuf::CommitAttributeSet ( std::string_view attrSetName,
const RNTupleLink & attrAnchorInfo )
finalvirtual

Adds the given anchor information (name + locator) into the main RNTuple's descriptor as an attribute set linked to it with the given name.

The attribute set must have already been written to storage via RNTupleAttrSetWriter::Commit(). Note that, by RNTuple specs, this is only legal to call on a non-attribute RNTuple's sink.

Implements ROOT::Internal::RPageSink.

Definition at line 330 of file RPageSinkBuf.cxx.

◆ CommitCluster()

std::uint64_t ROOT::Internal::RPageSinkBuf::CommitCluster ( ROOT::NTupleSize_t nNewEntries)
finalvirtual

Finalize the current cluster and create a new one for the following data.

Returns the number of bytes written to storage (excluding metadata).

Reimplemented from ROOT::Internal::RPageSink.

Definition at line 284 of file RPageSinkBuf.cxx.

◆ CommitClusterGroup()

void ROOT::Internal::RPageSinkBuf::CommitClusterGroup ( )
finalvirtual

Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).

Implements ROOT::Internal::RPageSink.

Definition at line 305 of file RPageSinkBuf.cxx.

◆ CommitDataset()

ROOT::Internal::RNTupleLink ROOT::Internal::RPageSink::CommitDataset ( )
inherited

Run the registered callbacks and finalize the current cluster and the entrire data set.

Definition at line 774 of file RPageStorage.cxx.

◆ CommitDatasetImpl()

ROOT::Internal::RNTupleLink ROOT::Internal::RPageSinkBuf::CommitDatasetImpl ( )
finalvirtual

Implements ROOT::Internal::RPageSink.

Definition at line 312 of file RPageSinkBuf.cxx.

◆ CommitPage()

void ROOT::Internal::RPageSinkBuf::CommitPage ( ColumnHandle_t columnHandle,
const RPage & page )
finalvirtual

Write a page to the storage. The column must have been added before.

Implements ROOT::Internal::RPageSink.

Definition at line 149 of file RPageSinkBuf.cxx.

◆ CommitSealedPage()

void ROOT::Internal::RPageSinkBuf::CommitSealedPage ( ROOT::DescriptorId_t physicalColumnId,
const RSealedPage & sealedPage )
finalvirtual

Write a preprocessed page to storage. The column must have been added before.

Implements ROOT::Internal::RPageSink.

Definition at line 240 of file RPageSinkBuf.cxx.

◆ CommitSealedPageV()

void ROOT::Internal::RPageSinkBuf::CommitSealedPageV ( std::span< RPageStorage::RSealedPageGroup > ranges)
finalvirtual

Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.

Implements ROOT::Internal::RPageSink.

Definition at line 246 of file RPageSinkBuf.cxx.

◆ CommitStagedClusters()

void ROOT::Internal::RPageSinkBuf::CommitStagedClusters ( std::span< RStagedCluster > clusters)
finalvirtual

Commit staged clusters, logically appending them to the ntuple descriptor.

Implements ROOT::Internal::RPageSink.

Definition at line 298 of file RPageSinkBuf.cxx.

◆ CommitSuppressedColumn()

void ROOT::Internal::RPageSinkBuf::CommitSuppressedColumn ( ColumnHandle_t columnHandle)
finalvirtual

Commits a suppressed column for the current cluster.

Can be called anytime before CommitCluster(). For any given column and cluster, there must be no calls to both CommitSuppressedColumn() and page commits.

Implements ROOT::Internal::RPageSink.

Definition at line 144 of file RPageSinkBuf.cxx.

◆ ConnectFields()

void ROOT::Internal::RPageSinkBuf::ConnectFields ( const std::vector< ROOT::RFieldBase * > & fields,
ROOT::NTupleSize_t firstEntry )
private

Definition at line 70 of file RPageSinkBuf.cxx.

◆ DropColumn()

void ROOT::Internal::RPageSink::DropColumn ( ColumnHandle_t columnHandle)
inlinefinalvirtualinherited

Unregisters a column.

A page source decreases the reference counter for the corresponding active column. For a page sink, dropping columns is currently a no-op.

Implements ROOT::Internal::RPageStorage.

Definition at line 307 of file RPageStorage.hxx.

◆ FlushClusterImpl()

void ROOT::Internal::RPageSinkBuf::FlushClusterImpl ( std::function< void(void)> FlushClusterFn)
private

Definition at line 255 of file RPageSinkBuf.cxx.

◆ GetColumnId()

ROOT::DescriptorId_t ROOT::Internal::RPageStorage::GetColumnId ( ColumnHandle_t columnHandle) const
inlineinherited

Definition at line 188 of file RPageStorage.hxx.

◆ GetDescriptor()

const ROOT::RNTupleDescriptor & ROOT::Internal::RPageSinkBuf::GetDescriptor ( ) const
finalvirtual

Return the RNTupleDescriptor being constructed.

Implements ROOT::Internal::RPageSink.

Definition at line 88 of file RPageSinkBuf.cxx.

◆ GetMetrics()

virtual ROOT::Experimental::Detail::RNTupleMetrics & ROOT::Internal::RPageStorage::GetMetrics ( )
inlinevirtualinherited

Returns the default metrics object.

Subclasses might alternatively provide their own metrics object by overriding this.

Definition at line 192 of file RPageStorage.hxx.

◆ GetNEntries()

ROOT::NTupleSize_t ROOT::Internal::RPageSinkBuf::GetNEntries ( ) const
inlinefinalvirtual

Implements ROOT::Internal::RPageSink.

Definition at line 137 of file RPageSinkBuf.hxx.

◆ GetNTupleName()

const std::string & ROOT::Internal::RPageStorage::GetNTupleName ( ) const
inlineinherited

Returns the NTuple name.

Definition at line 195 of file RPageStorage.hxx.

◆ GetSinkGuard()

virtual RSinkGuard ROOT::Internal::RPageSink::GetSinkGuard ( )
inlinevirtualinherited

Definition at line 433 of file RPageStorage.hxx.

◆ GetType()

EPageStorageType ROOT::Internal::RPageSink::GetType ( )
inlinefinalvirtualinherited

Whether the concrete implementation is a sink or a source.

Implements ROOT::Internal::RPageStorage.

Definition at line 303 of file RPageStorage.hxx.

◆ GetWriteOptions()

const ROOT::RNTupleWriteOptions & ROOT::Internal::RPageSink::GetWriteOptions ( ) const
inlineinherited

Returns the sink's write options.

Definition at line 305 of file RPageStorage.hxx.

◆ Init()

void ROOT::Internal::RPageSink::Init ( RNTupleModel & model)
inlineinherited

Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model.

Definition at line 318 of file RPageStorage.hxx.

◆ InitImpl()

void ROOT::Internal::RPageSinkBuf::InitImpl ( ROOT::RNTupleModel & model)
finalvirtual

Implements ROOT::Internal::RPageSink.

Definition at line 93 of file RPageSinkBuf.cxx.

◆ IsInitialized()

bool ROOT::Internal::RPageSink::IsInitialized ( ) const
inlineinherited

Definition at line 309 of file RPageStorage.hxx.

◆ operator=() [1/2]

RPageSinkBuf & ROOT::Internal::RPageSinkBuf::operator= ( const RPageSinkBuf & )
delete

◆ operator=() [2/2]

RPageSinkBuf & ROOT::Internal::RPageSinkBuf::operator= ( RPageSinkBuf && )
delete

◆ RegisterOnCommitDatasetCallback()

void ROOT::Internal::RPageSink::RegisterOnCommitDatasetCallback ( Callback_t callback)
inlineinherited

The registered callback is executed at the beginning of CommitDataset();.

Definition at line 402 of file RPageStorage.hxx.

◆ ReservePage()

ROOT::Internal::RPage ROOT::Internal::RPageSinkBuf::ReservePage ( ColumnHandle_t columnHandle,
std::size_t nElements )
finalvirtual

Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero.

Reimplemented from ROOT::Internal::RPageSink.

Definition at line 319 of file RPageSinkBuf.cxx.

◆ SealPage() [1/2]

ROOT::Internal::RPageStorage::RSealedPage ROOT::Internal::RPageSink::SealPage ( const ROOT::Internal::RPage & page,
const ROOT::Internal::RColumnElementBase & element )
protectedinherited

Helper for streaming a page.

This is commonly used in derived, concrete page sinks. Note that if compressionSetting is 0 (uncompressed) and the page is mappable and not checksummed, the returned sealed page will point directly to the input page buffer. Otherwise, the sealed page references fSealPageBuffer. Thus, the buffer pointed to by the RSealedPage should never be freed.

Definition at line 757 of file RPageStorage.cxx.

◆ SealPage() [2/2]

ROOT::Internal::RPageStorage::RSealedPage ROOT::Internal::RPageSink::SealPage ( const RSealPageConfig & config)
staticinherited

Seal a page using the provided info.

Definition at line 719 of file RPageStorage.cxx.

◆ SetTaskScheduler()

void ROOT::Internal::RPageStorage::SetTaskScheduler ( RTaskScheduler * taskScheduler)
inlineinherited

Definition at line 197 of file RPageStorage.hxx.

◆ StageCluster()

ROOT::Internal::RPageSink::RStagedCluster ROOT::Internal::RPageSinkBuf::StageCluster ( ROOT::NTupleSize_t nNewEntries)
finalvirtual

Stage the current cluster and create a new one for the following data.

Returns the object that must be passed to CommitStagedClusters to logically append the staged cluster to the ntuple descriptor.

Implements ROOT::Internal::RPageSink.

Definition at line 291 of file RPageSinkBuf.cxx.

◆ UpdateExtraTypeInfo()

void ROOT::Internal::RPageSinkBuf::UpdateExtraTypeInfo ( const ROOT::RExtraTypeInfoDescriptor & extraTypeInfo)
finalvirtual

Adds an extra type information record to schema.

The extra type information will be written to the extension header. The information in the record will be merged with the existing information, e.g. duplicate streamer info records will be removed. This method is called by the "on commit dataset" callback registered by specific fields (e.g., streamer field) and during merging.

Implements ROOT::Internal::RPageSink.

Definition at line 137 of file RPageSinkBuf.cxx.

◆ UpdateSchema()

void ROOT::Internal::RPageSinkBuf::UpdateSchema ( const RNTupleModelChangeset & changeset,
ROOT::NTupleSize_t firstEntry )
finalvirtual

Incorporate incremental changes to the model into the ntuple descriptor.

This happens, e.g. if new fields were added after the initial call to RPageSink::Init(RNTupleModel &). firstEntry specifies the global index for the first stored element in the added columns.

Implements ROOT::Internal::RPageSink.

Definition at line 101 of file RPageSinkBuf.cxx.

◆ WaitForAllTasks()

void ROOT::Internal::RPageStorage::WaitForAllTasks ( )
inlineprotectedinherited

Definition at line 153 of file RPageStorage.hxx.

Member Data Documentation

◆ fBufferedColumns

std::vector<RColumnBuf> ROOT::Internal::RPageSinkBuf::fBufferedColumns
private

Vector of buffered column pages. Indexed by column id.

Definition at line 116 of file RPageSinkBuf.hxx.

◆ fBufferedUncompressed

std::atomic<std::size_t> ROOT::Internal::RPageSinkBuf::fBufferedUncompressed = 0
private

The sum of uncompressed bytes in buffered pages. Used to heuristically reduce the memory usage.

Definition at line 114 of file RPageSinkBuf.hxx.

◆ fCounters

std::unique_ptr<RCounters> ROOT::Internal::RPageSinkBuf::fCounters
private

Definition at line 107 of file RPageSinkBuf.hxx.

◆ fInnerModel

std::unique_ptr<ROOT::RNTupleModel> ROOT::Internal::RPageSinkBuf::fInnerModel
private

The buffered page sink maintains a copy of the RNTupleModel for the inner sink.

For the unbuffered case, the RNTupleModel is instead managed by a RNTupleWriter.

Definition at line 112 of file RPageSinkBuf.hxx.

◆ fInnerSink

std::unique_ptr<RPageSink> ROOT::Internal::RPageSinkBuf::fInnerSink
private

The inner sink, responsible for actually performing I/O.

Definition at line 109 of file RPageSinkBuf.hxx.

◆ fIsInitialized

bool ROOT::Internal::RPageSink::fIsInitialized = false
protectedinherited

Flag if sink was initialized.

Definition at line 279 of file RPageStorage.hxx.

◆ fMetrics

ROOT::Experimental::Detail::RNTupleMetrics ROOT::Internal::RPageStorage::fMetrics
protectedinherited

Definition at line 146 of file RPageStorage.hxx.

◆ fNColumns

ROOT::DescriptorId_t ROOT::Internal::RPageSinkBuf::fNColumns = 0
private

Definition at line 120 of file RPageSinkBuf.hxx.

◆ fNFields

ROOT::DescriptorId_t ROOT::Internal::RPageSinkBuf::fNFields = 0
private

Definition at line 119 of file RPageSinkBuf.hxx.

◆ fNTupleName

std::string ROOT::Internal::RPageStorage::fNTupleName
protectedinherited

Definition at line 151 of file RPageStorage.hxx.

◆ fOnDatasetCommitCallbacks

std::vector<Callback_t> ROOT::Internal::RPageSink::fOnDatasetCommitCallbacks
privateinherited

Definition at line 288 of file RPageStorage.hxx.

◆ fOptions

std::unique_ptr<ROOT::RNTupleWriteOptions> ROOT::Internal::RPageSink::fOptions
protectedinherited

Definition at line 276 of file RPageStorage.hxx.

◆ fPageAllocator

std::unique_ptr<ROOT::Internal::RPageAllocator> ROOT::Internal::RPageStorage::fPageAllocator
protectedinherited

For the time being, we will use the heap allocator for all sources and sinks. This may change in the future.

Definition at line 149 of file RPageStorage.hxx.

◆ fSealPageBuffer

std::vector<unsigned char> ROOT::Internal::RPageSink::fSealPageBuffer
privateinherited

Used as destination buffer in the simple SealPage overload.

Definition at line 289 of file RPageStorage.hxx.

◆ fSuppressedColumns

std::vector<ColumnHandle_t> ROOT::Internal::RPageSinkBuf::fSuppressedColumns
private

Columns committed as suppressed are stored and passed to the inner sink at cluster commit.

Definition at line 118 of file RPageSinkBuf.hxx.

◆ fTaskScheduler

RTaskScheduler* ROOT::Internal::RPageStorage::fTaskScheduler = nullptr
protectedinherited

Definition at line 152 of file RPageStorage.hxx.

◆ fWritePageMemoryManager

RWritePageMemoryManager ROOT::Internal::RPageSink::fWritePageMemoryManager
privateinherited

Used in ReservePage to maintain the page buffer budget.

Definition at line 292 of file RPageStorage.hxx.

◆ kNBytesPageChecksum

std::size_t ROOT::Internal::RPageStorage::kNBytesPageChecksum = sizeof(std::uint64_t)
staticconstexprinherited

The page checksum is a 64bit xxhash3.

Definition at line 73 of file RPageStorage.hxx.


The documentation for this class was generated from the following files: