Base class for a sink with a physical storage backend.
Definition at line 435 of file RPageStorage.hxx.
Classes | |
struct | RCounters |
Default I/O performance counters that get registered in fMetrics. More... | |
struct | RFeatures |
Set of optional features supported by the persistent sink. More... | |
Public Member Functions | |
RPagePersistentSink (const RPagePersistentSink &)=delete | |
RPagePersistentSink (RPagePersistentSink &&)=default | |
RPagePersistentSink (std::string_view ntupleName, const ROOT::RNTupleWriteOptions &options) | |
~RPagePersistentSink () override | |
ColumnHandle_t | AddColumn (ROOT::DescriptorId_t fieldId, ROOT::Internal::RColumn &column) final |
Register a new column. | |
void | CommitClusterGroup () final |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing). | |
void | CommitDatasetImpl () final |
void | CommitPage (ColumnHandle_t columnHandle, const ROOT::Internal::RPage &page) final |
Write a page to the storage. The column must have been added before. | |
void | CommitSealedPage (ROOT::DescriptorId_t physicalColumnId, const RPageStorage::RSealedPage &sealedPage) final |
Write a preprocessed page to storage. The column must have been added before. | |
void | CommitSealedPageV (std::span< RPageStorage::RSealedPageGroup > ranges) final |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before. | |
void | CommitStagedClusters (std::span< RStagedCluster > clusters) final |
Commit staged clusters, logically appending them to the ntuple descriptor. | |
void | CommitSuppressedColumn (ColumnHandle_t columnHandle) final |
Commits a suppressed column for the current cluster. | |
const ROOT::RNTupleDescriptor & | GetDescriptor () const final |
Return the RNTupleDescriptor being constructed. | |
ROOT::NTupleSize_t | GetNEntries () const final |
std::unique_ptr< RNTupleModel > | InitFromDescriptor (const ROOT::RNTupleDescriptor &descriptor, bool copyClusters) |
Initialize sink based on an existing descriptor and fill into the descriptor builder, optionally copying over the descriptor's clusters to this sink's descriptor. | |
void | InitImpl (RNTupleModel &model) final |
Updates the descriptor and calls InitImpl() that handles the backend-specific details (file, DAOS, etc.) | |
RPagePersistentSink & | operator= (const RPagePersistentSink &)=delete |
RPagePersistentSink & | operator= (RPagePersistentSink &&)=default |
RStagedCluster | StageCluster (ROOT::NTupleSize_t nNewEntries) final |
Stage the current cluster and create a new one for the following data. | |
void | UpdateExtraTypeInfo (const ROOT::RExtraTypeInfoDescriptor &extraTypeInfo) final |
Adds an extra type information record to schema. | |
void | UpdateSchema (const ROOT::Internal::RNTupleModelChangeset &changeset, ROOT::NTupleSize_t firstEntry) final |
Incorporate incremental changes to the model into the ntuple descriptor. | |
![]() | |
RPageSink (const RPageSink &)=delete | |
RPageSink (RPageSink &&)=default | |
RPageSink (std::string_view ntupleName, const ROOT::RNTupleWriteOptions &options) | |
~RPageSink () override | |
virtual std::uint64_t | CommitCluster (ROOT::NTupleSize_t nNewEntries) |
Finalize the current cluster and create a new one for the following data. | |
void | CommitDataset () |
Run the registered callbacks and finalize the current cluster and the entrire data set. | |
void | DropColumn (ColumnHandle_t) final |
Unregisters a column. | |
virtual RSinkGuard | GetSinkGuard () |
EPageStorageType | GetType () final |
Whether the concrete implementation is a sink or a source. | |
const ROOT::RNTupleWriteOptions & | GetWriteOptions () const |
Returns the sink's write options. | |
void | Init (RNTupleModel &model) |
Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model. | |
bool | IsInitialized () const |
RPageSink & | operator= (const RPageSink &)=delete |
RPageSink & | operator= (RPageSink &&)=default |
void | RegisterOnCommitDatasetCallback (Callback_t callback) |
The registered callback is executed at the beginning of CommitDataset();. | |
virtual ROOT::Internal::RPage | ReservePage (ColumnHandle_t columnHandle, std::size_t nElements) |
Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero. | |
![]() | |
RPageStorage (const RPageStorage &other)=delete | |
RPageStorage (RPageStorage &&other)=default | |
RPageStorage (std::string_view name) | |
virtual | ~RPageStorage () |
ROOT::DescriptorId_t | GetColumnId (ColumnHandle_t columnHandle) const |
virtual ROOT::Experimental::Detail::RNTupleMetrics & | GetMetrics () |
Returns the default metrics object. | |
const std::string & | GetNTupleName () const |
Returns the NTuple name. | |
RPageStorage & | operator= (const RPageStorage &other)=delete |
RPageStorage & | operator= (RPageStorage &&other)=default |
void | SetTaskScheduler (RTaskScheduler *taskScheduler) |
Static Public Member Functions | |
static std::unique_ptr< RPageSink > | Create (std::string_view ntupleName, std::string_view location, const ROOT::RNTupleWriteOptions &options=ROOT::RNTupleWriteOptions()) |
Guess the concrete derived page source from the location. | |
![]() | |
static RSealedPage | SealPage (const RSealPageConfig &config) |
Seal a page using the provided info. | |
Protected Member Functions | |
virtual RNTupleLocator | CommitClusterGroupImpl (unsigned char *serializedPageList, std::uint32_t length)=0 |
Returns the locator of the page list envelope of the given buffer that contains the serialized page list. | |
virtual void | CommitDatasetImpl (unsigned char *serializedFooter, std::uint32_t length)=0 |
virtual RNTupleLocator | CommitPageImpl (ColumnHandle_t columnHandle, const ROOT::Internal::RPage &page)=0 |
virtual RNTupleLocator | CommitSealedPageImpl (ROOT::DescriptorId_t physicalColumnId, const RPageStorage::RSealedPage &sealedPage)=0 |
virtual std::vector< RNTupleLocator > | CommitSealedPageVImpl (std::span< RPageStorage::RSealedPageGroup > ranges, const std::vector< bool > &mask) |
Vector commit of preprocessed pages. | |
void | EnableDefaultMetrics (const std::string &prefix) |
Enables the default set of metrics provided by RPageSink. | |
virtual void | InitImpl (unsigned char *serializedHeader, std::uint32_t length)=0 |
virtual std::uint64_t | StageClusterImpl ()=0 |
Returns the number of bytes written to storage (excluding metadata) | |
![]() | |
RSealedPage | SealPage (const ROOT::Internal::RPage &page, const ROOT::Internal::RColumnElementBase &element) |
Helper for streaming a page. | |
![]() | |
void | WaitForAllTasks () |
Protected Attributes | |
std::unique_ptr< RCounters > | fCounters |
ROOT::Internal::RNTupleDescriptorBuilder | fDescriptorBuilder |
RFeatures | fFeatures |
![]() | |
bool | fIsInitialized = false |
Flag if sink was initialized. | |
std::unique_ptr< ROOT::RNTupleWriteOptions > | fOptions |
![]() | |
ROOT::Experimental::Detail::RNTupleMetrics | fMetrics |
std::string | fNTupleName |
std::unique_ptr< ROOT::Internal::RPageAllocator > | fPageAllocator |
For the time being, we will use the heap allocator for all sources and sinks. This may change in the future. | |
RTaskScheduler * | fTaskScheduler = nullptr |
Private Attributes | |
std::uint64_t | fNextClusterInGroup = 0 |
Remembers the starting cluster id for the next cluster group. | |
std::vector< ROOT::RClusterDescriptor::RColumnRange > | fOpenColumnRanges |
Keeps track of the number of elements in the currently open cluster. Indexed by column id. | |
std::vector< ROOT::RClusterDescriptor::RPageRange > | fOpenPageRanges |
Keeps track of the written pages in the currently open cluster. Indexed by column id. | |
ROOT::NTupleSize_t | fPrevClusterNEntries = 0 |
Used to calculate the number of entries in the current cluster. | |
ROOT::Internal::RNTupleSerializer::RContext | fSerializationContext |
Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization. | |
ROOT::Internal::RNTupleSerializer::StreamerInfoMap_t | fStreamerInfos |
Union of the streamer info records that are sent from streamer fields to the sink before committing the dataset. | |
Additional Inherited Members | |
![]() | |
using | Callback_t = std::function<void(RPageSink &)> |
![]() | |
using | ColumnHandle_t = RColumnHandle |
The column handle identifies a column with the current open page storage. | |
using | SealedPageSequence_t = std::deque<RSealedPage> |
![]() | |
static constexpr std::size_t | kNBytesPageChecksum = sizeof(std::uint64_t) |
The page checksum is a 64bit xxhash3. | |
#include <ROOT/RPageStorage.hxx>
ROOT::Internal::RPagePersistentSink::RPagePersistentSink | ( | std::string_view | ntupleName, |
const ROOT::RNTupleWriteOptions & | options ) |
Definition at line 799 of file RPageStorage.cxx.
|
delete |
|
default |
|
override |
Definition at line 805 of file RPageStorage.cxx.
|
finalvirtual |
Register a new column.
When reading, the column must exist in the ntuple on disk corresponding to the metadata. When writing, every column can only be attached once.
Implements ROOT::Internal::RPageStorage.
Definition at line 808 of file RPageStorage.cxx.
|
finalvirtual |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).
Implements ROOT::Internal::RPageSink.
Definition at line 1206 of file RPageStorage.cxx.
|
protectedpure virtual |
Returns the locator of the page list envelope of the given buffer that contains the serialized page list.
Typically, the implementation takes care of compressing and writing the provided buffer.
Implemented in ROOT::Experimental::Internal::RPageSinkDaos, and ROOT::Internal::RPageSinkFile.
|
finalvirtual |
Implements ROOT::Internal::RPageSink.
Reimplemented in ROOT::Experimental::Internal::RPageSinkDaos, and ROOT::Internal::RPageSinkFile.
Definition at line 1248 of file RPageStorage.cxx.
|
protectedpure virtual |
|
finalvirtual |
Write a page to the storage. The column must have been added before.
Implements ROOT::Internal::RPageSink.
Definition at line 1030 of file RPageStorage.cxx.
|
protectedpure virtual |
Implemented in ROOT::Experimental::Internal::RPageSinkDaos, and ROOT::Internal::RPageSinkFile.
|
finalvirtual |
Write a preprocessed page to storage. The column must have been added before.
Implements ROOT::Internal::RPageSink.
Definition at line 1041 of file RPageStorage.cxx.
|
protectedpure virtual |
Implemented in ROOT::Experimental::Internal::RPageSinkDaos, and ROOT::Internal::RPageSinkFile.
|
finalvirtual |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.
Implements ROOT::Internal::RPageSink.
Definition at line 1070 of file RPageStorage.cxx.
|
protectedvirtual |
Vector commit of preprocessed pages.
The ranges
array specifies a range of sealed pages to be committed for each column. The returned vector contains, in order, the RNTupleLocator for each page on each range in ranges
, i.e. the first N entries refer to the N pages in ranges[0]
, followed by M entries that refer to the M pages in ranges[1]
, etc. The mask allows to skip writing out certain pages. The vector has the size of all the pages. For every false
value in the mask, the corresponding locator is skipped (missing) in the output vector. The default is to call CommitSealedPageImpl
for each page; derived classes may provide an optimized implementation though.
Reimplemented in ROOT::Experimental::Internal::RPageSinkDaos, and ROOT::Internal::RPageSinkFile.
Definition at line 1054 of file RPageStorage.cxx.
|
finalvirtual |
Commit staged clusters, logically appending them to the ntuple descriptor.
Implements ROOT::Internal::RPageSink.
Definition at line 1169 of file RPageStorage.cxx.
|
finalvirtual |
Commits a suppressed column for the current cluster.
Can be called anytime before CommitCluster(). For any given column and cluster, there must be no calls to both CommitSuppressedColumn() and page commits.
Implements ROOT::Internal::RPageSink.
Definition at line 1025 of file RPageStorage.cxx.
|
static |
Guess the concrete derived page source from the location.
Definition at line 778 of file RPageStorage.cxx.
|
protected |
Enables the default set of metrics provided by RPageSink.
prefix
will be used as the prefix for the counters registered in the internal RNTupleMetrics object. This set of counters can be extended by a subclass by calling fMetrics.MakeCounter<...>()
.
A subclass using the default set of metrics is always responsible for updating the counters appropriately, e.g. fCounters->fNPageCommited.Inc()
Definition at line 1279 of file RPageStorage.cxx.
|
inlinefinalvirtual |
Return the RNTupleDescriptor being constructed.
Implements ROOT::Internal::RPageSink.
Definition at line 518 of file RPageStorage.hxx.
|
inlinefinalvirtual |
Implements ROOT::Internal::RPageSink.
Definition at line 520 of file RPageStorage.hxx.
std::unique_ptr< ROOT::RNTupleModel > ROOT::Internal::RPagePersistentSink::InitFromDescriptor | ( | const ROOT::RNTupleDescriptor & | descriptor, |
bool | copyClusters ) |
Initialize sink based on an existing descriptor and fill into the descriptor builder, optionally copying over the descriptor's clusters to this sink's descriptor.
Definition at line 956 of file RPageStorage.cxx.
|
finalvirtual |
Updates the descriptor and calls InitImpl() that handles the backend-specific details (file, DAOS, etc.)
Implements ROOT::Internal::RPageSink.
Reimplemented in ROOT::Experimental::Internal::RPageSinkDaos, and ROOT::Internal::RPageSinkFile.
Definition at line 927 of file RPageStorage.cxx.
|
protectedpure virtual |
|
delete |
|
default |
|
finalvirtual |
Stage the current cluster and create a new one for the following data.
Returns the object that must be passed to CommitStagedClusters to logically append the staged cluster to the ntuple descriptor.
Implements ROOT::Internal::RPageSink.
Definition at line 1140 of file RPageStorage.cxx.
|
protectedpure virtual |
Returns the number of bytes written to storage (excluding metadata)
Implemented in ROOT::Experimental::Internal::RPageSinkDaos, and ROOT::Internal::RPageSinkFile.
|
finalvirtual |
Adds an extra type information record to schema.
The extra type information will be written to the extension header. The information in the record will be merged with the existing information, e.g. duplicate streamer info records will be removed. This method is called by the "on commit dataset" callback registered by specific fields (e.g., streamer field) and during merging.
Implements ROOT::Internal::RPageSink.
Definition at line 919 of file RPageStorage.cxx.
|
finalvirtual |
Incorporate incremental changes to the model into the ntuple descriptor.
This happens, e.g. if new fields were added after the initial call to RPageSink::Init(RNTupleModel &)
. firstEntry
specifies the global index for the first stored element in the added columns.
Implements ROOT::Internal::RPageSink.
Definition at line 829 of file RPageStorage.cxx.
|
protected |
Definition at line 471 of file RPageStorage.hxx.
|
protected |
Definition at line 459 of file RPageStorage.hxx.
|
protected |
Definition at line 458 of file RPageStorage.hxx.
|
private |
Remembers the starting cluster id for the next cluster group.
Definition at line 441 of file RPageStorage.hxx.
|
private |
Keeps track of the number of elements in the currently open cluster. Indexed by column id.
Definition at line 445 of file RPageStorage.hxx.
|
private |
Keeps track of the written pages in the currently open cluster. Indexed by column id.
Definition at line 447 of file RPageStorage.hxx.
|
private |
Used to calculate the number of entries in the current cluster.
Definition at line 443 of file RPageStorage.hxx.
|
private |
Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization.
Definition at line 438 of file RPageStorage.hxx.
|
private |
Union of the streamer info records that are sent from streamer fields to the sink before committing the dataset.
Definition at line 450 of file RPageStorage.hxx.