Abstract interface to write data into an ntuple.
The page sink takes the list of columns and afterwards a series of page commits and cluster commits. The user is responsible to commit clusters at a consistent point, i.e. when all pages corresponding to data up to the given entry number are committed.
An object of this class may either be a wrapper (for example a RPageSinkBuf) or a "persistent" sink, inheriting from RPagePersistentSink.
Definition at line 258 of file RPageStorage.hxx.
Classes | |
struct | RSealPageConfig |
Parameters for the SealPage() method. More... | |
class | RSinkGuard |
An RAII wrapper used to synchronize a page sink. See GetSinkGuard(). More... | |
struct | RStagedCluster |
Cluster that was staged, but not yet logically appended to the RNTuple. More... | |
Public Types | |
using | Callback_t = std::function<void(RPageSink &)> |
![]() | |
using | ColumnHandle_t = RColumnHandle |
The column handle identifies a column with the current open page storage. | |
using | SealedPageSequence_t = std::deque<RSealedPage> |
Public Member Functions | |
RPageSink (const RPageSink &)=delete | |
RPageSink (RPageSink &&)=default | |
RPageSink (std::string_view ntupleName, const ROOT::RNTupleWriteOptions &options) | |
~RPageSink () override | |
virtual std::uint64_t | CommitCluster (ROOT::NTupleSize_t nNewEntries) |
Finalize the current cluster and create a new one for the following data. | |
virtual void | CommitClusterGroup ()=0 |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing). | |
void | CommitDataset () |
Run the registered callbacks and finalize the current cluster and the entrire data set. | |
virtual void | CommitPage (ColumnHandle_t columnHandle, const RPage &page)=0 |
Write a page to the storage. The column must have been added before. | |
virtual void | CommitSealedPage (ROOT::DescriptorId_t physicalColumnId, const RPageStorage::RSealedPage &sealedPage)=0 |
Write a preprocessed page to storage. The column must have been added before. | |
virtual void | CommitSealedPageV (std::span< RPageStorage::RSealedPageGroup > ranges)=0 |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before. | |
virtual void | CommitStagedClusters (std::span< RStagedCluster > clusters)=0 |
Commit staged clusters, logically appending them to the ntuple descriptor. | |
virtual void | CommitSuppressedColumn (ColumnHandle_t columnHandle)=0 |
Commits a suppressed column for the current cluster. | |
void | DropColumn (ColumnHandle_t) final |
Unregisters a column. | |
virtual const RNTupleDescriptor & | GetDescriptor () const =0 |
Return the RNTupleDescriptor being constructed. | |
virtual ROOT::NTupleSize_t | GetNEntries () const =0 |
virtual RSinkGuard | GetSinkGuard () |
EPageStorageType | GetType () final |
Whether the concrete implementation is a sink or a source. | |
const ROOT::RNTupleWriteOptions & | GetWriteOptions () const |
Returns the sink's write options. | |
void | Init (RNTupleModel &model) |
Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model. | |
bool | IsInitialized () const |
RPageSink & | operator= (const RPageSink &)=delete |
RPageSink & | operator= (RPageSink &&)=default |
void | RegisterOnCommitDatasetCallback (Callback_t callback) |
The registered callback is executed at the beginning of CommitDataset();. | |
virtual RPage | ReservePage (ColumnHandle_t columnHandle, std::size_t nElements) |
Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero. | |
virtual RStagedCluster | StageCluster (ROOT::NTupleSize_t nNewEntries)=0 |
Stage the current cluster and create a new one for the following data. | |
virtual void | UpdateExtraTypeInfo (const RExtraTypeInfoDescriptor &extraTypeInfo)=0 |
Adds an extra type information record to schema. | |
virtual void | UpdateSchema (const RNTupleModelChangeset &changeset, ROOT::NTupleSize_t firstEntry)=0 |
Incorporate incremental changes to the model into the ntuple descriptor. | |
![]() | |
RPageStorage (const RPageStorage &other)=delete | |
RPageStorage (RPageStorage &&other)=default | |
RPageStorage (std::string_view name) | |
virtual | ~RPageStorage () |
virtual ColumnHandle_t | AddColumn (ROOT::DescriptorId_t fieldId, RColumn &column)=0 |
Register a new column. | |
ROOT::DescriptorId_t | GetColumnId (ColumnHandle_t columnHandle) const |
virtual Detail::RNTupleMetrics & | GetMetrics () |
Returns the default metrics object. | |
const std::string & | GetNTupleName () const |
Returns the NTuple name. | |
RPageStorage & | operator= (const RPageStorage &other)=delete |
RPageStorage & | operator= (RPageStorage &&other)=default |
void | SetTaskScheduler (RTaskScheduler *taskScheduler) |
Static Public Member Functions | |
static RSealedPage | SealPage (const RSealPageConfig &config) |
Seal a page using the provided info. | |
Protected Member Functions | |
virtual void | CommitDatasetImpl ()=0 |
virtual void | InitImpl (RNTupleModel &model)=0 |
RSealedPage | SealPage (const RPage &page, const RColumnElementBase &element) |
Helper for streaming a page. | |
![]() | |
void | WaitForAllTasks () |
Protected Attributes | |
bool | fIsInitialized = false |
Flag if sink was initialized. | |
std::unique_ptr< ROOT::RNTupleWriteOptions > | fOptions |
![]() | |
Detail::RNTupleMetrics | fMetrics |
std::string | fNTupleName |
std::unique_ptr< RPageAllocator > | fPageAllocator |
For the time being, we will use the heap allocator for all sources and sinks. This may change in the future. | |
RTaskScheduler * | fTaskScheduler = nullptr |
Private Attributes | |
std::vector< Callback_t > | fOnDatasetCommitCallbacks |
std::vector< unsigned char > | fSealPageBuffer |
Used as destination buffer in the simple SealPage overload. | |
RWritePageMemoryManager | fWritePageMemoryManager |
Used in ReservePage to maintain the page buffer budget. | |
Additional Inherited Members | |
![]() | |
static constexpr std::size_t | kNBytesPageChecksum = sizeof(std::uint64_t) |
The page checksum is a 64bit xxhash3. | |
#include <ROOT/RPageStorage.hxx>
using ROOT::Experimental::Internal::RPageSink::Callback_t = std::function<void(RPageSink &)> |
Definition at line 260 of file RPageStorage.hxx.
ROOT::Experimental::Internal::RPageSink::RPageSink | ( | std::string_view | ntupleName, |
const ROOT::RNTupleWriteOptions & | options ) |
Definition at line 665 of file RPageStorage.cxx.
|
default |
|
override |
Definition at line 671 of file RPageStorage.cxx.
|
inlinevirtual |
Finalize the current cluster and create a new one for the following data.
Returns the number of bytes written to storage (excluding meta-data).
Reimplemented in ROOT::Experimental::Internal::RPageSinkBuf.
Definition at line 380 of file RPageStorage.hxx.
|
pure virtual |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).
Implemented in ROOT::Experimental::Internal::RPageNullSink, ROOT::Experimental::Internal::RPageSinkBuf, and ROOT::Experimental::Internal::RPagePersistentSink.
void ROOT::Experimental::Internal::RPageSink::CommitDataset | ( | ) |
Run the registered callbacks and finalize the current cluster and the entrire data set.
Definition at line 729 of file RPageStorage.cxx.
|
protectedpure virtual |
|
pure virtual |
Write a page to the storage. The column must have been added before.
Implemented in ROOT::Experimental::Internal::RPageSinkBuf, ROOT::Experimental::Internal::RPagePersistentSink, and ROOT::Experimental::Internal::RPageNullSink.
|
pure virtual |
Write a preprocessed page to storage. The column must have been added before.
Implemented in ROOT::Experimental::Internal::RPagePersistentSink, ROOT::Experimental::Internal::RPageSinkBuf, and ROOT::Experimental::Internal::RPageNullSink.
|
pure virtual |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.
Implemented in ROOT::Experimental::Internal::RPageSinkBuf, ROOT::Experimental::Internal::RPagePersistentSink, and ROOT::Experimental::Internal::RPageNullSink.
|
pure virtual |
Commit staged clusters, logically appending them to the ntuple descriptor.
Implemented in ROOT::Experimental::Internal::RPageSinkBuf, ROOT::Experimental::Internal::RPagePersistentSink, and ROOT::Experimental::Internal::RPageNullSink.
|
pure virtual |
Commits a suppressed column for the current cluster.
Can be called anytime before CommitCluster(). For any given column and cluster, there must be no calls to both CommitSuppressedColumn() and page commits.
Implemented in ROOT::Experimental::Internal::RPageSinkBuf, ROOT::Experimental::Internal::RPagePersistentSink, and ROOT::Experimental::Internal::RPageNullSink.
|
inlinefinalvirtual |
Unregisters a column.
A page source decreases the reference counter for the corresponding active column. For a page sink, dropping columns is currently a no-op.
Implements ROOT::Experimental::Internal::RPageStorage.
Definition at line 309 of file RPageStorage.hxx.
|
pure virtual |
Return the RNTupleDescriptor being constructed.
Implemented in ROOT::Experimental::Internal::RPageNullSink, ROOT::Experimental::Internal::RPageSinkBuf, and ROOT::Experimental::Internal::RPagePersistentSink.
|
pure virtual |
|
inlinevirtual |
Definition at line 422 of file RPageStorage.hxx.
|
inlinefinalvirtual |
Whether the concrete implementation is a sink or a source.
Implements ROOT::Experimental::Internal::RPageStorage.
Definition at line 305 of file RPageStorage.hxx.
|
inline |
Returns the sink's write options.
Definition at line 307 of file RPageStorage.hxx.
|
inline |
Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model.
Definition at line 320 of file RPageStorage.hxx.
|
protectedpure virtual |
|
inline |
Definition at line 311 of file RPageStorage.hxx.
|
inline |
The registered callback is executed at the beginning of CommitDataset();.
Definition at line 391 of file RPageStorage.hxx.
|
virtual |
Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero.
Reimplemented in ROOT::Experimental::Internal::RPageSinkBuf.
Definition at line 737 of file RPageStorage.cxx.
|
protected |
Helper for streaming a page.
This is commonly used in derived, concrete page sinks. Note that if compressionSetting is 0 (uncompressed) and the page is mappable and not checksummed, the returned sealed page will point directly to the input page buffer. Otherwise, the sealed page references fSealPageBuffer. Thus, the buffer pointed to by the RSealedPage should never be freed.
Definition at line 712 of file RPageStorage.cxx.
|
static |
Seal a page using the provided info.
Definition at line 674 of file RPageStorage.cxx.
|
pure virtual |
Stage the current cluster and create a new one for the following data.
Returns the object that must be passed to CommitStagedClusters to logically append the staged cluster to the ntuple descriptor.
Implemented in ROOT::Experimental::Internal::RPageSinkBuf, ROOT::Experimental::Internal::RPagePersistentSink, and ROOT::Experimental::Internal::RPageNullSink.
|
pure virtual |
Adds an extra type information record to schema.
The extra type information will be written to the extension header. The information in the record will be merged with the existing information, e.g. duplicate streamer info records will be removed. This method is called by the "on commit dataset" callback registered by specific fields (e.g., streamer field) and during merging.
Implemented in ROOT::Experimental::Internal::RPageNullSink, ROOT::Experimental::Internal::RPageSinkBuf, and ROOT::Experimental::Internal::RPagePersistentSink.
|
pure virtual |
Incorporate incremental changes to the model into the ntuple descriptor.
This happens, e.g. if new fields were added after the initial call to RPageSink::Init(RNTupleModel &)
. firstEntry
specifies the global index for the first stored element in the added columns.
Implemented in ROOT::Experimental::Internal::RPageNullSink, ROOT::Experimental::Internal::RPageSinkBuf, and ROOT::Experimental::Internal::RPagePersistentSink.
Flag if sink was initialized.
Definition at line 281 of file RPageStorage.hxx.
|
private |
Definition at line 290 of file RPageStorage.hxx.
|
protected |
Definition at line 278 of file RPageStorage.hxx.
|
private |
Used as destination buffer in the simple SealPage overload.
Definition at line 291 of file RPageStorage.hxx.
|
private |
Used in ReservePage to maintain the page buffer budget.
Definition at line 294 of file RPageStorage.hxx.