Storage provider that write ntuple pages into a file.
The written file can be either in ROOT format or in RNTuple bare format.
Definition at line 54 of file RPageStorageFile.hxx.
Classes | |
| struct | CommitBatch |
Public Types | |
| using | Callback_t = std::function<void(RPageSink &)> |
| using | ColumnHandle_t = RColumnHandle |
| The column handle identifies a column with the current open page storage. | |
| using | SealedPageSequence_t = std::deque<RSealedPage> |
Public Member Functions | |
| RPageSinkFile (const RPageSinkFile &)=delete | |
| RPageSinkFile (RPageSinkFile &&)=default | |
| RPageSinkFile (std::string_view ntupleName, ROOT::Experimental::RFile &file, std::string_view ntupleDir, const ROOT::RNTupleWriteOptions &options) | |
| RPageSinkFile (std::string_view ntupleName, std::string_view path, const ROOT::RNTupleWriteOptions &options) | |
| RPageSinkFile (std::string_view ntupleName, TDirectory &fileOrDirectory, const ROOT::RNTupleWriteOptions &options) | |
| ~RPageSinkFile () override | |
| ColumnHandle_t | AddColumn (ROOT::DescriptorId_t fieldId, ROOT::Internal::RColumn &column) final |
| Register a new column. | |
| std::unique_ptr< RPageSink > | CloneAsHidden (std::string_view name, const ROOT::RNTupleWriteOptions &opts) const override |
| Creates a new sink with the same underlying storage as this but writing to a different RNTuple named name. | |
| void | CommitAttributeSet (std::string_view attrSetName, const RNTupleLink &attrAnchorInfo) final |
| Adds the given anchor information (name + locator) into the main RNTuple's descriptor as an attribute set linked to it with the given name. | |
| virtual std::uint64_t | CommitCluster (ROOT::NTupleSize_t nNewEntries) |
| Finalize the current cluster and create a new one for the following data. | |
| void | CommitClusterGroup () final |
| Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing). | |
| RNTupleLink | CommitDataset () |
| Run the registered callbacks and finalize the current cluster and the entrire data set. | |
| void | CommitPage (ColumnHandle_t columnHandle, const ROOT::Internal::RPage &page) final |
| Write a page to the storage. The column must have been added before. | |
| void | CommitSealedPage (ROOT::DescriptorId_t physicalColumnId, const RPageStorage::RSealedPage &sealedPage) final |
| Write a preprocessed page to storage. The column must have been added before. | |
| void | CommitSealedPageV (std::span< RPageStorage::RSealedPageGroup > ranges) final |
| Write a vector of preprocessed pages to storage. The corresponding columns must have been added before. | |
| void | CommitStagedClusters (std::span< RStagedCluster > clusters) final |
| Commit staged clusters, logically appending them to the ntuple descriptor. | |
| void | CommitSuppressedColumn (ColumnHandle_t columnHandle) final |
| Commits a suppressed column for the current cluster. | |
| void | DropColumn (ColumnHandle_t) final |
| Unregisters a column. | |
| ROOT::DescriptorId_t | GetColumnId (ColumnHandle_t columnHandle) const |
| const ROOT::RNTupleDescriptor & | GetDescriptor () const final |
| Return the RNTupleDescriptor being constructed. | |
| virtual ROOT::Experimental::Detail::RNTupleMetrics & | GetMetrics () |
| Returns the default metrics object. | |
| ROOT::NTupleSize_t | GetNEntries () const final |
| const std::string & | GetNTupleName () const |
| Returns the NTuple name. | |
| virtual RSinkGuard | GetSinkGuard () |
| EPageStorageType | GetType () final |
| Whether the concrete implementation is a sink or a source. | |
| const ROOT::RNTupleWriteOptions & | GetWriteOptions () const |
| Returns the sink's write options. | |
| void | Init (RNTupleModel &model) |
| Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model. | |
| std::unique_ptr< RNTupleModel > | InitFromDescriptor (const ROOT::RNTupleDescriptor &descriptor, bool copyClusters) |
| Initialize sink based on an existing descriptor and fill into the descriptor builder, optionally copying over the descriptor's clusters to this sink's descriptor. | |
| bool | IsInitialized () const |
| RPageSinkFile & | operator= (const RPageSinkFile &)=delete |
| RPageSinkFile & | operator= (RPageSinkFile &&)=default |
| void | RegisterOnCommitDatasetCallback (Callback_t callback) |
| The registered callback is executed at the beginning of CommitDataset();. | |
| virtual ROOT::Internal::RPage | ReservePage (ColumnHandle_t columnHandle, std::size_t nElements) |
| Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero. | |
| void | SetTaskScheduler (RTaskScheduler *taskScheduler) |
| RStagedCluster | StageCluster (ROOT::NTupleSize_t nNewEntries) final |
| Stage the current cluster and create a new one for the following data. | |
| void | UpdateExtraTypeInfo (const ROOT::RExtraTypeInfoDescriptor &extraTypeInfo) final |
| Adds an extra type information record to schema. | |
| void | UpdateSchema (const ROOT::Internal::RNTupleModelChangeset &changeset, ROOT::NTupleSize_t firstEntry) final |
| Incorporate incremental changes to the model into the ntuple descriptor. | |
Static Public Member Functions | |
| static std::unique_ptr< RPageSink > | Create (std::string_view ntupleName, std::string_view location, const ROOT::RNTupleWriteOptions &options=ROOT::RNTupleWriteOptions()) |
| Guess the concrete derived page source from the location. | |
| static RSealedPage | SealPage (const RSealPageConfig &config) |
| Seal a page using the provided info. | |
Static Public Attributes | |
| static constexpr std::size_t | kNBytesPageChecksum = sizeof(std::uint64_t) |
| The page checksum is a 64bit xxhash3. | |
Protected Member Functions | |
| RNTupleLocator | CommitClusterGroupImpl (unsigned char *serializedPageList, std::uint32_t length) final |
| Returns the locator of the page list envelope of the given buffer that contains the serialized page list. | |
| RNTupleLink | CommitDatasetImpl () final |
| RNTupleLink | CommitDatasetImpl (unsigned char *serializedFooter, std::uint32_t length) final |
| RNTupleLocator | CommitPageImpl (ColumnHandle_t columnHandle, const RPage &page) override |
| RNTupleLocator | CommitSealedPageImpl (ROOT::DescriptorId_t physicalColumnId, const RPageStorage::RSealedPage &sealedPage) final |
| std::vector< RNTupleLocator > | CommitSealedPageVImpl (std::span< RPageStorage::RSealedPageGroup > ranges, const std::vector< bool > &mask) final |
| Vector commit of preprocessed pages. | |
| void | EnableDefaultMetrics (const std::string &prefix) |
| Enables the default set of metrics provided by RPageSink. | |
| void | InitImpl (RNTupleModel &model) final |
| Updates the descriptor and calls InitImpl() that handles the backend-specific details (file, DAOS, etc.). | |
| void | InitImpl (unsigned char *serializedHeader, std::uint32_t length) final |
| RSealedPage | SealPage (const ROOT::Internal::RPage &page, const ROOT::Internal::RColumnElementBase &element) |
| Helper for streaming a page. | |
| std::uint64_t | StageClusterImpl () final |
| Returns the number of bytes written to storage (excluding metadata). | |
| void | WaitForAllTasks () |
Protected Attributes | |
| std::unique_ptr< RCounters > | fCounters |
| ROOT::Internal::RNTupleDescriptorBuilder | fDescriptorBuilder |
| RFeatures | fFeatures |
| bool | fIsInitialized = false |
| Flag if sink was initialized. | |
| ROOT::Experimental::Detail::RNTupleMetrics | fMetrics |
| std::string | fNTupleName |
| std::unique_ptr< ROOT::RNTupleWriteOptions > | fOptions |
| std::unique_ptr< ROOT::Internal::RPageAllocator > | fPageAllocator |
| For the time being, we will use the heap allocator for all sources and sinks. This may change in the future. | |
| RTaskScheduler * | fTaskScheduler = nullptr |
Private Member Functions | |
| RPageSinkFile (std::string_view ntupleName, const ROOT::RNTupleWriteOptions &options) | |
| RPageSinkFile (std::unique_ptr< ROOT::Internal::RNTupleFileWriter > writer, const ROOT::RNTupleWriteOptions &options) | |
| void | CommitBatchOfPages (CommitBatch &batch, std::vector< RNTupleLocator > &locators) |
| Subroutine of CommitSealedPageVImpl, used to perform a vector write of the (multi-)range of pages contained in batch. | |
| RNTupleLocator | WriteSealedPage (const RPageStorage::RSealedPage &sealedPage, std::size_t bytesPacked) |
| We pass bytesPacked so that TFile::ls() reports a reasonable value for the compression ratio of the corresponding key. | |
Private Attributes | |
| ROOT::Internal::RNTupleSerializer::StreamerInfoMap_t | fInfosOfClassFields |
| On UpdateSchema(), the new class fields register the corresponding streamer info here so that the streamer info records in the file can be properly updated on dataset commit. | |
| ROOT::Internal::RNTupleSerializer::StreamerInfoMap_t | fInfosOfStreamerFields |
| Union of the streamer info records that are sent from streamer fields to the sink before committing the dataset. | |
| std::uint64_t | fNBytesCurrentCluster = 0 |
| Number of bytes committed to storage in the current cluster. | |
| std::uint64_t | fNextClusterInGroup = 0 |
| Remembers the starting cluster id for the next cluster group. | |
| std::vector< Callback_t > | fOnDatasetCommitCallbacks |
| std::vector< ROOT::RClusterDescriptor::RColumnRange > | fOpenColumnRanges |
| Keeps track of the number of elements in the currently open cluster. Indexed by column id. | |
| std::vector< ROOT::RClusterDescriptor::RPageRange > | fOpenPageRanges |
| Keeps track of the written pages in the currently open cluster. Indexed by column id. | |
| ROOT::NTupleSize_t | fPrevClusterNEntries = 0 |
| Used to calculate the number of entries in the current cluster. | |
| std::vector< unsigned char > | fSealPageBuffer |
| Used as destination buffer in the simple SealPage overload. | |
| ROOT::Internal::RNTupleSerializer::RContext | fSerializationContext |
| Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization. | |
| RWritePageMemoryManager | fWritePageMemoryManager |
| Used in ReservePage to maintain the page buffer budget. | |
| std::unique_ptr< ROOT::Internal::RNTupleFileWriter > | fWriter |
#include <ROOT/RPageStorageFile.hxx>
|
inherited |
Definition at line 258 of file RPageStorage.hxx.
|
inherited |
The column handle identifies a column with the current open page storage.
Definition at line 180 of file RPageStorage.hxx.
|
inherited |
Definition at line 130 of file RPageStorage.hxx.
|
private |
Definition at line 59 of file RPageStorageFile.cxx.
|
private |
Definition at line 87 of file RPageStorageFile.cxx.
| ROOT::Internal::RPageSinkFile::RPageSinkFile | ( | std::string_view | ntupleName, |
| std::string_view | path, | ||
| const ROOT::RNTupleWriteOptions & | options ) |
Definition at line 66 of file RPageStorageFile.cxx.
| ROOT::Internal::RPageSinkFile::RPageSinkFile | ( | std::string_view | ntupleName, |
| TDirectory & | fileOrDirectory, | ||
| const ROOT::RNTupleWriteOptions & | options ) |
Definition at line 73 of file RPageStorageFile.cxx.
| ROOT::Internal::RPageSinkFile::RPageSinkFile | ( | std::string_view | ntupleName, |
| ROOT::Experimental::RFile & | file, | ||
| std::string_view | ntupleDir, | ||
| const ROOT::RNTupleWriteOptions & | options ) |
Definition at line 80 of file RPageStorageFile.cxx.
|
delete |
|
default |
|
override |
Definition at line 94 of file RPageStorageFile.cxx.
|
finalvirtualinherited |
Register a new column.
When reading, the column must exist in the ntuple on disk corresponding to the metadata. When writing, every column can only be attached once.
Implements ROOT::Internal::RPageStorage.
Definition at line 824 of file RPageStorage.cxx.
|
overridevirtual |
Creates a new sink with the same underlying storage as this but writing to a different RNTuple named name.
Only one of the two sinks can safely write at the same time. The RNTuple written by this cloned sink will be stored in a hidden key (this is a convenient assumption we make now since this method is only used to create attribute RNTuples).
Implements ROOT::Internal::RPageSink.
Definition at line 317 of file RPageStorageFile.cxx.
|
finalvirtualinherited |
Adds the given anchor information (name + locator) into the main RNTuple's descriptor as an attribute set linked to it with the given name.
The attribute set must have already been written to storage via RNTupleAttrSetWriter::Commit(). Note that, by RNTuple specs, this is only legal to call on a non-attribute RNTuple's sink.
Implements ROOT::Internal::RPageSink.
Definition at line 1274 of file RPageStorage.cxx.
|
private |
Subroutine of CommitSealedPageVImpl, used to perform a vector write of the (multi-)range of pages contained in batch.
The locators for the written pages are appended to locators. This procedure also updates some internal metrics of the page sink, hence it's not const. batch gets reset to size 0 after the writing is done (but its begin and end are not updated).
Definition at line 177 of file RPageStorageFile.cxx.
|
inlinevirtualinherited |
Finalize the current cluster and create a new one for the following data.
Returns the number of bytes written to storage (excluding metadata).
Reimplemented in ROOT::Internal::RPageSinkBuf.
Definition at line 379 of file RPageStorage.hxx.
|
finalvirtualinherited |
Write out the page locations (page list envelope) for all the committed clusters since the last call of CommitClusterGroup (or the beginning of writing).
Implements ROOT::Internal::RPageSink.
Definition at line 1232 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Returns the locator of the page list envelope of the given buffer that contains the serialized page list.
Typically, the implementation takes care of compressing and writing the provided buffer.
Implements ROOT::Internal::RPagePersistentSink.
Definition at line 282 of file RPageStorageFile.cxx.
|
inherited |
Run the registered callbacks and finalize the current cluster and the entrire data set.
Definition at line 774 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Reimplemented from ROOT::Internal::RPagePersistentSink.
Definition at line 555 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Implements ROOT::Internal::RPagePersistentSink.
Definition at line 295 of file RPageStorageFile.cxx.
|
finalvirtualinherited |
Write a page to the storage. The column must have been added before.
Implements ROOT::Internal::RPageSink.
Definition at line 1056 of file RPageStorage.cxx.
|
overrideprotectedvirtual |
Implements ROOT::Internal::RPagePersistentSink.
Definition at line 156 of file RPageStorageFile.cxx.
|
finalvirtualinherited |
Write a preprocessed page to storage. The column must have been added before.
Implements ROOT::Internal::RPageSink.
Definition at line 1067 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Implements ROOT::Internal::RPagePersistentSink.
Definition at line 169 of file RPageStorageFile.cxx.
|
finalvirtualinherited |
Write a vector of preprocessed pages to storage. The corresponding columns must have been added before.
Implements ROOT::Internal::RPageSink.
Definition at line 1096 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Vector commit of preprocessed pages.
The ranges array specifies a range of sealed pages to be committed for each column. The returned vector contains, in order, the RNTupleLocator for each page on each range in ranges, i.e. the first N entries refer to the N pages in ranges[0], followed by M entries that refer to the M pages in ranges[1], etc. The mask allows to skip writing out certain pages. The vector has the size of all the pages. For every false value in the mask, the corresponding locator is skipped (missing) in the output vector. The default is to call CommitSealedPageImpl for each page; derived classes may provide an optimized implementation though.
Reimplemented from ROOT::Internal::RPagePersistentSink.
Definition at line 204 of file RPageStorageFile.cxx.
|
finalvirtualinherited |
Commit staged clusters, logically appending them to the ntuple descriptor.
Implements ROOT::Internal::RPageSink.
Definition at line 1195 of file RPageStorage.cxx.
|
finalvirtualinherited |
Commits a suppressed column for the current cluster.
Can be called anytime before CommitCluster(). For any given column and cluster, there must be no calls to both CommitSuppressedColumn() and page commits.
Implements ROOT::Internal::RPageSink.
Definition at line 1051 of file RPageStorage.cxx.
|
staticinherited |
Guess the concrete derived page source from the location.
Definition at line 794 of file RPageStorage.cxx.
|
inlinefinalvirtualinherited |
Unregisters a column.
A page source decreases the reference counter for the corresponding active column. For a page sink, dropping columns is currently a no-op.
Implements ROOT::Internal::RPageStorage.
Definition at line 307 of file RPageStorage.hxx.
|
protectedinherited |
Enables the default set of metrics provided by RPageSink.
prefix will be used as the prefix for the counters registered in the internal RNTupleMetrics object. This set of counters can be extended by a subclass by calling fMetrics.MakeCounter<...>().
A subclass using the default set of metrics is always responsible for updating the counters appropriately, e.g. fCounters->fNPageCommited.Inc()
Definition at line 1320 of file RPageStorage.cxx.
|
inlineinherited |
Definition at line 188 of file RPageStorage.hxx.
|
inlinefinalvirtualinherited |
Return the RNTupleDescriptor being constructed.
Implements ROOT::Internal::RPageSink.
Definition at line 530 of file RPageStorage.hxx.
|
inlinevirtualinherited |
Returns the default metrics object.
Subclasses might alternatively provide their own metrics object by overriding this.
Definition at line 192 of file RPageStorage.hxx.
|
inlinefinalvirtualinherited |
Implements ROOT::Internal::RPageSink.
Definition at line 532 of file RPageStorage.hxx.
|
inlineinherited |
Returns the NTuple name.
Definition at line 195 of file RPageStorage.hxx.
|
inlinevirtualinherited |
Definition at line 433 of file RPageStorage.hxx.
|
inlinefinalvirtualinherited |
Whether the concrete implementation is a sink or a source.
Implements ROOT::Internal::RPageStorage.
Definition at line 303 of file RPageStorage.hxx.
|
inlineinherited |
Returns the sink's write options.
Definition at line 305 of file RPageStorage.hxx.
|
inlineinherited |
Physically creates the storage container to hold the ntuple (e.g., a keys a TFile or an S3 bucket) Init() associates column handles to the columns referenced by the model.
Definition at line 318 of file RPageStorage.hxx.
|
nodiscardinherited |
Initialize sink based on an existing descriptor and fill into the descriptor builder, optionally copying over the descriptor's clusters to this sink's descriptor.
Definition at line 980 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Updates the descriptor and calls InitImpl() that handles the backend-specific details (file, DAOS, etc.).
Reimplemented from ROOT::Internal::RPagePersistentSink.
Definition at line 535 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Implements ROOT::Internal::RPagePersistentSink.
Definition at line 96 of file RPageStorageFile.cxx.
|
inlineinherited |
Definition at line 309 of file RPageStorage.hxx.
|
delete |
|
default |
|
inlineinherited |
The registered callback is executed at the beginning of CommitDataset();.
Definition at line 402 of file RPageStorage.hxx.
|
virtualinherited |
Get a new, empty page for the given column that can be filled with up to nElements; nElements must be larger than zero.
Reimplemented in ROOT::Internal::RPageSinkBuf.
Definition at line 781 of file RPageStorage.cxx.
|
protectedinherited |
Helper for streaming a page.
This is commonly used in derived, concrete page sinks. Note that if compressionSetting is 0 (uncompressed) and the page is mappable and not checksummed, the returned sealed page will point directly to the input page buffer. Otherwise, the sealed page references fSealPageBuffer. Thus, the buffer pointed to by the RSealedPage should never be freed.
Definition at line 757 of file RPageStorage.cxx.
|
staticinherited |
Seal a page using the provided info.
Definition at line 719 of file RPageStorage.cxx.
|
inlineinherited |
Definition at line 197 of file RPageStorage.hxx.
|
finalvirtualinherited |
Stage the current cluster and create a new one for the following data.
Returns the object that must be passed to CommitStagedClusters to logically append the staged cluster to the ntuple descriptor.
Implements ROOT::Internal::RPageSink.
Definition at line 1166 of file RPageStorage.cxx.
|
finalprotectedvirtual |
Returns the number of bytes written to storage (excluding metadata).
Implements ROOT::Internal::RPagePersistentSink.
Definition at line 274 of file RPageStorageFile.cxx.
|
finalvirtualinherited |
Adds an extra type information record to schema.
The extra type information will be written to the extension header. The information in the record will be merged with the existing information, e.g. duplicate streamer info records will be removed. This method is called by the "on commit dataset" callback registered by specific fields (e.g., streamer field) and during merging.
Implements ROOT::Internal::RPageSink.
Definition at line 943 of file RPageStorage.cxx.
|
finalvirtual |
Incorporate incremental changes to the model into the ntuple descriptor.
This happens, e.g. if new fields were added after the initial call to RPageSink::Init(RNTupleModel &). firstEntry specifies the global index for the first stored element in the added columns.
Reimplemented from ROOT::Internal::RPagePersistentSink.
Definition at line 104 of file RPageStorageFile.cxx.
|
inlineprotectedinherited |
Definition at line 153 of file RPageStorage.hxx.
|
inlineprivate |
We pass bytesPacked so that TFile::ls() reports a reasonable value for the compression ratio of the corresponding key.
It is not strictly necessary to write and read the sealed page.
Definition at line 138 of file RPageStorageFile.cxx.
|
protectedinherited |
Definition at line 483 of file RPageStorage.hxx.
|
protectedinherited |
Definition at line 471 of file RPageStorage.hxx.
|
protectedinherited |
Definition at line 470 of file RPageStorage.hxx.
|
private |
On UpdateSchema(), the new class fields register the corresponding streamer info here so that the streamer info records in the file can be properly updated on dataset commit.
Definition at line 72 of file RPageStorageFile.hxx.
|
privateinherited |
Union of the streamer info records that are sent from streamer fields to the sink before committing the dataset.
Definition at line 462 of file RPageStorage.hxx.
|
protectedinherited |
Flag if sink was initialized.
Definition at line 279 of file RPageStorage.hxx.
|
protectedinherited |
Definition at line 146 of file RPageStorage.hxx.
|
private |
Number of bytes committed to storage in the current cluster.
Definition at line 69 of file RPageStorageFile.hxx.
|
privateinherited |
Remembers the starting cluster id for the next cluster group.
Definition at line 453 of file RPageStorage.hxx.
|
protectedinherited |
Definition at line 151 of file RPageStorage.hxx.
|
privateinherited |
Definition at line 288 of file RPageStorage.hxx.
|
privateinherited |
Keeps track of the number of elements in the currently open cluster. Indexed by column id.
Definition at line 457 of file RPageStorage.hxx.
|
privateinherited |
Keeps track of the written pages in the currently open cluster. Indexed by column id.
Definition at line 459 of file RPageStorage.hxx.
|
protectedinherited |
Definition at line 276 of file RPageStorage.hxx.
|
protectedinherited |
For the time being, we will use the heap allocator for all sources and sinks. This may change in the future.
Definition at line 149 of file RPageStorage.hxx.
|
privateinherited |
Used to calculate the number of entries in the current cluster.
Definition at line 455 of file RPageStorage.hxx.
|
privateinherited |
Used as destination buffer in the simple SealPage overload.
Definition at line 289 of file RPageStorage.hxx.
|
privateinherited |
Used to map the IDs of the descriptor to the physical IDs issued during header/footer serialization.
Definition at line 450 of file RPageStorage.hxx.
|
protectedinherited |
Definition at line 152 of file RPageStorage.hxx.
|
privateinherited |
Used in ReservePage to maintain the page buffer budget.
Definition at line 292 of file RPageStorage.hxx.
|
private |
Definition at line 67 of file RPageStorageFile.hxx.
|
staticconstexprinherited |
The page checksum is a 64bit xxhash3.
Definition at line 73 of file RPageStorage.hxx.