Logo ROOT  
Reference Guide
 
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Modules Pages
Loading...
Searching...
No Matches
RPageSinkBuf.hxx
Go to the documentation of this file.
1/// \file ROOT/RPageSinkBuf.hxx
2/// \ingroup NTuple ROOT7
3/// \author Jakob Blomer <jblomer@cern.ch>
4/// \author Max Orok <maxwellorok@gmail.com>
5/// \author Javier Lopez-Gomez <javier.lopez.gomez@cern.ch>
6/// \date 2021-03-17
7/// \warning This is part of the ROOT 7 prototype! It will change without notice. It might trigger earthquakes. Feedback
8/// is welcome!
9
10/*************************************************************************
11 * Copyright (C) 1995-2021, Rene Brun and Fons Rademakers. *
12 * All rights reserved. *
13 * *
14 * For the licensing terms see $ROOTSYS/LICENSE. *
15 * For the list of contributors see $ROOTSYS/README/CREDITS. *
16 *************************************************************************/
17
18#ifndef ROOT7_RPageSinkBuf
19#define ROOT7_RPageSinkBuf
20
22#include <ROOT/RPageStorage.hxx>
23
24#include <deque>
25#include <functional>
26#include <iterator>
27#include <memory>
28#include <tuple>
29
30namespace ROOT {
31namespace Experimental {
32namespace Internal {
33
34// clang-format off
35/**
36\class ROOT::Experimental::Internal::RPageSinkBuf
37\ingroup NTuple
38\brief Wrapper sink that coalesces cluster column page writes
39*/
40// clang-format on
41class RPageSinkBuf : public RPageSink {
42private:
43 /// A buffered column. The column is not responsible for RPage memory management (i.e. ReservePage),
44 /// which is handled by the enclosing RPageSinkBuf.
45 class RColumnBuf {
46 public:
47 struct RPageZipItem {
49 // Compression scratch buffer for fSealedPage.
50 std::unique_ptr<unsigned char[]> fBuf;
52 bool IsSealed() const { return fSealedPage != nullptr; }
53 };
54 public:
55 RColumnBuf() = default;
56 RColumnBuf(const RColumnBuf&) = delete;
57 RColumnBuf& operator=(const RColumnBuf&) = delete;
58 RColumnBuf(RColumnBuf&&) = default;
61
62 /// Returns a reference to the newly buffered page. The reference remains
63 /// valid until DropBufferedPages().
65 {
66 if (!fCol) {
68 }
69 // Safety: Insertion at the end of a deque never invalidates references
70 // to existing elements.
71 return fBufferedPages.emplace_back();
72 }
73 const RPageStorage::ColumnHandle_t &GetHandle() const { return fCol; }
74 bool IsEmpty() const { return fBufferedPages.empty(); }
75 bool HasSealedPagesOnly() const { return fBufferedPages.size() == fSealedPages.size(); }
77
78 void DropBufferedPages();
79
80 // The returned reference points to a default-constructed RSealedPage. It can be used
81 // to fill in data after sealing.
83 {
84 return fSealedPages.emplace_back();
85 }
86
87 private:
89 /// Using a deque guarantees that element iterators are never invalidated
90 /// by appends to the end of the iterator by BufferPage.
91 std::deque<RPageZipItem> fBufferedPages;
92 /// Pages that have been already sealed by a concurrent task. A vector commit can be issued if all
93 /// buffered pages have been sealed.
94 /// Note that each RSealedPage refers to the same buffer as `fBufferedPages[i].fBuf` for some value of `i`, and
95 /// thus owned by RPageZipItem
97 };
98
99private:
100 /// I/O performance counters that get registered in fMetrics
108 std::unique_ptr<RCounters> fCounters;
109 /// The inner sink, responsible for actually performing I/O.
110 std::unique_ptr<RPageSink> fInnerSink;
111 /// The buffered page sink maintains a copy of the RNTupleModel for the inner sink.
112 /// For the unbuffered case, the RNTupleModel is instead managed by a RNTupleWriter.
113 std::unique_ptr<RNTupleModel> fInnerModel;
114 /// Vector of buffered column pages. Indexed by column id.
115 std::vector<RColumnBuf> fBufferedColumns;
116 /// Columns committed as suppressed are stored and passed to the inner sink at cluster commit
117 std::vector<ColumnHandle_t> fSuppressedColumns;
120
121 void ConnectFields(const std::vector<ROOT::RFieldBase *> &fields, ROOT::NTupleSize_t firstEntry);
122 void FlushClusterImpl(std::function<void(void)> FlushClusterFn);
123
124public:
125 explicit RPageSinkBuf(std::unique_ptr<RPageSink> inner);
126 RPageSinkBuf(const RPageSinkBuf&) = delete;
130 ~RPageSinkBuf() override;
131
133
135
136 ROOT::NTupleSize_t GetNEntries() const final { return fInnerSink->GetNEntries(); }
137
138 void InitImpl(RNTupleModel &model) final;
141
145 void CommitSealedPageV(std::span<RPageStorage::RSealedPageGroup> ranges) final;
146 std::uint64_t CommitCluster(ROOT::NTupleSize_t nNewEntries) final;
147 RStagedCluster StageCluster(ROOT::NTupleSize_t nNewEntries) final;
148 void CommitStagedClusters(std::span<RStagedCluster> clusters) final;
151
153}; // RPageSinkBuf
154
155} // namespace Internal
156} // namespace Experimental
157} // namespace ROOT
158
159#endif
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
A thread-safe integral performance counter.
A non thread-safe integral performance counter.
An either thread-safe or non thread safe counter for CPU ticks.
RPageZipItem & BufferPage(RPageStorage::ColumnHandle_t columnHandle)
Returns a reference to the newly buffered page.
RColumnBuf & operator=(const RColumnBuf &)=delete
const RPageStorage::ColumnHandle_t & GetHandle() const
RPageStorage::SealedPageSequence_t fSealedPages
Pages that have been already sealed by a concurrent task.
std::deque< RPageZipItem > fBufferedPages
Using a deque guarantees that element iterators are never invalidated by appends to the end of the it...
RColumnBuf & operator=(RColumnBuf &&)=default
const RPageStorage::SealedPageSequence_t & GetSealedPages() const
Wrapper sink that coalesces cluster column page writes.
RPageSinkBuf & operator=(RPageSinkBuf &&)=default
void UpdateSchema(const RNTupleModelChangeset &changeset, ROOT::NTupleSize_t firstEntry) final
Incorporate incremental changes to the model into the ntuple descriptor.
void CommitPage(ColumnHandle_t columnHandle, const ROOT::Internal::RPage &page) final
Write a page to the storage. The column must have been added before.
void CommitStagedClusters(std::span< RStagedCluster > clusters) final
Commit staged clusters, logically appending them to the ntuple descriptor.
void CommitSealedPage(ROOT::DescriptorId_t physicalColumnId, const RSealedPage &sealedPage) final
Write a preprocessed page to storage. The column must have been added before.
RPageSinkBuf(const RPageSinkBuf &)=delete
std::vector< RColumnBuf > fBufferedColumns
Vector of buffered column pages. Indexed by column id.
void ConnectFields(const std::vector< ROOT::RFieldBase * > &fields, ROOT::NTupleSize_t firstEntry)
ROOT::NTupleSize_t GetNEntries() const final
std::unique_ptr< RNTupleModel > fInnerModel
The buffered page sink maintains a copy of the RNTupleModel for the inner sink.
std::unique_ptr< RPageSink > fInnerSink
The inner sink, responsible for actually performing I/O.
const ROOT::RNTupleDescriptor & GetDescriptor() const final
Return the RNTupleDescriptor being constructed.
void FlushClusterImpl(std::function< void(void)> FlushClusterFn)
void CommitSealedPageV(std::span< RPageStorage::RSealedPageGroup > ranges) final
Write a vector of preprocessed pages to storage. The corresponding columns must have been added befor...
ColumnHandle_t AddColumn(ROOT::DescriptorId_t fieldId, ROOT::Internal::RColumn &column) final
Register a new column.
RPageSinkBuf(std::unique_ptr< RPageSink > inner)
ROOT::Internal::RPage ReservePage(ColumnHandle_t columnHandle, std::size_t nElements) final
Get a new, empty page for the given column that can be filled with up to nElements; nElements must be...
std::unique_ptr< RCounters > fCounters
std::uint64_t CommitCluster(ROOT::NTupleSize_t nNewEntries) final
Finalize the current cluster and create a new one for the following data.
RStagedCluster StageCluster(ROOT::NTupleSize_t nNewEntries) final
Stage the current cluster and create a new one for the following data.
void InitImpl(RNTupleModel &model) final
std::vector< ColumnHandle_t > fSuppressedColumns
Columns committed as suppressed are stored and passed to the inner sink at cluster commit.
void CommitClusterGroup() final
Write out the page locations (page list envelope) for all the committed clusters since the last call ...
void UpdateExtraTypeInfo(const ROOT::RExtraTypeInfoDescriptor &extraTypeInfo) final
Adds an extra type information record to schema.
void CommitSuppressedColumn(ColumnHandle_t columnHandle) final
Commits a suppressed column for the current cluster.
RPageSinkBuf & operator=(const RPageSinkBuf &)=delete
Abstract interface to write data into an ntuple.
std::deque< RSealedPage > SealedPageSequence_t
RColumnHandle ColumnHandle_t
The column handle identifies a column with the current open page storage.
The RNTupleModel encapulates the schema of an ntuple.
A column is a storage-backed array of a simple, fixed-size type, from which pages can be mapped into ...
Definition RColumn.hxx:40
A page is a slice of a column that is mapped into memory.
Definition RPage.hxx:46
Field specific extra type information from the header / extenstion header.
The on-storage meta-data of an ntuple.
tbb::task_arena is an alias of tbb::interface7::task_arena, which doesn't allow to forward declare tb...
std::uint64_t DescriptorId_t
Distriniguishes elements of the same type within a descriptor, e.g. different fields.
std::uint64_t NTupleSize_t
Integer type long enough to hold the maximum number of entries in a column.
The incremental changes to a RNTupleModel
I/O performance counters that get registered in fMetrics.
Detail::RNTupleTickCounter< Detail::RNTupleAtomicCounter > & fTimeCpuZip
Detail::RNTupleTickCounter< Detail::RNTuplePlainCounter > & fTimeCpuCriticalSection
A sealed page contains the bytes of a page as written to storage (packed & compressed).