Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
RNTupleModel.hxx
Go to the documentation of this file.
1/// \file ROOT/RNTupleModel.hxx
2/// \ingroup NTuple ROOT7
3/// \author Jakob Blomer <jblomer@cern.ch>
4/// \date 2018-10-04
5/// \warning This is part of the ROOT 7 prototype! It will change without notice. It might trigger earthquakes. Feedback
6/// is welcome!
7
8/*************************************************************************
9 * Copyright (C) 1995-2019, Rene Brun and Fons Rademakers. *
10 * All rights reserved. *
11 * *
12 * For the licensing terms see $ROOTSYS/LICENSE. *
13 * For the list of contributors see $ROOTSYS/README/CREDITS. *
14 *************************************************************************/
15
16#ifndef ROOT7_RNTupleModel
17#define ROOT7_RNTupleModel
18
19#include <ROOT/REntry.hxx>
20#include <ROOT/RError.hxx>
21#include <ROOT/RField.hxx>
22#include <ROOT/RNTupleUtil.hxx>
23#include <string_view>
24
25#include <cstdint>
26#include <functional>
27#include <memory>
28#include <string>
29#include <unordered_map>
30#include <unordered_set>
31#include <utility>
32
33namespace ROOT {
34namespace Experimental {
35
36class RNTupleCollectionWriter;
37class RNTupleModel;
38class RNTupleWriter;
39class RNTupleWriteOptions;
40
41namespace Internal {
42class RPageSinkBuf;
43
44// clang-format off
45/**
46\class ROOT::Experimental::Internal::RNTupleModelChangeset
47\ingroup NTuple
48\brief The incremental changes to a `RNTupleModel`
49
50Represents a set of alterations to a `RNTupleModel` that happened after the model is used to initialize a `RPageSink`
51instance. This object can be used to communicate metadata updates to a `RPageSink`.
52You will not normally use this directly; see `RNTupleModel::RUpdater` instead.
53*/
54// clang-format on
57 /// Points to the fields in fModel that were added as part of an updater transaction
58 std::vector<RFieldBase *> fAddedFields;
59 /// Points to the projected fields in fModel that were added as part of an updater transaction
60 std::vector<RFieldBase *> fAddedProjectedFields;
61
63 bool IsEmpty() const { return fAddedFields.empty() && fAddedProjectedFields.empty(); }
64};
65
66/// Merge two RNTuple models. The resulting model will take the description from the left-hand model.
67/// When `rightFieldPrefix` is specified, the right-hand model will be stored in an untyped sub-collection, identified
68/// by the prefix. This way, a field from the right-hand model is represented as `<prefix>.<fieldname>`.
69/// When no prefix is specified, the fields from the right-hand model get added directly to the resulting model.
70///
71/// Note that both models must be frozen before merging.
72std::unique_ptr<RNTupleModel>
73MergeModels(const RNTupleModel &left, const RNTupleModel &right, std::string_view rightFieldPrefix = "");
74} // namespace Internal
75
76// clang-format off
77/**
78\class ROOT::Experimental::RNTupleModel
79\ingroup NTuple
80\brief The RNTupleModel encapulates the schema of an ntuple.
81
82The ntuple model comprises a collection of hierarchically organized fields. From a model, "entries"
83can be extracted. For convenience, the model provides a default entry unless it is created as a "bare model".
84Models have a unique model identifier that faciliates checking whether entries are compatible with it
85(i.e.: have been extracted from that model).
86
87A model is subject to a state transition during its lifetime: it starts in a building state, in which fields can be
88added and modified. Once the schema is finalized, the model gets frozen. Only frozen models can create entries.
89*/
90// clang-format on
92 friend std::unique_ptr<RNTupleModel>
93 Internal::MergeModels(const RNTupleModel &left, const RNTupleModel &right, std::string_view rightFieldPrefix);
94
95public:
96 /// User provided function that describes the mapping of existing source fields to projected fields in terms
97 /// of fully qualified field names. The mapping function is called with the qualified field names of the provided
98 /// field and the subfields. It should return the qualified field names used as a mapping source.
99 using FieldMappingFunc_t = std::function<std::string(const std::string &)>;
100
101 /// A wrapper over a field name and an optional description; used in `AddField()` and `RUpdater::AddField()`
104 NameWithDescription_t(const std::string &name) : fName(name) {}
105 NameWithDescription_t(std::string_view name) : fName(name) {}
106 NameWithDescription_t(std::string_view name, std::string_view descr) : fName(name), fDescription(descr) {}
107
108 std::string_view fName;
109 std::string_view fDescription = "";
110 };
111
112 /// Projected fields are fields whose columns are reused from existing fields. Projected fields are not attached
113 /// to the models zero field. Only the real source fields are written to, projected fields are stored as meta-data
114 /// (header) information only. Only top-level projected fields are supported because otherwise the layout of types
115 /// could be altered in unexpected ways.
116 /// All projected fields and the source fields used to back them are kept in this class.
118 public:
119 /// The map keys are the projected target fields, the map values are the backing source fields
120 /// Note that sub fields are treated individually and indepently of their parent field
121 using FieldMap_t = std::unordered_map<const RFieldBase *, const RFieldBase *>;
122
123 private:
124 explicit RProjectedFields(std::unique_ptr<RFieldZero> fieldZero) : fFieldZero(std::move(fieldZero)) {}
125 /// The projected fields are attached to this zero field
126 std::unique_ptr<RFieldZero> fFieldZero;
127 /// Maps the source fields from fModel to the target projected fields attached to fFieldZero
129 /// The model this set of projected fields belongs to
131
132 /// Asserts that the passed field is a valid target of the source field provided in the field map.
133 /// Checks the field without looking into sub fields.
135
136 public:
137 explicit RProjectedFields(const RNTupleModel *model) : fFieldZero(std::make_unique<RFieldZero>()), fModel(model)
138 {
139 }
144 ~RProjectedFields() = default;
145
146 /// The new model needs to be a clone of fModel
147 std::unique_ptr<RProjectedFields> Clone(const RNTupleModel *newModel) const;
148
149 RFieldZero *GetFieldZero() const { return fFieldZero.get(); }
150 const RFieldBase *GetSourceField(const RFieldBase *target) const;
151 /// Adds a new projected field. The field map needs to provide valid source fields of fModel for 'field'
152 /// and each of its sub fields.
153 RResult<void> Add(std::unique_ptr<RFieldBase> field, const FieldMap_t &fieldMap);
154 bool IsEmpty() const { return fFieldZero->begin() == fFieldZero->end(); }
155 };
156
157 /// A model is usually immutable after passing it to an `RNTupleWriter`. However, for the rare
158 /// cases that require changing the model after the fact, `RUpdater` provides limited support for
159 /// incremental updates, e.g. addition of new fields.
160 ///
161 /// See `RNTupleWriter::CreateModelUpdater()` for an example.
162 class RUpdater {
163 private:
166 std::uint64_t fNewModelId = 0; ///< The model ID after committing
167
168 public:
169 explicit RUpdater(RNTupleWriter &writer);
171 /// Begin a new set of alterations to the underlying model. As a side effect, all `REntry` instances related to
172 /// the model are invalidated.
173 void BeginUpdate();
174 /// Commit changes since the last call to `BeginUpdate()`. All the invalidated `REntry`s remain invalid.
175 /// `CreateEntry()` or `CreateBareEntry()` can be used to create an `REntry` that matching the new model.
176 /// Upon completion, `BeginUpdate()` can be called again to begin a new set of changes.
177 void CommitUpdate();
178
179 template <typename T, typename... ArgsT>
180 std::shared_ptr<T> MakeField(const NameWithDescription_t &fieldNameDesc, ArgsT &&...args)
181 {
182 auto objPtr = fOpenChangeset.fModel.MakeField<T>(fieldNameDesc, std::forward<ArgsT>(args)...);
183 auto fieldZero = fOpenChangeset.fModel.fFieldZero.get();
184 auto it = std::find_if(fieldZero->begin(), fieldZero->end(),
185 [&](const auto &f) { return f.GetFieldName() == fieldNameDesc.fName; });
186 R__ASSERT(it != fieldZero->end());
187 fOpenChangeset.fAddedFields.emplace_back(&(*it));
188 return objPtr;
189 }
190
191 void AddField(std::unique_ptr<RFieldBase> field);
192
193 RResult<void> AddProjectedField(std::unique_ptr<RFieldBase> field, FieldMappingFunc_t mapping);
194 };
195
196private:
197 /// Hierarchy of fields consisting of simple types and collections (sub trees)
198 std::unique_ptr<RFieldZero> fFieldZero;
199 /// Contains field values corresponding to the created top-level fields
200 std::unique_ptr<REntry> fDefaultEntry;
201 /// Keeps track of which field names are taken, including projected field names.
202 std::unordered_set<std::string> fFieldNames;
203 /// Free text set by the user
204 std::string fDescription;
205 /// The set of projected top-level fields
206 std::unique_ptr<RProjectedFields> fProjectedFields;
207 /// Every model has a unique ID to distinguish it from other models. Entries are linked to models via the ID.
208 /// Cloned models get a new model ID.
209 std::uint64_t fModelId = 0;
210 /// Changed by Freeze() / Unfreeze() and by the RUpdater.
211 bool fIsFrozen = false;
212
213 /// Checks that user-provided field names are valid in the context
214 /// of this NTuple model. Throws an RException for invalid names.
215 void EnsureValidFieldName(std::string_view fieldName);
216
217 /// Throws an RException if fFrozen is true
218 void EnsureNotFrozen() const;
219
220 /// Throws an RException if fDefaultEntry is nullptr
221 void EnsureNotBare() const;
222
223 /// The field name can be a top-level field or a nested field. Returns nullptr if the field is not in the model.
224 RFieldBase *FindField(std::string_view fieldName) const;
225
226 RNTupleModel(std::unique_ptr<RFieldZero> fieldZero);
227
228public:
229 RNTupleModel(const RNTupleModel&) = delete;
231 ~RNTupleModel() = default;
232
233 std::unique_ptr<RNTupleModel> Clone() const;
234 static std::unique_ptr<RNTupleModel> Create();
235 static std::unique_ptr<RNTupleModel> Create(std::unique_ptr<RFieldZero> fieldZero);
236 /// A bare model has no default entry
237 static std::unique_ptr<RNTupleModel> CreateBare();
238 static std::unique_ptr<RNTupleModel> CreateBare(std::unique_ptr<RFieldZero> fieldZero);
239
240 /// Creates a new field given a `name` or `{name, description}` pair and a
241 /// corresponding value that is managed by a shared pointer.
242 ///
243 /// **Example: create some fields and fill an %RNTuple**
244 /// ~~~ {.cpp}
245 /// #include <ROOT/RNTupleModel.hxx>
246 /// #include <ROOT/RNTupleWriter.hxx>
247 /// using ROOT::Experimental::RNTupleModel;
248 /// using ROOT::Experimental::RNTupleWriter;
249 ///
250 /// #include <vector>
251 ///
252 /// auto model = RNTupleModel::Create();
253 /// auto pt = model->MakeField<float>("pt");
254 /// auto vec = model->MakeField<std::vector<int>>("vec");
255 ///
256 /// // The RNTuple is written to disk when the RNTupleWriter goes out of scope
257 /// {
258 /// auto writer = RNTupleWriter::Recreate(std::move(model), "myNTuple", "myFile.root");
259 /// for (int i = 0; i < 100; i++) {
260 /// *pt = static_cast<float>(i);
261 /// *vec = {i, i+1, i+2};
262 /// writer->Fill();
263 /// }
264 /// }
265 /// ~~~
266 ///
267 /// **Example: create a field with an initial value**
268 /// ~~~ {.cpp}
269 /// #include <ROOT/RNTupleModel.hxx>
270 /// using ROOT::Experimental::RNTupleModel;
271 ///
272 /// auto model = RNTupleModel::Create();
273 /// // pt's initial value is 42.0
274 /// auto pt = model->MakeField<float>("pt", 42.0);
275 /// ~~~
276 /// **Example: create a field with a description**
277 /// ~~~ {.cpp}
278 /// #include <ROOT/RNTupleModel.hxx>
279 /// using ROOT::Experimental::RNTupleModel;
280 ///
281 /// auto model = RNTupleModel::Create();
282 /// auto hadronFlavour = model->MakeField<float>({
283 /// "hadronFlavour", "flavour from hadron ghost clustering"
284 /// });
285 /// ~~~
286 template <typename T, typename... ArgsT>
287 std::shared_ptr<T> MakeField(const NameWithDescription_t &fieldNameDesc, ArgsT &&...args)
288 {
290 EnsureValidFieldName(fieldNameDesc.fName);
291 auto field = std::make_unique<RField<T>>(fieldNameDesc.fName);
292 field->SetDescription(fieldNameDesc.fDescription);
293 std::shared_ptr<T> ptr;
294 if (fDefaultEntry)
295 ptr = fDefaultEntry->AddValue<T>(*field, std::forward<ArgsT>(args)...);
296 fFieldNames.insert(field->GetFieldName());
297 fFieldZero->Attach(std::move(field));
298 return ptr;
299 }
300
301 /// Adds a field whose type is not known at compile time. Thus there is no shared pointer returned.
302 ///
303 /// Throws an exception if the field is null.
304 void AddField(std::unique_ptr<RFieldBase> field);
305
306 /// Adds a top-level field based on existing fields.
307 RResult<void> AddProjectedField(std::unique_ptr<RFieldBase> field, FieldMappingFunc_t mapping);
309
310 void Freeze();
311 void Unfreeze();
312 bool IsFrozen() const { return fIsFrozen; }
313 bool IsBare() const { return !fDefaultEntry; }
314 std::uint64_t GetModelId() const { return fModelId; }
315
316 /// Ingests a model for a sub collection and attaches it to the current model
317 ///
318 /// Throws an exception if collectionModel is null.
319 std::shared_ptr<RNTupleCollectionWriter>
320 MakeCollection(std::string_view fieldName, std::unique_ptr<RNTupleModel> collectionModel);
321
322 std::unique_ptr<REntry> CreateEntry() const;
323 /// In a bare entry, all values point to nullptr. The resulting entry shall use BindValue() in order
324 /// set memory addresses to be serialized / deserialized
325 std::unique_ptr<REntry> CreateBareEntry() const;
326 /// Creates a token to be used in REntry methods to address a top-level field
327 REntry::RFieldToken GetToken(std::string_view fieldName) const;
328 /// Calls the given field's CreateBulk() method. Throws an exception if no field with the given name exists.
329 RFieldBase::RBulk CreateBulk(std::string_view fieldName) const;
330
332 const REntry &GetDefaultEntry() const;
333
334 /// Non-const access to the root field is used to commit clusters during writing,
335 /// and to make adjustments to the fields between freezing and connecting to a page sink.
337 const RFieldZero &GetFieldZero() const { return *fFieldZero; }
338 const RFieldBase &GetField(std::string_view fieldName) const;
339
340 const std::string &GetDescription() const { return fDescription; }
341 void SetDescription(std::string_view description);
342
343 /// Estimate the memory usage for this model during writing
344 ///
345 /// This will return an estimate in bytes for the internal page and compression buffers. The value should be
346 /// understood per sequential RNTupleWriter or per RNTupleFillContext created for a RNTupleParallelWriter
347 /// constructed with this model.
348 std::size_t EstimateWriteMemoryUsage(const RNTupleWriteOptions &options = RNTupleWriteOptions()) const;
349};
350
351} // namespace Experimental
352} // namespace ROOT
353
354#endif
#define f(i)
Definition RSha256.hxx:104
#define R__ASSERT(e)
Checks condition e and reports a fatal error if it's false.
Definition TError.h:125
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t target
char name[80]
Definition TGX11.cxx:110
The field token identifies a top-level field in this entry.
Definition REntry.hxx:60
The REntry is a collection of values in an ntuple corresponding to a complete row in the data set.
Definition REntry.hxx:50
Similar to RValue but manages an array of consecutive values.
A field translates read and write calls from/to underlying columns to/from tree values.
The container field for an ntuple model, which itself has no physical representation.
Definition RField.hxx:59
Projected fields are fields whose columns are reused from existing fields.
FieldMap_t fFieldMap
Maps the source fields from fModel to the target projected fields attached to fFieldZero.
RProjectedFields & operator=(RProjectedFields &&)=default
RProjectedFields(std::unique_ptr< RFieldZero > fieldZero)
RResult< void > EnsureValidMapping(const RFieldBase *target, const FieldMap_t &fieldMap)
Asserts that the passed field is a valid target of the source field provided in the field map.
const RNTupleModel * fModel
The model this set of projected fields belongs to.
std::unordered_map< const RFieldBase *, const RFieldBase * > FieldMap_t
The map keys are the projected target fields, the map values are the backing source fields Note that ...
std::unique_ptr< RFieldZero > fFieldZero
The projected fields are attached to this zero field.
RProjectedFields(const RProjectedFields &)=delete
const RFieldBase * GetSourceField(const RFieldBase *target) const
RProjectedFields & operator=(const RProjectedFields &)=delete
RResult< void > Add(std::unique_ptr< RFieldBase > field, const FieldMap_t &fieldMap)
Adds a new projected field.
A model is usually immutable after passing it to an RNTupleWriter.
Internal::RNTupleModelChangeset fOpenChangeset
void CommitUpdate()
Commit changes since the last call to BeginUpdate().
void BeginUpdate()
Begin a new set of alterations to the underlying model.
std::uint64_t fNewModelId
The model ID after committing.
std::shared_ptr< T > MakeField(const NameWithDescription_t &fieldNameDesc, ArgsT &&...args)
RResult< void > AddProjectedField(std::unique_ptr< RFieldBase > field, FieldMappingFunc_t mapping)
void AddField(std::unique_ptr< RFieldBase > field)
The RNTupleModel encapulates the schema of an ntuple.
std::unordered_set< std::string > fFieldNames
Keeps track of which field names are taken, including projected field names.
std::string fDescription
Free text set by the user.
void EnsureValidFieldName(std::string_view fieldName)
Checks that user-provided field names are valid in the context of this NTuple model.
std::uint64_t fModelId
Every model has a unique ID to distinguish it from other models.
std::function< std::string(const std::string &)> FieldMappingFunc_t
User provided function that describes the mapping of existing source fields to projected fields in te...
std::uint64_t GetModelId() const
RNTupleModel(const RNTupleModel &)=delete
REntry::RFieldToken GetToken(std::string_view fieldName) const
Creates a token to be used in REntry methods to address a top-level field.
void EnsureNotBare() const
Throws an RException if fDefaultEntry is nullptr.
std::unique_ptr< RNTupleModel > Clone() const
void EnsureNotFrozen() const
Throws an RException if fFrozen is true.
std::size_t EstimateWriteMemoryUsage(const RNTupleWriteOptions &options=RNTupleWriteOptions()) const
Estimate the memory usage for this model during writing.
std::shared_ptr< T > MakeField(const NameWithDescription_t &fieldNameDesc, ArgsT &&...args)
Creates a new field given a name or {name, description} pair and a corresponding value that is manage...
std::shared_ptr< RNTupleCollectionWriter > MakeCollection(std::string_view fieldName, std::unique_ptr< RNTupleModel > collectionModel)
Ingests a model for a sub collection and attaches it to the current model.
const RFieldBase & GetField(std::string_view fieldName) const
std::unique_ptr< REntry > CreateBareEntry() const
In a bare entry, all values point to nullptr.
std::unique_ptr< REntry > CreateEntry() const
RFieldBase::RBulk CreateBulk(std::string_view fieldName) const
Calls the given field's CreateBulk() method. Throws an exception if no field with the given name exis...
static std::unique_ptr< RNTupleModel > Create()
std::unique_ptr< RProjectedFields > fProjectedFields
The set of projected top-level fields.
RResult< void > AddProjectedField(std::unique_ptr< RFieldBase > field, FieldMappingFunc_t mapping)
Adds a top-level field based on existing fields.
const RFieldZero & GetFieldZero() const
void SetDescription(std::string_view description)
std::unique_ptr< REntry > fDefaultEntry
Contains field values corresponding to the created top-level fields.
RFieldBase * FindField(std::string_view fieldName) const
The field name can be a top-level field or a nested field. Returns nullptr if the field is not in the...
static std::unique_ptr< RNTupleModel > CreateBare()
A bare model has no default entry.
const std::string & GetDescription() const
const RProjectedFields & GetProjectedFields() const
void AddField(std::unique_ptr< RFieldBase > field)
Adds a field whose type is not known at compile time.
RFieldZero & GetFieldZero()
Non-const access to the root field is used to commit clusters during writing, and to make adjustments...
RNTupleModel & operator=(const RNTupleModel &)=delete
bool fIsFrozen
Changed by Freeze() / Unfreeze() and by the RUpdater.
std::unique_ptr< RFieldZero > fFieldZero
Hierarchy of fields consisting of simple types and collections (sub trees)
Common user-tunable settings for storing ntuples.
An RNTuple that gets filled with entries (data) and writes them to storage.
The class is used as a return type for operations that can fail; wraps a value of type T or an RError...
Definition RError.hxx:194
std::unique_ptr< RNTupleModel > MergeModels(const RNTupleModel &left, const RNTupleModel &right, std::string_view rightFieldPrefix="")
Merge two RNTuple models.
tbb::task_arena is an alias of tbb::interface7::task_arena, which doesn't allow to forward declare tb...
The incremental changes to a RNTupleModel
std::vector< RFieldBase * > fAddedProjectedFields
Points to the projected fields in fModel that were added as part of an updater transaction.
std::vector< RFieldBase * > fAddedFields
Points to the fields in fModel that were added as part of an updater transaction.
A wrapper over a field name and an optional description; used in AddField() and RUpdater::AddField()
NameWithDescription_t(std::string_view name, std::string_view descr)