Logo ROOT  
Reference Guide
Loading...
Searching...
No Matches
RNTupleParallelWriter.hxx
Go to the documentation of this file.
1/// \file ROOT/RNTupleParallelWriter.hxx
2/// \author Jonas Hahnfeld <jonas.hahnfeld@cern.ch>
3/// \date 2024-02-01
4
5/*************************************************************************
6 * Copyright (C) 1995-2024, Rene Brun and Fons Rademakers. *
7 * All rights reserved. *
8 * *
9 * For the licensing terms see $ROOTSYS/LICENSE. *
10 * For the list of contributors see $ROOTSYS/README/CREDITS. *
11 *************************************************************************/
12
13#ifndef ROOT_RNTupleParallelWriter
14#define ROOT_RNTupleParallelWriter
15
18
19#include <memory>
20#include <mutex>
21#include <string_view>
22#include <vector>
23
24class TDirectory;
25
26namespace ROOT {
27
28class RNTupleModel;
29
30namespace Internal {
31class RPageSink;
32} // namespace Internal
33
35
36/**
37\class ROOT::RNTupleParallelWriter
38\ingroup NTuple
39\brief A writer to fill an RNTuple from multiple contexts
40
41Compared to the sequential RNTupleWriter, a parallel writer enables the creation of multiple RNTupleFillContext (see
42CreateFillContext()). Each fill context prepares independent clusters that are appended to the common RNTuple with
43internal synchronization. Before destruction, all fill contexts must have flushed their data and been destroyed (or
44data could be lost!).
45
46For user convenience, CreateFillContext() is thread-safe and may be called from multiple threads in parallel at any
47time, also after some data has already been written. Internally, the original model is cloned and ownership is passed
48to a newly created RNTupleFillContext. For that reason, it is recommended to use RNTupleModel::CreateBare when creating
49the model for parallel writing and avoid the allocation of a useless default REntry per context.
50
51Note that the sequence of independently prepared clusters is indeterminate and therefore entries are only partially
52ordered: Entries from one context are totally ordered as they were filled. However, there is no orderering with other
53contexts and the entries may be appended to the RNTuple either before or after other entries written in parallel into
54other contexts. In addition, two consecutive entries in one fill context can end up separated in the final RNTuple, if
55they happen to fall onto a cluster boundary and other contexts append more entries before the next cluster is full.
56
57At the moment, the parallel writer does not (yet) support incremental updates of the underlying model. Please refer to
58RNTupleWriter::CreateModelUpdater if required for your use case.
59*/
61private:
62 /// A global mutex to protect the internal data structures of this object.
63 std::mutex fMutex;
64 /// A mutex to synchronize the final page sink.
65 std::mutex fSinkMutex;
66 /// The final RPageSink that represents the synchronization point.
67 std::unique_ptr<ROOT::Internal::RPageSink> fSink;
68 /// The original RNTupleModel connected to fSink; needs to be destructed before it.
69 std::unique_ptr<ROOT::RNTupleModel> fModel;
71 /// List of all created helpers. They must be destroyed before this RNTupleParallelWriter is destructed.
72 std::vector<std::weak_ptr<RNTupleFillContext>> fFillContexts;
73
74 RNTupleParallelWriter(std::unique_ptr<ROOT::RNTupleModel> model, std::unique_ptr<ROOT::Internal::RPageSink> sink);
79
80public:
81 /// Recreate a new file and return a writer to write an RNTuple.
82 static std::unique_ptr<RNTupleParallelWriter>
83 Recreate(std::unique_ptr<ROOT::RNTupleModel> model, std::string_view ntupleName, std::string_view storage,
85 /// Append an RNTuple to the existing file.
86 ///
87 /// While the writer synchronizes between multiple fill contexts created from the same writer, there is no
88 /// synchronization with other writers or other clients that write into the same file. The caller must ensure that
89 /// the underlying file is not be accessed while data is filled into any created context. To improve performance, it
90 /// is allowed to use special methods that are guaranteed to not interact with the underlying file, such as
91 /// RNTupleFillContext::FillNoFlush().
92 static std::unique_ptr<RNTupleParallelWriter>
93 Append(std::unique_ptr<ROOT::RNTupleModel> model, std::string_view ntupleName, TDirectory &fileOrDirectory,
95
97
98 /// Create a new RNTupleFillContext that can be used to fill entries and prepare clusters in parallel. This method is
99 /// thread-safe and may be called from multiple threads in parallel at any time, also after some data has already
100 /// been written.
101 ///
102 /// Note that all fill contexts must be destroyed before CommitDataset() is called.
103 std::shared_ptr<RNTupleFillContext> CreateFillContext();
104
105 /// Automatically called by the destructor
106 void CommitDataset();
107
108 void EnableMetrics() { fMetrics.Enable(); }
110};
111
112} // namespace ROOT
113
114#endif
A collection of Counter objects with a name, a unit, and a description.
Abstract interface to write data into an ntuple.
A context for filling entries (data) into clusters of an RNTuple.
The RNTupleModel encapulates the schema of an RNTuple.
RNTupleParallelWriter & operator=(RNTupleParallelWriter &&)=delete
static std::unique_ptr< RNTupleParallelWriter > Recreate(std::unique_ptr< ROOT::RNTupleModel > model, std::string_view ntupleName, std::string_view storage, const ROOT::RNTupleWriteOptions &options=ROOT::RNTupleWriteOptions())
Recreate a new file and return a writer to write an RNTuple.
RNTupleParallelWriter(const RNTupleParallelWriter &)=delete
std::vector< std::weak_ptr< RNTupleFillContext > > fFillContexts
List of all created helpers. They must be destroyed before this RNTupleParallelWriter is destructed.
const Experimental::Detail::RNTupleMetrics & GetMetrics() const
static std::unique_ptr< RNTupleParallelWriter > Append(std::unique_ptr< ROOT::RNTupleModel > model, std::string_view ntupleName, TDirectory &fileOrDirectory, const ROOT::RNTupleWriteOptions &options=ROOT::RNTupleWriteOptions())
Append an RNTuple to the existing file.
Experimental::Detail::RNTupleMetrics fMetrics
void CommitDataset()
Automatically called by the destructor.
std::mutex fMutex
A global mutex to protect the internal data structures of this object.
std::unique_ptr< ROOT::Internal::RPageSink > fSink
The final RPageSink that represents the synchronization point.
RNTupleParallelWriter(RNTupleParallelWriter &&)=delete
RNTupleParallelWriter(std::unique_ptr< ROOT::RNTupleModel > model, std::unique_ptr< ROOT::Internal::RPageSink > sink)
RNTupleParallelWriter & operator=(const RNTupleParallelWriter &)=delete
std::shared_ptr< RNTupleFillContext > CreateFillContext()
Create a new RNTupleFillContext that can be used to fill entries and prepare clusters in parallel.
std::unique_ptr< ROOT::RNTupleModel > fModel
The original RNTupleModel connected to fSink; needs to be destructed before it.
std::mutex fSinkMutex
A mutex to synchronize the final page sink.
Common user-tunable settings for storing RNTuples.
Describe directory structure in memory.
Definition TDirectory.h:45