Logo ROOT  
Reference Guide
 
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Modules Pages
Loading...
Searching...
No Matches
RNTupleParallelWriter.hxx
Go to the documentation of this file.
1/// \file ROOT/RNTupleParallelWriter.hxx
2/// \ingroup NTuple
3/// \author Jonas Hahnfeld <jonas.hahnfeld@cern.ch>
4/// \date 2024-02-01
5/// \warning This is part of the ROOT 7 prototype! It will change without notice. It might trigger earthquakes. Feedback
6/// is welcome!
7
8/*************************************************************************
9 * Copyright (C) 1995-2024, Rene Brun and Fons Rademakers. *
10 * All rights reserved. *
11 * *
12 * For the licensing terms see $ROOTSYS/LICENSE. *
13 * For the list of contributors see $ROOTSYS/README/CREDITS. *
14 *************************************************************************/
15
16#ifndef ROOT_RNTupleParallelWriter
17#define ROOT_RNTupleParallelWriter
18
21
22#include <memory>
23#include <mutex>
24#include <string_view>
25#include <vector>
26
27class TDirectory;
28
29namespace ROOT {
30
31class RNTupleModel;
32
33namespace Internal {
34class RPageSink;
35} // namespace Internal
36
37namespace Experimental {
38
39class RNTupleFillContext;
40
41/**
42\class ROOT::Experimental::RNTupleParallelWriter
43\ingroup NTuple
44\brief A writer to fill an RNTuple from multiple contexts
45
46Compared to the sequential RNTupleWriter, a parallel writer enables the creation of multiple RNTupleFillContext (see
47RNTupleParallelWriter::CreateFillContext). Each fill context prepares independent clusters that are appended to the
48common ntuple with internal synchronization. Before destruction, all fill contexts must have flushed their data and
49been destroyed (or data could be lost!).
50
51For user convenience, RNTupleParallelWriter::CreateFillContext is thread-safe and may be called from multiple threads
52in parallel at any time, also after some data has already been written. Internally, the original model is cloned and
53ownership is passed to a newly created RNTupleFillContext. For that reason, it is recommended to use
54RNTupleModel::CreateBare when creating the model for parallel writing and avoid the allocation of a useless default
55REntry per context.
56
57Note that the sequence of independently prepared clusters is indeterminate and therefore entries are only partially
58ordered: Entries from one context are totally ordered as they were filled. However, there is no orderering with other
59contexts and the entries may be appended to the ntuple either before or after other entries written in parallel into
60other contexts. In addition, two consecutive entries in one fill context can end up separated in the final ntuple, if
61they happen to fall onto a cluster boundary and other contexts append more entries before the next cluster is full.
62
63At the moment, the parallel writer does not (yet) support incremental updates of the underlying model. Please refer to
64RNTupleWriter::CreateModelUpdater if required for your use case.
65*/
67private:
68 /// A global mutex to protect the internal data structures of this object.
69 std::mutex fMutex;
70 /// A mutex to synchronize the final page sink.
71 std::mutex fSinkMutex;
72 /// The final RPageSink that represents the synchronization point.
73 std::unique_ptr<ROOT::Internal::RPageSink> fSink;
74 /// The original RNTupleModel connected to fSink; needs to be destructed before it.
75 std::unique_ptr<ROOT::RNTupleModel> fModel;
77 /// List of all created helpers. They must be destroyed before this RNTupleParallelWriter is destructed.
78 std::vector<std::weak_ptr<RNTupleFillContext>> fFillContexts;
79
80 RNTupleParallelWriter(std::unique_ptr<ROOT::RNTupleModel> model, std::unique_ptr<ROOT::Internal::RPageSink> sink);
83
84public:
85 /// Recreate a new file and return a writer to write an ntuple.
86 static std::unique_ptr<RNTupleParallelWriter>
87 Recreate(std::unique_ptr<ROOT::RNTupleModel> model, std::string_view ntupleName, std::string_view storage,
89 /// Append an ntuple to the existing file, which must not be accessed while data is filled into any created context.
90 static std::unique_ptr<RNTupleParallelWriter>
91 Append(std::unique_ptr<ROOT::RNTupleModel> model, std::string_view ntupleName, TDirectory &fileOrDirectory,
93
95
96 /// Create a new RNTupleFillContext that can be used to fill entries and prepare clusters in parallel. This method is
97 /// thread-safe and may be called from multiple threads in parallel at any time, also after some data has already
98 /// been written.
99 ///
100 /// Note that all fill contexts must be destroyed before RNTupleParallelWriter::CommitDataset() is called.
101 std::shared_ptr<RNTupleFillContext> CreateFillContext();
102
103 /// Automatically called by the destructor
104 void CommitDataset();
105
107 const Detail::RNTupleMetrics &GetMetrics() const { return fMetrics; }
108};
109
110} // namespace Experimental
111} // namespace ROOT
112
113#endif
A collection of Counter objects with a name, a unit, and a description.
A writer to fill an RNTuple from multiple contexts.
std::vector< std::weak_ptr< RNTupleFillContext > > fFillContexts
List of all created helpers. They must be destroyed before this RNTupleParallelWriter is destructed.
RNTupleParallelWriter(const RNTupleParallelWriter &)=delete
static std::unique_ptr< RNTupleParallelWriter > Append(std::unique_ptr< ROOT::RNTupleModel > model, std::string_view ntupleName, TDirectory &fileOrDirectory, const ROOT::RNTupleWriteOptions &options=ROOT::RNTupleWriteOptions())
Append an ntuple to the existing file, which must not be accessed while data is filled into any creat...
static std::unique_ptr< RNTupleParallelWriter > Recreate(std::unique_ptr< ROOT::RNTupleModel > model, std::string_view ntupleName, std::string_view storage, const ROOT::RNTupleWriteOptions &options=ROOT::RNTupleWriteOptions())
Recreate a new file and return a writer to write an ntuple.
std::mutex fSinkMutex
A mutex to synchronize the final page sink.
std::shared_ptr< RNTupleFillContext > CreateFillContext()
Create a new RNTupleFillContext that can be used to fill entries and prepare clusters in parallel.
RNTupleParallelWriter(std::unique_ptr< ROOT::RNTupleModel > model, std::unique_ptr< ROOT::Internal::RPageSink > sink)
RNTupleParallelWriter & operator=(const RNTupleParallelWriter &)=delete
std::unique_ptr< ROOT::RNTupleModel > fModel
The original RNTupleModel connected to fSink; needs to be destructed before it.
void CommitDataset()
Automatically called by the destructor.
std::mutex fMutex
A global mutex to protect the internal data structures of this object.
std::unique_ptr< ROOT::Internal::RPageSink > fSink
The final RPageSink that represents the synchronization point.
const Detail::RNTupleMetrics & GetMetrics() const
Common user-tunable settings for storing RNTuples.
Describe directory structure in memory.
Definition TDirectory.h:45
tbb::task_arena is an alias of tbb::interface7::task_arena, which doesn't allow to forward declare tb...