Logo ROOT   6.14/05
Reference Guide
df013_InspectAnalysis.C
Go to the documentation of this file.
1 /// \file
2 /// \ingroup tutorial_dataframe
3 /// \notebook -nodraw
4 /// Showcase registration of callback functions that act on partial results while
5 /// the event-loop is running using `OnPartialResult` and `OnPartialResultSlot`.
6 /// This tutorial is not meant to run in batch mode.
7 ///
8 /// \macro_code
9 ///
10 /// \date September 2017
11 /// \author Enrico Guiraud
12 
13 using namespace ROOT; // RDataFrame lives in here
14 
15 void df013_InspectAnalysis()
16 {
18  const auto poolSize = ROOT::GetImplicitMTPoolSize();
19  const auto nSlots = 0 == poolSize ? 1 : poolSize;
20 
21  // ## Setup a simple RDataFrame
22  // We start by creating a RDataFrame with a good number of empty events
23  const auto nEvents = nSlots * 10000ull;
24  RDataFrame d(nEvents);
25 
26  // `heavyWork` is a lambda that fakes some interesting computation and just returns a normally distributed double
27  TRandom r;
28  auto heavyWork = [&r]() {
29  for (volatile int i = 0; i < 1000000; ++i)
30  ;
31  return r.Gaus();
32  };
33 
34  // Let's define a column "x" produced by invoking `heavyWork` for each event
35  // `tdf` stores a modified data-frame that contains "x"
36  auto tdf = d.Define("x", heavyWork);
37 
38  // Now we register a histogram-filling action with the RDataFrame.
39  // `h` can be used just like a pointer to TH1D but it is actually a TResultProxy<TH1D>, a smart object that triggers
40  // an event-loop to fill the pointee histogram if needed.
41  auto h = tdf.Histo1D<double>({"browserHisto", "", 100, -2., 2.}, "x");
42 
43  // ## Use the callback mechanism to draw the histogram on a TBrowser while it is being filled
44  // So far we have registered a column "x" to a data-frame with `nEvents` events and we registered the filling of a
45  // histogram with the values of column "x".
46  // In the following we will register three functions for execution during the event-loop:
47  // - one is to be executed once just before the loop and adds a partially-filled histogram to a TBrowser
48  // - the next is executed every 50 events and draws the partial histogram on the TBrowser's TPad
49  // - another callback is responsible of updating a simple progress bar from multiple threads
50 
51  // First off we create a TBrowser that contains a "RDFResults" directory
52  auto *tdfDirectory = new TMemFile("RDFResults", "RECREATE");
53  auto *browser = new TBrowser("b", tdfDirectory);
54  // The global pad should now be set to the TBrowser's canvas, let's store its value in a local variable
55  auto browserPad = gPad;
56 
57  // A useful feature of `TResultProxy` is its `OnPartialResult` method: it allows us to register a callback that is
58  // executed once per specified number of events during the event-loop, on "partial" versions of the result objects
59  // contained in the `TResultProxy`. In this case, the partial result is going to be a histogram filled with an
60  // increasing number of events.
61  // Instead of requesting the callback to be executed every N entries, this time we use the special value `kOnce` to
62  // request that it is executed once right before starting the event-loop.
63  // The callback is a C++11 lambda that registers the partial result object in `tdfDirectory`.
64  h.OnPartialResult(h.kOnce, [tdfDirectory](TH1D &h_) { tdfDirectory->Add(&h_); });
65  // Note that we called `OnPartialResult` with a dot, `.`, since this is a method of `TResultProxy` itself.
66  // We do not want to call `OnPartialResult` on the pointee histogram!)
67 
68  // Multiple callbacks can be registered on the same `TResultProxy` (they are executed one after the other in the
69  // same order as they were registered). We now request that the partial result is drawn and the TBrowser's TPad is
70  // updated every 50 events.
71  h.OnPartialResult(50, [&browserPad](TH1D &hist) {
72  if (!browserPad)
73  return; // in case root -b was invoked
74  browserPad->cd();
75  hist.Draw();
76  browserPad->Update();
77  // This call tells ROOT to process all pending GUI events
78  // It allows users to use the TBrowser as usual while the event-loop is running
80  });
81 
82  // Finally, we would like to print a progress bar on the terminal to show how the event-loop is progressing.
83  // To take into account _all_ events we use `OnPartialResultSlot`: when Implicit Multi-Threading is enabled, in fact,
84  // `OnPartialResult` invokes the callback only in one of the worker threads, and always returns that worker threads'
85  // partial result. This is useful because it means we don't have to worry about concurrent execution and
86  // thread-safety of the callbacks if we are happy with just one threads' partial result.
87  // `OnPartialResultSlot`, on the other hand, invokes the callback in each one of the worker threads, every time a
88  // thread finishes processing a batch of `everyN` events. This is what we want for the progress bar, but we need to
89  // take care that two threads will not print to terminal at the same time: we need a std::mutex for synchronization.
90  std::string progressBar;
91  std::mutex barMutex; // Only one thread at a time can lock a mutex. Let's use this to avoid concurrent printing.
92  // Magic numbers that yield good progress bars for nSlots = 1,2,4,8
93  const auto everyN = nSlots == 8 ? 1000 : 100ull * nSlots;
94  const auto barWidth = nEvents / everyN;
95  h.OnPartialResultSlot(everyN, [&barWidth, &progressBar, &barMutex](unsigned int /*slot*/, TH1D & /*partialHist*/) {
96  std::lock_guard<std::mutex> l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction
97  progressBar.push_back('#');
98  // re-print the line with the progress bar
99  std::cout << "\r[" << std::left << std::setw(barWidth) << progressBar << ']' << std::flush;
100  });
101 
102  // ## Running the analysis
103  // So far we told RDataFrame what we want to happen during the event-loop, but we have not actually run any of those
104  // actions: the TBrowser is still empty, the progress bar has not been printed even once, and we haven't produced
105  // a single data-point!
106  // As usual with RDataFrame, the event-loop is triggered by accessing the contents of a TResultProxy for the first
107  // time. Let's run!
108  std::cout << "Analysis running..." << std::endl;
109  h->Draw(); // the final, complete result will be drawn after the event-loop has completed.
110  std::cout << "\nDone!" << std::endl;
111 
112  // Finally, some book-keeping: in the TMemFile that we are using as TBrowser directory, we substitute the partial
113  // result with a clone of the final result (the "original" final result will be deleted at the end of the macro).
114  tdfDirectory->Clear();
115  auto clone = static_cast<TH1D *>(h->Clone());
116  clone->SetDirectory(nullptr);
117  tdfDirectory->Add(clone);
118  if (!browserPad)
119  return; // in case root -b was invoked
120  browserPad->cd();
121  clone->Draw();
122 }
UInt_t GetImplicitMTPoolSize()
Returns the size of the pool used for implicit multi-threading.
Definition: TROOT.cxx:614
virtual Bool_t ProcessEvents()
Process pending events (GUI, timers, sockets).
Definition: TSystem.cxx:424
Namespace for new ROOT classes and functions.
Definition: StringConv.hxx:21
virtual void SetDirectory(TDirectory *dir)
By default when an histogram is created, it is added to the list of histogram objects in the current ...
Definition: TH1.cxx:8231
virtual Double_t Gaus(Double_t mean=0, Double_t sigma=1)
Samples a random number from the standard Normal (Gaussian) Distribution with the given mean and sigm...
Definition: TRandom.cxx:256
A TMemFile is like a normal TFile except that it reads and writes only from memory.
Definition: TMemFile.h:19
This is the base class for the ROOT Random number generators.
Definition: TRandom.h:27
void EnableImplicitMT(UInt_t numthreads=0)
Enable ROOT&#39;s implicit multi-threading for all objects and methods that provide an internal paralleli...
Definition: TROOT.cxx:576
Using a TBrowser one can browse all ROOT objects.
Definition: TBrowser.h:37
ROOT::R::TRInterface & r
Definition: Object.C:4
R__EXTERN TSystem * gSystem
Definition: TSystem.h:540
virtual void Draw(Option_t *option="")
Draw this histogram with options.
Definition: TH1.cxx:2974
1-D histogram with a double per channel (see TH1 documentation)}
Definition: TH1.h:610
#define h(i)
Definition: RSha256.hxx:106
ROOT&#39;s RDataFrame offers a high level interface for analyses of data stored in TTrees, CSV&#39;s and other data formats.
Definition: RDataFrame.hxx:42
#define d(i)
Definition: RSha256.hxx:102
auto * l
Definition: textangle.C:4
#define gPad
Definition: TVirtualPad.h:285