ROOT
v6-32
Reference Guide
Loading...
Searching...
No Matches
df019_Cache.C
Go to the documentation of this file.
1
/// \file
2
/// \ingroup tutorial_dataframe
3
/// \notebook -draw
4
/// Cache a processed RDataFrame in memory for further usage.
5
///
6
/// This tutorial shows how the content of a data frame can be cached in memory
7
/// in form of a dataframe. The content of the columns is stored in memory in
8
/// contiguous slabs of memory and is "ready to use", i.e. no ROOT IO operation
9
/// is performed.
10
///
11
/// Creating a cached data frame storing all of its content deserialised and uncompressed
12
/// in memory is particularly useful when dealing with datasets of a moderate size
13
/// (small enough to fit the RAM) over which several explorative loops need to be
14
/// performed as fast as possible. In addition, caching can be useful when no file
15
/// on disk needs to be created as a side effect of checkpointing part of the analysis.
16
///
17
/// All steps in the caching are lazy, i.e. the cached data frame is actually filled
18
/// only when the event loop is triggered on it.
19
///
20
/// \macro_code
21
/// \macro_image
22
///
23
/// \date June 2018
24
/// \author Danilo Piparo (CERN)
25
26
void
df019_Cache
()
27
{
28
// We create a data frame on top of the hsimple example.
29
auto
hsimplePath
=
gROOT
->GetTutorialDir();
30
hsimplePath
+=
"/hsimple.root"
;
31
ROOT::RDataFrame
df(
"ntuple"
,
hsimplePath
.Data());
32
33
// We apply a simple cut and define a new column.
34
auto
df_cut
= df.Filter([](
float
py) {
return
py > 0.f; }, {
"py"
})
35
.Define(
"px_plus_py"
, [](
float
px,
float
py) {
return
px + py; }, {
"px"
,
"py"
});
36
37
// We cache the content of the dataset. Nothing has happened yet: the work to accomplish
38
// has been described. As for `Snapshot`, the types and columns can be written out explicitly
39
// or left for the jitting to handle (`df_cached` is intentionally unused - it shows how
40
// to create a *cached* dataframe specifying column types explicitly):
41
auto
df_cached
=
df_cut
.Cache<float,
float
>({
"px_plus_py"
,
"py"
});
42
auto
df_cached_implicit
=
df_cut
.Cache();
43
auto
h
=
df_cached_implicit
.Histo1D<
float
>(
"px_plus_py"
);
44
45
// Now the event loop on the cached dataset is triggered. This event triggers the loop
46
// on the `df` data frame lazily.
47
h
->DrawCopy();
48
}
h
#define h(i)
Definition
RSha256.hxx:106
gROOT
#define gROOT
Definition
TROOT.h:406
ROOT::Detail::TRangeCast
Definition
TCollection.h:311
ROOT::RDataFrame
ROOT's RDataFrame offers a modern, high-level interface for analysis of data stored in TTree ,...
Definition
RDataFrame.hxx:41
df019_Cache
Definition
df019_Cache.py:1
tutorials
dataframe
df019_Cache.C
ROOT v6-32 - Reference Guide Generated on Thu Feb 27 2025 14:17:35 (GVA Time) using Doxygen 1.10.0