Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
df019_Cache.py
Go to the documentation of this file.
1## \file
2## \ingroup tutorial_dataframe
3## \notebook -draw
4## Cache a processed RDataFrame in memory for further usage.
5##
6## This tutorial shows how the content of a data frame can be cached in memory
7## in form of a dataframe. The content of the columns is stored in memory in
8## contiguous slabs of memory and is "ready to use", i.e. no ROOT IO operation
9## is performed.
10##
11## Creating a cached data frame storing all of its content deserialised and uncompressed
12## in memory is particularly useful when dealing with datasets of a moderate size
13## (small enough to fit the RAM) over which several explorative loops need to be
14## performed as fast as possible. In addition, caching can be useful when no file
15## on disk needs to be created as a side effect of checkpointing part of the analysis.
16##
17## All steps in the caching are lazy, i.e. the cached data frame is actually filled
18## only when the event loop is triggered on it.
19##
20## \macro_code
21## \macro_image
22##
23## \date June 2018
24## \author Danilo Piparo (CERN)
25
26import ROOT
27import os
28
29# We create a data frame on top of the hsimple example.
30hsimplePath = os.path.join(str(ROOT.gROOT.GetTutorialDir().Data()), "hsimple.root")
31df = ROOT.RDataFrame("ntuple", hsimplePath)
32
33# We apply a simple cut and define a new column.
34df_cut = df.Filter("py > 0.f")\
35 .Define("px_plus_py", "px + py")
36
37# We cache the content of the dataset. Nothing has happened yet: the work to accomplish
38# has been described.
39df_cached = df_cut.Cache()
40
41h = df_cached.Histo1D("px_plus_py")
42
43# Now the event loop on the cached dataset is triggered by accessing the histogram.
44# This event triggers the loop on the `df` data frame lazily.
45c = ROOT.TCanvas()
46h.Draw()
47c.SaveAs("df019_Cache.png")
48
49print("Saved figure to df019_Cache.png")
ROOT's RDataFrame offers a modern, high-level interface for analysis of data stored in TTree ,...