Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
df012_DefinesAndFiltersAsStrings.py
Go to the documentation of this file.
1## \file
2## \ingroup tutorial_dataframe
3## \notebook -nodraw
4## Use just-in-time-compiled Filters and Defines for quick prototyping.
5##
6## This tutorial illustrates how to use jit-compiling features of RDataFrame
7## to define data using C++ code in a Python script.
8##
9## \macro_code
10## \macro_output
11##
12## \date October 2017
13## \author Guilherme Amadio (CERN)
14
15import ROOT
16
17## We will inefficiently calculate an approximation of pi by generating
18## some data and doing very simple filtering and analysis on it.
19
20## We start by creating an empty dataframe where we will insert 10 million
21## random points in a square of side 2.0 (that is, with an inscribed unit
22## circle).
23
24npoints = 10000000
25df = ROOT.RDataFrame(npoints)
26
27## Define what data we want inside the dataframe. We do not need to define p
28## as an array, but we do it here to demonstrate how to use jitting with RDataFrame.
29
30pidf = df.Define("x", "gRandom->Uniform(-1.0, 1.0)") \
31 .Define("y", "gRandom->Uniform(-1.0, 1.0)") \
32 .Define("p", "std::array<double, 2> v{x, y}; return v;") \
33 .Define("r", "double r2 = 0.0; for (auto&& w : p) r2 += w*w; return sqrt(r2);")
34
35## Now we have a dataframe with columns x, y, p (which is a point based on x
36## and y), and the radius r = sqrt(x*x + y*y). In order to approximate pi, we
37## need to know how many of our data points fall inside the circle of radius
38## one compared with the total number of points. The ratio of the areas is
39##
40## A_circle / A_square = pi r*r / l * l, where r = 1.0, and l = 2.0
41##
42## Therefore, we can approximate pi with four times the number of points inside
43## the unit circle over the total number of points:
44
45incircle = pidf.Filter("r <= 1.0").Count().GetValue()
46
47pi_approx = 4.0 * incircle / npoints
48
49print("pi is approximately equal to %g" % (pi_approx))
ROOT's RDataFrame offers a modern, high-level interface for analysis of data stored in TTree ,...