ROOT
6.16/01
Reference Guide
tutorials
dataframe
df012_DefinesAndFiltersAsStrings.py
Go to the documentation of this file.
1
## \file
2
## \ingroup tutorial_dataframe
3
## \notebook -nodraw
4
##
5
## This tutorial illustrates how to use jit-compiling features of RDataFrame
6
## to define data using C++ code in a Python script
7
##
8
## \macro_code
9
## \macro_output
10
##
11
## \date October 2017
12
## \author Guilherme Amadio
13
14
import
ROOT
15
16
## We will inefficiently calculate an approximation of pi by generating
17
## some data and doing very simple filtering and analysis on it.
18
19
## We start by creating an empty dataframe where we will insert 10 million
20
## random points in a square of side 2.0 (that is, with an inscribed unit
21
## circle).
22
23
npoints = 10000000
24
tdf = ROOT.ROOT.RDataFrame(npoints)
25
26
## Define what data we want inside the dataframe. We do not need to define p
27
## as an array, but we do it here to demonstrate how to use jitting with RDataFrame
28
29
pidf = tdf.Define(
"x"
,
"gRandom->Uniform(-1.0, 1.0)"
) \
30
.Define(
"y"
,
"gRandom->Uniform(-1.0, 1.0)"
) \
31
.Define(
"p"
,
"std::array<double, 2> v{x, y}; return v;"
) \
32
.Define(
"r"
,
"double r2 = 0.0; for (auto&& w : p) r2 += w*w; return sqrt(r2);"
)
33
34
## Now we have a dataframe with columns x, y, p (which is a point based on x
35
## and y), and the radius r = sqrt(x*x + y*y). In order to approximate pi, we
36
## need to know how many of our data points fall inside the circle of radius
37
## one compared with the total number of points. The ratio of the areas is
38
##
39
## A_circle / A_square = pi r*r / l * l, where r = 1.0, and l = 2.0
40
##
41
## Therefore, we can approximate pi with 4 times the number of points inside
42
## the unit circle over the total number of points:
43
44
incircle = pidf.Filter(
"r <= 1.0"
).Count().GetValue()
45
46
pi_approx = 4.0 * incircle / npoints
47
48
print(
"pi is approximately equal to %g"
% (pi_approx))