Logo ROOT  
Reference Guide
 
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Modules Pages
Loading...
Searching...
No Matches
rf408_RDataFrameToRooFit.py File Reference

Detailed Description

View in nbviewer Open in SWAN
Fill RooDataSet/RooDataHist in RDataFrame.

This tutorial shows how to fill RooFit data classes directly from RDataFrame. Using two small helpers, we tell RDataFrame where the data has to go.

import ROOT
import math
# Set up
# ------------------------
# We create an RDataFrame with two columns filled with 2 million random numbers.
df = ROOT.RDataFrame(2000000).Define("x", "gRandom->Uniform(-5., 5.)").Define("y", "gRandom->Gaus(1., 3.)")
# We create RooFit variables that will represent the dataset.
x = ROOT.RooRealVar("x", "x", -5.0, 5.0)
y = ROOT.RooRealVar("y", "y", -50.0, 50.0)
x.setBins(10)
y.setBins(20)
# Booking the creation of RooDataSet / RooDataHist in RDataFrame
# ----------------------------------------------------------------
# Method 1:
# ---------
# We directly book the RooDataSetHelper action.
# We need to pass
# - the RDataFrame column types as template parameters
# - the constructor arguments for RooDataSet (they follow the same syntax as the usual RooDataSet constructors)
# - the column names that RDataFrame should fill into the dataset
# NOTE: RDataFrame columns are matched to RooFit variables by position, *not by name*!
#
# The returned object is not yet a RooDataSet, but an RResultPtr that will be
# lazy-evaluated once you call GetValue() on it. We will only evaluate the
# RResultPtr once all other RDataFrame related actions are declared. This way
# we trigger the event loop computation only once, which will improve the
# runtime significantly.
#
# To learn more about lazy actions, see:
# https://root.cern/doc/master/classROOT_1_1RDataFrame.html#actions
roo_data_set_result = df.Book(
ROOT.std.move(ROOT.RooDataSetHelper("dataset", "Title of dataset", ROOT.RooArgSet(x, y))), ("x", "y")
)
# Method 2:
# ---------
# We first declare the RooDataHistHelper
rdhMaker = ROOT.RooDataHistHelper("dataset", "Title of dataset", ROOT.RooArgSet(x, y))
# Then, we move it into an RDataFrame action:
roo_data_hist_result = df.Book(ROOT.std.move(rdhMaker), ("x", "y"))
# Run it and inspect the results
# -------------------------------
# At this point, all RDF actions were defined (namely, the `Book` operations),
# so we can get values from the RResultPtr objects, triggering the event loop
# and getting the actual RooFit data objects.
roo_data_set = roo_data_set_result.GetValue()
roo_data_hist = roo_data_hist_result.GetValue()
# Let's inspect the dataset / datahist.
def print_data(data):
print("")
data.Print()
for i in range(min(data.numEntries(), 20)):
print(
"("
+ ", ".join(["{0:8.3f}".format(var.getVal()) for var in data.get(i)])
+ ", ) weight={0:10.3f}".format(data.weight())
)
print("mean(x) = {0:.3f}".format(data.mean(x)) + "\tsigma(x) = {0:.3f}".format(math.sqrt(data.moment(x, 2.0))))
print("mean(y) = {0:.3f}".format(data.mean(y)) + "\tsigma(y) = {0:.3f}\n".format(math.sqrt(data.moment(y, 2.0))))
print_data(roo_data_set)
print_data(roo_data_hist)
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t format
ROOT's RDataFrame offers a modern, high-level interface for analysis of data stored in TTree ,...
RooDataSet::dataset[x,y] = 2000000 entries
RooDataHist::dataset[x,y] = 200 bins (2e+06 weights)
( 4.997, -0.304, ) weight= 1.000
( 4.472, 0.910, ) weight= 1.000
( 4.575, 0.830, ) weight= 1.000
( 0.400, 0.776, ) weight= 1.000
( 2.599, -0.232, ) weight= 1.000
( -1.844, 1.575, ) weight= 1.000
( 0.197, 0.853, ) weight= 1.000
( -1.077, -0.721, ) weight= 1.000
( -4.697, -3.165, ) weight= 1.000
( 4.437, -1.208, ) weight= 1.000
( 3.983, -0.146, ) weight= 1.000
( -0.014, -1.447, ) weight= 1.000
( -3.177, -2.704, ) weight= 1.000
( -4.371, -0.363, ) weight= 1.000
( 2.254, -0.499, ) weight= 1.000
( 2.139, 6.533, ) weight= 1.000
( 1.993, 6.991, ) weight= 1.000
( -3.708, 7.781, ) weight= 1.000
( -4.168, 1.284, ) weight= 1.000
( -4.177, 4.650, ) weight= 1.000
mean(x) = 0.001 sigma(x) = 2.886
mean(y) = 1.000 sigma(y) = 3.000
( -4.500, -47.500, ) weight= 0.000
( -4.500, -42.500, ) weight= 0.000
( -4.500, -37.500, ) weight= 0.000
( -4.500, -32.500, ) weight= 0.000
( -4.500, -27.500, ) weight= 0.000
( -4.500, -22.500, ) weight= 0.000
( -4.500, -17.500, ) weight= 0.000
( -4.500, -12.500, ) weight= 24.000
( -4.500, -7.500, ) weight= 4537.000
( -4.500, -2.500, ) weight= 69653.000
( -4.500, 2.500, ) weight=107838.000
( -4.500, 7.500, ) weight= 17790.000
( -4.500, 12.500, ) weight= 292.000
( -4.500, 17.500, ) weight= 0.000
( -4.500, 22.500, ) weight= 0.000
( -4.500, 27.500, ) weight= 0.000
( -4.500, 32.500, ) weight= 0.000
( -4.500, 37.500, ) weight= 0.000
( -4.500, 42.500, ) weight= 0.000
( -4.500, 47.500, ) weight= 0.000
mean(x) = 0.001 sigma(x) = 2.872
mean(y) = 0.999 sigma(y) = 3.329
Date
July 2021
Author
Harshal Shende, Stephan Hageboeck (C++ version)

Definition in file rf408_RDataFrameToRooFit.py.