df004_cutFlowReport.py File Reference



Detailed Description

Display cut/Filter efficiencies with RDataFrame.

View in nbviewer Open in SWAN This tutorial shows how to get information about the efficiency of the filters applied

import ROOT
def fill_tree(treeName, fileName):
df = ROOT.RDataFrame(50)
df.Define("b1", "(double) rdfentry_")\
.Define("b2", "(int) rdfentry_ * rdfentry_").Snapshot(treeName, fileName)
# We prepare an input tree to run on
fileName = 'df004_cutFlowReport_py.root'
treeName = 'myTree'
fill_tree(treeName, fileName)
# We read the tree from the file and create a RDataFrame, a class that
# allows us to interact with the data contained in the tree.
d = ROOT.RDataFrame(treeName, fileName)
# ## Define cuts and create the report
# An optional string parameter name can be passed to the Filter method to create a named filter.
# Named filters work as usual, but also keep track of how many entries they accept and reject.
filtered1 = d.Filter('b1 > 25', 'Cut1')
filtered2 = d.Filter('0 == b2 % 2', 'Cut2')
augmented1 = filtered2.Define('b3', 'b1 / b2')
filtered3 = augmented1.Filter('b3 < .5','Cut3')
# Statistics are retrieved through a call to the Report method:
# when Report is called on the main RDataFrame object, it retrieves stats for
# all named filters declared up to that point. When called on a stored chain
# state (i.e. a chain/graph node), it retrieves stats for all named filters in
# the section of the chain between the main RDataFrame and that node (included).
# Stats are printed in the same order as named filters have been added to the
# graph, and refer to the latest event-loop that has been run using the relevant
# RDataFrame.
print('Cut3 stats:')
print('All stats:')
allCutsReport = d.Report()
Cut3 stats:
All stats:
Cut1 : pass=24 all=50 -- eff=48.00 % cumulative eff=48.00 %
Cut2 : pass=25 all=50 -- eff=50.00 % cumulative eff=50.00 %
Cut3 : pass=23 all=25 -- eff=92.00 % cumulative eff=46.00 %
May 2017
Danilo Piparo

Definition in file df004_cutFlowReport.py.

