ROOT provides with the RooFit library a toolkit for modeling the expected distribution of events in a physics analysis.→ RooFit tutorials
For RooFit, Topical Manuals are available at Topical Manuals - RooFit.
They contain in-depth information about RooFit.
The core functionality of RooFit is to enable the modeling of ‘event data’ distributions, where each event is a discrete occurrence in time, and has one or more measured observables associated with it. Experiments of this nature result in data sets obeying Poisson (or binomial) statistics.
The natural modeling language for such distributions are probability density functions (probability density function = PDF)
F(x;p) that describe the probability density of the distribution of the observables
x in terms of function in parameter
The defining properties of probability density functions, unit normalization with respect to all observables and positive definiteness, also provide important benefits for the design of a structured modeling language: PDFs are easily added with intuitive interpretation of fraction coefficients.
They allow construction of higher dimensional PDFs out of lower dimensional building block with an intuitive language to introduce and describe correlations between observables.
And they also allow the universal implementation of toy Monte Carlo sampling techniques, and are of course an prerequisite for the use of (un-binned) maximum likelihood parameter estimation technique.
RooFit introduces a granular structure in its mapping of mathematical data models components to C++ objects: rather than aiming at a monolithic entity that describes a data model, each math symbol is presented by a separate object. A feature of this design philosophy is that all RooFit models always consist of multiple objects.
|Mathematical concept||Roofit class|
|List of space points||RooAbsData|
A Gaussian probability density function (PDF) consists typically of four objects:
- three objects representing the observable, the mean and the sigma parameters,
- one object representing a Gaussian probability density function.
Model building operations such as addition, multiplication, integration are represented by separate operator objects and make the modeling language scale well to models of arbitrary complexity.
Signal and background model
Taking a Gaussian probability density function, the following example constructs a one-dimensional probability density function with a Gaussian signal component and a
ARGUS phase space background component.
Figure: Roofit plot.
It is also possible to organize all individual components of the RooFit PDF (the variables, component PDFs and combined PDF) in a container class the
myWorkspace that has an associated factory tool to create trees of RooFit objects of arbitrary complexity using a construction language.
After executing the ROOT macro, the objects defined in the workspace are also available in a namespace with the same name as the workspace if the second argument of the workspace constructor is set to
That is, typing
myWorkspace::sum at the root prompt yields:
Convolution of two PDFs
The Signal and background model example illustrated the use of the
RooAddPdf addition operator. It is also possible to construct convolutions of PDFs using the FFT convolution operator.
You can construct multi-dimensional PDFs with and without correlations using the
RooProdPdf product operator. The example below shows how to construct a 2-dimensional PDF with correlations of the form
F(x|y)*G(y) where the conditional PDF
F(x|y) describes the distribution in observable
x given a value of
y, and PDF
G(y) describes the distribution in observable
The result is:
Working with likelihood functions and profile likelihood
Given a PDF and a data set, a likelihood function can be constructed as:
The likelihood function behaves like a regular RooFit function and can be plotted the same way probability density functions:
Since likelihood evaluations are potentially time-consuming, RooFit facilitates calculation of likelihood in parallel on multiple processes. This parallelization process is transparent to the user. To request parallel calculation on 8 processors (on the same host), construct the likelihood function as follows
You can also construct along similar lines the profile likelihood, which is the likelihood minimized w.r.t. the nuisance parameters, i.e for a likelihood
p is a parameter of interest and
q is a nuisance parameter, the value of the profile likelihood
PL(p) is the value of
L(p,q) at the value of
L(p,q) is lowest. A profile likelihood is construct as follows:
A toy PDF and a data set are constructed. A likelihood scan and a profile likelihood scan are compared in one of the parameters:
Figure: The likelihood and the profile likelihood in the frac parameter.