ROOT provides with the RooFit library a toolkit for modeling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, create plots, and generate “toy Monte Carlo” samples for various studies.
The core functionality of RooFit is to enable the modeling of ‘event data’ distributions, where each event is a discrete occurrence in time, and has one or more measured observables associated with it. Experiments of this nature result in data sets obeying Poisson (or binomial) statistics.
The natural modeling language for such distributions are probability density functions (PDF) F(x;p) that describe the probability density of the distribution of the observables x in terms of function in parameter p.
The defining properties of PDFs, unit normalization with respect to all observables and positive definiteness, also provide important benefits for the design of a structured modeling language: PDFs are easily added with intuitive interpretation of fraction coefficients.
They allow construction of higher dimensional PDFs out of lower dimensional building block with an intuitive language to introduce and describe correlations between observables.
And they also allow the universal implementation of toy Monte Carlo sampling techniques, and are of course an prerequisite for the use of (un-binned) maximum likelihood parameter estimation technique.
RooFit introduces a granular structure in its mapping of mathematical data models components to C++ objects: instead of aiming for a monolithic entity describing a data model, each mathematical symbol is represented by a separate object. A feature of this design philosophy is that all RooFit models always consist of multiple objects.
List of space points
A Gaussian PDF consists typically of four objects:
three objects representing the observable, the mean and the sigma parameters,
one object representing a Gaussian PDF.
Model building operations such as addition, multiplication, integration are represented by separate operator objects and make the modeling language scale well to models of arbitrary complexity.
Signal and background model
Taking a Gaussian PDF, the following example constructs a one-dimensional PDF with a Gaussian signal component and an ARGUS background component. All individual components of the RooFit PDF (the variables, component PDFs, and the combined PDF) are all created individually by calling the constructors directly.
Figure: Roofit plot.
It is also possible to organize all individual components of the RooFit PDF (the variables, component PDFs, and the combined PDF) in a container class myWorkspace that has an associated factory tool to create trees of RooFit objects of arbitrary complexity using a construction language.
Convolution of two PDFs
The Signal and background model example illustrated the use of the RooAddPdf addition operator. It is also possible to construct convolutions of PDFs using the FFT convolution operator.
You can use PDF’s lxg for fitting, plotting and event generation in exactly the same way as the PDF model of The Signal and background model example.
You can construct multi-dimensional PDFs with and without correlations using the RooProdPdf product operator. The example below shows how to construct a 2-dimensional PDF with correlations of the form F(x|y)*G(y) where the conditional PDF F(x|y) describes the distribution in observable x given a value of y, and PDF G(y) describes the distribution in observable y.
The result is:
RooProdPdf::model[ gaussy * gaussx|y ] = 0.606531
Fitting, plotting and event generation with multi-dimensional PDFs is very similar to that of one-dimensional PDFs. Continuing the above example, you can use:
Working with likelihood functions and profile likelihood
The likelihood function behaves like a regular RooFit function and can be drawn in the same way as PDFs
You can also similarly construct the profile likelihood, which is the likelihood minimized taking into account the nuisance parameters, this is, for a likelihood L(p,q) where p is a parameter of interest and q is a nuisance parameter, the value of the profile likelihood PL(p) is the value of L(p,q) at the value of q where L(p,q) is lowest. A profile likelihood is construct as follows:
A toy PDF and a data set are constructed. A likelihood scan and a profile likelihood scan are compared in one of the parameters:
Figure: The likelihood and the profile likelihood in the frac parameter.
In Python, the example above look like this: