Fitting is the method for modeling the expected distribution of events in a physics data analysis. ROOT offers various options to perform the fitting of the data:
- Fit() method: You can fit histograms and graphs programmatically with the
- Minimization packages: ROOT provides several minimization packages.
- Using the ROOT::Fit classes
- Fit Panel: After a histogram is drawn, the Fit Panel GUI is best used for prototyping the fit.
- RooFit: The RooFit library is a toolkit for modeling the expected distribution of events in a physics analysis.
Using the Fit() method
Fit() method is implemented for:
the histogram classes TH1
the sparse histogram classes THnSparse
Using TH1::Fit() and TGraph::Fit()
The function returns a TFitResultPtr, which is explained later in this manual.
By default, the fitted function object is added to the histogram and is drawn on the current pad.
For a detailed explanation of the
goption strings, see TH1::Fit().
xmax parameters optionally specify the fit range.
The signature of TGraph::Fit() is the same, but the supported options are slightly different:
I are exclusive to TH1::Fit(), while
ROB only apply to TGraph::Fit() (these options are explained in the linked function documentations).
Fitting 1-D histograms with pre-defined functions
Use the TH1::Fit() method to fit a 1-D histogram with a pre-defined function. The name of the pre-defined function is the first parameter. For pre-defined functions, you do not need to set initial values for the parameters.
See the TH1::Fit() documentation for the full list of pre-defined functions.
A histogram object
hist is fit with a Gaussian:
Fitting 1-D histograms with user-defined functions
In this example, we create a TF1
func from a general C++ function with parameters:
fitf function is fitted to the histogram.
Configuring the fit
The following configuration actions are available when fitting a histogram or graph using the
Fit() method (relevant tutorials linked in parathesis):
- Fixing and setting parameter bounds
- Fitting subranges and multiple subranges (multifit.C / multifit.py). The tutorial shows how to fit several Gaussian functions with different parameters to separate subranges of the same histogram.
- Fitting the convolution of two functions (fitConvolution.C / fitConvolution.py)
- Fitting the normalized sum of functions (fitNormSum.C / fitNormSum.py)
- Adding functions to the list
Fixing and setting parameter bounds
For pre-defined functions like
landau, the parameter initial values are set automatically.
For not pre-defined functions, the fit parameters must be initialized before invoking the
- Use the TF1::SetParLimits() method to set the bounds for one parameter.
When the lower and upper limits are equal, the parameter is fixed.
The parameter is fixed 4 at 10.
- Use the TF1::FixParameter() method to fix a parameter to 0.
You do not need to set the limits for all parameters.
There is function with 6 parameters. Then a setup like the following is possible: Parameters 0 to 2 can vary freely, parameter 3 has boundaries [-10, 4] with the initial value -1.5, and parameter 4 is fixed to 0.
Adding functions to the list
$ROOTSYS/tutorials/fit/multifit.C illustrates how to fit several functions on the same histogram.
By default a fit command deletes the previously fitted function in the histogram object. You can specify the option
+ in the second parameter to add the newly fitted function to the existing list of functions for the histogram.
Note that the fitted function(s) are saved with the histogram when it is written to a ROOT file.
Accessing fit results
You can obtain the following results of a fit:
- associated function
- parameter values
- covariance and correlation matrix (via the fit result object explained below)
One or more objects (typically a
TF1\*) can be added to the list of functions associated to each histogram.
TH1::Fit() adds the fitted function to this list.
Given a histogram
h, you can retrieve the associated function with:
Accessing the fit parameters and results
If the histogram or graph is made persistent, the list of associated functions is also persistent.
Retrieve a pointer to the function with the TH1::GetFunction() method. Then you can retrieve the fit parameters from the function.
With the fit option
S, you can access the full result of the fit including the covariance and correlation matrix.
By default, for each bin, the sum of weights is computed at fill time. You can also call TH1::Sumw2() to force the storage
and computation of the sum of the square of weights per bin. If
Sumw2() has been called, the error per bin is computed
sqrt(sum of squares of weights). Otherwise, the error is set equal to the `sqrt(bin content).
To return the error for a given bin number, use:
Empty bins are excluded in the fit when using the Chi-square fit method. When fitting an histogram representing
counts (that is with Poisson statistics) it is recommended to use the Log-Likelihood method (option
in case of low statistics.
Fit statistics box for plots
You can change the statistics box to display the fit parameters with the TStyle::SetOptFit() method. This parameter has four digits:
mode = pcev (default = 0111)
p = 1: Print probability.
c = 1: Print Chi-square/number of degrees of freedom.
e = 1: Print errors (if e=1, v must be 1).
v = 1: Print name/values of parameters.
To print the fit probability, parameter names, values, and errors, use:
The fit result object
When fitting an histogram (a
object) or a graph (a
object), it is possible to return a
object, which behaves as a smart pointer to a
TFitResult . TFitResultPtr is the return object of TH1::Fit or TGraph::Fit.
contains only the status of the fit and can be obtained by an automatic conversion of
to an integer. If the fit option
S is used instead,
and behaves as a smart pointer to it.
class inherits from ROOT::Fit::FitResult.
In addition to the base FitResult class, it provides some methods to return
a covariance or correlation matrix as a
can be stored in ROOT files.
All fit result objects support printing with FitResult::Print().
Using ROOT::Fit classes
- Fit method classes: Classes describing fit method functions like:
- Fit data classes: Classes for describing the input data for fitting, including:
- Binned datasets (ROOT::Fit::BinData): data points containing both coordinates and a corresponding value or weight with optionally an error on the value or the coordinate. They are used for least square (chi-square) fits of histograms or TGraph objects.
- Un-binned datasets (ROOT::Fit::UnBinData): They are used for fitting vectors of data points, for example from a TTree .
- User fitting classes: Classes for fitting a given dataset:
The fitter classes use the generic interfaces for parametric function evaluations, ROOT::Math::IParametricFunctionMultiDim, to define the fitting model function, and the ROOT::Math::Minimizer interface to perform the minimization of the target function.
Creating the data object
Example: filling a binned dataset from a histogram
There is histogram, represented as a
type object. Now a
ROOT:Fit::BinData object is created and filled.
ROOT::Fit::DataOptions controls some fitting options.
In this example, the
fIntegral option is set to integrate the fit function over each bin instead of using the value at the bin centers.
The call to ROOT::Fit::DataRange sets the fit range to the interval between
Using un-binned data
- Use the ROOT::Fit::UnBinData class for un-binned data.
For creating un-binned datasets, there are two possibilities:
- Copy the data inside a
Create an empty
ROOT::Fit::UnBinDataobject, iterate on the data and add the data point one by one. An input
ROOT::Fit::DataRangeobject is passed in order to copy the data according to the given range.
ROOT::Fit::UnBinDataas a wrapper to an external data storage.
In this case the
ROOT::Fit::UnBinDataobject is created from an iterator or pointers to the data and the data are not copied inside. The data cannot be selected according to a specified range. All the data points will be included in the fit.
ROOT::Fit::UnBinData supports also weighted data. In addition to the data points (coordinates), which
can be of arbitrary
k dimensions, the class can be constructed from a vector of weights.
Data are taken from a histogram ( TH1 object).
In this example a two-dimensional
UnBinData object is created with the contents from a tree.
Creating a fit model
To fit a dataset, a model is needed to describe the data, such as a probability density function (PDF) describing the observed data or a hypothetical function describing the relationship between the independent variables
X and the single dependent variable
Y. The model can have any number of k independent variables. For example, in fitting a k-dimensional histogram, the independent variables
X are the coordinates of the bin centers and
Y is the bin weight.
The model function needs to be expressed as function of some unknown parameters. The fitting will find the best parameter value to describe the observed data.
You can for example use the
class, the parametric function class, to describe the model function.
But the ROOT::Fit::Fitter class takes as input a more general parametric function object, the abstract interface class ROOT::Math::IParametricFunctionMultiDim. It describes a generic one-dimensional or multi-dimensional function with parameters.
This interface extends the abstract ROOT::Math::IBaseFunctionMultiDim class with methods to set or retrieve parameter values and to evaluate the function given by the independent vector of values
X and the vector of parameters
When creating a wrapper, the parameter values stored in TF1 are copied to the ROOT::Math::WrappedMultiTF1 object. The function object representing the model function is given to the ROOT::Fit::Fitter class using the Fitter::SetFunction method.
You can also provide a function object that implements the derivatives of the function with respect to the parameters. In this case you must provide the function object as a class deriving from the ROOT::Math::IParametricGradFunctionMultiDim interface.
Note that the ROOT::Math::WrappedMultiTF1 wrapper class implements also the gradient interface, using internally TF1::GradientPar, which is based on numerical differentiation, apart for the case of linear functions (this is when TF1::IsLinear() is
true). The parameter derivatives of the model function can be useful to some minimization algorithms, such as FUMILI (see → FUMILI). However, in general is better to leave the minimization algorithm (for example TMinuit, see → TMinuit) to compute the needed derivatives using its own customised numerical differentiation algorithm. To avoid providing the parameter derivations to the fitter, explicitly set Fitter::SetFunction to
Configuring the fit
There are the following fit configurations:
- Setting the initial values of the parameters.
- Setting the parameter step sizes.
- Setting eventual parameter bounds.
- Setting the minimizer library and the particular algorithm to use.
- Setting different minimization options (print level, tolerance, max iterations, etc. . . ).
- Setting the type of parameter errors to compute (parabolic error, minor errors, re-normalize errors using fitted chi2 values).
Setting the lower and upper bounds for the first parameter and a lower bound for the second parameter:
Note that a ROOT::Fit::ParameterSettings objects exists for each fit parameter and it created by the ROOT::Fit::FitConfig class, after the model function has been set in the fitter. Only when the function is set, the number of parameter is known and automatically the
FitConfig creates the corresponding
Various minimizers can be used in the fitting process. They can be implemented in different libraries and loaded at run time. Each different minimizer (for example Minuit, Minuit2, FUMILI, etc.) consists of a different implementation of the ROOT::Math::Minimizer interface. Within the same minimizer, thus within the same class implementing the
Minimizer interface, different algorithms exist.
If the requested minimizer is not available in ROOT, the default one is used. The default minimizer type and algorithm can be specified by using the static function
Performing the fit
Depending on the available input data and the selected function for fitting, you can use one of the methods of the ROOT::Fit::Fitter class to perform the fit.
Pre-defined fitting methods
The following pre-defined fitting methods are available:
Least-square fit: Fitter::LeastSquare(const BinData &) or Fitter::Fit(const Bindata &). Both methods should be used when the binned data values follow a Gaussian distribution. These fit methods are implemented using the ROOT::Fit::Chi2FCN class.
Binned likelihood fit: Fitter::LikelihoodFit(const Bindata &). This method should be used when the binned data values follow a Poisson or a multinomial distribution. The Poisson case (extended fit) is the default and in this case the function normalization is also fit to the data. This method is implemented by the ROOT::Fit::PoissonLikelihoodFCN class.
Un-binned likelihood fit: Fitter::LikelihoodFit(const UnBindata &). By default the fit is not extended, this is the normalization is not fitted to the data. This method is implemented using the LogLikelihoodFCN class.
Linear fit: A linear fit can be chosen if the model function is linear in the parameters.
User-defined fitting methods
You can also implement your own fitting methods. You can implement your own version of the method function using on its own dataset objects and functions.
Use ROOT::Fit::Fitter::SetFCN to set the method function and ROOT::Fit::FitFCN for fitting.
You can pass the method function also in
ROOT::Fit::FitFCN, but in this case a previously defined fitting configuration is used.
The possible type of method functions that are based in
- A generic functor object implementing
operator()(const double * p)where
pis the parameter vector. In this case you need to pass the number of parameters, the function object and optionally a vector of initial parameter values. Other optional parameter include the size of the datasets and a flag specifying if it is a
chi2(least-square fit). If the last two parameters are given, the
chi2/ndfcan be computed after fitting the data.
- A function object implementing the ROOT::Math::IBaseFunctionMultiDim interface.
- A function object implementing the ROOT::Math::FitMethodFunction interface. This is an interface class that extends ROOT::Math::IBaseFunctionMultiDim with some additional functions which can be used when fitting is done. The extra functionality is required by some fitting algorithms like FUMILI or
- An old-Minuit like FCN interface (this is a free function with the signature
fcn(int &npar, double *gin, double &f, double *u, int flag).
Example: simultaneous fit of two histograms
Computing confidence intervals
With the fit result object returned by Fitter::Result(),
you can compute the confidence intervals after the fit (see ROOT::Fit::FitResult::GetConfidenceIntervals).
Given an input dataset (for example a
BinData object) and a confidence level value (for example 68%), it computes the lower and upper band values of the model function at the given data points.
You can take a loot at the ConfidenceIntervals.C tutorial for an example.
Using the Fit Panel
After you have drawn a histogram (see → Drawing a histograms), you can use the Fit Panel for fitting the data. The Fit Panel is best suited for prototyping
The following section describes how to use the Fit Panel using an example.
Given is a histogram following a Gaussian distribution.
Right-click on the object and then click
You also can select
Toolsand then click
Figure: FitPanel in the context menu.
The Fit Panel is displayed.
Figure: Fit Panel.
Fit Function section you can select the function that should be used for fitting.
The following types of functions are listed here:
Pre-defined functions that will depend on the dimensionality of the data.
Functions present in
gDirectory. These functions were already created by the user through a ROOT macro.
Previously used functions. Functions that fitted the current data previously, if the data is able to store such functions.
Select a fitting function.
Set Parameters...to set the parameters of the selected function.
Set Parameters of... dialog window is displayed.
Figure: Set Parameters of… dialog window.
Set the parameters for the fit function.
Generaltab, select the general options for fitting.
This includes the method that will be used, as well as what fit options will be used with it and the draw options. You can also constrain the range of the function used for the fitting.
Minimizationtab, select the minimization algorithm for fitting.
Figure: A fitted histogram.