Fitting histograms

Fitting is the method for modeling the expected distribution of events in a physics data analysis. ROOT offers various options to perform the fitting of the data:

Fit() method: You can fit histograms and graphs programmatically with the Fit() method.
Minimization packages: ROOT provides several minimization packages.
Using the ROOT::Fit classes
Fit Panel: After a histogram is drawn, the Fit Panel GUI is best used for prototyping the fit.
RooFit: The RooFit library is a toolkit for modeling the expected distribution of events in a physics analysis.

→ Fit tutorials

Using the Fit() method

The Fit() method is implemented for:

the histogram classes TH1
the sparse histogram classes THnSparse
the graph classes TGraph, TGraph2D and TMultiGraph (for fitting a collection of graphs with the same function)

Using TH1::Fit() and TGraph::Fit()

You can fit a TF1 function f1 to a histogram with the TH1::Fit() method, which has the following signature:

TFitResultPtr Fit(TF1 *f1,
                  const char *option="",
                  const char *goption="",
                  double xxmin=0.0,
                  double xxmax=0.0);

The function returns a TFitResultPtr, which is explained later in this manual.

By default, the fitted function object is added to the histogram and is drawn on the current pad.

For a detailed explanation of the option and goption strings, see TH1::Fit().

The xmin and xmax parameters optionally specify the fit range.

The signature of TGraph::Fit() is the same, but the supported options are slightly different: L, WL, and I are exclusive to TH1::Fit(), while EX0 and ROB only apply to TGraph::Fit() (these options are explained in the linked function documentations).

Fitting 1-D histograms with pre-defined functions

Use the TH1::Fit() method to fit a 1-D histogram with a pre-defined function. The name of the pre-defined function is the first parameter. For pre-defined functions, you do not need to set initial values for the parameters.

See the TH1::Fit() documentation for the full list of pre-defined functions.

Example

A histogram object hist is fit with a Gaussian:

hist.Fit("gaus");

Fitting 1-D histograms with user-defined functions

You can also fit any TF1 function that you defined yourself in one of the ways listed in the class documentation to a 1-D histogram.

Example

In this example, we create a TF1 func from a general C++ function with parameters:

// Define a function with three parameters.
double fitf(double *x,double *par) {
   double arg = 0;
   if (par[2]!=0) arg = (x[0] - par[1])/par[2];
   double fitval = par[0]*TMath::Exp(-0.5*arg*arg);
   return fitval;
}

// Create a TF1 object using the fitf function. The last three parameters
// specify the range and the number of parameters for the function.
TF1 *func = new TF1("fit",fitf,-3,3,3);

Now the fitf function is fitted to the histogram.

// Open a ROOT file and get a histogram.
TFile *f = new TFile("hsimple.root");
TH1F *hpx = f->Get<TH1F>("hpx");

// Set the initial parameters to the mean and RMS of the histogram.
func->SetParameters(500,hpx->GetMean(),hpx->GetRMS());

// Give the parameters names.
func->SetParNames("Constant","Mean_value","Sigma");

// Call TH1::Fit with the name of the TF1 object.
hpx->Fit("fit");

Configuring the fit

The following configuration actions are available when fitting a histogram or graph using the Fit() method (relevant tutorials linked in parathesis):

Fixing and setting parameter bounds
Fitting subranges and multiple subranges (multifit.C / multifit.py). The tutorial shows how to fit several Gaussian functions with different parameters to separate subranges of the same histogram.
Fitting the convolution of two functions (fitConvolution.C / fitConvolution.py)
Fitting the normalized sum of functions (fitNormSum.C / fitNormSum.py)
Adding functions to the list

Fixing and setting parameter bounds

For pre-defined functions like poln, exp, gaus, and landau, the parameter initial values are set automatically.

For not pre-defined functions, the fit parameters must be initialized before invoking the Fit() method.

Use the TF1::SetParLimits() method to set the bounds for one parameter.

func->SetParLimits(0,-1,1);

When the lower and upper limits are equal, the parameter is fixed.

Example

The parameter is fixed 4 at 10.

func->SetParameter(4,10);
func->SetParLimits(4,10,10);

Use the TF1::FixParameter() method to fix a parameter to 0.

Example

func->SetParameter(4,0);
func->FixParameter(4,0);

You do not need to set the limits for all parameters.

Example

There is function with 6 parameters. Then a setup like the following is possible: Parameters 0 to 2 can vary freely, parameter 3 has boundaries [-10, 4] with the initial value -1.5, and parameter 4 is fixed to 0.

func->SetParameters(0,3.1,1.e-6,-1.5,0,100);
func->SetParLimits(3,-10,4);
func->FixParameter(4,0);

Adding functions to the list

The example $ROOTSYS/tutorials/fit/multifit.C illustrates how to fit several functions on the same histogram. By default a fit command deletes the previously fitted function in the histogram object. You can specify the option + in the second parameter to add the newly fitted function to the existing list of functions for the histogram.

hist->Fit("f1","+","",-2,2)

Note that the fitted function(s) are saved with the histogram when it is written to a ROOT file.

Accessing fit results

You can obtain the following results of a fit:

associated function
parameter values
errors
covariance and correlation matrix (via the fit result object explained below)

Associated function

One or more objects (typically a TF1\*) can be added to the list of functions associated to each histogram. TH1::Fit() adds the fitted function to this list.

Given a histogram h, you can retrieve the associated function with:

TF1 *myfunc = h->GetFunction("myfunc");

Accessing the fit parameters and results

If the histogram or graph is made persistent, the list of associated functions is also persistent.

Retrieve a pointer to the function with the TH1::GetFunction() method. Then you can retrieve the fit parameters from the function.

Example

TF1 *fit = hist->GetFunction(function_name);
double chi2 = fit->GetChisquare();

// Value of the first parameter.
double p1 = fit->GetParameter(0);

// Error of the first parameter.
double e1 = fit->GetParError(0);

With the fit option S, you can access the full result of the fit including the covariance and correlation matrix.

Associated errors

By default, for each bin, the sum of weights is computed at fill time. You can also call TH1::Sumw2() to force the storage and computation of the sum of the square of weights per bin. If Sumw2() has been called, the error per bin is computed as the sqrt(sum of squares of weights). Otherwise, the error is set equal to the `sqrt(bin content).

To return the error for a given bin number, use:

double error = h->GetBinError(bin);

Empty bins are excluded in the fit when using the Chi-square fit method. When fitting an histogram representing counts (that is with Poisson statistics) it is recommended to use the Log-Likelihood method (option L or WL), particularly in case of low statistics.

Fit statistics box for plots

You can change the statistics box to display the fit parameters with the TStyle::SetOptFit() method. This parameter has four digits: mode = pcev (default = 0111)

p = 1: Print probability.
c = 1: Print Chi-square/number of degrees of freedom.
e = 1: Print errors (if e=1, v must be 1).
v = 1: Print name/values of parameters.

Example

To print the fit probability, parameter names, values, and errors, use:

gStyle->SetOptFit(1011);

The fit result object

When fitting an histogram (a TH1 object) or a graph (a TGraph object), it is possible to return a TFitResult via the TFitResultPtr object, which behaves as a smart pointer to a TFitResult. TFitResultPtr is the return object of TH1::Fit or TGraph::Fit.

By default TFitResultPtr contains only the status of the fit and can be obtained by an automatic conversion of TFitResultPtr to an integer. If the fit option S is used instead, TFitResultPtr contains TFitResult and behaves as a smart pointer to it.

The TFitResult class inherits from ROOT::Fit::FitResult. In addition to the base FitResult class, it provides some methods to return a covariance or correlation matrix as a TMatrixDSym object. Furthermore, TFitResult can be stored in ROOT files. All fit result objects support printing with FitResult::Print().

Example

// TFitResultPtr contains only the fit status.
int fitStatus = hist->Fit(myFunction);

// TFitResultPtr contains the TFitResult.
TFitResultPtr r = hist->Fit(myFunction,"S");

// Access the covariance matrix.
TMatrixDSym cov = r->GetCovarianceMatrix();

// Retrieve the fit chi2.
double chi2 = r->Chi2();

// Retrieve the value for the parameter 0.
double par0 = r->Parameter(0);

// Retrieve the error for the parameter 0.
double err0 = r->ParError(0);

// Print the full information of the fit including covariance matrix.
r->Print("V");

// Store the result in a ROOT file.
r->Write();

Using ROOT::Fit classes

ROOT::Fit is the namespace for fitting classes (regression analysis). The fitting classes are part of the MathCore library.
The defined classes can be classified in the following groups:

Fit method classes: Classes describing fit method functions like:
- ROOT::Fit::Chi2FCN: Class for binned fits using the least square methods.
- ROOT::Fit::PoissonLikelihoodFCN: Class for evaluating the log likelihood for binned Poisson likelihood fits.
- ROOT::Fit::LogLikelihoodFCN: Calls for likelihood fits.
Fit data classes: Classes for describing the input data for fitting, including:
- Binned datasets (ROOT::Fit::BinData): data points containing both coordinates and a corresponding value or weight with optionally an error on the value or the coordinate. They are used for least square (chi-square) fits of histograms or TGraph objects.
- Un-binned datasets (ROOT::Fit::UnBinData): They are used for fitting vectors of data points, for example from a TTree.
User fitting classes: Classes for fitting a given dataset:
- ROOT::Fit::Fitter for executing the fit.
- ROOT::Fit::FitConfig for configuring the fit.
- ROOT::Fit::ParameterSettings to define the properties of the fit parameters (initial values, bounds, etc.).
- ROOT::Fit::FitResult for storing the result of the fit.

The fitter classes use the generic interfaces for parametric function evaluations, ROOT::Math::IParametricFunctionMultiDim, to define the fitting model function, and the ROOT::Math::Minimizer interface to perform the minimization of the target function.

Creating the data object

Example: filling a binned dataset from a histogram

There is histogram, represented as a TH1 type object. Now a ROOT:Fit::BinData object is created and filled.

ROOT::Fit::DataOptions opt;
opt.fIntegral = true;
ROOT::Fit::DataRange range(xmin,xmax);
ROOT::Fit::BinData data(opt,range);

// Fill the bin data using the histogram. You can do this by using the
// ROOT::Fit::FillData() helper function from the histogram library.
TH1 * h1 = gDirectory->Get<TH1>("myHistogram");
ROOT::Fit::FillData(data, h1);

ROOT::Fit::DataOptions controls some fitting options. In this example, the fIntegral option is set to integrate the fit function over each bin instead of using the value at the bin centers. The call to ROOT::Fit::DataRange sets the fit range to the interval between xmin and xmax.

Using un-binned data

Use the ROOT::Fit::UnBinData class for un-binned data.

For creating un-binned datasets, there are two possibilities:

Copy the data inside a ROOT::Fit::UnBinData object.
Create an empty ROOT::Fit::UnBinData object, iterate on the data and add the data point one by one. An input ROOT::Fit::DataRange object is passed in order to copy the data according to the given range.
Use ROOT::Fit::UnBinData as a wrapper to an external data storage.
In this case the ROOT::Fit::UnBinData object is created from an iterator or pointers to the data and the data are not copied inside. The data cannot be selected according to a specified range. All the data points will be included in the fit.

ROOT::Fit::UnBinData supports also weighted data. In addition to the data points (coordinates), which can be of arbitrary k dimensions, the class can be constructed from a vector of weights.

Example

Data are taken from a standard vector.

// The data points that we want to transfer to the ROOT::Fit::UnBinData
std::vector<double> dataVec{1., 2., 3., 4., 5.};

// When creating the fitData object, it is important to create it with the right size.
ROOT::Fit::UnBinData fitData(dataVec.size());
for (std::size_t i = 0; i < dataVec.size(); ++i) {
   fitData.Add(dataVec[i]);
}

Example

In this example a two-dimensional UnBinData object is created with the contents from a tree.

TFile * file = TFile::Open("hsimple.root");
TTree *ntuple = 0;
file->GetObject("ntuple",ntuple);

// Select from the tree the data that should be used for fitting.
// Use TTree::Draw.
int nevt = ntuple->Draw("px:py","","goff");
double * x = ntuple->GetV1();
double * y = ntuple->GetV2();
ROOT::Fit::UnBinData data(nevt, x, y );

Creating a fit model

To fit a dataset, a model is needed to describe the data, such as a probability density function (PDF) describing the observed data or a hypothetical function describing the relationship between the independent variables X and the single dependent variable Y. The model can have any number of k independent variables. For example, in fitting a k-dimensional histogram, the independent variables X are the coordinates of the bin centers and Y is the bin weight.

The model function needs to be expressed as function of some unknown parameters. The fitting will find the best parameter value to describe the observed data.

You can for example use the TF1 class, the parametric function class, to describe the model function. But the ROOT::Fit::Fitter class takes as input a more general parametric function object, the abstract interface class ROOT::Math::IParametricFunctionMultiDim. It describes a generic one-dimensional or multi-dimensional function with parameters. This interface extends the abstract ROOT::Math::IBaseFunctionMultiDim class with methods to set or retrieve parameter values and to evaluate the function given by the independent vector of values X and the vector of parameters P.

You convert a TF1 object in a ROOT::Math::IParametricFunctionMultiDim, using the wrapper class ROOT::Math::WrappedMultiTF1.

Example

TF1 * f1 = new TF1("f1","gaus");
ROOT::Math::WrappedMultiTF1 fitFunction(f1, f1->GetNdim() );
ROOT::Fit::Fitter fitter;
fitter.SetFunction( fitFunction, false);

When creating a wrapper, the parameter values stored in TF1 are copied to the ROOT::Math::WrappedMultiTF1 object. The function object representing the model function is given to the ROOT::Fit::Fitter class using the Fitter::SetFunction method.

You can also provide a function object that implements the derivatives of the function with respect to the parameters. In this case you must provide the function object as a class deriving from the ROOT::Math::IParametricGradFunctionMultiDim interface.

Note that the ROOT::Math::WrappedMultiTF1 wrapper class implements also the gradient interface, using internally TF1::GradientPar, which is based on numerical differentiation, apart for the case of linear functions (this is when TF1::IsLinear() is true). The parameter derivatives of the model function can be useful to some minimization algorithms, such as FUMILI (see → FUMILI). However, in general is better to leave the minimization algorithm (for example TMinuit, see → TMinuit) to compute the needed derivatives using its own customised numerical differentiation algorithm. To avoid providing the parameter derivations to the fitter, explicitly set Fitter::SetFunction to false.

Configuring the fit

Use the ROOT::Fit::FitConfig class (contained in the ROOT::Fit::ParameterSettings class) for configuring the fit.

There are the following fit configurations:

Setting the initial values of the parameters.
Setting the parameter step sizes.
Setting eventual parameter bounds.
Setting the minimizer library and the particular algorithm to use.
Setting different minimization options (print level, tolerance, max iterations, etc. . . ).
Setting the type of parameter errors to compute (parabolic error, minor errors, re-normalize errors using fitted chi2 values).

Example

Setting the lower and upper bounds for the first parameter and a lower bound for the second parameter:

fitter.SetFunction( fitFunction, false);
fitter.Config().ParSettings(0).SetLimits(0,1.E6);
fitter.Config().ParSettings(2).SetLowerLimit(0);

Note that a ROOT::Fit::ParameterSettings objects exists for each fit parameter and it created by the ROOT::Fit::FitConfig class, after the model function has been set in the fitter. Only when the function is set, the number of parameter is known and automatically the FitConfig creates the corresponding ParameterSetting objects.

Various minimizers can be used in the fitting process. They can be implemented in different libraries and loaded at run time. Each different minimizer (for example Minuit, Minuit2, FUMILI, etc.) consists of a different implementation of the ROOT::Math::Minimizer interface. Within the same minimizer, thus within the same class implementing the Minimizer interface, different algorithms exist.

If the requested minimizer is not available in ROOT, the default one is used. The default minimizer type and algorithm can be specified by using the static function ROOT::Math::MinimizerOptions::SetDefaultMinimizer("minimizerName").

Performing the fit

Depending on the available input data and the selected function for fitting, you can use one of the methods of the ROOT::Fit::Fitter class to perform the fit.

Pre-defined fitting methods

The following pre-defined fitting methods are available:

Least-square fit: Fitter::LeastSquare(const BinData &) or Fitter::Fit(const Bindata &). Both methods should be used when the binned data values follow a Gaussian distribution. These fit methods are implemented using the ROOT::Fit::Chi2FCN class.
Binned likelihood fit: Fitter::LikelihoodFit(const Bindata &). This method should be used when the binned data values follow a Poisson or a multinomial distribution. The Poisson case (extended fit) is the default and in this case the function normalization is also fit to the data. This method is implemented by the ROOT::Fit::PoissonLikelihoodFCN class.
Un-binned likelihood fit: Fitter::LikelihoodFit(const UnBindata &). By default the fit is not extended, this is the normalization is not fitted to the data. This method is implemented using the LogLikelihoodFCN class.
Linear fit: A linear fit can be chosen if the model function is linear in the parameters.

User-defined fitting methods

You can also implement your own fitting methods. You can implement your own version of the method function using on its own dataset objects and functions.

Use ROOT::Fit::Fitter::SetFCN to set the method function and ROOT::Fit::FitFCN for fitting.
You can pass the method function also in ROOT::Fit::FitFCN, but in this case a previously defined fitting configuration is used.

The possible type of method functions that are based in ROOT::Fit::Fitter::SetFCN are:

A generic functor object implementing operator()(const double * p) where p is the parameter vector. In this case you need to pass the number of parameters, the function object and optionally a vector of initial parameter values. Other optional parameter include the size of the datasets and a flag specifying if it is a chi2 (least-square fit). If the last two parameters are given, the chi2/ndf can be computed after fitting the data.

template <class Function>
bool Fitter::SetFCN(unsigned int npar, Function &f,
                    const double *initialParameters = nullptr,
                    unsigned int dataSize = 0,
                    bool isChi2Fit = false);

A function object implementing the ROOT::Math::IBaseFunctionMultiDim interface.

bool Fitter::SetFCN(const ROOT::Math::IBaseFunctionMultiDim &f,
                    const double *initialParameters = nullptr,
                    unsigned int dataSize = 0,
                    bool isChi2Fit = false);

A function object implementing the ROOT::Math::FitMethodFunction interface. This is an interface class that extends ROOT::Math::IBaseFunctionMultiDim with some additional functions which can be used when fitting is done. The extra functionality is required by some fitting algorithms like FUMILI or GSLMultiFit.

bool Fitter::SetFCN(const ROOT::Math::FitMethodFunction &f,
                    const double *initialParameters = nullptr,
                    unsigned int dataSize = 0);

An old-Minuit like FCN interface (this is a free function with the signature fcn(int &npar, double *gin, double &f, double *u, int flag).

typedef void (*MinuitFCN)(int &npar, double *gin, double &f, double *u, int flag);
bool Fitter::SetFCN(MinuitFCN fcn, int npar,
                    const double *initialParameters = nullptr,
                    unsigned int dataSize = 0,
                    bool isChi2Fit = false)

Example: simultaneous fit of two histograms

One good example that covers most of the ROOT::Fit features is the simultaneous fit of two histograms (see the combinedFit.C / combinedFit.py tutorial).

Computing confidence intervals

With the fit result object returned by Fitter::Result(), you can compute the confidence intervals after the fit (see ROOT::Fit::FitResult::GetConfidenceIntervals). Given an input dataset (for example a BinData object) and a confidence level value (for example 68%), it computes the lower and upper band values of the model function at the given data points.

You can take a loot at the ConfidenceIntervals.C tutorial for an example.

Using the Fit Panel

After you have drawn a histogram (see → Drawing a histograms), you can use the Fit Panel for fitting the data. The Fit Panel is best suited for prototyping

The following section describes how to use the Fit Panel using an example.

Given is a histogram following a Gaussian distribution.

TH1F *h1 = new TH1F("h1", "h1", 200, -5,5);
TF1 *f1 = new TF1("f1", "[2]*TMath::Gaus(x,[0],[1])");
f1->SetParameters(1,1,1);
h1->FillRandom("f1");
h1->Draw();

Right-click on the object and then click FitPanel.
You also can select Tools and then click Fit Panel.

Figure: FitPanel in the context menu.

The Fit Panel is displayed.

Figure: Fit Panel.

In the Fit Function section you can select the function that should be used for fitting.
The following types of functions are listed here:

Pre-defined functions that will depend on the dimensionality of the data.
Functions present in gDirectory. These functions were already created by the user through a ROOT macro.
Previously used functions. Functions that fitted the current data previously, if the data is able to store such functions.

Select a fitting function.

Click Set Parameters... to set the parameters of the selected function.

The Set Parameters of... dialog window is displayed.

Figure: Set Parameters of… dialog window.

Set the parameters for the fit function.
In the General tab, select the general options for fitting.
This includes the method that will be used, as well as what fit options will be used with it and the draw options. You can also constrain the range of the function used for the fitting.
In the Minimization tab, select the minimization algorithm for fitting.
Click Fit.

Figure: A fitted histogram.