Fitting histograms

Fitting is the method for modeling the expected distribution of events in a physics data analysis. ROOT offers various options to perform the fitting of the data:

  • Fit Panel: After a histogram is drawn, the Fit Panel GUI is best used for prototyping the fit.
  • Fit() method: You can fit histograms and graphs programmatically with the Fit() method.
  • Minimization packages: ROOT provides several minimization packages.
  • RooFit: The RooFit library is a toolkit for modeling the expected distribution of events in a physics analysis.
→ Fit tutorials

Using the Fit Panel

After you have drawn a histogram (see → Drawing a histograms), you can use the Fit Panel for fitting the data. The Fit Panel is best suited for prototyping

The following section describes how to use the Fit Panel using an example.

Given is a histogram following a Gaussian distribution.

   TH1F *h1 = new TH1F("h1", "h1", 200, -5,5);
   TF1 *f1 = new TF1("f1", "[2]*TMath::Gaus(x,[0],[1])");
   f1->SetParameters(1,1,1);
   h1->FillRandom("f1");
   h1->Draw();
  • Right-click on the object and then click FitPanel.
    You also can select Tools and then click Fit Panel.

Figure: FitPanel in the context menu.

The Fit Panel is displayed.

Figure: Fit Panel.

In the Fit Function section you can select the function that should be used for fitting.
The following types of functions are listed here:

  • Pre-defined functions that will depend on the dimensionality of the data.

  • Functions present in gDirectory. These functions were already created by the user through a ROOT macro.

  • Previously used functions. Functions that fitted the current data previously, if the data is able to store such functions.

Select a fitting function.

  • Click Set Parameters... to set the parameters of the selected function.

The Set Parameters of... dialog window is displayed.

Figure: Set Parameters of… dialog window.

  • Set the parameters for the fit function.

  • In the General tab, select the general options for fitting.
    This includes the method that will be used, as well as what fit options will be used with it and the draw options. You can also constrain the range of the function used for the fitting.

  • In the Minimization tab, select the minimization algorithm for fitting.

  • Click Fit.

Figure: A fitted histogram.

Using the Fit() method

The Fit() method is implemented for:

Using TH1::Fit()

  • Use the TH1::Fit() method to fit a histogram programmatically.
    By default, the fitted function object is added to the histogram and is drawn in the current pad.

The signature is:

   TFitResultPtr Fit(TF1 *function, Option_t *option, Option_t *goption,
   Axis_t xxmin, Axis_t xxmax)

function: Pointer to the fitted function (the fit model) object.

option: The fitting option, with the following options:

  • W: Sets all weights to 1 for non empty bins; ignore error bars.
  • WW: Sets all weights to 1 including empty bins; ignore error bars.
  • I: Uses integral of function in bin instead of value at bin center.
  • L: Uses a log likelihood method (default is chi-square method). To be used when the histogram represents counts.
  • WL: Weighted log likelihood method. To be used when the histogram has been filled with weights different than 1.
  • P: Uses Pearson chi-square method. Uses expected errors instead of the observed one given by TH1::GetBinError() (default case). The expected error is instead estimated from the square-root of the bin function value.
  • Q: Quiet mode (minimum printing).
  • V: Verbose mode (default is between Q and V).
  • S: The result of the fit is returned in the TFitResultPtr .
  • E: Performs better errors estimation using the Minos technique.
  • M: Improves fit results, by using the IMPROVE algorithm of TMinuit .
  • R: Uses the range specified in the function range.
  • N: Does not store the graphics function, does not draw.
  • 0: Does not plot the result of the fit. By default the fitted function is drawn unless the option N is specified.
  • +: Adds this new fitted function to the list of fitted functions (by default, the previous function is deleted and only the last one is kept).
  • B: Use this option when you want to fix one or more parameters and the fitting function is a predefined one, like polN, expo, landau, gaus.
    Note that in case of pre-defined functions, some default initial values and limits are set.
  • C: In case of linear fitting, do no calculate the chisquare (saves time).
  • F: If fitting a linear function (e.g., polN), switch to use the default minimizer (e.g., TMinuit ). By default, polN functions are fitted by the linear fitter.

goption: The graphics option that is the same as TH1::Draw().

xxmin, xxmax: Specifies the range over which to apply the fit.

Using TGraph::Fit()

The signature for fitting TGraph is the same as for the TH1 .

Only the following options only apply for fitting histograms:

  • L: Uses a log likelihood method (default is chi-square method). To be used when the histogram represents counts.
  • WL: Weighted log likelihood method. To be used when the histogram has been filled with weights different than 1.
  • I: Uses integral of function in bin instead of value at bin center.

The following options only apply for TGraph::Fit:

  • EX0: When fitting a TGraphErrors or a TGraphAsymmErrors , the errors on the coordinates are not used in the fit.
  • ROB: Use the Robust fitting in case of linear fitting . Computes the least trimmed squares (LTS) regression coefficients (robust (resistant) regression), using the default fraction of good points.
  • ROB=0.x: As above, but compute the LTS regression coefficients, using 0.x as a fraction of good points.

Using the TF1 function class

In the following section is described how to use the TF1 class that is used for fitting histograms and graphs.

Fitting 1-D histograms with pre-defined functions

  • Use the TH1::Fit() method to fit a 1-D histogram with a pre-defined function. The name of the pre-defined function is the first parameter. For pre-defined functions, you do not need to set initial values for the parameters.

Example

A histogram object hist is fit with a Gaussian:

   root[] hist.Fit("gaus");

The following pre-defined functions are available:

  • gaus: Gaussian function with three parameters: f(x) = p0*exp(-0.5*((x-p1)/p2)ˆ2)

  • expo: Exponential function with two parameters: f(x) = exp(p0+p1*x)

  • polN: Polynomial of degree N, where N is a number between 0 and 9: f(x) = p0 + p1*x + p2*x2 +...

  • chebyshevN: Chebyshev polynomial of degree N, where N is a number between 0 and 9: f(x) = p0 + p1*x + p2*(2*x2-1) +...

  • landau: Landau function with mean and sigma. This function has been adapted from the CERNLIB routine G110 denlan (see → TMath::Landau).

  • gausn: Normalized form of the Gaussian function with three parameters f(x) = p0*exp(-0.5*((x-p1)/p2)ˆ2)/(p2*sqrt(2PI))

Fitting 1-D histograms with user-defined functions

First you create a TF1 object, then you use the name of the TF1 fitting function in the Fit() method.

You can create the TF1 fitting function as follows:

  • from an existing expressions defined in TFormula (with and without parameters),

  • by defining your own function.

Creating a TF1 fitting function with a TFormula expression

Example

The TF1 constructor is used with the formula sin(x)/x.

   root[] TF1 *f1 = new TF1("f1","sin(x)/x",0,10)

You can also use a TF1 object in the constructor of another TF1 object.

   root[] TF1 *f2 = new TF1("f2","f1*2",0,10)

Example

The TF1 constructor is used with the formula x*sin(x) and two parameters.
The parameter index is enclosed in square brackets.

   root[] TF1 *f1 = new TF1("f1","[0]*x*sin([1]*x)",-3,3);

Use SetParameter() to set the initial values.

   root[] f1->SetParameter(0,10);

Use SetParameters() to set multiple parameters at once.

   root[] f1->SetParameters(10,5);

Creating a user TF1 fitting function

You can define your own function and then pass the function pointer to the TF1 constructor. Your function for a TF1 constructor must have the following signature:

   Double_t fitf(Double_t *x,Double_t *par)

Double_t *x: Pointer to the variable array. This array must be a 1-D array with v[0] = x in case of a 1-D histogram, v[0] =x, v[1] = y for a 2-D histogram, etc.

Double_t *par: Pointer to the parameter array. par contains the current values of parameters when it is called by the FCN() function.

Example

An 1-D histogram is fit with a user-defined function.
See also the fitexample.C tutorial.

// Define a function with three parameters.
   Double_t fitf(Double_t *x,Double_t *par) {
      Double_t arg = 0;
      if (par[2]!=0) arg = (x[0] - par[1])/par[2];
      Double_t fitval = par[0]*TMath::Exp(-0.5*arg*arg);
      return fitval;
   }

Now the fitf function is used to fit the histogram.

   void fitexample() {

// Open a ROOT file and get a histogram.
   TFile *f = new TFile("hsimple.root");
   TH1F *hpx = (TH1F*)f->Get("hpx");

// Create a TF1 object using the fitf function.
// The last three parameters specify the number of parameters for the function.
   TF1 *func = new TF1("fit",fitf,-3,3,3);

// Set the parameters to the mean and RMS of the histogram.
   func->SetParameters(500,hpx->GetMean(),hpx->GetRMS());

// Give the parameters names.
   func->SetParNames ("Constant","Mean_value","Sigma");

// Call TH1::Fit with the name of the TF1 object.
   hpx->Fit("fit");
   }

Accessing the fitted function parameters and results

Examples

   root[] TF1 *fit = hist->GetFunction(function_name);
   root[] Double_t chi2 = fit->GetChisquare();

// Value of the first parameter:
   root[] Double_t p1 = fit->GetParameter(0);

// Error of the first parameter:
   root[] Double_t e1 = fit->GetParError(0);

Configuring the fit

The following configuration actions are available when fitting a histogram or graph using the Fit() method:

Fixing and setting parameter bounds

For pre-defined functions like poln, exp, gaus, and landau, the parameter initial values are set automatically.

For not pre-defined functions, the fit parameters must be initialized before invoking the Fit() method.

   func->SetParLimits(0,-1,1);

When the lower and upper limits are equal, the parameter is fixed.

Example

The parameter is fixed 4 at 10.

   func->SetParameter(4,10);
   func->SetParLimits(4,10,10);

Example

   func->SetParameter(4,0);
   func->FixParameter(4,0);

You do not need to set the limits for all parameters.

Example

There is function with 6 parameters. Then a setup like the following is possible: Parameters 0 to 2 can vary freely, parameter 3 has boundaries [-10, 4] with the initial value -1.5, and parameter 4 is fixed to 0.

   func->SetParameters(0,3.1,1.e-6,-1.5,0,100);
   func->SetParLimits(3,-10,4);
   func->FixParameter(4,0);

Fitting subranges

By default, TH1::Fit() fits the function on the defined histogram range. You can specify the R option in the second parameter of TH1::Fit() to restrict the fit to the range specified in the TF1 constructor.

Example

The fit will be limited to -3 to 3, the range specified in the TF1 constructor:

   root[] TF1 *f1 = new TF1("f1","[0]*x*sin([1]*x)",-3,3);
   root[] hist->Fit("f1","R");

You can also specify a range in the call to TH1::Fit().

   root[] hist->Fit("f1","","",-2,2)

See also the ROOT macros myfit.C and multifit.C for more detailed examples.

Fitting multiple subranges

You can find a ROOT macro for fitting multiple subranges at multifit.C . It shows how to use several Gaussian functions with different parameters on separate subranges of the same histogram.

Example

Four TF1 objects are created, one for each subrange.

   g1 = new TF1("m1","gaus",85,95);
   g2 = new TF1("m2","gaus",98,108);
   g3 = new TF1("m3","gaus",110,121);

// The total is the sum of the three, each has three parameters.
   total = new TF1("mstotal","gaus(0)+gaus(3)+gaus(6)",85,125);

The histogram are filled with bins defined in the array x.

   h = new TH1F("g1","Example of several fits in subranges",np,85,134);
   h->SetMaximum(7);
   for (int i=0; i<np; i++) {
      h->SetBinContent(i+1,x[i]);
   }

// Define the parameter array for the total function.
   Double_t par[9];

When fitting simple functions, such as a Gaussian, the initial values of the parameters are automatically computed. In the more complicated case of the sum of three Gaussian functions, the initial values of parameters must be set. In this particular case, the initial values are taken from the result of the individual fits.

// Fit each function and add it to the list of functions.
   h->Fit(g1,"R");
   h->Fit(g2,"R+");
   h->Fit(g3,"R+");

// Get the parameters from the fit
   g1->GetParameters(&par[0]);
   g2->GetParameters(&par[3]);
   g3->GetParameters(&par[6]);

// Use the parameters on the sum
   total->SetParameters(par);
   h->Fit(total,"R+");

Adding functions to the list

The example $ROOTSYS/tutorials/fit/multifit.C illustrates how to fit several functions on the same histogram. By default a fit command deletes the previously fitted function in the histogram object. You can specify the option + in the second parameter to add the newly fitted function to the existing list of functions for the histogram.

   hist->Fit("f1","+","",-2,2)

Note that the fitted function(s) are saved with the histogram when it is written to a ROOT file.

Result of the fit

You can obtain the following results of a fit:

Associated function

One or more objects (typically a TF1\*) can be added to the list of functions (fFunctions) associated to each histogram. TH1::Fit() adds the fitted function to this list.

Given a histogram h, you can retrieve the associated function with:

   TF1 *myfunc = h->GetFunction("myfunc");

Accessing the fit parameters and results

If the histogram or graph is made persistent, the list of associated functions is also persistent.

Retrieve a pointer to the function with the TH1::GetFunction() method. Then you can retrieve the fit parameters from the function.

Example

   TF1 *fit = hist->GetFunction(function_name);
   Double_t chi2 = fit->GetChisquare();

// Value of the first parameter.
   Double_t p1 = fit->GetParameter(0);

// Error of the first parameter.
   Double_t e1 = fit->GetParError(0);

With the fit option S, you can access the full result of the fit including the covariance and correlation matrix.

Associated errors

By default, for each bin, the sum of weights is computed at fill time. You can also call TH1::Sumw2() to force the storage and computation of the sum of the square of weights per bin. If Sumw2() has been called, the error per bin is computed as the sqrt(sum of squares of weights). Otherwise, the error is set equal to the `sqrt(bin content).

To return the error for a given bin number, use:

   Double_t error = h->GetBinError(bin);

Empty bins are excluded in the fit when using the Chi-square fit method. When fitting an histogram representing counts (that is with Poisson statistics) it is recommended to use the Log-Likelihood method (option L or WL), particularly in case of low statistics.

Fit statistics

You can change the statistics box to display the fit parameters with the TStyle::SetOptFit() method. This parameter has four digits: mode = pcev (default = 0111)

  • p = 1: Print probability.
  • c = 1: Print Chi-square/number of degrees of freedom.
  • e = 1: Print errors (if e=1, v must be 1).
  • v = 1: Print name/values of parameters.

Example

To print the fit probability, parameter names, values, and errors, use:

   gStyle->SetOptFit(1011);

Using ROOT::Fit classes

ROOT::Fit is the namespace for fitting classes (regression analysis). The fitting classes are part of the MathCore library.
The defined classes can be classified in the following groups:

In addition, the fitter classes use the generic interfaces for parametric function evaluations, ROOT::Math::IParametricFunctionMultiDim to define the fitting model function, and ROOT::Math::Minimizer interface to perform the minimization of the target function.

Creating the input fit data

There are two types of input data:

Using binned data

Example

There is histogram, represented as a TH1 type object. Now a ROOT:Fit::BinData object is created and filled.

   ROOT::Fit::DataOptions opt;
   opt.fIntegral = true;
   ROOT::Fit::BinData data(opt);

// Fill the bin data by using the histogram.
   TH1 * h1 = (TH1*) gDirectory->Get("myHistogram");
   ROOT::Fit::FillData(data, h1);

By using ROOT::Fit::DataOptions some fitting options are controlled and by using ROOT::Fit::DataRange you can specify the data range.

Example

This example shows how to specify the input option to use the integral of the function value in the bin instead of using that of the function value in the bin center when performing the fit, and to use a range between the xmin and xmax values.

   ROOT::Fit::DataOptions opt;
   opt.fIntegral = true;
   ROOT::Fit::DataRange range(xmin,xmax);
   ROOT::Fit::BinData data(opt,range);

// Fill the bin data using the histogram.
// You can do this by using the following helper function from the histogram library.
   TH1 * h1 = (TH1*) gDirectory->Get("myHistogram");
   ROOT::Fit::FillData(data, h1);

Using un-binned data

For creating un-binned data sets, there are two possibilities:

  1. Copy the data inside a ROOT::Fit::UnBinData object.
    Create an empty ROOT::Fit::UnBinData object, iterate on the data and add the data point one by one. An input ROOT::Fit::DataRange object is passed in order to copy the data according to the given range.
  2. Use ROOT::Fit::UnBinData as a wrapper to an external data storage.
    In this case the ROOT::Fit::UnBinData object is created from an iterator or pointers to the data and the data are not copied inside. The data cannot be selected according to a specified range. All the data points will be included in the fit.

ROOT::Fit::UnBinData supports also weighted data. In addition to the data points (coordinates), which can be of arbitrary k dimensions, the class can be constructed from a vector of weights.

Example

Data are taken from a histogram ( TH1 object).

   double * buffer = histogram->GetBuffer();

// Number of entry is first entry in the buffer.
   int n = buffer[0];

// When creating the data object, it is important to create it with the size of the data.
   ROOT::Fit::UnBinData data(n);
   for (int i = 0; i < n; ++i)
      data.add(buffer[2*i+1]);

Example

In this example a two-dimensional UnBinData object is created with the contents from a tree.

   TFile * file = TFile::Open("hsimple.root");
   TTree *ntuple = 0; file->GetObject("ntuple",ntuple);

// Select from the tree the data that should be used for fitting.
// Use TTree::Draw.
   int nevt = ntuple->Draw("px:py","","goff");
   double * x = ntuple->GetV1();
   double * y = ntuple->GetV2();
   ROOT::Fit::UnBinData data(nevt, x, y );

Creating a fit model

To fit a data set, a model is needed to describe the data, such as a probability density function (PDF) describing the observed data or a hypothetical function describing the relationship between the independent variables X and the single dependent variable Y. The model can have any number of k independent variables. For example, in fitting a k-dimensional histogram, the independent variables X are the coordinates of the bin centers and Y is the bin weight.

The model function needs to be expressed as function of some unknown parameters. The fitting will find the best parameter value to describe the observed data.

You can for example use the TF1 class, the parametric function class, to describe the model function. But the ROOT::Fit::Fitter class takes as input a more general parametric function object, the abstract interface class ROOT::Math::IParametricFunctionMultiDim. It describes a generic one-dimensional or multi-dimensional function with parameters. This interface extends the abstract ROOT::Math::IBaseFunctionMultiDim class with methods to set or retrieve parameter values and to evaluate the function given by the independent vector of values X and the vector of parameters P.

You convert a TF1 object in a ROOT::Math::IParametricFunctionMultiDim, using the wrapper class ROOT::Math::WrappedMultiTF1.

Example

   TF1 * f1 = new TF1("f1","gaus");
   ROOT::Math::WrappedMultiTF1 fitFunction(f1, f1->GetNdim() );
   ROOT::Fit::Fitter fitter;
   fitter.SetFunction( fitFunction, false);

When creating a wrapper, the parameter values stored in TF1 are copied to the ROOT::Math::WrappedMultiTF1 object. The function object representing the model function is given to the ROOT::Fit::Fitter class using the Fitter::SetFunction method.

You can also provide a function object that implements the derivatives of the function with respect to the parameters. In this case you must provide the function object as a class deriving from the ROOT::Math::IParametricGradFunctionMultiDim interface.

Note that the ROOT::Math::WrappedMultiTF1 wrapper class implements also the gradient interface, using internally TF1::GradientPar, which is based on numerical differentiation, apart for the case of linear functions (this is when TF1::IsLinear() is true). The parameter derivatives of the model function can be useful to some minimization algorithms, such as FUMILI (see → FUMILI). However, in general is better to leave the minimization algorithm (for example TMinuit, see → TMinuit) to compute the needed derivatives using its own customised numerical differentiation algorithm. To avoid providing the parameter derivations to the fitter, explicitly set Fitter::SetFunction to false.

Configuring the fit

Use the ROOT::Fit::FitConfig class (contained in the ROOT::Fit::ParameterSettings class) for configuring the fit.

There are the following fit configurations:

  • Setting the initial values of the parameters.
  • Setting the parameter step sizes.
  • Setting eventual parameter bounds.
  • Setting the minimizer library and the particular algorithm to use.
  • Setting different minimization options (print level, tolerance, max iterations, etc. . . ).
  • Setting the type of parameter errors to compute (parabolic error, minor errors, re-normalize errors using fitted chi2 values).

Example

Setting the lower and upper bounds for the first parameter and a lower bound for the second parameter:

   fitter.SetFunction( fitFunction, false);
   fitter.Config().ParSettings(0).SetLimits(0,1.E6);
   fitter.Config().ParSettings(2).SetLowerLimit(0);

Note that a ROOT::Fit::ParameterSettings objects exists for each fit parameter and it created by the ROOT::Fit::FitConfig class, after the model function has been set in the fitter. Only when the function is set, the number of parameter is known and automatically the FitConfig creates the corresponding ParameterSetting objects.

Various minimizers can be used in the fitting process. They can be implemented in different libraries and loaded at run time. Each different minimizer (for example Minuit, Minuit2, FUMILI, etc.) consists of a different implementation of the ROOT::Math::Minimizer interface. Within the same minimizer, thus within the same class implementing the Minimizer interface, different algorithms exist.

If the requested minimizer is not available in ROOT, the default one is used. The default minimizer type and algorithm can be specified by using the static function ROOT::Math::MinimizerOptions::SetDefaultMinimizer("minimizerName").

Performing the fit

Depending on the available input data and the selected function for fitting, you can use one of the methods of the ROOT::Fit::Fitter class to perform the fit.

Pre-defined fitting methods

The following pre-defined fitting methods are available:

User-defined fitting methods

You can also implement your own fitting methods. You can implement your own version of the method function using on its own data set objects and functions.

Use ROOT::Fit::Fitter::SetFCN to set the method function and ROOT::Fit::FitFCN for fitting.
You can pass the method function also in ROOT::Fit::FitFCN, but in this case a previously defined fitting configuration is used.

The possible type of method functions that are based in ROOT::Fit::Fitter::SetFCN are:

  • A generic functor object implementing operator()(const double * p) where p is the parameter vector. In this case you need to pass the number of parameters, the function object and optionally a vector of initial parameter values. Other optional parameter include the size of the data sets and a flag specifying if it is a chi2 (least-square fit). If the last two parameters are given, the chi2/ndf can be computed after fitting the data.
   template <class Function>
   bool Fitter::SetFCN(unsigned int npar, Function & f, const double * initialParameters = 0, unsigned int dataSize=0, bool isChi2Fit = false)
   bool Fitter::SetFCN(const ROOT::Math::IBaseFunctionMultiDim & f, const double * initialParameters = 0, unsigned int dataSize=0, bool isChi2Fit = false)
   bool Fitter::SetFCN(const ROOT::Math::FitMethodFunction & f, const double * initialParameters = 0, unsigned int dataSize=0)
  • An old-Minuit like FCN interface (this is a free function with the signature fcn(int &npar, double *gin, double &f, double *u, int flag).
   typedef void(* MinuitFCN)(int &npar, double *gin, double &f, double *u, int flag)
   bool Fitter::SetFCN(MinuitFCN fcn, int npar, const double * initialParameters = 0, unsigned int dataSize=0, bool isChi2Fit = false)

Fit result

The result of the fit is contained in the ROOT::Fit::Result object.

You can print the result of the fit with the FitResult::Print() method.

By using ROOT::Fit::FitResult, you can compute the confidence intervals after the fit ROOT::Fit::FitResult::GetConfidenceIntervals. Given an input data set (for example a BinData object) and a confidence level value (for example 68%), it computes the lower and upper band values of the model function at the given data points.

TFitResult

TFitResult is a class deriving from ROOT::Fit::FitResult and providing in addition some convenient methods to return a covariance or correlation matrix as a TMatrixDSym object. Furthermore, TFitResult derives from TNamed and can be conveniently stored in a file.

When fitting an histogram (a TH1 object) or a graph (a TGraph object), it is possible to return a TFitResult via the TFitResultPtr object, which behaves as a smart pointer to a
TFitResult . TFitResultPtr is the return object of TH1::Fit or TGraph::Fit.

By default TFitResultPtr contains only the status of the fit and can be obtained by an automatic conversion of TFitResultPtr to an integer. If the fit option S is used instead, TFitResultPtr contains TFitResult and behaves as a smart pointer to it.

Example

// TFitResultPtr contains only the fit status.
   int fitStatus = hist->Fit(myFunction);

// TFitResultPtr contains the TFitResult.
   TFitResultPtr r = hist->Fit(myFunction,"S");

// Access the covariance matrix.
   TMatrixDSym cov = r->GetCovarianceMatrix();

// Retrieve the fit chi2.
   Double_t chi2 = r->Chi2();

// Retrieve the value for the parameter 0.
   Double_t par0 = r->Parameter(0);

// Retrieve the error for the parameter 0.
   Double_t err0 = r->ParError(0);

// Print the full information of the fit including covariance matrix.
   r->Print("V");

// Store the result in a ROOT file.
   r->Write();