Fitting histograms

Fitting is the method for modeling the expected distribution of events in a physics data analysis. ROOT offers various options to perform the fitting of the data:

  • Fit Panel: After a histogram is drawn, the Fit Panel GUI is best used for prototyping the fit.
  • Fit() method: You can fit histograms and graphs programmatically with the Fit() method.
  • Minimization packages: ROOT provides several minimization packages like Minuit2 and FUMILI.
  • RooFit: The RooFit library is a toolkit for modeling the expected distribution of events in a physics analysis.
→ Fit tutorials

Using the Fit Panel

After you have drawn a histogram (see → Drawing a histograms), you can use the Fit Panel for fitting the data. The Fit Panel is best suited for prototyping

The following section describes how to use the Fit Panel using an example.

Given is a histogram following a Gaussian distribution.

   TH1F *h1 = new TH1F("h1", "h1", 200, -5,5);
   TF1 *f1 = new TF1("f1", "[2]*TMath::Gaus(x,[0],[1])");
   f1->SetParameters(1,1,1);
   h1->FillRandom("f1");
   h1->Draw();
  • Right-click on the object and then click FitPanel.
    You also can select Tools and then click Fit Panel.

Figure: FitPanel in the context menu.

The Fit Panel is displayed.

Figure: Fit Panel.

In the Fit Function section you can select the function that should be used for fitting.
The following types of functions are listed here:

  • Pre-defined functions that will depend on the dimensionality of the data.

  • Functions present in gDirectory. These functions were already created by the user through a ROOT macro.

  • Previously used functions. Functions that fitted the current data previously, if the data is able to store such functions.

Select a fitting function.

  • Click Set Parameters... to set the parameters of the selected function.

The Set Parameters of... dialog window is displayed.

Figure: Set Parameters of… dialog window.

  • Set the parameters for the fit function.

  • In the General tab, select the general options for fitting.
    This includes the method that will be used, as well as what fit options will be used with it and the draw options. You can also constrain the range of the function used for the fitting.

  • In the Minimization tab, select the minimization algorithm for fitting.

  • Click Fit.

Figure: A fitted histogram.

Using the Fit() method

The Fit() method is implemented for:

Using TH1::Fit()

  • Use the TH1::Fit() method to fit a histogram programmatically.
    By default, the fitted function object is added to the histogram and is drawn in the current pad.

The signature is:

   TFitResultPtr Fit(TF1 *function, Option_t *option, Option_t *goption,
   Axis_t xxmin, Axis_t xxmax)

function: Pointer to the fitted function (the fit model) object.

option: The fitting option, with the following options:

  • W: Sets all weights to 1 for non empty bins; ignore error bars.
  • WW: Sets all weights to 1 including empty bins; ignore error bars.
  • I: Uses integral of function in bin instead of value at bin center.
  • L: Uses a log likelihood method (default is chi-square method). To be used when the histogram represents counts.
  • WL: Weighted log likelihood method. To be used when the histogram has been filled with weights different than 1.
  • P: Uses Pearson chi-square method. Uses expected errors instead of the observed one given by TH1::GetBinError() (default case). The expected error is instead estimated from the square-root of the bin function value.
  • Q: Quiet mode (minimum printing).
  • V: Verbose mode (default is between Q and V).
  • S: The result of the fit is returned in the TFitResultPtr .
  • E: Performs better errors estimation using the Minos technique.
  • M: Improves fit results, by using the IMPROVE algorithm of TMinuit .
  • R: Uses the range specified in the function range.
  • N: Does not store the graphics function, does not draw.
  • 0: Does not plot the result of the fit. By default the fitted function is drawn unless the option N is specified.
  • +: Adds this new fitted function to the list of fitted functions (by default, the previous function is deleted and only the last one is kept).
  • B: Use this option when you want to fix one or more parameters and the fitting function is a predefined one, like polN, expo, landau, gaus.
    Note that in case of pre-defined functions, some default initial values and limits are set.
  • C: In case of linear fitting, do no calculate the chisquare (saves time).
  • F: If fitting a linear function (e.g., polN), switch to use the default minimizer (e.g., TMinuit ). By default, polN functions are fitted by the linear fitter.

goption: The graphics option that is the same as TH1::Draw().

xxmin, xxmax: Specifies the range over which to apply the fit.

Using TGraph::Fit()

The signature for fitting TGraph is the same as for the TH1 .

Only the following options only apply for fitting histograms:

  • L
  • WL
  • I

The following options only apply for TGraph::Fit:

  • EX0: When fitting a TGraphErrors or a TGraphAsymmErrors , the errors on the coordinates are not used in the fit.
  • ROB: Use the Robust fitting in case of linear fitting . Computes the LTS regression coefficients (robust (resistant) regression), using the default fraction of good points.
  • ROB=0.x: As above, but compute the LTS regression coefficients, using 0.x as a fraction of good points.

Using the TF1 function class

In the following section is described how to use the TF1 class that is used for fitting histograms and graphs.

Fitting 1-D histograms with pre-defined functions

  • Use the TH1::Fit() method to fit a 1-D histogram with a pre-defined function. The name of the pre-defined function is the first parameter. For pre-defined functions, you do not need to set initial values for the parameters.

Example

A histogram object hist is fit with a Gaussian:

   root[] hist.Fit("gaus");

The following pre-defined functions are available:

  • gaus: Gaussian function with three parameters: f(x) = p0*exp(-0.5*((x-p1)/p2)ˆ2)

  • expo: Exponential function with two parameters: f(x) = exp(p0+p1*x)

  • polN: Polynomial of degree N, where N is a number between 0 and 9: f(x) = p0 + p1*x + p2*x2 +...

  • chebyshevN: Chebyshev polynomial of degree N, where N is a number between 0 and 9: f(x) = p0 + p1*x + p2*(2*x2-1) +...

  • landau: Landau function with mean and sigma. This function has been adapted from the CERNLIB routine G110 denlan (see → TMath::Landau).

  • gausn: Normalized form of the Gaussian function with three parameters f(x) = p0*exp(-0.5*((x-p1)/p2)ˆ2)/(p2*sqrt(2PI))

Fitting 1-D histograms with user-defined functions

First you create a TF1 object, then use the name of the TF1 fitting function in the Fit() method.

You can create the TF1 fitting function as follows:

  • from an existing expressions defined in TFormula (with and without parameters),

  • by defining your own function.

Creating a TF1 fitting function with a TFormula expression

Example

The TF1 constructor is used with the formula sin(x)/x.

   root[] TF1 *f1 = new TF1("f1","sin(x)/x",0,10)

You can also use a TF1 object in the constructor of another TF1 object.

   root[] TF1 *f2 = new TF1("f2","f1*2",0,10)

Example

The TF1 constructor is used with the formula x*sin(x) and two parameters.
The parameter index is enclosed in square brackets.

   root[] TF1 *f1 = new TF1("f1","[0]*x*sin([1]*x)",-3,3);

Use SetParameter() to set the initial values.

   root[] f1->SetParameter(0,10);

Use SetParameters() to set multiple parameters at once.

   root[] f1->SetParameters(10,5);

Creating a user TF1 fitting function

You can define your own function and then pass the function pointer to the TF1 constructor. Your function for a TF1 constructor must have the following signature:

   Double_t fitf(Double_t *x,Double_t *par)

Double_t *x: Pointer to the variable array. This array must be a 1-D array with v[0] = x in case of a 1-D histogram, v[0] =x, v[1] = y for a 2-D histogram, etc.

Double_t *par: Pointer to the parameter array. par contains the current values of parameters when it is called by the FCN() function.

Example

An 1-D histogram is fit with a user-defined function.
See also the fitexample.C tutorial.

// Define a function with 3 parameters
   Double_t fitf(Double_t *x,Double_t *par) {
      Double_t arg = 0;
      if (par[2]!=0) arg = (x[0] - par[1])/par[2];
      Double_t fitval = par[0]*TMath::Exp(-0.5*arg*arg);
      return fitval;
   }

Now the fitf function is used to fit the histogram.

   void fitexample() {

// Open a ROOT file and get a histogram.
   TFile *f = new TFile("hsimple.root");
   TH1F *hpx = (TH1F*)f->Get("hpx");

// Create a TF1 object using the fitf function.
// The last three parameters specify the number of parameters for the function.
   TF1 *func = new TF1("fit",fitf,-3,3,3);

// Set the parameters to the mean and RMS of the histogram.
   func->SetParameters(500,hpx->GetMean(),hpx->GetRMS());

//Give the parameters names.
   func->SetParNames ("Constant","Mean_value","Sigma");

// Call TH1::Fit with the name of the TF1 object.
   hpx->Fit("fit");
   }

Accessing the fitted function parameters and results

Examples

   root[] TF1 *fit = hist->GetFunction(function_name);
   root[] Double_t chi2 = fit->GetChisquare();

// Value of the first parameter:
   root[] Double_t p1 = fit->GetParameter(0);

// Error of the first parameter:
   root[] Double_t e1 = fit->GetParError(0);

Configuring the fit

The following configuration actions are available when fitting a histogram or graph using the Fit() method:

Fixing and setting parameter bounds

For pre-defined functions like poln, exp, gaus, and landau, the parameter initial values are set automatically.

For not pre-defined functions, the fit parameters must be initialized before invoking the Fit() method.

   func->SetParLimits(0,-1,1);

When the lower and upper limits are equal, the parameter is fixed.

Example

The parameter is fixed 4 at 10.

   func->SetParameter(4,10);
   func->SetParLimits(4,10,10);

Example

   func->SetParameter(4,0);
   func->FixParameter(4,0);

You do not need to set the limits for all parameters.

Example

There is function with 6 parameters. Then there is a setup possible like the following: parameters 0 to 2 can vary freely, parameter 3 has boundaries [-10, 4] with the initial value -1.5, and parameter 4 is fixed to 0.

   func->SetParameters(0,3.1,1.e-6,-1.5,0,100);
   func->SetParLimits(3,-10,4);
   func->FixParameter(4,0);

Fitting subranges

By default, TH1::Fit() fits the function on the defined histogram range. You can specify the R option in the second parameter of TH1::Fit() to restrict the fit to the range specified in the TF1 constructor.

Example

The fit will be limited to -3 to 3, the range specified in the TF1 constructor:

   root[] TF1 *f1 = new TF1("f1","[0]*x*sin([1]*x)",-3,3);
   root[] hist->Fit("f1","R");

You can also specify a range in the call to TH1::Fit().

   root[] hist->Fit("f1","","",-2,2)

See also the ROOT macros $ROOTSYS/tutorials/fit/myfit.C and multifit.C for more detailed examples.

Fitting multiple sub ranges

You can find a ROOT macro for fitting multiple sub ranges at $ROOTSYS/tutorials/fit/multifit.C. It shows how to use several Gaussian functions with different parameters on separate sub ranges of the same histogram.

Example

Four TF1 objects are created, one for each sub range.

   g1 = new TF1("m1","gaus",85,95);
   g2 = new TF1("m2","gaus",98,108);
   g3 = new TF1("m3","gaus",110,121);

// The total is the sum of the three, each has 3 parameters.
   total = new TF1("mstotal","gaus(0)+gaus(3)+gaus(6)",85,125);

The histogram are filled with bins defined in the array x.

   h = new TH1F("g1","Example of several fits in subranges",np,85,134);
   h->SetMaximum(7);
   for (int i=0; i<np; i++) {
      h->SetBinContent(i+1,x[i]);
   }

// Define the parameter array for the total function.
   Double_t par[9];

When fitting simple functions, such as a Gaussian, the initial values of the parameters are automatically computed. In the more complicated case of the sum of 3 Gaussian functions, the initial values of parameters must be set. In this particular case, the initial values are taken from the result of the individual fits.

// Fit each function and add it to the list of functions.
   h->Fit(g1,"R");
   h->Fit(g2,"R+");
   h->Fit(g3,"R+");

// Get the parameters from the fit
   g1->GetParameters(&par[0]);
   g2->GetParameters(&par[3]);
   g3->GetParameters(&par[6]);

// Use the parameters on the sum
   total->SetParameters(par);
   h->Fit(total,"R+");

Result of the fit

You can obtain the following results of a fit:

Associated function

One or more objects (typically a TF1\*) can be added to the list of functions (fFunctions) associated to each histogram. TH1::Fit() adds the fitted function to this list.

Given a histogram h, you can retrieve the associated function with:

   TF1 *myfunc = h->GetFunction("myfunc");

Accessing the fit parameters and results

If the histogram or graph is made persistent, the list of associated functions is also persistent.

Retrieve a pointer to the function with the TH1::GetFunction() method. Then you can retrieve the fit parameters from the function.

Example

   TF1 *fit = hist->GetFunction(function_name);
   Double_t chi2 = fit->GetChisquare();

// Value of the first parameter.
   Double_t p1 = fit->GetParameter(0);

// Error of the first parameter.
   Double_t e1 = fit->GetParError(0);

With the fit option S, you can access the full result of the fit including the covariance and correlation matrix.

Associated errors

By default, for each bin, the sum of weights is computed at fill time. You can also call TH1::Sumw2() to force the storage and computation of the sum of the square of weights per bin. If Sumw2() has been called, the error per bin is computed as the sqrt(sum of squares of weights). Otherwise, the error is set equal to the sqrt(bin content).

To return the error for a given bin number, use:

   Double_t error = h->GetBinError(bin);

Empty bins are excluded in the fit when using the Chi-square fit method. When fitting an histogram representing counts (this is with Poisson statistics) it is recommended to use the Log-Likelihood method (option L or WL), particularly in case of low statistics.

Fit statistics

You can change the statistics box to display the fit parameters with the TStyle::SetOptFit() method. This parameter has four digits: mode = pcev (default = 0111)

  • p = 1: Print probability.
  • c = 1: Print Chi-square/number of degrees of freedom.
  • e = 1: Print errors (if e=1, v must be 1).
  • v = 1: Print name/values of parameters.

Example

To print the fit probability, parameter names/values, and errors, use:

   gStyle->SetOptFit(1011);

Using ROOT::Fit classes

ROOT::Fit is the namespace for fitting classes (regression analysis). The fitting classes are part of the MathCore library.
The defined classes can be classified in the following groups:

Creating the input data

There are two types of input data:

Using binned data

Example

There is histogram, represented as a TH1 type object. Now a ROOT:Fit::BinData object is created and filled.

   ROOT::Fit::DataOptions opt;
   opt.fIntegral = true;
   ROOT::Fit::BinData data(opt);

// Fill the bin data by using the histogram:
   TH1 * h1 = (TH1*) gDirectory->Get("myHistogram");
   ROOT::Fit::FillData(data, h1);

By using ROOT::Fit::DataOptions you can specify the data range and some fitting options.

Using un-binned data

For creating un-binned data sets, there are two possibilities:

  1. Copy the data inside a ROOT::Fit::UnBinData object.
    Create an empty ROOT::Fit::UnBinData object, iterate on the data and add the data point one by one. An input ROOT::Fit::DataRange object is passed in order to copy the data according to the given range.
  2. Use ROOT::Fit::UnBinData as a wrapper to an external data storage.
    In this case the ROOT::Fit::UnBinData object is created from an iterator or pointers to the data and the data are not copied inside. The data cannot be selected according to a specified range. All the data points will be included in the fit.

ROOT::Fit::UnBinData supports also weighted data. In addition to the data points (coordinates), which can be of arbitrary k dimensions, the class can be constructed from a vector of weights.

Example

Data are taken from a histogram (TH1 object).

   double * buffer = histogram->GetBuffer();

// Number of entry is first entry in the buffer.
   int n = buffer[0];

// When creating the data object, it is important to create it with the size of the data.
   ROOT::Fit::UnBinData data(n);
   for (int i = 0; i < n; ++i)
      data.add(buffer[2*i+1]);

Creating a fit model

The model function needs to be expressed as function of some unknown parameters. The fitting will find the best parameter value to describe the observed data.

You can for example use the TF1 class, the parametric function class to describe the model function. But the ROOT::Fit::Fitter class takes as input a more general parametric function object, the abstract interface class ROOT::Math::IParametricFunctionMultiDim. It describes a generic one-dimensional or multi-dimensional function with parameters. This interface extends the abstract ROOT::Math::IBaseFunctionMultiDim class with methods to set/retrieve parameter values and to evaluate the function given the independent vector of values X and vector of parameters P.

Configuring the fit

Use the ROOT::Fit::FitConfig (contained in the ROOT::Fit::ParameterSettings class) class for configuring the fit.

There the following fit configurations:

  • Setting the initial values of the parameters.
  • Setting the parameter step sizes.
  • Setting eventual parameter bounds.
  • Setting the minimizer library and the particular algorithm to use.
  • Setting different minimization options (print level, tolerance, max iterations, etc. . . ).
  • Setting the type of parameter errors to compute (parabolic error, minor errors, re-normalize errors using fitted chi2 values).

Example

Setting the lower/upper bounds for the first parameter and a lower bound for the second parameter:

   fitter.SetFunction( fitFunction, false);
   fitter.Config().ParSettings(0).SetLimits(0,1.E6);
   fitter.Config().ParSettings(2).SetLowerLimit(0);

Performing the fit

Depending on the available input data and the selected function for fitting, you can use one of the methods of the ROOT::Fit::Fitter class to perform the fit.

The following pre-defined fitting methods are available:

Fit result

The result of the fit is contained in the ROOT::Fit::Result object.

You can print the result of the fit with the FitResult::Print() method.