Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
TBinomialEfficiencyFitter.cxx
Go to the documentation of this file.
1// @(#)root/hist:$Id$
2// Author: Frank Filthaut, Rene Brun 30/05/2007
3
4/*************************************************************************
5 * Copyright (C) 1995-2007, Rene Brun and Fons Rademakers. *
6 * All rights reserved. *
7 * *
8 * For the licensing terms see $ROOTSYS/LICENSE. *
9 * For the list of contributors see $ROOTSYS/README/CREDITS. *
10 *************************************************************************/
11
12/** \class TBinomialEfficiencyFitter
13 \ingroup Hist
14 \brief Binomial fitter for the division of two histograms.
15
16Use when you need to calculate a selection's efficiency from two histograms,
17one containing all entries, and one containing the subset of these entries
18that pass the selection, and when you have a parametrization available for
19the efficiency as a function of the variable(s) under consideration.
20
21A very common problem when estimating efficiencies is that of error estimation:
22when no other information is available than the total number of events N and
23the selected number n, the best estimate for the selection efficiency p is n/N.
24Standard binomial statistics dictates that the uncertainty (this presupposes
25sufficiently high statistics that an approximation by a normal distribution is
26reasonable) on p, given N, is
27\f[
28 \sqrt{\frac{p(1-p)}{N}}
29\f]
30However, when p is estimated as n/N, fluctuations from the true p to its
31estimate become important, especially for low numbers of events, and giving
32rise to biased results.
33
34When fitting a parametrized efficiency, these problems can largely be overcome,
35as a hypothesized true efficiency is available by construction. Even so, simply
36using the corresponding uncertainty still presupposes that Gaussian errors
37yields a reasonable approximation. When using, instead of binned efficiency
38histograms, the original numerator and denominator histograms, a binned maximum
39likelihood can be constructed as the product of bin-by-bin binomial probabilities
40to select n out of N events. Assuming that a correct parametrization of the
41efficiency is provided, this construction in general yields less biased results
42(and is much less sensitive to binning details).
43
44A generic use of this method is given below (note that the method works for 2D
45and 3D histograms as well):
46
47~~~ {.cpp}
48 {
49 TH1* denominator; // denominator histogram
50 TH1* numerator; // corresponding numerator histogram
51 TF1* eff; // efficiency parametrization
52 .... // set step sizes and initial parameter
53 .... // values for the fit function
54 .... // possibly also set ranges, see TF1::SetRange()
55 TBinomialEfficiencyFitter* f = new TBinomialEfficiencyFitter(
56 numerator, denominator);
57 Int_t status = f->Fit(eff, "I");
58 if (status == 0) {
59 // if the fit was successful, display bin-by-bin efficiencies
60 // as well as the result of the fit
61 numerator->Sumw2();
62 TH1* hEff = dynamic_cast<TH1*>(numerator->Clone("heff"));
63 hEff->Divide(hEff, denominator, 1.0, 1.0, "B");
64 hEff->Draw("E");
65 eff->Draw("same");
66 }
67 }
68~~~
69
70Note that this method cannot be expected to yield reliable results when using
71weighted histograms (because the likelihood computation will be incorrect).
72
73*/
74
76
77#include "TMath.h"
78#include "TPluginManager.h"
79#include "TROOT.h"
80#include "TH1.h"
81#include "TF1.h"
82#include "TF2.h"
83#include "TF3.h"
84#include "Fit/FitConfig.h"
85#include "Fit/Fitter.h"
86#include "TFitResult.h"
87#include "Math/Functor.h"
89
90#include <limits>
91
92
94
95
96
97////////////////////////////////////////////////////////////////////////////////
98/// default constructor
99
101 fNumerator = nullptr;
102 fDenominator = nullptr;
103 fFunction = nullptr;
106 fRange = kFALSE;
108 fFitter = nullptr;
109}
110
111////////////////////////////////////////////////////////////////////////////////
112/// Constructor.
113///
114/// Note that no objects are copied, so it is up to the user to ensure that the
115/// histogram pointers remain valid.
116///
117/// Both histograms need to be "consistent". This is not checked here, but in
118/// TBinomialEfficiencyFitter::Fit().
119
120TBinomialEfficiencyFitter::TBinomialEfficiencyFitter(const TH1 *numerator, const TH1 *denominator) {
122 fFunction = nullptr;
123 fFitter = nullptr;
124 Set(numerator,denominator);
125}
126
127////////////////////////////////////////////////////////////////////////////////
128/// destructor
129
134
135////////////////////////////////////////////////////////////////////////////////
136/// Initialize with a new set of inputs.
137
138void TBinomialEfficiencyFitter::Set(const TH1 *numerator, const TH1 *denominator)
139{
140 fNumerator = (TH1*)numerator;
141 fDenominator = (TH1*)denominator;
142
145 fRange = kFALSE;
146}
147
148////////////////////////////////////////////////////////////////////////////////
149/// Set the required integration precision, see TF1::Integral()
150
152{
153 fEpsilon = epsilon;
154}
155
156////////////////////////////////////////////////////////////////////////////////
157/// Provide access to the underlying fitter object.
158/// This may be useful e.g. for the retrieval of additional information (such
159/// as the output covariance matrix of the fit).
160
167
168////////////////////////////////////////////////////////////////////////////////
169/// Carry out the fit of the given function to the given histograms.
170///
171/// If option "I" is used, the fit function will be averaged over the
172/// bin (the default is to evaluate it simply at the bin center).
173///
174/// If option "R" is used, the fit range will be taken from the fit
175/// function (the default is to use the entire histogram).
176///
177/// If option "S" a TFitResult object is returned and it can be used to obtain
178/// additional fit information, like covariance or correlation matrix.
179///
180/// Note that all parameter values, limits, and step sizes are copied
181/// from the input fit function f1 (so they should be set before calling
182/// this method. This is particularly relevant for the step sizes, taken
183/// to be the "error" set on input, as a null step size usually fixes the
184/// corresponding parameter. That is protected against, but in such cases
185/// an arbitrary starting step size will be used, and the reliability of
186/// the fit should be questioned). If parameters are to be fixed, this
187/// should be done by specifying non-null parameter limits, with lower
188/// limits larger than upper limits.
189///
190/// On output, f1 contains the fitted parameters and errors, as well as
191/// the number of degrees of freedom, and the goodness-of-fit estimator
192/// as given by S. Baker and R. Cousins, Nucl. Instr. Meth. A221 (1984) 437.
193
195{
196 TString opt = option;
197 opt.ToUpper();
198 fAverage = opt.Contains("I");
199 fRange = opt.Contains("R");
200 Bool_t verbose = opt.Contains("V");
201 Bool_t quiet = opt.Contains("Q");
202 Bool_t saveResult = opt.Contains("S");
203 if (!f1) return -1;
204 fFunction = (TF1*)f1;
205 Int_t i, npar;
206 npar = f1->GetNpar();
207 if (npar <= 0) {
208 Error("Fit", "function %s has illegal number of parameters = %d",
209 f1->GetName(), npar);
210 return -3;
211 }
212
213 // Check that function has same dimension as histogram
214 if (!fNumerator || !fDenominator) {
215 Error("Fit","No numerator or denominator histograms set");
216 return -5;
217 }
218 if (f1->GetNdim() != fNumerator->GetDimension()) {
219 Error("Fit","function %s dimension, %d, does not match histogram dimension, %d",
221 return -4;
222 }
223 // Check that the numbers of bins for the histograms match
225 (f1->GetNdim() > 1 && fNumerator->GetNbinsY() != fDenominator->GetNbinsY()) ||
226 (f1->GetNdim() > 2 && fNumerator->GetNbinsZ() != fDenominator->GetNbinsZ())) {
227 Error("Fit", "numerator and denominator histograms do not have identical numbers of bins");
228 return -6;
229 }
230
231 // initialize the fitter
232
233 if (!fFitter) {
235 }
236
237
238 std::vector<ROOT::Fit::ParameterSettings> & parameters = fFitter->Config().ParamsSettings();
239 parameters.reserve(npar);
240 for (i = 0; i < npar; i++) {
241
242 // assign an ARBITRARY starting error to ensure the parameter won't be fixed!
244 if (we <= 0) we = 0.3*TMath::Abs(f1->GetParameter(i));
245 if (we == 0) we = 0.01;
246
247 parameters.push_back(ROOT::Fit::ParameterSettings(f1->GetParName(i), f1->GetParameter(i), we) );
248
251 // when a parameter is fixed is having plow and pup equal to the value (if this is not zero)
252 // we handle special case when fixed parameter has zero value (in that case plow=1 and pup =1 )
253 if (plow >= pup && (plow==f1->GetParameter(i) || pup==f1->GetParameter(i) ||
254 ( f1->GetParameter(i) == 0 && plow==1. && pup == 1.) ) ) {
255 parameters.back().Fix();
256 Info("Fit", "Fixing parameter %s to value %f", f1->GetParName(i), f1->GetParameter(i));
257 } else if (plow < pup) {
258 parameters.back().SetLimits(plow,pup);
259 Info("Fit", "Setting limits for parameter %s to [%f,%f]", f1->GetParName(i), plow,pup);
260 }
261 }
262
263 // fcn must be set after setting the parameters
265
266 // set also model function in fitter to have it in FitResult
267 // in this way one can compute for example the confidence intervals
269
270 // in case default value of 1.0 is used
271 if (fFitter->Config().MinimizerOptions().ErrorDef() == 1.0 ) {
272 fFitter->Config().MinimizerOptions().SetErrorDef(0.5);
273 }
274
275 if (verbose) {
276 fFitter->Config().MinimizerOptions().SetPrintLevel(3);
277 }
278 else if (quiet) {
279 fFitter->Config().MinimizerOptions().SetPrintLevel(0);
280 }
281
282 // perform the actual fit
283
284 // set the fit to be a binned likelihood fit
285 // so use as chi2 for goodness of fit Baker&Cousins LR
287 Bool_t status = fFitter->FitFCN();
288 if ( !status && !quiet)
289 Warning("Fit","Abnormal termination of minimization.");
290
291 fFitDone = kTRUE;
292
293 // set the number of fitted points
294 // number of fit points is set in ComputeFCN in the TF1 object
296
297 //Store fit results in fitFunction
298 const ROOT::Fit::FitResult & fitResult = fFitter->Result();
299 if (!fitResult.IsEmpty() ) {
300 f1->SetNDF(fitResult.Ndf() );
301 f1->SetChisquare(fitResult.Chi2());
302
303 f1->SetParameters( &(fitResult.Parameters().front()) );
304 if ( int( fitResult.Errors().size()) >= f1->GetNpar() )
305 f1->SetParErrors( &(fitResult.Errors().front()) );
306
307 if (!quiet) {
308 Info("Fit","Successful Result from Binomial Efficiency fitter of function %s",f1->GetName());
309 fitResult.Print(std::cout);
310 }
311 }
312 // create a new result class if needed
313 if (saveResult) {
314 TFitResult* fr = new TFitResult(fitResult);
315 TString name = TString::Format("TBinomialEfficiencyFitter_result_of_%s",f1->GetName() );
316 fr->SetName(name); fr->SetTitle(name);
317 return TFitResultPtr(fr);
318 }
319 else {
320 return TFitResultPtr(fitResult.Status() );
321 }
322
323}
324
325////////////////////////////////////////////////////////////////////////////////
326/// Compute the likelihood.
327
329{
331
334 int ylowbin = 0, yhighbin = 0, zlowbin = 0, zhighbin = 0;
335 if (nDim > 1) {
338 if (nDim > 2) {
341 }
342 }
343
345
346 if (fRange) {
347 double xmin, xmax, ymin, ymax, zmin, zmax;
348
349 // This way to ensure that a minimum range chosen exactly at a
350 // bin boundary is far from elegant, but is hopefully adequate.
351
352 if (nDim == 1) {
356 } else if (nDim == 2) {
362 } else if (nDim == 3) {
363 fFunction->GetRange(xmin, ymin, zmin, xmax, ymax, zmax);
370 }
371 }
372
373 // The coding below is perhaps somewhat awkward -- but it is done
374 // so that 1D, 2D, and 3D cases can be covered in the same loops.
375
376 f = 0.;
377
378 Int_t npoints = 0;
379 Double_t nmax = 0;
380 for (int xbin = xlowbin; xbin <= xhighbin; ++xbin) {
381
382 // compute the bin edges
385
386 for (int ybin = ylowbin; ybin <= yhighbin; ++ybin) {
387
388 // compute the bin edges (if applicable)
389 Double_t ylow = (nDim > 1) ? fDenominator->GetYaxis()->GetBinLowEdge(ybin) : 0;
390 Double_t yup = (nDim > 1) ? fDenominator->GetYaxis()->GetBinLowEdge(ybin+1) : 0;
391
392 for (int zbin = zlowbin; zbin <= zhighbin; ++zbin) {
393
394 // compute the bin edges (if applicable)
397
398 int bin = fDenominator->GetBin(xbin, ybin, zbin);
401
402 // count maximum value to use in the likelihood for inf
403 // i.e. a number much larger than the other terms
404 if (nDen> nmax) nmax = nDen;
405 if (nDen <= 0.) continue;
406 npoints++;
407
408 // mu is the average of the function over the bin OR
409 // the function evaluated at the bin centre
410 // As yet, there is nothing to prevent mu from being
411 // outside the range <0,1> !!
412
413 Double_t mu = 0;
414 switch (nDim) {
415 case 1:
416 mu = (fAverage) ?
417 fFunction->Integral(xlow, xup, fEpsilon)
418 / (xup-xlow) :
420 break;
421 case 2:
422 {
423 mu = (fAverage) ?
424 ((TF2*)fFunction)->Integral(xlow, xup, ylow, yup, fEpsilon)
425 / ((xup-xlow)*(yup-ylow)) :
428 }
429 break;
430 case 3:
431 {
432 mu = (fAverage) ?
433 ((TF3*)fFunction)->Integral(xlow, xup, ylow, yup, zlow, zup, fEpsilon)
434 / ((xup-xlow)*(yup-ylow)*(zup-zlow)) :
438 }
439 }
440
441 // binomial formula (forgetting about the factorials)
442 if (nNum != 0.) {
443 if (mu > 0.)
444 f -= nNum * TMath::Log(mu*nDen/nNum);
445 else
446 f -= nmax * -1E30; // crossing our fingers
447 }
448 if (nDen - nNum != 0.) {
449 if (1. - mu > 0.)
450 f -= (nDen - nNum) * TMath::Log((1. - mu)*nDen/(nDen-nNum));
451 else
452 f -= nmax * -1E30; // crossing our fingers
453 }
454 }
455 }
456 }
457
459}
#define f(i)
Definition RSha256.hxx:104
constexpr Bool_t kFALSE
Definition RtypesCore.h:108
double Double_t
Double 8 bytes.
Definition RtypesCore.h:73
constexpr Bool_t kTRUE
Definition RtypesCore.h:107
const char Option_t
Option string (const char)
Definition RtypesCore.h:80
const Double_t kDefaultEpsilon
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
Option_t Option_t option
char name[80]
Definition TGX11.cxx:110
float xmin
float ymin
float xmax
float ymax
class containing the result of the fit and all the related information (fitted parameter values,...
Definition FitResult.h:47
bool IsEmpty() const
True if a fit result does not exist (even invalid) with parameter values.
Definition FitResult.h:108
const std::vector< double > & Errors() const
parameter errors (return st::vector)
Definition FitResult.h:162
const std::vector< double > & Parameters() const
parameter values (return std::vector)
Definition FitResult.h:167
unsigned int Ndf() const
Number of degree of freedom.
Definition FitResult.h:156
double Chi2() const
Return the Chi2 value after fitting In case of unbinned fits (or not defined one, see the documentati...
Definition FitResult.h:153
void Print(std::ostream &os, bool covmat=false) const
print the result and optionally covariance matrix and correlations
int Status() const
minimizer status code
Definition FitResult.h:128
Fitter class, entry point for performing all type of fits.
Definition Fitter.h:79
void SetNumberOfFitPoints(unsigned int npoints)
Set number of fit points when using an external FCN function This function can be called after Fit to...
Definition Fitter.h:475
void SetFitType(int type)
Set the type of fit when using an external FCN possible types are : 1 (least-square),...
Definition Fitter.h:484
const FitResult & Result() const
get fit result
Definition Fitter.h:397
bool FitFCN(unsigned int npar, Function &fcn, const double *params=nullptr, unsigned int dataSize=0, int fitType=0)
Fit using the a generic FCN function as a C++ callable object implementing double () (const double *)...
Definition Fitter.h:647
const FitConfig & Config() const
access to the fit configuration (const method)
Definition Fitter.h:425
bool SetFCN(unsigned int npar, Function &fcn, const double *params=nullptr, unsigned int dataSize=0, int fitType=0)
Set a generic FCN function as a C++ callable object implementing double () (const double *) Note that...
Definition Fitter.h:654
Class, describing value, limits and step size of the parameters Provides functionality also to set/re...
Documentation for class Functor class.
Definition Functor.h:49
Documentation for the abstract class IBaseFunctionMultiDim.
Definition IFunction.h:63
virtual Double_t GetBinCenter(Int_t bin) const
Return center of bin.
Definition TAxis.cxx:481
virtual Int_t FindBin(Double_t x)
Find bin number corresponding to abscissa x.
Definition TAxis.cxx:292
virtual Double_t GetBinLowEdge(Int_t bin) const
Return low edge of bin.
Definition TAxis.cxx:521
Int_t GetLast() const
Return last bin on the axis i.e.
Definition TAxis.cxx:472
Int_t GetFirst() const
Return first bin on the axis i.e.
Definition TAxis.cxx:461
Double_t fEpsilon
Precision required for function integration (option "I")
void Set(const TH1 *numerator, const TH1 *denominator)
Initialize with a new set of inputs.
ROOT::Fit::Fitter * fFitter
pointer to the real fitter
TH1 * fDenominator
Denominator histogram.
Bool_t fRange
True if the fit range must be taken from the function range.
Bool_t fAverage
True if the fit function must be averaged over the bin.
~TBinomialEfficiencyFitter() override
destructor
TBinomialEfficiencyFitter()
default constructor
ROOT::Fit::Fitter * GetFitter()
Provide access to the underlying fitter object.
Double_t EvaluateFCN(const Double_t *par)
void SetPrecision(Double_t epsilon)
Set the required integration precision, see TF1::Integral()
Bool_t fFitDone
Set to kTRUE when the fit has been done.
void ComputeFCN(Double_t &f, const Double_t *par)
Compute the likelihood.
TFitResultPtr Fit(TF1 *f1, Option_t *option="")
Carry out the fit of the given function to the given histograms.
TH1 * fNumerator
Numerator histogram.
1-Dim function class
Definition TF1.h:182
virtual void GetParLimits(Int_t ipar, Double_t &parmin, Double_t &parmax) const
Return limits for parameter ipar.
Definition TF1.cxx:1967
virtual void SetNDF(Int_t ndf)
Set the number of degrees of freedom ndf should be the number of points used in a fit - the number of...
Definition TF1.cxx:3450
virtual Double_t GetParError(Int_t ipar) const
Return value of parameter number ipar.
Definition TF1.cxx:1957
virtual void SetChisquare(Double_t chi2)
Definition TF1.h:596
virtual Int_t GetNpar() const
Definition TF1.h:461
virtual void SetParErrors(const Double_t *errors)
Set errors for all active parameters when calling this function, the array errors must have at least ...
Definition TF1.cxx:3521
virtual Double_t Integral(Double_t a, Double_t b, Double_t epsrel=1.e-12)
IntegralOneDim or analytical integral.
Definition TF1.cxx:2556
virtual Int_t GetNumberFitPoints() const
Definition TF1.h:483
virtual void SetNumberFitPoints(Int_t npfits)
Definition TF1.h:608
virtual void GetRange(Double_t *xmin, Double_t *xmax) const
Return range of a generic N-D function.
Definition TF1.cxx:2306
virtual const char * GetParName(Int_t ipar) const
Definition TF1.h:509
virtual void SetParameters(const Double_t *params)
Definition TF1.h:633
virtual Double_t Eval(Double_t x, Double_t y=0, Double_t z=0, Double_t t=0) const
Evaluate this function.
Definition TF1.cxx:1446
virtual Int_t GetNdim() const
Definition TF1.h:465
virtual Double_t GetParameter(Int_t ipar) const
Definition TF1.h:492
A 2-Dim function with parameters.
Definition TF2.h:29
A 3-Dim function with parameters.
Definition TF3.h:28
Provides an indirection to the TFitResult class and with a semantics identical to a TFitResult pointe...
Extends the ROOT::Fit::Result class with a TNamed inheritance providing easy possibility for I/O.
Definition TFitResult.h:34
TH1 is the base class of all histogram classes in ROOT.
Definition TH1.h:109
virtual Double_t GetBinCenter(Int_t bin) const
Return bin center for 1D histogram.
Definition TH1.cxx:9179
TAxis * GetZaxis()
Definition TH1.h:574
virtual Int_t GetNbinsY() const
Definition TH1.h:543
virtual Int_t GetNbinsZ() const
Definition TH1.h:544
virtual Int_t GetDimension() const
Definition TH1.h:528
TAxis * GetXaxis()
Definition TH1.h:572
virtual Int_t GetBin(Int_t binx, Int_t biny=0, Int_t binz=0) const
Return Global bin number corresponding to binx,y,z.
Definition TH1.cxx:4974
virtual Int_t GetNbinsX() const
Definition TH1.h:542
TAxis * GetYaxis()
Definition TH1.h:573
virtual Double_t GetBinContent(Int_t bin) const
Return content of bin number bin.
Definition TH1.cxx:5076
virtual void SetTitle(const char *title="")
Set the title of the TNamed.
Definition TNamed.cxx:173
const char * GetName() const override
Returns name of object.
Definition TNamed.h:49
virtual void SetName(const char *name)
Set the name of the TNamed.
Definition TNamed.cxx:149
virtual void Warning(const char *method, const char *msgfmt,...) const
Issue warning message.
Definition TObject.cxx:1057
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition TObject.cxx:1071
virtual void Info(const char *method, const char *msgfmt,...) const
Issue info message.
Definition TObject.cxx:1045
Basic string class.
Definition TString.h:138
void ToUpper()
Change string to upper case.
Definition TString.cxx:1202
static TString Format(const char *fmt,...)
Static method which formats a string using a printf style format descriptor and return a TString.
Definition TString.cxx:2384
Bool_t Contains(const char *pat, ECaseCompare cmp=kExact) const
Definition TString.h:640
TF1 * f1
Definition legend1.C:11
Double_t Log(Double_t x)
Returns the natural logarithm of x.
Definition TMath.h:767
Short_t Abs(Short_t d)
Returns the absolute value of parameter Short_t d.
Definition TMathBase.h:124