Fits MC fractions to data histogram (a la HMCMLL, see R. Barlow and C. Beeston, Comp. Phys. Comm. 77 (1993) 219-228, and http://www.hep.man.ac.uk/~roger/hfrac.f). The virtue of this fit is that it takes into account both data and Monte Carlo statistical uncertainties. The way in which this is done is through a standard likelihood fit using Poisson statistics; however, the template (MC) predictions are also varied within statistics, leading to additional contributions to the overall likelihood. This leads to many more fit parameters (one per bin per template), but the minimisation with respect to these additional parameters is done analytically rather than introducing them as formal fit parameters. Some special care needs to be taken in the case of bins with zero content. For more details please see the original publication cited above. An example application of this fit is given below. For a TH1* histogram ("data") fitted as the sum of three Monte Carlo sources ("mc"): { TH1F *data; //data histogram TH1F *mc0; // first MC histogram TH1F *mc1; // second MC histogram TH1F *mc2; // third MC histogram .... // retrieve histograms TObjArray *mc = new TObjArray(3); // MC histograms are put in this array mc->Add(mc0); mc->Add(mc1); mc->Add(mc2); TFractionFitter* fit = new TFractionFitter(data, mc); // initialise fit->Constrain(1,0.0,1.0); // constrain fraction 1 to be between 0 and 1 fit->SetRangeX(1,15); // use only the first 15 bins in the fit Int_t status = fit->Fit(); // perform the fit cout << "fit status: " << status << endl; if (status == 0) { // check on fit status TH1F* result = (TH1F*) fit->GetPlot(); data->Draw("Ep"); result->Draw("same"); } } Assumptions A few assumptions need to be made for the fit procedure to be carried out: (1) The total number of events in each template is not too small (so that its Poisson uncertainty can be neglected). (2) The number of events in each bin is much smaller than the total number of events in each template (so that multinomial uncertainties can be replaced with Poisson uncertainties). Biased fit uncertainties may result if these conditions are not fulfilled (see e.g. arXiv:0803.2711). Instantiation A fit object is instantiated through TFractionFitter* fit = new TFractionFitter(data, mc); A number of basic checks (intended to ensure that the template histograms represent the same "kind" of distribution as the data one) are carried out. The TVirtualFitter object is then addressed and all fit parameters (the template fractions) declared (initially unbounded). Applying constraints Fit parameters can be constrained through fit->Constrain(parameter #, lower bound, upper bound); Setting lower bound = upper bound = 0 removes the constraint (a la Minuit); however, a function fit->Unconstrain(parameter #) is also provided to simplify this. Setting parameter values The function TVirtualFitter* vFit = fit->GetFitter(); is provided for direct access to the TVirtualFitter object. This allows to set and fix parameter values, and set step sizes directly. Restricting the fit range The fit range can be restricted through fit->SetRangeX(first bin #, last bin #); and freed using fit->ReleaseRangeX(); For 2D histograms the Y range can be similarly restricted using fit->SetRangeY(first bin #, last bin #); fit->ReleaseRangeY(); and for 3D histograms also fit->SetRangeZ(first bin #, last bin #); fit->ReleaseRangeZ(); It is also possible to exclude individual bins from the fit through fit->ExcludeBin(bin #); where the given bin number is assumed to follow the TH1::GetBin() numbering. Any bins excluded in this way can be included again using the corresponding fit->IncludeBin(bin #); Weights histograms Weights histograms (for a motivation see the above publication) can be specified for the individual MC sources through fit->SetWeight(parameter #, pointer to weights histogram); and unset by specifying a null pointer. Obtaining fit results The fit is carried out through Int_t status = fit->Fit(); where status is the code returned from the "MINIMIZE" command. For fits that converged, parameter values and errors can be obtained through fit->GetResult(parameter #, value, error); and the histogram corresponding to the total Monte Carlo prediction (which is not the same as a simple weighted sum of the input Monte Carlo distributions) can be obtained by TH1* result = fit->GetPlot(); Using different histograms It is possible to change the histogram being fitted through fit->SetData(TH1* data); and to change the template histogram for a given parameter number through fit->SetMC(parameter #, TH1* MC); This can speed up code in case of multiple data or template histograms; however, it should be done with care as any settings are taken over from the previous fit. In addition, neither the dimensionality nor the numbers of bins of the histograms should change (in that case it is better to instantiate a new TFractionFitter object). Errors Any serious inconsistency results in an error.
TFractionFitter() | |
TFractionFitter(TH1* data, TObjArray* MCs, Option_t* option = "") | |
virtual | ~TFractionFitter() |
void | TObject::AbstractMethod(const char* method) const |
virtual void | TObject::AppendPad(Option_t* option = "") |
virtual void | TObject::Browse(TBrowser* b) |
static TClass* | Class() |
virtual const char* | TObject::ClassName() const |
virtual void | TObject::Clear(Option_t* = "") |
virtual TObject* | TObject::Clone(const char* newname = "") const |
virtual Int_t | TObject::Compare(const TObject* obj) const |
void | Constrain(Int_t parm, Double_t low, Double_t high) |
virtual void | TObject::Copy(TObject& object) const |
virtual void | TObject::Delete(Option_t* option = "")MENU |
virtual Int_t | TObject::DistancetoPrimitive(Int_t px, Int_t py) |
virtual void | TObject::Draw(Option_t* option = "") |
virtual void | TObject::DrawClass() constMENU |
virtual TObject* | TObject::DrawClone(Option_t* option = "") constMENU |
virtual void | TObject::Dump() constMENU |
virtual void | TObject::Error(const char* method, const char* msgfmt) const |
void | ErrorAnalysis(Double_t UP) |
void | ExcludeBin(Int_t bin) |
virtual void | TObject::Execute(const char* method, const char* params, Int_t* error = 0) |
virtual void | TObject::Execute(TMethod* method, TObjArray* params, Int_t* error = 0) |
virtual void | TObject::ExecuteEvent(Int_t event, Int_t px, Int_t py) |
virtual void | TObject::Fatal(const char* method, const char* msgfmt) const |
virtual TObject* | TObject::FindObject(const char* name) const |
virtual TObject* | TObject::FindObject(const TObject* obj) const |
Int_t | Fit() |
Double_t | GetChisquare() const |
virtual Option_t* | TObject::GetDrawOption() const |
static Long_t | TObject::GetDtorOnly() |
TVirtualFitter* | GetFitter() const |
virtual const char* | TObject::GetIconName() const |
TH1* | GetMCPrediction(Int_t parm) const |
virtual const char* | TObject::GetName() const |
Int_t | GetNDF() const |
virtual char* | TObject::GetObjectInfo(Int_t px, Int_t py) const |
static Bool_t | TObject::GetObjectStat() |
virtual Option_t* | TObject::GetOption() const |
TH1* | GetPlot() |
Double_t | GetProb() const |
void | GetResult(Int_t parm, Double_t& value, Double_t& error) const |
virtual const char* | TObject::GetTitle() const |
virtual UInt_t | TObject::GetUniqueID() const |
virtual Bool_t | TObject::HandleTimer(TTimer* timer) |
virtual ULong_t | TObject::Hash() const |
void | IncludeBin(Int_t bin) |
virtual void | TObject::Info(const char* method, const char* msgfmt) const |
virtual Bool_t | TObject::InheritsFrom(const char* classname) const |
virtual Bool_t | TObject::InheritsFrom(const TClass* cl) const |
virtual void | TObject::Inspect() constMENU |
void | TObject::InvertBit(UInt_t f) |
virtual TClass* | IsA() const |
virtual Bool_t | TObject::IsEqual(const TObject* obj) const |
virtual Bool_t | TObject::IsFolder() const |
Bool_t | TObject::IsOnHeap() const |
virtual Bool_t | TObject::IsSortable() const |
Bool_t | TObject::IsZombie() const |
virtual void | TObject::ls(Option_t* option = "") const |
void | TObject::MayNotUse(const char* method) const |
virtual Bool_t | TObject::Notify() |
void | TObject::Obsolete(const char* method, const char* asOfVers, const char* removedFromVers) const |
static void | TObject::operator delete(void* ptr) |
static void | TObject::operator delete(void* ptr, void* vp) |
static void | TObject::operator delete[](void* ptr) |
static void | TObject::operator delete[](void* ptr, void* vp) |
void* | TObject::operator new(size_t sz) |
void* | TObject::operator new(size_t sz, void* vp) |
void* | TObject::operator new[](size_t sz) |
void* | TObject::operator new[](size_t sz, void* vp) |
TObject& | TObject::operator=(const TObject& rhs) |
virtual void | TObject::Paint(Option_t* option = "") |
virtual void | TObject::Pop() |
virtual void | TObject::Print(Option_t* option = "") const |
virtual Int_t | TObject::Read(const char* name) |
virtual void | TObject::RecursiveRemove(TObject* obj) |
void | ReleaseRangeX() |
void | ReleaseRangeY() |
void | ReleaseRangeZ() |
void | TObject::ResetBit(UInt_t f) |
virtual void | TObject::SaveAs(const char* filename = "", Option_t* option = "") constMENU |
virtual void | TObject::SavePrimitive(ostream& out, Option_t* option = "") |
void | TObject::SetBit(UInt_t f) |
void | TObject::SetBit(UInt_t f, Bool_t set) |
void | SetData(TH1* data) |
virtual void | TObject::SetDrawOption(Option_t* option = "")MENU |
static void | TObject::SetDtorOnly(void* obj) |
void | SetMC(Int_t parm, TH1* MC) |
static void | TObject::SetObjectStat(Bool_t stat) |
void | SetRangeX(Int_t low, Int_t high) |
void | SetRangeY(Int_t low, Int_t high) |
void | SetRangeZ(Int_t low, Int_t high) |
virtual void | TObject::SetUniqueID(UInt_t uid) |
void | SetWeight(Int_t parm, TH1* weight) |
virtual void | ShowMembers(TMemberInspector&) |
virtual void | Streamer(TBuffer&) |
void | StreamerNVirtual(TBuffer& ClassDef_StreamerNVirtual_b) |
virtual void | TObject::SysError(const char* method, const char* msgfmt) const |
Bool_t | TObject::TestBit(UInt_t f) const |
Int_t | TObject::TestBits(UInt_t f) const |
void | UnConstrain(Int_t parm) |
virtual void | TObject::UseCurrentStyle() |
virtual void | TObject::Warning(const char* method, const char* msgfmt) const |
virtual Int_t | TObject::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0) |
virtual Int_t | TObject::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0) const |
virtual void | TObject::DoError(int level, const char* location, const char* fmt, va_list va) const |
void | TObject::MakeZombie() |
void | CheckConsistency() |
void | CheckParNo(Int_t parm) const |
void | ComputeChisquareLambda() |
void | ComputeFCN(Int_t& npar, Double_t* gin, Double_t& f, Double_t* par, Int_t flag) |
void | FindPrediction(int bin, double& t_i, int& k_0, double& A_ki) const |
void | GetRanges(Int_t& minX, Int_t& maxX, Int_t& minY, Int_t& maxY, Int_t& minZ, Int_t& maxZ) const |
bool | IsExcluded(Int_t bin) const |
enum TObject::EStatusBits { | kCanDelete | |
kMustCleanup | ||
kObjInCanvas | ||
kIsReferenced | ||
kHasUUID | ||
kCannotPick | ||
kNoContextMenu | ||
kInvalidObject | ||
}; | ||
enum TObject::[unnamed] { | kIsOnHeap | |
kNotDeleted | ||
kZombie | ||
kBitMask | ||
kSingleKey | ||
kOverwrite | ||
kWriteDelete | ||
}; |
TObjArray | fAji | array of pointers to predictions of real template distributions |
Double_t | fChisquare | Template fit chisquare |
TH1* | fData | pointer to the "data" histogram to be fitted to |
vector<Int_t> | fExcludedBins | bins excluded from the fit |
Bool_t | fFitDone | flags whether a valid fit has been performed |
Double_t* | fFractions | template fractions scaled to the "data" histogram statistics |
Int_t | fHighLimitX | last bin in X dimension |
Int_t | fHighLimitY | last bin in Y dimension |
Int_t | fHighLimitZ | last bin in Z dimension |
Double_t | fIntegralData | "data" histogram content integral over allowed fit range |
Double_t* | fIntegralMCs | same for template histograms (weights not taken into account) |
Int_t | fLowLimitX | first bin in X dimension |
Int_t | fLowLimitY | first bin in Y dimension |
Int_t | fLowLimitZ | first bin in Z dimension |
TObjArray | fMCs | array of pointers to template histograms |
Int_t | fNDF | Number of degrees of freedom in the fit |
Int_t | fNpar | number of fit parameters |
Int_t | fNpfits | Number of points used in the fit |
TH1* | fPlot | pointer to histogram containing summed template predictions |
TObjArray | fWeights | array of pointers to corresponding weight factors (may be null) |
TFractionFitter constructor. Does a complete initialisation (including consistency checks, default fit range as the whole histogram but without under- and overflows, and declaration of the fit parameters). Note that the histograms are not copied, only references are used. Arguments: data: histogram to be fitted MCs: array of TH1* corresponding template distributions Option: can be used to control the print level of the minimization algorithm option = "Q" : quite - no message is printed option = "V" : verbose - max print out option = "" : default: print initial fraction values and result
Change the histogram to be fitted to. Notes: - Parameter constraints and settings are retained from a possible previous fit. - Modifying the dimension or number of bins results in an error (in this case rather instantiate a new TFractionFitter object)
Change the histogram for template number <parm>. Notes: - Parameter constraints and settings are retained from a possible previous fit. - Modifying the dimension or number of bins results in an error (in this case rather instantiate a new TFractionFitter object)
Set bin by bin weights for template number <parm> (the parameter numbering
follows that of the input template vector).
Weights can be "unset" by passing a null pointer.
Consistency of the weights histogram with the data histogram is checked at
this point, and an error in case of problems.
Give direct access to the underlying minimisation class. This can be used e.g. to modify parameter values or step sizes.
Function for internal use, checking parameter validity An invalid parameter results in an error.
Set the X range of the histogram to be used in the fit. Use ReleaseRangeX() to go back to fitting the full histogram. The consistency check ensures that no empty fit range occurs (and also recomputes the bin content integrals). Arguments: low: lower X bin number high: upper X bin number
Set the Y range of the histogram to be used in the fit (2D or 3D histograms only). Use ReleaseRangeY() to go back to fitting the full histogram. The consistency check ensures that no empty fit range occurs (and also recomputes the bin content integrals). Arguments: low: lower Y bin number high: upper Y bin number
Set the Z range of the histogram to be used in the fit (3D histograms only). Use ReleaseRangeY() to go back to fitting the full histogram. The consistency check ensures that no empty fit range occurs (and also recomputes the bin content integrals). Arguments: low: lower Y bin number high: upper Y bin number
Include the given bin in the fit, if it was excluded before using ExcludeBin(). The bin numbering to be used is that of TH1::GetBin().
Function for internal use, checking whether the given bin is excluded from the fit or not.
Constrain the values of parameter number <parm> (the parameter numbering follows that of the input template vector). Use UnConstrain() to remove this constraint.
Function used internally to check the consistency between the
various histograms. Checks are performed on nonexistent or empty
histograms, the precise histogram class, and the number of bins.
In addition, integrals over the "allowed" bin ranges are computed.
Any inconsistency results in a error.
Perform the fit with the default UP value. The value returned is the minimisation status.
Set UP to the given value (see class TMinuit), and perform a MINOS minimisation.
Obtain the fit result for parameter <parm> (the parameter numbering follows that of the input template vector).
Return the "template prediction" corresponding to the fit result (this is not the same as the weighted sum of template distributions, as template statistical uncertainties are taken into account). Note that the name of this histogram will simply be the same as that of the "data" histogram, prefixed with the string "Fraction fit to hist: ".
Used internally to obtain the bin ranges according to the dimensionality of the histogram and the limits set by hand.
Used internally to compute the likelihood value.
Function used internally to obtain the template prediction in the individual bins 'bin' <=> 'i' (paper) 'par' <=> 'j' (paper)
Return the likelihood ratio Chi-squared (chi2) for the fit. The value is computed when the fit is executed successfully. Chi2 calculation is based on the "likelihood ratio" lambda, lambda = L(y;n) / L(m;n), where L(y;n) is the likelihood of the fit result <y> describing the data <n> and L(m;n) is the likelihood of an unknown "true" underlying distribution <m> describing the data <n>. Since <m> is unknown, the data distribution is used instead, lambda = L(y;n) / L(n;n). Note that this ratio is 1 if the fit is perfect. The chi2 value is then computed according to chi2 = -2*ln(lambda). This parameter can be shown to follow a Chi-square distribution. See for example S. Baker and R. Cousins, "Clarification of the use of chi-square and likelihood functions in fits to histograms", Nucl. Instr. Meth. A221, pp. 437-442 (1984)
return the number of degrees of freedom in the fit the fNDF parameter has been previously computed during a fit. The number of degrees of freedom corresponds to the number of points used in the fit minus the number of templates.
Method used internally to compute the likelihood ratio chi2 See the function GetChisquare() for details
Return the adjusted MC template (Aji) for template (parm). Note that the (Aji) times fractions only sum to the total prediction of the fit if all weights are 1.