MCMCInterval is a concrete implementation of the RooStats::ConfInterval interface. It takes as input Markov Chain of data points in the parameter space generated by Monte Carlo using the Metropolis algorithm. From the Markov Chain, the confidence interval can be determined in two ways:
Using a Kernel-Estimated PDF: (not the default method)
A RooNDKeysPdf is constructed from the data set using adaptive kernel width. With this RooNDKeysPdf F, we then integrate over the most likely domain in the parameter space (tallest points in the posterior RooNDKeysPdf) until the target confidence level is reached within an acceptable neighborhood as defined by SetEpsilon(). More specifically: we calculate the following for different cutoff values C until we reach the target confidence level: \int_{ F >= C } F d{normset}. Important note: this is not the default method because of a bug in constructing the RooNDKeysPdf from a weighted data set. Configure to use this method by calling SetUseKeys(true), and the data set will be interpreted without weights.
Using a binned data set: (the default method)
This is the binned analog of the continuous integrative method that uses the kernel-estimated PDF. The points in the Markov Chain are put into a binned data set and the interval is then calculated by adding the heights of the bins in decreasing order until the desired level of confidence has been reached. Note that this means the actual confidence level is >= the confidence level prescribed by the client (unless the user calls SetHistStrict(kFALSE)). This method is the default but may not remain as such in future releases, so you may wish to explicitly configure to use this method by calling SetUseKeys(false)These are not the only ways for the confidence interval to be determined, and other possibilities are being considered being added, especially for the 1-dimensional case.
One can ask an MCMCInterval for the lower and upper limits on a specific parameter of interest in the interval. Note that this works better for some distributions (ones with exactly one local maximum) than others, and sometimes has little value.
virtual Double_t | CalcConfLevel(Double_t cutoff, Double_t full) |
virtual void | CreateDataHist() |
virtual void | CreateHist() |
virtual void | CreateKeysDataHist() |
virtual void | CreateKeysPdf() |
virtual void | CreateSparseHist() |
virtual void | CreateVector(RooRealVar* param) |
virtual void | DetermineByDataHist() |
virtual void | DetermineByHist() |
virtual void | DetermineByKeys() |
virtual void | DetermineBySparseHist() |
virtual void | DetermineInterval() |
virtual void | DetermineShortestInterval() |
virtual void | DetermineTailFractionInterval() |
virtual void | TObject::DoError(int level, const char* location, const char* fmt, va_list va) const |
void | TObject::MakeZombie() |
Bool_t | AcceptableConfLevel(Double_t confLevel) |
Bool_t | WithinDeltaFraction(Double_t a, Double_t b) |
enum { | DEFAULT_NUM_BINS | |
}; | ||
enum IntervalType { | kShortest | |
kTailFraction | ||
}; | ||
enum TObject::EStatusBits { | kCanDelete | |
kMustCleanup | ||
kObjInCanvas | ||
kIsReferenced | ||
kHasUUID | ||
kCannotPick | ||
kNoContextMenu | ||
kInvalidObject | ||
}; | ||
enum TObject::[unnamed] { | kIsOnHeap | |
kNotDeleted | ||
kZombie | ||
kBitMask | ||
kSingleKey | ||
kOverwrite | ||
kWriteDelete | ||
}; |
RooRealVar** | fAxes | array of pointers to RooRealVars representing |
RooStats::MarkovChain* | fChain | the markov chain |
Double_t | fConfidenceLevel | Requested confidence level (eg. 0.95 for 95% CL) |
RooRealVar* | fCutoffVar | cutoff variable to use for integrating keys pdf |
RooDataHist* | fDataHist | the binned Markov Chain data |
Double_t | fDelta | topCutoff (a) considered == bottomCutoff (b) iff |
Int_t | fDimension | number of variables |
Double_t | fEpsilon | acceptable error for Keys interval determination |
Double_t | fFull | Value of intergral of fProduct |
RooStats::Heaviside* | fHeaviside | the Heaviside function |
TH1* | fHist | the binned Markov Chain data |
Double_t | fHistConfLevel | the actual conf level determined by hist |
Double_t | fHistCutoff | cutoff bin size to be in interval |
RooStats::MCMCInterval::IntervalType | fIntervalType | |
Bool_t | fIsHistStrict | whether the specified confidence level is a |
Double_t | fKeysConfLevel | the actual conf level determined by keys |
Double_t | fKeysCutoff | cutoff keys pdf value to be in interval |
RooDataHist* | fKeysDataHist | data hist representing product |
RooNDKeysPdf* | fKeysPdf | the kernel estimation pdf |
Double_t | fLeftSideTF | left side tail-fraction for interval |
TString | TNamed::fName | object identifier |
Int_t | fNumBurnInSteps | number of steps to discard as burn in, starting |
RooArgSet | fParameters | parameters of interest for this interval |
RooProduct* | fProduct | the (keysPdf * heaviside) product |
THnSparse* | fSparseHist | the binned Markov Chain data |
Double_t | fTFConfLevel | the actual conf level of tail-fraction interval |
Double_t | fTFLower | lower limit of the tail-fraction interval |
Double_t | fTFUpper | upper limit of the tail-fraction interval |
TString | TNamed::fTitle | object title |
Bool_t | fUseKeys | whether to use kernel estimation |
Bool_t | fUseSparseHist | whether to use sparse hist (vs. RooDataHist) |
Double_t | fVecWeight | sum of weights of all entries in fVector |
vector<Int_t> | fVector | vector containing the Markov chain data |
kbelasco: check here for memory leak. does RooNDKeysPdf use the RooArgList passed to it or does it make a clone? also check for memory leak from chain, does RooNDKeysPdf clone that?
get the desired confidence level (see GetActualConfidenceLevel())
{return fConfidenceLevel;}
whether the specified confidence level is a floor for the actual confidence level (strict), or a ceiling (not strict)
{ fIsHistStrict = isHistStrict; }
Set the MarkovChain that this interval is based on
{ fChain = &chain; }
return a list of RooRealVars representing the axes you own the returned RooArgList
set the number of steps in the chain to discard as burn-in, starting from the first
{ fNumBurnInSteps = numBurnInSteps; }
set whether to use kernel estimation to determine the interval
{ fUseKeys = useKeys; }
set whether to use a sparse histogram. you MUST also call SetUseKeys(kFALSE) to use a histogram.
{ fUseSparseHist = useSparseHist; }
get whether we used kernel estimation to determine the interval
{ return fUseKeys; }
get the number of steps in the chain to disard as burn-in, get the number of steps in the chain to disard as burn-in, starting from the first
{ return fNumBurnInSteps; }
Get the number of parameters of interest in this interval
{ return fDimension; }
Get the markov chain on which this interval is based You do not own the returned MarkovChain*
{ return fChain; }
Get a clone of the markov chain on which this interval is based as a RooDataSet. You own the returned RooDataSet*
{ return fChain->GetAsDataSet(whichVars); }
Get the markov chain on which this interval is based as a RooDataSet. You do not own the returned RooDataSet*
{ return fChain->GetAsConstDataSet(); }
Get a clone of the markov chain on which this interval is based as a RooDataHist. You own the returned RooDataHist*
{ return fChain->GetAsDataHist(whichVars); }
Get a clone of the markov chain on which this interval is based as a THnSparse. You own the returned THnSparse*
{ return fChain->GetAsSparseHist(whichVars); }
Get a clone of the weight variable from the markov chain
{ return fChain->GetWeightVar(); }
Set the type of interval to find. This will only have an effect for
1-D intervals. If is more than 1 parameter of interest, then a
"shortest" interval will always be used, since it generalizes directly
to N dimensions
{ fIntervalType = intervalType; }
set the left-side tail fraction for a tail-fraction interval
{ fLeftSideTF = a; }
kbelasco: The inner-workings of the class really should not be exposed like this in a comment, but it seems to be the only way to give the user any control over this process, if he desires it Set the fraction delta such that topCutoff (a) is considered == bottomCutoff (b) iff (TMath::Abs(a - b) < TMath::Abs(fDelta * (a + b)/2)) when determining the confidence interval by Keys