With default parameters the macro will attempt to run the standard hist2workspace example and read the ROOT file that it produces.
This uses a modified version of the profile likelihood ratio as a test statistic for upper limits (eg. test stat = 0 if muhat>mu).
Based on the observed data, one defines a set of parameter points to be tested based on the value of the parameter of interest and the conditional MLE (eg. profiled) values of the nuisance parameters.
At each parameter point, pseudo-experiments are generated using this fixed reference model and then the test statistic is evaluated. The auxiliary measurements (global observables) associated with the constraint terms in nuisance parameters are also fluctuated in the process of generating the pseudo-experiments in a frequentist manner forming an 'unconditional ensemble'. One could form a 'conditional' ensemble in which these auxiliary measurements are fixed. Note that the nuisance parameters are not randomized, which is a Bayesian procedure. Note, the nuisance parameters are floating in the fits. For each point, the threshold that defines the 95% acceptance region is found. This forms a "Confidence Belt".
After constructing the confidence belt, one can find the confidence interval for any particular dataset by finding the intersection of the observed test statistic and the confidence belt. First this is done on the observed data to get an observed 1-sided upper limt.
Finally, there expected limit and bands (from background-only) are formed by generating background-only data and finding the upper limit. The background-only is defined as such that the nuisance parameters are fixed to their best fit value based on the data with the signal rate fixed to 0. The bands are done by hand for now, will later be part of the RooStats tools.
On a technical note, this technique IS the generalization of Feldman-Cousins with nuisance parameters.
Building the confidence belt can be computationally expensive. Once it is built, one could save it to a file and use it in a separate step.
We can use PROOF to speed things along in parallel, however, the test statistic has to be installed on the workers so either turn off PROOF or include the modified test statistic in your $ROOTSYS/roofit/roostats/inc directory, add the additional line to the LinkDef.h file, and recompile root.
Note, if you have a boundary on the parameter of interest (eg. cross-section) the threshold on the two-sided test statistic starts off at moderate values and plateaus.
[#0] PROGRESS:Generation – generated toys: 500 / 999 NeymanConstruction: Prog: 12/50 total MC = 39 this test stat = 0 SigXsecOverSM=0.69 alpha_syst1=0.136515 alpha_syst3=0.425415 beta_syst2=1.08496 [-1e+30, 0.011215] in interval = 1
this tells you the values of the parameters being used to generate the pseudo-experiments and the threshold in this case is 0.011215. One would expect for 95% that the threshold would be ~1.35 once the cross-section is far enough away from 0 that it is essentially unaffected by the boundary. As one reaches the last points in the scan, the threshold starts to get artificially high. This is because the range of the parameter in the fit is the same as the range in the scan. In the future, these should be independently controlled, but they are not now. As a result the ~50% of pseudo-experiments that have an upward fluctuation end up with muhat = muMax. Because of this, the upper range of the parameter should be well above the expected upper limit... but not too high or one will need a very large value of nPointsToScan to resolve the relevant region. This can be improved, but this is the first version of this script.
Important note: when the model includes external constraint terms, like a Gaussian constraint to a nuisance parameter centered around some nominal value there is a subtlety. The asymptotic results are all based on the assumption that all the measurements fluctuate... including the nominal values from auxiliary measurements. If these do not fluctuate, this corresponds to an "conditional ensemble". The result is that the distribution of the test statistic can become very non-chi^2. This results in thresholds that become very large.
Found data and ModelConfig:
=== Using the following for ModelConfig ===
Observables: RooArgSet:: = (obs_x_channel1,channelCat)
Parameters of Interest: RooArgSet:: = (SigXsecOverSM)
Nuisance Parameters: RooArgSet:: = (alpha_syst2,alpha_syst3,gamma_stat_channel1_bin_0,gamma_stat_channel1_bin_1)
Global Observables: RooArgSet:: = (nominalLumi,nom_alpha_syst1,nom_alpha_syst2,nom_alpha_syst3,nom_gamma_stat_channel1_bin_0,nom_gamma_stat_channel1_bin_1)
PDF: RooSimultaneous::simPdf[ indexCat=channelCat channel1=model_channel1 ] = 0.190787
FeldmanCousins: ntoys per point = 499
FeldmanCousins: nEvents per toy will fluctuate about expectation
will use global observables for unconditional ensemble
RooArgSet:: = (nominalLumi,nom_alpha_syst1,nom_alpha_syst2,nom_alpha_syst3,nom_gamma_stat_channel1_bin_0,nom_gamma_stat_channel1_bin_1)
=== Using the following for ModelConfig ===
Observables: RooArgSet:: = (obs_x_channel1,channelCat)
Parameters of Interest: RooArgSet:: = (SigXsecOverSM)
Nuisance Parameters: RooArgSet:: = (alpha_syst2,alpha_syst3,gamma_stat_channel1_bin_0,gamma_stat_channel1_bin_1)
Global Observables: RooArgSet:: = (nominalLumi,nom_alpha_syst1,nom_alpha_syst2,nom_alpha_syst3,nom_gamma_stat_channel1_bin_0,nom_gamma_stat_channel1_bin_1)
PDF: RooSimultaneous::simPdf[ indexCat=channelCat channel1=model_channel1 ] = 0.190787
FeldmanCousins: Model has nuisance parameters, will do profile construction
FeldmanCousins: # points to test = 20
lookup index = 0
NeymanConstruction: Prog: 1/20 total MC = 499 this test stat = 1.54009
SigXsecOverSM=0.075 alpha_syst2=0.656138 alpha_syst3=0.244593 gamma_stat_channel1_bin_0=1.03396 gamma_stat_channel1_bin_1=1.04971 [-inf, 1.25525] in interval = 0
NeymanConstruction: Prog: 2/20 total MC = 499 this test stat = 1.12265
SigXsecOverSM=0.225 alpha_syst2=0.562087 alpha_syst3=0.218282 gamma_stat_channel1_bin_0=1.02843 gamma_stat_channel1_bin_1=1.04073 [-inf, 1.40915] in interval = 1
NeymanConstruction: Prog: 3/20 total MC = 499 this test stat = 0.7727
SigXsecOverSM=0.375 alpha_syst2=0.457123 alpha_syst3=0.184027 gamma_stat_channel1_bin_0=1.02317 gamma_stat_channel1_bin_1=1.03439 [-inf, 1.50661] in interval = 1
NeymanConstruction: Prog: 4/20 total MC = 499 this test stat = 0.488747
SigXsecOverSM=0.525 alpha_syst2=0.356387 alpha_syst3=0.15007 gamma_stat_channel1_bin_0=1.01805 gamma_stat_channel1_bin_1=1.02809 [-inf, 1.65905] in interval = 1
NeymanConstruction: Prog: 5/20 total MC = 499 this test stat = 0.270286
SigXsecOverSM=0.675 alpha_syst2=0.259065 alpha_syst3=0.116822 gamma_stat_channel1_bin_0=1.01309 gamma_stat_channel1_bin_1=1.02183 [-inf, 1.78779] in interval = 1
NeymanConstruction: Prog: 6/20 total MC = 499 this test stat = 0.116895
SigXsecOverSM=0.825 alpha_syst2=0.159293 alpha_syst3=0.099895 gamma_stat_channel1_bin_0=1.00789 gamma_stat_channel1_bin_1=1.01569 [-inf, 1.97144] in interval = 1
NeymanConstruction: Prog: 7/20 total MC = 499 this test stat = 0.0272646
SigXsecOverSM=0.975 alpha_syst2=0.0707634 alpha_syst3=0.0527046 gamma_stat_channel1_bin_0=1.00355 gamma_stat_channel1_bin_1=1.00975 [-inf, 2.23383] in interval = 1
NeymanConstruction: Prog: 8/20 total MC = 499 this test stat = 0.000124673
SigXsecOverSM=1.125 alpha_syst2=-0.0163928 alpha_syst3=0.00771932 gamma_stat_channel1_bin_0=0.999196 gamma_stat_channel1_bin_1=1.00336 [-inf, 1.96757] in interval = 1
NeymanConstruction: Prog: 9/20 total MC = 499 this test stat = 0.0345285
SigXsecOverSM=1.275 alpha_syst2=-0.0986159 alpha_syst3=-0.0389895 gamma_stat_channel1_bin_0=0.995081 gamma_stat_channel1_bin_1=0.997784 [-inf, 2.05116] in interval = 1
NeymanConstruction: Prog: 10/20 total MC = 499 this test stat = 0.129724
SigXsecOverSM=1.425 alpha_syst2=-0.187205 alpha_syst3=-0.0491003 gamma_stat_channel1_bin_0=0.990771 gamma_stat_channel1_bin_1=0.991205 [-inf, 2.19007] in interval = 1
NeymanConstruction: Prog: 11/20 total MC = 499 this test stat = 0.284455
SigXsecOverSM=1.575 alpha_syst2=-0.26955 alpha_syst3=-0.0814599 gamma_stat_channel1_bin_0=0.986772 gamma_stat_channel1_bin_1=0.985267 [-inf, 1.86276] in interval = 1
NeymanConstruction: Prog: 12/20 total MC = 499 this test stat = 0.497383
SigXsecOverSM=1.725 alpha_syst2=-0.349331 alpha_syst3=-0.113443 gamma_stat_channel1_bin_0=0.982921 gamma_stat_channel1_bin_1=0.979383 [-inf, 2.01379] in interval = 1
NeymanConstruction: Prog: 13/20 total MC = 499 this test stat = 0.767683
SigXsecOverSM=1.875 alpha_syst2=-0.426406 alpha_syst3=-0.145041 gamma_stat_channel1_bin_0=0.979212 gamma_stat_channel1_bin_1=0.973556 [-inf, 2.26953] in interval = 1
NeymanConstruction: Prog: 14/20 total MC = 499 this test stat = 1.09484
SigXsecOverSM=2.025 alpha_syst2=-0.500642 alpha_syst3=-0.176255 gamma_stat_channel1_bin_0=0.975641 gamma_stat_channel1_bin_1=0.967788 [-inf, 2.17449] in interval = 1
NeymanConstruction: Prog: 15/20 total MC = 499 this test stat = 1.47726
SigXsecOverSM=2.175 alpha_syst2=-0.570354 alpha_syst3=-0.210605 gamma_stat_channel1_bin_0=0.9722 gamma_stat_channel1_bin_1=0.962111 [-inf, 2.14316] in interval = 1
NeymanConstruction: Prog: 16/20 total MC = 499 this test stat = 1.91416
SigXsecOverSM=2.325 alpha_syst2=-0.638736 alpha_syst3=-0.240819 gamma_stat_channel1_bin_0=0.968884 gamma_stat_channel1_bin_1=0.956461 [-inf, 1.70746] in interval = 0
NeymanConstruction: Prog: 17/20 total MC = 499 this test stat = 2.40455
SigXsecOverSM=2.475 alpha_syst2=-0.704252 alpha_syst3=-0.270579 gamma_stat_channel1_bin_0=0.96569 gamma_stat_channel1_bin_1=0.950877 [-inf, 1.71386] in interval = 0
NeymanConstruction: Prog: 18/20 total MC = 499 this test stat = 2.94758
SigXsecOverSM=2.625 alpha_syst2=-0.767005 alpha_syst3=-0.299874 gamma_stat_channel1_bin_0=0.962614 gamma_stat_channel1_bin_1=0.945359 [-inf, 1.36182] in interval = 0
NeymanConstruction: Prog: 19/20 total MC = 499 this test stat = 3.54228
SigXsecOverSM=2.775 alpha_syst2=-0.82716 alpha_syst3=-0.328689 gamma_stat_channel1_bin_0=0.959653 gamma_stat_channel1_bin_1=0.939908 [-inf, 1.38914] in interval = 0
NeymanConstruction: Prog: 20/20 total MC = 499 this test stat = 4.18788
SigXsecOverSM=2.925 alpha_syst2=-0.884931 alpha_syst3=-0.357011 gamma_stat_channel1_bin_0=0.9568 gamma_stat_channel1_bin_1=0.934526 [-inf, 1.36873] in interval = 0
[#1] INFO:Eval -- 14 points in interval
95% interval on SigXsecOverSM is : [0.225, 2.175]
[#1] INFO:Minimization -- p.d.f. provides expected number of events, including extended term in likelihood.
[#1] INFO:Minimization -- Including the following constraint terms in minimization: (lumiConstraint,alpha_syst1Constraint,alpha_syst2Constraint,alpha_syst3Constraint,gamma_stat_channel1_bin_0_constraint,gamma_stat_channel1_bin_1_constraint)
[#1] INFO:Minimization -- The global observables are not defined , normalize constraints with respect to the parameters (Lumi,SigXsecOverSM,alpha_syst1,alpha_syst2,alpha_syst3,gamma_stat_channel1_bin_0,gamma_stat_channel1_bin_1)
[#1] INFO:Fitting -- RooAbsPdf::fitTo(simPdf) fixing normalization set for coefficient determination to observables in data
[#1] INFO:Fitting -- using CPU computation library compiled with -mavx512
[#1] INFO:Minimization -- RooProfileLL::evaluate(RooEvaluatorWrapper_Profile[SigXsecOverSM]) Creating instance of MINUIT
[#1] INFO:Fitting -- RooAddition::defaultErrorLevel(nll_simPdf_obsData) Summation contains a RooNLLVar, using its error level
[#1] INFO:Minimization -- RooProfileLL::evaluate(RooEvaluatorWrapper_Profile[SigXsecOverSM]) determining minimum likelihood for current configurations w.r.t all observable
[#1] INFO:Minimization -- RooProfileLL::evaluate(RooEvaluatorWrapper_Profile[SigXsecOverSM]) minimum found at (SigXsecOverSM=1.12313)
.
Will use these parameter points to generate pseudo data for bkg only
1) 0x7d91780 RooRealVar:: alpha_syst2 = 0.710945 +/- 0.914123 L(-5 - 5) "alpha_syst2"
2) 0x7d91ce0 RooRealVar:: alpha_syst3 = 0.261483 +/- 0.929174 L(-5 - 5) "alpha_syst3"
3) 0x7d92260 RooRealVar:: gamma_stat_channel1_bin_0 = 1.03677 +/- 0.0462911 L(0 - 1.25) "gamma_stat_channel1_bin_0"
4) 0x7d92810 RooRealVar:: gamma_stat_channel1_bin_1 = 1.05318 +/- 0.0761262 L(0 - 1.5) "gamma_stat_channel1_bin_1"
5) 0x7d92df0 RooRealVar:: SigXsecOverSM = 0 +/- 0 L(0 - 3) B(20) "SigXsecOverSM"
-2 sigma band 0
-1 sigma band 0.495 [Power Constraint)]
median of band 1.095
+1 sigma band 1.545
+2 sigma band 1.995
observed 95% upper-limit 2.175
CLb strict [P(toy>obs|0)] for observed 95% upper-limit 0.975
CLb inclusive [P(toy>=obs|0)] for observed 95% upper-limit 0.975
#include <iostream>
using std::cout, std::endl;
{
filename =
"results/example_combined_GaussExample_model.root";
cout << "will run standard hist2workspace example" << endl;
gROOT->ProcessLine(
".! prepareHistFactory .");
gROOT->ProcessLine(
".! hist2workspace config/example.xml");
cout << "\n\n---------------------" << endl;
cout << "Done creating example input" << endl;
cout << "---------------------\n\n" << endl;
}
} else
cout << "Found data and ModelConfig:" << endl;
if (!
mc->GetPdf()->canBeExtended()) {
if (
data->numEntries() == 1)
fc.FluctuateNumDataEntries(
false);
else
cout << "Not sure what to do about this model" << endl;
}
if (
mc->GetGlobalObservables()) {
cout << "will use global observables for unconditional ensemble" << endl;
mc->GetGlobalObservables()->Print();
}
}
std::unique_ptr<RooAbsReal>
nll{
mc->GetPdf()->createNLL(*
data)};
std::unique_ptr<RooAbsReal> profile{
nll->createProfile(*
mc->GetParametersOfInterest())};
profile->getVal();
if (
mc->GetNuisanceParameters())
cout << "\nWill use these parameter points to generate pseudo data for bkg only" << endl;
double CLb = 0;
histOfUL->GetXaxis()->SetTitle(
"Upper Limit (background only)");
histOfUL->GetYaxis()->SetTitle(
"Entries");
w->loadSnapshot(
"paramsToGenerateData");
std::unique_ptr<RooDataSet>
toyData;
if (!
mc->GetPdf()->canBeExtended()) {
if (
data->numEntries() == 1)
toyData = std::unique_ptr<RooDataSet>{
mc->GetPdf()->generate(*
mc->GetObservables(), 1)};
else
cout << "Not sure what to do about this model" << endl;
} else {
toyData = std::unique_ptr<RooDataSet>{
mc->GetPdf()->generate(*
mc->GetObservables(), Extended())};
}
std::unique_ptr<RooDataSet>
one{
mc->GetPdf()->generate(*
mc->GetGlobalObservables(), 1)};
std::unique_ptr<RooArgSet> allVars{
mc->GetPdf()->getVariables()};
allVars->assign(*values);
} else {
break;
}
}
}
c1->SaveAs(
"two-sided_upper_limit_output.pdf");
for (
int i = 1; i <=
cumulative->GetNbinsX(); ++i) {
if (bins[i] < 0.5)
}
cout <<
"-1 sigma band " <<
band1sigDown <<
" [Power Constraint)]" << endl;
cout <<
"\nobserved 95% upper-limit " <<
interval->UpperLimit(*
firstPOI) << endl;
cout << "CLb strict [P(toy>obs|0)] for observed 95% upper-limit " << CLb << endl;
cout <<
"CLb inclusive [P(toy>=obs|0)] for observed 95% upper-limit " <<
CLbinclusive << endl;
}
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void data
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char filename
R__EXTERN TSystem * gSystem
Abstract base class for binned and unbinned datasets.
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Container class to hold unbinned data.
Variable that can be changed from the outside.
ConfidenceBelt is a concrete implementation of the ConfInterval interface.
The FeldmanCousins class (like the Feldman-Cousins technique) is essentially a specific configuration...
ModelConfig is a simple class that holds configuration information specifying how a model should be u...
PointSetInterval is a concrete implementation of the ConfInterval interface.
ProfileLikelihoodTestStat is an implementation of the TestStatistic interface that calculates the pro...
ToyMCSampler is an implementation of the TestStatSampler interface.
Persistable container for RooFit projects.
A ROOT file is an on-disk file, usually with extension .root, that stores objects in a file-system-li...
static TFile * Open(const char *name, Option_t *option="", const char *ftitle="", Int_t compress=ROOT::RCompressionSetting::EDefaults::kUseCompiledDefault, Int_t netopt=0)
Create / open a file.
1-D histogram with a float per channel (see TH1 documentation)
virtual Bool_t AccessPathName(const char *path, EAccessMode mode=kFileExists)
Returns FALSE if one can access a file using the specified access mode.
double nll(double pdf, double weight, int binnedL, int doBinOffset)
The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or othe...
Namespace for the RooStats classes.
double SignificanceToPValue(double Z)
returns p-value corresponding to a 1-sided significance