Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
rs_numberCountingCombination.C File Reference

Detailed Description

View in nbviewer Open in SWAN
'Number Counting Example' RooStats tutorial macro #100

This tutorial shows an example of a combination of two searches using number counting with background uncertainty.

The macro uses a RooStats "factory" to construct a PDF that represents the two number counting analyses with background uncertainties. The uncertainties are taken into account by considering a sideband measurement of a size that corresponds to the background uncertainty. The problem has been studied in these references:

After using the factory to make the model, we use a RooStats ProfileLikelihoodCalculator for a Hypothesis test and a confidence interval. The calculator takes into account systematics by eliminating nuisance parameters with the profile likelihood. This is equivalent to the method of MINOS.

[#0] WARNING:ObjectHandling -- NumberCountingPdfFactory: changed value of tau_0 to 100.01 to be consistent with background and its uncertainty. Also stored these values of tau into workspace with name . tau_0ExpectedNumberCountingData if you test with a different dataset, you should adjust tau appropriately.
[#0] WARNING:ObjectHandling -- NumberCountingPdfFactory: changed value of tau_1 to 100.01 to be consistent with background and its uncertainty. Also stored these values of tau into workspace with name . tau_1ExpectedNumberCountingData if you test with a different dataset, you should adjust tau appropriately.
[#1] INFO:InputArguments -- The deprecated RooFit::CloneData(1) option passed to createNLL() is ignored.
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoGLobalFit - find MLE
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#1] INFO:Minimization -- The following expressions will be evaluated in cache-and-track mode: (sigRegion_0,sideband_0,sigRegion_1,sideband_1)
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoMinimizeNLL - using Minuit2 / with strategy 1
[#1] INFO:Minimization --
RooFitResult: minimized FCN value: 17.6316, estimated distance to minimum: 5.21488e-14
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
b_0 1.0000e+02 +/- 9.99e-01
b_1 1.0000e+02 +/- 9.96e-01
masterSignal 1.0000e+00 +/- 4.78e-01
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::GetHypoTest - do conditional fit
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoMinimizeNLL - using Minuit2 / with strategy 1
[#1] INFO:Minimization --
RooFitResult: minimized FCN value: 19.9696, estimated distance to minimum: 2.51262e-13
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
b_0 1.0020e+02 +/- 9.96e-01
b_1 1.0010e+02 +/- 9.95e-01
-------------------------------------------------
The p-value for the null is 0.015294
Corresponding to a significance of 2.16239
-------------------------------------------------
[#1] INFO:InputArguments -- The deprecated RooFit::CloneData(1) option passed to createNLL() is ignored.
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoGLobalFit - find MLE
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#1] INFO:Minimization -- The following expressions will be evaluated in cache-and-track mode: (sigRegion_0,sideband_0,sigRegion_1,sideband_1)
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoMinimizeNLL - using Minuit2 / with strategy 1
[#1] INFO:Minimization --
RooFitResult: minimized FCN value: 17.6316, estimated distance to minimum: 7.47226e-13
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
b_0 1.0000e+02 +/- 9.99e-01
b_1 1.0000e+02 +/- 9.96e-01
masterSignal 1.0000e+00 +/- 4.78e-01
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_TopLevelPdf_ExpectedNumberCountingData_Profile[masterSignal]) Creating instance of MINUIT
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_TopLevelPdf_ExpectedNumberCountingData_Profile[masterSignal]) determining minimum likelihood for current configurations w.r.t all observable
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_TopLevelPdf_ExpectedNumberCountingData_Profile[masterSignal]) minimum found at (masterSignal=1.00388)
.
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_TopLevelPdf_ExpectedNumberCountingData_Profile[masterSignal]) Creating instance of MINUIT
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_TopLevelPdf_ExpectedNumberCountingData_Profile[masterSignal]) determining minimum likelihood for current configurations w.r.t all observable
[#0] ERROR:InputArguments -- RooArgSet::checkForDup: ERROR argument with name masterSignal is already in this set
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_TopLevelPdf_ExpectedNumberCountingData_Profile[masterSignal]) minimum found at (masterSignal=1.00009)
..........................................................................................................................................................................................................lower limit on master signal = 0.089069
upper limit on master signal = 2.00127
-------------------------------------------------
Consider this parameter point:
RooRealVar::masterSignal = 0 +/- 0.477961 L(0 - 3)
It is NOT in the interval.
-------------------------------------------------
-------------------------------------------------
Consider this parameter point:
RooRealVar::masterSignal = 2 +/- 0.477961 L(0 - 3)
It IS in the interval.
-------------------------------------------------
#include "RooRealVar.h"
#include <cassert>
// use this order for safety on library loading
using namespace RooFit;
using namespace RooStats;
// declare three variations on the same tutorial
void rs_numberCountingCombination_expected();
void rs_numberCountingCombination_observed();
void rs_numberCountingCombination_observedWithTau();
// -------------------------------
// main driver to choose one
void rs_numberCountingCombination(int flag = 1)
{
if (flag == 1)
rs_numberCountingCombination_expected();
if (flag == 2)
rs_numberCountingCombination_observed();
if (flag == 3)
rs_numberCountingCombination_observedWithTau();
}
// -------------------------------
void rs_numberCountingCombination_expected()
{
/////////////////////////////////////////
// An example of a number counting combination with two channels.
// We consider both hypothesis testing and the equivalent confidence interval.
/////////////////////////////////////////
/////////////////////////////////////////
// The Model building stage
/////////////////////////////////////////
// Step 1, define arrays with signal & bkg expectations and background uncertainties
Double_t s[2] = {20., 10.}; // expected signal
Double_t b[2] = {100., 100.}; // expected background
Double_t db[2] = {.0100, .0100}; // fractional background uncertainty
// Step 2, use a RooStats factory to build a PDF for a
// number counting combination and add it to the workspace.
// We need to give the signal expectation to relate the masterSignal
// to the signal contribution in the individual channels.
// The model neglects correlations in background uncertainty,
// but they could be added without much change to the example.
RooWorkspace *wspace = new RooWorkspace();
f.AddModel(s, 2, wspace, "TopLevelPdf", "masterSignal");
// Step 3, use a RooStats factory to add datasets to the workspace.
// Step 3a.
// Add the expected data to the workspace
f.AddExpData(s, b, db, 2, wspace, "ExpectedNumberCountingData");
// see below for a printout of the workspace
// wspace->Print(); //uncomment to see structure of workspace
/////////////////////////////////////////
// The Hypothesis testing stage:
/////////////////////////////////////////
// Step 4, Define the null hypothesis for the calculator
// Here you need to know the name of the variables corresponding to hypothesis.
RooRealVar *mu = wspace->var("masterSignal");
RooArgSet *poi = new RooArgSet(*mu);
RooArgSet *nullParams = new RooArgSet("nullParams");
nullParams->addClone(*mu);
// here we explicitly set the value of the parameters for the null
nullParams->setRealValue("masterSignal", 0);
// Step 5, Create a calculator for doing the hypothesis test.
// because this is a
ProfileLikelihoodCalculator plc(*wspace->data("ExpectedNumberCountingData"), *wspace->pdf("TopLevelPdf"), *poi, 0.05,
nullParams);
// Step 6, Use the Calculator to get a HypoTestResult
HypoTestResult *htr = plc.GetHypoTest();
assert(htr != 0);
cout << "-------------------------------------------------" << endl;
cout << "The p-value for the null is " << htr->NullPValue() << endl;
cout << "Corresponding to a significance of " << htr->Significance() << endl;
cout << "-------------------------------------------------\n\n" << endl;
/* expected case should return:
-------------------------------------------------
The p-value for the null is 0.015294
Corresponding to a significance of 2.16239
-------------------------------------------------
*/
//////////////////////////////////////////
// Confidence Interval Stage
// Step 8, Here we re-use the ProfileLikelihoodCalculator to return a confidence interval.
// We need to specify what are our parameters of interest
RooArgSet *paramsOfInterest = nullParams; // they are the same as before in this case
plc.SetParameters(*paramsOfInterest);
LikelihoodInterval *lrint = (LikelihoodInterval *)plc.GetInterval();
lrint->SetConfidenceLevel(0.95);
// Step 9, make a plot of the likelihood ratio and the interval obtained
// paramsOfInterest->setRealValue("masterSignal",1.);
// find limits
double lower = lrint->LowerLimit(*mu);
double upper = lrint->UpperLimit(*mu);
LikelihoodIntervalPlot lrPlot(lrint);
lrPlot.SetMaximum(3.);
lrPlot.Draw();
// Step 10a. Get upper and lower limits
cout << "lower limit on master signal = " << lower << endl;
cout << "upper limit on master signal = " << upper << endl;
// Step 10b, Ask if masterSignal=0 is in the interval.
// Note, this is equivalent to the question of a 2-sigma hypothesis test:
// "is the parameter point masterSignal=0 inside the 95% confidence interval?"
// Since the significance of the Hypothesis test was > 2-sigma it should not be:
// eg. we exclude masterSignal=0 at 95% confidence.
paramsOfInterest->setRealValue("masterSignal", 0.);
cout << "-------------------------------------------------" << endl;
std::cout << "Consider this parameter point:" << std::endl;
paramsOfInterest->first()->Print();
if (lrint->IsInInterval(*paramsOfInterest))
std::cout << "It IS in the interval." << std::endl;
else
std::cout << "It is NOT in the interval." << std::endl;
cout << "-------------------------------------------------\n\n" << endl;
// Step 10c, We also ask about the parameter point masterSignal=2, which is inside the interval.
paramsOfInterest->setRealValue("masterSignal", 2.);
cout << "-------------------------------------------------" << endl;
std::cout << "Consider this parameter point:" << std::endl;
paramsOfInterest->first()->Print();
if (lrint->IsInInterval(*paramsOfInterest))
std::cout << "It IS in the interval." << std::endl;
else
std::cout << "It is NOT in the interval." << std::endl;
cout << "-------------------------------------------------\n\n" << endl;
delete lrint;
delete htr;
delete wspace;
delete poi;
delete nullParams;
/*
// Here's an example of what is in the workspace
// wspace->Print();
RooWorkspace(NumberCountingWS) Number Counting WS contents
variables
---------
(x_0,masterSignal,expected_s_0,b_0,y_0,tau_0,x_1,expected_s_1,b_1,y_1,tau_1)
p.d.f.s
-------
RooProdPdf::joint[ pdfs=(sigRegion_0,sideband_0,sigRegion_1,sideband_1) ] = 2.20148e-08
RooPoisson::sigRegion_0[ x=x_0 mean=splusb_0 ] = 0.036393
RooPoisson::sideband_0[ x=y_0 mean=bTau_0 ] = 0.00398939
RooPoisson::sigRegion_1[ x=x_1 mean=splusb_1 ] = 0.0380088
RooPoisson::sideband_1[ x=y_1 mean=bTau_1 ] = 0.00398939
functions
--------
RooAddition::splusb_0[ set1=(s_0,b_0) set2=() ] = 120
RooProduct::s_0[ compRSet=(masterSignal,expected_s_0) compCSet=() ] = 20
RooProduct::bTau_0[ compRSet=(b_0,tau_0) compCSet=() ] = 10000
RooAddition::splusb_1[ set1=(s_1,b_1) set2=() ] = 110
RooProduct::s_1[ compRSet=(masterSignal,expected_s_1) compCSet=() ] = 10
RooProduct::bTau_1[ compRSet=(b_1,tau_1) compCSet=() ] = 10000
datasets
--------
RooDataSet::ExpectedNumberCountingData(x_0,y_0,x_1,y_1)
embedded pre-calculated expensive components
-------------------------------------------
*/
}
void rs_numberCountingCombination_observed()
{
/////////////////////////////////////////
// The same example with observed data in a main
// measurement and an background-only auxiliary
// measurement with a factor tau more background
// than in the main measurement.
/////////////////////////////////////////
// The Model building stage
/////////////////////////////////////////
// Step 1, define arrays with signal & bkg expectations and background uncertainties
// We still need the expectation to relate signal in different channels with the master signal
Double_t s[2] = {20., 10.}; // expected signal
// Step 2, use a RooStats factory to build a PDF for a
// number counting combination and add it to the workspace.
// We need to give the signal expectation to relate the masterSignal
// to the signal contribution in the individual channels.
// The model neglects correlations in background uncertainty,
// but they could be added without much change to the example.
RooWorkspace *wspace = new RooWorkspace();
f.AddModel(s, 2, wspace, "TopLevelPdf", "masterSignal");
// Step 3, use a RooStats factory to add datasets to the workspace.
// Add the observed data to the workspace
Double_t mainMeas[2] = {123., 117.}; // observed main measurement
Double_t bkgMeas[2] = {111.23, 98.76}; // observed background
Double_t dbMeas[2] = {.011, .0095}; // observed fractional background uncertainty
f.AddData(mainMeas, bkgMeas, dbMeas, 2, wspace, "ObservedNumberCountingData");
// see below for a printout of the workspace
// wspace->Print(); //uncomment to see structure of workspace
/////////////////////////////////////////
// The Hypothesis testing stage:
/////////////////////////////////////////
// Step 4, Define the null hypothesis for the calculator
// Here you need to know the name of the variables corresponding to hypothesis.
RooRealVar *mu = wspace->var("masterSignal");
RooArgSet *poi = new RooArgSet(*mu);
RooArgSet *nullParams = new RooArgSet("nullParams");
nullParams->addClone(*mu);
// here we explicitly set the value of the parameters for the null
nullParams->setRealValue("masterSignal", 0);
// Step 5, Create a calculator for doing the hypothesis test.
// because this is a
ProfileLikelihoodCalculator plc(*wspace->data("ObservedNumberCountingData"), *wspace->pdf("TopLevelPdf"), *poi, 0.05,
nullParams);
wspace->var("tau_0")->Print();
wspace->var("tau_1")->Print();
// Step 7, Use the Calculator to get a HypoTestResult
HypoTestResult *htr = plc.GetHypoTest();
cout << "-------------------------------------------------" << endl;
cout << "The p-value for the null is " << htr->NullPValue() << endl;
cout << "Corresponding to a significance of " << htr->Significance() << endl;
cout << "-------------------------------------------------\n\n" << endl;
/* observed case should return:
-------------------------------------------------
The p-value for the null is 0.0351669
Corresponding to a significance of 1.80975
-------------------------------------------------
*/
//////////////////////////////////////////
// Confidence Interval Stage
// Step 8, Here we re-use the ProfileLikelihoodCalculator to return a confidence interval.
// We need to specify what are our parameters of interest
RooArgSet *paramsOfInterest = nullParams; // they are the same as before in this case
plc.SetParameters(*paramsOfInterest);
LikelihoodInterval *lrint = (LikelihoodInterval *)plc.GetInterval();
lrint->SetConfidenceLevel(0.95);
// Step 9c. Get upper and lower limits
cout << "lower limit on master signal = " << lrint->LowerLimit(*mu) << endl;
cout << "upper limit on master signal = " << lrint->UpperLimit(*mu) << endl;
delete lrint;
delete htr;
delete wspace;
delete nullParams;
delete poi;
}
void rs_numberCountingCombination_observedWithTau()
{
/////////////////////////////////////////
// The same example with observed data in a main
// measurement and an background-only auxiliary
// measurement with a factor tau more background
// than in the main measurement.
/////////////////////////////////////////
// The Model building stage
/////////////////////////////////////////
// Step 1, define arrays with signal & bkg expectations and background uncertainties
// We still need the expectation to relate signal in different channels with the master signal
Double_t s[2] = {20., 10.}; // expected signal
// Step 2, use a RooStats factory to build a PDF for a
// number counting combination and add it to the workspace.
// We need to give the signal expectation to relate the masterSignal
// to the signal contribution in the individual channels.
// The model neglects correlations in background uncertainty,
// but they could be added without much change to the example.
RooWorkspace *wspace = new RooWorkspace();
f.AddModel(s, 2, wspace, "TopLevelPdf", "masterSignal");
// Step 3, use a RooStats factory to add datasets to the workspace.
// Add the observed data to the workspace in the on-off problem.
Double_t mainMeas[2] = {123., 117.}; // observed main measurement
Double_t sideband[2] = {11123., 9876.}; // observed sideband
Double_t tau[2] = {100., 100.}; // ratio of bkg in sideband to bkg in main measurement, from experimental design.
f.AddDataWithSideband(mainMeas, sideband, tau, 2, wspace, "ObservedNumberCountingDataWithSideband");
// see below for a printout of the workspace
// wspace->Print(); //uncomment to see structure of workspace
/////////////////////////////////////////
// The Hypothesis testing stage:
/////////////////////////////////////////
// Step 4, Define the null hypothesis for the calculator
// Here you need to know the name of the variables corresponding to hypothesis.
RooRealVar *mu = wspace->var("masterSignal");
RooArgSet *poi = new RooArgSet(*mu);
RooArgSet *nullParams = new RooArgSet("nullParams");
nullParams->addClone(*mu);
// here we explicitly set the value of the parameters for the null
nullParams->setRealValue("masterSignal", 0);
// Step 5, Create a calculator for doing the hypothesis test.
// because this is a
ProfileLikelihoodCalculator plc(*wspace->data("ObservedNumberCountingDataWithSideband"), *wspace->pdf("TopLevelPdf"),
*poi, 0.05, nullParams);
// Step 7, Use the Calculator to get a HypoTestResult
HypoTestResult *htr = plc.GetHypoTest();
cout << "-------------------------------------------------" << endl;
cout << "The p-value for the null is " << htr->NullPValue() << endl;
cout << "Corresponding to a significance of " << htr->Significance() << endl;
cout << "-------------------------------------------------\n\n" << endl;
/* observed case should return:
-------------------------------------------------
The p-value for the null is 0.0352035
Corresponding to a significance of 1.80928
-------------------------------------------------
*/
//////////////////////////////////////////
// Confidence Interval Stage
// Step 8, Here we re-use the ProfileLikelihoodCalculator to return a confidence interval.
// We need to specify what are our parameters of interest
RooArgSet *paramsOfInterest = nullParams; // they are the same as before in this case
plc.SetParameters(*paramsOfInterest);
LikelihoodInterval *lrint = (LikelihoodInterval *)plc.GetInterval();
lrint->SetConfidenceLevel(0.95);
// Step 9c. Get upper and lower limits
cout << "lower limit on master signal = " << lrint->LowerLimit(*mu) << endl;
cout << "upper limit on master signal = " << lrint->UpperLimit(*mu) << endl;
delete lrint;
delete htr;
delete wspace;
delete nullParams;
delete poi;
}
#define b(i)
Definition RSha256.hxx:100
#define f(i)
Definition RSha256.hxx:104
double Double_t
Definition RtypesCore.h:59
void Print(Option_t *options=nullptr) const override
Print the object to the defaultPrintStream().
Definition RooAbsArg.h:322
RooAbsArg * first() const
bool setRealValue(const char *name, double newVal=0.0, bool verbose=false)
Set value of a RooAbsRealLValue stored in set with given name to newVal No error messages are printed...
virtual RooAbsArg * addClone(const RooAbsArg &var, bool silent=false)
Add a clone of the specified argument to list.
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition RooArgSet.h:55
RooRealVar represents a variable that can be changed from the outside.
Definition RooRealVar.h:37
HypoTestResult is a base class for results from hypothesis tests.
virtual double Significance() const
familiar name for the Null p-value in terms of 1-sided Gaussian significance
virtual double NullPValue() const
Return p-value for null hypothesis.
This class provides simple and straightforward utilities to plot a LikelihoodInterval object.
LikelihoodInterval is a concrete implementation of the RooStats::ConfInterval interface.
double UpperLimit(const RooRealVar &param)
return the upper bound of the interval on a given parameter
void SetConfidenceLevel(double cl) override
set the confidence level for the interval (e.g 0.682 for a 1-sigma interval)
double LowerLimit(const RooRealVar &param)
return the lower bound of the interval on a given parameter
bool IsInInterval(const RooArgSet &) const override
check if given point is in the interval
A factory for building PDFs and data for a number counting combination.
The ProfileLikelihoodCalculator is a concrete implementation of CombinedCalculator (the interface cla...
Persistable container for RooFit projects.
RooAbsPdf * pdf(RooStringView name) const
Retrieve p.d.f (RooAbsPdf) with given name. A null pointer is returned if not found.
RooRealVar * var(RooStringView name) const
Retrieve real-valued variable (RooRealVar) with given name. A null pointer is returned if not found.
RooAbsData * data(RooStringView name) const
Retrieve dataset (binned or unbinned) with given name. A null pointer is returned if not found.
The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or othe...
Definition JSONIO.h:26
Namespace for the RooStats classes.
Definition Asimov.h:19
Author
Kyle Cranmer

Definition in file rs_numberCountingCombination.C.