Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
MultivariateGaussianTest.C File Reference

Detailed Description

View in nbviewer Open in SWAN
Comparison of MCMC and PLC in a multi-variate gaussian problem

This tutorial produces an N-dimensional multivariate Gaussian with a non-trivial covariance matrix. By default N=4 (called "dim").

A subset of these are considered parameters of interest. This problem is tractable analytically.

We use this mainly as a test of Markov Chain Monte Carlo and we compare the result to the profile likelihood ratio.

We use the proposal helper to create a customized proposal function for this problem.

For N=4 and 2 parameters of interest it takes about 10-20 seconds and the acceptance rate is 37%

[#1] INFO:Fitting -- RooAbsPdf::fitTo(mvg_over_mvg_Int[x0,x1,x2,x3]) fixing normalization set for coefficient determination to observables in data
[#1] INFO:Fitting -- using CPU computation library compiled with -mavx2
[#1] INFO:Fitting -- RooAddition::defaultErrorLevel(nll_mvg_over_mvg_Int[x0,x1,x2,x3]_mvgData) Summation contains a RooNLLVar, using its error level
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
Minuit2Minimizer: Minimize with max-calls 2000 convergence for edm < 1 strategy 1
Minuit2Minimizer : Valid minimum - status = 0
FVAL = 706.560063684865781
Edm = 9.79361278577427602e-06
Nfcn = 68
mu_x0 = 0.180728 +/- 0.17298 (limited)
mu_x1 = 0.207351 +/- 0.172978 (limited)
mu_x2 = -0.0159412 +/- 0.172984 (limited)
mu_x3 = 0.12343 +/- 0.172982 (limited)
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: deactivating const optimization
[#1] INFO:Fitting -- RooAbsPdf::fitTo(mvg_over_mvg_Int[x0,x1,x2,x3]) fixing normalization set for coefficient determination to observables in data
Metropolis-Hastings progress: ....................................................................................................
[#1] INFO:Eval -- Proposal acceptance rate: 37.1%
[#1] INFO:Eval -- Number of steps in chain: 3710
[#1] INFO:InputArguments -- The deprecated RooFit::CloneData(1) option passed to createNLL() is ignored.
[#1] INFO:Fitting -- RooAbsPdf::fitTo(mvg_over_mvg_Int[x0,x1,x2,x3]) fixing normalization set for coefficient determination to observables in data
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoGLobalFit - find MLE
[#1] INFO:Fitting -- RooAddition::defaultErrorLevel(nll_mvg_over_mvg_Int[x0,x1,x2,x3]_mvgData) Summation contains a RooNLLVar, using its error level
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoMinimizeNLL - using Minuit2 / Migrad with strategy 1
[#1] INFO:Minimization --
RooFitResult: minimized FCN value: 706.56, estimated distance to minimum: 3.96149e-12
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
mu_x0 1.8118e-01 +/- 1.73e-01
mu_x1 2.0792e-01 +/- 1.73e-01
mu_x2 -1.6067e-02 +/- 1.73e-01
mu_x3 1.2371e-01 +/- 1.73e-01
[#1] INFO:Minimization -- RooProfileLL::evaluate(RooEvaluatorWrapper_Profile[mu_x0,mu_x1]) Creating instance of MINUIT
[#1] INFO:Fitting -- RooAddition::defaultErrorLevel(nll_mvg_over_mvg_Int[x0,x1,x2,x3]_mvgData) Summation contains a RooNLLVar, using its error level
[#1] INFO:Minimization -- RooProfileLL::evaluate(RooEvaluatorWrapper_Profile[mu_x0,mu_x1]) determining minimum likelihood for current configurations w.r.t all observable
[#1] INFO:Minimization -- RooProfileLL::evaluate(RooEvaluatorWrapper_Profile[mu_x0,mu_x1]) minimum found at (mu_x0=0.181184, mu_x1=0.207918)
..[#1] INFO:Minimization -- LikelihoodInterval - Finding the contour of mu_x0 ( 0 ) and mu_x1 ( 1 )
MCMC interval on p0: [-0.28, 0.6]
MCMC interval on p1: [-0.2, 0.6]
Real time 0:00:01, CP time 1.460
#include "RooGlobalFunc.h"
#include <cstdlib>
#include "TMatrixDSym.h"
#include "RooArgList.h"
#include "RooRealVar.h"
#include "TH2F.h"
#include "TCanvas.h"
#include "RooAbsReal.h"
#include "RooFitResult.h"
#include "TStopwatch.h"
using std::cout, std::endl;
using namespace RooFit;
using namespace RooStats;
void MultivariateGaussianTest(Int_t dim = 4, Int_t nPOI = 2)
{
// let's time this challenging example
t.Start();
RooArgList xVec;
RooArgList muVec;
RooArgSet poi;
// make the observable and means
Int_t i, j;
RooRealVar *mu_x;
for (i = 0; i < dim; i++) {
char *name = Form("x%d", i);
x = new RooRealVar(name, name, 0, -3, 3);
xVec.add(*x);
char *mu_name = Form("mu_x%d", i);
mu_x = new RooRealVar(mu_name, mu_name, 0, -2, 2);
muVec.add(*mu_x);
}
// put them into the list of parameters of interest
for (i = 0; i < nPOI; i++) {
poi.add(*muVec.at(i));
}
// make a covariance matrix that is all 1's
TMatrixDSym cov(dim);
for (i = 0; i < dim; i++) {
for (j = 0; j < dim; j++) {
if (i == j)
cov(i, j) = 3.;
else
cov(i, j) = 1.0;
}
}
// now make the multivariate Gaussian
RooMultiVarGaussian mvg("mvg", "mvg", xVec, muVec, cov);
// --------------------
// make a toy dataset
std::unique_ptr<RooDataSet> data{mvg.generate(xVec, 100)};
// now create the model config for this problem
RooWorkspace *w = new RooWorkspace("MVG");
ModelConfig modelConfig(w);
modelConfig.SetPdf(mvg);
modelConfig.SetParametersOfInterest(poi);
// -------------------------------------------------------
// Setup calculators
// MCMC
// we want to setup an efficient proposal function
// using the covariance matrix from a fit to the data
std::unique_ptr<RooFitResult> fit{mvg.fitTo(*data, Save(true))};
ph.SetVariables((RooArgSet &)fit->floatParsFinal());
ph.SetCovMatrix(fit->covarianceMatrix());
ph.SetCacheSize(100);
// now create the calculator
MCMCCalculator mc(*data, modelConfig);
mc.SetConfidenceLevel(0.95);
mc.SetNumBurnInSteps(100);
mc.SetNumIters(10000);
mc.SetNumBins(50);
mc.SetProposalFunction(*pdfProp);
MCMCInterval *mcInt = mc.GetInterval();
RooArgList *poiList = mcInt->GetAxes();
// now setup the profile likelihood calculator
ProfileLikelihoodCalculator plc(*data, modelConfig);
plc.SetConfidenceLevel(0.95);
LikelihoodInterval *plInt = (LikelihoodInterval *)plc.GetInterval();
// make some plots
MCMCIntervalPlot mcPlot(*mcInt);
TCanvas *c1 = new TCanvas();
mcPlot.SetLineColor(kGreen);
mcPlot.SetLineWidth(2);
mcPlot.Draw();
LikelihoodIntervalPlot plPlot(plInt);
plPlot.Draw("same");
if (poiList->getSize() == 1) {
RooRealVar *p = (RooRealVar *)poiList->at(0);
Double_t ll = mcInt->LowerLimit(*p);
Double_t ul = mcInt->UpperLimit(*p);
cout << "MCMC interval: [" << ll << ", " << ul << "]" << endl;
}
if (poiList->getSize() == 2) {
RooRealVar *p0 = (RooRealVar *)poiList->at(0);
RooRealVar *p1 = (RooRealVar *)poiList->at(1);
TCanvas *scatter = new TCanvas();
Double_t ll = mcInt->LowerLimit(*p0);
Double_t ul = mcInt->UpperLimit(*p0);
cout << "MCMC interval on p0: [" << ll << ", " << ul << "]" << endl;
ll = mcInt->LowerLimit(*p1);
ul = mcInt->UpperLimit(*p1);
cout << "MCMC interval on p1: [" << ll << ", " << ul << "]" << endl;
// MCMC interval on p0: [-0.2, 0.6]
// MCMC interval on p1: [-0.2, 0.6]
mcPlot.DrawChainScatter(*p0, *p1);
scatter->Update();
}
t.Print();
}
int Int_t
Definition RtypesCore.h:45
double Double_t
Definition RtypesCore.h:59
@ kGreen
Definition Rtypes.h:66
winID h TVirtualViewer3D TVirtualGLPainter p
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void data
char name[80]
Definition TGX11.cxx:110
char * Form(const char *fmt,...)
Formats a string in a circular formatting buffer.
Definition TString.cxx:2489
Int_t getSize() const
Return the number of elements in the collection.
virtual bool add(const RooAbsArg &var, bool silent=false)
Add the specified argument to list.
RooArgList is a container object that can hold multiple RooAbsArg objects.
Definition RooArgList.h:22
RooAbsArg * at(Int_t idx) const
Return object at given index, or nullptr if index is out of range.
Definition RooArgList.h:110
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition RooArgSet.h:24
Multivariate Gaussian p.d.f.
Variable that can be changed from the outside.
Definition RooRealVar.h:37
This class provides simple and straightforward utilities to plot a LikelihoodInterval object.
LikelihoodInterval is a concrete implementation of the RooStats::ConfInterval interface.
Bayesian Calculator estimating an interval or a credible region using the Markov-Chain Monte Carlo me...
This class provides simple and straightforward utilities to plot a MCMCInterval object.
MCMCInterval is a concrete implementation of the RooStats::ConfInterval interface.
virtual double UpperLimit(RooRealVar &param)
get the highest value of param that is within the confidence interval
virtual RooArgList * GetAxes()
return a list of RooRealVars representing the axes you own the returned RooArgList
virtual double LowerLimit(RooRealVar &param)
get the lowest value of param that is within the confidence interval
ModelConfig is a simple class that holds configuration information specifying how a model should be u...
Definition ModelConfig.h:35
The ProfileLikelihoodCalculator is a concrete implementation of CombinedCalculator (the interface cla...
ProposalFunction is an interface for all proposal functions that would be used with a Markov Chain Mo...
virtual void SetCovMatrix(const TMatrixDSym &covMatrix)
set the covariance matrix to use for a multi-variate Gaussian proposal
virtual ProposalFunction * GetProposalFunction()
Get the ProposalFunction that we've been designing.
virtual void SetVariables(RooArgList &vars)
virtual void SetCacheSize(Int_t size)
virtual void SetUpdateProposalParameters(bool updateParams)
Persistable container for RooFit projects.
The Canvas class.
Definition TCanvas.h:23
void Update() override
Update canvas pad buffers.
Definition TCanvas.cxx:2489
Stopwatch class.
Definition TStopwatch.h:28
void Start(Bool_t reset=kTRUE)
Start the stopwatch.
void Print(Option_t *option="") const override
Print the real and cpu time passed between the start and stop events.
RooCmdArg Save(bool flag=true)
return c1
Definition legend1.C:41
Double_t x[n]
Definition legend1.C:17
fit(model, train_loader, val_loader, num_epochs, batch_size, optimizer, criterion, save_best, scheduler)
The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or othe...
Definition CodegenImpl.h:64
Namespace for the RooStats classes.
Definition CodegenImpl.h:58
Authors
Kevin Belasco, Kyle Cranmer

Definition in file MultivariateGaussianTest.C.