Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
MultivariateGaussianTest.py File Reference

Namespaces

namespace  MultivariateGaussianTest
 

Detailed Description

View in nbviewer Open in SWAN
Comparison of MCMC and PLC in a multi-variate gaussian problem

This tutorial produces an N-dimensional multivariate Gaussian with a non-trivial covariance matrix. By default N=4 (called "dim").

A subset of these are considered parameters of interest. This problem is tractable analytically.

We use this mainly as a test of Markov Chain Monte Carlo and we compare the result to the profile likelihood ratio.

We use the proposal helper to create a customized proposal function for this problem.

For N=4 and 2 parameters of interest it takes about 10-20 seconds and the acceptance rate is 37%

[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
**********
** 1 **SET PRINT 1
**********
**********
** 2 **SET NOGRAD
**********
PARAMETER DEFINITIONS:
NO. NAME VALUE STEP SIZE LIMITS
1 mu_x0 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
2 mu_x1 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
3 mu_x2 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
4 mu_x3 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
**********
** 3 **SET ERR 0.5
**********
**********
** 4 **SET PRINT 1
**********
**********
** 5 **SET STR 1
**********
NOW USING STRATEGY 1: TRY TO BALANCE SPEED AGAINST RELIABILITY
**********
** 6 **MIGRAD 2000 1
**********
FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
START MIGRAD MINIMIZATION. STRATEGY 1. CONVERGENCE WHEN EDM .LT. 1.00e-03
FCN=707.822 FROM MIGRAD STATUS=INITIATE 14 CALLS 15 TOTAL
EDM= unknown STRATEGY= 1 NO ERROR MATRIX
EXT PARAMETER CURRENT GUESS STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 mu_x0 0.00000e+00 4.00000e-01 2.01358e-01 -9.83942e+00
2 mu_x1 0.00000e+00 4.00000e-01 2.01358e-01 -1.25127e+01
3 mu_x2 0.00000e+00 4.00000e-01 2.01358e-01 9.88572e+00
4 mu_x3 0.00000e+00 4.00000e-01 2.01358e-01 -4.09155e+00
ERR DEF= 0.5
MIGRAD MINIMIZATION HAS CONVERGED.
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=706.56 FROM MIGRAD STATUS=CONVERGED 65 CALLS 66 TOTAL
EDM=1.95881e-05 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 mu_x0 1.80728e-01 1.72980e-01 1.42731e-03 -2.59592e-02
2 mu_x1 2.07351e-01 1.72978e-01 1.42884e-03 -3.69006e-02
3 mu_x2 -1.59412e-02 1.72984e-01 1.42230e-03 3.21539e-02
4 mu_x3 1.23430e-01 1.72982e-01 1.42472e-03 -8.04153e-03
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 4 ERR DEF=0.5
3.000e-02 9.997e-03 9.998e-03 9.997e-03
9.997e-03 3.000e-02 9.998e-03 9.997e-03
9.998e-03 9.998e-03 3.000e-02 9.998e-03
9.997e-03 9.997e-03 9.998e-03 3.000e-02
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.44715 1.000 0.333 0.333 0.333
2 0.44715 0.333 1.000 0.333 0.333
3 0.44717 0.333 0.333 1.000 0.333
4 0.44716 0.333 0.333 0.333 1.000
**********
** 7 **SET ERR 0.5
**********
**********
** 8 **SET PRINT 1
**********
**********
** 9 **HESSE 2000
**********
COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=706.56 FROM HESSE STATUS=OK 23 CALLS 89 TOTAL
EDM=1.95881e-05 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER INTERNAL INTERNAL
NO. NAME VALUE ERROR STEP SIZE VALUE
1 mu_x0 1.80728e-01 1.72984e-01 2.85462e-04 9.04875e-02
2 mu_x1 2.07351e-01 1.72982e-01 2.85769e-04 1.03862e-01
3 mu_x2 -1.59412e-02 1.72987e-01 2.84460e-04 -7.97069e-03
4 mu_x3 1.23430e-01 1.72986e-01 2.84944e-04 6.17540e-02
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 4 ERR DEF=0.5
3.000e-02 9.999e-03 9.999e-03 9.999e-03
9.999e-03 3.000e-02 9.999e-03 9.999e-03
9.999e-03 9.999e-03 3.000e-02 9.999e-03
9.999e-03 9.999e-03 9.999e-03 3.000e-02
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.44720 1.000 0.333 0.333 0.333
2 0.44719 0.333 1.000 0.333 0.333
3 0.44720 0.333 0.333 1.000 0.333
4 0.44720 0.333 0.333 0.333 1.000
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: deactivating const optimization
Metropolis-Hastings progress: ....................................................................................................
[#1] INFO:Eval -- Proposal acceptance rate: 37.1%
[#1] INFO:Eval -- Number of steps in chain: 3710
[#1] INFO:InputArguments -- The deprecated RooFit::CloneData(1) option passed to createNLL() is ignored.
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoGLobalFit - find MLE
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#0] PROGRESS:Minimization -- ProfileLikelihoodCalcultor::DoMinimizeNLL - using Minuit / Migrad with strategy 1
[#1] INFO:Minimization --
RooFitResult: minimized FCN value: 706.56, estimated distance to minimum: 1.16082e-08
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
mu_x0 1.8119e-01 +/- 1.73e-01
mu_x1 2.0792e-01 +/- 1.73e-01
mu_x2 -1.6078e-02 +/- 1.73e-01
mu_x3 1.2370e-01 +/- 1.73e-01
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_mvg_mvgData_Profile[mu_x1,mu_x0]) Creating instance of MINUIT
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_mvg_mvgData_Profile[mu_x1,mu_x0]) determining minimum likelihood for current configurations w.r.t all observable
[#1] INFO:Minimization -- RooProfileLL::evaluate(nll_mvg_mvgData_Profile[mu_x1,mu_x0]) minimum found at (mu_x0=0.181184, mu_x1=0.207917)
..[#1] INFO:Minimization -- LikelihoodInterval - Finding the contour of mu_x1 ( 1 ) and mu_x0 ( 0 )
Real time 0:00:02, CP time 2.970
MCMC interval on p0: [-0.19999999999999996, 0.6000000000000001]
MCMC interval on p1: [-0.28, 0.6000000000000001]
import ROOT
dim = 4
nPOI = 2
# let's time this challenging example
t = ROOT.TStopwatch()
t.Start()
xVec = []
muVec = []
poi = set()
# make the observable and means
for i in range(dim):
name = "x{}".format(i)
x = ROOT.RooRealVar(name, name, 0, -3, 3)
xVec.append(x)
mu_name = "mu_x{}".format(i)
mu_x = ROOT.RooRealVar(mu_name, mu_name, 0, -2, 2)
muVec.append(mu_x)
# put them into the list of parameters of interest
for i in range(nPOI):
poi.add(muVec[i])
# make a covariance matrix that is all 1's
cov = ROOT.TMatrixDSym(dim)
for i in range(dim):
for j in range(dim):
if i == j:
cov[i, j] = 3.0
else:
cov[i, j] = 1.0
# now make the multivariate Gaussian
mvg = ROOT.RooMultiVarGaussian("mvg", "mvg", xVec, muVec, cov)
# --------------------
# make a toy dataset
data = mvg.generate(xVec, 100)
# now create the model config for this problem
w = ROOT.RooWorkspace("MVG")
modelConfig = ROOT.RooStats.ModelConfig(w)
modelConfig.SetPdf(mvg)
modelConfig.SetParametersOfInterest(poi)
# -------------------------------------------------------
# Setup calculators
# MCMC
# we want to setup an efficient proposal function
# using the covariance matrix from a fit to the data
fit = mvg.fitTo(data, Save=True)
ph = ROOT.RooStats.ProposalHelper()
ph.SetVariables(fit.floatParsFinal())
ph.SetCovMatrix(fit.covarianceMatrix())
ph.SetUpdateProposalParameters(True)
ph.SetCacheSize(100)
pdfProp = ph.GetProposalFunction()
# now create the calculator
mc = ROOT.RooStats.MCMCCalculator(data, modelConfig)
mc.SetConfidenceLevel(0.95)
mc.SetNumBurnInSteps(100)
mc.SetNumIters(10000)
mc.SetNumBins(50)
mc.SetProposalFunction(pdfProp)
mcInt = mc.GetInterval()
poiList = mcInt.GetAxes()
# now setup the profile likelihood calculator
plc = ROOT.RooStats.ProfileLikelihoodCalculator(data, modelConfig)
plc.SetConfidenceLevel(0.95)
plInt = plc.GetInterval()
# make some plots
mcPlot = ROOT.RooStats.MCMCIntervalPlot(mcInt)
c1 = ROOT.TCanvas()
mcPlot.SetLineColor(ROOT.kGreen)
mcPlot.SetLineWidth(2)
mcPlot.Draw()
plPlot = ROOT.RooStats.LikelihoodIntervalPlot(plInt)
plPlot.Draw("same")
if len(poiList) == 1:
p = poiList.at(0)
print("MCMC interval: [{}, {}]".format(mcInt.LowerLimit(p), mcInt.UpperLimit(p)))
if len(poiList) == 2:
p0 = poiList.at(0)
p1 = poiList.at(1)
scatter = ROOT.TCanvas()
print("MCMC interval on p0: [{}, {}]".format(mcInt.LowerLimit(p0), mcInt.UpperLimit(p0)))
print("MCMC interval on p1: [{}, {}]".format(mcInt.LowerLimit(p1), mcInt.UpperLimit(p1)))
# MCMC interval on p0: [-0.2, 0.6]
# MCMC interval on p1: [-0.2, 0.6]
mcPlot.DrawChainScatter(p0, p1)
scatter.Update()
scatter.SaveAs("MultivariateGaussianTest_scatter.png")
t.Stop()
t.Print()
c1.SaveAs("MultivariateGaussianTest_plot.png")
# TODO: The MCMCCalculator has to be destructed first. Otherwise, we can get
# segmentation faults depending on the destruction order, which is random in
# Python. Probably the issue is that some object has a non-owning pointer to
# another object, which it uses in its destructor. This should be fixed either
# in the design of RooStats in C++, or with phythonizations.
del mc
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t UChar_t len
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t format
Date
July 2022
Authors
Artem Busorgin, Kevin Belasco and Kyle Cranmer (C++ version)

Definition in file MultivariateGaussianTest.py.