A tutorial that explains you how to solve problems with binning effects and numerical stability in binned fits.
Introduction
In this tutorial, you will learn three new things:
- How to reduce the bias in binned fits by changing the definition of the normalization integral
- How to completely get rid of binning effects by integrating the pdf over each bin
- How to improve the numeric stability of fits with a greatly different number of events per bin, using a constant per-bin counterterm
import ROOT
def generateBinnedAsimov(pdf, x, n_events):
"""
Generate binned Asimov dataset for a continuous pdf.
One should in principle be able to use
pdf.generateBinned(x, n_events, RooFit::ExpectedData()).
Unfortunately it has a problem: it also has the bin bias that this tutorial
demonstrates, to if we would use it, the biases would cancel out.
"""
data_h = ROOT.RooDataHist("dataH", "dataH", {x})
x_binning = x.getBinning()
for i_bin in range(x.numBins()):
x.setRange("bin", x_binning.binLow(i_bin), x_binning.binHigh(i_bin))
integ = pdf.createIntegral(x, NormSet=x, Range="bin")
ROOT.SetOwnership(integ, True)
integ.getVal()
data_h.set(i_bin, n_events * integ.getVal(), -1)
return data_h
def enableBinIntegrator(func, num_bins):
"""
Force numeric integration and do this numeric integration with the
RooBinIntegrator, which sums the function values at the bin centers.
"""
custom_config = ROOT.RooNumIntConfig(func.getIntegratorConfig())
custom_config.method1D().setLabel("RooBinIntegrator")
custom_config.getConfigSection("RooBinIntegrator").setRealValue("numBins", num_bins)
func.setIntegratorConfig(custom_config)
func.forceNumInt(True)
def disableBinIntegrator(func):
"""
Reset the integrator config to disable the RooBinIntegrator.
"""
func.setIntegratorConfig()
func.forceNumInt(False)
ROOT.RooMsgService.instance().getStream(1).removeTopic(ROOT.RooFit.Minimization)
ROOT.RooMsgService.instance().getStream(1).removeTopic(ROOT.RooFit.Fitting)
ROOT.RooMsgService.instance().getStream(1).removeTopic(ROOT.RooFit.Generation)
x = ROOT.RooRealVar("x", "x", 0.1, 5.1)
x.setBins(10)
c = ROOT.RooRealVar("c", "c", -1.8, -5, 5)
expo = ROOT.RooExponential("expo", "expo", x, c)
expo_data = generateBinnedAsimov(expo, x, 10000)
fit1 = expo.fitTo(expo_data, Save=True, PrintLevel=-1, SumW2Error=False)
fit1.Print()
enableBinIntegrator(expo, x.numBins())
fit2 = expo.fitTo(expo_data, Save=True, PrintLevel=-1, SumW2Error=False)
fit2.Print()
disableBinIntegrator(expo)
a = ROOT.RooRealVar("a", "a", -0.3, -5.0, 5.0)
powerlaw = ROOT.RooGenericPdf("powerlaw", "std::pow(x, a)", [x, a]);
powerlaw_data = generateBinnedAsimov(powerlaw, x, 10000)
fit3 = powerlaw.fitTo(powerlaw_data, Save=True, PrintLevel=-1, SumW2Error=False)
fit3.Print()
enableBinIntegrator(powerlaw, x.numBins())
fit4 = powerlaw.fitTo(powerlaw_data, Save=True, PrintLevel=-1, SumW2Error=False)
fit4.Print()
disableBinIntegrator(powerlaw)
fit5 = powerlaw.fitTo(powerlaw_data, IntegrateBins=1e-3, Save=True, PrintLevel=-1, SumW2Error=False)
fit5.Print()
x.setBins(100)
mu = ROOT.RooRealVar("mu", "mu", 3.0, 0.1, 5.1)
sigma = ROOT.RooRealVar("sigma", "sigma", 0.5, 0.01, 5.0)
gauss = ROOT.RooGaussian("gauss", "gauss", x, mu, sigma)
nsig = ROOT.RooRealVar("nsig", "nsig", 10000, 0, 1e9)
nbkg = ROOT.RooRealVar("nbkg", "nbkg", 10000000, 0, 1e9)
frac = ROOT.RooRealVar("frac", "frac", nsig.getVal() / (nsig.getVal() + nbkg.getVal()), 0.0, 1.0)
model = ROOT.RooAddPdf("model", "model", [gauss, expo], [nsig, nbkg])
model_data = model.generateBinned(x)
mu.setVal(2.0)
sigma.setVal(1.0)
fit6 = model.fitTo(model_data, Save=True, PrintLevel=-1, SumW2Error=False)
fit6.Print()
fit7 = model.fitTo(model_data, Offset="bin", Save=True, PrintLevel=-1, SumW2Error=False)
fit7.Print()
[#1] INFO:Eval -- RooRealVar::setRange(x) new range named 'bin' created with bounds [0.1,0.6]
RooFitResult: minimized FCN value: 4754.37, estimated distance to minimum: 3.09852e-09
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0 HESSE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
c -1.6862e+00 +/- 1.70e-02
[#0] WARNING:Integration -- RooBinIntegrator::RooBinIntegrator WARNING: integrand provide no binning definition observable #0 substituting default binning of 10 bins
[#1] INFO:NumericIntegration -- RooRealIntegral::init(expo_Int[x]) using numeric integrator RooBinIntegrator to calculate Int(x)
RooFitResult: minimized FCN value: 4440.6, estimated distance to minimum: 5.599e-07
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0 HESSE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
c -1.8000e+00 +/- 1.87e-02
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x|bin]_Norm[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
RooFitResult: minimized FCN value: 15816.4, estimated distance to minimum: 4.97929e-07
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0 HESSE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
a -2.6105e-01 +/- 1.06e-02
[#0] WARNING:Integration -- RooBinIntegrator::RooBinIntegrator WARNING: integrand provide no binning definition observable #0 substituting default binning of 10 bins
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x]) using numeric integrator RooBinIntegrator to calculate Int(x)
RooFitResult: minimized FCN value: 15739.9, estimated distance to minimum: 4.99845e-07
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0 HESSE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
a -3.1481e-01 +/- 1.15e-02
[#1] INFO:NumericIntegration -- RooRealIntegral::init(powerlaw_Int[x]) using numeric integrator RooRombergIntegrator to calculate Int(x)
RooFitResult: minimized FCN value: 15739.6, estimated distance to minimum: 3.93505e-05
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0 HESSE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
a -3.0009e-01 +/- 1.07e-02
[#0] PROGRESS:Generation -- RooAbsPdf::generateBinned(model) Performing costly accept/reject sampling. If this takes too long, use extended mode to speed up the process.
RooFitResult: minimized FCN value: -1.47174e+08, estimated distance to minimum: 0.162057
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=-1 HESSE=3
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
c -1.7972e+00 +/- 7.39e-04
mu 2.9756e+00 +/- 3.90e-02
nbkg 1.0001e+07 +/- 3.25e+03
nsig 9.4264e+03 +/- 7.36e+02
sigma 4.6849e-01 +/- 2.75e-02
RooFitResult: minimized FCN value: 3416.14, estimated distance to minimum: 0.000238317
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0 HESSE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
c -1.7971e+00 +/- 7.26e-04
mu 2.9939e+00 +/- 3.64e-02
nbkg 1.0001e+07 +/- 3.24e+03
nsig 9.2425e+03 +/- 6.93e+02
sigma 4.5747e-01 +/- 2.59e-02
- Date
- January 2023
- Author
- Jonas Rembser
Definition in file rf614_binned_fit_problems.py.