Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
RooBinSamplingPdf.cxx
Go to the documentation of this file.
1// Authors: Stephan Hageboeck, CERN; Andrea Sciandra, SCIPP-UCSC/Atlas; Nov 2020
2
3/*****************************************************************************
4 * RooFit
5 * Authors: *
6 * WV, Wouter Verkerke, UC Santa Barbara, verkerke@slac.stanford.edu *
7 * DK, David Kirkby, UC Irvine, dkirkby@uci.edu *
8 * *
9 * Copyright (c) 2000-2020, Regents of the University of California *
10 * and Stanford University. All rights reserved. *
11 * *
12 * Redistribution and use in source and binary forms, *
13 * with or without modification, are permitted according to the terms *
14 * listed in LICENSE (http://roofit.sourceforge.net/license.txt) *
15 *****************************************************************************/
16
17/**
18 * \class RooBinSamplingPdf
19 * The RooBinSamplingPdf is supposed to be used as an adapter between a continuous PDF
20 * and a binned distribution.
21 * When RooFit is used to fit binned data, and the PDF is continuous, it takes the probability density
22 * at the bin centre as a proxy for the probability averaged (integrated) over the entire bin. This is
23 * correct only if the second derivative of the function vanishes, though. This is shown in the plots
24 * below.
25 *
26 * For PDFs that have larger curvatures, the RooBinSamplingPdf can be used. It integrates the PDF in each
27 * bin using an adaptive integrator. This usually requires 21 times more function evaluations, but significantly
28 * reduces biases due to better sampling of the PDF. The integrator can be accessed from the outside
29 * using integrator(). This can be used to change the integration rules, so less/more function evaluations are
30 * performed. The target precision of the integrator can be set in the constructor.
31 *
32 *
33 * ### How to use it
34 * There are two ways to use this class:
35 * - Manually wrap a PDF:
36 * ```
37 * RooBinSamplingPdf binSampler("<name>", "title", <binned observable of PDF>, <original PDF> [, <precision for integrator>]);
38 * binSampler.fitTo(data);
39 * ```
40 * When a PDF is wrapped with a RooBinSamplingPDF, just use the bin sampling PDF instead of the original one for fits
41 * or plotting etc.
42 * \note The binning will be taken from the observable. Make sure that this binning is the same as the one of the dataset that should be fit.
43 * Use RooRealVar::setBinning() to adapt it.
44 * - Instruct test statistics to carry out this wrapping automatically:
45 * ```
46 * pdf.fitTo(data, IntegrateBins(<precision>));
47 * ```
48 * This method is especially useful when used with a simultaneous PDF, since each component will automatically be wrapped,
49 * depending on the value of `precision`:
50 * - `precision < 0.`: None of the PDFs are touched, bin sampling is off.
51 * - `precision = 0.`: Continuous PDFs that are fit to a RooDataHist are wrapped into a RooBinSamplingPdf. The target precision
52 * forwarded to the integrator is 1.E-4 (the default argument of the constructor).
53 * - `precision > 0.`: All continuous PDFs are automatically wrapped into a RooBinSamplingPdf, regardless of what data they are
54 * fit to (see next paragraph). The same `'precision'` is used for all integrators.
55 *
56 * ### Simulating a binned fit using RooDataSet
57 * Some frameworks use unbinned data (RooDataSet) to simulate binned datasets. By adding one entry for each bin centre with the
58 * appropriate weight, one can achieve the same result as fitting with RooDataHist. In this case, however, RooFit cannot
59 * auto-detect that a binned fit is running, and that an integration over the bin is desired (note that there are no bins to
60 * integrate over in this kind of dataset).
61 *
62 * In this case, `IntegrateBins(>0.)` needs to be used, and the desired binning needs to be assigned to the observable
63 * of the dataset:
64 * ```
65 * RooRealVar x("x", "x", 0., 5.);
66 * x.setBins(10);
67 *
68 * // <create dataset and model>
69 *
70 * model.fitTo(data, IntegrateBins(>0.));
71 * ```
72 *
73 * \see RooAbsPdf::fitTo()
74 * \see IntegrateBins()
75 *
76 * \note This feature is currently limited to one-dimensional PDFs.
77 *
78 *
79 * \htmlonly <style>div.image img[src="RooBinSamplingPdf_OFF.png"]{width:12cm;}</style> \endhtmlonly
80 * \htmlonly <style>div.image img[src="RooBinSamplingPdf_ON.png" ]{width:12cm;}</style> \endhtmlonly
81 * <table>
82 * <tr><th> Binned fit without %RooBinSamplingPdf <th> Binned fit with %RooBinSamplingPdf </td></tr>
83 * <tr><td> \image html RooBinSamplingPdf_OFF.png ""
84 * </td>
85 * <td> \image html RooBinSamplingPdf_ON.png ""
86 * </td></tr>
87 * </table>
88 *
89 */
90
91
92#include "RooBinSamplingPdf.h"
93
94#include "RooFitImplHelpers.h"
95#include "RooRealBinding.h"
96#include "RooRealVar.h"
97#include "RooGlobalFunc.h"
98#include "RooDataHist.h"
99
100#include "Math/Integrator.h"
101
102#include <algorithm>
103
104////////////////////////////////////////////////////////////////////////////////
105/// Construct a new RooBinSamplingPdf.
106/// \param[in] name A name to identify this object.
107/// \param[in] title Title (for e.g. plotting)
108/// \param[in] observable Observable to integrate over (the one that is binned).
109/// \param[in] inputPdf A PDF whose bins should be sampled with higher precision.
110/// \param[in] epsilon Relative precision for the integrator, which is used to sample the bins.
111/// Note that ROOT's default is to use an adaptive integrator, which in its first iteration usually reaches
112/// relative precision of 1.E-4 or better. Therefore, asking for lower precision rarely has an effect.
113RooBinSamplingPdf::RooBinSamplingPdf(const char *name, const char *title, RooAbsRealLValue& observable,
114 RooAbsPdf& inputPdf, double epsilon) :
115 RooAbsPdf(name, title),
116 _pdf("inputPdf", "Function to be converted into a PDF", this, inputPdf),
117 _observable("observable", "Observable to integrate over", this, observable, true, true),
118 _relEpsilon(epsilon) {
119 if (!_pdf->dependsOn(*_observable)) {
120 throw std::invalid_argument(std::string("RooBinSamplingPDF(") + GetName()
121 + "): The PDF " + _pdf->GetName() + " needs to depend on the observable "
122 + _observable->GetName());
123 }
124}
125
126
127 ////////////////////////////////////////////////////////////////////////////////
128 /// Copy a RooBinSamplingPdf.
129 /// \param[in] other PDF to copy.
130 /// \param[in] name Optionally rename the copy.
132 RooAbsPdf(other, name),
133 _pdf("inputPdf", this, other._pdf),
134 _observable("observable", this, other._observable),
135 _relEpsilon(other._relEpsilon) { }
136
137
138////////////////////////////////////////////////////////////////////////////////
139/// Integrate the PDF over the current bin of the observable.
141 const unsigned int bin = _observable->getBin();
142 const double low = _observable->getBinning().binLow(bin);
143 const double high = _observable->getBinning().binHigh(bin);
144
145 const double oldX = _observable->getVal();
146 double result;
147 {
148 // Important: When the integrator samples x, caching of sub-tree values needs to be off.
149 DisableCachingRAII disableCaching(inhibitDirty());
150 result = integrate(_normSet, low, high) / (high-low);
151 }
152
153 _observable->setVal(oldX);
154
155 return result;
156}
157
158
159////////////////////////////////////////////////////////////////////////////////
160/// Integrate the PDF over all its bins, and return a batch with those values.
161/// \param[in,out] evalData Struct with evaluation data.
162/// \param[in] normSet Normalisation set that's used to evaluate the PDF.
164{
165 std::span<double> output = ctx.output();
166
167 // Retrieve binning, which we need to compute the probabilities
168 auto boundaries = binBoundaries();
169 auto xValues = ctx.at(_observable);
170
171 // Important: When the integrator samples x, caching of sub-tree values needs to be off.
172 DisableCachingRAII disableCaching(inhibitDirty());
173
174 // Now integrate PDF in each bin:
175 for (unsigned int i = 0; i < xValues.size(); ++i) {
176 const double x = xValues[i];
177 const auto upperIt = std::upper_bound(boundaries.begin(), boundaries.end(), x);
178 const unsigned int bin = std::distance(boundaries.begin(), upperIt) - 1;
179 assert(bin < boundaries.size());
180
181 output[i] = integrate(nullptr, boundaries[bin], boundaries[bin + 1]) / (boundaries[bin + 1] - boundaries[bin]);
182 }
183}
184
185
186////////////////////////////////////////////////////////////////////////////////
187/// Get the bin boundaries for the observable.
188/// These will be recomputed whenever the shape of this object is dirty.
189std::span<const double> RooBinSamplingPdf::binBoundaries() const {
190 if (isShapeDirty() || _binBoundaries.empty()) {
191 _binBoundaries.clear();
192 const RooAbsBinning& binning = _observable->getBinning(nullptr);
193 const double* boundaries = binning.array();
194
195 for (int i=0; i < binning.numBoundaries(); ++i) {
196 _binBoundaries.push_back(boundaries[i]);
197 }
198
199 assert(std::is_sorted(_binBoundaries.begin(), _binBoundaries.end()));
200
202 }
203
204 return {_binBoundaries};
205}
206
207
208////////////////////////////////////////////////////////////////////////////////
209/// Return a list of all bin boundaries, so the PDF is plotted correctly.
210/// \param[in] obs Observable to generate the boundaries for.
211/// \param[in] xlo Beginning of range to create list of boundaries for.
212/// \param[in] xhi End of range to create to create list of boundaries for.
213/// \return Pointer to a list to be deleted by caller.
214std::list<double>* RooBinSamplingPdf::binBoundaries(RooAbsRealLValue& obs, double xlo, double xhi) const {
215 if (obs.namePtr() != _observable->namePtr()) {
216 coutE(Plotting) << "RooBinSamplingPdf::binBoundaries(" << GetName() << "): observable '" << obs.GetName()
217 << "' is not the observable of this PDF ('" << _observable->GetName() << "')." << std::endl;
218 return nullptr;
219 }
220
221 auto list = new std::list<double>;
222 for (double val : binBoundaries()) {
223 if (xlo <= val && val < xhi)
224 list->push_back(val);
225 }
226
227 return list;
228}
229
230
231////////////////////////////////////////////////////////////////////////////////
232/// Return a list of all bin edges, so the PDF is plotted as a step function.
233/// \param[in] obs Observable to generate the sampling hint for.
234/// \param[in] xlo Beginning of range to create sampling hint for.
235/// \param[in] xhi End of range to create sampling hint for.
236/// \return Pointer to a list to be deleted by caller.
237std::list<double>* RooBinSamplingPdf::plotSamplingHint(RooAbsRealLValue& obs, double xlo, double xhi) const {
238 if (obs.namePtr() != _observable->namePtr()) {
239 coutE(Plotting) << "RooBinSamplingPdf::plotSamplingHint(" << GetName() << "): observable '" << obs.GetName()
240 << "' is not the observable of this PDF ('" << _observable->GetName() << "')." << std::endl;
241 return nullptr;
242 }
243
244 auto binEdges = new std::list<double>;
245 const auto& binning = obs.getBinning();
246
247 for (unsigned int bin=0, numBins = static_cast<unsigned int>(binning.numBins()); bin < numBins; ++bin) {
248 const double low = std::max(binning.binLow(bin), xlo);
249 const double high = std::min(binning.binHigh(bin), xhi);
250 const double width = high - low;
251
252 // Check if this bin is in plotting range at all
253 if (low >= high)
254 continue;
255
256 // Move support points slightly inside the bin, so step function is plotted correctly.
257 binEdges->push_back(low + 0.001 * width);
258 binEdges->push_back(high - 0.001 * width);
259 }
260
261 return binEdges;
262}
263
264
265////////////////////////////////////////////////////////////////////////////////
266/// Direct access to the unique_ptr holding the integrator that's used to sample the bins.
267/// This can be used to change options such as sampling accuracy or to entirely exchange the integrator.
268///
269/// #### Example: Use the 61-point Gauss-Kronrod integration rule
270/// ```{.cpp}
271/// ROOT::Math::IntegratorOneDimOptions intOptions = pdf.integrator()->Options();
272/// intOptions.SetNPoints(6); // 61-point integration rule
273/// intOptions.SetRelTolerance(1.E-9); // Smaller tolerance -> more subdivisions
274/// pdf.integrator()->SetOptions(intOptions);
275/// ```
276/// \see ROOT::Math::IntegratorOneDim::SetOptions for more details on integration options.
277/// \note When RooBinSamplingPdf is loaded from files, integrator options will fall back to the default values.
278std::unique_ptr<ROOT::Math::IntegratorOneDim>& RooBinSamplingPdf::integrator() const {
279 if (!_integrator) {
280 _integrator = std::make_unique<ROOT::Math::IntegratorOneDim>(*this,
281 ROOT::Math::IntegrationOneDim::kADAPTIVE, // GSL Integrator. Will really get it only if MathMore enabled.
282 -1., _relEpsilon, // Abs epsilon = default, rel epsilon set by us.
283 0, // We don't limit the sub-intervals. Steer run time via _relEpsilon.
284 2 // This should read ROOT::Math::Integration::kGAUSS21, but this is in MathMore, so we cannot include it here.
285 );
286 }
287
288 return _integrator;
289}
290
291
292////////////////////////////////////////////////////////////////////////////////
293/// Binding used by the integrator to evaluate the PDF.
294double RooBinSamplingPdf::operator()(double x) const {
296 return _pdf;
297}
298
299
300////////////////////////////////////////////////////////////////////////////////
301/// Integrate the wrapped PDF using our current integrator, with given norm set and limits.
302double RooBinSamplingPdf::integrate(const RooArgSet* /*normSet*/, double low, double high) const {
303 return integrator()->Integral(low, high);
304}
305
306
307/// Creates a wrapping RooBinSamplingPdf if appropriate.
308/// \param[in] pdf The input pdf.
309/// \param[in] data The dataset to be used in the fit, used to figure out the
310/// observables and whether the dataset is binned.
311/// \param[in] precision Precision argument for all created RooBinSamplingPdfs.
312std::unique_ptr<RooAbsPdf> RooBinSamplingPdf::create(RooAbsPdf& pdf, RooAbsData const &data, double precision) {
313 if (precision < 0.)
314 return nullptr;
315
316 std::unique_ptr<RooArgSet> funcObservables( pdf.getObservables(data) );
317 const bool oneDimAndBinned = (1 == std::count_if(funcObservables->begin(), funcObservables->end(), [](const RooAbsArg* arg) {
318 auto var = dynamic_cast<const RooRealVar*>(arg);
319 return var && var->numBins() > 1;
320 }));
321
322 if (!oneDimAndBinned) {
323 if (precision > 0.) {
324 oocoutE(&pdf, Fitting)
325 << "Integration over bins was requested, but this is currently only implemented for 1-D fits." << std::endl;
326 }
327 return nullptr;
328 }
329
330 // Find the real-valued observable. We don't care about categories.
331 auto theObs = std::find_if(funcObservables->begin(), funcObservables->end(), [](const RooAbsArg* arg){
332 return dynamic_cast<const RooAbsRealLValue*>(arg);
333 });
334 assert(theObs != funcObservables->end());
335
336 std::unique_ptr<RooAbsPdf> newPdf;
337
338 if (precision > 0.) {
339 // User forced integration. Let just apply it.
340 newPdf = std::make_unique<RooBinSamplingPdf>(
341 (std::string(pdf.GetName()) + "_binSampling").c_str(), pdf.GetTitle(),
342 *static_cast<RooAbsRealLValue *>(*theObs), pdf, precision);
343 } else if (dynamic_cast<RooDataHist const *>(&data) != nullptr &&
344 precision == 0. && !pdf.isBinnedDistribution(*data.get())) {
345 // User didn't forbid integration, and it seems appropriate with a
346 // RooDataHist.
347 newPdf = std::make_unique<RooBinSamplingPdf>(
348 (std::string(pdf.GetName()) + "_binSampling").c_str(), pdf.GetTitle(),
349 *static_cast<RooAbsRealLValue *>(*theObs), pdf);
350 }
351
352 return newPdf;
353}
#define oocoutE(o, a)
#define coutE(a)
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void data
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t result
Option_t Option_t width
char name[80]
Definition TGX11.cxx:110
Disable all caches for sub-branches in an expression tree.
Common abstract base class for objects that represent a value and a "shape" in RooFit.
Definition RooAbsArg.h:77
bool dependsOn(const RooAbsCollection &serverList, const RooAbsArg *ignoreArg=nullptr, bool valueOnly=false) const
Test whether we depend on (ie, are served by) any object in the specified collection.
const TNamed * namePtr() const
De-duplicated pointer to this object's name.
Definition RooAbsArg.h:504
bool isShapeDirty() const
Definition RooAbsArg.h:357
RooFit::OwningPtr< RooArgSet > getObservables(const RooArgSet &set, bool valueOnly=true) const
Given a set of possible observables, return the observables that this PDF depends on.
bool inhibitDirty() const
Delete watch flag.
void clearShapeDirty() const
Definition RooAbsArg.h:546
Abstract base class for RooRealVar binning definitions.
virtual double binLow(Int_t bin) const =0
virtual double binHigh(Int_t bin) const =0
virtual Int_t numBoundaries() const =0
virtual double * array() const =0
Abstract base class for binned and unbinned datasets.
Definition RooAbsData.h:57
Abstract interface for all probability density functions.
Definition RooAbsPdf.h:40
RooArgSet const * _normSet
Normalization integral (owned by _normMgr)
Definition RooAbsPdf.h:320
Abstract base class for objects that represent a real value that may appear on the left hand side of ...
virtual const RooAbsBinning & getBinning(const char *name=nullptr, bool verbose=true, bool createOnTheFly=false) const =0
Retrieve binning configuration with given name or default binning.
Int_t getBin(const char *rangeName=nullptr) const override
virtual void setVal(double value)=0
Set the current value of the object. Needs to be overridden by implementations.
double getVal(const RooArgSet *normalisationSet=nullptr) const
Evaluate object.
Definition RooAbsReal.h:103
virtual bool isBinnedDistribution(const RooArgSet &) const
Tests if the distribution is binned. Unless overridden by derived classes, this always returns false.
Definition RooAbsReal.h:348
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition RooArgSet.h:24
The RooBinSamplingPdf is supposed to be used as an adapter between a continuous PDF and a binned dist...
double integrate(const RooArgSet *normSet, double low, double high) const
Integrate the wrapped PDF using our current integrator, with given norm set and limits.
RooTemplateProxy< RooAbsPdf > _pdf
std::vector< double > _binBoundaries
! Workspace to store data for bin sampling
void doEval(RooFit::EvalContext &) const override
Integrate the PDF over all its bins, and return a batch with those values.
std::unique_ptr< ROOT::Math::IntegratorOneDim > & integrator() const
Direct access to the unique_ptr holding the integrator that's used to sample the bins.
RooTemplateProxy< RooAbsRealLValue > _observable
std::unique_ptr< ROOT::Math::IntegratorOneDim > _integrator
! Integrator used to sample bins.
double _relEpsilon
Default integrator precision.
std::span< const double > binBoundaries() const
Get the bin boundaries for the observable.
static std::unique_ptr< RooAbsPdf > create(RooAbsPdf &pdf, RooAbsData const &data, double precision)
Creates a wrapping RooBinSamplingPdf if appropriate.
double operator()(double x) const
Binding used by the integrator to evaluate the PDF.
double evaluate() const override
Integrate the PDF over the current bin of the observable.
const RooAbsPdf & pdf() const
std::list< double > * plotSamplingHint(RooAbsRealLValue &obs, double xlo, double xhi) const override
Return a list of all bin edges, so the PDF is plotted as a step function.
Container class to hold N-dimensional binned data.
Definition RooDataHist.h:40
std::span< const double > at(RooAbsArg const *arg, RooAbsArg const *caller=nullptr)
std::span< double > output()
const char * GetName() const override
Returns name of object.
Definition TNamed.h:47
const char * GetTitle() const override
Returns title of object.
Definition TNamed.h:48
@ kADAPTIVE
to be used for general functions without singularities
Double_t x[n]
Definition legend1.C:17
static void output()