Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
RooBinSamplingPdf.cxx
Go to the documentation of this file.
1// Authors: Stephan Hageboeck, CERN; Andrea Sciandra, SCIPP-UCSC/Atlas; Nov 2020
2
3/*****************************************************************************
4 * RooFit
5 * Authors: *
6 * WV, Wouter Verkerke, UC Santa Barbara, verkerke@slac.stanford.edu *
7 * DK, David Kirkby, UC Irvine, dkirkby@uci.edu *
8 * *
9 * Copyright (c) 2000-2020, Regents of the University of California *
10 * and Stanford University. All rights reserved. *
11 * *
12 * Redistribution and use in source and binary forms, *
13 * with or without modification, are permitted according to the terms *
14 * listed in LICENSE (http://roofit.sourceforge.net/license.txt) *
15 *****************************************************************************/
16
17/**
18 * \class RooBinSamplingPdf
19 * The RooBinSamplingPdf is supposed to be used as an adapter between a continuous PDF
20 * and a binned distribution.
21 * When RooFit is used to fit binned data, and the PDF is continuous, it takes the probability density
22 * at the bin centre as a proxy for the probability averaged (integrated) over the entire bin. This is
23 * correct only if the second derivative of the function vanishes, though. This is shown in the plots
24 * below.
25 *
26 * For PDFs that have larger curvatures, the RooBinSamplingPdf can be used. It integrates the PDF in each
27 * bin using an adaptive integrator. This usually requires 21 times more function evaluations, but significantly
28 * reduces biases due to better sampling of the PDF. The integrator can be accessed from the outside
29 * using integrator(). This can be used to change the integration rules, so less/more function evaluations are
30 * performed. The target precision of the integrator can be set in the constructor.
31 *
32 * ### How to use it
33 * There are two ways to use this class:
34 * - Manually wrap a PDF:
35 * ```
36 * RooBinSamplingPdf binSampler("<name>", "title", <binned observable of PDF>, <original PDF> [, <precision for integrator>]);
37 * binSampler.fitTo(data);
38 * ```
39 * When a PDF is wrapped with a RooBinSamplingPDF, just use the bin sampling PDF instead of the original one for fits
40 * or plotting etc. Note that the binning will be taken from the observable.
41 * - Instruct test statistics to carry out this wrapping automatically:
42 * ```
43 * pdf.fitTo(data, IntegrateBins(<precision>));
44 * ```
45 * This method is especially useful when used with a simultaneous PDF, since each component will automatically be wrapped,
46 * depending on the value of `precision`:
47 * - `precision < 0.`: None of the PDFs are touched, bin sampling is off.
48 * - `precision = 0.`: Continuous PDFs that are fit to a RooDataHist are wrapped into a RooBinSamplingPdf. The target precision
49 * forwarded to the integrator is 1.E-4 (the default argument of the constructor).
50 * - `precision > 0.`: All continuous PDFs are automatically wrapped into a RooBinSamplingPdf, regardless of what data they are
51 * fit to (see next paragraph). The same `'precision'` is used for all integrators.
52 *
53 * ### Simulating a binned fit using RooDataSet
54 * Some frameworks use unbinned data (RooDataSet) to simulate binned datasets. By adding one entry for each bin centre with the
55 * appropriate weight, one can achieve the same result as fitting with RooDataHist. In this case, however, RooFit cannot
56 * auto-detect that a binned fit is running, and that an integration over the bin is desired (note that there are no bins to
57 * integrate over in this kind of dataset).
58 *
59 * In this case, `IntegrateBins(>0.)` needs to be used, and the desired binning needs to be assigned to the observable
60 * of the dataset:
61 * ```
62 * RooRealVar x("x", "x", 0., 5.);
63 * x.setBins(10);
64 *
65 * // <create dataset and model>
66 *
67 * model.fitTo(data, IntegrateBins(>0.));
68 * ```
69 *
70 * \see RooAbsPdf::fitTo()
71 * \see IntegrateBins()
72 *
73 * \note This feature is currently limited to one-dimensional PDFs.
74 *
75 *
76 * \htmlonly <style>div.image img[src="RooBinSamplingPdf_OFF.png"]{width:12cm;}</style> \endhtmlonly
77 * \htmlonly <style>div.image img[src="RooBinSamplingPdf_ON.png" ]{width:12cm;}</style> \endhtmlonly
78 * <table>
79 * <tr><th> Binned fit without %RooBinSamplingPdf <th> Binned fit with %RooBinSamplingPdf </td></tr>
80 * <tr><td> \image html RooBinSamplingPdf_OFF.png ""
81 * </td>
82 * <td> \image html RooBinSamplingPdf_ON.png ""
83 * </td></tr>
84 * </table>
85 *
86 */
87
88
89#include "RooBinSamplingPdf.h"
90
91#include "RooHelpers.h"
92#include "RooRealBinding.h"
93#include "RunContext.h"
94
95#include "Math/Integrator.h"
96
97#include <algorithm>
98
99////////////////////////////////////////////////////////////////////////////////
100/// Construct a new RooBinSamplingPdf.
101/// \param[in] name A name to identify this object.
102/// \param[in] title Title (for e.g. plotting)
103/// \param[in] observable Observable to integrate over (the one that is binned).
104/// \param[in] inputPdf A PDF whose bins should be sampled with higher precision.
105/// \param[in] epsilon Relative precision for the integrator, which is used to sample the bins.
106/// Note that ROOT's default is to use an adaptive integrator, which in its first iteration usually reaches
107/// relative precision of 1.E-4 or better. Therefore, asking for lower precision rarely has an effect.
108RooBinSamplingPdf::RooBinSamplingPdf(const char *name, const char *title, RooAbsRealLValue& observable,
109 RooAbsPdf& inputPdf, double epsilon) :
110 RooAbsPdf(name, title),
111 _pdf("inputPdf", "Function to be converted into a PDF", this, inputPdf),
112 _observable("observable", "Observable to integrate over", this, observable, true, true),
113 _relEpsilon(epsilon) {
114 if (!_pdf->dependsOn(*_observable)) {
115 throw std::invalid_argument(std::string("RooBinSamplingPDF(") + GetName()
116 + "): The PDF " + _pdf->GetName() + " needs to depend on the observable "
117 + _observable->GetName());
118 }
119}
120
121
122 ////////////////////////////////////////////////////////////////////////////////
123 /// Copy a RooBinSamplingPdf.
124 /// \param[in] other PDF to copy.
125 /// \param[in] name Optionally rename the copy.
127 RooAbsPdf(other, name),
128 _pdf("inputPdf", this, other._pdf),
129 _observable("observable", this, other._observable),
130 _relEpsilon(other._relEpsilon) { }
131
132
133////////////////////////////////////////////////////////////////////////////////
134/// Integrate the PDF over the current bin of the observable.
136 const unsigned int bin = _observable->getBin();
137 const double low = _observable->getBinning().binLow(bin);
138 const double high = _observable->getBinning().binHigh(bin);
139
140 const double oldX = _observable->getVal();
141 double result;
142 {
143 // Important: When the integrator samples x, caching of sub-tree values needs to be off.
145 result = integrate(_normSet, low, high) / (high-low);
146 }
147
148 _observable->setVal(oldX);
149
150 return result;
151}
152
153
154////////////////////////////////////////////////////////////////////////////////
155/// Integrate the PDF over all its bins, and return a batch with those values.
156/// \param[in,out] evalData Struct with evaluation data.
157/// \param[in] normSet Normalisation set that's used to evaluate the PDF.
159 // Retrieve binning, which we need to compute the probabilities
160 auto boundaries = binBoundaries();
161 auto xValues = _observable->getValues(evalData, normSet);
162 auto results = evalData.makeBatch(this, xValues.size());
163
164 // Important: When the integrator samples x, caching of sub-tree values needs to be off.
166
167 // Now integrate PDF in each bin:
168 for (unsigned int i=0; i < xValues.size(); ++i) {
169 const double x = xValues[i];
170 const auto upperIt = std::upper_bound(boundaries.begin(), boundaries.end(), x);
171 const unsigned int bin = std::distance(boundaries.begin(), upperIt) - 1;
172 assert(bin < boundaries.size());
173
174 results[i] = integrate(normSet, boundaries[bin], boundaries[bin+1]) / (boundaries[bin+1]-boundaries[bin]);
175 }
176
177 return results;
178}
179
180
181////////////////////////////////////////////////////////////////////////////////
182/// Get the bin boundaries for the observable.
183/// These will be recomputed whenever the shape of this object is dirty.
185 if (isShapeDirty() || _binBoundaries.empty()) {
186 _binBoundaries.clear();
187 const RooAbsBinning& binning = _observable->getBinning(nullptr);
188 const double* boundaries = binning.array();
189
190 for (int i=0; i < binning.numBoundaries(); ++i) {
191 _binBoundaries.push_back(boundaries[i]);
192 }
193
194 assert(std::is_sorted(_binBoundaries.begin(), _binBoundaries.end()));
195
197 }
198
199 return {_binBoundaries};
200}
201
202
203////////////////////////////////////////////////////////////////////////////////
204/// Return a list of all bin boundaries, so the PDF is plotted correctly.
205/// \param[in] obs Observable to generate the boundaries for.
206/// \param[in] xlo Beginning of range to create list of boundaries for.
207/// \param[in] xhi End of range to create to create list of boundaries for.
208/// \return Pointer to a list to be deleted by caller.
209std::list<double>* RooBinSamplingPdf::binBoundaries(RooAbsRealLValue& obs, Double_t xlo, Double_t xhi) const {
210 if (obs.namePtr() != _observable->namePtr()) {
211 coutE(Plotting) << "RooBinSamplingPdf::binBoundaries(" << GetName() << "): observable '" << obs.GetName()
212 << "' is not the observable of this PDF ('" << _observable->GetName() << "')." << std::endl;
213 return nullptr;
214 }
215
216 auto list = new std::list<double>;
217 for (double val : binBoundaries()) {
218 if (xlo <= val && val < xhi)
219 list->push_back(val);
220 }
221
222 return list;
223}
224
225
226////////////////////////////////////////////////////////////////////////////////
227/// Return a list of all bin edges, so the PDF is plotted as a step function.
228/// \param[in] obs Observable to generate the sampling hint for.
229/// \param[in] xlo Beginning of range to create sampling hint for.
230/// \param[in] xhi End of range to create sampling hint for.
231/// \return Pointer to a list to be deleted by caller.
233 if (obs.namePtr() != _observable->namePtr()) {
234 coutE(Plotting) << "RooBinSamplingPdf::plotSamplingHint(" << GetName() << "): observable '" << obs.GetName()
235 << "' is not the observable of this PDF ('" << _observable->GetName() << "')." << std::endl;
236 return nullptr;
237 }
238
239 auto binEdges = new std::list<double>;
240 const auto& binning = obs.getBinning();
241
242 for (unsigned int bin=0, numBins = static_cast<unsigned int>(binning.numBins()); bin < numBins; ++bin) {
243 const double low = std::max(binning.binLow(bin), xlo);
244 const double high = std::min(binning.binHigh(bin), xhi);
245 const double width = high - low;
246
247 // Check if this bin is in plotting range at all
248 if (low >= high)
249 continue;
250
251 // Move support points slightly inside the bin, so step function is plotted correctly.
252 binEdges->push_back(low + 0.001 * width);
253 binEdges->push_back(high - 0.001 * width);
254 }
255
256 return binEdges;
257}
258
259
260////////////////////////////////////////////////////////////////////////////////
261/// Direct access to the unique_ptr holding the integrator that's used to sample the bins.
262/// This can be used to change options such as sampling accuracy or to entirely exchange the integrator.
263///
264/// #### Example: Use the 61-point Gauss-Kronrod integration rule
265/// ```{.cpp}
266/// ROOT::Math::IntegratorOneDimOptions intOptions = pdf.integrator()->Options();
267/// intOptions.SetNPoints(6); // 61-point integration rule
268/// intOptions.SetRelTolerance(1.E-9); // Smaller tolerance -> more subdivisions
269/// pdf.integrator()->SetOptions(intOptions);
270/// ```
271/// \see ROOT::Math::IntegratorOneDim::SetOptions for more details on integration options.
272/// \note When RooBinSamplingPdf is loaded from files, integrator options will fall back to the default values.
273std::unique_ptr<ROOT::Math::IntegratorOneDim>& RooBinSamplingPdf::integrator() const {
274 if (!_integrator) {
276 ROOT::Math::IntegrationOneDim::kADAPTIVE, // GSL Integrator. Will really get it only if MathMore enabled.
277 -1., _relEpsilon, // Abs epsilon = default, rel epsilon set by us.
278 0, // We don't limit the sub-intervals. Steer run time via _relEpsilon.
279 2 // This should read ROOT::Math::Integration::kGAUSS21, but this is in MathMore, so we cannot include it here.
280 ));
281 }
282
283 return _integrator;
284}
285
286
287////////////////////////////////////////////////////////////////////////////////
288/// Binding used by the integrator to evaluate the PDF.
289double RooBinSamplingPdf::operator()(double x) const {
292}
293
294
295////////////////////////////////////////////////////////////////////////////////
296/// Integrate the wrapped PDF using our current integrator, with given norm set and limits.
297double RooBinSamplingPdf::integrate(const RooArgSet* normSet, double low, double high) const {
298 // Need to set this because operator() only takes one argument.
299 _normSetForIntegrator = normSet;
300
301 return integrator()->Integral(low, high);
302}
303
#define coutE(a)
double Double_t
Definition RtypesCore.h:59
include TDocParser_001 C image html pict1_TDocParser_001 png width
char name[80]
Definition TGX11.cxx:110
User Class for performing numerical integration of a function in one dimension.
Definition Integrator.h:98
const TNamed * namePtr() const
Definition RooAbsArg.h:550
Bool_t dependsOn(const RooAbsCollection &serverList, const RooAbsArg *ignoreArg=0, Bool_t valueOnly=kFALSE) const
Test whether we depend on (ie, are served by) any object in the specified collection.
Bool_t isShapeDirty() const
Definition RooAbsArg.h:434
Bool_t inhibitDirty() const
Delete watch flag.
void clearShapeDirty() const
Definition RooAbsArg.h:584
RooAbsBinning is the abstract base class for RooRealVar binning definitions.
virtual Double_t binLow(Int_t bin) const =0
virtual Double_t * array() const =0
virtual Double_t binHigh(Int_t bin) const =0
virtual Int_t numBoundaries() const =0
RooArgSet const * _normSet
Normalization integral (owned by _normMgr)
Definition RooAbsPdf.h:322
RooAbsRealLValue is the common abstract base class for objects that represent a real value that may a...
virtual const RooAbsBinning & getBinning(const char *name=0, Bool_t verbose=kTRUE, Bool_t createOnTheFly=kFALSE) const =0
Retrive binning configuration with given name or default binning.
virtual void setVal(Double_t value)=0
Set the current value of the object. Needs to be overridden by implementations.
virtual Int_t getBin(const char *rangeName=0) const
virtual RooSpan< const double > getValues(RooBatchCompute::RunContext &evalData, const RooArgSet *normSet=nullptr) const
by this change, please consult the release notes for ROOT 6.24 for guidance on how to make this trans...
Double_t getVal(const RooArgSet *normalisationSet=nullptr) const
Evaluate object.
Definition RooAbsReal.h:91
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition RooArgSet.h:29
The RooBinSamplingPdf is supposed to be used as an adapter between a continuous PDF and a binned dist...
double integrate(const RooArgSet *normSet, double low, double high) const
Integrate the wrapped PDF using our current integrator, with given norm set and limits.
RooTemplateProxy< RooAbsPdf > _pdf
std::vector< double > _binBoundaries
Integrator used to sample bins.
const RooArgSet * _normSetForIntegrator
Workspace to store data for bin sampling.
std::unique_ptr< ROOT::Math::IntegratorOneDim > & integrator() const
Direct access to the unique_ptr holding the integrator that's used to sample the bins.
RooTemplateProxy< RooAbsRealLValue > _observable
RooSpan< const double > binBoundaries() const
Get the bin boundaries for the observable.
std::unique_ptr< ROOT::Math::IntegratorOneDim > _integrator
Default integrator precision.
double operator()(double x) const
Binding used by the integrator to evaluate the PDF.
double evaluate() const override
Integrate the PDF over the current bin of the observable.
std::list< double > * plotSamplingHint(RooAbsRealLValue &obs, double xlo, double xhi) const override
Return a list of all bin edges, so the PDF is plotted as a step function.
RooSpan< double > evaluateSpan(RooBatchCompute::RunContext &evalData, const RooArgSet *normSet) const override
Integrate the PDF over all its bins, and return a batch with those values.
A simple container to hold a batch of data values.
Definition RooSpan.h:34
virtual const char * GetName() const
Returns name of object.
Definition TNamed.h:47
Double_t x[n]
Definition legend1.C:17
This struct enables passing computation data around between elements of a computation graph.
Definition RunContext.h:31
RooSpan< double > makeBatch(const RooAbsReal *owner, std::size_t size)
Create a writable batch.
Disable all caches for sub-branches in an expression tree.
Definition RooHelpers.h:102
REAL epsilon
Definition triangle.c:617