Logo ROOT   6.16/01
Reference Guide
TSPlot.cxx
Go to the documentation of this file.
1// @(#)root/splot:$Id$
2// Author: Muriel Pivk, Anna Kreshuk 10/2005
3
4/**********************************************************************
5 * *
6 * Copyright (c) 2005 ROOT Foundation, CERN/PH-SFT *
7 * *
8 **********************************************************************/
9
10#include "TSPlot.h"
11#include "TVirtualFitter.h"
12#include "TH1.h"
13#include "TTreePlayer.h"
14#include "TTreeFormula.h"
15#include "TTreeFormulaManager.h"
16#include "TSelectorDraw.h"
17#include "TBrowser.h"
18#include "TClass.h"
19#include "TMath.h"
20
21extern void Yields(Int_t &, Double_t *, Double_t &f, Double_t *x, Int_t iflag);
22
24
25//____________________________________________________________________
26//Begin_Html <!--
27/* -->
28<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
29<p>
30<b><font size="+2">Overview</font></b>
31
32</p><p>
33A common method used in High Energy Physics to perform measurements is
34the maximum Likelihood method, exploiting discriminating variables to
35disentangle signal from background. The crucial point for such an
36analysis to be reliable is to use an exhaustive list of sources of
37events combined with an accurate description of all the Probability
38Density Functions (PDF).
39</p><p>To assess the validity of the fit, a convincing quality check
40is to explore further the data sample by examining the distributions of
41control variables. A control variable can be obtained for instance by
42removing one of the discriminating variables before performing again
43the maximum Likelihood fit: this removed variable is a control
44variable. The expected distribution of this control variable, for
45signal, is to be compared to the one extracted, for signal, from the
46data sample. In order to be able to do so, one must be able to unfold
47from the distribution of the whole data sample.
48</p><p>The TSPlot method allows to reconstruct the distributions for
49the control variable, independently for each of the various sources of
50events, without making use of any <em>a priori</em> knowledge on <u>this</u>
51variable. The aim is thus to use the knowledge available for the
52discriminating variables to infer the behaviour of the individual
53sources of events with respect to the control variable.
54</p><p>
55TSPlot is optimal if the control variable is uncorrelated with the discriminating variables.
56
57</p><p>
58A detail description of the formalism itself, called <!-- MATH
59 $\hbox{$_s$}{\cal P}lot$
60 -->
61<img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48">, is given in&nbsp;[<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/node1.html#bib:sNIM">1</a>].
62
63</p><p>
64<b><font size="+2">The method</font></b>
65
66</p><p>
67The <!-- MATH
68 $\hbox{$_s$}{\cal P}lot$
69 -->
70<img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> technique is developped in the above context of a maximum Likelihood method making use of discriminating variables.
71
72</p><p>One considers a data sample in which are merged several species
73of events. These species represent various signal components and
74background components which all together account for the data sample.
75The different terms of the log-Likelihood are:
76</p><ul>
77<li><img src="gif/sPlot_img6.png" alt="$N$" align="bottom" border="0" height="17" width="22">: the total number of events in the data sample,
78</li>
79<li><!-- MATH
80 ${\rm N}_{\rm s}$
81 -->
82<img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25">: the number of species of events populating the data sample,
83</li>
84<li><img src="gif/sPlot_img8.png" alt="$N_i$" align="middle" border="0" height="34" width="25">: the number of events expected on the average for the <img src="gif/sPlot_img9.png" alt="$i^{\rm th}$" align="bottom" border="0" height="20" width="25"> species,
85</li>
86<li><!-- MATH
87 ${\rm f}_i(y_e)$
88 -->
89<img src="gif/sPlot_img10.png" alt="${\rm f}_i(y_e)$" align="middle" border="0" height="37" width="47">: the value of the PDFs of the discriminating variables <img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> for the <img src="gif/sPlot_img12.png" alt="$i^{th}$" align="bottom" border="0" height="20" width="25"> species and for event <img src="gif/sPlot_img13.png" alt="$e$" align="bottom" border="0" height="17" width="13">,
90</li>
91<li><img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">: the set of control variables which, by definition, do not appear in the expression of the Likelihood function <img src="gif/sPlot_img15.png" alt="${\cal L}$" align="bottom" border="0" height="18" width="18">.
92</li>
93</ul>
94The extended log-Likelihood reads:
95<br>
96<div align="right">
97
98<!-- MATH
99 \begin{equation}
100{\cal L}=\sum_{e=1}^{N}\ln \Big\{ \sum_{i=1}^{{\rm N}_{\rm s}}N_i{\rm f}_i(y_e) \Big\} -\sum_{i=1}^{{\rm N}_{\rm s}}N_i ~.
101\end{equation}
102 -->
103<table align="center" width="100%">
104<tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:eLik"></a><img src="gif/sPlot_img16.png" alt="\begin{displaymath}
105{\cal L}=\sum_{e=1}^{N}\ln \Big\{ \sum_{i=1}^{{\rm N}_{\rm s}}N_i{\rm f}_i(y_e) \Big\} -\sum_{i=1}^{{\rm N}_{\rm s}}N_i ~.
106\end{displaymath}" border="0" height="59" width="276"></td>
107<td align="right" width="10">
108(1)</td></tr>
109</tbody></table>
110<br clear="all"></div><p></p>
111From this expression, after maximization of <img src="gif/sPlot_img15.png" alt="${\cal L}$" align="bottom" border="0" height="18" width="18"> with respect to the <img src="gif/sPlot_img8.png" alt="$N_i$" align="middle" border="0" height="34" width="25"> parameters, a weight can be computed for every event and each species, in order to obtain later the true distribution <!-- MATH
112 ${\hbox{\bf {M}}}_i(x)$
113 -->
114<img src="gif/sPlot_img17.png" alt="${\hbox{\bf {M}}}_i(x)$" align="middle" border="0" height="37" width="56"> of variable <img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">. If <img src="gif/sPlot_img18.png" alt="${\rm n}$" align="bottom" border="0" height="17" width="15"> is one of the <!-- MATH
115 ${\rm N}_{\rm s}$
116 -->
117<img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25"> species present in the data sample, the weight for this species is defined by:
118<br>
119<div align="right">
120
121<!-- MATH
122 \begin{equation}
123\begin{Large}\fbox{$
124{_s{\cal P}}_{\rm n}(y_e)={\sum_{j=1}^{{\rm N}_{\rm s}} \hbox{\bf V}_{{\rm n}j}{\rm f}_j(y_e)\over\sum_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e) } $}\end{Large} ~,
125\end{equation}
126 -->
127<table align="center" width="100%">
128<tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:weightxnotiny"></a><img src="gif/sPlot_img19.png" alt="\begin{displaymath}
129\begin{Large}
130\fbox{$
131{_s{\cal P}}_{\rm n}(y_e)={\sum_{j=1}^...
132...um_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e) } $}\end{Large} ~,
133\end{displaymath}" border="0" height="76" width="279"></td>
134<td align="right" width="10">
135(2)</td></tr>
136</tbody></table>
137<br clear="all"></div><p></p>
138where <!-- MATH
139 $\hbox{\bf V}_{{\rm n}j}$
140 -->
141<img src="gif/sPlot_img20.png" alt="$\hbox{\bf V}_{{\rm n}j}$" align="middle" border="0" height="34" width="35">
142is the covariance matrix resulting from the Likelihood maximization.
143This matrix can be used directly from the fit, but this is numerically
144less accurate than the direct computation:
145<br>
146<div align="right">
147
148<!-- MATH
149 \begin{equation}
150\hbox{\bf V}^{-1}_{{\rm n}j}~=~
151{\partial^2(-{\cal L})\over\partial N_{\rm n}\partial N_j}~=~
152\sum_{e=1}^N {{\rm f}_{\rm n}(y_e){\rm f}_j(y_e)\over(\sum_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e))^2} ~.
153\end{equation}
154 -->
155<table align="center" width="100%">
156<tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:VarianceMatrixDirect"></a><img src="gif/sPlot_img21.png" alt="\begin{displaymath}
157\hbox{\bf V}^{-1}_{{\rm n}j}~=~
158{\partial^2(-{\cal L})\over\...
159...y_e)\over(\sum_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e))^2} ~.
160\end{displaymath}" border="0" height="58" width="360"></td>
161<td align="right" width="10">
162(3)</td></tr>
163</tbody></table>
164<br clear="all"></div><p></p>
165The distribution of the control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> obtained by histogramming the weighted events reproduces, on average, the true distribution <!-- MATH
166 ${\hbox{\bf {M}}}_{\rm n}(x)$
167 -->
168<img src="gif/sPlot_img22.png" alt="${\hbox{\bf {M}}}_{\rm n}(x)$" align="middle" border="0" height="37" width="59">.
169
170<p>
171The class TSPlot allows to reconstruct the true distribution <!-- MATH
172 ${\hbox{\bf {M}}}_{\rm n}(x)$
173 -->
174<img src="gif/sPlot_img22.png" alt="${\hbox{\bf {M}}}_{\rm n}(x)$" align="middle" border="0" height="37" width="59"> of a control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> for each of the <!-- MATH
175 ${\rm N}_{\rm s}$
176 -->
177<img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25"> species from the sole knowledge of the PDFs of the discriminating variables <img src="gif/sPlot_img23.png" alt="${\rm f}_i(y)$" align="middle" border="0" height="37" width="40">. The plots obtained thanks to the TSPlot class are called <!-- MATH
178 $\hbox{$_s$}{\cal P}lots$
179 -->
180<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">.
181
182</p><p>
183<b><font size="+2">Some properties and checks</font></b>
184
185</p><p>
186Beside reproducing the true distribution, <!-- MATH
187 $\hbox{$_s$}{\cal P}lots$
188 -->
189<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> bear remarkable properties:
190
191</p><ul>
192<li>
193Each <img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">-distribution is properly normalized:
194<br>
195<div align="right">
196
197<!-- MATH
198 \begin{equation}
199\sum_{e=1}^{N} {_s{\cal P}}_{\rm n}(y_e)~=~N_{\rm n}~.
200\end{equation}
201 -->
202<table align="center" width="100%">
203<tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:NormalizationOK"></a><img src="gif/sPlot_img24.png" alt="\begin{displaymath}
204\sum_{e=1}^{N} {_s{\cal P}}_{\rm n}(y_e)~=~N_{\rm n}~.
205\end{displaymath}" border="0" height="58" width="158"></td>
206<td align="right" width="10">
207(4)</td></tr>
208</tbody></table>
209<br clear="all"></div><p></p>
210</li>
211<li>
212For any event:
213<br>
214<div align="right">
215
216<!-- MATH
217 \begin{equation}
218\sum_{l=1}^{{\rm N}_{\rm s}} {_s{\cal P}}_l(y_e) ~=~1 ~.
219\end{equation}
220 -->
221<table align="center" width="100%">
222<tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:numberconservation"></a><img src="gif/sPlot_img25.png" alt="\begin{displaymath}
223\sum_{l=1}^{{\rm N}_{\rm s}} {_s{\cal P}}_l(y_e) ~=~1 ~.
224\end{displaymath}" border="0" height="59" width="140"></td>
225<td align="right" width="10">
226(5)</td></tr>
227</tbody></table>
228<br clear="all"></div><p></p>
229That is to say that, summing up the <!-- MATH
230 ${\rm N}_{\rm s}$
231 -->
232<img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25"> <!-- MATH
233 $\hbox{$_s$}{\cal P}lots$
234 -->
235<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">, one recovers the data sample distribution in&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">, and summing up the number of events entering in a <!-- MATH
236 $\hbox{$_s$}{\cal P}lot$
237 -->
238<img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> for a given species, one recovers the yield of the species, as provided by the fit. The property&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:NormalizationOK">4</a> is implemented in the TSPlot class as a check.
239</li>
240<li>the sum of the statistical uncertainties per bin
241<br>
242<div align="right">
243
244<!-- MATH
245 \begin{equation}
246\sigma[N_{\rm n}\ _s\tilde{\rm M}_{\rm n}(x) {\delta x}]~=~\sqrt{\sum_{e \subset {\delta x}} ({_s{\cal P}}_{\rm n})^2} ~.
247\end{equation}
248 -->
249<table align="center" width="100%">
250<tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:ErrorPerBin"></a><img src="gif/sPlot_img26.png" alt="\begin{displaymath}
251\sigma[N_{\rm n}\ _s\tilde{\rm M}_{\rm n}(x) {\delta x}]~=~\sqrt{\sum_{e \subset {\delta x}} ({_s{\cal P}}_{\rm n})^2} ~.
252\end{displaymath}" border="0" height="55" width="276"></td>
253<td align="right" width="10">
254(6)</td></tr>
255</tbody></table>
256<br clear="all"></div><p></p>
257reproduces the statistical uncertainty on the yield <img src="gif/sPlot_img27.png" alt="$N_{\rm n}$" align="middle" border="0" height="34" width="28">, as provided by the fit: <!-- MATH
258 $\sigma[N_{\rm n}]\equiv\sqrt{\hbox{\bf V}_{{\rm n}{\rm n}}}$
259 -->
260<img src="gif/sPlot_img28.png" alt="$\sigma[N_{\rm n}]\equiv\sqrt{\hbox{\bf V}_{{\rm n}{\rm n}}}$" align="middle" border="0" height="40" width="123">.
261Because of that and since the determination of the yields is optimal
262when obtained using a Likelihood fit, one can conclude that the<!-- MATH
263 $\hbox{$_s$}{\cal P}lot$
264 -->
265 <img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> technique is itself an optimal method to reconstruct distributions of control variables.
266</li>
267</ul>
268
269<p>
270<b><font size="+2">Different steps followed by TSPlot</font></b>
271
272</p><p>
273
274</p><ol>
275<li>A maximum Likelihood fit is performed to obtain the yields <img src="gif/sPlot_img8.png" alt="$N_i$" align="middle" border="0" height="34" width="25"> of the various species.
276The fit relies on discriminating variables&nbsp;<img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> uncorrelated with a control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">:
277the later is therefore totally absent from the fit.
278</li>
279<li>The weights <img src="gif/sPlot_img29.png" alt="${_s{\cal P}}$" align="middle" border="0" height="34" width="27"> are calculated using Eq.&nbsp;(<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:weightxnotiny">2</a>) where the covariance matrix is taken from Minuit.
280</li>
281<li>Histograms of&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> are filled by weighting the events with <img src="gif/sPlot_img29.png" alt="${_s{\cal P}}$" align="middle" border="0" height="34" width="27">.
282</li>
283<li>Error bars per bin are given by Eq.&nbsp;(<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:ErrorPerBin">6</a>).
284</li>
285</ol>
286The <!-- MATH
287 $\hbox{$_s$}{\cal P}lots$
288 -->
289<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> reproduce the true distributions of the species in the control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">, within the above defined statistical uncertainties.
290
291<p>
292<b><font size="+2">Illustrations</font></b>
293
294</p><p>
295To illustrate the technique, one considers an example derived from the analysis where <!-- MATH
296 $\hbox{$_s$}{\cal P}lots$
297 -->
298<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">
299have been first used (charmless B decays). One is dealing with a data
300sample in which two species are present: the first is termed signal and
301the second background. A maximum Likelihood fit is performed to obtain
302the two yields <img src="gif/sPlot_img30.png" alt="$N_1$" align="middle" border="0" height="34" width="27"> and <img src="gif/sPlot_img31.png" alt="$N_2$" align="middle" border="0" height="34" width="27">. The fit relies on two discriminating variables collectively denoted&nbsp;<img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> which are chosen within three possible variables denoted <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39">, <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20">.
303The variable which is not incorporated in&nbsp;<img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> is used as the control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">. The six distributions of the three variables are assumed to be the ones depicted in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfs">1</a>.
304
305</p><p>
306
307</p><div align="center"><a name="fig:pdfs"></a><a name="106"></a>
308<table>
309<caption align="bottom"><strong>Figure 1:</strong>
310Distributions of the three discriminating variables available to perform the Likelihood fit:
311<img src="gif/sPlot_img32.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39">, <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35">, <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20">.
312Among the three variables, two are used to perform the fit while one is
313kept out of the fit to serve the purpose of a control variable. The
314three distributions on the top (resp. bottom) of the figure correspond
315to the signal (resp. background). The unit of the vertical axis is
316chosen such that it indicates the number of entries per bin, if one
317slices the histograms in 25 bins.</caption>
318<tbody><tr><td><img src="gif/sPlot_img33.png" alt="\begin{figure}\begin{center}
319\mbox{{\psfig{file=pdfmesNIM.eps,width=0.33\linewi...
320...th}}
321{\psfig{file=pdffiNIM.eps,width=0.33\linewidth}}}
322\end{center}\end{figure}" border="0" height="162" width="544"></td></tr>
323</tbody></table>
324</div>
325
326<p>
327A data sample being built through a Monte Carlo simulation based on the distributions shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfs">1</a>, one obtains the three distributions of Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfstot">2</a>. Whereas the distribution of&nbsp;<img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> clearly indicates the presence of the signal, the distribution of <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> are less obviously populated by signal.
328
329</p><p>
330
331</p><div align="center"><a name="fig:pdfstot"></a><a name="169"></a>
332<table>
333<caption align="bottom"><strong>Figure 2:</strong>
334Distributions of the three discriminating variables for signal plus
335background. The three distributions are the ones obtained from a data
336sample obtained through a Monte Carlo simulation based on the
337distributions shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfs">1</a>. The data sample consists of 500 signal events and 5000 background events.</caption>
338<tbody><tr><td><img src="gif/sPlot_img34.png" alt="\begin{figure}\begin{center}
339\mbox{{\psfig{file=genmesTOTNIM.eps,width=0.33\lin...
340...}
341{\psfig{file=genfiTOTNIM.eps,width=0.33\linewidth}}}
342\end{center}\end{figure}" border="0" height="160" width="545"></td></tr>
343</tbody></table>
344</div>
345
346<p>
347Chosing <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> as discriminating variables to determine <img src="gif/sPlot_img30.png" alt="$N_1$" align="middle" border="0" height="34" width="27"> and <img src="gif/sPlot_img31.png" alt="$N_2$" align="middle" border="0" height="34" width="27"> through a maximum Likelihood fit, one builds, for the control variable <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> which is unknown to the fit, the two <!-- MATH
348 $\hbox{$_s$}{\cal P}lots$
349 -->
350<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> for signal and background shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:messPlots">3</a>. One observes that the <!-- MATH
351 $\hbox{$_s$}{\cal P}lot$
352 -->
353<img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48">
354for signal reproduces correctly the PDF even where the latter vanishes,
355although the error bars remain sizeable. This results from the almost
356complete cancellation between positive and negative weights: the sum of
357weights is close to zero while the sum of weights squared is not. The
358occurence of negative weights occurs through the appearance of the
359covariance matrix, and its negative components, in the definition of
360Eq.&nbsp;(<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:weightxnotiny">2</a>).
361
362</p><p>
363A word of caution is in order with respect to the error bars. Whereas
364their sum in quadrature is identical to the statistical uncertainties
365of the yields determined by the fit, and if, in addition, they are
366asymptotically correct, the error bars should be handled with care for
367low statistics and/or for too fine binning. This is because the error
368bars do not incorporate two known properties of the PDFs: PDFs are
369positive definite and can be non-zero in a given x-bin, even if in the
370particular data sample at hand, no event is observed in this bin. The
371latter limitation is not specific to<!-- MATH
372 $\hbox{$_s$}{\cal P}lots$
373 -->
374 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">,
375rather it is always present when one is willing to infer the PDF at the
376origin of an histogram, when, for some bins, the number of entries does
377not guaranty the applicability of the Gaussian regime. In such
378situations, a satisfactory practice is to attach allowed ranges to the
379histogram to indicate the upper and lower limits of the PDF value which
380are consistent with the actual observation, at a given confidence
381level.
382</p><p>
383
384</p><div align="center"><a name="fig:messPlots"></a><a name="127"></a>
385<table>
386<caption align="bottom"><strong>Figure 3:</strong>
387The <!-- MATH
388 $\hbox{$_s$}{\cal P}lots$
389 -->
390<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> (signal on the left, background on the right) obtained for <img src="gif/sPlot_img32.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> are represented as dots with error bars. They are obtained from a fit using only information from <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20">.</caption>
391<tbody><tr><td><img src="gif/sPlot_img35.png" alt="\begin{figure}\begin{center}
392\mbox{\psfig{file=mass-sig-sPlot.eps,width=0.48\li...
393... \psfig{file=mass-bkg-sPlot.eps,width=0.48\linewidth}}
394\end{center}\end{figure}" border="0" height="181" width="539"></td></tr>
395</tbody></table>
396</div>
397
398<p>
399Chosing <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> and <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> as discriminating variables to determine <img src="gif/sPlot_img30.png" alt="$N_1$" align="middle" border="0" height="34" width="27"> and <img src="gif/sPlot_img31.png" alt="$N_2$" align="middle" border="0" height="34" width="27"> through a maximum Likelihood fit, one builds, for the control variable <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> which is unknown to the fit, the two <!-- MATH
400 $\hbox{$_s$}{\cal P}lots$
401 -->
402<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> for signal and background shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:FisPlots">4</a>. In the <!-- MATH
403 $\hbox{$_s$}{\cal P}lot$
404 -->
405<img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> for signal one observes that error bars are the largest in the&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> regions where the background is the largest.
406
407</p><p>
408
409</p><div align="center"><a name="fig:FisPlots"></a><a name="136"></a>
410<table>
411<caption align="bottom"><strong>Figure 4:</strong>
412The <!-- MATH
413 $\hbox{$_s$}{\cal P}lots$
414 -->
415<img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> (signal on the left, background on the right) obtained for <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> are represented as dots with error bars. They are obtained from a fit using only information from <img src="gif/sPlot_img32.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> and <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35">.</caption>
416<tbody><tr><td><img src="gif/sPlot_img36.png" alt="\begin{figure}\begin{center}
417\mbox{\psfig{file=fisher-sig-sPlot.eps,width=0.48\...
418...psfig{file=fisher-bkg-sPlot.eps,width=0.48\linewidth}}
419\end{center}\end{figure}" border="0" height="180" width="539"></td></tr>
420</tbody></table>
421</div>
422
423<p>
424The results above can be obtained by running the tutorial TestSPlot.C
425</p>
426<!--*/
427//-->End_Html
428
429
430////////////////////////////////////////////////////////////////////////////////
431/// default constructor (used by I/O only)
432
434 fTree(0),
435 fTreename(0),
436 fVarexp(0),
437 fSelection(0)
438{
439 fNx = 0;
440 fNy=0;
441 fNevents = 0;
442 fNSpecies=0;
444}
445
446////////////////////////////////////////////////////////////////////////////////
447
449 fTreename(0),
450 fVarexp(0),
451 fSelection(0)
452
453{
454 //normal TSPlot constructor
455 // nx : number of control variables
456 // ny : number of discriminating variables
457 // ne : total number of events
458 // ns : number of species
459 // tree: input data
460
461 fNx = nx;
462 fNy=ny;
463 fNevents = ne;
465
470 fTree = tree;
472}
473
474////////////////////////////////////////////////////////////////////////////////
475/// destructor
476
478{
480 delete [] fNumbersOfEvents;
481 if (!fXvarHists.IsEmpty())
483 if (!fYvarHists.IsEmpty())
485 if (!fYpdfHists.IsEmpty())
487}
488
489////////////////////////////////////////////////////////////////////////////////
490///To browse the histograms
491
493{
494 if (!fSWeightsHists.IsEmpty()) {
495 TIter next(&fSWeightsHists);
496 TH1D* h = 0;
497 while ((h = (TH1D*)next()))
498 b->Add(h,h->GetName());
499 }
500
501 if (!fYpdfHists.IsEmpty()) {
502 TIter next(&fYpdfHists);
503 TH1D* h = 0;
504 while ((h = (TH1D*)next()))
505 b->Add(h,h->GetName());
506 }
507 if (!fYvarHists.IsEmpty()) {
508 TIter next(&fYvarHists);
509 TH1D* h = 0;
510 while ((h = (TH1D*)next()))
511 b->Add(h,h->GetName());
512 }
513 if (!fXvarHists.IsEmpty()) {
514 TIter next(&fXvarHists);
515 TH1D* h = 0;
516 while ((h = (TH1D*)next()))
517 b->Add(h,h->GetName());
518 }
519 b->Add(&fSWeights, "sWeights");
520}
521
522
523////////////////////////////////////////////////////////////////////////////////
524///Set the initial number of events of each species - used
525///as initial estimates in minuit
526
528{
529 if (!fNumbersOfEvents)
531 for (Int_t i=0; i<fNSpecies; i++)
532 fNumbersOfEvents[i]=numbers[i];
533}
534
535////////////////////////////////////////////////////////////////////////////////
536///Calculates the sWeights
537///The option controls the print level
538///"Q" - no print out
539///"V" - prints the estimated #of events in species - default
540///"VV" - as "V" + the minuit printing + sums of weights for control
541
543{
544
545 if (!fNumbersOfEvents){
546 Error("MakeSPlot","Initial numbers of events in species have not been set");
547 return;
548 }
549 Int_t i, j, ispecies;
550
551 TString opt = option;
552 opt.ToUpper();
553 opt.ReplaceAll("VV", "W");
554
555 //make sure that global fitter is minuit
556 char s[]="TFitter";
558 Int_t strdiff=strcmp(TVirtualFitter::GetFitter()->IsA()->GetName(), s);
559 if (strdiff!=0)
561 }
562
563
566
567 //now let's do it, excluding different yvars
568 //for iplot = -1 none is excluded
569 for (Int_t iplot=-1; iplot<fNy; iplot++){
570 for (i=0; i<fNevents; i++){
571 for (ispecies=0; ispecies<fNSpecies; ispecies++){
572 fPdfTot(i, ispecies)=1;
573 for (j=0; j<fNy; j++){
574 if (j!=iplot)
575 fPdfTot(i, ispecies)*=fYpdf(i, ispecies*fNy+j);
576 }
577 }
578 }
579 minuit->Clear();
580 minuit->SetFCN(Yields);
581 Double_t arglist[10];
582 //set the print level
583 if (opt.Contains("Q")||opt.Contains("V")){
584 arglist[0]=-1;
585 }
586 if (opt.Contains("W"))
587 arglist[0]=0;
588 minuit->ExecuteCommand("SET PRINT", arglist, 1);
589
590 minuit->SetObjectFit(&fPdfTot); //a tricky way to get fPdfTot matrix to fcn
591 for (ispecies=0; ispecies<fNSpecies; ispecies++)
592 minuit->SetParameter(ispecies, "", fNumbersOfEvents[ispecies], 1, 0, 0);
593
594 minuit->ExecuteCommand("MIGRAD", arglist, 0);
595 for (ispecies=0; ispecies<fNSpecies; ispecies++){
596 fNumbersOfEvents[ispecies]=minuit->GetParameter(ispecies);
597 if (!opt.Contains("Q"))
598 printf("estimated #of events in species %d = %f\n", ispecies, fNumbersOfEvents[ispecies]);
599 }
600 if (!opt.Contains("Q"))
601 printf("\n");
602 Double_t *covmat = minuit->GetCovarianceMatrix();
603 SPlots(covmat, iplot);
604
605 if (opt.Contains("W")){
606 Double_t *sumweight = new Double_t[fNSpecies];
607 for (i=0; i<fNSpecies; i++){
608 sumweight[i]=0;
609 for (j=0; j<fNevents; j++)
610 sumweight[i]+=fSWeights(j, (iplot+1)*fNSpecies + i);
611 printf("checking sum of weights[%d]=%f\n", i, sumweight[i]);
612 }
613 printf("\n");
614 delete [] sumweight;
615 }
616 }
617}
618
619////////////////////////////////////////////////////////////////////////////////
620///Computes the sWeights from the covariance matrix
621
622void TSPlot::SPlots(Double_t *covmat, Int_t i_excl)
623{
624 Int_t i, ispecies, k;
625 Double_t numerator, denominator;
626 for (i=0; i<fNevents; i++){
627 denominator=0;
628 for (ispecies=0; ispecies<fNSpecies; ispecies++)
629 denominator+=fNumbersOfEvents[ispecies]*fPdfTot(i, ispecies);
630 for (ispecies=0; ispecies<fNSpecies; ispecies++){
631 numerator=0;
632 for (k=0; k<fNSpecies; k++)
633 numerator+=covmat[ispecies*fNSpecies+k]*fPdfTot(i, k);
634 fSWeights(i, (i_excl+1)*fNSpecies + ispecies)=numerator/denominator;
635 }
636 }
637
638}
639
640////////////////////////////////////////////////////////////////////////////////
641///Returns the matrix of sweights
642
644{
645 if (weights.GetNcols()!=fNSpecies*(fNy+1) || weights.GetNrows()!=fNevents)
646 weights.ResizeTo(fNevents, fNSpecies*(fNy+1));
647 weights = fSWeights;
648}
649
650////////////////////////////////////////////////////////////////////////////////
651///Returns the matrix of sweights. It is assumed that the array passed in the argurment is big enough
652
654{
655 for (Int_t i=0; i<fNevents; i++){
656 for (Int_t j=0; j<fNSpecies; j++){
657 weights[i*fNSpecies+j]=fSWeights(i, j);
658 }
659 }
660}
661
662////////////////////////////////////////////////////////////////////////////////
663///Fills the histograms of x variables (not weighted) with nbins
664
666{
667 Int_t i, j;
668
669 if (!fXvarHists.IsEmpty()){
670 if (((TH1D*)fXvarHists.First())->GetNbinsX()!=nbins)
672 else
673 return;
674 }
675
676 //make the histograms
677 char name[12];
678 for (i=0; i<fNx; i++){
679 snprintf(name,sizeof(name), "x%d", i);
680 TH1D *h = new TH1D(name, name, nbins, fMinmax(0, i), fMinmax(1, i));
681 for (j=0; j<fNevents; j++)
682 h->Fill(fXvar(j, i));
684 }
685
686}
687
688////////////////////////////////////////////////////////////////////////////////
689///Returns the array of histograms of x variables (not weighted)
690///If histograms have not already
691///been filled, they are filled with default binning 100.
692
694{
695 Int_t nbins = 100;
696 if (fXvarHists.IsEmpty())
697 FillXvarHists(nbins);
698 else if (((TH1D*)fXvarHists.First())->GetNbinsX()!=nbins)
699 FillXvarHists(nbins);
700 return &fXvarHists;
701}
702
703////////////////////////////////////////////////////////////////////////////////
704///Returns the histogram of variable #ixvar
705///If histograms have not already
706///been filled, they are filled with default binning 100.
707
709{
710 Int_t nbins = 100;
711 if (fXvarHists.IsEmpty())
712 FillXvarHists(nbins);
713 else if (((TH1D*)fXvarHists.First())->GetNbinsX()!=nbins)
714 FillXvarHists(nbins);
715
716 return (TH1D*)fXvarHists.UncheckedAt(ixvar);
717}
718
719////////////////////////////////////////////////////////////////////////////////
720///Fill the histograms of y variables
721
723{
724 Int_t i, j;
725
726 if (!fYvarHists.IsEmpty()){
727 if (((TH1D*)fYvarHists.First())->GetNbinsX()!=nbins)
729 else
730 return;
731 }
732
733 //make the histograms
734 char name[12];
735 for (i=0; i<fNy; i++){
736 snprintf(name,sizeof(name), "y%d", i);
737 TH1D *h=new TH1D(name, name, nbins, fMinmax(0, fNx+i), fMinmax(1, fNx+i));
738 for (j=0; j<fNevents; j++)
739 h->Fill(fYvar(j, i));
741 }
742}
743
744////////////////////////////////////////////////////////////////////////////////
745///Returns the array of histograms of y variables. If histograms have not already
746///been filled, they are filled with default binning 100.
747
749{
750 Int_t nbins = 100;
751 if (fYvarHists.IsEmpty())
752 FillYvarHists(nbins);
753 else if (((TH1D*)fYvarHists.First())->GetNbinsX()!=nbins)
754 FillYvarHists(nbins);
755 return &fYvarHists;
756}
757
758////////////////////////////////////////////////////////////////////////////////
759///Returns the histogram of variable iyvar.If histograms have not already
760///been filled, they are filled with default binning 100.
761
763{
764 Int_t nbins = 100;
765 if (fYvarHists.IsEmpty())
766 FillYvarHists(nbins);
767 else if (((TH1D*)fYvarHists.First())->GetNbinsX()!=nbins)
768 FillYvarHists(nbins);
769 return (TH1D*)fYvarHists.UncheckedAt(iyvar);
770}
771
772////////////////////////////////////////////////////////////////////////////////
773///Fills the histograms of pdf-s of y variables with binning nbins
774
776{
777 Int_t i, j, ispecies;
778
779 if (!fYpdfHists.IsEmpty()){
780 if (((TH1D*)fYpdfHists.First())->GetNbinsX()!=nbins)
782 else
783 return;
784 }
785
786 char name[34];
787 for (ispecies=0; ispecies<fNSpecies; ispecies++){
788 for (i=0; i<fNy; i++){
789 snprintf(name,sizeof(name), "pdf_species%d_y%d", ispecies, i);
790 //TH1D *h = new TH1D(name, name, nbins, ypdfmin[ispecies*fNy+i], ypdfmax[ispecies*fNy+i]);
791 TH1D *h = new TH1D(name, name, nbins, fMinmax(0, fNx+fNy+ispecies*fNy+i), fMinmax(1, fNx+fNy+ispecies*fNy+i));
792 for (j=0; j<fNevents; j++)
793 h->Fill(fYpdf(j, ispecies*fNy+i));
795 }
796 }
797}
798
799////////////////////////////////////////////////////////////////////////////////
800///Returns the array of histograms of pdf's of y variables with binning nbins
801///If histograms have not already
802///been filled, they are filled with default binning 100.
803
805{
806 Int_t nbins = 100;
807 if (fYpdfHists.IsEmpty())
808 FillYpdfHists(nbins);
809
810 return &fYpdfHists;
811}
812
813////////////////////////////////////////////////////////////////////////////////
814///Returns the histogram of the pdf of variable #iyvar for species #ispecies, binning nbins
815///If histograms have not already
816///been filled, they are filled with default binning 100.
817
819{
820 Int_t nbins = 100;
821 if (fYpdfHists.IsEmpty())
822 FillYpdfHists(nbins);
823
824 return (TH1D*)fYpdfHists.UncheckedAt(fNy*ispecies+iyvar);
825}
826
827////////////////////////////////////////////////////////////////////////////////
828///The order of histograms in the array:
829///x0_species0, x0_species1,..., x1_species0, x1_species1,..., y0_species0, y0_species1,...
830///If the histograms have already been filled with a different binning, they are refilled
831///and all histograms are deleted
832
834{
835 if (fSWeights.GetNoElements()==0){
836 Error("GetSWeightsHists", "SWeights were not computed");
837 return;
838 }
839
840 if (!fSWeightsHists.IsEmpty()){
841 if (((TH1D*)fSWeightsHists.First())->GetNbinsX()!=nbins)
843 else
844 return;
845 }
846
847 char name[30];
848
849 //Fill histograms of x-variables weighted with sWeights
850 for (Int_t ivar=0; ivar<fNx; ivar++){
851 for (Int_t ispecies=0; ispecies<fNSpecies; ispecies++){
852 snprintf(name,30, "x%d_species%d", ivar, ispecies);
853 TH1D *h = new TH1D(name, name, nbins, fMinmax(0, ivar), fMinmax(1, ivar));
854 h->Sumw2();
855 for (Int_t ievent=0; ievent<fNevents; ievent++)
856 h->Fill(fXvar(ievent, ivar), fSWeights(ievent, ispecies));
858 }
859 }
860
861 //Fill histograms of y-variables (exluded from the fit), weighted with sWeights
862 for (Int_t iexcl=0; iexcl<fNy; iexcl++){
863 for(Int_t ispecies=0; ispecies<fNSpecies; ispecies++){
864 snprintf(name,30, "y%d_species%d", iexcl, ispecies);
865 TH1D *h = new TH1D(name, name, nbins, fMinmax(0, fNx+iexcl), fMinmax(1, fNx+iexcl));
866 h->Sumw2();
867 for (Int_t ievent=0; ievent<fNevents; ievent++)
868 h->Fill(fYvar(ievent, iexcl), fSWeights(ievent, fNSpecies*(iexcl+1)+ispecies));
870 }
871 }
872}
873
874////////////////////////////////////////////////////////////////////////////////
875///Returns an array of all histograms of variables, weighted with sWeights
876///If histograms have not been already filled, they are filled with default binning 50
877///The order of histograms in the array:
878///x0_species0, x0_species1,..., x1_species0, x1_species1,..., y0_species0, y0_species1,...
879
881{
882 Int_t nbins = 50; //default binning
884 FillSWeightsHists(nbins);
885
886 return &fSWeightsHists;
887}
888
889////////////////////////////////////////////////////////////////////////////////
890///The Fill...Hist() methods fill the histograms with the real limits on the variables
891///This method allows to refill the specified histogram with user-set boundaries min and max
892///Parameters:
893///type = 1 - histogram of x variable #nvar
894/// = 2 - histogram of y variable #nvar
895/// = 3 - histogram of y_pdf for y #nvar and species #nspecies
896/// = 4 - histogram of x variable #nvar, species #nspecies, WITH sWeights
897/// = 5 - histogram of y variable #nvar, species #nspecies, WITH sWeights
898
899void TSPlot::RefillHist(Int_t type, Int_t nvar, Int_t nbins, Double_t min, Double_t max, Int_t nspecies)
900{
901 if (type<1 || type>5){
902 Error("RefillHist", "type must lie between 1 and 5");
903 return;
904 }
905 char name[20];
906 Int_t j;
907 TH1D *hremove;
908 if (type==1){
909 hremove = (TH1D*)fXvarHists.RemoveAt(nvar);
910 delete hremove;
911 snprintf(name,20,"x%d",nvar);
912 TH1D *h = new TH1D(name, name, nbins, min, max);
913 for (j=0; j<fNevents;j++)
914 h->Fill(fXvar(j, nvar));
915 fXvarHists.AddAt(h, nvar);
916 }
917 if (type==2){
918 hremove = (TH1D*)fYvarHists.RemoveAt(nvar);
919 delete hremove;
920 snprintf(name,20, "y%d", nvar);
921 TH1D *h = new TH1D(name, name, nbins, min, max);
922 for (j=0; j<fNevents;j++)
923 h->Fill(fYvar(j, nvar));
924 fXvarHists.AddAt(h, nvar);
925 }
926 if (type==3){
927 hremove = (TH1D*)fYpdfHists.RemoveAt(nspecies*fNy+nvar);
928 delete hremove;
929 snprintf(name,20, "pdf_species%d_y%d", nspecies, nvar);
930 TH1D *h=new TH1D(name, name, nbins, min, max);
931 for (j=0; j<fNevents; j++)
932 h->Fill(fYpdf(j, nspecies*fNy+nvar));
933 fYpdfHists.AddAt(h, nspecies*fNy+nvar);
934 }
935 if (type==4){
936 hremove = (TH1D*)fSWeightsHists.RemoveAt(fNSpecies*nvar+nspecies);
937 delete hremove;
938 snprintf(name,20, "x%d_species%d", nvar, nspecies);
939 TH1D *h = new TH1D(name, name, nbins, min, max);
940 h->Sumw2();
941 for (Int_t ievent=0; ievent<fNevents; ievent++)
942 h->Fill(fXvar(ievent, nvar), fSWeights(ievent, nspecies));
943 fSWeightsHists.AddAt(h, fNSpecies*nvar+nspecies);
944 }
945 if (type==5){
946 hremove = (TH1D*)fSWeightsHists.RemoveAt(fNx*fNSpecies + fNSpecies*nvar+nspecies);
947 delete hremove;
948 snprintf(name,20, "y%d_species%d", nvar, nspecies);
949 TH1D *h = new TH1D(name, name, nbins, min, max);
950 h->Sumw2();
951 for (Int_t ievent=0; ievent<fNevents; ievent++)
952 h->Fill(fYvar(ievent, nvar), fSWeights(ievent, nspecies));
953 fSWeightsHists.AddAt(h, fNx*fNSpecies + fNSpecies*nvar+nspecies);
954 }
955}
956////////////////////////////////////////////////////////////////////////////////
957///Returns the histogram of a variable, weithed with sWeights
958///If histograms have not been already filled, they are filled with default binning 50
959///If parameter ixvar!=-1, the histogram of x-variable #ixvar is returned for species ispecies
960///If parameter ixvar==-1, the histogram of y-variable #iyexcl is returned for species ispecies
961///If the histogram has already been filled and the binning is different from the parameter nbins
962///all histograms with old binning will be deleted and refilled.
963
965{
966
967 Int_t nbins = 50; //default binning
969 FillSWeightsHists(nbins);
970
971 if (ixvar==-1)
972 return (TH1D*)fSWeightsHists.UncheckedAt(fNx*fNSpecies + fNSpecies*iyexcl+ispecies);
973 else
974 return (TH1D*)fSWeightsHists.UncheckedAt(fNSpecies*ixvar + ispecies);
975
976}
977
978
979////////////////////////////////////////////////////////////////////////////////
980/// Set the input Tree
981
983{
984 fTree = tree;
985}
986
987////////////////////////////////////////////////////////////////////////////////
988///Specifies the variables from the tree to be used for splot
989///
990///Variables fNx, fNy, fNSpecies and fNEvents should already be set!
991///
992///In the 1st parameter it is assumed that first fNx variables are x(control variables),
993///then fNy y variables (discriminating variables),
994///then fNy*fNSpecies ypdf variables (probability distribution functions of dicriminating
995///variables for different species). The order of pdfs should be: species0_y0, species0_y1,...
996///species1_y0, species1_y1,...species[fNSpecies-1]_y0...
997///The 2nd parameter allows to make a cut
998///TTree::Draw method description contains more details on specifying expression and selection
999
1000void TSPlot::SetTreeSelection(const char* varexp, const char *selection, Long64_t firstentry)
1001{
1002 TTreeFormula **var;
1003 std::vector<TString> cnames;
1004 TList *formulaList = new TList();
1005 TSelectorDraw *selector = (TSelectorDraw*)(((TTreePlayer*)fTree->GetPlayer())->GetSelector());
1006
1007 Long64_t entry, entryNumber;
1008 Int_t i,nch;
1009 Int_t ncols;
1010 TObjArray *leaves = fTree->GetListOfLeaves();
1011
1012 fTreename= new TString(fTree->GetName());
1013 if (varexp)
1014 fVarexp = new TString(varexp);
1015 if (selection)
1016 fSelection= new TString(selection);
1017
1018 nch = varexp ? strlen(varexp) : 0;
1019
1020
1021//*-*- Compile selection expression if there is one
1022 TTreeFormula *select = 0;
1023 if (selection && strlen(selection)) {
1024 select = new TTreeFormula("Selection",selection,fTree);
1025 if (!select) return;
1026 if (!select->GetNdim()) { delete select; return; }
1027 formulaList->Add(select);
1028 }
1029//*-*- if varexp is empty, take first nx + ny + ny*nspecies columns by default
1030
1031 if (nch == 0) {
1032 ncols = fNx + fNy + fNy*fNSpecies;
1033 for (i=0;i<ncols;i++) {
1034 cnames.push_back( leaves->At(i)->GetName() );
1035 }
1036//*-*- otherwise select only the specified columns
1037 } else {
1038 ncols = selector->SplitNames(varexp,cnames);
1039 }
1040 var = new TTreeFormula* [ncols];
1041 Double_t *xvars = new Double_t[ncols];
1042
1043 fMinmax.ResizeTo(2, ncols);
1044 for (i=0; i<ncols; i++){
1045 fMinmax(0, i)=1e30;
1046 fMinmax(1, i)=-1e30;
1047 }
1048
1049//*-*- Create the TreeFormula objects corresponding to each column
1050 for (i=0;i<ncols;i++) {
1051 var[i] = new TTreeFormula("Var1",cnames[i].Data(),fTree);
1052 formulaList->Add(var[i]);
1053 }
1054
1055//*-*- Create a TreeFormulaManager to coordinate the formulas
1056 TTreeFormulaManager *manager=0;
1057 if (formulaList->LastIndex()>=0) {
1058 manager = new TTreeFormulaManager;
1059 for(i=0;i<=formulaList->LastIndex();i++) {
1060 manager->Add((TTreeFormula*)formulaList->At(i));
1061 }
1062 manager->Sync();
1063 }
1064//*-*- loop on all selected entries
1065 // fSelectedRows = 0;
1066 Int_t tnumber = -1;
1067 Long64_t selectedrows=0;
1068 for (entry=firstentry;entry<firstentry+fNevents;entry++) {
1069 entryNumber = fTree->GetEntryNumber(entry);
1070 if (entryNumber < 0) break;
1071 Long64_t localEntry = fTree->LoadTree(entryNumber);
1072 if (localEntry < 0) break;
1073 if (tnumber != fTree->GetTreeNumber()) {
1074 tnumber = fTree->GetTreeNumber();
1075 if (manager) manager->UpdateFormulaLeaves();
1076 }
1077 Int_t ndata = 1;
1078 if (manager && manager->GetMultiplicity()) {
1079 ndata = manager->GetNdata();
1080 }
1081
1082 for(Int_t inst=0;inst<ndata;inst++) {
1083 Bool_t loaded = kFALSE;
1084 if (select) {
1085 if (select->EvalInstance(inst) == 0) {
1086 continue;
1087 }
1088 }
1089
1090 if (inst==0) loaded = kTRUE;
1091 else if (!loaded) {
1092 // EvalInstance(0) always needs to be called so that
1093 // the proper branches are loaded.
1094 for (i=0;i<ncols;i++) {
1095 var[i]->EvalInstance(0);
1096 }
1097 loaded = kTRUE;
1098 }
1099
1100 for (i=0;i<ncols;i++) {
1101 xvars[i] = var[i]->EvalInstance(inst);
1102 }
1103
1104 // curentry = entry-firstentry;
1105 //printf("event#%d\n", curentry);
1106 //for (i=0; i<ncols; i++)
1107 // printf("xvars[%d]=%f\n", i, xvars[i]);
1108 //selectedrows++;
1109 for (i=0; i<fNx; i++){
1110 fXvar(selectedrows, i) = xvars[i];
1111 if (fXvar(selectedrows, i) < fMinmax(0, i))
1112 fMinmax(0, i)=fXvar(selectedrows, i);
1113 if (fXvar(selectedrows, i) > fMinmax(1, i))
1114 fMinmax(1, i)=fXvar(selectedrows, i);
1115 }
1116 for (i=0; i<fNy; i++){
1117 fYvar(selectedrows, i) = xvars[i+fNx];
1118 //printf("y_in_loop(%d, %d)=%f, xvars[%d]=%f\n", selectedrows, i, fYvar(selectedrows, i), i+fNx, xvars[i+fNx]);
1119 if (fYvar(selectedrows, i) < fMinmax(0, i+fNx))
1120 fMinmax(0, i+fNx) = fYvar(selectedrows, i);
1121 if (fYvar(selectedrows, i) > fMinmax(1, i+fNx))
1122 fMinmax(1, i+fNx) = fYvar(selectedrows, i);
1123 for (Int_t j=0; j<fNSpecies; j++){
1124 fYpdf(selectedrows, j*fNy + i)=xvars[j*fNy + i+fNx+fNy];
1125 if (fYpdf(selectedrows, j*fNy+i) < fMinmax(0, j*fNy+i+fNx+fNy))
1126 fMinmax(0, j*fNy+i+fNx+fNy) = fYpdf(selectedrows, j*fNy+i);
1127 if (fYpdf(selectedrows, j*fNy+i) > fMinmax(1, j*fNy+i+fNx+fNy))
1128 fMinmax(1, j*fNy+i+fNx+fNy) = fYpdf(selectedrows, j*fNy+i);
1129 }
1130 }
1131 selectedrows++;
1132 }
1133 }
1134 fNevents=selectedrows;
1135 // for (i=0; i<fNevents; i++){
1136 // printf("event#%d\n", i);
1137 //for (Int_t iy=0; iy<fNy; iy++)
1138 // printf("y[%d]=%f\n", iy, fYvar(i, iy));
1139 //for (Int_t ispecies=0; ispecies<fNSpecies; ispecies++){
1140 // for (Int_t iy=0; iy<fNy; iy++)
1141 // printf("ypdf[sp. %d, y %d]=%f\n", ispecies, iy, fYpdf(i, ispecies*fNy+iy));
1142 // }
1143 //}
1144 delete [] xvars;
1145 delete [] var;
1146}
1147
1148////////////////////////////////////////////////////////////////////////////////
1149/// FCN-function for Minuit
1150
1151void Yields(Int_t &, Double_t *, Double_t &f, Double_t *x, Int_t /*iflag*/)
1152{
1153 Double_t lik;
1154 Int_t i, ispecies;
1155
1157 TMatrixD *pdftot = (TMatrixD*)fitter->GetObjectFit();
1158 Int_t nev = pdftot->GetNrows();
1159 Int_t nes = pdftot->GetNcols();
1160 f=0;
1161 for (i=0; i<nev; i++){
1162 lik=0;
1163 for (ispecies=0; ispecies<nes; ispecies++)
1164 lik+=x[ispecies]*(*pdftot)(i, ispecies);
1165 if (lik<0) lik=1;
1166 f+=TMath::Log(lik);
1167 }
1168 //extended likelihood, equivalent to chi2
1169 Double_t ntot=0;
1170 for (i=0; i<nes; i++)
1171 ntot += x[i];
1172 f = -2*(f-ntot);
1173}
1174
#define b(i)
Definition: RSha256.hxx:100
#define f(i)
Definition: RSha256.hxx:104
#define h(i)
Definition: RSha256.hxx:106
int Int_t
Definition: RtypesCore.h:41
const Bool_t kFALSE
Definition: RtypesCore.h:88
bool Bool_t
Definition: RtypesCore.h:59
double Double_t
Definition: RtypesCore.h:55
long long Long64_t
Definition: RtypesCore.h:69
const Bool_t kTRUE
Definition: RtypesCore.h:87
const char Option_t
Definition: RtypesCore.h:62
#define ClassImp(name)
Definition: Rtypes.h:363
int type
Definition: TGX11.cxx:120
void Yields(Int_t &, Double_t *, Double_t &f, Double_t *x, Int_t iflag)
FCN-function for Minuit.
Definition: TSPlot.cxx:1151
#define snprintf
Definition: civetweb.c:1540
virtual Int_t GetNdim() const
Definition: TFormula.h:237
Using a TBrowser one can browse all ROOT objects.
Definition: TBrowser.h:37
1-D histogram with a double per channel (see TH1 documentation)}
Definition: TH1.h:614
A doubly linked list.
Definition: TList.h:44
virtual void Add(TObject *obj)
Definition: TList.h:87
virtual TObject * At(Int_t idx) const
Returns the object at position idx. Returns 0 if idx is out of range.
Definition: TList.cxx:354
Int_t GetNrows() const
Definition: TMatrixTBase.h:124
Int_t GetNoElements() const
Definition: TMatrixTBase.h:128
Int_t GetNcols() const
Definition: TMatrixTBase.h:127
virtual TMatrixTBase< Element > & ResizeTo(Int_t nrows, Int_t ncols, Int_t=-1)
Set size of the matrix to nrows x ncols New dynamic elements are created, the overlapping part of the...
Definition: TMatrixT.cxx:1210
virtual const char * GetName() const
Returns name of object.
Definition: TNamed.h:47
An array of TObjects.
Definition: TObjArray.h:37
virtual void AddLast(TObject *obj)
Add object in the next empty slot in the array.
Definition: TObjArray.cxx:177
void Add(TObject *obj)
Definition: TObjArray.h:73
TObject * UncheckedAt(Int_t i) const
Definition: TObjArray.h:89
TObject * First() const
Return the object in the first slot.
Definition: TObjArray.cxx:495
virtual void Delete(Option_t *option="")
Remove all objects from the array AND delete all heap based objects.
Definition: TObjArray.cxx:355
Bool_t IsEmpty() const
Definition: TObjArray.h:70
virtual void AddAt(TObject *obj, Int_t idx)
Add object at position ids.
Definition: TObjArray.cxx:253
virtual TObject * RemoveAt(Int_t idx)
Remove object at index idx.
Definition: TObjArray.cxx:678
TObject * At(Int_t idx) const
Definition: TObjArray.h:165
virtual const char * GetName() const
Returns name of object.
Definition: TObject.cxx:357
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition: TObject.cxx:880
Definition: TSPlot.h:21
void FillYvarHists(Int_t nbins=100)
Fill the histograms of y variables.
Definition: TSPlot.cxx:722
TObjArray fXvarHists
Definition: TSPlot.h:30
TObjArray fSWeightsHists
Definition: TSPlot.h:33
void RefillHist(Int_t type, Int_t var, Int_t nbins, Double_t min, Double_t max, Int_t nspecies=-1)
The Fill...Hist() methods fill the histograms with the real limits on the variables This method allow...
Definition: TSPlot.cxx:899
TMatrixD fSWeights
Definition: TSPlot.h:28
Int_t fNevents
Definition: TSPlot.h:44
Int_t fNy
Definition: TSPlot.h:42
void Browse(TBrowser *b)
To browse the histograms.
Definition: TSPlot.cxx:492
Int_t fNSpecies
Definition: TSPlot.h:43
TString * fTreename
Definition: TSPlot.h:36
void SPlots(Double_t *covmat, Int_t i_excl)
Computes the sWeights from the covariance matrix.
Definition: TSPlot.cxx:622
void GetSWeights(TMatrixD &weights)
Returns the matrix of sweights.
Definition: TSPlot.cxx:643
virtual ~TSPlot()
destructor
Definition: TSPlot.cxx:477
TMatrixD fMinmax
Definition: TSPlot.h:27
void FillSWeightsHists(Int_t nbins=50)
The order of histograms in the array: x0_species0, x0_species1,..., x1_species0, x1_species1,...
Definition: TSPlot.cxx:833
Int_t fNx
Definition: TSPlot.h:41
TString * fSelection
Definition: TSPlot.h:38
TH1D * GetSWeightsHist(Int_t ixvar, Int_t ispecies, Int_t iyexcl=-1)
Returns the histogram of a variable, weithed with sWeights If histograms have not been already filled...
Definition: TSPlot.cxx:964
TMatrixD fPdfTot
Definition: TSPlot.h:26
void MakeSPlot(Option_t *option="v")
Calculates the sWeights The option controls the print level "Q" - no print out "V" - prints the estim...
Definition: TSPlot.cxx:542
TH1D * GetYpdfHist(Int_t iyvar, Int_t ispecies)
Returns the histogram of the pdf of variable #iyvar for species #ispecies, binning nbins If histogram...
Definition: TSPlot.cxx:818
TObjArray * GetSWeightsHists()
Returns an array of all histograms of variables, weighted with sWeights If histograms have not been a...
Definition: TSPlot.cxx:880
TTree * fTree
Definition: TSPlot.h:35
TObjArray * GetYpdfHists()
Returns the array of histograms of pdf's of y variables with binning nbins If histograms have not alr...
Definition: TSPlot.cxx:804
void FillXvarHists(Int_t nbins=100)
Fills the histograms of x variables (not weighted) with nbins.
Definition: TSPlot.cxx:665
TObjArray * GetXvarHists()
Returns the array of histograms of x variables (not weighted) If histograms have not already been fil...
Definition: TSPlot.cxx:693
TObjArray fYvarHists
Definition: TSPlot.h:31
TString * fVarexp
Definition: TSPlot.h:37
void SetTree(TTree *tree)
Set the input Tree.
Definition: TSPlot.cxx:982
TObjArray * GetYvarHists()
Returns the array of histograms of y variables.
Definition: TSPlot.cxx:748
void FillYpdfHists(Int_t nbins=100)
Fills the histograms of pdf-s of y variables with binning nbins.
Definition: TSPlot.cxx:775
void SetTreeSelection(const char *varexp="", const char *selection="", Long64_t firstentry=0)
Specifies the variables from the tree to be used for splot.
Definition: TSPlot.cxx:1000
TMatrixD fXvar
Definition: TSPlot.h:23
TSPlot()
default constructor (used by I/O only)
Definition: TSPlot.cxx:433
TH1D * GetYvarHist(Int_t iyvar)
Returns the histogram of variable iyvar.If histograms have not already been filled,...
Definition: TSPlot.cxx:762
void SetInitialNumbersOfSpecies(Int_t *numbers)
Set the initial number of events of each species - used as initial estimates in minuit.
Definition: TSPlot.cxx:527
TObjArray fYpdfHists
Definition: TSPlot.h:32
Double_t * fNumbersOfEvents
Definition: TSPlot.h:46
TMatrixD fYpdf
Definition: TSPlot.h:25
TH1D * GetXvarHist(Int_t ixvar)
Returns the histogram of variable #ixvar If histograms have not already been filled,...
Definition: TSPlot.cxx:708
TMatrixD fYvar
Definition: TSPlot.h:24
A specialized TSelector for TTree::Draw.
Definition: TSelectorDraw.h:31
virtual UInt_t SplitNames(const TString &varexp, std::vector< TString > &names)
Build Index array for names in varexp.
Int_t LastIndex() const
Basic string class.
Definition: TString.h:131
TString & ReplaceAll(const TString &s1, const TString &s2)
Definition: TString.h:687
void ToUpper()
Change string to upper case.
Definition: TString.cxx:1113
Bool_t Contains(const char *pat, ECaseCompare cmp=kExact) const
Definition: TString.h:619
Used to coordinate one or more TTreeFormula objects.
virtual void UpdateFormulaLeaves()
This function could be called TTreePlayer::UpdateFormulaLeaves, itself called by TChain::LoadTree whe...
virtual Int_t GetNdata(Bool_t forceLoadDim=kFALSE)
Return number of available instances in the formulas.
virtual Bool_t Sync()
Synchronize all the formulae.
virtual void Add(TTreeFormula *)
Add a new formula to the list of formulas managed The manager of the formula will be changed and the ...
virtual Int_t GetMultiplicity() const
Used to pass a selection expression to the Tree drawing routine.
Definition: TTreeFormula.h:58
T EvalInstance(Int_t i=0, const char *stringStack[]=0)
Evaluate this treeformula.
Implement some of the functionality of the class TTree requiring access to extra libraries (Histogram...
Definition: TTreePlayer.h:37
A TTree object has a header with a name and a title.
Definition: TTree.h:71
virtual TObjArray * GetListOfLeaves()
Definition: TTree.h:428
TVirtualTreePlayer * GetPlayer()
Load the TTreePlayer (if not already done).
Definition: TTree.cxx:6059
virtual Long64_t GetEntryNumber(Long64_t entry) const
Return entry number corresponding to entry.
Definition: TTree.cxx:5626
virtual Long64_t LoadTree(Long64_t entry)
Set current entry.
Definition: TTree.cxx:6226
virtual Int_t GetTreeNumber() const
Definition: TTree.h:458
Abstract Base Class for Fitting.
virtual void SetObjectFit(TObject *obj)
virtual TObject * GetObjectFit() const
virtual Double_t * GetCovarianceMatrix() const =0
virtual void SetFCN(void(*fcn)(Int_t &, Double_t *, Double_t &f, Double_t *, Int_t))
To set the address of the minimization objective function called by the native compiler (see function...
virtual Int_t SetParameter(Int_t ipar, const char *parname, Double_t value, Double_t verr, Double_t vlow, Double_t vhigh)=0
virtual void Clear(Option_t *option="")=0
Set name and title to empty strings ("").
virtual Int_t ExecuteCommand(const char *command, Double_t *args, Int_t nargs)=0
virtual Double_t GetParameter(Int_t ipar) const =0
static TVirtualFitter * GetFitter()
static: return the current Fitter
static TVirtualFitter * Fitter(TObject *obj, Int_t maxpar=25)
Static function returning a pointer to the current fitter.
Double_t x[n]
Definition: legend1.C:17
static constexpr double s
static constexpr double ns
Double_t Log(Double_t x)
Definition: TMath.h:748
Definition: tree.py:1