Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
TPrincipal.cxx
Go to the documentation of this file.
1// @(#)root/hist:$Id$
2// Author: Christian Holm Christensen 1/8/2000
3
4/*************************************************************************
5 * Copyright (C) 1995-2004, Rene Brun and Fons Rademakers. *
6 * All rights reserved. *
7 * *
8 * For the licensing terms see $ROOTSYS/LICENSE. *
9 * For the list of contributors see $ROOTSYS/README/CREDITS. *
10 *************************************************************************/
11
12/** \class TPrincipal
13 \ingroup Hist
14Principal Components Analysis (PCA)
15
16The current implementation is based on the LINTRA package from CERNLIB
17by R. Brun, H. Hansroul, and J. Kubler.
18The class has been implemented by Christian Holm Christensen in August 2000.
19
20## Introduction
21
22In many applications of various fields of research, the treatment of
23large amounts of data requires powerful techniques capable of rapid
24data reduction and analysis. Usually, the quantities most
25conveniently measured by the experimentalist, are not necessarily the
26most significant for classification and analysis of the data. It is
27then useful to have a way of selecting an optimal set of variables
28necessary for the recognition process and reducing the dimensionality
29of the problem, resulting in an easier classification procedure.
30
31This paper describes the implementation of one such method of
32feature selection, namely the principal components analysis. This
33multidimensional technique is well known in the field of pattern
34recognition and and its use in Particle Physics has been documented
35elsewhere (cf. H. Wind, <I>Function Parameterization</I>, CERN
3672-21).
37
38## Overview
39Suppose we have prototypes which are trajectories of particles,
40passing through a spectrometer. If one measures the passage of the
41particle at say 8 fixed planes, the trajectory is described by an
428-component vector:
43\f[
44 \mathbf{x} = \left(x_0, x_1, \ldots, x_7\right)
45\f]
46in 8-dimensional pattern space.
47
48One proceeds by generating a representative tracks sample and
49building up the covariance matrix \f$\mathsf{C}\f$. Its eigenvectors and
50eigenvalues are computed by standard methods, and thus a new basis is
51obtained for the original 8-dimensional space the expansion of the
52prototypes,
53\f[
54 \mathbf{x}_m = \sum^7_{i=0} a_{m_i} \mathbf{e}_i
55 \quad
56 \mbox{where}
57 \quad
58 a_{m_i} = \mathbf{x}^T\bullet\mathbf{e}_i
59\f]
60allows the study of the behavior of the coefficients \f$a_{m_i}\f$ for all
61the tracks of the sample. The eigenvectors which are insignificant for
62the trajectory description in the expansion will have their
63corresponding coefficients \f$a_{m_i}\f$ close to zero for all the
64prototypes.
65
66On one hand, a reduction of the dimensionality is then obtained by
67omitting these least significant vectors in the subsequent analysis.
68
69On the other hand, in the analysis of real data, these least
70significant variables(?) can be used for the pattern
71recognition problem of extracting the valid combinations of
72coordinates describing a true trajectory from the set of all possible
73wrong combinations.
74
75The program described here performs this principal components analysis
76on a sample of data provided by the user. It computes the covariance
77matrix, its eigenvalues ands corresponding eigenvectors and exhibits
78the behavior of the principal components \f$a_{m_i}\f$, thus providing
79to the user all the means of understanding their data.
80
81## Principal Components Method
82Let's consider a sample of \f$M\f$ prototypes each being characterized by
83\f$P\f$ variables \f$x_0, x_1, \ldots, x_{P-1}\f$. Each prototype is a point, or a
84column vector, in a \f$P\f$-dimensional *Pattern space*.
85\f[
86 \mathbf{x} = \left[\begin{array}{c}
87 x_0\\x_1\\\vdots\\x_{P-1}\end{array}\right]\,,
88\f]
89where each \f$x_n\f$ represents the particular value associated with the
90\f$n\f$-dimension.
91
92Those \f$P\f$ variables are the quantities accessible to the
93experimentalist, but are not necessarily the most significant for the
94classification purpose.
95
96The *Principal Components Method* consists of applying a
97*linear* transformation to the original variables. This
98transformation is described by an orthogonal matrix and is equivalent
99to a rotation of the original pattern space into a new set of
100coordinate vectors, which hopefully provide easier feature
101identification and dimensionality reduction.
102
103Let's define the covariance matrix:
104\f[
105 \mathsf{C} = \left\langle\mathbf{y}\mathbf{y}^T\right\rangle
106 \quad\mbox{where}\quad
107 \mathbf{y} = \mathbf{x} - \left\langle\mathbf{x}\right\rangle\,,
108\f]
109and the brackets indicate mean value over the sample of \f$M\f$
110prototypes.
111
112This matrix \f$\mathsf{C}\f$ is real, positive definite, symmetric, and will
113have all its eigenvalues greater then zero. It will now be show that
114among the family of all the complete orthonormal bases of the pattern
115space, the base formed by the eigenvectors of the covariance matrix
116and belonging to the largest eigenvalues, corresponds to the most
117significant features of the description of the original prototypes.
118
119let the prototypes be expanded on into a set of \f$N\f$ basis vectors
120\f$\mathbf{e}_n, n=0,\ldots,N,N+1, \ldots, P-1\f$
121\f[
122 \mathbf{y}_i = \sum^N_{i=0} a_{i_n} \mathbf{e}_n,
123 \quad
124 i = 1, \ldots, M,
125 \quad
126 N < P-1
127\f]
128The `best' feature coordinates \f$\mathbf{e}_n\f$, spanning a *feature
129space*, will be obtained by minimizing the error due to this
130truncated expansion, i.e.,
131\f[
132 \min\left(E_N\right) =
133 \min\left[\left\langle\left(\mathbf{y}_i - \sum^N_{i=0} a_{i_n} \mathbf{e}_n\right)^2\right\rangle\right]
134\f]
135with the conditions:
136\f[
137 \mathbf{e}_k\bullet\mathbf{e}_j = \delta_{jk} =
138 \left\{\begin{array}{rcl}
139 1 & \mbox{for} & k = j\\
140 0 & \mbox{for} & k \neq j
141 \end{array}\right.
142\f]
143Multiplying (3) by \f$\mathbf{e}^T_n\f$ using (5),
144we get
145\f[
146 a_{i_n} = \mathbf{y}_i^T\bullet\mathbf{e}_n\,,
147\f]
148so the error becomes
149\f{eqnarray*}{
150 E_N &=&
151 \left\langle\left[\sum_{n=N+1}^{P-1} a_{i_n}\mathbf{e}_n\right]^2\right\rangle\nonumber\\
152 &=&
153 \left\langle\left[\sum_{n=N+1}^{P-1} \mathbf{y}_i^T\bullet\mathbf{e}_n\mathbf{e}_n\right]^2\right\rangle\nonumber\\
154 &=&
155 \left\langle\sum_{n=N+1}^{P-1} \mathbf{e}_n^T\mathbf{y}_i\mathbf{y}_i^T\mathbf{e}_n\right\rangle\nonumber\\
156 &=&
157 \sum_{n=N+1}^{P-1} \mathbf{e}_n^T\mathsf{C}\mathbf{e}_n
158\f}
159The minimization of the sum in (7) is obtained when each
160term \f$\mathbf{e}_n^\mathsf{C}\mathbf{e}_n\f$ is minimum, since \f$\mathsf{C}\f$ is
161positive definite. By the method of Lagrange multipliers, and the
162condition (5), we get
163\f[
164 E_N = \sum^{P-1}_{n=N+1} \left(\mathbf{e}_n^T\mathsf{C}\mathbf{e}_n -
165 l_n\mathbf{e}_n^T\bullet\mathbf{e}_n + l_n\right)
166\f]
167The minimum condition \f$\frac{dE_N}{d\mathbf{e}^T_n} = 0\f$ leads to the
168equation
169\f[
170 \mathsf{C}\mathbf{e}_n = l_n\mathbf{e}_n\,,
171\f]
172which shows that \f$\mathbf{e}_n\f$ is an eigenvector of the covariance
173matrix \f$\mathsf{C}\f$ with eigenvalue \f$l_n\f$. The estimated minimum error is
174then given by
175\f[
176 E_N \sim \sum^{P-1}_{n=N+1} \mathbf{e}_n^T\bullet l_n\mathbf{e}_n
177 = \sum^{P-1}_{n=N+1} l_n\,,
178\f]
179where \f$l_n,\,n=N+1,\ldots,P\f$ \f$l_n,\,n=N+1,\ldots,P-1\f$ are the eigenvalues associated with the
180omitted eigenvectors in the expansion (3). Thus, by choosing
181the \f$N\f$ largest eigenvalues, and their associated eigenvectors, the
182error \f$E_N\f$ is minimized.
183
184The transformation matrix to go from the pattern space to the feature
185space consists of the ordered eigenvectors \f$\mathbf{e}_1,\ldots,\mathbf{e}_P\f$
186\f$\mathbf{e}_0,\ldots,\mathbf{e}_{P-1}\f$ for its columns
187\f[
188 \mathsf{T} = \left[
189 \begin{array}{cccc}
190 \mathbf{e}_0 &
191 \mathbf{e}_1 &
192 \vdots &
193 \mathbf{e}_{P-1}
194 \end{array}\right]
195 = \left[
196 \begin{array}{cccc}
197 \mathbf{e}_{0_0} & \mathbf{e}_{1_0} & \cdots & \mathbf{e}_{{P-1}_0}\\
198 \mathbf{e}_{0_1} & \mathbf{e}_{1_1} & \cdots & \mathbf{e}_{{P-1}_1}\\
199 \vdots & \vdots & \ddots & \vdots \\
200 \mathbf{e}_{0_{P-1}} & \mathbf{e}_{1_{P-1}} & \cdots & \mathbf{e}_{{P-1}_{P-1}}\\
201 \end{array}\right]
202\f]
203This is an orthogonal transformation, or rotation, of the pattern
204space and feature selection results in ignoring certain coordinates
205in the transformed space.
206
207Christian Holm August 2000, CERN
208*/
209
210#include "TPrincipal.h"
211
212#include "TVectorD.h"
213#include "TMatrixD.h"
214#include "TMatrixDSymEigen.h"
215#include "TMath.h"
216#include "TList.h"
217#include "TH2.h"
218#include "TDatime.h"
219#include "TBrowser.h"
220#include "TROOT.h"
221#include "Riostream.h"
222
223
225
226////////////////////////////////////////////////////////////////////////////////
227/// Empty constructor. Do not use.
228
230 : fMeanValues(0),
231 fSigmas(0),
232 fCovarianceMatrix(1,1),
233 fEigenVectors(1,1),
234 fEigenValues(0),
235 fOffDiagonal(0),
236 fStoreData(kFALSE)
237{
238 fTrace = 0;
239 fHistograms = nullptr;
243}
244
245////////////////////////////////////////////////////////////////////////////////
246/// Constructor. Argument is number of variables in the sample of data
247/// Options are:
248/// - N Normalize the covariance matrix (default)
249/// - D Store input data (default)
250///
251/// The created object is named "principal" by default.
252
254 : fMeanValues(nVariables),
255 fSigmas(nVariables),
256 fCovarianceMatrix(nVariables,nVariables),
257 fEigenVectors(nVariables,nVariables),
258 fEigenValues(nVariables),
259 fOffDiagonal(nVariables),
260 fStoreData(kFALSE)
261{
262 if (nVariables <= 1) {
263 Error("TPrincipal", "You can't be serious - nVariables == 1!!!");
264 return;
265 }
266
267 SetName("principal");
268
269 fTrace = 0;
270 fHistograms = nullptr;
273 fNumberOfVariables = nVariables;
274 while (opt && strlen(opt) > 0) {
275 switch(*opt++) {
276 case 'N':
277 case 'n':
279 break;
280 case 'D':
281 case 'd':
283 break;
284 default:
285 break;
286 }
287 }
288
289 if (!fMeanValues.IsValid())
290 Error("TPrincipal","Couldn't create vector mean values");
291 if (!fSigmas.IsValid())
292 Error("TPrincipal","Couldn't create vector sigmas");
294 Error("TPrincipal","Couldn't create covariance matrix");
295 if (!fEigenVectors.IsValid())
296 Error("TPrincipal","Couldn't create eigenvector matrix");
297 if (!fEigenValues.IsValid())
298 Error("TPrincipal","Couldn't create eigenvalue vector");
299 if (!fOffDiagonal.IsValid())
300 Error("TPrincipal","Couldn't create offdiagonal vector");
301 if (fStoreData) {
302 fUserData.ResizeTo(nVariables*1000);
303 fUserData.Zero();
304 if (!fUserData.IsValid())
305 Error("TPrincipal","Couldn't create user data vector");
306 }
307}
308
309////////////////////////////////////////////////////////////////////////////////
310/// Copy constructor.
311
313 TNamed(pr),
314 fNumberOfDataPoints(pr.fNumberOfDataPoints),
315 fNumberOfVariables(pr.fNumberOfVariables),
316 fMeanValues(pr.fMeanValues),
317 fSigmas(pr.fSigmas),
318 fCovarianceMatrix(pr.fCovarianceMatrix),
319 fEigenVectors(pr.fEigenVectors),
320 fEigenValues(pr.fEigenValues),
321 fOffDiagonal(pr.fOffDiagonal),
322 fUserData(pr.fUserData),
323 fTrace(pr.fTrace),
324 fHistograms(pr.fHistograms),
325 fIsNormalised(pr.fIsNormalised),
326 fStoreData(pr.fStoreData)
327{
328}
329
330////////////////////////////////////////////////////////////////////////////////
331/// Assignment operator.
332
334{
335 if(this!=&pr) {
340 fSigmas=pr.fSigmas;
346 fTrace=pr.fTrace;
350 }
351 return *this;
352}
353
354////////////////////////////////////////////////////////////////////////////////
355/// Destructor.
356
358{
359 if (fHistograms) {
361 delete fHistograms;
362 }
363}
364
365////////////////////////////////////////////////////////////////////////////////
366/// Add a data point and update the covariance matrix. The input
367/// array must be <TT>fNumberOfVariables</TT> long.
368///
369///
370/// The Covariance matrix and mean values of the input data is calculated
371/// on the fly by the following equations:
372///
373/// \f[
374/// \left<x_i\right>^{(0)} = x_{i0}
375/// \f]
376///
377///
378/// \f[
379/// \left<x_i\right>^{(n)} = \left<x_i\right>^{(n-1)}
380/// + \frac1n \left(x_{in} - \left<x_i\right>^{(n-1)}\right)
381/// \f]
382///
383/// \f[
384/// C_{ij}^{(0)} = 0
385/// \f]
386///
387///
388///
389/// \f[
390/// C_{ij}^{(n)} = C_{ij}^{(n-1)}
391/// + \frac1{n-1}\left[\left(x_{in} - \left<x_i\right>^{(n)}\right)
392/// \left(x_{jn} - \left<x_j\right>^{(n)}\right)\right]
393/// - \frac1n C_{ij}^{(n-1)}
394/// \f]
395///
396/// since this is a really fast method, with no rounding errors (please
397/// refer to CERN 72-21 pp. 54-106).
398///
399///
400/// The data is stored internally in a <TT>TVectorD</TT>, in the following
401/// way:
402///
403/// \f[
404/// \mathbf{x} = \left[\left(x_{0_0},\ldots,x_{{P-1}_0}\right),\ldots,
405/// \left(x_{0_i},\ldots,x_{{P-1}_i}\right), \ldots\right]
406/// \f]
407///
408/// With \f$P\f$ as defined in the class description.
409
411{
412 if (!p)
413 return;
414
415 // Increment the data point counter
416 Int_t i,j;
417 if (++fNumberOfDataPoints == 1) {
418 for (i = 0; i < fNumberOfVariables; i++)
419 fMeanValues(i) = p[i];
420 }
421 else {
422
423 const Double_t invnp = 1. / Double_t(fNumberOfDataPoints);
424 const Double_t invnpM1 = 1. /(Double_t(fNumberOfDataPoints - 1));
425 const Double_t cor = 1. - invnp;
426 // use directly vector array for faster element access
427 Double_t * meanValues = fMeanValues.GetMatrixArray();
429 for (i = 0; i < fNumberOfVariables; i++) {
430
431 meanValues[i] *= cor;
432 meanValues[i] += p[i] * invnp;
433 const Double_t t1 = (p[i] - meanValues[i]) * invnpM1;
434
435 // Setting Matrix (lower triangle) elements
436 for (j = 0; j < i + 1; j++) {
437 const Int_t index = i * fNumberOfVariables + j;
438 covMatrix[index] *= cor;
439 covMatrix[index] += t1 * (p[j] - meanValues[j]);
440 }
441 }
442 }
443
444 // Store data point in internal vector
445 // If the vector isn't big enough to hold the new data, then
446 // expand the vector by half it's size.
447 if (!fStoreData)
448 return;
452
453 for (i = 0; i < fNumberOfVariables; i++) {
455 fUserData.GetMatrixArray()[j] = p[i];
456 }
457
458}
459
460////////////////////////////////////////////////////////////////////////////////
461/// Browse the TPrincipal object in the TBrowser.
462
464{
465 if (fHistograms) {
466 TIter next(fHistograms);
467 TH1* h = nullptr;
468 while ((h = (TH1*)next()))
469 b->Add(h,h->GetName());
470 }
471
472 if (fStoreData)
473 b->Add(&fUserData,"User Data");
474 b->Add(&fCovarianceMatrix,"Covariance Matrix");
475 b->Add(&fMeanValues,"Mean value vector");
476 b->Add(&fSigmas,"Sigma value vector");
477 b->Add(&fEigenValues,"Eigenvalue vector");
478 b->Add(&fEigenVectors,"Eigenvector Matrix");
479
480}
481
482////////////////////////////////////////////////////////////////////////////////
483/// Clear the data in Object. Notice, that's not possible to change
484/// the dimension of the original data.
485
487{
488 if (fHistograms) {
489 fHistograms->Delete(opt);
490 }
491
493 fTrace = 0;
498 fSigmas.Zero();
500
501 if (fStoreData) {
503 fUserData.Zero();
504 }
505}
506
507////////////////////////////////////////////////////////////////////////////////
508/// Return a row of the user supplied data.
509/// If row is out of bounds, 0 is returned.
510/// It's up to the user to delete the returned array.
511/// Row 0 is the first row;
512
514{
515 if (row >= fNumberOfDataPoints)
516 return nullptr;
517
518 if (!fStoreData)
519 return nullptr;
520
522 return &fUserData(index);
523}
524
525
526////////////////////////////////////////////////////////////////////////////////
527/// Generates the file `<filename>`, with `.C` appended if it does
528/// argument doesn't end in .cxx or .C.
529///
530/// The file contains the implementation of two functions
531/// ~~~ {.cpp}
532/// void X2P(Double_t *x, Double *p)
533/// void P2X(Double_t *p, Double *x, Int_t nTest)
534/// ~~~
535/// which does the same as `TPrincipal::X2P` and `TPrincipal::P2X`
536/// respectively. Please refer to these methods.
537///
538/// Further, the static variables:
539/// ~~~ {.cpp}
540/// Int_t gNVariables
541/// Double_t gEigenValues[]
542/// Double_t gEigenVectors[]
543/// Double_t gMeanValues[]
544/// Double_t gSigmaValues[]
545/// ~~~
546/// are initialized. The only ROOT header file needed is Rtypes.h
547///
548/// See TPrincipal::MakeRealCode for a list of options
549
551{
552 TString outName(filename);
553 if (!outName.EndsWith(".C") && !outName.EndsWith(".cxx"))
554 outName += ".C";
555
556 MakeRealCode(outName.Data(),"",opt);
557}
558
559////////////////////////////////////////////////////////////////////////////////
560/// Make histograms of the result of the analysis.
561/// The option string say which histograms to create
562/// - X Histogram original data
563/// - P Histogram principal components corresponding to
564/// original data
565/// - D Histogram the difference between the original data
566/// and the projection of principal unto a lower
567/// dimensional subspace (2D histograms)
568/// - E Histogram the eigenvalues
569/// - S Histogram the square of the residues
570/// (see `TPrincipal::SumOfSquareResiduals`)
571/// The histograms will be named `<name>_<type><number>`, where `<name>`
572/// is the first argument, `<type>` is one of X,P,D,E,S, and `<number>`
573/// is the variable.
574
576{
577 Bool_t makeX = kFALSE;
578 Bool_t makeD = kFALSE;
579 Bool_t makeP = kFALSE;
580 Bool_t makeE = kFALSE;
581 Bool_t makeS = kFALSE;
582
583 Int_t len = opt ? strlen(opt) : 0;
584 Int_t i,j,k;
585 for (i = 0; i < len; i++) {
586 switch (opt[i]) {
587 case 'X':
588 case 'x':
589 if (fStoreData)
590 makeX = kTRUE;
591 break;
592 case 'd':
593 case 'D':
594 if (fStoreData)
595 makeD = kTRUE;
596 break;
597 case 'P':
598 case 'p':
599 if (fStoreData)
600 makeP = kTRUE;
601 break;
602 case 'E':
603 case 'e':
604 makeE = kTRUE;
605 break;
606 case 's':
607 case 'S':
608 if (fStoreData)
609 makeS = kTRUE;
610 break;
611 default:
612 Warning("MakeHistograms","Unknown option: %c",opt[i]);
613 }
614 }
615
616 // If no option was given, then exit gracefully
617 if (!makeX && !makeD && !makeP && !makeE && !makeS)
618 return;
619
620 // If the list of histograms doesn't exist, create it.
621 if (!fHistograms)
622 fHistograms = new TList;
623
624 // Don't create the histograms if they are already in the TList.
625 if (makeX && fHistograms->FindObject(TString::Format("%s_x000",name)))
626 makeX = kFALSE;
627 if (makeD && fHistograms->FindObject(TString::Format("%s_d000",name)))
628 makeD = kFALSE;
629 if (makeP && fHistograms->FindObject(TString::Format("%s_p000",name)))
630 makeP = kFALSE;
631 if (makeE && fHistograms->FindObject(TString::Format("%s_e",name)))
632 makeE = kFALSE;
633 if (makeS && fHistograms->FindObject(TString::Format("%s_s",name)))
634 makeS = kFALSE;
635
636 TH1F **hX = nullptr;
637 TH2F **hD = nullptr;
638 TH1F **hP = nullptr;
639 TH1F *hE = nullptr;
640 TH1F *hS = nullptr;
641
642 // Initialize the arrays of histograms needed
643 if (makeX)
644 hX = new TH1F * [fNumberOfVariables];
645
646 if (makeD)
647 hD = new TH2F * [fNumberOfVariables];
648
649 if (makeP)
650 hP = new TH1F * [fNumberOfVariables];
651
652 if (makeE){
653 hE = new TH1F(TString::Format("%s_e",name), "Eigenvalues of Covariance matrix",
655 hE->SetXTitle("Eigenvalue");
656 fHistograms->Add(hE);
657 }
658
659 if (makeS) {
660 hS = new TH1F(TString::Format("%s_s",name),"E_{N}",
662 hS->SetXTitle("N");
663 hS->SetYTitle("#sum_{i=1}^{M} (x_{i} - x'_{N,i})^{2}");
664 fHistograms->Add(hS);
665 }
666
667 // Initialize sub elements of the histogram arrays
668 for (i = 0; i < fNumberOfVariables; i++) {
669 if (makeX) {
670 // We allow 4 sigma spread in the original data in our
671 // histogram.
672 Double_t xlowb = fMeanValues(i) - 4 * fSigmas(i);
673 Double_t xhighb = fMeanValues(i) + 4 * fSigmas(i);
674 Int_t xbins = (fNumberOfDataPoints > 0 && fNumberOfDataPoints < 100 ? 1 : fNumberOfDataPoints/100);
675 hX[i] = new TH1F(TString::Format("%s_x%03d", name, i),
676 TString::Format("Pattern space, variable %d", i),
677 xbins,xlowb,xhighb);
678 hX[i]->SetXTitle(Form("x_{%d}",i));
679 fHistograms->Add(hX[i]);
680 }
681
682 if(makeD) {
683 // The upper limit below is arbitrary!!!
684 Double_t dlowb = 0;
685 Double_t dhighb = 20;
686 Int_t dbins = (fNumberOfDataPoints > 0 && fNumberOfDataPoints < 100 ? 1 : fNumberOfDataPoints/100);
687 hD[i] = new TH2F(TString::Format("%s_d%03d", name, i),
688 TString::Format("Distance from pattern to feature space, variable %d", i),
689 dbins,dlowb,dhighb,
691 hD[i]->SetXTitle(TString::Format("|x_{%d} - x'_{%d,N}|/#sigma_{%d}",i,i,i));
692 hD[i]->SetYTitle("N");
693 fHistograms->Add(hD[i]);
694 }
695
696 if(makeP) {
697 // For some reason, the trace of the none-scaled matrix
698 // (see TPrincipal::MakeNormalised) should enter here. Taken
699 // from LINTRA code.
701 Double_t plowb = -10 * TMath::Sqrt(et);
702 Double_t phighb = -plowb;
703 Int_t pbins = 100;
704 hP[i] = new TH1F(TString::Format("%s_p%03d", name, i),
705 TString::Format("Feature space, variable %d", i),
706 pbins,plowb,phighb);
707 hP[i]->SetXTitle(TString::Format("p_{%d}",i));
708 fHistograms->Add(hP[i]);
709 }
710
711 if (makeE)
712 // The Eigenvector histogram is easy
713 hE->Fill(i,fEigenValues(i));
714
715 }
716 if (!makeX && !makeP && !makeD && !makeS) {
717 if (hX)
718 delete[] hX;
719 if (hD)
720 delete[] hD;
721 if (hP)
722 delete[] hP;
723 return;
724 }
725
726 Double_t *x = nullptr;
729 for (i = 0; i < fNumberOfDataPoints; i++) {
730
731 // Zero arrays
732 for (j = 0; j < fNumberOfVariables; j++)
733 p[j] = d[j] = 0;
734
735 // update the original data histogram
736 x = (Double_t*)(GetRow(i));
737 R__ASSERT(x);
738
739 if (makeP||makeD||makeS)
740 // calculate the corresponding principal component
741 X2P(x,p);
742
743 if (makeD || makeS) {
744 // Calculate the difference between the original data, and the
745 // same project onto principal components, and then to a lower
746 // dimensional sub-space
747 for (j = fNumberOfVariables; j > 0; j--) {
748 P2X(p,d,j);
749
750 for (k = 0; k < fNumberOfVariables; k++) {
751 // We use the absolute value of the difference!
752 d[k] = x[k] - d[k];
753
754 if (makeS)
755 hS->Fill(j,d[k]*d[k]);
756
757 if (makeD) {
758 d[k] = TMath::Abs(d[k]) / (fIsNormalised ? fSigmas(k) : 1);
759 (hD[k])->Fill(d[k],j);
760 }
761 }
762 }
763 }
764
765 if (makeX||makeP) {
766 // If we are asked to make any of these histograms, we have to
767 // go here
768 for (j = 0; j < fNumberOfVariables; j++) {
769 if (makeX)
770 (hX[j])->Fill(x[j]);
771
772 if (makeP)
773 (hP[j])->Fill(p[j]);
774 }
775 }
776 }
777 // Clean up
778 if (hX)
779 delete [] hX;
780 if (hD)
781 delete [] hD;
782 if (hP)
783 delete [] hP;
784 if (d)
785 delete [] d;
786 if (p)
787 delete [] p;
788
789 // Normalize the residues
790 if (makeS)
791 hS->Scale(Double_t(1.)/fNumberOfDataPoints);
792}
793
794////////////////////////////////////////////////////////////////////////////////
795/// Normalize the covariance matrix
796
798{
799 Int_t i,j;
800 for (i = 0; i < fNumberOfVariables; i++) {
802 if (fIsNormalised)
803 for (j = 0; j <= i; j++)
804 fCovarianceMatrix(i,j) /= (fSigmas(i) * fSigmas(j));
805
807 }
808
809 // Fill remaining parts of matrix, and scale.
810 for (i = 0; i < fNumberOfVariables; i++)
811 for (j = 0; j <= i; j++) {
814 }
815
816}
817
818////////////////////////////////////////////////////////////////////////////////
819/// Generate the file `<classname>PCA.cxx` which contains the
820/// implementation of two methods:
821/// ~~~ {.cpp}
822/// void <classname>::X2P(Double_t *x, Double *p)
823/// void <classname>::P2X(Double_t *p, Double *x, Int_t nTest)
824/// ~~~
825/// which does the same as TPrincipal::X2P and TPrincipal::P2X
826/// respectively. Please refer to these methods.
827///
828/// Further, the public static members:
829/// ~~~ {.cpp}
830/// Int_t <classname>::fgNVariables
831/// Double_t <classname>::fgEigenValues[]
832/// Double_t <classname>::fgEigenVectors[]
833/// Double_t <classname>::fgMeanValues[]
834/// Double_t <classname>::fgSigmaValues[]
835/// ~~~
836/// are initialized, and assumed to exist. The class declaration is
837/// assumed to be in `<classname>.h` and assumed to be provided by the
838/// user.
839///
840/// See TPrincipal::MakeRealCode for a list of options
841///
842/// The minimal class definition is:
843/// ~~~ {.cpp}
844/// class <classname> {
845/// public:
846/// static Int_t fgNVariables;
847/// static Double_t fgEigenVectors[];
848/// static Double_t fgEigenValues[];
849/// static Double_t fgMeanValues[];
850/// static Double_t fgSigmaValues[];
851///
852/// void X2P(Double_t *x, Double_t *p);
853/// void P2X(Double_t *p, Double_t *x, Int_t nTest);
854/// };
855/// ~~~
856/// Whether the methods `<classname>::%X2P` and `<classname>::%P2X` should
857/// be static or not, is up to the user.
858
859void TPrincipal::MakeMethods(const char *classname, Option_t *opt)
860{
861
862 MakeRealCode(TString::Format("%sPCA.cxx", classname), classname, opt);
863}
864
865
866////////////////////////////////////////////////////////////////////////////////
867/// Perform the principal components analysis.
868/// This is done in several stages in the TMatrix::EigenVectors method:
869/// - Transform the covariance matrix into a tridiagonal matrix.
870/// - Find the eigenvalues and vectors of the tridiagonal matrix.
871
873{
874 // Normalize covariance matrix
876
878 TMatrixDSymEigen eigen(sym);
881 //make sure that eigenvalues are positive
882 for (Int_t i = 0; i < fNumberOfVariables; i++) {
883 if (fEigenValues[i] < 0) fEigenValues[i] = -fEigenValues[i];
884 }
885}
886
887////////////////////////////////////////////////////////////////////////////////
888/// This is the method that actually generates the code for the
889/// transformations to and from feature space and pattern space
890/// It's called by TPrincipal::MakeCode and TPrincipal::MakeMethods.
891///
892/// The options are: NONE so far
893
894void TPrincipal::MakeRealCode(const char *filename, const char *classname,
895 Option_t *)
896{
897 Bool_t isMethod = classname[0] == '\0' ? kFALSE : kTRUE;
898 TString prefix;
899 const char *cv_qual = isMethod ? "" : "static ";
900 if (isMethod)
901 prefix.Form("%s::", classname);
902
903 std::ofstream outFile(filename,std::ios::out|std::ios::trunc);
904 if (!outFile) {
905 Error("MakeRealCode","couldn't open output file '%s'",filename);
906 return;
907 }
908
909 std::cout << "Writing on file \"" << filename << "\" ... " << std::flush;
910 //
911 // Write header of file
912 //
913 // Emacs mode line ;-)
914 outFile << "// -*- mode: c++ -*-" << std::endl;
915 // Info about creator
916 outFile << "// " << std::endl
917 << "// File " << filename
918 << " generated by TPrincipal::MakeCode" << std::endl;
919 // Time stamp
920 TDatime date;
921 outFile << "// on " << date.AsString() << std::endl;
922 // ROOT version info
923 outFile << "// ROOT version " << gROOT->GetVersion()
924 << std::endl << "//" << std::endl;
925 // General information on the code
926 outFile << "// This file contains the functions " << std::endl
927 << "//" << std::endl
928 << "// void " << prefix
929 << "X2P(Double_t *x, Double_t *p); " << std::endl
930 << "// void " << prefix
931 << "P2X(Double_t *p, Double_t *x, Int_t nTest);"
932 << std::endl << "//" << std::endl
933 << "// The first for transforming original data x in " << std::endl
934 << "// pattern space, to principal components p in " << std::endl
935 << "// feature space. The second function is for the" << std::endl
936 << "// inverse transformation, but using only nTest" << std::endl
937 << "// of the principal components in the expansion" << std::endl
938 << "// " << std::endl
939 << "// See TPrincipal class documentation for more "
940 << "information " << std::endl << "// " << std::endl;
941 // Header files
942 outFile << "#ifndef __CINT__" << std::endl;
943 if (isMethod)
944 // If these are methods, we need the class header
945 outFile << "#include \"" << classname << ".h\"" << std::endl;
946 else
947 // otherwise, we need the typedefs of Int_t and Double_t
948 outFile << "#include <Rtypes.h> // needed for Double_t etc" << std::endl;
949 // Finish the preprocessor block
950 outFile << "#endif" << std::endl << std::endl;
951
952 //
953 // Now for the data
954 //
955 // We make the Eigenvector matrix, Eigenvalue vector, Sigma vector,
956 // and Mean value vector static, since all are needed in both
957 // functions. Also ,the number of variables are stored in a static
958 // variable.
959 outFile << "//" << std::endl
960 << "// Static data variables" << std::endl
961 << "//" << std::endl;
962 outFile << cv_qual << "Int_t " << prefix << "gNVariables = "
963 << fNumberOfVariables << ";" << std::endl;
964
965 // Assign the values to the Eigenvector matrix. The elements are
966 // stored row-wise, that is element
967 // M[i][j] = e[i * nVariables + j]
968 // where i and j are zero-based.
969 outFile << std::endl << "// Assignment of eigenvector matrix." << std::endl
970 << "// Elements are stored row-wise, that is" << std::endl
971 << "// M[i][j] = e[i * nVariables + j] " << std::endl
972 << "// where i and j are zero-based" << std::endl;
973 outFile << cv_qual << "Double_t " << prefix
974 << "gEigenVectors[] = {" << std::flush;
975 Int_t i,j;
976 for (i = 0; i < fNumberOfVariables; i++) {
977 for (j = 0; j < fNumberOfVariables; j++) {
979 outFile << (index != 0 ? "," : "" ) << std::endl
980 << " " << fEigenVectors(i,j) << std::flush;
981 }
982 }
983 outFile << "};" << std::endl << std::endl;
984
985 // Assignment to eigenvalue vector. Zero-based.
986 outFile << "// Assignment to eigen value vector. Zero-based." << std::endl;
987 outFile << cv_qual << "Double_t " << prefix
988 << "gEigenValues[] = {" << std::flush;
989 for (i = 0; i < fNumberOfVariables; i++)
990 outFile << (i != 0 ? "," : "") << std::endl
991 << " " << fEigenValues(i) << std::flush;
992 outFile << std::endl << "};" << std::endl << std::endl;
993
994 // Assignment to mean Values vector. Zero-based.
995 outFile << "// Assignment to mean value vector. Zero-based." << std::endl;
996 outFile << cv_qual << "Double_t " << prefix
997 << "gMeanValues[] = {" << std::flush;
998 for (i = 0; i < fNumberOfVariables; i++)
999 outFile << (i != 0 ? "," : "") << std::endl
1000 << " " << fMeanValues(i) << std::flush;
1001 outFile << std::endl << "};" << std::endl << std::endl;
1002
1003 // Assignment to mean Values vector. Zero-based.
1004 outFile << "// Assignment to sigma value vector. Zero-based." << std::endl;
1005 outFile << cv_qual << "Double_t " << prefix
1006 << "gSigmaValues[] = {" << std::flush;
1007 for (i = 0; i < fNumberOfVariables; i++)
1008 outFile << (i != 0 ? "," : "") << std::endl
1009 << " " << (fIsNormalised ? fSigmas(i) : 1) << std::flush;
1010 // << " " << fSigmas(i) << std::flush;
1011 outFile << std::endl << "};" << std::endl << std::endl;
1012
1013 //
1014 // Finally we reach the functions themselves
1015 //
1016 // First: void x2p(Double_t *x, Double_t *p);
1017 //
1018 outFile << "// " << std::endl
1019 << "// The "
1020 << (isMethod ? "method " : "function ")
1021 << " void " << prefix
1022 << "X2P(Double_t *x, Double_t *p)"
1023 << std::endl << "// " << std::endl;
1024 outFile << "void " << prefix
1025 << "X2P(Double_t *x, Double_t *p) {" << std::endl
1026 << " for (Int_t i = 0; i < gNVariables; i++) {" << std::endl
1027 << " p[i] = 0;" << std::endl
1028 << " for (Int_t j = 0; j < gNVariables; j++)" << std::endl
1029 << " p[i] += (x[j] - gMeanValues[j]) " << std::endl
1030 << " * gEigenVectors[j * gNVariables + i] "
1031 << "/ gSigmaValues[j];" << std::endl << std::endl << " }"
1032 << std::endl << "}" << std::endl << std::endl;
1033 //
1034 // Now: void p2x(Double_t *p, Double_t *x, Int_t nTest);
1035 //
1036 outFile << "// " << std::endl << "// The "
1037 << (isMethod ? "method " : "function ")
1038 << " void " << prefix
1039 << "P2X(Double_t *p, Double_t *x, Int_t nTest)"
1040 << std::endl << "// " << std::endl;
1041 outFile << "void " << prefix
1042 << "P2X(Double_t *p, Double_t *x, Int_t nTest) {" << std::endl
1043 << " for (Int_t i = 0; i < gNVariables; i++) {" << std::endl
1044 << " x[i] = gMeanValues[i];" << std::endl
1045 << " for (Int_t j = 0; j < nTest; j++)" << std::endl
1046 << " x[i] += p[j] * gSigmaValues[i] " << std::endl
1047 << " * gEigenVectors[i * gNVariables + j];" << std::endl
1048 << " }" << std::endl << "}" << std::endl << std::endl;
1049
1050 // EOF
1051 outFile << "// EOF for " << filename << std::endl;
1052
1053 // Close the file
1054 outFile.close();
1055
1056 std::cout << "done" << std::endl;
1057}
1058
1059////////////////////////////////////////////////////////////////////////////////
1060/// Calculate x as a function of nTest of the most significant
1061/// principal components p, and return it in x.
1062/// It's the users responsibility to make sure that both x and p are
1063/// of the right size (i.e., memory must be allocated for x).
1064
1066{
1067 for (Int_t i = 0; i < fNumberOfVariables; i++){
1068 x[i] = fMeanValues(i);
1069 for (Int_t j = 0; j < nTest; j++)
1070 x[i] += p[j] * (fIsNormalised ? fSigmas(i) : 1)
1071 * fEigenVectors(i,j);
1072 }
1073
1074}
1075
1076////////////////////////////////////////////////////////////////////////////////
1077/// Print the statistics
1078/// Options are
1079/// - M Print mean values of original data
1080/// - S Print sigma values of original data
1081/// - E Print eigenvalues of covariance matrix
1082/// - V Print eigenvectors of covariance matrix
1083/// Default is MSE
1084
1086{
1087 Bool_t printV = kFALSE;
1088 Bool_t printM = kFALSE;
1089 Bool_t printS = kFALSE;
1090 Bool_t printE = kFALSE;
1091
1092 Int_t len = opt ? strlen(opt) : 0;
1093 for (Int_t i = 0; i < len; i++) {
1094 switch (opt[i]) {
1095 case 'V':
1096 case 'v':
1097 printV = kTRUE;
1098 break;
1099 case 'M':
1100 case 'm':
1101 printM = kTRUE;
1102 break;
1103 case 'S':
1104 case 's':
1105 printS = kTRUE;
1106 break;
1107 case 'E':
1108 case 'e':
1109 printE = kTRUE;
1110 break;
1111 default:
1112 Warning("Print", "Unknown option '%c'",opt[i]);
1113 break;
1114 }
1115 }
1116
1117 if (printM||printS||printE) {
1118 std::cout << " Variable # " << std::flush;
1119 if (printM)
1120 std::cout << "| Mean Value " << std::flush;
1121 if (printS)
1122 std::cout << "| Sigma " << std::flush;
1123 if (printE)
1124 std::cout << "| Eigenvalue" << std::flush;
1125 std::cout << std::endl;
1126
1127 std::cout << "-------------" << std::flush;
1128 if (printM)
1129 std::cout << "+------------" << std::flush;
1130 if (printS)
1131 std::cout << "+------------" << std::flush;
1132 if (printE)
1133 std::cout << "+------------" << std::flush;
1134 std::cout << std::endl;
1135
1136 for (Int_t i = 0; i < fNumberOfVariables; i++) {
1137 std::cout << std::setw(12) << i << " " << std::flush;
1138 if (printM)
1139 std::cout << "| " << std::setw(10) << std::setprecision(4)
1140 << fMeanValues(i) << " " << std::flush;
1141 if (printS)
1142 std::cout << "| " << std::setw(10) << std::setprecision(4)
1143 << fSigmas(i) << " " << std::flush;
1144 if (printE)
1145 std::cout << "| " << std::setw(10) << std::setprecision(4)
1146 << fEigenValues(i) << " " << std::flush;
1147 std::cout << std::endl;
1148 }
1149 std::cout << std::endl;
1150 }
1151
1152 if(printV) {
1153 for (Int_t i = 0; i < fNumberOfVariables; i++) {
1154 std::cout << "Eigenvector # " << i << std::flush;
1157 v.Print();
1158 }
1159 }
1160}
1161
1162////////////////////////////////////////////////////////////////////////////////
1163/// Calculates the sum of the square residuals, that is
1164///
1165/// \f[
1166/// E_N = \sum_{i=0}^{P-1} \left(x_i - x^\prime_i\right)^2
1167/// \f]
1168///
1169/// where \f$x^\prime_i = \sum_{j=i}^N p_i e_{n_j}\f$
1170/// is the \f$i^{\mbox{th}}\f$ component of the principal vector, corresponding to
1171/// \f$x_i\f$, the original data; I.e., the square distance to the space
1172/// spanned by \f$N\f$ eigenvectors.
1173
1175{
1176
1177 if (!x)
1178 return;
1179
1180 Double_t p[100];
1181 Double_t xp[100];
1182
1183 X2P(x,p);
1184 for (Int_t i = fNumberOfVariables-1; i >= 0; i--) {
1185 P2X(p,xp,i);
1186 for (Int_t j = 0; j < fNumberOfVariables; j++) {
1187 s[i] += (x[j] - xp[j])*(x[j] - xp[j]);
1188 }
1189 }
1190}
1191
1192////////////////////////////////////////////////////////////////////////////////
1193/// Test the PCA, bye calculating the sum square of residuals
1194/// (see method SumOfSquareResiduals), and display the histogram
1195
1197{
1198 MakeHistograms("pca","S");
1199
1200 if (!fStoreData)
1201 return;
1202
1203 TH1 *pca_s = nullptr;
1204 if (fHistograms) pca_s = (TH1*)fHistograms->FindObject("pca_s");
1205 if (!pca_s) {
1206 Warning("Test", "Couldn't get histogram of square residuals");
1207 return;
1208 }
1209
1210 pca_s->Draw();
1211}
1212
1213////////////////////////////////////////////////////////////////////////////////
1214/// Calculate the principal components from the original data vector
1215/// x, and return it in p.
1216///
1217/// It's the users responsibility to make sure that both x and p are
1218/// of the right size (i.e., memory must be allocated for p).
1219
1221{
1222 for (Int_t i = 0; i < fNumberOfVariables; i++){
1223 p[i] = 0;
1224 for (Int_t j = 0; j < fNumberOfVariables; j++)
1225 p[i] += (x[j] - fMeanValues(j)) * fEigenVectors(j,i) /
1226 (fIsNormalised ? fSigmas(j) : 1);
1227 }
1228
1229}
#define d(i)
Definition RSha256.hxx:102
#define b(i)
Definition RSha256.hxx:100
#define h(i)
Definition RSha256.hxx:106
size_t size(const MatrixT &matrix)
retrieve the size of a square matrix
constexpr Bool_t kFALSE
Definition RtypesCore.h:101
double Double_t
Definition RtypesCore.h:59
constexpr Bool_t kTRUE
Definition RtypesCore.h:100
const char Option_t
Definition RtypesCore.h:66
#define ClassImp(name)
Definition Rtypes.h:377
#define R__ASSERT(e)
Definition TError.h:118
winID h TVirtualViewer3D TVirtualGLPainter p
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char filename
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t index
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t UChar_t len
char name[80]
Definition TGX11.cxx:110
TMatrixTColumn_const< Double_t > TMatrixDColumn_const
#define gROOT
Definition TROOT.h:407
char * Form(const char *fmt,...)
Formats a string in a circular formatting buffer.
Definition TString.cxx:2467
Using a TBrowser one can browse all ROOT objects.
Definition TBrowser.h:37
This class stores the date and time with a precision of one second in an unsigned 32 bit word (950130...
Definition TDatime.h:37
const char * AsString() const
Return the date & time as a string (ctime() format).
Definition TDatime.cxx:102
1-D histogram with a float per channel (see TH1 documentation)}
Definition TH1.h:577
TH1 is the base class of all histogram classes in ROOT.
Definition TH1.h:58
virtual void SetXTitle(const char *title)
Definition TH1.h:415
virtual Bool_t Add(TF1 *h1, Double_t c1=1, Option_t *option="")
Performs the operation: this = this + c1*f1 if errors are defined (see TH1::Sumw2),...
Definition TH1.cxx:807
virtual Int_t Fill(Double_t x)
Increment bin with abscissa X by 1.
Definition TH1.cxx:3345
void Draw(Option_t *option="") override
Draw this histogram with options.
Definition TH1.cxx:3067
virtual void SetYTitle(const char *title)
Definition TH1.h:416
2-D histogram with a float per channel (see TH1 documentation)}
Definition TH2.h:258
A doubly linked list.
Definition TList.h:38
TObject * FindObject(const char *name) const override
Find an object in this list using its name.
Definition TList.cxx:578
void Add(TObject *obj) override
Definition TList.h:81
void Delete(Option_t *option="") override
Remove all objects from the list AND delete all heap based objects.
Definition TList.cxx:470
TMatrixDSymEigen.
const TVectorD & GetEigenValues() const
const TMatrixD & GetEigenVectors() const
Int_t GetNrows() const
virtual TMatrixTBase< Element > & Zero()
Set matrix elements to zero.
Bool_t IsValid() const
TMatrixTSym< Element > & Use(Int_t row_lwb, Int_t row_upb, Element *data)
const Element * GetMatrixArray() const override
Definition TMatrixT.h:227
The TNamed class is the base class for all named ROOT classes.
Definition TNamed.h:29
virtual void SetName(const char *name)
Set the name of the TNamed.
Definition TNamed.cxx:140
TNamed & operator=(const TNamed &rhs)
TNamed assignment operator.
Definition TNamed.cxx:51
virtual void Warning(const char *method, const char *msgfmt,...) const
Issue warning message.
Definition TObject.cxx:973
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition TObject.cxx:987
Principal Components Analysis (PCA)
Definition TPrincipal.h:21
virtual void MakeMethods(const char *classname="PCA", Option_t *option="")
Generate the file <classname>PCA.cxx which contains the implementation of two methods:
virtual void AddRow(const Double_t *x)
Add a data point and update the covariance matrix.
virtual void X2P(const Double_t *x, Double_t *p)
Calculate the principal components from the original data vector x, and return it in p.
Double_t fTrace
Trace of covarience matrix.
Definition TPrincipal.h:38
const Double_t * GetRow(Int_t row)
Return a row of the user supplied data.
TPrincipal()
Empty constructor. Do not use.
void Clear(Option_t *option="") override
Clear the data in Object.
virtual void MakeHistograms(const char *name="pca", Option_t *option="epsdx")
Make histograms of the result of the analysis.
void Print(Option_t *opt="MSE") const override
Print the statistics Options are.
TVectorD fUserData
Vector of original data points.
Definition TPrincipal.h:36
virtual void MakeCode(const char *filename="pca", Option_t *option="")
Generates the file <filename>, with .C appended if it does argument doesn't end in ....
TMatrixD fCovarianceMatrix
Covariance matrix.
Definition TPrincipal.h:29
Int_t fNumberOfVariables
Number of variables.
Definition TPrincipal.h:25
TVectorD fSigmas
vector of sigmas
Definition TPrincipal.h:28
TVectorD fOffDiagonal
Elements of the tridiagonal.
Definition TPrincipal.h:34
TVectorD fEigenValues
Eigenvalue vector of trans.
Definition TPrincipal.h:32
TVectorD fMeanValues
Mean value over all data points.
Definition TPrincipal.h:27
TList * fHistograms
List of histograms.
Definition TPrincipal.h:40
void MakeRealCode(const char *filename, const char *prefix, Option_t *option="")
This is the method that actually generates the code for the transformations to and from feature space...
Bool_t fStoreData
Should we store input data?
Definition TPrincipal.h:43
TMatrixD fEigenVectors
Eigenvector matrix of trans.
Definition TPrincipal.h:31
~TPrincipal() override
Destructor.
virtual void P2X(const Double_t *p, Double_t *x, Int_t nTest)
Calculate x as a function of nTest of the most significant principal components p,...
void Browse(TBrowser *b) override
Browse the TPrincipal object in the TBrowser.
Int_t fNumberOfDataPoints
Number of data points.
Definition TPrincipal.h:24
void MakeNormalised()
Normalize the covariance matrix.
virtual void MakePrincipals()
Perform the principal components analysis.
virtual void SumOfSquareResiduals(const Double_t *x, Double_t *s)
Calculates the sum of the square residuals, that is.
void Test(Option_t *option="")
Test the PCA, bye calculating the sum square of residuals (see method SumOfSquareResiduals),...
Bool_t fIsNormalised
Normalize matrix?
Definition TPrincipal.h:42
TPrincipal & operator=(const TPrincipal &)
Assignment operator.
Basic string class.
Definition TString.h:139
Bool_t EndsWith(const char *pat, ECaseCompare cmp=kExact) const
Return true if string ends with the specified string.
Definition TString.cxx:2222
const char * Data() const
Definition TString.h:380
static TString Format(const char *fmt,...)
Static method which formats a string using a printf style format descriptor and return a TString.
Definition TString.cxx:2356
void Form(const char *fmt,...)
Formats a string using a printf style format descriptor.
Definition TString.cxx:2334
TVectorT< Element > & Zero()
Set vector elements to zero.
Definition TVectorT.cxx:453
TVectorT< Element > & ResizeTo(Int_t lwb, Int_t upb)
Resize the vector to [lwb:upb] .
Definition TVectorT.cxx:294
Int_t GetNrows() const
Definition TVectorT.h:75
Bool_t IsValid() const
Definition TVectorT.h:83
Element * GetMatrixArray()
Definition TVectorT.h:78
Double_t x[n]
Definition legend1.C:17
Double_t Sqrt(Double_t x)
Returns the square root of x.
Definition TMath.h:662
Short_t Abs(Short_t d)
Returns the absolute value of parameter Short_t d.
Definition TMathBase.h:123
auto * t1
Definition textangle.C:20
#define sym(otri1, otri2)
Definition triangle.c:933