Logo ROOT   6.14/05
Reference Guide
principal.py
Go to the documentation of this file.
1 ## \file
2 ## \ingroup tutorial_math
3 ## \notebook
4 ## Principal Components Analysis (PCA) example
5 ##
6 ## Example of using TPrincipal as a stand alone class.
7 ##
8 ## I create n-dimensional data points, where c = trunc(n / 5) + 1
9 ## are correlated with the rest n - c randomly distributed variables.
10 ##
11 ## Based on principal.C by Rene Brun and Christian Holm Christensen
12 ##
13 ## \macro_output
14 ## \macro_code
15 ##
16 ## \authors Juan Fernando Jaramillo Botero
17 
18 from ROOT import TPrincipal, gRandom, TBrowser, vector
19 
20 
21 n = 10
22 m = 10000
23 
24 c = int(n / 5) + 1
25 
26 print ("""*************************************************
27 * Principal Component Analysis *
28 * *
29 * Number of variables: {0:4d} *
30 * Number of data points: {1:8d} *
31 * Number of dependent variables: {2:4d} *
32 * *
33 *************************************************""".format(n, m, c))
34 
35 # Initilase the TPrincipal object. Use the empty string for the
36 # final argument, if you don't wan't the covariance
37 # matrix. Normalising the covariance matrix is a good idea if your
38 # variables have different orders of magnitude.
39 principal = TPrincipal(n, "ND")
40 
41 # Use a pseudo-random number generator
42 randumNum = gRandom
43 
44 # Make the m data-points
45 # Make a variable to hold our data
46 # Allocate memory for the data point
47 data = vector('double')()
48 for i in range(m):
49  # First we create the un-correlated, random variables, according
50  # to one of three distributions
51  for j in range(n - c):
52  if j % 3 == 0:
53  data.push_back(randumNum.Gaus(5, 1))
54  elif j % 3 == 1:
55  data.push_back(randumNum.Poisson(8))
56  else:
57  data.push_back(randumNum.Exp(2))
58 
59  # Then we create the correlated variables
60  for j in range(c):
61  data.push_back(0)
62  for k in range(n - c - j):
63  data[n - c + j] += data[k]
64 
65  # Finally we're ready to add this datapoint to the PCA
66  principal.AddRow(data.data())
67  data.clear()
68 
69 # Do the actual analysis
70 principal.MakePrincipals()
71 
72 # Print out the result on
73 principal.Print()
74 
75 # Test the PCA
76 principal.Test()
77 
78 # Make some histograms of the orginal, principal, residue, etc data
79 principal.MakeHistograms()
80 
81 # Make two functions to map between feature and pattern space
82 # Start a browser, so that we may browse the histograms generated
83 # above
84 principal.MakeCode()
85 b = TBrowser("principalBrowser", principal)
Principal Components Analysis (PCA)
Definition: TPrincipal.h:20
Using a TBrowser one can browse all ROOT objects.
Definition: TBrowser.h:37