Logo ROOT  
Reference Guide
principal.py
Go to the documentation of this file.
1## \file
2## \ingroup tutorial_math
3## \notebook
4## Principal Components Analysis (PCA) example
5##
6## Example of using TPrincipal as a stand alone class.
7##
8## I create n-dimensional data points, where c = trunc(n / 5) + 1
9## are correlated with the rest n - c randomly distributed variables.
10##
11## Based on principal.C by Rene Brun and Christian Holm Christensen
12##
13## \macro_output
14## \macro_code
15##
16## \authors Juan Fernando Jaramillo Botero
17
18from ROOT import TPrincipal, gRandom, TBrowser, vector
19
20
21n = 10
22m = 10000
23
24c = int(n / 5) + 1
25
26print ("""*************************************************
27* Principal Component Analysis *
28* *
29* Number of variables: {0:4d} *
30* Number of data points: {1:8d} *
31* Number of dependent variables: {2:4d} *
32* *
33*************************************************""".format(n, m, c))
34
35# Initilase the TPrincipal object. Use the empty string for the
36# final argument, if you don't wan't the covariance
37# matrix. Normalising the covariance matrix is a good idea if your
38# variables have different orders of magnitude.
39principal = TPrincipal(n, "ND")
40
41# Use a pseudo-random number generator
42randumNum = gRandom
43
44# Make the m data-points
45# Make a variable to hold our data
46# Allocate memory for the data point
47data = vector('double')()
48for i in range(m):
49 # First we create the un-correlated, random variables, according
50 # to one of three distributions
51 for j in range(n - c):
52 if j % 3 == 0:
53 data.push_back(randumNum.Gaus(5, 1))
54 elif j % 3 == 1:
55 data.push_back(randumNum.Poisson(8))
56 else:
57 data.push_back(randumNum.Exp(2))
58
59 # Then we create the correlated variables
60 for j in range(c):
61 data.push_back(0)
62 for k in range(n - c - j):
63 data[n - c + j] += data[k]
64
65 # Finally we're ready to add this datapoint to the PCA
66 principal.AddRow(data.data())
67 data.clear()
68
69# Do the actual analysis
70principal.MakePrincipals()
71
72# Print out the result on
73principal.Print()
74
75# Test the PCA
76principal.Test()
77
78# Make some histograms of the orginal, principal, residue, etc data
79principal.MakeHistograms()
80
81# Make two functions to map between feature and pattern space
82# Start a browser, so that we may browse the histograms generated
83# above
84principal.MakeCode()
85b = TBrowser("principalBrowser", principal)
Using a TBrowser one can browse all ROOT objects.
Definition: TBrowser.h:37