Logo ROOT   6.10/09
Reference Guide
portfolio.C
Go to the documentation of this file.
1 /// \file
2 /// \ingroup tutorial_quadp
3 /// \notebook
4 /// This macro shows in detail the use of the quadratic programming package quadp .
5 /// Running this macro :
6 ///
7 /// ~~~{.cpp}
8 /// .x portfolio.C+
9 /// ~~~
10 ///
11 /// or
12 ///
13 /// ~~~{.cpp}
14 /// gSystem->Load("libQuadp");
15 /// .L portFolio.C+; portfolio()
16 /// ~~~
17 ///
18 /// Let's first review what we exactly mean by "quadratic programming" :
19 ///
20 /// We want to minimize the following objective function :
21 ///
22 /// \f$ c^T x + ( 1/2 ) x^T Q x \f$ wrt. the vector \f$ x \f$
23 ///
24 /// \f$ c \f$ is a vector and \f$ Q \f$ a symmetric positive definite matrix
25 ///
26 /// You might wonder what is so special about this objective which is quadratic in
27 /// the unknowns, that can not be done by Minuit/Fumili . Well, we have in addition
28 /// the following boundary conditions on \f$ x \f$:
29 ///
30 /// \f[
31 /// A x = b \\
32 /// clo \le C x \le cup \\
33 /// xlo \le x \le xup
34 /// \f] where A and C are arbitrary matrices and the rest are vectors
35 ///
36 /// Not all these constraints have to be defined . Our example will only use \f$ xlo \f$,
37 /// \f$ A \f$ and \f$ b \f$
38 /// Still, this could be handled by a general non-linear minimizer like Minuit by introducing
39 /// so-called "slack" variables . However, quadp is tailored to objective functions not more
40 /// complex than being quadratic . This allows usage of solving techniques which are even
41 /// stable for problems involving for instance 500 variables, 100 inequality conditions
42 /// and 50 equality conditions .
43 ///
44 /// Enough said about quadratic programming, let's return to our example .
45 /// Suppose, after a long day of doing physics, you have a look at your investments and
46 /// realize that an early retirement is not possible, given the returns of your stocks .
47 /// So what now ? ROOT to the rescue ...
48 ///
49 /// In 1990 Harry Markowitz was awarded the Nobel prize for economics: " his work provided new tools
50 /// for weighing the risks and rewards of different investments and for valuing corporate stocks and bonds" .
51 /// In plain English, he developed the tools to balance greed and fear, we want the maximum return
52 /// with the minimum amount of risk. Our stock portfolio should be at the
53 /// ["Efficient Frontier"](see http://www.riskglossary.com/articles/efficient_frontier.htm).
54 /// To quantify better the risk we are willing to take, we define a utility function \f$ U(x) \f$. It describes
55 /// as a function of our total assets \f$ x \f$, our "satisfaction" . A common choice is \f$ 1-exp(-k*x) \f$ (the reason for
56 /// the exponent will be clear later) . The parameter \f$ k \f$ is the risk-aversion factor . For small values of \f$ k \f$
57 /// the satisfaction is small for small values of \f$ x \f$; by increasing \f$ x \f$ the satisfaction can still be increased
58 /// significantly . For large values of \f$ k \f$, \f$ U(x) \f$ increases rapidly to 1, there is no increase in satisfaction
59 /// for additional dollars earned .
60 ///
61 /// In summary :
62 /// - small \f$ k \f$ ==> risk-loving investor
63 /// - large \f$ k \f$ ==> risk-averse investor
64 ///
65 /// Suppose we have for nrStocks the historical daily returns \f$ r = closing_price(n) - closing_price(n-1) \f$.
66 /// Define a vector \f$ x \f$ of length of \f$ nrStocks \f$, which contains the fraction of our money invested in
67 /// each stock . We can calculate the average daily return \f$ z \f$ of our portfolio and its variance using
68 /// the portfolio covariance Covar :
69 ///
70 /// \f$ z = r^T x \f$ and \f$ var = x^T Covar x \f$
71 ///
72 /// Assuming that the daily returns have a Normal distribution, \f$ N(x) \f$, so will \f$ z \f$ with mean \f$ r^T x \f$
73 /// and variance \f$ x^T Covar x \f$
74 ///
75 /// The expected value of the utility function is :
76 ///
77 /// \f[
78 /// E(u(x)) = Int (1-exp(-k*x) N(x) dx \\
79 /// = 1-exp(-k (r^T x - 0.5 k x^T Covar x) ) \\
80 /// \f]
81 ///
82 /// Its value is maximised by maximising \f$ r^T x -0.5 k x^T Covar x \f$
83 /// under the condition \f$ sum (x_i) = 1 \f$, meaning we want all our money invested and
84 /// \f$ x_i \ge 0 \f$, we can not "short" a stock
85 ///
86 /// For 10 stocks we got the historical daily data for Sep-2000 to Jun-2004:
87 ///
88 /// - GE : General Electric Co
89 /// - SUNW : Sun Microsystems Inc
90 /// - QCOM : Qualcomm Inc
91 /// - BRCM : Broadcom Corp
92 /// - TYC : Tyco International Ltd
93 /// - IBM : International Business Machines Corp
94 /// - AMAT : Applied Materials Inc
95 /// - C : Citigroup Inc
96 /// - PFE : Pfizer Inc
97 /// - HD : Home Depot Inc
98 ///
99 /// We calculate the optimal portfolio for 2.0 and 10.0 .
100 ///
101 /// Food for thought :
102 /// - We assumed that the stock returns have a Normal distribution . Check this assumption by
103 /// histogramming the stock returns !
104 /// - We used for the expected return in the objective function, the flat average over a time
105 /// period . Investment firms will put significant resources in improving the return prediction .
106 /// - If you want to trade significant number of shares, several other considerations have
107 /// to be taken into account :
108 /// + If you are going to buy, you will drive the price up (so-called "slippage") .
109 /// This can be taken into account by adding terms to the objective
110 /// (Google for "slippage optimization")
111 /// + FTC regulations might have to be added to the inequality constraints
112 /// - Investment firms do not want to be exposed to the "market" as defined by a broad
113 /// index like the S&P and "hedge" this exposure away . A perfect hedge this can be added
114 /// as an equality constrain, otherwise add an inequality constrain .
115 ///
116 /// \macro_image
117 /// \macro_output
118 /// \macro_code
119 ///
120 /// \author Eddy Offermann
121 
122 #include "Riostream.h"
123 #include "TCanvas.h"
124 #include "TFile.h"
125 #include "TMath.h"
126 #include "TTree.h"
127 #include "TArrayF.h"
128 #include "TH1.h"
129 #include "TF1.h"
130 #include "TLegend.h"
131 #include "TSystem.h"
132 
133 #include "TMatrixD.h"
134 #include "TMatrixDSym.h"
135 #include "TVectorD.h"
136 #include "TQpProbDens.h"
137 #include "TGondzioSolver.h"
138 
139 const Int_t nrStocks = 10;
140 static const Char_t *stocks[] =
141  {"GE","SUNW","QCOM","BRCM","TYC","IBM","AMAT","C","PFE","HD"};
142 
143 class TStockDaily {
144 public:
145  Int_t fDate;
146  Int_t fOpen; // 100*open_price
147  Int_t fHigh; // 100*high_price
148  Int_t fLow; // 100*low_price
149  Int_t fClose; // 100*close_price
150  Int_t fVol;
151  Int_t fCloseAdj; // 100*close_price adjusted for splits and dividend
152 
153  TStockDaily() {
154  fDate = fVol = fOpen = fHigh = fLow = fClose = fCloseAdj = 0;
155  }
156  virtual ~TStockDaily() {}
157 
158  ClassDef(TStockDaily,1)
159 };
160 
161 //---------------------------------------------------------------------------
162 Double_t RiskProfile(Double_t *x, Double_t *par) {
163  Double_t riskFactor = par[0];
164  return 1-TMath::Exp(-riskFactor*x[0]);
165 }
166 
167 //---------------------------------------------------------------------------
168 TArrayF &StockReturn(TFile *f,const TString &name,Int_t sDay,Int_t eDay)
169 {
170  TTree *tDaily = (TTree*)f->Get(name);
171  TStockDaily *data = 0;
172  tDaily->SetBranchAddress("daily",&data);
173  TBranch *b_closeAdj = tDaily->GetBranch("fCloseAdj");
174  TBranch *b_date = tDaily->GetBranch("fDate");
175 
176  //read only the "adjusted close" branch for all entries
177  const Int_t nrEntries = (Int_t)tDaily->GetEntries();
178  TArrayF closeAdj(nrEntries);
179  for (Int_t i = 0; i < nrEntries; i++) {
180  b_date->GetEntry(i);
181  b_closeAdj->GetEntry(i);
182  if (data->fDate >= sDay && data->fDate <= eDay)
183  closeAdj[i] = data->fCloseAdj/100.;
184  }
185 
186  TArrayF *r = new TArrayF(nrEntries-1);
187  for (Int_t i = 1; i < nrEntries; i++)
188 // (*r)[i-1] = closeAdj[i]-closeAdj[i-1];
189  (*r)[i-1] = closeAdj[i]/closeAdj[i-1];
190 
191  return *r;
192 }
193 
194 #ifndef __MAKECINT__
195 //---------------------------------------------------------------------------
196 TVectorD OptimalInvest(Double_t riskFactor,TVectorD r,TMatrixDSym Covar)
197 {
198 // what the quadratic programming package will do:
199 //
200 // minimize c^T x + ( 1/2 ) x^T Q x
201 // subject to A x = b
202 // clo <= C x <= cup
203 // xlo <= x <= xup
204 // what we want :
205 //
206 // maximize c^T x - k ( 1/2 ) x^T Q x
207 // subject to sum_x x_i = 1
208 // 0 <= x_i
209 
210  // We have nrStocks weights to determine,
211  // 1 equality- and 0 inequality- equations (the simple square boundary
212  // condition (xlo <= x <= xup) does not count)
213 
214  const Int_t nrVar = nrStocks;
215  const Int_t nrEqual = 1;
216  const Int_t nrInEqual = 0;
217 
218  // flip the sign of the objective function because we want to maximize
219  TVectorD c = -1.*r;
220  TMatrixDSym Q = riskFactor*Covar;
221 
222  // equality equation
223  TMatrixD A(nrEqual,nrVar); A = 1;
224  TVectorD b(nrEqual); b = 1;
225 
226  // inequality equation
227  //
228  // - although not applicable in the current situation since nrInEqual = 0, one
229  // has to specify not only clo and cup but also an index vector iclo and icup,
230  // whose values are either 0 or 1 . If iclo[j] = 1, the lower boundary condition
231  // is active on x[j], etc. ...
232 
233  TMatrixD C (nrInEqual,nrVar);
234  TVectorD clo (nrInEqual);
235  TVectorD cup (nrInEqual);
236  TVectorD iclo(nrInEqual);
237  TVectorD icup(nrInEqual);
238 
239  // simple square boundary condition : 0 <= x_i, so only xlo is relevant .
240  // Like for clo and cup above, we have to define an index vector ixlo and ixup .
241  // Since each variable has the lower boundary, we can set the whole vector
242  // ixlo = 1
243 
244  TVectorD xlo (nrVar); xlo = 0;
245  TVectorD xup (nrVar); xup = 0;
246  TVectorD ixlo(nrVar); ixlo = 1;
247  TVectorD ixup(nrVar); ixup = 0;
248 
249  // setup the quadratic programming problem . Since a small number of variables are
250  // involved and "Q" has everywhere entries, we chose the dense version "TQpProbDens" .
251  // In case of a sparse formulation, simply replace all "Dens" by "Sparse" below and
252  // use TMatrixDSparse instead of TMatrixDSym and TMatrixD
253 
254  TQpProbDens *qp = new TQpProbDens(nrVar,nrEqual,nrInEqual);
255 
256  // stuff all the matrices/vectors defined above in the proper places
257 
258  TQpDataDens *prob = (TQpDataDens *)qp->MakeData(c,Q,xlo,ixlo,xup,ixup,A,b,C,clo,iclo,cup,icup);
259 
260  // setup the nrStock variables, vars->fX will contain the final solution
261 
262  TQpVar *vars = qp->MakeVariables(prob);
263  TQpResidual *resid = qp->MakeResiduals(prob);
264 
265  // Now we have to choose the method of solving, either TGondzioSolver or TMehrotraSolver
266  // The Gondzio method is more sophisticated and therefore numerically more involved
267  // If one want the Mehrotra method, simply replace "Gondzio" by "Mehrotra" .
268 
269  TGondzioSolver *s = new TGondzioSolver(qp,prob);
270  const Int_t status = s->Solve(prob,vars,resid);
271 
272  const TVectorD weight = vars->fX;
273 
274  delete qp; delete prob; delete vars; delete resid; delete s;
275  if (status != 0) {
276  cout << "Could not solve this problem." <<endl;
277  return TVectorD(nrStocks);
278  }
279 
280  return weight;
281 }
282 #endif
283 
284  //---------------------------------------------------------------------------
285 void portfolio()
286 {
287  const Int_t sDay = 20000809;
288  const Int_t eDay = 20040602;
289 
290  const char *fname = "stock.root";
291  TFile *f = 0;
292  if (!gSystem->AccessPathName(fname)) {
293  f = TFile::Open(fname);
294  } else if (!gSystem->AccessPathName(Form("%s/quadp/%s", TROOT::GetTutorialDir().Data(), fname))) {
295  f = TFile::Open(Form("%s/quadp/%s", TROOT::GetTutorialDir().Data(), fname));
296  } else {
297  printf("accessing %s file from http://root.cern.ch/files\n",fname);
298  f = TFile::Open(Form("http://root.cern.ch/files/%s",fname));
299  }
300  if (!f) return;
301 
302  TArrayF *data = new TArrayF[nrStocks];
303  for (Int_t i = 0; i < nrStocks; i++) {
304  const TString symbol = stocks[i];
305  data[i] = StockReturn(f,symbol,sDay,eDay);
306  }
307 
308  const Int_t nrData = data[0].GetSize();
309 
310  TVectorD r(nrStocks);
311  for (Int_t i = 0; i < nrStocks; i++)
312  r[i] = data[i].GetSum()/nrData;
313 
314  TMatrixDSym Covar(nrStocks);
315  for (Int_t i = 0; i < nrStocks; i++) {
316  for (Int_t j = 0; j <= i; j++) {
317  Double_t sum = 0.;
318  for (Int_t k = 0; k < nrData; k++) {
319  sum += (data[i][k] - r[i]) * (data[j][k] - r[j]);
320  }
321  Covar(i,j) = Covar(j,i) = sum/nrData;
322  }
323  }
324 
325  const TVectorD weight1 = OptimalInvest(2.0,r,Covar);
326  const TVectorD weight2 = OptimalInvest(10.,r,Covar);
327 
328  cout << "stock daily daily w1 w2" <<endl;
329  cout << "symb return sdv " <<endl;
330  for (Int_t i = 0; i < nrStocks; i++)
331  printf("%s\t: %.3f %.3f %.3f %.3f\n",stocks[i],r[i],TMath::Sqrt(Covar[i][i]),weight1[i],weight2[i]);
332 
333  TCanvas *c1 = new TCanvas("c1","Portfolio Optimizations",10,10,800,900);
334  c1->Divide(1,2);
335 
336  // utility function / risk profile
337 
338  c1->cd(1);
339  gPad->SetGridx();
340  gPad->SetGridy();
341 
342  TF1 *f1 = new TF1("f1",RiskProfile,0,2.5,1);
343  f1->SetParameter(0,2.0);
344  f1->SetLineColor(49);
345  f1->Draw("AC");
346  f1->GetHistogram()->SetXTitle("dollar");
347  f1->GetHistogram()->SetYTitle("utility");
348  f1->GetHistogram()->SetMinimum(0.0);
349  f1->GetHistogram()->SetMaximum(1.0);
350  TF1 *f2 = new TF1("f2",RiskProfile,0,2.5,1);
351  f2->SetParameter(0,10.);
352  f2->SetLineColor(50);
353  f2->Draw("CSAME");
354 
355  TLegend *legend1 = new TLegend(0.50,0.65,0.70,0.82);
356  legend1->AddEntry(f1,"1-exp(-2.0*x)","l");
357  legend1->AddEntry(f2,"1-exp(-10.*x)","l");
358  legend1->Draw();
359 
360  // vertical bar chart of portfolio distribution
361 
362  c1->cd(2);
363  TH1F *h1 = new TH1F("h1","Portfolio Distribution",nrStocks,0,0);
364  TH1F *h2 = new TH1F("h2","Portfolio Distribution",nrStocks,0,0);
365  h1->SetStats(0);
366  h1->SetFillColor(49);
367  h2->SetFillColor(50);
368  h1->SetBarWidth(0.45);
369  h1->SetBarOffset(0.1);
370  h2->SetBarWidth(0.4);
371  h2->SetBarOffset(0.55);
372  for (Int_t i = 0; i < nrStocks; i++) {
373  h1->Fill(stocks[i],weight1[i]);
374  h2->Fill(stocks[i],weight2[i]);
375  }
376 
377  h1->Draw("BAR2 HIST");
378  h2->Draw("BAR2SAME HIST");
379 
380  TLegend *legend2 = new TLegend(0.50,0.65,0.70,0.82);
381  legend2->AddEntry(h1,"high risk","f");
382  legend2->AddEntry(h2,"low risk","f");
383  legend2->Draw();
384 }
virtual Bool_t AccessPathName(const char *path, EAccessMode mode=kFileExists)
Returns FALSE if one can access a file using the specified access mode.
Definition: TSystem.cxx:1272
double par[1]
Definition: unuranDistr.cxx:38
virtual void SetBarOffset(Float_t offset=0.25)
Definition: TH1.h:337
virtual Int_t Fill(Double_t x)
Increment bin with abscissa X by 1.
Definition: TH1.cxx:3126
static long int sum(long int i)
Definition: Factory.cxx:2162
virtual void SetMaximum(Double_t maximum=-1111)
Definition: TH1.h:375
This class displays a legend box (TPaveText) containing several legend entries.
Definition: TLegend.h:23
static const TString & GetTutorialDir()
Get the tutorials directory in the installation. Static utility function.
Definition: TROOT.cxx:2865
return c1
Definition: legend1.C:41
virtual void Draw(Option_t *option="")
Draw this legend with its current attributes.
Definition: TLegend.cxx:452
THist< 1, float, THistStatContent, THistStatUncertainty > TH1F
Definition: THist.hxx:311
A ROOT file is a suite of consecutive data records (TKey instances) with a well defined format...
Definition: TFile.h:46
TVirtualPad * cd(Int_t subpadnumber=0)
Set current canvas & pad.
Definition: TCanvas.cxx:679
TVectorT.
Definition: TMatrixTBase.h:77
virtual TObject * Get(const char *namecycle)
Return pointer to object identified by namecycle.
virtual void SetMinimum(Double_t minimum=-1111)
Definition: TH1.h:376
Basic string class.
Definition: TString.h:129
tomato 1-D histogram with a float per channel (see TH1 documentation)}
Definition: TH1.h:551
Array of floats (32 bits per element).
Definition: TArrayF.h:27
virtual TQpResidual * MakeResiduals(const TQpDataBase *data)
Setup the residuals.
int Int_t
Definition: RtypesCore.h:41
virtual void SetYTitle(const char *title)
Definition: TH1.h:390
virtual void Draw(Option_t *option="")
Draw this function with its current attributes.
Definition: TF1.cxx:1087
TVectorD fX
Definition: TQpVar.h:91
static double A[]
virtual void SetBarWidth(Float_t width=0.5)
Definition: TH1.h:338
TMatrixT.
Definition: TMatrixDfwd.h:22
static TFile * Open(const char *name, Option_t *option="", const char *ftitle="", Int_t compress=1, Int_t netopt=0)
Create / open a file.
Definition: TFile.cxx:3909
Double_t x[n]
Definition: legend1.C:17
#define ClassDef(name, id)
Definition: Rtypes.h:297
virtual Int_t SetBranchAddress(const char *bname, void *add, TBranch **ptr=0)
Change branch address, dealing with clone trees properly.
Definition: TTree.cxx:7873
std::vector< std::vector< double > > Data
TH1F * h1
Definition: legend1.C:5
TVectorT< Double_t > TVectorD
Definition: TVectorDfwd.h:22
virtual TBranch * GetBranch(const char *name)
Return pointer to the branch with the given name in this tree or its friends.
Definition: TTree.cxx:4979
virtual void SetLineColor(Color_t lcolor)
Set the line color.
Definition: TAttLine.h:40
virtual Int_t Solve(TQpDataBase *prob, TQpVar *iterate, TQpResidual *resid)
Solve the quadratic programming problem as formulated through prob, store the final solution in itera...
static double C[]
TRandom2 r(17)
R__EXTERN TSystem * gSystem
Definition: TSystem.h:539
virtual void Draw(Option_t *option="")
Draw this histogram with options.
Definition: TH1.cxx:2851
Int_t GetSize() const
Definition: TArray.h:47
virtual void SetFillColor(Color_t fcolor)
Set the fill area color.
Definition: TAttFill.h:37
char * Form(const char *fmt,...)
virtual Int_t GetEntry(Long64_t entry=0, Int_t getall=0)
Read all leaves of entry and return total number of bytes read.
Definition: TBranch.cxx:1291
The Canvas class.
Definition: TCanvas.h:31
Double_t Exp(Double_t x)
Definition: TMath.h:622
double f(double x)
double Double_t
Definition: RtypesCore.h:55
TLegendEntry * AddEntry(const TObject *obj, const char *label="", Option_t *option="lpf")
Add a new entry to this legend.
Definition: TLegend.cxx:359
virtual Long64_t GetEntries() const
Definition: TTree.h:381
char Char_t
Definition: RtypesCore.h:29
virtual void SetXTitle(const char *title)
Definition: TH1.h:389
Definition: TQpVar.h:59
virtual void Divide(Int_t nx=1, Int_t ny=1, Float_t xmargin=0.01, Float_t ymargin=0.01, Int_t color=0)
Automatic pad generation by division.
Definition: TPad.cxx:1135
double f2(const double *x)
1-Dim function class
Definition: TF1.h:150
TF1 * f1
Definition: legend1.C:11
you should not use this method at all Int_t Int_t Double_t Double_t Double_t Int_t Double_t Double_t Double_t Double_t b
Definition: TRolke.cxx:630
#define gPad
Definition: TVirtualPad.h:284
A TTree object has a header with a name and a title.
Definition: TTree.h:78
virtual TH1 * GetHistogram() const
Return a pointer to the histogram used to visualise the function.
Definition: TF1.cxx:1311
virtual void SetParameter(Int_t param, Double_t value)
Definition: TF1.h:578
Double_t Sqrt(Double_t x)
Definition: TMath.h:591
virtual TQpDataBase * MakeData(Double_t *c, Double_t *Q, Double_t *xlo, Bool_t *ixlo, Double_t *xup, Bool_t *ixup, Double_t *A, Double_t *bA, Double_t *C, Double_t *clo, Bool_t *iclo, Double_t *cup, Bool_t *icup)
Setup the data.
Definition: TQpProbDens.cxx:80
A TTree is a list of TBranches.
Definition: TBranch.h:57
virtual void SetStats(Bool_t stats=kTRUE)
Set statistics option on/off.
Definition: TH1.cxx:8103
static double Q[]
virtual TQpVar * MakeVariables(const TQpDataBase *data)
Setup the variables.