doc/v606/TRobustEstimator_8cxx_source.html

 // @(#)root/physics:$Id$

 // Author: Anna Kreshuk  08/10/2004


 /*************************************************************************

  * Copyright (C) 1995-2004, Rene Brun and Fons Rademakers.               *

  * All rights reserved.                                                  *

  *                                                                       *

  * For the licensing terms see $ROOTSYS/LICENSE.                         *

  * For the list of contributors see $ROOTSYS/README/CREDITS.             *

  *************************************************************************/


 //////////////////////////////////////////////////////////////////////////////

 //

 //  TRobustEstimator

 //

 // Minimum Covariance Determinant Estimator - a Fast Algorithm

 // invented by Peter J.Rousseeuw and Katrien Van Dreissen

 // "A Fast Algorithm for the Minimum covariance Determinant Estimator"

 // Technometrics, August 1999, Vol.41, NO.3

 //

 // What are robust estimators?

 // "An important property of an estimator is its robustness. An estimator

 // is called robust if it is insensitive to measurements that deviate

 // from the expected behaviour. There are 2 ways to treat such deviating

 // measurements: one may either try to recongize them and then remove

 // them from the data sample; or one may leave them in the sample, taking

 // care that they do not influence the estimate unduly. In both cases robust

 // estimators are needed...Robust procedures compensate for systematic errors

 // as much as possible, and indicate any situation in which a danger of not being

 // able to operate reliably is detected."

 // R.Fruhwirth, M.Regler, R.K.Bock, H.Grote, D.Notz

 // "Data Analysis Techniques for High-Energy Physics", 2nd edition

 //

 // What does this algorithm do?

 // It computes a highly robust estimator of multivariate location and scatter.

 // Then, it takes those estimates to compute robust distances of all the

 // data vectors. Those with large robust distances are considered outliers.

 // Robust distances can then be plotted for better visualization of the data.

 //

 // How does this algorithm do it?

 // The MCD objective is to find h observations(out of n) whose classical

 // covariance matrix has the lowest determinant. The MCD estimator of location

 // is then the average of those h points and the MCD estimate of scatter

 // is their covariance matrix. The minimum(and default) h = (n+nvariables+1)/2

 // so the algorithm is effective when less than (n+nvar+1)/2 variables are outliers.

 // The algorithm also allows for exact fit situations - that is, when h or more

 // observations lie on a hyperplane. Then the algorithm still yields the MCD location T

 // and scatter matrix S, the latter being singular as it should be. From (T,S) the

 // program then computes the equation of the hyperplane.

 //

 // How can this algorithm be used?

 // In any case, when contamination of data is suspected, that might influence

 // the classical estimates.

 // Also, robust estimation of location and scatter is a tool to robustify

 // other multivariate techniques such as, for example, principal-component analysis

 // and discriminant analysis.

 //

 //

 //

 //

 // Technical details of the algorithm:

 // 0.The default h = (n+nvariables+1)/2, but the user may choose any interger h with

 //   (n+nvariables+1)/2<=h<=n. The program then reports the MCD's breakdown value

 //   (n-h+1)/n. If you are sure that the dataset contains less than 25% contamination

 //   which is usually the case, a good compromise between breakdown value and

 //  efficiency is obtained by putting h=[.75*n].

 // 1.If h=n,the MCD location estimate is the average of the whole dataset, and

 //   the MCD scatter estimate is its covariance matrix. Report this and stop

 // 2.If nvariables=1 (univariate data), compute the MCD estimate by the exact

 //   algorithm of Rousseeuw and Leroy (1987, pp.171-172) in O(nlogn)time and stop

 // 3.From here on, h<n and nvariables>=2.

 //   3a.If n is small:

 //    - repeat (say) 500 times:

 //    -- construct an initial h-subset, starting from a random (nvar+1)-subset

 //    -- carry out 2 C-steps (described in the comments of CStep function)

 //    - for the 10 results with lowest det(S):

 //    -- carry out C-steps until convergence

 //    - report the solution (T, S) with the lowest det(S)

 //   3b.If n is larger (say, n>600), then

 //    - construct up to 5 disjoint random subsets of size nsub (say, nsub=300)

 //    - inside each subset repeat 500/5 times:

 //    -- construct an initial subset of size hsub=[nsub*h/n]

 //    -- carry out 2 C-steps

 //    -- keep the best 10 results (Tsub, Ssub)

 //    - pool the subsets, yielding the merged set (say, of size nmerged=1500)

 //    - in the merged set, repeat for each of the 50 solutions (Tsub, Ssub)

 //    -- carry out 2 C-steps

 //    -- keep the 10 best results

 //    - in the full dataset, repeat for those best results:

 //    -- take several C-steps, using n and h

 //    -- report the best final result (T, S)

 // 4.To obtain consistency when the data comes from a multivariate normal

 //   distribution, covariance matrix is multiplied by a correction factor

 // 5.Robust distances for all elements, using the final (T, S) are calculated

 //   Then the very final mean and covariance estimates are calculated only for

 //   values, whose robust distances are less than a cutoff value (0.975 quantile

 //   of chi2 distribution with nvariables degrees of freedom)

 //

 //////////////////////////////////////////////////////////////////////////////


 #include "TRobustEstimator.h"

 #include "TRandom.h"

 #include "TMath.h"

 #include "TH1D.h"

 #include "TPaveLabel.h"

 #include "TDecompChol.h"


 ClassImp(TRobustEstimator)


 const Double_t kChiMedian[50]= {

          0.454937, 1.38629, 2.36597, 3.35670, 4.35146, 5.34812, 6.34581, 7.34412, 8.34283,

          9.34182, 10.34, 11.34, 12.34, 13.34, 14.34, 15.34, 16.34, 17.34, 18.34, 19.34,

         20.34, 21.34, 22.34, 23.34, 24.34, 25.34, 26.34, 27.34, 28.34, 29.34, 30.34,

         31.34, 32.34, 33.34, 34.34, 35.34, 36.34, 37.34, 38.34, 39.34, 40.34,

         41.34, 42.34, 43.34, 44.34, 45.34, 46.34, 47.34, 48.34, 49.33};


 const Double_t kChiQuant[50]={

          5.02389, 7.3776,9.34840,11.1433,12.8325,

         14.4494,16.0128,17.5346,19.0228,20.4831,21.920,23.337,

         24.736,26.119,27.488,28.845,30.191,31.526,32.852,34.170,

         35.479,36.781,38.076,39.364,40.646,41.923,43.194,44.461,

         45.722,46.979,48.232,49.481,50.725,51.966,53.203,54.437,

         55.668,56.896,58.120,59.342,60.561,61.777,62.990,64.201,

         65.410,66.617,67.821,69.022,70.222,71.420};


 ////////////////////////////////////////////////////////////////////////////////

 ///this constructor should be used in a univariate case:

 ///first call this constructor, then - the EvaluateUni(..) function


 TRobustEstimator::TRobustEstimator(){

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///constructor


 TRobustEstimator::TRobustEstimator(Int_t nvectors, Int_t nvariables, Int_t hh)

    :fMean(nvariables),

     fCovariance(nvariables),

     fInvcovariance(nvariables),

     fCorrelation(nvariables),

     fRd(nvectors),

     fSd(nvectors),

     fOut(1),

     fHyperplane(nvariables),

     fData(nvectors, nvariables)

 {

    if ((nvectors<=1)||(nvariables<=0)){

       Error("TRobustEstimator","Not enough vectors or variables");

       return;

    }

    if (nvariables==1){

       Error("TRobustEstimator","For the univariate case, use the default constructor and EvaluateUni() function");

       return;

    }


    fN=nvectors;

    fNvar=nvariables;

    if (hh<(fN+fNvar+1)/2){

       if (hh>0)

          Warning("TRobustEstimator","chosen h is too small, default h is taken instead");

       fH=(fN+fNvar+1)/2;

    } else

       fH=hh;


    fVarTemp=0;

    fVecTemp=0;

    fExact=0;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///adds a column to the data matrix

 ///it is assumed that the column has size fN

 ///variable fVarTemp keeps the number of columns l

 ///already added


 void TRobustEstimator::AddColumn(Double_t *col)

 {

    if (fVarTemp==fNvar) {

       fNvar++;

       fCovariance.ResizeTo(fNvar, fNvar);

       fInvcovariance.ResizeTo(fNvar, fNvar);

       fCorrelation.ResizeTo(fNvar, fNvar);

       fMean.ResizeTo(fNvar);

       fHyperplane.ResizeTo(fNvar);

       fData.ResizeTo(fN, fNvar);

    }

    for (Int_t i=0; i<fN; i++) {

       fData(i, fVarTemp)=col[i];

    }

    fVarTemp++;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///adds a vector to the data matrix

 ///it is supposed that the vector is of size fNvar


 void TRobustEstimator::AddRow(Double_t *row)

 {

    if(fVecTemp==fN) {

       fN++;

       fRd.ResizeTo(fN);

       fSd.ResizeTo(fN);

       fData.ResizeTo(fN, fNvar);

    }

    for (Int_t i=0; i<fNvar; i++)

       fData(fVecTemp, i)=row[i];


    fVecTemp++;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///Finds the estimate of multivariate mean and variance


 void TRobustEstimator::Evaluate()

 {

    Double_t kEps=1e-14;


    if (fH==fN){

       Warning("Evaluate","Chosen h = #observations, so classic estimates of location and scatter will be calculated");

       Classic();

       return;

    }


    Int_t i, j, k;

    Int_t ii, jj;

    Int_t nmini = 300;

    Int_t k1=500;

    Int_t nbest=10;

    TMatrixD sscp(fNvar+1, fNvar+1);

    TVectorD vec(fNvar);


    Int_t *index = new Int_t[fN];

    Double_t *ndist = new Double_t[fN];

    Double_t det;

    Double_t *deti=new Double_t[nbest];

    for (i=0; i<nbest; i++)

       deti[i]=1e16;


    for (i=0; i<fN; i++)

       fRd(i)=0;

    ////////////////////////////

    //for small n

    ////////////////////////////

    if (fN<nmini*2) {

       //for storing the best fMeans and covariances


       TMatrixD mstock(nbest, fNvar);

       TMatrixD cstock(fNvar, fNvar*nbest);


       for (k=0; k<k1; k++) {

          CreateSubset(fN, fH, fNvar, index, fData, sscp, ndist);

          //calculate the mean and covariance of the created subset

          ClearSscp(sscp);

          for (i=0; i<fH; i++) {

             for(j=0; j<fNvar; j++)

                vec(j)=fData[index[i]][j];

             AddToSscp(sscp, vec);

          }

          Covar(sscp, fMean, fCovariance, fSd, fH);

          det = fCovariance.Determinant();

          if (det < kEps) {

             fExact = Exact(ndist);

             delete [] index;

             delete [] ndist;

             delete [] deti;

             return;

          }

          //make 2 CSteps

          det = CStep(fN, fH, index, fData, sscp, ndist);

          if (det < kEps) {

             fExact = Exact(ndist);

             delete [] index;

             delete [] ndist;

             delete [] deti;

             return;

          }

          det = CStep(fN, fH, index, fData, sscp, ndist);

          if (det < kEps) {

             fExact = Exact(ndist);

             delete [] index;

             delete [] ndist;

             delete [] deti;

             return;

          } else {

             Int_t maxind=TMath::LocMax(nbest, deti);

             if(det<deti[maxind]) {

                deti[maxind]=det;

                for(ii=0; ii<fNvar; ii++) {

                   mstock(maxind, ii)=fMean(ii);

                   for(jj=0; jj<fNvar; jj++)

                      cstock(ii, jj+maxind*fNvar)=fCovariance(ii, jj);

                }

             }

          }

       }


       //now for nbest best results perform CSteps until convergence


       for (i=0; i<nbest; i++) {

          for(ii=0; ii<fNvar; ii++) {

             fMean(ii)=mstock(i, ii);

             for (jj=0; jj<fNvar; jj++)

                fCovariance(ii, jj)=cstock(ii, jj+i*fNvar);

          }


          det=1;

          while (det>kEps) {

             det=CStep(fN, fH, index, fData, sscp, ndist);

             if(TMath::Abs(det-deti[i])<kEps)

                break;

             else

                deti[i]=det;

          }

          for(ii=0; ii<fNvar; ii++) {

             mstock(i,ii)=fMean(ii);

             for (jj=0; jj<fNvar; jj++)

                cstock(ii,jj+i*fNvar)=fCovariance(ii, jj);

          }

       }


       Int_t detind=TMath::LocMin(nbest, deti);

       for(ii=0; ii<fNvar; ii++) {

          fMean(ii)=mstock(detind,ii);


          for(jj=0; jj<fNvar; jj++)

             fCovariance(ii, jj)=cstock(ii,jj+detind*fNvar);

       }


       if (deti[detind]!=0) {

          //calculate robust distances and throw out the bad points

          Int_t nout = RDist(sscp);

          Double_t cutoff=kChiQuant[fNvar-1];


          fOut.Set(nout);


          j=0;

          for (i=0; i<fN; i++) {

             if(fRd(i)>cutoff) {

                fOut[j]=i;

                j++;

             }

          }


       } else {

          fExact=Exact(ndist);

       }

       delete [] index;

       delete [] ndist;

       delete [] deti;

       return;


    }

    /////////////////////////////////////////////////

   //if n>nmini, the dataset should be partitioned

   //partitioning

   ////////////////////////////////////////////////

    Int_t indsubdat[5];

    Int_t nsub;

    for (ii=0; ii<5; ii++)

       indsubdat[ii]=0;


    nsub = Partition(nmini, indsubdat);


    Int_t sum=0;

    for (ii=0; ii<5; ii++)

       sum+=indsubdat[ii];

    Int_t *subdat=new Int_t[sum];

    //printf("allocates subdat[ %d ]\n", sum);

    // init the subdat matrix

    for (int iii = 0; iii < sum; ++iii) subdat[iii] = -999;

    RDraw(subdat, nsub, indsubdat);

    for (int iii = 0; iii < sum; ++iii) {

       if (subdat[iii] < 0 || subdat[iii] >= fN ) {

          Error("Evaluate","subdat index is invalid subdat[%d] = %d",iii, subdat[iii] );

          R__ASSERT(0);

       }

    }

    //now the indexes of selected cases are in the array subdat

    //matrices to store best means and covariances

    Int_t nbestsub=nbest*nsub;

    TMatrixD mstockbig(nbestsub, fNvar);

    TMatrixD cstockbig(fNvar, fNvar*nbestsub);

    TMatrixD hyperplane(nbestsub, fNvar);

    for (i=0; i<nbestsub; i++) {

       for(j=0; j<fNvar; j++)

          hyperplane(i,j)=0;

    }

    Double_t *detibig = new Double_t[nbestsub];

    Int_t maxind;

    maxind=TMath::LocMax(5, indsubdat);

    TMatrixD dattemp(indsubdat[maxind], fNvar);


    Int_t k2=Int_t(k1/nsub);

    //construct h-subsets and perform 2 CSteps in subgroups


    for (Int_t kgroup=0; kgroup<nsub; kgroup++) {

       //printf("group #%d\n", kgroup);

       Int_t ntemp=indsubdat[kgroup];

       Int_t temp=0;

       for (i=0; i<kgroup; i++)

          temp+=indsubdat[i];

       Int_t par;


       for(i=0; i<ntemp; i++) {

          for (j=0; j<fNvar; j++) {

             dattemp(i,j)=fData[subdat[temp+i]][j];

          }

       }

       Int_t htemp=Int_t(fH*ntemp/fN);


       for (i=0; i<nbest; i++)

          deti[i]=1e16;


       for(k=0; k<k2; k++) {

          CreateSubset(ntemp, htemp, fNvar, index, dattemp, sscp, ndist);

          ClearSscp(sscp);

          for (i=0; i<htemp; i++) {

             for(j=0; j<fNvar; j++) {

                vec(j)=dattemp(index[i],j);

             }

             AddToSscp(sscp, vec);

          }

          Covar(sscp, fMean, fCovariance, fSd, htemp);

          det = fCovariance.Determinant();

          if (det<kEps) {

             par =Exact2(mstockbig, cstockbig, hyperplane, deti, nbest, kgroup, sscp,ndist);

             if(par==nbest+1) {


                delete [] detibig;

                delete [] deti;

                delete [] subdat;

                delete [] ndist;

                delete [] index;

                return;

             } else

                deti[par]=det;

          } else {

             det = CStep(ntemp, htemp, index, dattemp, sscp, ndist);

             if (det<kEps) {

                par=Exact2(mstockbig, cstockbig, hyperplane, deti, nbest, kgroup, sscp, ndist);

                if(par==nbest+1) {


                   delete [] detibig;

                   delete [] deti;

                   delete [] subdat;

                   delete [] ndist;

                   delete [] index;

                   return;

                } else

                   deti[par]=det;

             } else {

                det=CStep(ntemp,htemp, index, dattemp, sscp, ndist);

                if(det<kEps){

                   par=Exact2(mstockbig, cstockbig, hyperplane, deti, nbest, kgroup, sscp,ndist);

                   if(par==nbest+1) {


                      delete [] detibig;

                      delete [] deti;

                      delete [] subdat;

                      delete [] ndist;

                      delete [] index;

                      return;

                   } else {

                      deti[par]=det;

                   }

                } else {

                   maxind=TMath::LocMax(nbest, deti);

                   if(det<deti[maxind]) {

                      deti[maxind]=det;

                      for(i=0; i<fNvar; i++) {

                         mstockbig(nbest*kgroup+maxind,i)=fMean(i);

                         for(j=0; j<fNvar; j++) {

                            cstockbig(i,nbest*kgroup*fNvar+maxind*fNvar+j)=fCovariance(i,j);


                         }

                      }

                   }


                }

             }

          }


          maxind=TMath::LocMax(nbest, deti);

          if (deti[maxind]<kEps)

             break;

       }


       for(i=0; i<nbest; i++) {

          detibig[kgroup*nbest + i]=deti[i];


       }


    }


    //now the arrays mstockbig and cstockbig store nbest*nsub best means and covariances

    //detibig stores nbest*nsub their determinants

    //merge the subsets and carry out 2 CSteps on the merged set for all 50 best solutions


    TMatrixD datmerged(sum, fNvar);

    for(i=0; i<sum; i++) {

       for (j=0; j<fNvar; j++)

          datmerged(i,j)=fData[subdat[i]][j];

    }

    //  printf("performing calculations for merged set\n");

    Int_t hmerged=Int_t(sum*fH/fN);


    Int_t nh;

    for(k=0; k<nbestsub; k++) {

       //for all best solutions perform 2 CSteps and then choose the very best

       for(ii=0; ii<fNvar; ii++) {

          fMean(ii)=mstockbig(k,ii);

          for(jj=0; jj<fNvar; jj++)

             fCovariance(ii, jj)=cstockbig(ii,k*fNvar+jj);

       }

       if(detibig[k]==0) {

          for(i=0; i<fNvar; i++)

             fHyperplane(i)=hyperplane(k,i);

          CreateOrtSubset(datmerged,index, hmerged, sum, sscp, ndist);


       }

       det=CStep(sum, hmerged, index, datmerged, sscp, ndist);

       if (det<kEps) {

          nh= Exact(ndist);

          if (nh>=fH) {

             fExact = nh;


             delete [] detibig;

             delete [] deti;

             delete [] subdat;

             delete [] ndist;

             delete [] index;

             return;

          } else {

             CreateOrtSubset(datmerged, index, hmerged, sum, sscp, ndist);

          }

       }


       det=CStep(sum, hmerged, index, datmerged, sscp, ndist);

       if (det<kEps) {

          nh=Exact(ndist);

          if (nh>=fH) {

             fExact = nh;

             delete [] detibig;

             delete [] deti;

             delete [] subdat;

             delete [] ndist;

             delete [] index;

             return;

          }

       }

       detibig[k]=det;

       for(i=0; i<fNvar; i++) {

          mstockbig(k,i)=fMean(i);

          for(j=0; j<fNvar; j++) {

             cstockbig(i,k*fNvar+j)=fCovariance(i, j);

          }

       }

    }

    //now for the subset with the smallest determinant

    //repeat CSteps until convergence

    Int_t minind=TMath::LocMin(nbestsub, detibig);

    det=detibig[minind];

    for(i=0; i<fNvar; i++) {

       fMean(i)=mstockbig(minind,i);

       fHyperplane(i)=hyperplane(minind,i);

       for(j=0; j<fNvar; j++)

          fCovariance(i, j)=cstockbig(i,minind*fNvar + j);

    }

    if(det<kEps)

       CreateOrtSubset(fData, index, fH, fN, sscp, ndist);

    det=1;

    while (det>kEps) {

       det=CStep(fN, fH, index, fData, sscp, ndist);

       if(TMath::Abs(det-detibig[minind])<kEps) {

          break;

       } else {

          detibig[minind]=det;

       }

    }

    if(det<kEps) {

       Exact(ndist);

       fExact=kTRUE;

    }

    Int_t nout = RDist(sscp);

    Double_t cutoff=kChiQuant[fNvar-1];


    fOut.Set(nout);


    j=0;

    for (i=0; i<fN; i++) {

       if(fRd(i)>cutoff) {

          fOut[j]=i;

          j++;

       }

    }


    delete [] detibig;

    delete [] deti;

    delete [] subdat;

    delete [] ndist;

    delete [] index;

    return;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///for the univariate case

 ///estimates of location and scatter are returned in mean and sigma parameters

 ///the algorithm works on the same principle as in multivariate case -

 ///it finds a subset of size hh with smallest sigma, and then returns mean and

 ///sigma of this subset


 void TRobustEstimator::EvaluateUni(Int_t nvectors, Double_t *data, Double_t &mean, Double_t &sigma, Int_t hh)

 {

    if (hh==0)

       hh=(nvectors+2)/2;

    Double_t faclts[]={2.6477,2.5092,2.3826,2.2662,2.1587,2.0589,1.9660,1.879,1.7973,1.7203,1.6473};

    Int_t *index=new Int_t[nvectors];

    TMath::Sort(nvectors, data, index, kFALSE);


    Int_t nquant;

    nquant=TMath::Min(Int_t(Double_t(((hh*1./nvectors)-0.5)*40))+1, 11);

    Double_t factor=faclts[nquant-1];


    Double_t *aw=new Double_t[nvectors];

    Double_t *aw2=new Double_t[nvectors];

    Double_t sq=0;

    Double_t sqmin=0;

    Int_t ndup=0;

    Int_t len=nvectors-hh;

    Double_t *slutn=new Double_t[len];

    for(Int_t i=0; i<len; i++)

       slutn[i]=0;

    for(Int_t jint=0; jint<len; jint++) {

       aw[jint]=0;

       for (Int_t j=0; j<hh; j++) {

          aw[jint]+=data[index[j+jint]];

          if(jint==0)

             sq+=data[index[j]]*data[index[j]];

       }

       aw2[jint]=aw[jint]*aw[jint]/hh;


       if(jint==0) {

          sq=sq-aw2[jint];

          sqmin=sq;

          slutn[ndup]=aw[jint];


       } else {

          sq=sq - data[index[jint-1]]*data[index[jint-1]]+

             data[index[jint+hh]]*data[index[jint+hh]]-

             aw2[jint]+aw2[jint-1];

          if(sq<sqmin) {

             ndup=0;

             sqmin=sq;

             slutn[ndup]=aw[jint];


          } else {

             if(sq==sqmin) {

                ndup++;

                slutn[ndup]=aw[jint];

             }

          }

       }

    }


    slutn[0]=slutn[Int_t((ndup)/2)]/hh;

    Double_t bstd=factor*TMath::Sqrt(sqmin/hh);

    mean=slutn[0];

    sigma=bstd;

    delete [] aw;

    delete [] aw2;

    delete [] slutn;

    delete [] index;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///returns the breakdown point of the algorithm


 Int_t TRobustEstimator::GetBDPoint()

 {

    Int_t n;

    n=(fN-fH+1)/fN;

    return n;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///returns the chi2 quantiles


 Double_t TRobustEstimator::GetChiQuant(Int_t i) const

 {

    if (i < 0 || i >= 50) return 0;

    return kChiQuant[i];

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///returns the covariance matrix


 void TRobustEstimator::GetCovariance(TMatrixDSym &matr)

 {

    if (matr.GetNrows()!=fNvar || matr.GetNcols()!=fNvar){

       Warning("GetCovariance","provided matrix is of the wrong size, it will be resized");

       matr.ResizeTo(fNvar, fNvar);

    }

    matr=fCovariance;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///returns the correlation matrix


 void TRobustEstimator::GetCorrelation(TMatrixDSym &matr)

 {

    if (matr.GetNrows()!=fNvar || matr.GetNcols()!=fNvar) {

       Warning("GetCorrelation","provided matrix is of the wrong size, it will be resized");

       matr.ResizeTo(fNvar, fNvar);

    }

    matr=fCorrelation;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///if the points are on a hyperplane, returns this hyperplane


 const TVectorD* TRobustEstimator::GetHyperplane() const

 {

    if (fExact==0) {

       Error("GetHyperplane","the data doesn't lie on a hyperplane!\n");

       return 0;

    } else {

       return &fHyperplane;

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///if the points are on a hyperplane, returns this hyperplane


 void TRobustEstimator::GetHyperplane(TVectorD &vec)

 {

    if (fExact==0){

       Error("GetHyperplane","the data doesn't lie on a hyperplane!\n");

       return;

    }

    if (vec.GetNoElements()!=fNvar) {

       Warning("GetHyperPlane","provided vector is of the wrong size, it will be resized");

       vec.ResizeTo(fNvar);

    }

    vec=fHyperplane;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///return the estimate of the mean


 void TRobustEstimator::GetMean(TVectorD &means)

 {

    if (means.GetNoElements()!=fNvar) {

       Warning("GetMean","provided vector is of the wrong size, it will be resized");

       means.ResizeTo(fNvar);

    }

    means=fMean;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///returns the robust distances (helps to find outliers)


 void TRobustEstimator::GetRDistances(TVectorD &rdist)

 {

    if (rdist.GetNoElements()!=fN) {

       Warning("GetRDistances","provided vector is of the wrong size, it will be resized");

       rdist.ResizeTo(fN);

    }

    rdist=fRd;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///returns the number of outliers


 Int_t TRobustEstimator::GetNOut()

 {

    return fOut.GetSize();

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///update the sscp matrix with vector vec


 void TRobustEstimator::AddToSscp(TMatrixD &sscp, TVectorD &vec)

 {

    Int_t i, j;

    for (j=1; j<fNvar+1; j++) {

       sscp(0, j) +=vec(j-1);

       sscp(j, 0) = sscp(0, j);

    }

    for (i=1; i<fNvar+1; i++) {

       for (j=1; j<fNvar+1; j++) {

          sscp(i, j) += vec(i-1)*vec(j-1);

       }

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///clear the sscp matrix, used for covariance and mean calculation


 void TRobustEstimator::ClearSscp(TMatrixD &sscp)

 {

    for (Int_t i=0; i<fNvar+1; i++) {

       for (Int_t j=0; j<fNvar+1; j++) {

          sscp(i, j)=0;

       }

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///called when h=n. Returns classic covariance matrix

 ///and mean


 void TRobustEstimator::Classic()

 {

    TMatrixD sscp(fNvar+1, fNvar+1);

    TVectorD temp(fNvar);

    ClearSscp(sscp);

    for (Int_t i=0; i<fN; i++) {

       for (Int_t j=0; j<fNvar; j++)

          temp(j)=fData(i, j);

       AddToSscp(sscp, temp);

    }

    Covar(sscp, fMean, fCovariance, fSd, fN);

    Correl();


 }


 ////////////////////////////////////////////////////////////////////////////////

 ///calculates mean and covariance


 void TRobustEstimator::Covar(TMatrixD &sscp, TVectorD &m, TMatrixDSym &cov, TVectorD &sd, Int_t nvec)

 {

    Int_t i, j;

    Double_t f;

    for (i=0; i<fNvar; i++) {

       m(i)=sscp(0, i+1);

       sd[i]=sscp(i+1, i+1);

       f=(sd[i]-m(i)*m(i)/nvec)/(nvec-1);

       if (f>1e-14) sd[i]=TMath::Sqrt(f);

       else sd[i]=0;

       m(i)/=nvec;

    }

    for (i=0; i<fNvar; i++) {

       for (j=0; j<fNvar; j++) {

          cov(i, j)=sscp(i+1, j+1)-nvec*m(i)*m(j);

       cov(i, j)/=nvec-1;

       }

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///transforms covariance matrix into correlation matrix


 void TRobustEstimator::Correl()

 {

    Int_t i, j;

    Double_t *sd=new Double_t[fNvar];

    for(j=0; j<fNvar; j++)

       sd[j]=1./TMath::Sqrt(fCovariance(j, j));

    for(i=0; i<fNvar; i++) {

       for (j=0; j<fNvar; j++) {

          if (i==j)

             fCorrelation(i, j)=1.;

          else

             fCorrelation(i, j)=fCovariance(i, j)*sd[i]*sd[j];

       }

    }

    delete [] sd;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///creates a subset of htotal elements from ntotal elements

 ///first, p+1 elements are drawn randomly(without repetitions)

 ///if their covariance matrix is singular, more elements are

 ///added one by one, until their covariance matrix becomes regular

 ///or it becomes clear that htotal observations lie on a hyperplane

 ///If covariance matrix determinant!=0, distances of all ntotal elements

 ///are calculated, using formula d_i=Sqrt((x_i-M)*S_inv*(x_i-M)), where

 ///M is mean and S_inv is the inverse of the covariance matrix

 ///htotal points with smallest distances are included in the returned subset.


 void TRobustEstimator::CreateSubset(Int_t ntotal, Int_t htotal, Int_t p, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)

 {

    Double_t kEps = 1e-14;

    Int_t i, j;

    Bool_t repeat=kFALSE;

    Int_t nindex=0;

    Int_t num;

    for(i=0; i<ntotal; i++)

       index[i]=ntotal+1;


    for (i=0; i<p+1; i++) {

       num=Int_t(gRandom->Uniform(0, 1)*(ntotal-1));

       if (i>0){

          for(j=0; j<=i-1; j++) {

             if(index[j]==num)

             repeat=kTRUE;

          }

       }

       if(repeat==kTRUE) {

          i--;

          repeat=kFALSE;

       } else {

          index[i]=num;

          nindex++;

       }

    }


    ClearSscp(sscp);


    TVectorD vec(fNvar);

    Double_t det;

    for (i=0; i<p+1; i++) {

       for (j=0; j<fNvar; j++) {

          vec[j]=data[index[i]][j];


       }

       AddToSscp(sscp, vec);

    }


    Covar(sscp, fMean, fCovariance, fSd, p+1);

    det=fCovariance.Determinant();

    while((det<kEps)&&(nindex < htotal)) {

     //if covariance matrix is singular,another vector is added until

     //the matrix becomes regular or it becomes clear that all

     //vectors of the group lie on a hyperplane

       repeat=kFALSE;

       do{

          num=Int_t(gRandom->Uniform(0,1)*(ntotal-1));

          repeat=kFALSE;

          for(i=0; i<nindex; i++) {

             if(index[i]==num) {

                repeat=kTRUE;

                break;

             }

          }

       }while(repeat==kTRUE);


       index[nindex]=num;

       nindex++;

       //check if covariance matrix is singular

       for(j=0; j<fNvar; j++)

          vec[j]=data[index[nindex-1]][j];

       AddToSscp(sscp, vec);

       Covar(sscp, fMean, fCovariance, fSd, nindex);

       det=fCovariance.Determinant();

    }


    if(nindex!=htotal) {

       TDecompChol chol(fCovariance);

       fInvcovariance = chol.Invert();


       TVectorD temp(fNvar);

       for(j=0; j<ntotal; j++) {

          ndist[j]=0;

          for(i=0; i<fNvar; i++)

             temp[i]=data[j][i] - fMean(i);

          temp*=fInvcovariance;

          for(i=0; i<fNvar; i++)

             ndist[j]+=(data[j][i]-fMean(i))*temp[i];

       }

       KOrdStat(ntotal, ndist, htotal-1,index);

    }


 }


 ////////////////////////////////////////////////////////////////////////////////

 ///creates a subset of hmerged vectors with smallest orthogonal distances to the hyperplane

 ///hyp[1]*(x1-mean[1])+...+hyp[nvar]*(xnvar-mean[nvar])=0

 ///This function is called in case when less than fH samples lie on a hyperplane.


 void TRobustEstimator::CreateOrtSubset(TMatrixD &dat,Int_t *index, Int_t hmerged, Int_t nmerged, TMatrixD &sscp, Double_t *ndist)

 {

    Int_t i, j;

       TVectorD vec(fNvar);

    for (i=0; i<nmerged; i++) {

       ndist[i]=0;

       for(j=0; j<fNvar; j++) {

          ndist[i]+=fHyperplane[j]*(dat[i][j]-fMean[j]);

          ndist[i]=TMath::Abs(ndist[i]);

       }

    }

    KOrdStat(nmerged, ndist, hmerged-1, index);

    ClearSscp(sscp);

    for (i=0; i<hmerged; i++) {

       for(j=0; j<fNvar; j++)

          vec[j]=dat[index[i]][j];

       AddToSscp(sscp, vec);

    }

    Covar(sscp, fMean, fCovariance, fSd, hmerged);

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///from the input htotal-subset constructs another htotal subset with lower determinant

 ///

 ///As proven by Peter J.Rousseeuw and Katrien Van Driessen, if distances for all elements

 ///are calculated, using the formula:d_i=Sqrt((x_i-M)*S_inv*(x_i-M)), where M is the mean

 ///of the input htotal-subset, and S_inv - the inverse of its covariance matrix, then

 ///htotal elements with smallest distances will have covariance matrix with determinant

 ///less or equal to the determinant of the input subset covariance matrix.

 ///

 ///determinant for this htotal-subset with smallest distances is returned


 Double_t TRobustEstimator::CStep(Int_t ntotal, Int_t htotal, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)

 {

    Int_t i, j;

    TVectorD vec(fNvar);

    Double_t det;


    TDecompChol chol(fCovariance);

    fInvcovariance = chol.Invert();


    TVectorD temp(fNvar);

    for(j=0; j<ntotal; j++) {

       ndist[j]=0;

       for(i=0; i<fNvar; i++)

          temp[i]=data[j][i]-fMean[i];

       temp*=fInvcovariance;

       for(i=0; i<fNvar; i++)

          ndist[j]+=(data[j][i]-fMean[i])*temp[i];

    }


    //taking h smallest

    KOrdStat(ntotal, ndist, htotal-1, index);

    //writing their mean and covariance

    ClearSscp(sscp);

    for (i=0; i<htotal; i++) {

       for (j=0; j<fNvar; j++)

          temp[j]=data[index[i]][j];

       AddToSscp(sscp, temp);

    }

    Covar(sscp, fMean, fCovariance, fSd, htotal);

    det = fCovariance.Determinant();

    return det;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///for the exact fit situaions

 ///returns number of observations on the hyperplane


 Int_t TRobustEstimator::Exact(Double_t *ndist)

 {

    Int_t i, j;


    TMatrixDSymEigen eigen(fCovariance);

    TVectorD eigenValues=eigen.GetEigenValues();

    TMatrixD eigenMatrix=eigen.GetEigenVectors();


    for (j=0; j<fNvar; j++) {

       fHyperplane[j]=eigenMatrix(j,fNvar-1);

    }

    //calculate and return how many observations lie on the hyperplane

    for (i=0; i<fN; i++) {

       ndist[i]=0;

       for(j=0; j<fNvar; j++) {

          ndist[i]+=fHyperplane[j]*(fData[i][j]-fMean[j]);

          ndist[i]=TMath::Abs(ndist[i]);

       }

    }

    Int_t nhyp=0;


    for (i=0; i<fN; i++) {

       if(ndist[i] < 1e-14) nhyp++;

    }

    return nhyp;


 }


 ////////////////////////////////////////////////////////////////////////////////

 ///This function is called if determinant of the covariance matrix of a subset=0.

 ///

 ///If there are more then fH vectors on a hyperplane,

 ///returns this hyperplane and stops

 ///else stores the hyperplane coordinates in hyperplane matrix


 Int_t TRobustEstimator::Exact2(TMatrixD &mstockbig, TMatrixD &cstockbig, TMatrixD &hyperplane,

                              Double_t *deti, Int_t nbest, Int_t kgroup,

                              TMatrixD &sscp, Double_t *ndist)

 {

    Int_t i, j;


    TVectorD vec(fNvar);

    Int_t maxind = TMath::LocMax(nbest, deti);

    Int_t nh=Exact(ndist);

    //now nh is the number of observation on the hyperplane

    //ndist stores distances of observation from this hyperplane

    if(nh>=fH) {

       ClearSscp(sscp);

       for (i=0; i<fN; i++) {

          if(ndist[i]<1e-14) {

             for (j=0; j<fNvar; j++)

                vec[j]=fData[i][j];

             AddToSscp(sscp, vec);

          }

       }

       Covar(sscp, fMean, fCovariance, fSd, nh);


       fExact=nh;

       return nbest+1;


    } else {

       //if less than fH observations lie on a hyperplane,

       //mean and covariance matrix are stored in mstockbig

       //and cstockbig in place of the previous maximum determinant

       //mean and covariance

       for(i=0; i<fNvar; i++) {

          mstockbig(nbest*kgroup+maxind,i)=fMean(i);

          hyperplane(nbest*kgroup+maxind,i)=fHyperplane(i);

          for(j=0; j<fNvar; j++) {

             cstockbig(i,nbest*kgroup*fNvar+maxind*fNvar+j)=fCovariance(i,j);

          }

       }

       return maxind;

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///divides the elements into approximately equal subgroups

 ///number of elements in each subgroup is stored in indsubdat

 ///number of subgroups is returned


 Int_t TRobustEstimator::Partition(Int_t nmini, Int_t *indsubdat)

 {

    Int_t nsub;

    if ((fN>=2*nmini) && (fN<=(3*nmini-1))) {

       if (fN%2==1){

          indsubdat[0]=Int_t(fN*0.5);

       indsubdat[1]=Int_t(fN*0.5)+1;

       } else

          indsubdat[0]=indsubdat[1]=Int_t(fN/2);

       nsub=2;

    }

    else{

       if((fN>=3*nmini) && (fN<(4*nmini -1))) {

          if(fN%3==0){

             indsubdat[0]=indsubdat[1]=indsubdat[2]=Int_t(fN/3);

          } else {

             indsubdat[0]=Int_t(fN/3);

             indsubdat[1]=Int_t(fN/3)+1;

             if (fN%3==1) indsubdat[2]=Int_t(fN/3);

             else indsubdat[2]=Int_t(fN/3)+1;

          }

          nsub=3;

       }

       else{

          if((fN>=4*nmini)&&(fN<=(5*nmini-1))){

             if (fN%4==0) indsubdat[0]=indsubdat[1]=indsubdat[2]=indsubdat[3]=Int_t(fN/4);

             else {

                indsubdat[0]=Int_t(fN/4);

                indsubdat[1]=Int_t(fN/4)+1;

                if(fN%4==1) indsubdat[2]=indsubdat[3]=Int_t(fN/4);

                if(fN%4==2) {

                   indsubdat[2]=Int_t(fN/4)+1;

                   indsubdat[3]=Int_t(fN/4);

                }

                if(fN%4==3) indsubdat[2]=indsubdat[3]=Int_t(fN/4)+1;

             }

             nsub=4;

          } else {

             for(Int_t i=0; i<5; i++)

                indsubdat[i]=nmini;

             nsub=5;

          }

       }

    }

    return nsub;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///Calculates robust distances.Then the samples with robust distances

 ///greater than a cutoff value (0.975 quantile of chi2 distribution with

 ///fNvar degrees of freedom, multiplied by a correction factor), are given

 ///weiht=0, and new, reweighted estimates of location and scatter are calculated

 ///The function returns the number of outliers.


 Int_t TRobustEstimator::RDist(TMatrixD &sscp)

 {

    Int_t i, j;

    Int_t nout=0;


    TVectorD temp(fNvar);

    TDecompChol chol(fCovariance);

    fInvcovariance = chol.Invert();


    for (i=0; i<fN; i++) {

       fRd[i]=0;

       for(j=0; j<fNvar; j++) {

          temp[j]=fData[i][j]-fMean[j];

       }

       temp*=fInvcovariance;

       for(j=0; j<fNvar; j++) {

          fRd[i]+=(fData[i][j]-fMean[j])*temp[j];

       }

    }


    Double_t med;

    Double_t chi = kChiMedian[fNvar-1];


    med=TMath::Median(fN, fRd.GetMatrixArray());

    med/=chi;

    fCovariance*=med;

    TDecompChol chol2(fCovariance);

    fInvcovariance = chol2.Invert();


    for (i=0; i<fN; i++) {

       fRd[i]=0;

       for(j=0; j<fNvar; j++) {

          temp[j]=fData[i][j]-fMean[j];

    }


       temp*=fInvcovariance;

       for(j=0; j<fNvar; j++) {

          fRd[i]+=(fData[i][j]-fMean[j])*temp[j];

       }

    }


    Double_t cutoff = kChiQuant[fNvar-1];


    ClearSscp(sscp);

    for(i=0; i<fN; i++) {

       if (fRd[i]<=cutoff) {

          for(j=0; j<fNvar; j++)

             temp[j]=fData[i][j];

          AddToSscp(sscp,temp);

       } else {

          nout++;

       }

    }


    Covar(sscp, fMean, fCovariance, fSd, fN-nout);

    return nout;

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///Draws ngroup nonoverlapping subdatasets out of a dataset of size n

 ///such that the selected case numbers are uniformly distributed from 1 to n


 void TRobustEstimator::RDraw(Int_t *subdat, Int_t ngroup, Int_t *indsubdat)

 {

    Int_t jndex = 0;

    Int_t nrand;

    Int_t i, k, m, j;

    for (k=1; k<=ngroup; k++) {

       for (m=1; m<=indsubdat[k-1]; m++) {

          nrand = Int_t(gRandom->Uniform(0, 1) * double(fN-jndex))+1;

          //printf("nrand = %d - jndex %d\n",nrand,jndex);

          jndex++;

          if (jndex==1) {

             subdat[0]=nrand-1;  // in case nrand is equal to fN

          } else {

             subdat[jndex-1]=nrand+jndex-2;

             for (i=1; i<=jndex-1; i++) {

                if(subdat[i-1] > nrand+i-2) {

                   for(j=jndex; j>=i+1; j--) {

                      subdat[j-1]=subdat[j-2];

                   }

                   //printf("subdata[] i = %d - nrand %d\n",i,nrand);

                   subdat[i-1]=nrand+i-2;

                   break;  //breaking the loop for(i=1...

                }

             }

          }

       }

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///because I need an Int_t work array


 Double_t TRobustEstimator::KOrdStat(Int_t ntotal, Double_t *a, Int_t k, Int_t *work){

    Bool_t isAllocated = kFALSE;

    const Int_t kWorkMax=100;

    Int_t i, ir, j, l, mid;

    Int_t arr;

    Int_t *ind;

    Int_t workLocal[kWorkMax];

    Int_t temp;


    if (work) {

       ind = work;

    } else {

       ind = workLocal;

       if (ntotal > kWorkMax) {

          isAllocated = kTRUE;

          ind = new Int_t[ntotal];

       }

    }


    for (Int_t ii=0; ii<ntotal; ii++) {

       ind[ii]=ii;

    }

    Int_t rk = k;

    l=0;

    ir = ntotal-1;

    for(;;) {

       if (ir<=l+1) { //active partition contains 1 or 2 elements

          if (ir == l+1 && a[ind[ir]]<a[ind[l]])

             {temp = ind[l]; ind[l]=ind[ir]; ind[ir]=temp;}

          Double_t tmp = a[ind[rk]];

          if (isAllocated)

             delete [] ind;

          return tmp;

       } else {

          mid = (l+ir) >> 1; //choose median of left, center and right

          {temp = ind[mid]; ind[mid]=ind[l+1]; ind[l+1]=temp;}//elements as partitioning element arr.

          if (a[ind[l]]>a[ind[ir]])  //also rearrange so that a[l]<=a[l+1]

             {temp = ind[l]; ind[l]=ind[ir]; ind[ir]=temp;}


          if (a[ind[l+1]]>a[ind[ir]])

             {temp=ind[l+1]; ind[l+1]=ind[ir]; ind[ir]=temp;}


          if (a[ind[l]]>a[ind[l+1]])

                 {temp = ind[l]; ind[l]=ind[l+1]; ind[l+1]=temp;}


          i=l+1;        //initialize pointers for partitioning

          j=ir;

          arr = ind[l+1];

          for (;;) {

             do i++; while (a[ind[i]]<a[arr]);

             do j--; while (a[ind[j]]>a[arr]);

             if (j<i) break;  //pointers crossed, partitioning complete

                {temp=ind[i]; ind[i]=ind[j]; ind[j]=temp;}

          }

          ind[l+1]=ind[j];

          ind[j]=arr;

          if (j>=rk) ir = j-1; //keep active the partition that

          if (j<=rk) l=i;      //contains the k_th element

       }

    }

 }

TMatrixDSymEigen::GetEigenValues
const TVectorD & GetEigenValues() const
Definition: TMatrixDSymEigen.h:60

TRobustEstimator::fOut
TArrayI fOut
Definition: TRobustEstimator.h:42

TRobustEstimator::Classic
void Classic()
called when h=n.
Definition: TRobustEstimator.cxx:814

TRobustEstimator::GetNOut
Int_t GetNOut()
returns the number of outliers
Definition: TRobustEstimator.cxx:776

par
double par[1]
Definition: unuranDistr.cxx:38

TVectorT::ResizeTo
TVectorT< Element > & ResizeTo(Int_t lwb, Int_t upb)
Resize the vector to [lwb:upb] .
Definition: TVectorT.cxx:291

TMatrixTSym::ResizeTo
virtual TMatrixTBase< Element > & ResizeTo(Int_t nrows, Int_t ncols, Int_t=-1)
Set size of the matrix to nrows x ncols New dynamic elements are created, the overlapping part of the...
Definition: TMatrixTSym.cxx:767

TRobustEstimator::CreateOrtSubset
void CreateOrtSubset(TMatrixD &dat, Int_t *index, Int_t hmerged, Int_t nmerged, TMatrixD &sscp, Double_t *ndist)
creates a subset of hmerged vectors with smallest orthogonal distances to the hyperplane hyp[1]*(x1-m...
Definition: TRobustEstimator.cxx:973

TMath::LocMax
Long64_t LocMax(Long64_t n, const T *a)
Definition: TMath.h:724

TRobustEstimator::CreateSubset
void CreateSubset(Int_t ntotal, Int_t htotal, Int_t p, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)
creates a subset of htotal elements from ntotal elements first, p+1 elements are drawn randomly(witho...
Definition: TRobustEstimator.cxx:883

TRobustEstimator::AddColumn
void AddColumn(Double_t *col)
adds a column to the data matrix it is assumed that the column has size fN variable fVarTemp keeps th...
Definition: TRobustEstimator.cxx:176

TRobustEstimator::Evaluate
void Evaluate()
Finds the estimate of multivariate mean and variance.
Definition: TRobustEstimator.cxx:214

TRobustEstimator::fN
Int_t fN
Definition: TRobustEstimator.h:29

TRobustEstimator::Exact2
Int_t Exact2(TMatrixD &mstockbig, TMatrixD &cstockbig, TMatrixD &hyperplane, Double_t *deti, Int_t nbest, Int_t kgroup, TMatrixD &sscp, Double_t *ndist)
This function is called if determinant of the covariance matrix of a subset=0.
Definition: TRobustEstimator.cxx:1077

TRobustEstimator::ClearSscp
void ClearSscp(TMatrixD &sscp)
clear the sscp matrix, used for covariance and mean calculation
Definition: TRobustEstimator.cxx:801

TRobustEstimator::CStep
Double_t CStep(Int_t ntotal, Int_t htotal, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)
from the input htotal-subset constructs another htotal subset with lower determinant ...
Definition: TRobustEstimator.cxx:1005

TVectorT< Double_t >

TDecompChol.h

R__ASSERT
#define R__ASSERT(e)
Definition: TError.h:98

TMath::Min
Short_t Min(Short_t a, Short_t b)
Definition: TMathBase.h:170

Int_t
int Int_t
Definition: RtypesCore.h:41

TRobustEstimator::fData
TMatrixD fData
Definition: TRobustEstimator.h:46

Bool_t
bool Bool_t
Definition: RtypesCore.h:59

a
TArc * a
Definition: textangle.C:12

kFALSE
const Bool_t kFALSE
Definition: Rtypes.h:92

TRobustEstimator::KOrdStat
Double_t KOrdStat(Int_t ntotal, Double_t *arr, Int_t k, Int_t *work)
because I need an Int_t work array
Definition: TRobustEstimator.cxx:1273

TRobustEstimator::AddRow
void AddRow(Double_t *row)
adds a vector to the data matrix it is supposed that the vector is of size fNvar
Definition: TRobustEstimator.cxx:197

TMatrixT::ResizeTo
virtual TMatrixTBase< Element > & ResizeTo(Int_t nrows, Int_t ncols, Int_t=-1)
Set size of the matrix to nrows x ncols New dynamic elements are created, the overlapping part of the...
Definition: TMatrixT.cxx:1201

TRobustEstimator::GetHyperplane
const TVectorD * GetHyperplane() const
if the points are on a hyperplane, returns this hyperplane
Definition: TRobustEstimator.cxx:723

TMath::Abs
Short_t Abs(Short_t d)
Definition: TMathBase.h:110

TRobustEstimator::GetChiQuant
Double_t GetChiQuant(Int_t i) const
returns the chi2 quantiles
Definition: TRobustEstimator.cxx:690

TMatrixDSymEigen
Definition: TMatrixDSymEigen.h:33

TMatrixT< Double_t >

TRobustEstimator::fHyperplane
TVectorD fHyperplane
Definition: TRobustEstimator.h:43

TRobustEstimator::fInvcovariance
TMatrixDSym fInvcovariance
Definition: TRobustEstimator.h:38

TRobustEstimator
Definition: TRobustEstimator.h:23

TMatrixTSym::Determinant
virtual Double_t Determinant() const
Definition: TMatrixTSym.cxx:931

TRobustEstimator::fSd
TVectorD fSd
Definition: TRobustEstimator.h:41

double
double
Definition: RooCFunction2Binding.cxx:39

kChiQuant
const Double_t kChiQuant[50]
Definition: TRobustEstimator.cxx:117

TMath::Sort
void Sort(Index n, const Element *a, Index *index, Bool_t down=kTRUE)
Definition: TMath.h:1002

TRobustEstimator::GetRDistances
const TVectorD * GetRDistances() const
Definition: TRobustEstimator.h:98

TRobustEstimator::fNvar
Int_t fNvar
Definition: TRobustEstimator.h:27

TDecompChol::Invert
Bool_t Invert(TMatrixDSym &inv)
For a symmetric matrix A(m,m), its inverse A_inv(m,m) is returned .
Definition: TDecompChol.cxx:341

TObject::Error
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition: TObject.cxx:918

TMatrixTSym< Double_t >

TArrayI::Set
void Set(Int_t n)
Set size of this array to n ints.
Definition: TArrayI.cxx:104

TMatrixTBase::GetNrows
Int_t GetNrows() const
Definition: TMatrixTBase.h:134

TVectorT::GetMatrixArray
Element * GetMatrixArray()
Definition: TVectorT.h:84

TDecompChol
Definition: TDecompChol.h:28

TH1D.h

TRobustEstimator::RDraw
void RDraw(Int_t *subdat, Int_t ngroup, Int_t *indsubdat)
Draws ngroup nonoverlapping subdatasets out of a dataset of size n such that the selected case number...
Definition: TRobustEstimator.cxx:1241

TRobustEstimator::fH
Int_t fH
Definition: TRobustEstimator.h:28

m
TMarker * m
Definition: textangle.C:8

TRobustEstimator::AddToSscp
void AddToSscp(TMatrixD &sscp, TVectorD &vec)
update the sscp matrix with vector vec
Definition: TRobustEstimator.cxx:784

l
TLine * l
Definition: textangle.C:4

TRobustEstimator::Exact
Int_t Exact(Double_t *ndist)
for the exact fit situaions returns number of observations on the hyperplane
Definition: TRobustEstimator.cxx:1042

TRobustEstimator::Partition
Int_t Partition(Int_t nmini, Int_t *indsubdat)
divides the elements into approximately equal subgroups number of elements in each subgroup is stored...
Definition: TRobustEstimator.cxx:1124

TRobustEstimator::fVecTemp
Int_t fVecTemp
Definition: TRobustEstimator.h:32

TRobustEstimator::fExact
Int_t fExact
Definition: TRobustEstimator.h:34

TRobustEstimator::fCovariance
TMatrixDSym fCovariance
Definition: TRobustEstimator.h:37

TRobustEstimator::fRd
TVectorD fRd
Definition: TRobustEstimator.h:40

gRandom
R__EXTERN TRandom * gRandom
Definition: TRandom.h:62

TArray::GetSize
Int_t GetSize() const
Definition: TArray.h:49

f
double f(double x)
Definition: testIntegration.cxx:12

TRobustEstimator::GetCorrelation
const TMatrixDSym * GetCorrelation() const
Definition: TRobustEstimator.h:91

TRobustEstimator::fVarTemp
Int_t fVarTemp
Definition: TRobustEstimator.h:31

Double_t
double Double_t
Definition: RtypesCore.h:55

TMatrixDSymEigen::GetEigenVectors
const TMatrixD & GetEigenVectors() const
Definition: TMatrixDSymEigen.h:59

TMath::Median
Double_t Median(Long64_t n, const T *a, const Double_t *w=0, Long64_t *work=0)
Definition: TMath.h:1077

TMatrixTBase::GetNcols
Int_t GetNcols() const
Definition: TMatrixTBase.h:137

TRobustEstimator::RDist
Int_t RDist(TMatrixD &sscp)
Calculates robust distances.Then the samples with robust distances greater than a cutoff value (0...
Definition: TRobustEstimator.cxx:1178

TRandom.h

TRobustEstimator::EvaluateUni
void EvaluateUni(Int_t nvectors, Double_t *data, Double_t &mean, Double_t &sigma, Int_t hh=0)
for the univariate case estimates of location and scatter are returned in mean and sigma parameters t...
Definition: TRobustEstimator.cxx:614

ClassImp
ClassImp(TRobustEstimator) const Double_t kChiMedian[50]

TPaveLabel.h

TRandom::Uniform
virtual Double_t Uniform(Double_t x1=1)
Returns a uniform deviate on the interval (0, x1).
Definition: TRandom.cxx:606

TRobustEstimator::GetBDPoint
Int_t GetBDPoint()
returns the breakdown point of the algorithm
Definition: TRobustEstimator.cxx:680

TRobustEstimator::Correl
void Correl()
transforms covariance matrix into correlation matrix
Definition: TRobustEstimator.cxx:855

TVectorT::GetNoElements
Int_t GetNoElements() const
Definition: TVectorT.h:82

TRobustEstimator::fCorrelation
TMatrixDSym fCorrelation
Definition: TRobustEstimator.h:39

TRobustEstimator::GetMean
const TVectorD * GetMean() const
Definition: TRobustEstimator.h:96

TRobustEstimator::GetCovariance
const TMatrixDSym * GetCovariance() const
Definition: TRobustEstimator.h:89

TRobustEstimator::Covar
void Covar(TMatrixD &sscp, TVectorD &m, TMatrixDSym &cov, TVectorD &sd, Int_t nvec)
calculates mean and covariance
Definition: TRobustEstimator.cxx:832

TRobustEstimator::TRobustEstimator
TRobustEstimator()
this constructor should be used in a univariate case: first call this constructor, then - the EvaluateUni(..) function
Definition: TRobustEstimator.cxx:130

TMath::Sqrt
Double_t Sqrt(Double_t x)
Definition: TMath.h:464

TMath.h

TMath::LocMin
Long64_t LocMin(Long64_t n, const T *a)
Definition: TMath.h:695

kTRUE
const Bool_t kTRUE
Definition: Rtypes.h:91

n
const Int_t n
Definition: legend1.C:16

TRobustEstimator::fMean
TVectorD fMean
Definition: TRobustEstimator.h:36

TRobustEstimator.h

TObject::Warning
virtual void Warning(const char *method, const char *msgfmt,...) const
Issue warning message.
Definition: TObject.cxx:904