doc/v628/TRobustEstimator_8cxx_source.html

// @(#)root/physics:$Id$

// Author: Anna Kreshuk  08/10/2004


/*************************************************************************

 * Copyright (C) 1995-2004, Rene Brun and Fons Rademakers.               *

 * All rights reserved.                                                  *

 *                                                                       *

 * For the licensing terms see $ROOTSYS/LICENSE.                         *

 * For the list of contributors see $ROOTSYS/README/CREDITS.             *

 *************************************************************************/


/** \class TRobustEstimator

    \ingroup Physics

Minimum Covariance Determinant Estimator - a Fast Algorithm

invented by Peter J.Rousseeuw and Katrien Van Dreissen

"A Fast Algorithm for the Minimum covariance Determinant Estimator"

Technometrics, August 1999, Vol.41, NO.3


What are robust estimators?

"An important property of an estimator is its robustness. An estimator

is called robust if it is insensitive to measurements that deviate

from the expected behaviour. There are 2 ways to treat such deviating

measurements: one may either try to recognise them and then remove

them from the data sample; or one may leave them in the sample, taking

care that they do not influence the estimate unduly. In both cases robust

estimators are needed...Robust procedures compensate for systematic errors

as much as possible, and indicate any situation in which a danger of not being

able to operate reliably is detected."

R.Fruhwirth, M.Regler, R.K.Bock, H.Grote, D.Notz

"Data Analysis Techniques for High-Energy Physics", 2nd edition


What does this algorithm do?

It computes a highly robust estimator of multivariate location and scatter.

Then, it takes those estimates to compute robust distances of all the

data vectors. Those with large robust distances are considered outliers.

Robust distances can then be plotted for better visualization of the data.


How does this algorithm do it?

The MCD objective is to find h observations(out of n) whose classical

covariance matrix has the lowest determinant. The MCD estimator of location

is then the average of those h points and the MCD estimate of scatter

is their covariance matrix. The minimum(and default) h = (n+nvariables+1)/2

so the algorithm is effective when less than (n+nvar+1)/2 variables are outliers.

The algorithm also allows for exact fit situations - that is, when h or more

observations lie on a hyperplane. Then the algorithm still yields the MCD location T

and scatter matrix S, the latter being singular as it should be. From (T,S) the

program then computes the equation of the hyperplane.


How can this algorithm be used?

In any case, when contamination of data is suspected, that might influence

the classical estimates.

Also, robust estimation of location and scatter is a tool to robustify

other multivariate techniques such as, for example, principal-component analysis

and discriminant analysis.


Technical details of the algorithm:


1. The default h = (n+nvariables+1)/2, but the user may choose any integer h with

   (n+nvariables+1)/2<=h<=n. The program then reports the MCD's breakdown value

   (n-h+1)/n. If you are sure that the dataset contains less than 25% contamination

   which is usually the case, a good compromise between breakdown value and

   efficiency is obtained by putting h=[.75*n].

2. If h=n,the MCD location estimate is the average of the whole dataset, and

   the MCD scatter estimate is its covariance matrix. Report this and stop

3. If nvariables=1 (univariate data), compute the MCD estimate by the exact

   algorithm of Rousseeuw and Leroy (1987, pp.171-172) in O(nlogn)time and stop

4. From here on, h<n and nvariables>=2.

   1. If n is small:

      - repeat (say) 500 times:

        - construct an initial h-subset, starting from a random (nvar+1)-subset

        - carry out 2 C-steps (described in the comments of CStep function)

      - for the 10 results with lowest det(S):

        - carry out C-steps until convergence

      - report the solution (T, S) with the lowest det(S)

   2. If n is larger (say, n>600), then

      - construct up to 5 disjoint random subsets of size nsub (say, nsub=300)

      - inside each subset repeat 500/5 times:

         - construct an initial subset of size hsub=[nsub*h/n]

         - carry out 2 C-steps

         - keep the best 10 results (Tsub, Ssub)

      - pool the subsets, yielding the merged set (say, of size nmerged=1500)

      - in the merged set, repeat for each of the 50 solutions (Tsub, Ssub)

         - carry out 2 C-steps

         - keep the 10 best results

      - in the full dataset, repeat for those best results:

         - take several C-steps, using n and h

         - report the best final result (T, S)

5. To obtain consistency when the data comes from a multivariate normal

   distribution, covariance matrix is multiplied by a correction factor

6. Robust distances for all elements, using the final (T, S) are calculated

   Then the very final mean and covariance estimates are calculated only for

   values, whose robust distances are less than a cutoff value (0.975 quantile

   of chi2 distribution with nvariables degrees of freedom)

*/


#include "TRobustEstimator.h"

#include "TMatrixDSymEigen.h"

#include "TRandom.h"

#include "TMath.h"

#include "TDecompChol.h"


ClassImp(TRobustEstimator);


const Double_t kChiMedian[50]= {

         0.454937, 1.38629, 2.36597, 3.35670, 4.35146, 5.34812, 6.34581, 7.34412, 8.34283,

         9.34182, 10.34, 11.34, 12.34, 13.34, 14.34, 15.34, 16.34, 17.34, 18.34, 19.34,

        20.34, 21.34, 22.34, 23.34, 24.34, 25.34, 26.34, 27.34, 28.34, 29.34, 30.34,

        31.34, 32.34, 33.34, 34.34, 35.34, 36.34, 37.34, 38.34, 39.34, 40.34,

        41.34, 42.34, 43.34, 44.34, 45.34, 46.34, 47.34, 48.34, 49.33};


const Double_t kChiQuant[50]={

         5.02389, 7.3776,9.34840,11.1433,12.8325,

        14.4494,16.0128,17.5346,19.0228,20.4831,21.920,23.337,

        24.736,26.119,27.488,28.845,30.191,31.526,32.852,34.170,

        35.479,36.781,38.076,39.364,40.646,41.923,43.194,44.461,

        45.722,46.979,48.232,49.481,50.725,51.966,53.203,54.437,

        55.668,56.896,58.120,59.342,60.561,61.777,62.990,64.201,

        65.410,66.617,67.821,69.022,70.222,71.420};


////////////////////////////////////////////////////////////////////////////////

///this constructor should be used in a univariate case:

///first call this constructor, then - the EvaluateUni(..) function


TRobustEstimator::TRobustEstimator(){

}


////////////////////////////////////////////////////////////////////////////////

///constructor


TRobustEstimator::TRobustEstimator(Int_t nvectors, Int_t nvariables, Int_t hh)

   :fMean(nvariables),

    fCovariance(nvariables),

    fInvcovariance(nvariables),

    fCorrelation(nvariables),

    fRd(nvectors),

    fSd(nvectors),

    fOut(1),

    fHyperplane(nvariables),

    fData(nvectors, nvariables)

{

   if ((nvectors<=1)||(nvariables<=0)){

      Error("TRobustEstimator","Not enough vectors or variables");

      return;

   }

   if (nvariables==1){

      Error("TRobustEstimator","For the univariate case, use the default constructor and EvaluateUni() function");

      return;

   }


   fN=nvectors;

   fNvar=nvariables;

   if (hh<(fN+fNvar+1)/2){

      if (hh>0)

         Warning("TRobustEstimator","chosen h is too small, default h is taken instead");

      fH=(fN+fNvar+1)/2;

   } else

      fH=hh;


   fVarTemp=0;

   fVecTemp=0;

   fExact=0;

}


////////////////////////////////////////////////////////////////////////////////

///adds a column to the data matrix

///it is assumed that the column has size fN

///variable fVarTemp keeps the number of columns l

///already added


void TRobustEstimator::AddColumn(Double_t *col)

{

   if (fVarTemp==fNvar) {

      fNvar++;

      fCovariance.ResizeTo(fNvar, fNvar);

      fInvcovariance.ResizeTo(fNvar, fNvar);

      fCorrelation.ResizeTo(fNvar, fNvar);

      fMean.ResizeTo(fNvar);

      fHyperplane.ResizeTo(fNvar);

      fData.ResizeTo(fN, fNvar);

   }

   for (Int_t i=0; i<fN; i++) {

      fData(i, fVarTemp)=col[i];

   }

   fVarTemp++;

}


////////////////////////////////////////////////////////////////////////////////

///adds a vector to the data matrix

///it is supposed that the vector is of size fNvar


void TRobustEstimator::AddRow(Double_t *row)

{

   if(fVecTemp==fN) {

      fN++;

      fRd.ResizeTo(fN);

      fSd.ResizeTo(fN);

      fData.ResizeTo(fN, fNvar);

   }

   for (Int_t i=0; i<fNvar; i++)

      fData(fVecTemp, i)=row[i];


   fVecTemp++;

}


////////////////////////////////////////////////////////////////////////////////

///Finds the estimate of multivariate mean and variance


void TRobustEstimator::Evaluate()

{

   Double_t kEps=1e-14;


   if (fH==fN){

      Warning("Evaluate","Chosen h = #observations, so classic estimates of location and scatter will be calculated");

      Classic();

      return;

   }


   Int_t i, j, k;

   Int_t ii, jj;

   Int_t nmini = 300;

   Int_t k1=500;

   Int_t nbest=10;

   TMatrixD sscp(fNvar+1, fNvar+1);

   TVectorD vec(fNvar);


   Int_t *index = new Int_t[fN];

   Double_t *ndist = new Double_t[fN];

   Double_t det;

   Double_t *deti=new Double_t[nbest];

   for (i=0; i<nbest; i++)

      deti[i]=1e16;


   for (i=0; i<fN; i++)

      fRd(i)=0;

   ////////////////////////////

   //for small n

   ////////////////////////////

   if (fN<nmini*2) {

      //for storing the best fMeans and covariances


      TMatrixD mstock(nbest, fNvar);

      TMatrixD cstock(fNvar, fNvar*nbest);


      for (k=0; k<k1; k++) {

         CreateSubset(fN, fH, fNvar, index, fData, sscp, ndist);

         //calculate the mean and covariance of the created subset

         ClearSscp(sscp);

         for (i=0; i<fH; i++) {

            for(j=0; j<fNvar; j++)

               vec(j)=fData[index[i]][j];

            AddToSscp(sscp, vec);

         }

         Covar(sscp, fMean, fCovariance, fSd, fH);

         det = fCovariance.Determinant();

         if (det < kEps) {

            fExact = Exact(ndist);

            delete [] index;

            delete [] ndist;

            delete [] deti;

            return;

         }

         //make 2 CSteps

         det = CStep(fN, fH, index, fData, sscp, ndist);

         if (det < kEps) {

            fExact = Exact(ndist);

            delete [] index;

            delete [] ndist;

            delete [] deti;

            return;

         }

         det = CStep(fN, fH, index, fData, sscp, ndist);

         if (det < kEps) {

            fExact = Exact(ndist);

            delete [] index;

            delete [] ndist;

            delete [] deti;

            return;

         } else {

            Int_t maxind=TMath::LocMax(nbest, deti);

            if(det<deti[maxind]) {

               deti[maxind]=det;

               for(ii=0; ii<fNvar; ii++) {

                  mstock(maxind, ii)=fMean(ii);

                  for(jj=0; jj<fNvar; jj++)

                     cstock(ii, jj+maxind*fNvar)=fCovariance(ii, jj);

               }

            }

         }

      }


      //now for nbest best results perform CSteps until convergence


      for (i=0; i<nbest; i++) {

         for(ii=0; ii<fNvar; ii++) {

            fMean(ii)=mstock(i, ii);

            for (jj=0; jj<fNvar; jj++)

               fCovariance(ii, jj)=cstock(ii, jj+i*fNvar);

         }


         det=1;

         while (det>kEps) {

            det=CStep(fN, fH, index, fData, sscp, ndist);

            if(TMath::Abs(det-deti[i])<kEps)

               break;

            else

               deti[i]=det;

         }

         for(ii=0; ii<fNvar; ii++) {

            mstock(i,ii)=fMean(ii);

            for (jj=0; jj<fNvar; jj++)

               cstock(ii,jj+i*fNvar)=fCovariance(ii, jj);

         }

      }


      Int_t detind=TMath::LocMin(nbest, deti);

      for(ii=0; ii<fNvar; ii++) {

         fMean(ii)=mstock(detind,ii);


         for(jj=0; jj<fNvar; jj++)

            fCovariance(ii, jj)=cstock(ii,jj+detind*fNvar);

      }


      if (deti[detind]!=0) {

         //calculate robust distances and throw out the bad points

         Int_t nout = RDist(sscp);

         Double_t cutoff=kChiQuant[fNvar-1];


         fOut.Set(nout);


         j=0;

         for (i=0; i<fN; i++) {

            if(fRd(i)>cutoff) {

               fOut[j]=i;

               j++;

            }

         }


      } else {

         fExact=Exact(ndist);

      }

      delete [] index;

      delete [] ndist;

      delete [] deti;

      return;


   }

   /////////////////////////////////////////////////

  //if n>nmini, the dataset should be partitioned

  //partitioning

  ////////////////////////////////////////////////

   Int_t indsubdat[5];

   Int_t nsub;

   for (ii=0; ii<5; ii++)

      indsubdat[ii]=0;


   nsub = Partition(nmini, indsubdat);


   Int_t sum=0;

   for (ii=0; ii<5; ii++)

      sum+=indsubdat[ii];

   Int_t *subdat=new Int_t[sum];

   //printf("allocates subdat[ %d ]\n", sum);

   // init the subdat matrix

   for (int iii = 0; iii < sum; ++iii) subdat[iii] = -999;

   RDraw(subdat, nsub, indsubdat);

   for (int iii = 0; iii < sum; ++iii) {

      if (subdat[iii] < 0 || subdat[iii] >= fN ) {

         Error("Evaluate","subdat index is invalid subdat[%d] = %d",iii, subdat[iii] );

         R__ASSERT(0);

      }

   }

   //now the indexes of selected cases are in the array subdat

   //matrices to store best means and covariances

   Int_t nbestsub=nbest*nsub;

   TMatrixD mstockbig(nbestsub, fNvar);

   TMatrixD cstockbig(fNvar, fNvar*nbestsub);

   TMatrixD hyperplane(nbestsub, fNvar);

   for (i=0; i<nbestsub; i++) {

      for(j=0; j<fNvar; j++)

         hyperplane(i,j)=0;

   }

   Double_t *detibig = new Double_t[nbestsub];

   Int_t maxind;

   maxind=TMath::LocMax(5, indsubdat);

   TMatrixD dattemp(indsubdat[maxind], fNvar);


   Int_t k2=Int_t(k1/nsub);

   //construct h-subsets and perform 2 CSteps in subgroups


   for (Int_t kgroup=0; kgroup<nsub; kgroup++) {

      //printf("group #%d\n", kgroup);

      Int_t ntemp=indsubdat[kgroup];

      Int_t temp=0;

      for (i=0; i<kgroup; i++)

         temp+=indsubdat[i];

      Int_t par;


      for(i=0; i<ntemp; i++) {

         for (j=0; j<fNvar; j++) {

            dattemp(i,j)=fData[subdat[temp+i]][j];

         }

      }

      Int_t htemp=Int_t(fH*ntemp/fN);


      for (i=0; i<nbest; i++)

         deti[i]=1e16;


      for(k=0; k<k2; k++) {

         CreateSubset(ntemp, htemp, fNvar, index, dattemp, sscp, ndist);

         ClearSscp(sscp);

         for (i=0; i<htemp; i++) {

            for(j=0; j<fNvar; j++) {

               vec(j)=dattemp(index[i],j);

            }

            AddToSscp(sscp, vec);

         }

         Covar(sscp, fMean, fCovariance, fSd, htemp);

         det = fCovariance.Determinant();

         if (det<kEps) {

            par =Exact2(mstockbig, cstockbig, hyperplane, deti, nbest, kgroup, sscp,ndist);

            if(par==nbest+1) {


               delete [] detibig;

               delete [] deti;

               delete [] subdat;

               delete [] ndist;

               delete [] index;

               return;

            } else

               deti[par]=det;

         } else {

            det = CStep(ntemp, htemp, index, dattemp, sscp, ndist);

            if (det<kEps) {

               par=Exact2(mstockbig, cstockbig, hyperplane, deti, nbest, kgroup, sscp, ndist);

               if(par==nbest+1) {


                  delete [] detibig;

                  delete [] deti;

                  delete [] subdat;

                  delete [] ndist;

                  delete [] index;

                  return;

               } else

                  deti[par]=det;

            } else {

               det=CStep(ntemp,htemp, index, dattemp, sscp, ndist);

               if(det<kEps){

                  par=Exact2(mstockbig, cstockbig, hyperplane, deti, nbest, kgroup, sscp,ndist);

                  if(par==nbest+1) {


                     delete [] detibig;

                     delete [] deti;

                     delete [] subdat;

                     delete [] ndist;

                     delete [] index;

                     return;

                  } else {

                     deti[par]=det;

                  }

               } else {

                  maxind=TMath::LocMax(nbest, deti);

                  if(det<deti[maxind]) {

                     deti[maxind]=det;

                     for(i=0; i<fNvar; i++) {

                        mstockbig(nbest*kgroup+maxind,i)=fMean(i);

                        for(j=0; j<fNvar; j++) {

                           cstockbig(i,nbest*kgroup*fNvar+maxind*fNvar+j)=fCovariance(i,j);


                        }

                     }

                  }


               }

            }

         }


         maxind=TMath::LocMax(nbest, deti);

         if (deti[maxind]<kEps)

            break;

      }


      for(i=0; i<nbest; i++) {

         detibig[kgroup*nbest + i]=deti[i];


      }


   }


   //now the arrays mstockbig and cstockbig store nbest*nsub best means and covariances

   //detibig stores nbest*nsub their determinants

   //merge the subsets and carry out 2 CSteps on the merged set for all 50 best solutions


   TMatrixD datmerged(sum, fNvar);

   for(i=0; i<sum; i++) {

      for (j=0; j<fNvar; j++)

         datmerged(i,j)=fData[subdat[i]][j];

   }

   //  printf("performing calculations for merged set\n");

   Int_t hmerged=Int_t(sum*fH/fN);


   Int_t nh;

   for(k=0; k<nbestsub; k++) {

      //for all best solutions perform 2 CSteps and then choose the very best

      for(ii=0; ii<fNvar; ii++) {

         fMean(ii)=mstockbig(k,ii);

         for(jj=0; jj<fNvar; jj++)

            fCovariance(ii, jj)=cstockbig(ii,k*fNvar+jj);

      }

      if(detibig[k]==0) {

         for(i=0; i<fNvar; i++)

            fHyperplane(i)=hyperplane(k,i);

         CreateOrtSubset(datmerged,index, hmerged, sum, sscp, ndist);


      }

      det=CStep(sum, hmerged, index, datmerged, sscp, ndist);

      if (det<kEps) {

         nh= Exact(ndist);

         if (nh>=fH) {

            fExact = nh;


            delete [] detibig;

            delete [] deti;

            delete [] subdat;

            delete [] ndist;

            delete [] index;

            return;

         } else {

            CreateOrtSubset(datmerged, index, hmerged, sum, sscp, ndist);

         }

      }


      det=CStep(sum, hmerged, index, datmerged, sscp, ndist);

      if (det<kEps) {

         nh=Exact(ndist);

         if (nh>=fH) {

            fExact = nh;

            delete [] detibig;

            delete [] deti;

            delete [] subdat;

            delete [] ndist;

            delete [] index;

            return;

         }

      }

      detibig[k]=det;

      for(i=0; i<fNvar; i++) {

         mstockbig(k,i)=fMean(i);

         for(j=0; j<fNvar; j++) {

            cstockbig(i,k*fNvar+j)=fCovariance(i, j);

         }

      }

   }

   //now for the subset with the smallest determinant

   //repeat CSteps until convergence

   Int_t minind=TMath::LocMin(nbestsub, detibig);

   det=detibig[minind];

   for(i=0; i<fNvar; i++) {

      fMean(i)=mstockbig(minind,i);

      fHyperplane(i)=hyperplane(minind,i);

      for(j=0; j<fNvar; j++)

         fCovariance(i, j)=cstockbig(i,minind*fNvar + j);

   }

   if(det<kEps)

      CreateOrtSubset(fData, index, fH, fN, sscp, ndist);

   det=1;

   while (det>kEps) {

      det=CStep(fN, fH, index, fData, sscp, ndist);

      if(TMath::Abs(det-detibig[minind])<kEps) {

         break;

      } else {

         detibig[minind]=det;

      }

   }

   if(det<kEps) {

      Exact(ndist);

      fExact=kTRUE;

   }

   Int_t nout = RDist(sscp);

   Double_t cutoff=kChiQuant[fNvar-1];


   fOut.Set(nout);


   j=0;

   for (i=0; i<fN; i++) {

      if(fRd(i)>cutoff) {

         fOut[j]=i;

         j++;

      }

   }


   delete [] detibig;

   delete [] deti;

   delete [] subdat;

   delete [] ndist;

   delete [] index;

   return;

}


////////////////////////////////////////////////////////////////////////////////

///for the univariate case

///estimates of location and scatter are returned in mean and sigma parameters

///the algorithm works on the same principle as in multivariate case -

///it finds a subset of size hh with smallest sigma, and then returns mean and

///sigma of this subset


void TRobustEstimator::EvaluateUni(Int_t nvectors, Double_t *data, Double_t &mean, Double_t &sigma, Int_t hh)

{

   if (hh==0)

      hh=(nvectors+2)/2;

   Double_t faclts[]={2.6477,2.5092,2.3826,2.2662,2.1587,2.0589,1.9660,1.879,1.7973,1.7203,1.6473};

   Int_t *index=new Int_t[nvectors];

   TMath::Sort(nvectors, data, index, kFALSE);


   Int_t nquant;

   nquant=TMath::Min(Int_t(Double_t(((hh*1./nvectors)-0.5)*40))+1, 11);

   Double_t factor=faclts[nquant-1];


   Double_t *aw=new Double_t[nvectors];

   Double_t *aw2=new Double_t[nvectors];

   Double_t sq=0;

   Double_t sqmin=0;

   Int_t ndup=0;

   Int_t len=nvectors-hh;

   Double_t *slutn=new Double_t[len];

   for(Int_t i=0; i<len; i++)

      slutn[i]=0;

   for(Int_t jint=0; jint<len; jint++) {

      aw[jint]=0;

      for (Int_t j=0; j<hh; j++) {

         aw[jint]+=data[index[j+jint]];

         if(jint==0)

            sq+=data[index[j]]*data[index[j]];

      }

      aw2[jint]=aw[jint]*aw[jint]/hh;


      if(jint==0) {

         sq=sq-aw2[jint];

         sqmin=sq;

         slutn[ndup]=aw[jint];


      } else {

         sq=sq - data[index[jint-1]]*data[index[jint-1]]+

            data[index[jint+hh]]*data[index[jint+hh]]-

            aw2[jint]+aw2[jint-1];

         if(sq<sqmin) {

            ndup=0;

            sqmin=sq;

            slutn[ndup]=aw[jint];


         } else {

            if(sq==sqmin) {

               ndup++;

               slutn[ndup]=aw[jint];

            }

         }

      }

   }


   slutn[0]=slutn[Int_t((ndup)/2)]/hh;

   Double_t bstd=factor*TMath::Sqrt(sqmin/hh);

   mean=slutn[0];

   sigma=bstd;

   delete [] aw;

   delete [] aw2;

   delete [] slutn;

   delete [] index;

}


////////////////////////////////////////////////////////////////////////////////

///returns the breakdown point of the algorithm


Int_t TRobustEstimator::GetBDPoint()

{

   Int_t n;

   n=(fN-fH+1)/fN;

   return n;

}


////////////////////////////////////////////////////////////////////////////////

///returns the chi2 quantiles


Double_t TRobustEstimator::GetChiQuant(Int_t i) const

{

   if (i < 0 || i >= 50) return 0;

   return kChiQuant[i];

}


////////////////////////////////////////////////////////////////////////////////

///returns the covariance matrix


void TRobustEstimator::GetCovariance(TMatrixDSym &matr)

{

   if (matr.GetNrows()!=fNvar || matr.GetNcols()!=fNvar){

      Warning("GetCovariance","provided matrix is of the wrong size, it will be resized");

      matr.ResizeTo(fNvar, fNvar);

   }

   matr=fCovariance;

}


////////////////////////////////////////////////////////////////////////////////

///returns the correlation matrix


void TRobustEstimator::GetCorrelation(TMatrixDSym &matr)

{

   if (matr.GetNrows()!=fNvar || matr.GetNcols()!=fNvar) {

      Warning("GetCorrelation","provided matrix is of the wrong size, it will be resized");

      matr.ResizeTo(fNvar, fNvar);

   }

   matr=fCorrelation;

}


////////////////////////////////////////////////////////////////////////////////

///if the points are on a hyperplane, returns this hyperplane


const TVectorD* TRobustEstimator::GetHyperplane() const

{

   if (fExact==0) {

      Error("GetHyperplane","the data doesn't lie on a hyperplane!\n");

      return 0;

   } else {

      return &fHyperplane;

   }

}


////////////////////////////////////////////////////////////////////////////////

///if the points are on a hyperplane, returns this hyperplane


void TRobustEstimator::GetHyperplane(TVectorD &vec)

{

   if (fExact==0){

      Error("GetHyperplane","the data doesn't lie on a hyperplane!\n");

      return;

   }

   if (vec.GetNoElements()!=fNvar) {

      Warning("GetHyperPlane","provided vector is of the wrong size, it will be resized");

      vec.ResizeTo(fNvar);

   }

   vec=fHyperplane;

}


////////////////////////////////////////////////////////////////////////////////

///return the estimate of the mean


void TRobustEstimator::GetMean(TVectorD &means)

{

   if (means.GetNoElements()!=fNvar) {

      Warning("GetMean","provided vector is of the wrong size, it will be resized");

      means.ResizeTo(fNvar);

   }

   means=fMean;

}


////////////////////////////////////////////////////////////////////////////////

///returns the robust distances (helps to find outliers)


void TRobustEstimator::GetRDistances(TVectorD &rdist)

{

   if (rdist.GetNoElements()!=fN) {

      Warning("GetRDistances","provided vector is of the wrong size, it will be resized");

      rdist.ResizeTo(fN);

   }

   rdist=fRd;

}


////////////////////////////////////////////////////////////////////////////////

///returns the number of outliers


Int_t TRobustEstimator::GetNOut()

{

   return fOut.GetSize();

}


////////////////////////////////////////////////////////////////////////////////

///update the sscp matrix with vector vec


void TRobustEstimator::AddToSscp(TMatrixD &sscp, TVectorD &vec)

{

   Int_t i, j;

   for (j=1; j<fNvar+1; j++) {

      sscp(0, j) +=vec(j-1);

      sscp(j, 0) = sscp(0, j);

   }

   for (i=1; i<fNvar+1; i++) {

      for (j=1; j<fNvar+1; j++) {

         sscp(i, j) += vec(i-1)*vec(j-1);

      }

   }

}


////////////////////////////////////////////////////////////////////////////////

///clear the sscp matrix, used for covariance and mean calculation


void TRobustEstimator::ClearSscp(TMatrixD &sscp)

{

   for (Int_t i=0; i<fNvar+1; i++) {

      for (Int_t j=0; j<fNvar+1; j++) {

         sscp(i, j)=0;

      }

   }

}


////////////////////////////////////////////////////////////////////////////////

///called when h=n. Returns classic covariance matrix

///and mean


void TRobustEstimator::Classic()

{

   TMatrixD sscp(fNvar+1, fNvar+1);

   TVectorD temp(fNvar);

   ClearSscp(sscp);

   for (Int_t i=0; i<fN; i++) {

      for (Int_t j=0; j<fNvar; j++)

         temp(j)=fData(i, j);

      AddToSscp(sscp, temp);

   }

   Covar(sscp, fMean, fCovariance, fSd, fN);

   Correl();


}


////////////////////////////////////////////////////////////////////////////////

///calculates mean and covariance


void TRobustEstimator::Covar(TMatrixD &sscp, TVectorD &m, TMatrixDSym &cov, TVectorD &sd, Int_t nvec)

{

   Int_t i, j;

   Double_t f;

   for (i=0; i<fNvar; i++) {

      m(i)=sscp(0, i+1);

      sd[i]=sscp(i+1, i+1);

      f=(sd[i]-m(i)*m(i)/nvec)/(nvec-1);

      if (f>1e-14) sd[i]=TMath::Sqrt(f);

      else sd[i]=0;

      m(i)/=nvec;

   }

   for (i=0; i<fNvar; i++) {

      for (j=0; j<fNvar; j++) {

         cov(i, j)=sscp(i+1, j+1)-nvec*m(i)*m(j);

      cov(i, j)/=nvec-1;

      }

   }

}


////////////////////////////////////////////////////////////////////////////////

///transforms covariance matrix into correlation matrix


void TRobustEstimator::Correl()

{

   Int_t i, j;

   Double_t *sd=new Double_t[fNvar];

   for(j=0; j<fNvar; j++)

      sd[j]=1./TMath::Sqrt(fCovariance(j, j));

   for(i=0; i<fNvar; i++) {

      for (j=0; j<fNvar; j++) {

         if (i==j)

            fCorrelation(i, j)=1.;

         else

            fCorrelation(i, j)=fCovariance(i, j)*sd[i]*sd[j];

      }

   }

   delete [] sd;

}


////////////////////////////////////////////////////////////////////////////////

///creates a subset of htotal elements from ntotal elements

///first, p+1 elements are drawn randomly(without repetitions)

///if their covariance matrix is singular, more elements are

///added one by one, until their covariance matrix becomes regular

///or it becomes clear that htotal observations lie on a hyperplane

///If covariance matrix determinant!=0, distances of all ntotal elements

///are calculated, using formula d_i=Sqrt((x_i-M)*S_inv*(x_i-M)), where

///M is mean and S_inv is the inverse of the covariance matrix

///htotal points with smallest distances are included in the returned subset.


void TRobustEstimator::CreateSubset(Int_t ntotal, Int_t htotal, Int_t p, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)

{

   Double_t kEps = 1e-14;

   Int_t i, j;

   Bool_t repeat=kFALSE;

   Int_t nindex=0;

   Int_t num;

   for(i=0; i<ntotal; i++)

      index[i]=ntotal+1;


   for (i=0; i<p+1; i++) {

      num=Int_t(gRandom->Uniform(0, 1)*(ntotal-1));

      if (i>0){

         for(j=0; j<=i-1; j++) {

            if(index[j]==num)

            repeat=kTRUE;

         }

      }

      if(repeat==kTRUE) {

         i--;

         repeat=kFALSE;

      } else {

         index[i]=num;

         nindex++;

      }

   }


   ClearSscp(sscp);


   TVectorD vec(fNvar);

   Double_t det;

   for (i=0; i<p+1; i++) {

      for (j=0; j<fNvar; j++) {

         vec[j]=data[index[i]][j];


      }

      AddToSscp(sscp, vec);

   }


   Covar(sscp, fMean, fCovariance, fSd, p+1);

   det=fCovariance.Determinant();

   while((det<kEps)&&(nindex < htotal)) {

    //if covariance matrix is singular,another vector is added until

    //the matrix becomes regular or it becomes clear that all

    //vectors of the group lie on a hyperplane

      repeat=kFALSE;

      do{

         num=Int_t(gRandom->Uniform(0,1)*(ntotal-1));

         repeat=kFALSE;

         for(i=0; i<nindex; i++) {

            if(index[i]==num) {

               repeat=kTRUE;

               break;

            }

         }

      }while(repeat==kTRUE);


      index[nindex]=num;

      nindex++;

      //check if covariance matrix is singular

      for(j=0; j<fNvar; j++)

         vec[j]=data[index[nindex-1]][j];

      AddToSscp(sscp, vec);

      Covar(sscp, fMean, fCovariance, fSd, nindex);

      det=fCovariance.Determinant();

   }


   if(nindex!=htotal) {

      TDecompChol chol(fCovariance);

      fInvcovariance = chol.Invert();


      TVectorD temp(fNvar);

      for(j=0; j<ntotal; j++) {

         ndist[j]=0;

         for(i=0; i<fNvar; i++)

            temp[i]=data[j][i] - fMean(i);

         temp*=fInvcovariance;

         for(i=0; i<fNvar; i++)

            ndist[j]+=(data[j][i]-fMean(i))*temp[i];

      }

      KOrdStat(ntotal, ndist, htotal-1,index);

   }


}


////////////////////////////////////////////////////////////////////////////////

///creates a subset of hmerged vectors with smallest orthogonal distances to the hyperplane

///hyp[1]*(x1-mean[1])+...+hyp[nvar]*(xnvar-mean[nvar])=0

///This function is called in case when less than fH samples lie on a hyperplane.


void TRobustEstimator::CreateOrtSubset(TMatrixD &dat,Int_t *index, Int_t hmerged, Int_t nmerged, TMatrixD &sscp, Double_t *ndist)

{

   Int_t i, j;

      TVectorD vec(fNvar);

   for (i=0; i<nmerged; i++) {

      ndist[i]=0;

      for(j=0; j<fNvar; j++) {

         ndist[i]+=fHyperplane[j]*(dat[i][j]-fMean[j]);

         ndist[i]=TMath::Abs(ndist[i]);

      }

   }

   KOrdStat(nmerged, ndist, hmerged-1, index);

   ClearSscp(sscp);

   for (i=0; i<hmerged; i++) {

      for(j=0; j<fNvar; j++)

         vec[j]=dat[index[i]][j];

      AddToSscp(sscp, vec);

   }

   Covar(sscp, fMean, fCovariance, fSd, hmerged);

}


////////////////////////////////////////////////////////////////////////////////

///from the input htotal-subset constructs another htotal subset with lower determinant

///

///As proven by Peter J.Rousseeuw and Katrien Van Driessen, if distances for all elements

///are calculated, using the formula:d_i=Sqrt((x_i-M)*S_inv*(x_i-M)), where M is the mean

///of the input htotal-subset, and S_inv - the inverse of its covariance matrix, then

///htotal elements with smallest distances will have covariance matrix with determinant

///less or equal to the determinant of the input subset covariance matrix.

///

///determinant for this htotal-subset with smallest distances is returned


Double_t TRobustEstimator::CStep(Int_t ntotal, Int_t htotal, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)

{

   Int_t i, j;

   TVectorD vec(fNvar);

   Double_t det;


   TDecompChol chol(fCovariance);

   fInvcovariance = chol.Invert();


   TVectorD temp(fNvar);

   for(j=0; j<ntotal; j++) {

      ndist[j]=0;

      for(i=0; i<fNvar; i++)

         temp[i]=data[j][i]-fMean[i];

      temp*=fInvcovariance;

      for(i=0; i<fNvar; i++)

         ndist[j]+=(data[j][i]-fMean[i])*temp[i];

   }


   //taking h smallest

   KOrdStat(ntotal, ndist, htotal-1, index);

   //writing their mean and covariance

   ClearSscp(sscp);

   for (i=0; i<htotal; i++) {

      for (j=0; j<fNvar; j++)

         temp[j]=data[index[i]][j];

      AddToSscp(sscp, temp);

   }

   Covar(sscp, fMean, fCovariance, fSd, htotal);

   det = fCovariance.Determinant();

   return det;

}


////////////////////////////////////////////////////////////////////////////////

///for the exact fit situations

///returns number of observations on the hyperplane


Int_t TRobustEstimator::Exact(Double_t *ndist)

{

   Int_t i, j;


   TMatrixDSymEigen eigen(fCovariance);

   TVectorD eigenValues=eigen.GetEigenValues();

   TMatrixD eigenMatrix=eigen.GetEigenVectors();


   for (j=0; j<fNvar; j++) {

      fHyperplane[j]=eigenMatrix(j,fNvar-1);

   }

   //calculate and return how many observations lie on the hyperplane

   for (i=0; i<fN; i++) {

      ndist[i]=0;

      for(j=0; j<fNvar; j++) {

         ndist[i]+=fHyperplane[j]*(fData[i][j]-fMean[j]);

         ndist[i]=TMath::Abs(ndist[i]);

      }

   }

   Int_t nhyp=0;


   for (i=0; i<fN; i++) {

      if(ndist[i] < 1e-14) nhyp++;

   }

   return nhyp;


}


////////////////////////////////////////////////////////////////////////////////

///This function is called if determinant of the covariance matrix of a subset=0.

///

///If there are more then fH vectors on a hyperplane,

///returns this hyperplane and stops

///else stores the hyperplane coordinates in hyperplane matrix


Int_t TRobustEstimator::Exact2(TMatrixD &mstockbig, TMatrixD &cstockbig, TMatrixD &hyperplane,

                             Double_t *deti, Int_t nbest, Int_t kgroup,

                             TMatrixD &sscp, Double_t *ndist)

{

   Int_t i, j;


   TVectorD vec(fNvar);

   Int_t maxind = TMath::LocMax(nbest, deti);

   Int_t nh=Exact(ndist);

   //now nh is the number of observation on the hyperplane

   //ndist stores distances of observation from this hyperplane

   if(nh>=fH) {

      ClearSscp(sscp);

      for (i=0; i<fN; i++) {

         if(ndist[i]<1e-14) {

            for (j=0; j<fNvar; j++)

               vec[j]=fData[i][j];

            AddToSscp(sscp, vec);

         }

      }

      Covar(sscp, fMean, fCovariance, fSd, nh);


      fExact=nh;

      return nbest+1;


   } else {

      //if less than fH observations lie on a hyperplane,

      //mean and covariance matrix are stored in mstockbig

      //and cstockbig in place of the previous maximum determinant

      //mean and covariance

      for(i=0; i<fNvar; i++) {

         mstockbig(nbest*kgroup+maxind,i)=fMean(i);

         hyperplane(nbest*kgroup+maxind,i)=fHyperplane(i);

         for(j=0; j<fNvar; j++) {

            cstockbig(i,nbest*kgroup*fNvar+maxind*fNvar+j)=fCovariance(i,j);

         }

      }

      return maxind;

   }

}


////////////////////////////////////////////////////////////////////////////////

///divides the elements into approximately equal subgroups

///number of elements in each subgroup is stored in indsubdat

///number of subgroups is returned


Int_t TRobustEstimator::Partition(Int_t nmini, Int_t *indsubdat)

{

   Int_t nsub;

   if ((fN>=2*nmini) && (fN<=(3*nmini-1))) {

      if (fN%2==1){

         indsubdat[0]=Int_t(fN*0.5);

      indsubdat[1]=Int_t(fN*0.5)+1;

      } else

         indsubdat[0]=indsubdat[1]=Int_t(fN/2);

      nsub=2;

   }

   else{

      if((fN>=3*nmini) && (fN<(4*nmini -1))) {

         if(fN%3==0){

            indsubdat[0]=indsubdat[1]=indsubdat[2]=Int_t(fN/3);

         } else {

            indsubdat[0]=Int_t(fN/3);

            indsubdat[1]=Int_t(fN/3)+1;

            if (fN%3==1) indsubdat[2]=Int_t(fN/3);

            else indsubdat[2]=Int_t(fN/3)+1;

         }

         nsub=3;

      }

      else{

         if((fN>=4*nmini)&&(fN<=(5*nmini-1))){

            if (fN%4==0) indsubdat[0]=indsubdat[1]=indsubdat[2]=indsubdat[3]=Int_t(fN/4);

            else {

               indsubdat[0]=Int_t(fN/4);

               indsubdat[1]=Int_t(fN/4)+1;

               if(fN%4==1) indsubdat[2]=indsubdat[3]=Int_t(fN/4);

               if(fN%4==2) {

                  indsubdat[2]=Int_t(fN/4)+1;

                  indsubdat[3]=Int_t(fN/4);

               }

               if(fN%4==3) indsubdat[2]=indsubdat[3]=Int_t(fN/4)+1;

            }

            nsub=4;

         } else {

            for(Int_t i=0; i<5; i++)

               indsubdat[i]=nmini;

            nsub=5;

         }

      }

   }

   return nsub;

}


////////////////////////////////////////////////////////////////////////////////

///Calculates robust distances.Then the samples with robust distances

///greater than a cutoff value (0.975 quantile of chi2 distribution with

///fNvar degrees of freedom, multiplied by a correction factor), are given

///weiht=0, and new, reweighted estimates of location and scatter are calculated

///The function returns the number of outliers.


Int_t TRobustEstimator::RDist(TMatrixD &sscp)

{

   Int_t i, j;

   Int_t nout=0;


   TVectorD temp(fNvar);

   TDecompChol chol(fCovariance);

   fInvcovariance = chol.Invert();


   for (i=0; i<fN; i++) {

      fRd[i]=0;

      for(j=0; j<fNvar; j++) {

         temp[j]=fData[i][j]-fMean[j];

      }

      temp*=fInvcovariance;

      for(j=0; j<fNvar; j++) {

         fRd[i]+=(fData[i][j]-fMean[j])*temp[j];

      }

   }


   Double_t med;

   Double_t chi = kChiMedian[fNvar-1];


   med=TMath::Median(fN, fRd.GetMatrixArray());

   med/=chi;

   fCovariance*=med;

   TDecompChol chol2(fCovariance);

   fInvcovariance = chol2.Invert();


   for (i=0; i<fN; i++) {

      fRd[i]=0;

      for(j=0; j<fNvar; j++) {

         temp[j]=fData[i][j]-fMean[j];

   }


      temp*=fInvcovariance;

      for(j=0; j<fNvar; j++) {

         fRd[i]+=(fData[i][j]-fMean[j])*temp[j];

      }

   }


   Double_t cutoff = kChiQuant[fNvar-1];


   ClearSscp(sscp);

   for(i=0; i<fN; i++) {

      if (fRd[i]<=cutoff) {

         for(j=0; j<fNvar; j++)

            temp[j]=fData[i][j];

         AddToSscp(sscp,temp);

      } else {

         nout++;

      }

   }


   Covar(sscp, fMean, fCovariance, fSd, fN-nout);

   return nout;

}


////////////////////////////////////////////////////////////////////////////////

///Draws ngroup nonoverlapping subdatasets out of a dataset of size n

///such that the selected case numbers are uniformly distributed from 1 to n


void TRobustEstimator::RDraw(Int_t *subdat, Int_t ngroup, Int_t *indsubdat)

{

   Int_t jndex = 0;

   Int_t nrand;

   Int_t i, k, m, j;

   for (k=1; k<=ngroup; k++) {

      for (m=1; m<=indsubdat[k-1]; m++) {

         nrand = Int_t(gRandom->Uniform(0, 1) * double(fN-jndex))+1;

         //printf("nrand = %d - jndex %d\n",nrand,jndex);

         jndex++;

         if (jndex==1) {

            subdat[0]=nrand-1;  // in case nrand is equal to fN

         } else {

            subdat[jndex-1]=nrand+jndex-2;

            for (i=1; i<=jndex-1; i++) {

               if(subdat[i-1] > nrand+i-2) {

                  for(j=jndex; j>=i+1; j--) {

                     subdat[j-1]=subdat[j-2];

                  }

                  //printf("subdata[] i = %d - nrand %d\n",i,nrand);

                  subdat[i-1]=nrand+i-2;

                  break;  //breaking the loop for(i=1...

               }

            }

         }

      }

   }

}


////////////////////////////////////////////////////////////////////////////////

///because I need an Int_t work array


Double_t TRobustEstimator::KOrdStat(Int_t ntotal, Double_t *a, Int_t k, Int_t *work){

   Bool_t isAllocated = kFALSE;

   const Int_t kWorkMax=100;

   Int_t i, ir, j, l, mid;

   Int_t arr;

   Int_t *ind;

   Int_t workLocal[kWorkMax];

   Int_t temp;


   if (work) {

      ind = work;

   } else {

      ind = workLocal;

      if (ntotal > kWorkMax) {

         isAllocated = kTRUE;

         ind = new Int_t[ntotal];

      }

   }


   for (Int_t ii=0; ii<ntotal; ii++) {

      ind[ii]=ii;

   }

   Int_t rk = k;

   l=0;

   ir = ntotal-1;

   for(;;) {

      if (ir<=l+1) { //active partition contains 1 or 2 elements

         if (ir == l+1 && a[ind[ir]]<a[ind[l]])

            {temp = ind[l]; ind[l]=ind[ir]; ind[ir]=temp;}

         Double_t tmp = a[ind[rk]];

         if (isAllocated)

            delete [] ind;

         return tmp;

      } else {

         mid = (l+ir) >> 1; //choose median of left, center and right

         {temp = ind[mid]; ind[mid]=ind[l+1]; ind[l+1]=temp;}//elements as partitioning element arr.

         if (a[ind[l]]>a[ind[ir]])  //also rearrange so that a[l]<=a[l+1]

            {temp = ind[l]; ind[l]=ind[ir]; ind[ir]=temp;}


         if (a[ind[l+1]]>a[ind[ir]])

            {temp=ind[l+1]; ind[l+1]=ind[ir]; ind[ir]=temp;}


         if (a[ind[l]]>a[ind[l+1]])

                {temp = ind[l]; ind[l]=ind[l+1]; ind[l+1]=temp;}


         i=l+1;        //initialize pointers for partitioning

         j=ir;

         arr = ind[l+1];

         for (;;) {

            do i++; while (a[ind[i]]<a[arr]);

            do j--; while (a[ind[j]]>a[arr]);

            if (j<i) break;  //pointers crossed, partitioning complete

               {temp=ind[i]; ind[i]=ind[j]; ind[j]=temp;}

         }

         ind[l+1]=ind[j];

         ind[j]=arr;

         if (j>=rk) ir = j-1; //keep active the partition that

         if (j<=rk) l=i;      //contains the k_th element

      }

   }

}

f
#define f(i)
Definition RSha256.hxx:104

a
#define a(i)
Definition RSha256.hxx:99

e
#define e(i)
Definition RSha256.hxx:103

Int_t
int Int_t
Definition RtypesCore.h:45

kFALSE
constexpr Bool_t kFALSE
Definition RtypesCore.h:101

Double_t
double Double_t
Definition RtypesCore.h:59

kTRUE
constexpr Bool_t kTRUE
Definition RtypesCore.h:100

ClassImp
#define ClassImp(name)
Definition Rtypes.h:377

TDecompChol.h

R__ASSERT
#define R__ASSERT(e)
Definition TError.h:117

p
winID h TVirtualViewer3D TVirtualGLPainter p
Definition TGWin32VirtualGLProxy.cxx:51

data
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void data
Definition TGWin32VirtualXProxy.cxx:104

index
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t index
Definition TGWin32VirtualXProxy.cxx:168

len
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t UChar_t len
Definition TGWin32VirtualXProxy.cxx:249

TMath.h

TMatrixDSymEigen.h

TRandom.h

gRandom
R__EXTERN TRandom * gRandom
Definition TRandom.h:62

kChiMedian
const Double_t kChiMedian[50]
Definition TRobustEstimator.cxx:104

kChiQuant
const Double_t kChiQuant[50]
Definition TRobustEstimator.cxx:111

TRobustEstimator.h

TArrayI::Set
void Set(Int_t n) override
Set size of this array to n ints.
Definition TArrayI.cxx:105

TArray::GetSize
Int_t GetSize() const
Definition TArray.h:47

TDecompChol
Cholesky Decomposition class.
Definition TDecompChol.h:25

TDecompChol::Invert
Bool_t Invert(TMatrixDSym &inv)
For a symmetric matrix A(m,m), its inverse A_inv(m,m) is returned .
Definition TDecompChol.cxx:341

TMatrixDSymEigen
TMatrixDSymEigen.
Definition TMatrixDSymEigen.h:28

TMatrixDSymEigen::GetEigenValues
const TVectorD & GetEigenValues() const
Definition TMatrixDSymEigen.h:54

TMatrixDSymEigen::GetEigenVectors
const TMatrixD & GetEigenVectors() const
Definition TMatrixDSymEigen.h:53

TMatrixTBase::GetNrows
Int_t GetNrows() const
Definition TMatrixTBase.h:123

TMatrixTBase::GetNcols
Int_t GetNcols() const
Definition TMatrixTBase.h:126

TMatrixTSym< Double_t >

TMatrixTSym::ResizeTo
TMatrixTBase< Element > & ResizeTo(Int_t nrows, Int_t ncols, Int_t=-1) override
Set size of the matrix to nrows x ncols New dynamic elements are created, the overlapping part of the...
Definition TMatrixTSym.cxx:771

TMatrixTSym::Determinant
Double_t Determinant() const override
Definition TMatrixTSym.cxx:935

TMatrixT< Double_t >

TMatrixT::ResizeTo
TMatrixTBase< Element > & ResizeTo(Int_t nrows, Int_t ncols, Int_t=-1) override
Set size of the matrix to nrows x ncols New dynamic elements are created, the overlapping part of the...
Definition TMatrixT.cxx:1211

TObject::Warning
virtual void Warning(const char *method, const char *msgfmt,...) const
Issue warning message.
Definition TObject.cxx:956

TObject::Error
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition TObject.cxx:970

TRandom::Uniform
virtual Double_t Uniform(Double_t x1=1)
Returns a uniform deviate on the interval (0, x1).
Definition TRandom.cxx:672

TRobustEstimator
Minimum Covariance Determinant Estimator - a Fast Algorithm invented by Peter J.Rousseeuw and Katrien...
Definition TRobustEstimator.h:23

TRobustEstimator::fNvar
Int_t fNvar
Definition TRobustEstimator.h:27

TRobustEstimator::RDist
Int_t RDist(TMatrixD &sscp)
Calculates robust distances.Then the samples with robust distances greater than a cutoff value (0....
Definition TRobustEstimator.cxx:1172

TRobustEstimator::fCovariance
TMatrixDSym fCovariance
Definition TRobustEstimator.h:37

TRobustEstimator::CStep
Double_t CStep(Int_t ntotal, Int_t htotal, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)
from the input htotal-subset constructs another htotal subset with lower determinant
Definition TRobustEstimator.cxx:999

TRobustEstimator::GetCovariance
const TMatrixDSym * GetCovariance() const
Definition TRobustEstimator.h:92

TRobustEstimator::fInvcovariance
TMatrixDSym fInvcovariance
Definition TRobustEstimator.h:38

TRobustEstimator::CreateSubset
void CreateSubset(Int_t ntotal, Int_t htotal, Int_t p, Int_t *index, TMatrixD &data, TMatrixD &sscp, Double_t *ndist)
creates a subset of htotal elements from ntotal elements first, p+1 elements are drawn randomly(witho...
Definition TRobustEstimator.cxx:877

TRobustEstimator::Covar
void Covar(TMatrixD &sscp, TVectorD &m, TMatrixDSym &cov, TVectorD &sd, Int_t nvec)
calculates mean and covariance
Definition TRobustEstimator.cxx:826

TRobustEstimator::Exact
Int_t Exact(Double_t *ndist)
for the exact fit situations returns number of observations on the hyperplane
Definition TRobustEstimator.cxx:1036

TRobustEstimator::KOrdStat
Double_t KOrdStat(Int_t ntotal, Double_t *arr, Int_t k, Int_t *work)
because I need an Int_t work array
Definition TRobustEstimator.cxx:1267

TRobustEstimator::GetNOut
Int_t GetNOut()
returns the number of outliers
Definition TRobustEstimator.cxx:770

TRobustEstimator::TRobustEstimator
TRobustEstimator()
this constructor should be used in a univariate case: first call this constructor,...
Definition TRobustEstimator.cxx:124

TRobustEstimator::Correl
void Correl()
transforms covariance matrix into correlation matrix
Definition TRobustEstimator.cxx:849

TRobustEstimator::AddColumn
void AddColumn(Double_t *col)
adds a column to the data matrix it is assumed that the column has size fN variable fVarTemp keeps th...
Definition TRobustEstimator.cxx:170

TRobustEstimator::RDraw
void RDraw(Int_t *subdat, Int_t ngroup, Int_t *indsubdat)
Draws ngroup nonoverlapping subdatasets out of a dataset of size n such that the selected case number...
Definition TRobustEstimator.cxx:1235

TRobustEstimator::Classic
void Classic()
called when h=n.
Definition TRobustEstimator.cxx:808

TRobustEstimator::fN
Int_t fN
Definition TRobustEstimator.h:29

TRobustEstimator::Evaluate
void Evaluate()
Finds the estimate of multivariate mean and variance.
Definition TRobustEstimator.cxx:208

TRobustEstimator::fRd
TVectorD fRd
Definition TRobustEstimator.h:40

TRobustEstimator::GetMean
const TVectorD * GetMean() const
Definition TRobustEstimator.h:99

TRobustEstimator::Exact2
Int_t Exact2(TMatrixD &mstockbig, TMatrixD &cstockbig, TMatrixD &hyperplane, Double_t *deti, Int_t nbest, Int_t kgroup, TMatrixD &sscp, Double_t *ndist)
This function is called if determinant of the covariance matrix of a subset=0.
Definition TRobustEstimator.cxx:1071

TRobustEstimator::fSd
TVectorD fSd
Definition TRobustEstimator.h:41

TRobustEstimator::fMean
TVectorD fMean
Definition TRobustEstimator.h:36

TRobustEstimator::GetRDistances
const TVectorD * GetRDistances() const
Definition TRobustEstimator.h:101

TRobustEstimator::fOut
TArrayI fOut
Definition TRobustEstimator.h:42

TRobustEstimator::fVecTemp
Int_t fVecTemp
Definition TRobustEstimator.h:32

TRobustEstimator::fVarTemp
Int_t fVarTemp
Definition TRobustEstimator.h:31

TRobustEstimator::GetBDPoint
Int_t GetBDPoint()
returns the breakdown point of the algorithm
Definition TRobustEstimator.cxx:674

TRobustEstimator::fH
Int_t fH
Definition TRobustEstimator.h:28

TRobustEstimator::CreateOrtSubset
void CreateOrtSubset(TMatrixD &dat, Int_t *index, Int_t hmerged, Int_t nmerged, TMatrixD &sscp, Double_t *ndist)
creates a subset of hmerged vectors with smallest orthogonal distances to the hyperplane hyp[1]*(x1-m...
Definition TRobustEstimator.cxx:967

TRobustEstimator::fHyperplane
TVectorD fHyperplane
Definition TRobustEstimator.h:43

TRobustEstimator::AddRow
void AddRow(Double_t *row)
adds a vector to the data matrix it is supposed that the vector is of size fNvar
Definition TRobustEstimator.cxx:191

TRobustEstimator::fCorrelation
TMatrixDSym fCorrelation
Definition TRobustEstimator.h:39

TRobustEstimator::GetHyperplane
const TVectorD * GetHyperplane() const
if the points are on a hyperplane, returns this hyperplane
Definition TRobustEstimator.cxx:717

TRobustEstimator::Partition
Int_t Partition(Int_t nmini, Int_t *indsubdat)
divides the elements into approximately equal subgroups number of elements in each subgroup is stored...
Definition TRobustEstimator.cxx:1118

TRobustEstimator::fData
TMatrixD fData
Definition TRobustEstimator.h:46

TRobustEstimator::GetCorrelation
const TMatrixDSym * GetCorrelation() const
Definition TRobustEstimator.h:94

TRobustEstimator::GetChiQuant
Double_t GetChiQuant(Int_t i) const
returns the chi2 quantiles
Definition TRobustEstimator.cxx:684

TRobustEstimator::EvaluateUni
void EvaluateUni(Int_t nvectors, Double_t *data, Double_t &mean, Double_t &sigma, Int_t hh=0)
for the univariate case estimates of location and scatter are returned in mean and sigma parameters t...
Definition TRobustEstimator.cxx:608

TRobustEstimator::AddToSscp
void AddToSscp(TMatrixD &sscp, TVectorD &vec)
update the sscp matrix with vector vec
Definition TRobustEstimator.cxx:778

TRobustEstimator::ClearSscp
void ClearSscp(TMatrixD &sscp)
clear the sscp matrix, used for covariance and mean calculation
Definition TRobustEstimator.cxx:795

TRobustEstimator::fExact
Int_t fExact
Definition TRobustEstimator.h:34

TVectorT< Double_t >

TVectorT::ResizeTo
TVectorT< Element > & ResizeTo(Int_t lwb, Int_t upb)
Resize the vector to [lwb:upb] .
Definition TVectorT.cxx:294

TVectorT::GetNoElements
Int_t GetNoElements() const
Definition TVectorT.h:76

TVectorT::GetMatrixArray
Element * GetMatrixArray()
Definition TVectorT.h:78

bool

double

int

sigma
const Double_t sigma
Definition h1analysisProxy.h:11

n
const Int_t n
Definition legend1.C:16

TMath::LocMin
Long64_t LocMin(Long64_t n, const T *a)
Returns index of array with the minimum element.
Definition TMath.h:980

TMath::LocMax
Long64_t LocMax(Long64_t n, const T *a)
Returns index of array with the maximum element.
Definition TMath.h:1010

TMath::Sqrt
Double_t Sqrt(Double_t x)
Returns the square root of x.
Definition TMath.h:660

TMath::Min
Short_t Min(Short_t a, Short_t b)
Returns the smallest of a and b.
Definition TMathBase.h:198

TMath::Sort
void Sort(Index n, const Element *a, Index *index, Bool_t down=kTRUE)
Sort the n elements of the array a of generic templated type Element.
Definition TMathBase.h:431

TMath::Median
Double_t Median(Long64_t n, const T *a, const Double_t *w=0, Long64_t *work=0)
Same as RMS.
Definition TMath.h:1270

TMath::Abs
Short_t Abs(Short_t d)
Returns the absolute value of parameter Short_t d.
Definition TMathBase.h:123

vec
Definition civetweb.c:1856

m
TMarker m
Definition textangle.C:8

l
TLine l
Definition textangle.C:4

sum
static uint64_t sum(uint64_t i)
Definition Factory.cxx:2345