doc/v606/MethodFisher_8cxx_source.html

 // @(#)root/tmva $Id$

 // Author: Andreas Hoecker, Xavier Prudent, Joerg Stelzer, Helge Voss, Kai Voss


 /**********************************************************************************

  * Project: TMVA - a Root-integrated toolkit for multivariate Data analysis       *

  * Package: TMVA                                                                  *

  * Class  : MethodFisher                                                          *

  * Web    : http://tmva.sourceforge.net                                           *

  *                                                                                *

  * Description:                                                                   *

  *      Implementation (see header for description)                               *

  *                                                                                *

  * Original author of this Fisher-Discriminant implementation:                    *

  *      Andre Gaidot, CEA-France;                                                 *

  *      (Translation from FORTRAN)                                                *

  *                                                                                *

  * Authors (alphabetical):                                                        *

  *      Andreas Hoecker <Andreas.Hocker@cern.ch> - CERN, Switzerland              *

  *      Xavier Prudent  <prudent@lapp.in2p3.fr>  - LAPP, France                   *

  *      Helge Voss      <Helge.Voss@cern.ch>     - MPI-K Heidelberg, Germany      *

  *      Kai Voss        <Kai.Voss@cern.ch>       - U. of Victoria, Canada         *

  *                                                                                *

  * Copyright (c) 2005:                                                            *

  *      CERN, Switzerland                                                         *

  *      U. of Victoria, Canada                                                    *

  *      MPI-K Heidelberg, Germany                                                 *

  *      LAPP, Annecy, France                                                      *

  *                                                                                *

  * Redistribution and use in source and binary forms, with or without             *

  * modification, are permitted according to the terms listed in LICENSE           *

  * (http://tmva.sourceforge.net/LICENSE)                                          *

  **********************************************************************************/


 ////////////////////////////////////////////////////////////////////////////////


 /* Begin_Html

   Fisher and Mahalanobis Discriminants (Linear Discriminant Analysis)


   <p>

   In the method of Fisher discriminants event selection is performed

   in a transformed variable space with zero linear correlations, by

   distinguishing the mean values of the signal and background

   distributions.<br></p>


   <p>

   The linear discriminant analysis determines an axis in the (correlated)

   hyperspace of the input variables

   such that, when projecting the output classes (signal and background)

   upon this axis, they are pushed as far as possible away from each other,

   while events of a same class are confined in a close vicinity.

   The linearity property of this method is reflected in the metric with

   which "far apart" and "close vicinity" are determined: the covariance

   matrix of the discriminant variable space.

   </p>


   <p>

   The classification of the events in signal and background classes

   relies on the following characteristics (only): overall sample means,

   <i><my:o>x</my:o><sub>i</sub></i>, for each input variable, <i>i</i>,

   class-specific sample means, <i><my:o>x</my:o><sub>S(B),i</sub></i>,

   and total covariance matrix <i>T<sub>ij</sub></i>. The covariance matrix

   can be decomposed into the sum of a <i>within-</i> (<i>W<sub>ij</sub></i>)

   and a <i>between-class</i> (<i>B<sub>ij</sub></i>) class matrix. They describe

   the dispersion of events relative to the means of their own class (within-class

   matrix), and relative to the overall sample means (between-class matrix).

   The Fisher coefficients, <i>F<sub>i</sub></i>, are then given by <br>

   <center>

   <img vspace=6 src="gif/tmva_fisherC.gif" align="bottom" >

   </center>

   where in TMVA is set <i>N<sub>S</sub>=N<sub>B</sub></i>, so that the factor

   in front of the sum simplifies to &frac12;.

   The Fisher discriminant then reads<br>

   <center>

   <img vspace=6 src="gif/tmva_fisherD.gif" align="bottom" >

   </center>

   The offset <i>F</i><sub>0</sub> centers the sample mean of <i>x</i><sub>Fi</sub>

   at zero. Instead of using the within-class matrix, the Mahalanobis variant

   determines the Fisher coefficients as follows:<br>

   <center>

   <img vspace=6 src="gif/tmva_mahaC.gif" align="bottom" >

   </center>

   with resulting <i>x</i><sub>Ma</sub> that are very similar to the

   <i>x</i><sub>Fi</sub>. <br></p>


   TMVA provides two outputs for the ranking of the input variables:<br><p></p>

   <ul>

   <li> <u>Fisher test:</u> the Fisher analysis aims at simultaneously maximising

   the between-class separation, while minimising the within-class dispersion.

   A useful measure of the discrimination power of a variable is hence given

   by the diagonal quantity: <i>B<sub>ii</sub>/W<sub>ii</sub></i>.

   </li>


   <li> <u>Discrimination power:</u> the value of the Fisher coefficient is a

   measure of the discriminating power of a variable. The discrimination power

   of set of input variables can therefore be measured by the scalar

   <center>

   <img vspace=6 src="gif/tmva_discpower.gif" align="bottom" >

   </center>

   </li>

   </ul>

   The corresponding numbers are printed on standard output.

   End_Html */

 //_______________________________________________________________________


 #include "TMVA/MethodFisher.h"


 #include <iomanip>

 #include <cassert>


 #include "TMath.h"

 #include "TMatrix.h"

 #include "Riostream.h"


 #include "TMVA/ClassifierFactory.h"

 #include "TMVA/DataSet.h"

 #include "TMVA/DataSetInfo.h"

 #include "TMVA/Event.h"

 #include "TMVA/MsgLogger.h"

 #include "TMVA/Ranking.h"

 #include "TMVA/Tools.h"

 #include "TMVA/TransformationHandler.h"

 #include "TMVA/Types.h"

 #include "TMVA/VariableTransformBase.h"


 REGISTER_METHOD(Fisher)


 ClassImp(TMVA::MethodFisher);


 ////////////////////////////////////////////////////////////////////////////////

 /// standard constructor for the "Fisher"


 TMVA::MethodFisher::MethodFisher( const TString& jobName,

                                   const TString& methodTitle,

                                   DataSetInfo& dsi,

                                   const TString& theOption,

                                   TDirectory* theTargetDir ) :

    MethodBase( jobName, Types::kFisher, methodTitle, dsi, theOption, theTargetDir ),

    fMeanMatx     ( 0 ),

    fTheMethod    ( "Fisher" ),

    fFisherMethod ( kFisher ),

    fBetw         ( 0 ),

    fWith         ( 0 ),

    fCov          ( 0 ),

    fSumOfWeightsS( 0 ),

    fSumOfWeightsB( 0 ),

    fDiscrimPow   ( 0 ),

    fFisherCoeff  ( 0 ),

    fF0           ( 0 )

 {

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// constructor from weight file


 TMVA::MethodFisher::MethodFisher( DataSetInfo& dsi,

                                   const TString& theWeightFile,

                                   TDirectory* theTargetDir ) :

    MethodBase( Types::kFisher, dsi, theWeightFile, theTargetDir ),

    fMeanMatx     ( 0 ),

    fTheMethod    ( "Fisher" ),

    fFisherMethod ( kFisher ),

    fBetw         ( 0 ),

    fWith         ( 0 ),

    fCov          ( 0 ),

    fSumOfWeightsS( 0 ),

    fSumOfWeightsB( 0 ),

    fDiscrimPow   ( 0 ),

    fFisherCoeff  ( 0 ),

    fF0           ( 0 )

 {

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// default initialization called by all constructors


 void TMVA::MethodFisher::Init( void )

 {

    // allocate Fisher coefficients

    fFisherCoeff = new std::vector<Double_t>( GetNvar() );


    // the minimum requirement to declare an event signal-like

    SetSignalReferenceCut( 0.0 );


    // this is the preparation for training

    InitMatrices();

 }


 ////////////////////////////////////////////////////////////////////////////////

 ///

 /// MethodFisher options:

 /// format and syntax of option string: "type"

 /// where type is "Fisher" or "Mahalanobis"

 ///


 void TMVA::MethodFisher::DeclareOptions()

 {

    DeclareOptionRef( fTheMethod = "Fisher", "Method", "Discrimination method" );

    AddPreDefVal(TString("Fisher"));

    AddPreDefVal(TString("Mahalanobis"));

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// process user options


 void TMVA::MethodFisher::ProcessOptions()

 {

    if (fTheMethod ==  "Fisher" ) fFisherMethod = kFisher;

    else                          fFisherMethod = kMahalanobis;


    // this is the preparation for training

    InitMatrices();

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// destructor


 TMVA::MethodFisher::~MethodFisher( void )

 {

    if (fBetw       ) { delete fBetw; fBetw = 0; }

    if (fWith       ) { delete fWith; fWith = 0; }

    if (fCov        ) { delete fCov;  fCov = 0; }

    if (fDiscrimPow ) { delete fDiscrimPow; fDiscrimPow = 0; }

    if (fFisherCoeff) { delete fFisherCoeff; fFisherCoeff = 0; }

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// Fisher can only handle classification with 2 classes


 Bool_t TMVA::MethodFisher::HasAnalysisType( Types::EAnalysisType type, UInt_t numberClasses, UInt_t /*numberTargets*/ )

 {

    if (type == Types::kClassification && numberClasses == 2) return kTRUE;

    return kFALSE;

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// computation of Fisher coefficients by series of matrix operations


 void TMVA::MethodFisher::Train( void )

 {

    // get mean value of each variables for signal, backgd and signal+backgd

    GetMean();


    // get the matrix of covariance 'within class'

    GetCov_WithinClass();


    // get the matrix of covariance 'between class'

    GetCov_BetweenClass();


    // get the matrix of covariance 'between class'

    GetCov_Full();


    //--------------------------------------------------------------


    // get the Fisher coefficients

    GetFisherCoeff();


    // get the discriminating power of each variables

    GetDiscrimPower();


    // nice output

    PrintCoefficients();

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// returns the Fisher value (no fixed range)


 Double_t TMVA::MethodFisher::GetMvaValue( Double_t* err, Double_t* errUpper )

 {

    const Event * ev = GetEvent();

    Double_t result = fF0;

    for (UInt_t ivar=0; ivar<GetNvar(); ivar++)

       result += (*fFisherCoeff)[ivar]*ev->GetValue(ivar);


    // cannot determine error

    NoErrorCalc(err, errUpper);


    return result;


 }


 ////////////////////////////////////////////////////////////////////////////////

 /// initializaton method; creates global matrices and vectors


 void TMVA::MethodFisher::InitMatrices( void )

 {

    // average value of each variables for S, B, S+B

    fMeanMatx = new TMatrixD( GetNvar(), 3 );


    // the covariance 'within class' and 'between class' matrices

    fBetw = new TMatrixD( GetNvar(), GetNvar() );

    fWith = new TMatrixD( GetNvar(), GetNvar() );

    fCov  = new TMatrixD( GetNvar(), GetNvar() );


    // discriminating power

    fDiscrimPow = new std::vector<Double_t>( GetNvar() );

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// compute mean values of variables in each sample, and the overall means


 void TMVA::MethodFisher::GetMean( void )

 {

    // initialize internal sum-of-weights variables

    fSumOfWeightsS = 0;

    fSumOfWeightsB = 0;


    const UInt_t nvar = DataInfo().GetNVariables();


    // init vectors

    Double_t* sumS = new Double_t[nvar];

    Double_t* sumB = new Double_t[nvar];

    for (UInt_t ivar=0; ivar<nvar; ivar++) { sumS[ivar] = sumB[ivar] = 0; }


    // compute sample means

    for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {


       // read the Training Event into "event"

       const Event * ev = GetEvent(ievt);


       // sum of weights

       Double_t weight = ev->GetWeight();

       if (DataInfo().IsSignal(ev)) fSumOfWeightsS += weight;

       else                         fSumOfWeightsB += weight;


       Double_t* sum = DataInfo().IsSignal(ev) ? sumS : sumB;


       for (UInt_t ivar=0; ivar<nvar; ivar++) sum[ivar] += ev->GetValue( ivar )*weight;

    }


    for (UInt_t ivar=0; ivar<nvar; ivar++) {

       (*fMeanMatx)( ivar, 2 ) = sumS[ivar];

       (*fMeanMatx)( ivar, 0 ) = sumS[ivar]/fSumOfWeightsS;


       (*fMeanMatx)( ivar, 2 ) += sumB[ivar];

       (*fMeanMatx)( ivar, 1 ) = sumB[ivar]/fSumOfWeightsB;


       // signal + background

       (*fMeanMatx)( ivar, 2 ) /= (fSumOfWeightsS + fSumOfWeightsB);

    }


    //   fMeanMatx->Print();

    delete [] sumS;

    delete [] sumB;

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// the matrix of covariance 'within class' reflects the dispersion of the

 /// events relative to the center of gravity of their own class


 void TMVA::MethodFisher::GetCov_WithinClass( void )

 {

    // assert required

    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0 );


    // product matrices (x-<x>)(y-<y>) where x;y are variables


    // init

    const Int_t nvar  = GetNvar();

    const Int_t nvar2 = nvar*nvar;

    Double_t *sumSig  = new Double_t[nvar2];

    Double_t *sumBgd  = new Double_t[nvar2];

    Double_t *xval    = new Double_t[nvar];

    memset(sumSig,0,nvar2*sizeof(Double_t));

    memset(sumBgd,0,nvar2*sizeof(Double_t));


    // 'within class' covariance

    for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {


       // read the Training Event into "event"

       const Event* ev = GetEvent(ievt);


       Double_t weight = ev->GetWeight(); // may ignore events with negative weights


       for (Int_t x=0; x<nvar; x++) xval[x] = ev->GetValue( x );

       Int_t k=0;

       for (Int_t x=0; x<nvar; x++) {

          for (Int_t y=0; y<nvar; y++) {

             if (DataInfo().IsSignal(ev)) {

                Double_t v = ( (xval[x] - (*fMeanMatx)(x, 0))*(xval[y] - (*fMeanMatx)(y, 0)) )*weight;

                sumSig[k] += v;

             }else{

                Double_t v = ( (xval[x] - (*fMeanMatx)(x, 1))*(xval[y] - (*fMeanMatx)(y, 1)) )*weight;

                sumBgd[k] += v;

             }

             k++;

          }

       }

    }

    Int_t k=0;

    for (Int_t x=0; x<nvar; x++) {

       for (Int_t y=0; y<nvar; y++) {

          //(*fWith)(x, y) = (sumSig[k] + sumBgd[k])/(fSumOfWeightsS + fSumOfWeightsB);

          // HHV: I am still convinced that THIS is how it should be (below) However, while

          // the old version corresponded so nicely with LD, the FIXED version does not, unless

          // we agree to change LD. For LD, it is not "defined" to my knowledge how the weights

          // are weighted, while it is clear how the "Within" matrix for Fisher should be calcuated

          // (i.e. as seen below). In order to agree with the Fisher classifier, one would have to

          // weigh signal and background such that they correspond to the same number of effective

          // (weithed) events.

          // THAT is NOT done currently, but just "event weights" are used.

          (*fWith)(x, y) = sumSig[k]/fSumOfWeightsS + sumBgd[k]/fSumOfWeightsB;

          k++;

       }

    }


    delete [] sumSig;

    delete [] sumBgd;

    delete [] xval;

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// the matrix of covariance 'between class' reflects the dispersion of the

 /// events of a class relative to the global center of gravity of all the class

 /// hence the separation between classes


 void TMVA::MethodFisher::GetCov_BetweenClass( void )

 {

    // assert required

    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);


    Double_t prodSig, prodBgd;


    for (UInt_t x=0; x<GetNvar(); x++) {

       for (UInt_t y=0; y<GetNvar(); y++) {


          prodSig = ( ((*fMeanMatx)(x, 0) - (*fMeanMatx)(x, 2))*

                      ((*fMeanMatx)(y, 0) - (*fMeanMatx)(y, 2)) );

          prodBgd = ( ((*fMeanMatx)(x, 1) - (*fMeanMatx)(x, 2))*

                      ((*fMeanMatx)(y, 1) - (*fMeanMatx)(y, 2)) );


          (*fBetw)(x, y) = (fSumOfWeightsS*prodSig + fSumOfWeightsB*prodBgd) / (fSumOfWeightsS + fSumOfWeightsB);

       }

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// compute full covariance matrix from sum of within and between matrices


 void TMVA::MethodFisher::GetCov_Full( void )

 {

    for (UInt_t x=0; x<GetNvar(); x++)

       for (UInt_t y=0; y<GetNvar(); y++)

          (*fCov)(x, y) = (*fWith)(x, y) + (*fBetw)(x, y);

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// Fisher = Sum { [coeff]*[variables] }

 ///

 /// let Xs be the array of the mean values of variables for signal evts

 /// let Xb be the array of the mean values of variables for backgd evts

 /// let InvWith be the inverse matrix of the 'within class' correlation matrix

 ///

 /// then the array of Fisher coefficients is

 /// [coeff] =sqrt(fNsig*fNbgd)/fNevt*transpose{Xs-Xb}*InvWith


 void TMVA::MethodFisher::GetFisherCoeff( void )

 {

    // assert required

    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);


    // invert covariance matrix

    TMatrixD* theMat = 0;

    switch (GetFisherMethod()) {

    case kFisher:

       theMat = fWith;

       break;

    case kMahalanobis:

       theMat = fCov;

       break;

    default:

       Log() << kFATAL << "<GetFisherCoeff> undefined method" << GetFisherMethod() << Endl;

    }


    TMatrixD invCov( *theMat );


    if ( TMath::Abs(invCov.Determinant()) < 10E-24 ) {

       Log() << kWARNING << "<GetFisherCoeff> matrix is almost singular with deterninant="

               << TMath::Abs(invCov.Determinant())

               << " did you use the variables that are linear combinations or highly correlated?"

               << Endl;

    }

    if ( TMath::Abs(invCov.Determinant()) < 10E-120 ) {

       theMat->Print();

       Log() << kFATAL << "<GetFisherCoeff> matrix is singular with determinant="

               << TMath::Abs(invCov.Determinant())

               << " did you use the variables that are linear combinations? \n"

               << " do you any clue as to what went wrong in above printout of the covariance matrix? "

               << Endl;

    }


    invCov.Invert();


    // apply rescaling factor

    Double_t xfact = TMath::Sqrt( fSumOfWeightsS*fSumOfWeightsB ) / (fSumOfWeightsS + fSumOfWeightsB);


    // compute difference of mean values

    std::vector<Double_t> diffMeans( GetNvar() );

    UInt_t ivar, jvar;

    for (ivar=0; ivar<GetNvar(); ivar++) {

       (*fFisherCoeff)[ivar] = 0;


       for (jvar=0; jvar<GetNvar(); jvar++) {

          Double_t d = (*fMeanMatx)(jvar, 0) - (*fMeanMatx)(jvar, 1);

          (*fFisherCoeff)[ivar] += invCov(ivar, jvar)*d;

       }

       // rescale

       (*fFisherCoeff)[ivar] *= xfact;

    }


    // offset correction

    fF0 = 0.0;

    for (ivar=0; ivar<GetNvar(); ivar++){

       fF0 += (*fFisherCoeff)[ivar]*((*fMeanMatx)(ivar, 0) + (*fMeanMatx)(ivar, 1));

    }

    fF0 /= -2.0;

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// computation of discrimination power indicator for each variable

 /// small values of "fWith" indicates little compactness of sig & of backgd

 /// big values of "fBetw" indicates large separation between sig & backgd

 ///

 /// we want signal & backgd classes as compact and separated as possible

 /// the discriminating power is then defined as the ration "fBetw/fWith"


 void TMVA::MethodFisher::GetDiscrimPower( void )

 {

    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {

       if ((*fCov)(ivar, ivar) != 0)

          (*fDiscrimPow)[ivar] = (*fBetw)(ivar, ivar)/(*fCov)(ivar, ivar);

       else

          (*fDiscrimPow)[ivar] = 0;

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// computes ranking of input variables


 const TMVA::Ranking* TMVA::MethodFisher::CreateRanking()

 {

    // create the ranking object

    fRanking = new Ranking( GetName(), "Discr. power" );


    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {

       fRanking->AddRank( Rank( GetInputLabel(ivar), (*fDiscrimPow)[ivar] ) );

    }


    return fRanking;

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// display Fisher coefficients and discriminating power for each variable

 /// check maximum length of variable name


 void TMVA::MethodFisher::PrintCoefficients( void )

 {

    Log() << kINFO << "Results for Fisher coefficients:" << Endl;


    if (GetTransformationHandler().GetTransformationList().GetSize() != 0) {

       Log() << kINFO << "NOTE: The coefficients must be applied to TRANFORMED variables" << Endl;

       Log() << kINFO << "  List of the transformation: " << Endl;

       TListIter trIt(&GetTransformationHandler().GetTransformationList());

       while (VariableTransformBase *trf = (VariableTransformBase*) trIt()) {

          Log() << kINFO << "  -- " << trf->GetName() << Endl;

       }

    }

    std::vector<TString>  vars;

    std::vector<Double_t> coeffs;

    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {

       vars  .push_back( GetInputLabel(ivar) );

       coeffs.push_back(  (*fFisherCoeff)[ivar] );

    }

    vars  .push_back( "(offset)" );

    coeffs.push_back( fF0 );

    TMVA::gTools().FormattedOutput( coeffs, vars, "Variable" , "Coefficient", Log() );


    // for (int i=0; i<coeffs.size(); i++)

    //    std::cout << "fisher coeff["<<i<<"]="<<coeffs[i]<<std::endl;


    if (IsNormalised()) {

       Log() << kINFO << "NOTE: You have chosen to use the \"Normalise\" booking option. Hence, the" << Endl;

       Log() << kINFO << "      coefficients must be applied to NORMALISED (') variables as follows:" << Endl;

       Int_t maxL = 0;

       for (UInt_t ivar=0; ivar<GetNvar(); ivar++) if (GetInputLabel(ivar).Length() > maxL) maxL = GetInputLabel(ivar).Length();


       // Print normalisation expression (see Tools.cxx): "2*(x - xmin)/(xmax - xmin) - 1.0"

       for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {

          Log() << kINFO

                  << std::setw(maxL+9) << TString("[") + GetInputLabel(ivar) + "]' = 2*("

                  << std::setw(maxL+2) << TString("[") + GetInputLabel(ivar) + "]"

                  << std::setw(3) << (GetXmin(ivar) > 0 ? " - " : " + ")

                  << std::setw(6) << TMath::Abs(GetXmin(ivar)) << std::setw(3) << ")/"

                  << std::setw(6) << (GetXmax(ivar) -  GetXmin(ivar) )

                  << std::setw(3) << " - 1"

                  << Endl;

       }

       Log() << kINFO << "The TMVA Reader will properly account for this normalisation, but if the" << Endl;

       Log() << kINFO << "Fisher classifier is applied outside the Reader, the transformation must be" << Endl;

       Log() << kINFO << "implemented -- or the \"Normalise\" option is removed and Fisher retrained." << Endl;

       Log() << kINFO << Endl;

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// read Fisher coefficients from weight file


 void TMVA::MethodFisher::ReadWeightsFromStream( std::istream& istr )

 {

    istr >> fF0;

    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) istr >> (*fFisherCoeff)[ivar];

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// create XML description of Fisher classifier


 void TMVA::MethodFisher::AddWeightsXMLTo( void* parent ) const

 {

    void* wght = gTools().AddChild(parent, "Weights");

    gTools().AddAttr( wght, "NCoeff", GetNvar()+1 );

    void* coeffxml = gTools().AddChild(wght, "Coefficient");

    gTools().AddAttr( coeffxml, "Index", 0   );

    gTools().AddAttr( coeffxml, "Value", fF0 );

    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {

       coeffxml = gTools().AddChild( wght, "Coefficient" );

       gTools().AddAttr( coeffxml, "Index", ivar+1 );

       gTools().AddAttr( coeffxml, "Value", (*fFisherCoeff)[ivar] );

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// read Fisher coefficients from xml weight file


 void TMVA::MethodFisher::ReadWeightsFromXML( void* wghtnode )

 {

    UInt_t ncoeff, coeffidx;

    gTools().ReadAttr( wghtnode, "NCoeff", ncoeff );

    fFisherCoeff->resize(ncoeff-1);


    void* ch = gTools().GetChild(wghtnode);

    Double_t coeff;

    while (ch) {

       gTools().ReadAttr( ch, "Index", coeffidx );

       gTools().ReadAttr( ch, "Value", coeff    );

       if (coeffidx==0) fF0 = coeff;

       else             (*fFisherCoeff)[coeffidx-1] = coeff;

       ch = gTools().GetNextChild(ch);

    }

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// write Fisher-specific classifier response


 void TMVA::MethodFisher::MakeClassSpecific( std::ostream& fout, const TString& className ) const

 {

    Int_t dp = fout.precision();

    fout << "   double              fFisher0;" << std::endl;

    fout << "   std::vector<double> fFisherCoefficients;" << std::endl;

    fout << "};" << std::endl;

    fout << "" << std::endl;

    fout << "inline void " << className << "::Initialize() " << std::endl;

    fout << "{" << std::endl;

    fout << "   fFisher0 = " << std::setprecision(12) << fF0 << ";" << std::endl;

    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {

       fout << "   fFisherCoefficients.push_back( " << std::setprecision(12) << (*fFisherCoeff)[ivar] << " );" << std::endl;

    }

    fout << std::endl;

    fout << "   // sanity check" << std::endl;

    fout << "   if (fFisherCoefficients.size() != fNvars) {" << std::endl;

    fout << "      std::cout << \"Problem in class \\\"\" << fClassName << \"\\\"::Initialize: mismatch in number of input values\"" << std::endl;

    fout << "                << fFisherCoefficients.size() << \" != \" << fNvars << std::endl;" << std::endl;

    fout << "      fStatusIsClean = false;" << std::endl;

    fout << "   }         " << std::endl;

    fout << "}" << std::endl;

    fout << std::endl;

    fout << "inline double " << className << "::GetMvaValue__( const std::vector<double>& inputValues ) const" << std::endl;

    fout << "{" << std::endl;

    fout << "   double retval = fFisher0;" << std::endl;

    fout << "   for (size_t ivar = 0; ivar < fNvars; ivar++) {" << std::endl;

    fout << "      retval += fFisherCoefficients[ivar]*inputValues[ivar];" << std::endl;

    fout << "   }" << std::endl;

    fout << std::endl;

    fout << "   return retval;" << std::endl;

    fout << "}" << std::endl;

    fout << std::endl;

    fout << "// Clean up" << std::endl;

    fout << "inline void " << className << "::Clear() " << std::endl;

    fout << "{" << std::endl;

    fout << "   // clear coefficients" << std::endl;

    fout << "   fFisherCoefficients.clear(); " << std::endl;

    fout << "}" << std::endl;

    fout << std::setprecision(dp);

 }


 ////////////////////////////////////////////////////////////////////////////////

 /// get help message text

 ///

 /// typical length of text line:

 ///         "|--------------------------------------------------------------|"


 void TMVA::MethodFisher::GetHelpMessage() const

 {

    Log() << Endl;

    Log() << gTools().Color("bold") << "--- Short description:" << gTools().Color("reset") << Endl;

    Log() << Endl;

    Log() << "Fisher discriminants select events by distinguishing the mean " << Endl;

    Log() << "values of the signal and background distributions in a trans- " << Endl;

    Log() << "formed variable space where linear correlations are removed." << Endl;

    Log() << Endl;

    Log() << "   (More precisely: the \"linear discriminator\" determines" << Endl;

    Log() << "    an axis in the (correlated) hyperspace of the input " << Endl;

    Log() << "    variables such that, when projecting the output classes " << Endl;

    Log() << "    (signal and background) upon this axis, they are pushed " << Endl;

    Log() << "    as far as possible away from each other, while events" << Endl;

    Log() << "    of a same class are confined in a close vicinity. The  " << Endl;

    Log() << "    linearity property of this classifier is reflected in the " << Endl;

    Log() << "    metric with which \"far apart\" and \"close vicinity\" are " << Endl;

    Log() << "    determined: the covariance matrix of the discriminating" << Endl;

    Log() << "    variable space.)" << Endl;

    Log() << Endl;

    Log() << gTools().Color("bold") << "--- Performance optimisation:" << gTools().Color("reset") << Endl;

    Log() << Endl;

    Log() << "Optimal performance for Fisher discriminants is obtained for " << Endl;

    Log() << "linearly correlated Gaussian-distributed variables. Any deviation" << Endl;

    Log() << "from this ideal reduces the achievable separation power. In " << Endl;

    Log() << "particular, no discrimination at all is achieved for a variable" << Endl;

    Log() << "that has the same sample mean for signal and background, even if " << Endl;

    Log() << "the shapes of the distributions are very different. Thus, Fisher " << Endl;

    Log() << "discriminants often benefit from suitable transformations of the " << Endl;

    Log() << "input variables. For example, if a variable x in [-1,1] has a " << Endl;

    Log() << "a parabolic signal distributions, and a uniform background" << Endl;

    Log() << "distributions, their mean value is zero in both cases, leading " << Endl;

    Log() << "to no separation. The simple transformation x -> |x| renders this " << Endl;

    Log() << "variable powerful for the use in a Fisher discriminant." << Endl;

    Log() << Endl;

    Log() << gTools().Color("bold") << "--- Performance tuning via configuration options:" << gTools().Color("reset") << Endl;

    Log() << Endl;

    Log() << "<None>" << Endl;

 }

TMVA::MethodFisher::GetCov_BetweenClass
void GetCov_BetweenClass(void)
the matrix of covariance 'between class' reflects the dispersion of the events of a class relative to...
Definition: MethodFisher.cxx:416

TMVA::MethodFisher::CreateRanking
const Ranking * CreateRanking()
computes ranking of input variables
Definition: MethodFisher.cxx:540

TMVA::MethodFisher::MakeClassSpecific
void MakeClassSpecific(std::ostream &, const TString &) const
write Fisher-specific classifier response
Definition: MethodFisher.cxx:654

TMVA::kFATAL
Definition: Types.h:67

TMVA::Endl
MsgLogger & Endl(MsgLogger &ml)
Definition: MsgLogger.h:162

Types.h

TMVA::Types
Definition: Types.h:74

TMVA::MethodFisher::ReadWeightsFromStream
void ReadWeightsFromStream(std::istream &i)
read Fisher coefficients from weight file
Definition: MethodFisher.cxx:608

TMVA::MethodFisher::~MethodFisher
virtual ~MethodFisher(void)
destructor
Definition: MethodFisher.cxx:217

TMVA::MethodFisher::AddWeightsXMLTo
void AddWeightsXMLTo(void *parent) const
create XML description of Fisher classifier
Definition: MethodFisher.cxx:617

DataSetInfo.h

TMVA::MethodFisher::GetCov_Full
void GetCov_Full(void)
compute full covariance matrix from sum of within and between matrices
Definition: MethodFisher.cxx:439

TMVA::MethodFisher::GetDiscrimPower
void GetDiscrimPower(void)
computation of discrimination power indicator for each variable small values of "fWith" indicates lit...
Definition: MethodFisher.cxx:527

TMVA::MethodFisher::HasAnalysisType
virtual Bool_t HasAnalysisType(Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
Fisher can only handle classification with 2 classes.
Definition: MethodFisher.cxx:229

assert
#define assert(cond)
Definition: unittest.h:542

TMVA::MethodFisher::MethodFisher
MethodFisher(const TString &jobName, const TString &methodTitle, DataSetInfo &dsi, const TString &theOption="Fisher", TDirectory *theTargetDir=0)
standard constructor for the "Fisher"
Definition: MethodFisher.cxx:132

TMVA::Types::EAnalysisType
EAnalysisType
Definition: Types.h:124

TMVA::MethodBase
Definition: MethodBase.h:91

TString
Basic string class.
Definition: TString.h:137

TMVA::Ranking
Definition: Ranking.h:50

Int_t
int Int_t
Definition: RtypesCore.h:41

Bool_t
bool Bool_t
Definition: RtypesCore.h:59

kFALSE
const Bool_t kFALSE
Definition: Rtypes.h:92

TMVA::Event::GetWeight
Double_t GetWeight() const
return the event weight - depending on whether the flag IgnoreNegWeightsInTraining is or not...
Definition: Event.cxx:376

TMVA::Tools::AddAttr
void AddAttr(void *node, const char *, const T &value, Int_t precision=16)
Definition: Tools.h:308

TMVA::Tools::AddChild
void * AddChild(void *parent, const char *childname, const char *content=0, bool isRootNode=false)
add child node
Definition: Tools.cxx:1134

TMath::Abs
Short_t Abs(Short_t d)
Definition: TMathBase.h:110

TListIter
Iterator of linked list.
Definition: TList.h:187

TMatrixT< Double_t >

TMVA::gTools
Tools & gTools()
Definition: Tools.cxx:79

x
Double_t x[n]
Definition: legend1.C:17

tornado.d
int d
Definition: tornado.py:11

TMVA::Tools::GetChild
void * GetChild(void *parent, const char *childname=0)
get child node
Definition: Tools.cxx:1158

Data
std::vector< std::vector< double > > Data
Definition: ParallelTest.cxx:55

TMVA::DataSetInfo
Definition: DataSetInfo.h:78

DataSet.h

TMatrixT::Invert
TMatrixT< Element > & Invert(Double_t *det=0)
Invert the matrix and calculate its determinant.
Definition: TMatrixT.cxx:1388

Ranking.h

TMatrixD
TMatrixT< Double_t > TMatrixD
Definition: TMatrixDfwd.h:24

TMatrixTBase::Print
void Print(Option_t *name="") const
Print the matrix as a table of elements.
Definition: TMatrixTBase.cxx:815

TMVA::MethodFisher::GetMvaValue
Double_t GetMvaValue(Double_t *err=0, Double_t *errUpper=0)
returns the Fisher value (no fixed range)
Definition: MethodFisher.cxx:267

TMVA::MethodFisher::Train
void Train(void)
computation of Fisher coefficients by series of matrix operations
Definition: MethodFisher.cxx:238

TMVA::MethodFisher::ReadWeightsFromXML
void ReadWeightsFromXML(void *wghtnode)
read Fisher coefficients from xml weight file
Definition: MethodFisher.cxx:634

TMVA::VariableTransformBase
Definition: VariableTransformBase.h:67

TMVA::MethodFisher::ProcessOptions
void ProcessOptions()
process user options
Definition: MethodFisher.cxx:205

MethodFisher.h

v
SVector< double, 2 > v
Definition: Dict.h:5

TMVA::Event
Definition: Event.h:57

UInt_t
unsigned int UInt_t
Definition: RtypesCore.h:42

TMVA::Types::kClassification
Definition: Types.h:125

ClassImp
ClassImp(TMVA::MethodFisher)

MsgLogger.h

TMath::E
Double_t E()
Definition: TMath.h:54

TMVA::MethodFisher::InitMatrices
void InitMatrices(void)
initializaton method; creates global matrices and vectors
Definition: MethodFisher.cxx:284

TMVA::Tools::ReadAttr
void ReadAttr(void *node, const char *, T &value)
Definition: Tools.h:295

TMVA::MethodFisher::GetHelpMessage
void GetHelpMessage() const
get help message text
Definition: MethodFisher.cxx:701

Riostream.h

TMVA::MethodFisher::DeclareOptions
void DeclareOptions()
MethodFisher options: format and syntax of option string: "type" where type is "Fisher" or "Mahalanob...
Definition: MethodFisher.cxx:195

TransformationHandler.h

Double_t
double Double_t
Definition: RtypesCore.h:55

TMVA::MethodFisher::GetMean
void GetMean(void)
compute mean values of variables in each sample, and the overall means
Definition: MethodFisher.cxx:301

TMVA::MethodFisher::GetCov_WithinClass
void GetCov_WithinClass(void)
the matrix of covariance 'within class' reflects the dispersion of the events relative to the center ...
Definition: MethodFisher.cxx:350

TDirectory
Describe directory structure in memory.
Definition: TDirectory.h:44

type
int type
Definition: TGX11.cxx:120

Event.h

TMVA::Tools::GetNextChild
void * GetNextChild(void *prevchild, const char *childname=0)
XML helpers.
Definition: Tools.cxx:1170

TMVA::Event::GetValue
Float_t GetValue(UInt_t ivar) const
return value of i'th variable
Definition: Event.cxx:231

y
Double_t y[n]
Definition: legend1.C:17

TMVA::Rank
Definition: Ranking.h:78

TMVA::Tools::FormattedOutput
void FormattedOutput(const std::vector< Double_t > &, const std::vector< TString > &, const TString titleVars, const TString titleValues, MsgLogger &logger, TString format="%+1.3f")
formatted output of simple table
Definition: Tools.cxx:896

TMVA::Tools::Color
const TString & Color(const TString &)
human readable color strings
Definition: Tools.cxx:837

REGISTER_METHOD
#define REGISTER_METHOD(CLASS)
for example
Definition: ClassifierFactory.h:126

VariableTransformBase.h

Tools.h

TMatrixT::Determinant
virtual Double_t Determinant() const
Return the matrix determinant.
Definition: TMatrixT.cxx:1353

TMVA::MethodFisher::PrintCoefficients
void PrintCoefficients(void)
display Fisher coefficients and discriminating power for each variable check maximum length of variab...
Definition: MethodFisher.cxx:556

TMatrix.h

result
double result[121]
Definition: testSampleQuantiles.cxx:17

TMath::Sqrt
Double_t Sqrt(Double_t x)
Definition: TMath.h:464

TMath.h

TMVA::kINFO
Definition: Types.h:64

TMVA::MethodFisher::GetFisherCoeff
void GetFisherCoeff(void)
Fisher = Sum { [coeff]*[variables] }.
Definition: MethodFisher.cxx:456

kTRUE
const Bool_t kTRUE
Definition: Rtypes.h:91

ClassifierFactory.h

TMVA::kWARNING
Definition: Types.h:65

Log
Definition: math.cpp:60

TMVA::MethodFisher::Init
void Init(void)
default initialization called by all constructors
Definition: MethodFisher.cxx:176