doc/v608/MethodFisher_8cxx_source.html

 // @(#)root/tmva $Id$
 // Author: Andreas Hoecker, Xavier Prudent, Joerg Stelzer, Helge Voss, Kai Voss

 /**********************************************************************************
  * Project: TMVA - a Root-integrated toolkit for multivariate Data analysis       *
  * Package: TMVA                                                                  *
  * Class  : MethodFisher                                                          *
  * Web    : http://tmva.sourceforge.net                                           *
  *                                                                                *
  * Description:                                                                   *
  *      Implementation (see header for description)                               *
  *                                                                                *
  * Original author of this Fisher-Discriminant implementation:                    *
  *      Andre Gaidot, CEA-France;                                                 *
  *      (Translation from FORTRAN)                                                *
  *                                                                                *
  * Authors (alphabetical):                                                        *
  *      Andreas Hoecker <Andreas.Hocker@cern.ch> - CERN, Switzerland              *
  *      Xavier Prudent  <prudent@lapp.in2p3.fr>  - LAPP, France                   *
  *      Helge Voss      <Helge.Voss@cern.ch>     - MPI-K Heidelberg, Germany      *
  *      Kai Voss        <Kai.Voss@cern.ch>       - U. of Victoria, Canada         *
  *                                                                                *
  * Copyright (c) 2005:                                                            *
  *      CERN, Switzerland                                                         *
  *      U. of Victoria, Canada                                                    *
  *      MPI-K Heidelberg, Germany                                                 *
  *      LAPP, Annecy, France                                                      *
  *                                                                                *
  * Redistribution and use in source and binary forms, with or without             *
  * modification, are permitted according to the terms listed in LICENSE           *
  * (http://tmva.sourceforge.net/LICENSE)                                          *
  **********************************************************************************/

 ////////////////////////////////////////////////////////////////////////////////

 /* Begin_Html
    Fisher and Mahalanobis Discriminants (Linear Discriminant Analysis)

    <p>
    In the method of Fisher discriminants event selection is performed
    in a transformed variable space with zero linear correlations, by
    distinguishing the mean values of the signal and background
    distributions.<br></p>

    <p>
    The linear discriminant analysis determines an axis in the (correlated)
    hyperspace of the input variables
    such that, when projecting the output classes (signal and background)
    upon this axis, they are pushed as far as possible away from each other,
    while events of a same class are confined in a close vicinity.
    The linearity property of this method is reflected in the metric with
    which "far apart" and "close vicinity" are determined: the covariance
    matrix of the discriminant variable space.
    </p>

    <p>
    The classification of the events in signal and background classes
    relies on the following characteristics (only): overall sample means,
    <i><my:o>x</my:o><sub>i</sub></i>, for each input variable, <i>i</i>,
    class-specific sample means, <i><my:o>x</my:o><sub>S(B),i</sub></i>,
    and total covariance matrix <i>T<sub>ij</sub></i>. The covariance matrix
    can be decomposed into the sum of a <i>within-</i> (<i>W<sub>ij</sub></i>)
    and a <i>between-class</i> (<i>B<sub>ij</sub></i>) class matrix. They describe
    the dispersion of events relative to the means of their own class (within-class
    matrix), and relative to the overall sample means (between-class matrix).
    The Fisher coefficients, <i>F<sub>i</sub></i>, are then given by <br>
    <center>
    <img vspace=6 src="gif/tmva_fisherC.gif" align="bottom" >
    </center>
    where in TMVA is set <i>N<sub>S</sub>=N<sub>B</sub></i>, so that the factor
    in front of the sum simplifies to &frac12;.
    The Fisher discriminant then reads<br>
    <center>
    <img vspace=6 src="gif/tmva_fisherD.gif" align="bottom" >
    </center>
    The offset <i>F</i><sub>0</sub> centers the sample mean of <i>x</i><sub>Fi</sub>
    at zero. Instead of using the within-class matrix, the Mahalanobis variant
    determines the Fisher coefficients as follows:<br>
    <center>
    <img vspace=6 src="gif/tmva_mahaC.gif" align="bottom" >
    </center>
    with resulting <i>x</i><sub>Ma</sub> that are very similar to the
    <i>x</i><sub>Fi</sub>. <br></p>

    TMVA provides two outputs for the ranking of the input variables:<br><p></p>
    <ul>
    <li> <u>Fisher test:</u> the Fisher analysis aims at simultaneously maximising
    the between-class separation, while minimising the within-class dispersion.
    A useful measure of the discrimination power of a variable is hence given
    by the diagonal quantity: <i>B<sub>ii</sub>/W<sub>ii</sub></i>.
    </li>

    <li> <u>Discrimination power:</u> the value of the Fisher coefficient is a
    measure of the discriminating power of a variable. The discrimination power
    of set of input variables can therefore be measured by the scalar
    <center>
    <img vspace=6 src="gif/tmva_discpower.gif" align="bottom" >
    </center>
    </li>
    </ul>
    The corresponding numbers are printed on standard output.
    End_Html */
 //_______________________________________________________________________

 #include "TMVA/MethodFisher.h"

 #include "TMVA/ClassifierFactory.h"
 #include "TMVA/Configurable.h"
 #include "TMVA/DataSet.h"
 #include "TMVA/DataSetInfo.h"
 #include "TMVA/Event.h"
 #include "TMVA/IMethod.h"
 #include "TMVA/MethodBase.h"
 #include "TMVA/MsgLogger.h"
 #include "TMVA/Ranking.h"
 #include "TMVA/Tools.h"
 #include "TMVA/TransformationHandler.h"
 #include "TMVA/Types.h"
 #include "TMVA/VariableTransformBase.h"

 #include "TMath.h"
 #include "TMatrix.h"
 #include "TList.h"
 #include "Riostream.h"

 #include <iomanip>
 #include <cassert>

 REGISTER_METHOD(Fisher)

 ClassImp(TMVA::MethodFisher);

 ////////////////////////////////////////////////////////////////////////////////
 /// standard constructor for the "Fisher"

 TMVA::MethodFisher::MethodFisher( const TString& jobName,
                                   const TString& methodTitle,
                                   DataSetInfo& dsi,
                                   const TString& theOption ) :
    MethodBase( jobName, Types::kFisher, methodTitle, dsi, theOption),
    fMeanMatx     ( 0 ),
    fTheMethod    ( "Fisher" ),
    fFisherMethod ( kFisher ),
    fBetw         ( 0 ),
    fWith         ( 0 ),
    fCov          ( 0 ),
    fSumOfWeightsS( 0 ),
    fSumOfWeightsB( 0 ),
    fDiscrimPow   ( 0 ),
    fFisherCoeff  ( 0 ),
    fF0           ( 0 )
 {
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// constructor from weight file

 TMVA::MethodFisher::MethodFisher( DataSetInfo& dsi,
                                   const TString& theWeightFile) :
    MethodBase( Types::kFisher, dsi, theWeightFile),
    fMeanMatx     ( 0 ),
    fTheMethod    ( "Fisher" ),
    fFisherMethod ( kFisher ),
    fBetw         ( 0 ),
    fWith         ( 0 ),
    fCov          ( 0 ),
    fSumOfWeightsS( 0 ),
    fSumOfWeightsB( 0 ),
    fDiscrimPow   ( 0 ),
    fFisherCoeff  ( 0 ),
    fF0           ( 0 )
 {
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// default initialization called by all constructors

 void TMVA::MethodFisher::Init( void )
 {
    // allocate Fisher coefficients
    fFisherCoeff = new std::vector<Double_t>( GetNvar() );

    // the minimum requirement to declare an event signal-like
    SetSignalReferenceCut( 0.0 );

    // this is the preparation for training
    InitMatrices();
 }

 ////////////////////////////////////////////////////////////////////////////////
 ///
 /// MethodFisher options:
 /// format and syntax of option string: "type"
 /// where type is "Fisher" or "Mahalanobis"
 ///

 void TMVA::MethodFisher::DeclareOptions()
 {
    DeclareOptionRef( fTheMethod = "Fisher", "Method", "Discrimination method" );
    AddPreDefVal(TString("Fisher"));
    AddPreDefVal(TString("Mahalanobis"));
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// process user options

 void TMVA::MethodFisher::ProcessOptions()
 {
    if (fTheMethod ==  "Fisher" ) fFisherMethod = kFisher;
    else                          fFisherMethod = kMahalanobis;

    // this is the preparation for training
    InitMatrices();
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// destructor

 TMVA::MethodFisher::~MethodFisher( void )
 {
    if (fBetw       ) { delete fBetw; fBetw = 0; }
    if (fWith       ) { delete fWith; fWith = 0; }
    if (fCov        ) { delete fCov;  fCov = 0; }
    if (fDiscrimPow ) { delete fDiscrimPow; fDiscrimPow = 0; }
    if (fFisherCoeff) { delete fFisherCoeff; fFisherCoeff = 0; }
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// Fisher can only handle classification with 2 classes

 Bool_t TMVA::MethodFisher::HasAnalysisType( Types::EAnalysisType type, UInt_t numberClasses, UInt_t /*numberTargets*/ )
 {
    if (type == Types::kClassification && numberClasses == 2) return kTRUE;
    return kFALSE;
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// computation of Fisher coefficients by series of matrix operations

 void TMVA::MethodFisher::Train( void )
 {
    // get mean value of each variables for signal, backgd and signal+backgd
    GetMean();

    // get the matrix of covariance 'within class'
    GetCov_WithinClass();

    // get the matrix of covariance 'between class'
    GetCov_BetweenClass();

    // get the matrix of covariance 'between class'
    GetCov_Full();

    //--------------------------------------------------------------

    // get the Fisher coefficients
    GetFisherCoeff();

    // get the discriminating power of each variables
    GetDiscrimPower();

    // nice output
    PrintCoefficients();

    ExitFromTraining();
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// returns the Fisher value (no fixed range)

 Double_t TMVA::MethodFisher::GetMvaValue( Double_t* err, Double_t* errUpper )
 {
    const Event * ev = GetEvent();
    Double_t result = fF0;
    for (UInt_t ivar=0; ivar<GetNvar(); ivar++)
       result += (*fFisherCoeff)[ivar]*ev->GetValue(ivar);

    // cannot determine error
    NoErrorCalc(err, errUpper);

    return result;

 }

 ////////////////////////////////////////////////////////////////////////////////
 /// initializaton method; creates global matrices and vectors

 void TMVA::MethodFisher::InitMatrices( void )
 {
    // average value of each variables for S, B, S+B
    fMeanMatx = new TMatrixD( GetNvar(), 3 );

    // the covariance 'within class' and 'between class' matrices
    fBetw = new TMatrixD( GetNvar(), GetNvar() );
    fWith = new TMatrixD( GetNvar(), GetNvar() );
    fCov  = new TMatrixD( GetNvar(), GetNvar() );

    // discriminating power
    fDiscrimPow = new std::vector<Double_t>( GetNvar() );
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// compute mean values of variables in each sample, and the overall means

 void TMVA::MethodFisher::GetMean( void )
 {
    // initialize internal sum-of-weights variables
    fSumOfWeightsS = 0;
    fSumOfWeightsB = 0;

    const UInt_t nvar = DataInfo().GetNVariables();

    // init vectors
    Double_t* sumS = new Double_t[nvar];
    Double_t* sumB = new Double_t[nvar];
    for (UInt_t ivar=0; ivar<nvar; ivar++) { sumS[ivar] = sumB[ivar] = 0; }

    // compute sample means
    for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {

       // read the Training Event into "event"
       const Event * ev = GetEvent(ievt);

       // sum of weights
       Double_t weight = ev->GetWeight();
       if (DataInfo().IsSignal(ev)) fSumOfWeightsS += weight;
       else                         fSumOfWeightsB += weight;

       Double_t* sum = DataInfo().IsSignal(ev) ? sumS : sumB;

       for (UInt_t ivar=0; ivar<nvar; ivar++) sum[ivar] += ev->GetValue( ivar )*weight;
    }

    for (UInt_t ivar=0; ivar<nvar; ivar++) {
       (*fMeanMatx)( ivar, 2 ) = sumS[ivar];
       (*fMeanMatx)( ivar, 0 ) = sumS[ivar]/fSumOfWeightsS;

       (*fMeanMatx)( ivar, 2 ) += sumB[ivar];
       (*fMeanMatx)( ivar, 1 ) = sumB[ivar]/fSumOfWeightsB;

       // signal + background
       (*fMeanMatx)( ivar, 2 ) /= (fSumOfWeightsS + fSumOfWeightsB);
    }

    //   fMeanMatx->Print();
    delete [] sumS;
    delete [] sumB;
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// the matrix of covariance 'within class' reflects the dispersion of the
 /// events relative to the center of gravity of their own class

 void TMVA::MethodFisher::GetCov_WithinClass( void )
 {
    // assert required
    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0 );

    // product matrices (x-<x>)(y-<y>) where x;y are variables

    // init
    const Int_t nvar  = GetNvar();
    const Int_t nvar2 = nvar*nvar;
    Double_t *sumSig  = new Double_t[nvar2];
    Double_t *sumBgd  = new Double_t[nvar2];
    Double_t *xval    = new Double_t[nvar];
    memset(sumSig,0,nvar2*sizeof(Double_t));
    memset(sumBgd,0,nvar2*sizeof(Double_t));

    // 'within class' covariance
    for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {

       // read the Training Event into "event"
       const Event* ev = GetEvent(ievt);

       Double_t weight = ev->GetWeight(); // may ignore events with negative weights

       for (Int_t x=0; x<nvar; x++) xval[x] = ev->GetValue( x );
       Int_t k=0;
       for (Int_t x=0; x<nvar; x++) {
          for (Int_t y=0; y<nvar; y++) {
             if (DataInfo().IsSignal(ev)) {
                Double_t v = ( (xval[x] - (*fMeanMatx)(x, 0))*(xval[y] - (*fMeanMatx)(y, 0)) )*weight;
                sumSig[k] += v;
             }else{
                Double_t v = ( (xval[x] - (*fMeanMatx)(x, 1))*(xval[y] - (*fMeanMatx)(y, 1)) )*weight;
                sumBgd[k] += v;
             }
             k++;
          }
       }
    }
    Int_t k=0;
    for (Int_t x=0; x<nvar; x++) {
       for (Int_t y=0; y<nvar; y++) {
          //(*fWith)(x, y) = (sumSig[k] + sumBgd[k])/(fSumOfWeightsS + fSumOfWeightsB);
          // HHV: I am still convinced that THIS is how it should be (below) However, while
          // the old version corresponded so nicely with LD, the FIXED version does not, unless
          // we agree to change LD. For LD, it is not "defined" to my knowledge how the weights
          // are weighted, while it is clear how the "Within" matrix for Fisher should be calcuated
          // (i.e. as seen below). In order to agree with the Fisher classifier, one would have to
          // weigh signal and background such that they correspond to the same number of effective
          // (weithed) events.
          // THAT is NOT done currently, but just "event weights" are used.
          (*fWith)(x, y) = sumSig[k]/fSumOfWeightsS + sumBgd[k]/fSumOfWeightsB;
          k++;
       }
    }

    delete [] sumSig;
    delete [] sumBgd;
    delete [] xval;
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// the matrix of covariance 'between class' reflects the dispersion of the
 /// events of a class relative to the global center of gravity of all the class
 /// hence the separation between classes

 void TMVA::MethodFisher::GetCov_BetweenClass( void )
 {
    // assert required
    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);

    Double_t prodSig, prodBgd;

    for (UInt_t x=0; x<GetNvar(); x++) {
       for (UInt_t y=0; y<GetNvar(); y++) {

          prodSig = ( ((*fMeanMatx)(x, 0) - (*fMeanMatx)(x, 2))*
                      ((*fMeanMatx)(y, 0) - (*fMeanMatx)(y, 2)) );
          prodBgd = ( ((*fMeanMatx)(x, 1) - (*fMeanMatx)(x, 2))*
                      ((*fMeanMatx)(y, 1) - (*fMeanMatx)(y, 2)) );

          (*fBetw)(x, y) = (fSumOfWeightsS*prodSig + fSumOfWeightsB*prodBgd) / (fSumOfWeightsS + fSumOfWeightsB);
       }
    }
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// compute full covariance matrix from sum of within and between matrices

 void TMVA::MethodFisher::GetCov_Full( void )
 {
    for (UInt_t x=0; x<GetNvar(); x++)
       for (UInt_t y=0; y<GetNvar(); y++)
          (*fCov)(x, y) = (*fWith)(x, y) + (*fBetw)(x, y);
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// Fisher = Sum { [coeff]*[variables] }
 ///
 /// let Xs be the array of the mean values of variables for signal evts
 /// let Xb be the array of the mean values of variables for backgd evts
 /// let InvWith be the inverse matrix of the 'within class' correlation matrix
 ///
 /// then the array of Fisher coefficients is
 /// [coeff] =sqrt(fNsig*fNbgd)/fNevt*transpose{Xs-Xb}*InvWith

 void TMVA::MethodFisher::GetFisherCoeff( void )
 {
    // assert required
    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);

    // invert covariance matrix
    TMatrixD* theMat = 0;
    switch (GetFisherMethod()) {
    case kFisher:
       theMat = fWith;
       break;
    case kMahalanobis:
       theMat = fCov;
       break;
    default:
       Log() << kFATAL << "<GetFisherCoeff> undefined method" << GetFisherMethod() << Endl;
    }

    TMatrixD invCov( *theMat );

    if ( TMath::Abs(invCov.Determinant()) < 10E-24 ) {
       Log() << kWARNING << "<GetFisherCoeff> matrix is almost singular with deterninant="
             << TMath::Abs(invCov.Determinant())
             << " did you use the variables that are linear combinations or highly correlated?"
             << Endl;
    }
    if ( TMath::Abs(invCov.Determinant()) < 10E-120 ) {
       theMat->Print();
       Log() << kFATAL << "<GetFisherCoeff> matrix is singular with determinant="
             << TMath::Abs(invCov.Determinant())
             << " did you use the variables that are linear combinations? \n"
             << " do you any clue as to what went wrong in above printout of the covariance matrix? "
             << Endl;
    }

    invCov.Invert();

    // apply rescaling factor
    Double_t xfact = TMath::Sqrt( fSumOfWeightsS*fSumOfWeightsB ) / (fSumOfWeightsS + fSumOfWeightsB);

    // compute difference of mean values
    std::vector<Double_t> diffMeans( GetNvar() );
    UInt_t ivar, jvar;
    for (ivar=0; ivar<GetNvar(); ivar++) {
       (*fFisherCoeff)[ivar] = 0;

       for (jvar=0; jvar<GetNvar(); jvar++) {
          Double_t d = (*fMeanMatx)(jvar, 0) - (*fMeanMatx)(jvar, 1);
          (*fFisherCoeff)[ivar] += invCov(ivar, jvar)*d;
       }
       // rescale
       (*fFisherCoeff)[ivar] *= xfact;
    }


    // offset correction
    fF0 = 0.0;
    for (ivar=0; ivar<GetNvar(); ivar++){
       fF0 += (*fFisherCoeff)[ivar]*((*fMeanMatx)(ivar, 0) + (*fMeanMatx)(ivar, 1));
    }
    fF0 /= -2.0;
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// computation of discrimination power indicator for each variable
 /// small values of "fWith" indicates little compactness of sig & of backgd
 /// big values of "fBetw" indicates large separation between sig & backgd
 ///
 /// we want signal & backgd classes as compact and separated as possible
 /// the discriminating power is then defined as the ration "fBetw/fWith"

 void TMVA::MethodFisher::GetDiscrimPower( void )
 {
    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
       if ((*fCov)(ivar, ivar) != 0)
          (*fDiscrimPow)[ivar] = (*fBetw)(ivar, ivar)/(*fCov)(ivar, ivar);
       else
          (*fDiscrimPow)[ivar] = 0;
    }
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// computes ranking of input variables

 const TMVA::Ranking* TMVA::MethodFisher::CreateRanking()
 {
    // create the ranking object
    fRanking = new Ranking( GetName(), "Discr. power" );

    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
       fRanking->AddRank( Rank( GetInputLabel(ivar), (*fDiscrimPow)[ivar] ) );
    }

    return fRanking;
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// display Fisher coefficients and discriminating power for each variable
 /// check maximum length of variable name

 void TMVA::MethodFisher::PrintCoefficients( void )
 {
    Log() << kHEADER << "Results for Fisher coefficients:" << Endl;

    if (GetTransformationHandler().GetTransformationList().GetSize() != 0) {
       Log() << kINFO << "NOTE: The coefficients must be applied to TRANFORMED variables" << Endl;
       Log() << kINFO << "  List of the transformation: " << Endl;
       TListIter trIt(&GetTransformationHandler().GetTransformationList());
       while (VariableTransformBase *trf = (VariableTransformBase*) trIt()) {
          Log() << kINFO << "  -- " << trf->GetName() << Endl;
       }
    }
    std::vector<TString>  vars;
    std::vector<Double_t> coeffs;
    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
       vars  .push_back( GetInputLabel(ivar) );
       coeffs.push_back(  (*fFisherCoeff)[ivar] );
    }
    vars  .push_back( "(offset)" );
    coeffs.push_back( fF0 );
    TMVA::gTools().FormattedOutput( coeffs, vars, "Variable" , "Coefficient", Log() );

    // for (int i=0; i<coeffs.size(); i++)
    //    std::cout << "fisher coeff["<<i<<"]="<<coeffs[i]<<std::endl;

    if (IsNormalised()) {
       Log() << kINFO << "NOTE: You have chosen to use the \"Normalise\" booking option. Hence, the" << Endl;
       Log() << kINFO << "      coefficients must be applied to NORMALISED (') variables as follows:" << Endl;
       Int_t maxL = 0;
       for (UInt_t ivar=0; ivar<GetNvar(); ivar++) if (GetInputLabel(ivar).Length() > maxL) maxL = GetInputLabel(ivar).Length();

       // Print normalisation expression (see Tools.cxx): "2*(x - xmin)/(xmax - xmin) - 1.0"
       for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
          Log() << kINFO
                << std::setw(maxL+9) << TString("[") + GetInputLabel(ivar) + "]' = 2*("
                << std::setw(maxL+2) << TString("[") + GetInputLabel(ivar) + "]"
                << std::setw(3) << (GetXmin(ivar) > 0 ? " - " : " + ")
                << std::setw(6) << TMath::Abs(GetXmin(ivar)) << std::setw(3) << ")/"
                << std::setw(6) << (GetXmax(ivar) -  GetXmin(ivar) )
                << std::setw(3) << " - 1"
                << Endl;
       }
       Log() << kINFO << "The TMVA Reader will properly account for this normalisation, but if the" << Endl;
       Log() << kINFO << "Fisher classifier is applied outside the Reader, the transformation must be" << Endl;
       Log() << kINFO << "implemented -- or the \"Normalise\" option is removed and Fisher retrained." << Endl;
       Log() << kINFO << Endl;
    }
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// read Fisher coefficients from weight file

 void TMVA::MethodFisher::ReadWeightsFromStream( std::istream& istr )
 {
    istr >> fF0;
    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) istr >> (*fFisherCoeff)[ivar];
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// create XML description of Fisher classifier

 void TMVA::MethodFisher::AddWeightsXMLTo( void* parent ) const
 {
    void* wght = gTools().AddChild(parent, "Weights");
    gTools().AddAttr( wght, "NCoeff", GetNvar()+1 );
    void* coeffxml = gTools().AddChild(wght, "Coefficient");
    gTools().AddAttr( coeffxml, "Index", 0   );
    gTools().AddAttr( coeffxml, "Value", fF0 );
    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
       coeffxml = gTools().AddChild( wght, "Coefficient" );
       gTools().AddAttr( coeffxml, "Index", ivar+1 );
       gTools().AddAttr( coeffxml, "Value", (*fFisherCoeff)[ivar] );
    }
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// read Fisher coefficients from xml weight file

 void TMVA::MethodFisher::ReadWeightsFromXML( void* wghtnode )
 {
    UInt_t ncoeff, coeffidx;
    gTools().ReadAttr( wghtnode, "NCoeff", ncoeff );
    fFisherCoeff->resize(ncoeff-1);

    void* ch = gTools().GetChild(wghtnode);
    Double_t coeff;
    while (ch) {
       gTools().ReadAttr( ch, "Index", coeffidx );
       gTools().ReadAttr( ch, "Value", coeff    );
       if (coeffidx==0) fF0 = coeff;
       else             (*fFisherCoeff)[coeffidx-1] = coeff;
       ch = gTools().GetNextChild(ch);
    }
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// write Fisher-specific classifier response

 void TMVA::MethodFisher::MakeClassSpecific( std::ostream& fout, const TString& className ) const
 {
    Int_t dp = fout.precision();
    fout << "   double              fFisher0;" << std::endl;
    fout << "   std::vector<double> fFisherCoefficients;" << std::endl;
    fout << "};" << std::endl;
    fout << "" << std::endl;
    fout << "inline void " << className << "::Initialize() " << std::endl;
    fout << "{" << std::endl;
    fout << "   fFisher0 = " << std::setprecision(12) << fF0 << ";" << std::endl;
    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
       fout << "   fFisherCoefficients.push_back( " << std::setprecision(12) << (*fFisherCoeff)[ivar] << " );" << std::endl;
    }
    fout << std::endl;
    fout << "   // sanity check" << std::endl;
    fout << "   if (fFisherCoefficients.size() != fNvars) {" << std::endl;
    fout << "      std::cout << \"Problem in class \\\"\" << fClassName << \"\\\"::Initialize: mismatch in number of input values\"" << std::endl;
    fout << "                << fFisherCoefficients.size() << \" != \" << fNvars << std::endl;" << std::endl;
    fout << "      fStatusIsClean = false;" << std::endl;
    fout << "   }         " << std::endl;
    fout << "}" << std::endl;
    fout << std::endl;
    fout << "inline double " << className << "::GetMvaValue__( const std::vector<double>& inputValues ) const" << std::endl;
    fout << "{" << std::endl;
    fout << "   double retval = fFisher0;" << std::endl;
    fout << "   for (size_t ivar = 0; ivar < fNvars; ivar++) {" << std::endl;
    fout << "      retval += fFisherCoefficients[ivar]*inputValues[ivar];" << std::endl;
    fout << "   }" << std::endl;
    fout << std::endl;
    fout << "   return retval;" << std::endl;
    fout << "}" << std::endl;
    fout << std::endl;
    fout << "// Clean up" << std::endl;
    fout << "inline void " << className << "::Clear() " << std::endl;
    fout << "{" << std::endl;
    fout << "   // clear coefficients" << std::endl;
    fout << "   fFisherCoefficients.clear(); " << std::endl;
    fout << "}" << std::endl;
    fout << std::setprecision(dp);
 }

 ////////////////////////////////////////////////////////////////////////////////
 /// get help message text
 ///
 /// typical length of text line:
 ///         "|--------------------------------------------------------------|"

 void TMVA::MethodFisher::GetHelpMessage() const
 {
    Log() << Endl;
    Log() << gTools().Color("bold") << "--- Short description:" << gTools().Color("reset") << Endl;
    Log() << Endl;
    Log() << "Fisher discriminants select events by distinguishing the mean " << Endl;
    Log() << "values of the signal and background distributions in a trans- " << Endl;
    Log() << "formed variable space where linear correlations are removed." << Endl;
    Log() << Endl;
    Log() << "   (More precisely: the \"linear discriminator\" determines" << Endl;
    Log() << "    an axis in the (correlated) hyperspace of the input " << Endl;
    Log() << "    variables such that, when projecting the output classes " << Endl;
    Log() << "    (signal and background) upon this axis, they are pushed " << Endl;
    Log() << "    as far as possible away from each other, while events" << Endl;
    Log() << "    of a same class are confined in a close vicinity. The  " << Endl;
    Log() << "    linearity property of this classifier is reflected in the " << Endl;
    Log() << "    metric with which \"far apart\" and \"close vicinity\" are " << Endl;
    Log() << "    determined: the covariance matrix of the discriminating" << Endl;
    Log() << "    variable space.)" << Endl;
    Log() << Endl;
    Log() << gTools().Color("bold") << "--- Performance optimisation:" << gTools().Color("reset") << Endl;
    Log() << Endl;
    Log() << "Optimal performance for Fisher discriminants is obtained for " << Endl;
    Log() << "linearly correlated Gaussian-distributed variables. Any deviation" << Endl;
    Log() << "from this ideal reduces the achievable separation power. In " << Endl;
    Log() << "particular, no discrimination at all is achieved for a variable" << Endl;
    Log() << "that has the same sample mean for signal and background, even if " << Endl;
    Log() << "the shapes of the distributions are very different. Thus, Fisher " << Endl;
    Log() << "discriminants often benefit from suitable transformations of the " << Endl;
    Log() << "input variables. For example, if a variable x in [-1,1] has a " << Endl;
    Log() << "a parabolic signal distributions, and a uniform background" << Endl;
    Log() << "distributions, their mean value is zero in both cases, leading " << Endl;
    Log() << "to no separation. The simple transformation x -> |x| renders this " << Endl;
    Log() << "variable powerful for the use in a Fisher discriminant." << Endl;
    Log() << Endl;
    Log() << gTools().Color("bold") << "--- Performance tuning via configuration options:" << gTools().Color("reset") << Endl;
    Log() << Endl;
    Log() << "<None>" << Endl;
 }
TMVA::MethodFisher::GetCov_BetweenClass
void GetCov_BetweenClass(void)
the matrix of covariance &#39;between class&#39; reflects the dispersion of the events of a class relative to...
Definition: MethodFisher.cxx:420

TMVA::MethodFisher::CreateRanking
const Ranking * CreateRanking()
computes ranking of input variables
Definition: MethodFisher.cxx:544

TMVA::DataSetInfo::GetNVariables
UInt_t GetNVariables() const
Definition: DataSetInfo.h:128

sum
static long int sum(long int i)
Definition: Factory.cxx:1786

TMVA::kFATAL
Definition: Types.h:67

TMVA::MethodFisher::MethodFisher
MethodFisher(const TString &jobName, const TString &methodTitle, DataSetInfo &dsi, const TString &theOption="Fisher")
standard constructor for the "Fisher"
Definition: MethodFisher.cxx:136

TMVA::Endl
MsgLogger & Endl(MsgLogger &ml)
Definition: MsgLogger.h:162

Types.h

TMVA::Types
Definition: Types.h:77

TMVA::MethodFisher::ReadWeightsFromStream
void ReadWeightsFromStream(std::istream &i)
read Fisher coefficients from weight file
Definition: MethodFisher.cxx:612

TMVA::MethodFisher::~MethodFisher
virtual ~MethodFisher(void)
destructor
Definition: MethodFisher.cxx:219

DataSetInfo.h

TMVA::MethodFisher::GetCov_Full
void GetCov_Full(void)
compute full covariance matrix from sum of within and between matrices
Definition: MethodFisher.cxx:443

TMVA::MethodFisher::GetDiscrimPower
void GetDiscrimPower(void)
computation of discrimination power indicator for each variable small values of "fWith" indicates lit...
Definition: MethodFisher.cxx:531

TMVA::MethodFisher::HasAnalysisType
virtual Bool_t HasAnalysisType(Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
Fisher can only handle classification with 2 classes.
Definition: MethodFisher.cxx:231

TMVA::MethodBase::GetNvar
UInt_t GetNvar() const
Definition: MethodBase.h:340

TMVA::MethodFisher::GetFisherMethod
EFisherMethod GetFisherMethod(void)
Definition: MethodFisher.h:91

TMVA::Configurable::Log
MsgLogger & Log() const
Definition: Configurable.h:128

TMVA::Configurable::DeclareOptionRef
OptionBase * DeclareOptionRef(T &ref, const TString &name, const TString &desc="")

TMVA::MethodFisher::fBetw
TMatrixD * fBetw
Definition: MethodFisher.h:143

TMVA::MethodFisher::fSumOfWeightsS
Double_t fSumOfWeightsS
Definition: MethodFisher.h:148

TMVA::Types::EAnalysisType
EAnalysisType
Definition: Types.h:129

TMVA::MethodBase
Definition: MethodBase.h:119

TMVA::MethodFisher::AddWeightsXMLTo
void AddWeightsXMLTo(void *parent) const
create XML description of Fisher classifier
Definition: MethodFisher.cxx:621

TString
Basic string class.
Definition: TString.h:137

TMVA::MethodFisher::fMeanMatx
TMatrixD * fMeanMatx
Definition: MethodFisher.h:136

TMVA::MethodBase::GetTransformationHandler
TransformationHandler & GetTransformationHandler(Bool_t takeReroutedIfAvailable=true)
Definition: MethodBase.h:390

TMVA::Ranking
Definition: Ranking.h:50

Int_t
int Int_t
Definition: RtypesCore.h:41

Bool_t
bool Bool_t
Definition: RtypesCore.h:59

TMVA::MethodFisher::fFisherCoeff
std::vector< Double_t > * fFisherCoeff
Definition: MethodFisher.h:152

kFALSE
const Bool_t kFALSE
Definition: Rtypes.h:92

TMatrixT::Determinant
virtual Double_t Determinant() const
Return the matrix determinant.
Definition: TMatrixT.cxx:1361

TMVA::Tools::AddAttr
void AddAttr(void *node, const char *, const T &value, Int_t precision=16)
Definition: Tools.h:309

TMVA::Tools::AddChild
void * AddChild(void *parent, const char *childname, const char *content=0, bool isRootNode=false)
add child node
Definition: Tools.cxx:1134

TMath::Abs
Short_t Abs(Short_t d)
Definition: TMathBase.h:110

TMVA::MethodFisher::GetHelpMessage
void GetHelpMessage() const
get help message text
Definition: MethodFisher.cxx:705

TListIter
Iterator of linked list.
Definition: TList.h:187

TMVA::MethodBase::GetInputLabel
const TString & GetInputLabel(Int_t i) const
Definition: MethodBase.h:346

TMatrixT< Double_t >

TMVA::gTools
Tools & gTools()
Definition: Tools.cxx:79

x
Double_t x[n]
Definition: legend1.C:17

TMVA::MethodFisher::fCov
TMatrixD * fCov
Definition: MethodFisher.h:145

TMVA::MethodFisher::fSumOfWeightsB
Double_t fSumOfWeightsB
Definition: MethodFisher.h:149

TMVA::MethodFisher::kMahalanobis
Definition: MethodFisher.h:90

TMVA::MethodBase::GetEvent
const Event * GetEvent() const
Definition: MethodBase.h:745

TMVA::MethodBase::Data
DataSet * Data() const
Definition: MethodBase.h:405

TMVA::MethodFisher::fTheMethod
TString fTheMethod
Definition: MethodFisher.h:139

TMVA::Tools::GetChild
void * GetChild(void *parent, const char *childname=0)
get child node
Definition: Tools.cxx:1158

TMVA::MethodFisher::fFisherMethod
EFisherMethod fFisherMethod
Definition: MethodFisher.h:140

TMVA::kHEADER
Definition: Types.h:69

TMVA::MethodBase::GetXmin
Double_t GetXmin(Int_t ivar) const
Definition: MethodBase.h:352

TMVA::MethodBase::DataInfo
DataSetInfo & DataInfo() const
Definition: MethodBase.h:406

TList.h

TMVA::DataSetInfo
Definition: DataSetInfo.h:78

TMVA::Event::GetWeight
Double_t GetWeight() const
return the event weight - depending on whether the flag IgnoreNegWeightsInTraining is or not...
Definition: Event.cxx:378

DataSet.h

TMatrixT::Invert
TMatrixT< Element > & Invert(Double_t *det=0)
Invert the matrix and calculate its determinant.
Definition: TMatrixT.cxx:1396

Ranking.h

TMVA::MethodBase::GetXmax
Double_t GetXmax(Int_t ivar) const
Definition: MethodBase.h:353

TMatrixD
TMatrixT< Double_t > TMatrixD
Definition: TMatrixDfwd.h:24

TMVA::MethodFisher::GetMvaValue
Double_t GetMvaValue(Double_t *err=0, Double_t *errUpper=0)
returns the Fisher value (no fixed range)
Definition: MethodFisher.cxx:271

TMVA::MethodFisher::Train
void Train(void)
computation of Fisher coefficients by series of matrix operations
Definition: MethodFisher.cxx:240

TMVA::MethodFisher::ReadWeightsFromXML
void ReadWeightsFromXML(void *wghtnode)
read Fisher coefficients from xml weight file
Definition: MethodFisher.cxx:638

TMVA::VariableTransformBase
Definition: VariableTransformBase.h:67

TMVA::MethodFisher::ProcessOptions
void ProcessOptions()
process user options
Definition: MethodFisher.cxx:207

MethodFisher.h

v
SVector< double, 2 > v
Definition: Dict.h:5

TMVA::MethodBase::GetName
const char * GetName() const
Definition: MethodBase.h:330

TMVA::Event
Definition: Event.h:60

UInt_t
unsigned int UInt_t
Definition: RtypesCore.h:42

TString::Length
Ssiz_t Length() const
Definition: TString.h:390

TMVA::Types::kClassification
Definition: Types.h:130

MsgLogger.h

TMath::E
Double_t E()
Definition: TMath.h:54

TMVA::MethodFisher::InitMatrices
void InitMatrices(void)
initializaton method; creates global matrices and vectors
Definition: MethodFisher.cxx:288

TMVA::Tools::ReadAttr
void ReadAttr(void *node, const char *, T &value)
Definition: Tools.h:296

TMVA::MethodFisher::fWith
TMatrixD * fWith
Definition: MethodFisher.h:144

Riostream.h

TMVA::MethodFisher::DeclareOptions
void DeclareOptions()
MethodFisher options: format and syntax of option string: "type" where type is "Fisher" or "Mahalanob...
Definition: MethodFisher.cxx:197

TMVA::Event::GetValue
Float_t GetValue(UInt_t ivar) const
return value of i&#39;th variable
Definition: Event.cxx:233

IMethod.h

TransformationHandler.h

ClassImp
#define ClassImp(name)
Definition: Rtypes.h:279

Double_t
double Double_t
Definition: RtypesCore.h:55

TMVA::MethodFisher::GetMean
void GetMean(void)
compute mean values of variables in each sample, and the overall means
Definition: MethodFisher.cxx:305

TMVA::MethodFisher::GetCov_WithinClass
void GetCov_WithinClass(void)
the matrix of covariance &#39;within class&#39; reflects the dispersion of the events relative to the center ...
Definition: MethodFisher.cxx:354

TMVA::MethodBase::IsNormalised
Bool_t IsNormalised() const
Definition: MethodBase.h:490

type
int type
Definition: TGX11.cxx:120

Event.h

TMVA::Tools::GetNextChild
void * GetNextChild(void *prevchild, const char *childname=0)
XML helpers.
Definition: Tools.cxx:1170

y
Double_t y[n]
Definition: legend1.C:17

TMVA::Configurable::AddPreDefVal
void AddPreDefVal(const T &)
Definition: Configurable.h:174

TMVA::Rank
Definition: Ranking.h:78

TMVA::MethodBase::ExitFromTraining
void ExitFromTraining()
Definition: MethodBase.h:458

TMatrixTBase::Print
void Print(Option_t *name="") const
Print the matrix as a table of elements.
Definition: TMatrixTBase.cxx:832

TMVA::Tools::FormattedOutput
void FormattedOutput(const std::vector< Double_t > &, const std::vector< TString > &, const TString titleVars, const TString titleValues, MsgLogger &logger, TString format="%+1.3f")
formatted output of simple table
Definition: Tools.cxx:896

TMVA::Tools::Color
const TString & Color(const TString &)
human readable color strings
Definition: Tools.cxx:837

REGISTER_METHOD
#define REGISTER_METHOD(CLASS)
for example
Definition: ClassifierFactory.h:126

TMVA::MethodFisher::MakeClassSpecific
void MakeClassSpecific(std::ostream &, const TString &) const
write Fisher-specific classifier response
Definition: MethodFisher.cxx:658

TMVA::MethodBase::fRanking
Ranking * fRanking
Definition: MethodBase.h:581

VariableTransformBase.h

Tools.h

TMVA::MethodFisher::kFisher
Definition: MethodFisher.h:90

TMVA::Ranking::AddRank
virtual void AddRank(const Rank &rank)
Add a new rank take ownership of it.
Definition: Ranking.cxx:86

TMVA::MethodFisher::fF0
Double_t fF0
Definition: MethodFisher.h:153

TMVA::DataSet::GetNEvents
Long64_t GetNEvents(Types::ETreeType type=Types::kMaxTreeType) const
Definition: DataSet.h:229

TMVA::MethodFisher::PrintCoefficients
void PrintCoefficients(void)
display Fisher coefficients and discriminating power for each variable check maximum length of variab...
Definition: MethodFisher.cxx:560

TMatrix.h

TMVA::DataSetInfo::IsSignal
Bool_t IsSignal(const Event *ev) const
Definition: DataSetInfo.cxx:176

TMVA::MethodFisher
Definition: MethodFisher.h:58

result
double result[121]
Definition: testSampleQuantiles.cxx:17

Configurable.h

TMath::Sqrt
Double_t Sqrt(Double_t x)
Definition: TMath.h:464

TMath.h

TMVA::kINFO
Definition: Types.h:64

TMVA::MethodFisher::GetFisherCoeff
void GetFisherCoeff(void)
Fisher = Sum { [coeff]*[variables] }.
Definition: MethodFisher.cxx:460

kTRUE
const Bool_t kTRUE
Definition: Rtypes.h:91

MethodBase.h

ClassifierFactory.h

TMVA::kWARNING
Definition: Types.h:65

TMVA::MethodFisher::fDiscrimPow
std::vector< Double_t > * fDiscrimPow
Definition: MethodFisher.h:151

TMVA::MethodBase::NoErrorCalc
void NoErrorCalc(Double_t *const err, Double_t *const errUpper)
Definition: MethodBase.cxx:819

TMVA::MethodBase::SetSignalReferenceCut
void SetSignalReferenceCut(Double_t cut)
Definition: MethodBase.h:360

TMVA::MethodFisher::Init
void Init(void)
default initialization called by all constructors
Definition: MethodFisher.cxx:178