ROOT » TMVA » TMVA::MethodBDT

class TMVA::MethodBDT: public TMVA::MethodBase


 Analysis of Boosted Decision Trees

 Boosted decision trees have been successfully used in High Energy
 Physics analysis for example by the MiniBooNE experiment
 (Yang-Roe-Zhu, physics/0508045). In Boosted Decision Trees, the
 selection is done on a majority vote on the result of several decision
 trees, which are all derived from the same training sample by
 supplying different event weights during the training.

 Decision trees:

 Successive decision nodes are used to categorize the
 events out of the sample as either signal or background. Each node
 uses only a single discriminating variable to decide if the event is
 signal-like ("goes right") or background-like ("goes left"). This
 forms a tree like structure with "baskets" at the end (leave nodes),
 and an event is classified as either signal or background according to
 whether the basket where it ends up has been classified signal or
 background during the training. Training of a decision tree is the
 process to define the "cut criteria" for each node. The training
 starts with the root node. Here one takes the full training event
 sample and selects the variable and corresponding cut value that gives
 the best separation between signal and background at this stage. Using
 this cut criterion, the sample is then divided into two subsamples, a
 signal-like (right) and a background-like (left) sample. Two new nodes
 are then created for each of the two sub-samples and they are
 constructed using the same mechanism as described for the root
 node. The devision is stopped once a certain node has reached either a
 minimum number of events, or a minimum or maximum signal purity. These
 leave nodes are then called "signal" or "background" if they contain
 more signal respective background events from the training sample.

 Boosting:

 The idea behind adaptive boosting (AdaBoost) is, that signal events
 from the training sample, that end up in a background node
 (and vice versa) are given a larger weight than events that are in
 the correct leave node. This results in a re-weighed training event
 sample, with which then a new decision tree can be developed.
 The boosting can be applied several times (typically 100-500 times)
 and one ends up with a set of decision trees (a forest).
 Gradient boosting works more like a function expansion approach, where
 each tree corresponds to a summand. The parameters for each summand (tree)
 are determined by the minimization of a error function (binomial log-
 likelihood for classification and Huber loss for regression).
 A greedy algorithm is used, which means, that only one tree is modified
 at a time, while the other trees stay fixed.

 Bagging:

 In this particular variant of the Boosted Decision Trees the boosting
 is not done on the basis of previous training results, but by a simple
 stochastic re-sampling of the initial training event sample.

 Random Trees:
 Similar to the "Random Forests" from Leo Breiman and Adele Cutler, it
 uses the bagging algorithm together and bases the determination of the
 best node-split during the training on a random subset of variables only
 which is individually chosen for each split.

 Analysis:

 Applying an individual decision tree to a test event results in a
 classification of the event as either signal or background. For the
 boosted decision tree selection, an event is successively subjected to
 the whole set of decision trees and depending on how often it is
 classified as signal, a "likelihood" estimator is constructed for the
 event being signal or background. The value of this estimator is the
 one which is then used to select the events from an event sample, and
 the cut value on this estimator defines the efficiency and purity of
 the selection.


Function Members (Methods)

public:
virtual~MethodBDT()
voidTObject::AbstractMethod(const char* method) const
voidTMVA::Configurable::AddOptionsXMLTo(void* parent) const
voidTMVA::MethodBase::AddOutput(TMVA::Types::ETreeType type, TMVA::Types::EAnalysisType analysisType)
virtual voidAddWeightsXMLTo(void* parent) const
virtual voidTObject::AppendPad(Option_t* option = "")
TDirectory*TMVA::MethodBase::BaseDir() const
Double_tBoost(vector<const TMVA::Event*>&, TMVA::DecisionTree* dt, UInt_t cls = 0)
virtual voidTObject::Browse(TBrowser* b)
voidTMVA::Configurable::CheckForUnusedOptions() const
virtual voidTMVA::MethodBase::CheckSetup()
static TClass*Class()
virtual const char*TObject::ClassName() const
virtual voidTObject::Clear(Option_t* = "")
virtual TObject*TObject::Clone(const char* newname = "") const
virtual Int_tTObject::Compare(const TObject* obj) const
TMVA::ConfigurableTMVA::Configurable::Configurable(const TString& theOption = "")
TMVA::ConfigurableTMVA::Configurable::Configurable(const TMVA::Configurable&)
virtual voidTObject::Copy(TObject& object) const
virtual const TMVA::Ranking*CreateRanking()
TMVA::DataSet*TMVA::MethodBase::Data() const
TMVA::DataSetInfo&TMVA::MethodBase::DataInfo() const
virtual voidDeclareOptions()
virtual voidTObject::Delete(Option_t* option = "")MENU
voidTMVA::MethodBase::DisableWriting(Bool_t setter)
virtual Int_tTObject::DistancetoPrimitive(Int_t px, Int_t py)
Bool_tTMVA::MethodBase::DoMulticlass() const
Bool_tTMVA::MethodBase::DoRegression() const
virtual voidTObject::Draw(Option_t* option = "")
virtual voidTObject::DrawClass() constMENU
virtual TObject*TObject::DrawClone(Option_t* option = "") constMENU
virtual voidTObject::Dump() constMENU
virtual voidTObject::Error(const char* method, const char* msgfmt) const
virtual voidTObject::Execute(const char* method, const char* params, Int_t* error = 0)
virtual voidTObject::Execute(TMethod* method, TObjArray* params, Int_t* error = 0)
virtual voidTObject::ExecuteEvent(Int_t event, Int_t px, Int_t py)
virtual voidTObject::Fatal(const char* method, const char* msgfmt) const
virtual TObject*TObject::FindObject(const char* name) const
virtual TObject*TObject::FindObject(const TObject* obj) const
TMVA::Types::EAnalysisTypeTMVA::MethodBase::GetAnalysisType() const
const vector<double>&GetBoostWeights() const
const char*TMVA::Configurable::GetConfigDescription() const
const char*TMVA::Configurable::GetConfigName() const
virtual Option_t*TObject::GetDrawOption() const
static Long_tTObject::GetDtorOnly()
virtual Double_tTMVA::MethodBase::GetEfficiency(const TString&, TMVA::Types::ETreeType, Double_t& err)
const TMVA::Event*TMVA::MethodBase::GetEvent() const
const TMVA::Event*TMVA::MethodBase::GetEvent(const TMVA::Event* ev) const
const TMVA::Event*TMVA::MethodBase::GetEvent(Long64_t ievt) const
const TMVA::Event*TMVA::MethodBase::GetEvent(Long64_t ievt, TMVA::Types::ETreeType type) const
const vector<TMVA::Event*>&TMVA::MethodBase::GetEventCollection(TMVA::Types::ETreeType type)
const vector<TMVA::DecisionTree*>&GetForest() const
virtual voidGetHelpMessage() const
virtual const char*TObject::GetIconName() const
const TString&TMVA::MethodBase::GetInputLabel(Int_t i) const
const TString&TMVA::MethodBase::GetInputTitle(Int_t i) const
const TString&TMVA::MethodBase::GetInputVar(Int_t i) const
const TString&TMVA::MethodBase::GetJobName() const
virtual Double_tTMVA::MethodBase::GetKSTrainingVsTest(Char_t SorB, TString opt = "X")
virtual Double_tTMVA::MethodBase::GetMaximumSignificance(Double_t SignalEvents, Double_t BackgroundEvents, Double_t& optimal_significance_value) const
Double_tTMVA::MethodBase::GetMean(Int_t ivar) const
const TString&TMVA::MethodBase::GetMethodName() const
TMVA::Types::EMVATMVA::MethodBase::GetMethodType() const
TStringTMVA::MethodBase::GetMethodTypeName() const
virtual vector<Float_t>TMVA::MethodBase::GetMulticlassEfficiency(vector<vector<Float_t> >& purity)
virtual vector<Float_t>TMVA::MethodBase::GetMulticlassTrainingEfficiency(vector<vector<Float_t> >& purity)
virtual const vector<Float_t>&GetMulticlassValues()
virtual Double_tGetMvaValue(Double_t* err = 0, Double_t* errUpper = 0)
virtual const char*TMVA::MethodBase::GetName() const
UInt_tTMVA::MethodBase::GetNEvents() const
UInt_tTMVA::MethodBase::GetNTargets() const
UInt_tGetNTrees() const
UInt_tTMVA::MethodBase::GetNvar() const
UInt_tTMVA::MethodBase::GetNVariables() const
virtual char*TObject::GetObjectInfo(Int_t px, Int_t py) const
static Bool_tTObject::GetObjectStat()
virtual Option_t*TObject::GetOption() const
const TString&TMVA::Configurable::GetOptions() const
virtual Double_tTMVA::MethodBase::GetProba(const TMVA::Event* ev)
virtual Double_tTMVA::MethodBase::GetProba(Double_t mvaVal, Double_t ap_sig)
const TStringTMVA::MethodBase::GetProbaName() const
virtual Double_tTMVA::MethodBase::GetRarity(Double_t mvaVal, TMVA::Types::ESBType reftype = Types::kBackground) const
virtual voidTMVA::MethodBase::GetRegressionDeviation(UInt_t tgtNum, TMVA::Types::ETreeType type, Double_t& stddev, Double_t& stddev90Percent) const
virtual const vector<Float_t>&GetRegressionValues()
Double_tTMVA::MethodBase::GetRMS(Int_t ivar) const
virtual Double_tTMVA::MethodBase::GetROCIntegral(TH1D* histS, TH1D* histB) const
virtual Double_tTMVA::MethodBase::GetROCIntegral(TMVA::PDF* pdfS = 0, TMVA::PDF* pdfB = 0) const
virtual Double_tTMVA::MethodBase::GetSeparation(TH1*, TH1*) const
virtual Double_tTMVA::MethodBase::GetSeparation(TMVA::PDF* pdfS = 0, TMVA::PDF* pdfB = 0) const
Double_tTMVA::MethodBase::GetSignalReferenceCut() const
Double_tTMVA::MethodBase::GetSignalReferenceCutOrientation() const
virtual Double_tTMVA::MethodBase::GetSignificance() const
const TMVA::Event*TMVA::MethodBase::GetTestingEvent(Long64_t ievt) const
Double_tTMVA::MethodBase::GetTestTime() const
const TString&TMVA::MethodBase::GetTestvarName() const
virtual const char*TObject::GetTitle() const
virtual Double_tTMVA::MethodBase::GetTrainingEfficiency(const TString&)
const TMVA::Event*TMVA::MethodBase::GetTrainingEvent(Long64_t ievt) const
const vector<const TMVA::Event*>&GetTrainingEvents() const
UInt_tTMVA::MethodBase::GetTrainingROOTVersionCode() const
TStringTMVA::MethodBase::GetTrainingROOTVersionString() const
UInt_tTMVA::MethodBase::GetTrainingTMVAVersionCode() const
TStringTMVA::MethodBase::GetTrainingTMVAVersionString() const
Double_tTMVA::MethodBase::GetTrainTime() const
TMVA::TransformationHandler&TMVA::MethodBase::GetTransformationHandler(Bool_t takeReroutedIfAvailable = true)
const TMVA::TransformationHandler&TMVA::MethodBase::GetTransformationHandler(Bool_t takeReroutedIfAvailable = true) const
virtual UInt_tTObject::GetUniqueID() const
vector<Double_t>GetVariableImportance()
Double_tGetVariableImportance(UInt_t ivar)
TStringTMVA::MethodBase::GetWeightFileName() const
Double_tTMVA::MethodBase::GetXmax(Int_t ivar) const
Double_tTMVA::MethodBase::GetXmin(Int_t ivar) const
virtual Bool_tTObject::HandleTimer(TTimer* timer)
virtual Bool_tHasAnalysisType(TMVA::Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
virtual ULong_tTObject::Hash() const
Bool_tTMVA::MethodBase::HasMVAPdfs() const
TMVA::IMethodTMVA::IMethod::IMethod()
TMVA::IMethodTMVA::IMethod::IMethod(const TMVA::IMethod&)
virtual voidTObject::Info(const char* method, const char* msgfmt) const
virtual Bool_tTObject::InheritsFrom(const char* classname) const
virtual Bool_tTObject::InheritsFrom(const TClass* cl) const
voidInitEventSample()
virtual voidTObject::Inspect() constMENU
voidTObject::InvertBit(UInt_t f)
virtual TClass*IsA() const
virtual Bool_tTObject::IsEqual(const TObject* obj) const
virtual Bool_tTObject::IsFolder() const
Bool_tTObject::IsOnHeap() const
virtual Bool_tTMVA::MethodBase::IsSignalLike()
virtual Bool_tTMVA::MethodBase::IsSignalLike(Double_t mvaVal)
virtual Bool_tTObject::IsSortable() const
Bool_tTObject::IsZombie() const
virtual voidTObject::ls(Option_t* option = "") const
virtual voidTMVA::MethodBase::MakeClass(const TString& classFileName = TString("")) const
voidMakeClassInstantiateNode(TMVA::DecisionTreeNode* n, ostream& fout, const TString& className) const
virtual voidMakeClassSpecific(ostream&, const TString&) const
virtual voidMakeClassSpecificHeader(ostream&, const TString&) const
voidTObject::MayNotUse(const char* method) const
TMVA::MethodBaseTMVA::MethodBase::MethodBase(const TMVA::MethodBase&)
TMVA::MethodBaseTMVA::MethodBase::MethodBase(TMVA::Types::EMVA methodType, TMVA::DataSetInfo& dsi, const TString& weightFile, TDirectory* theBaseDir = 0)
TMVA::MethodBaseTMVA::MethodBase::MethodBase(const TString& jobName, TMVA::Types::EMVA methodType, const TString& methodTitle, TMVA::DataSetInfo& dsi, const TString& theOption = "", TDirectory* theBaseDir = 0)
TDirectory*TMVA::MethodBase::MethodBaseDir() const
TMVA::MethodBDTMethodBDT(const TMVA::MethodBDT&)
TMVA::MethodBDTMethodBDT(TMVA::DataSetInfo& theData, const TString& theWeightFile, TDirectory* theTargetDir = __null)
TMVA::MethodBDTMethodBDT(const TString& jobName, const TString& methodTitle, TMVA::DataSetInfo& theData, const TString& theOption = "", TDirectory* theTargetDir = 0)
virtual Bool_tTObject::Notify()
voidTObject::Obsolete(const char* method, const char* asOfVers, const char* removedFromVers) const
voidTObject::operator delete(void* ptr)
voidTObject::operator delete(void* ptr, void* vp)
voidTObject::operator delete[](void* ptr)
voidTObject::operator delete[](void* ptr, void* vp)
void*TObject::operator new(size_t sz)
void*TObject::operator new(size_t sz, void* vp)
void*TObject::operator new[](size_t sz)
void*TObject::operator new[](size_t sz, void* vp)
TMVA::MethodBDT&operator=(const TMVA::MethodBDT&)
virtual map<TString,Double_t>OptimizeTuningParameters(TString fomType = "ROCIntegral", TString fitType = "FitGA")
virtual voidTObject::Paint(Option_t* option = "")
virtual voidTMVA::Configurable::ParseOptions()
virtual voidTObject::Pop()
virtual voidTObject::Print(Option_t* option = "") const
virtual voidTMVA::MethodBase::PrintHelpMessage() const
voidTMVA::Configurable::PrintOptions() const
virtual voidProcessOptions()
voidTMVA::MethodBase::ProcessSetup()
virtual Int_tTObject::Read(const char* name)
voidTMVA::Configurable::ReadOptionsFromStream(istream& istr)
voidTMVA::Configurable::ReadOptionsFromXML(void* node)
voidTMVA::MethodBase::ReadStateFromFile()
voidTMVA::MethodBase::ReadStateFromStream(istream& tf)
voidTMVA::MethodBase::ReadStateFromStream(TFile& rf)
voidTMVA::MethodBase::ReadStateFromXMLString(const char* xmlstr)
virtual voidReadWeightsFromStream(istream& istr)
virtual voidReadWeightsFromXML(void* parent)
virtual voidTObject::RecursiveRemove(TObject* obj)
voidTMVA::MethodBase::RerouteTransformationHandler(TMVA::TransformationHandler* fTargetTransformation)
virtual voidReset()
voidTObject::ResetBit(UInt_t f)
virtual voidTObject::SaveAs(const char* filename = "", Option_t* option = "") constMENU
virtual voidTObject::SavePrimitive(ostream& out, Option_t* option = "")
voidSetAdaBoostBeta(Double_t b)
virtual voidTMVA::MethodBase::SetAnalysisType(TMVA::Types::EAnalysisType type)
voidSetBaggedSampleFraction(Double_t f)
voidTMVA::MethodBase::SetBaseDir(TDirectory* methodDir)
voidTObject::SetBit(UInt_t f)
voidTObject::SetBit(UInt_t f, Bool_t set)
voidTMVA::Configurable::SetConfigDescription(const char* d)
voidTMVA::Configurable::SetConfigName(const char* n)
virtual voidTObject::SetDrawOption(Option_t* option = "")MENU
static voidTObject::SetDtorOnly(void* obj)
voidSetMaxDepth(Int_t d)
voidTMVA::MethodBase::SetMethodBaseDir(TDirectory* methodDir)
voidTMVA::MethodBase::SetMethodDir(TDirectory* methodDir)
voidSetMinNodeSize(Double_t sizeInPercent)
voidSetMinNodeSize(TString sizeInPercent)
voidTMVA::Configurable::SetMsgType(TMVA::EMsgType t)
voidSetNodePurityLimit(Double_t l)
voidSetNTrees(Int_t d)
static voidTObject::SetObjectStat(Bool_t stat)
voidTMVA::Configurable::SetOptions(const TString& s)
voidSetShrinkage(Double_t s)
voidTMVA::MethodBase::SetSignalReferenceCut(Double_t cut)
voidTMVA::MethodBase::SetSignalReferenceCutOrientation(Double_t cutOrientation)
voidTMVA::MethodBase::SetTestTime(Double_t testTime)
voidTMVA::MethodBase::SetTestvarName(const TString& v = "")
voidTMVA::MethodBase::SetTrainTime(Double_t trainTime)
virtual voidSetTuneParameters(map<TString,Double_t> tuneParameters)
virtual voidTObject::SetUniqueID(UInt_t uid)
voidTMVA::MethodBase::SetupMethod()
voidSetUseNvars(Int_t n)
virtual voidShowMembers(TMemberInspector& insp) const
virtual voidStreamer(TBuffer&)
voidStreamerNVirtual(TBuffer& ClassDef_StreamerNVirtual_b)
virtual voidTObject::SysError(const char* method, const char* msgfmt) const
Bool_tTObject::TestBit(UInt_t f) const
Int_tTObject::TestBits(UInt_t f) const
virtual voidTMVA::MethodBase::TestClassification()
virtual voidTMVA::MethodBase::TestMulticlass()
virtual voidTMVA::MethodBase::TestRegression(Double_t& bias, Double_t& biasT, Double_t& dev, Double_t& devT, Double_t& rms, Double_t& rmsT, Double_t& mInf, Double_t& mInfT, Double_t& corr, TMVA::Types::ETreeType type)
Double_tTestTreeQuality(TMVA::DecisionTree* dt)
virtual voidTrain()
voidTMVA::MethodBase::TrainMethod()
virtual voidTObject::UseCurrentStyle()
virtual voidTObject::Warning(const char* method, const char* msgfmt) const
virtual Int_tTObject::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0)
virtual Int_tTObject::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0) const
virtual voidTMVA::MethodBase::WriteEvaluationHistosToFile(TMVA::Types::ETreeType treetype)
virtual voidWriteMonitoringHistosToFile() const
voidTMVA::Configurable::WriteOptionsToStream(ostream& o, const TString& prefix) const
voidTMVA::MethodBase::WriteStateToFile() const
private:
Double_tAdaBoost(vector<const TMVA::Event*>&, TMVA::DecisionTree* dt)
Double_tAdaBoostR2(vector<const TMVA::Event*>&, TMVA::DecisionTree* dt)
Double_tAdaCost(vector<const TMVA::Event*>&, TMVA::DecisionTree* dt)
Double_tApplyPreselectionCuts(const TMVA::Event* ev)
Double_tBagging()
voidBoostMonitor(Int_t iTree)
voidDeterminePreselectionCuts(const vector<const TMVA::Event*>& eventSample)
voidGetBaggedSubSample(vector<const TMVA::Event*>&)
Double_tGetGradBoostMVA(const TMVA::Event* e, UInt_t nTrees)
Double_tGetMvaValue(Double_t* err, Double_t* errUpper, UInt_t useNTrees)
Double_tGetWeightedQuantile(vector<pair<Double_t,Double_t> > vec, const Double_t quantile, const Double_t SumOfWeights = 0.)
Double_tGradBoost(vector<const TMVA::Event*>&, TMVA::DecisionTree* dt, UInt_t cls = 0)
Double_tGradBoostRegression(vector<const TMVA::Event*>&, TMVA::DecisionTree* dt)
virtual voidInit()
voidInitGradBoost(vector<const TMVA::Event*>&)
voidPreProcessNegativeEventWeights()
Double_tPrivateGetMvaValue(const TMVA::Event* ev, Double_t* err = 0, Double_t* errUpper = 0, UInt_t useNTrees = 0)
Double_tRegBoost(vector<const TMVA::Event*>&, TMVA::DecisionTree* dt)
voidUpdateTargets(vector<const TMVA::Event*>&, UInt_t cls = 0)
voidUpdateTargetsRegression(vector<const TMVA::Event*>&, Bool_t first = kFALSE)

Data Members

public:
Bool_tTMVA::MethodBase::fSetupCompletedis method setup
const TMVA::Event*TMVA::MethodBase::fTmpEvent! temporary event when testing on a different DataSet than the own one
static TObject::(anonymous)TObject::kBitMask
static TObject::EStatusBitsTObject::kCanDelete
static TObject::EStatusBitsTObject::kCannotPick
static TObject::EStatusBitsTObject::kHasUUID
static TObject::EStatusBitsTObject::kInvalidObject
static TObject::(anonymous)TObject::kIsOnHeap
static TObject::EStatusBitsTObject::kIsReferenced
static TObject::EStatusBitsTObject::kMustCleanup
static TObject::EStatusBitsTObject::kNoContextMenu
static TObject::(anonymous)TObject::kNotDeleted
static TObject::EStatusBitsTObject::kObjInCanvas
static TObject::(anonymous)TObject::kOverwrite
static TMVA::MethodBase::EWeightFileTypeTMVA::MethodBase::kROOT
static TObject::(anonymous)TObject::kSingleKey
static TMVA::MethodBase::EWeightFileTypeTMVA::MethodBase::kTEXT
static TObject::(anonymous)TObject::kWriteDelete
static TObject::(anonymous)TObject::kZombie
protected:
TMVA::Types::EAnalysisTypeTMVA::MethodBase::fAnalysisTypemethod-mode : true --> regression, false --> classification
UInt_tTMVA::MethodBase::fBackgroundClassindex of the Background-class
vector<TString>*TMVA::MethodBase::fInputVarsvector of input variables used in MVA
vector<Float_t>*TMVA::MethodBase::fMulticlassReturnValholds the return-values for the multiclass classification
Int_tTMVA::MethodBase::fNbinsnumber of bins in input variable histograms
Int_tTMVA::MethodBase::fNbinsHnumber of bins in evaluation histograms
Int_tTMVA::MethodBase::fNbinsMVAoutputnumber of bins in MVA output histograms
TMVA::Ranking*TMVA::MethodBase::fRankingpointer to ranking object (created by derived classifiers)
vector<Float_t>*TMVA::MethodBase::fRegressionReturnValholds the return-values for the regression
UInt_tTMVA::MethodBase::fSignalClassindex of the Signal-class
private:
Double_tfAdaBoostBetabeta parameter for AdaBoost algorithm
TStringfAdaBoostR2Lossloss type used in AdaBoostR2 (Linear,Quadratic or Exponential)
Bool_tfAutomaticuse user given prune strength or automatically determined one using a validation sample
Bool_tfBaggedBoostturn bagging in combination with boost on/off
Bool_tfBaggedGradBoostturn bagging in combination with grad boost on/off
Double_tfBaggedSampleFractionrelative size of bagged event sample to original sample size
TStringfBoostTypestring specifying the boost type
Double_tfBoostWeightntuple var: boost weight
vector<double>fBoostWeightsthe weights applied in the individual boosts
Double_tfCbbCost factor
Double_tfCssCost factor
Double_tfCtb_ssCost factor
Double_tfCts_sbCost factor
Bool_tfDoBoostMonitorcreate control plot with ROC integral vs tree number
Bool_tfDoPreselectiondo or do not perform automatic pre-selection of 100% eff. cuts
Double_tfErrorFractionntuple var: misclassification error fraction
vector<const TMVA::Event*>fEventSamplethe training events
Double_tfFValidationEventsfraction of events to use for pruning
vector<TMVA::DecisionTree*>fForestthe collection of decision trees
vector<Double_t>fHighBkgCut
vector<Double_t>fHighSigCut
Bool_tfHistoricBoolhistoric variable, only needed for "CompatibilityOptions"
Int_tfITreentuple var: ith tree
Bool_tfInverseBoostNegWeightsboost ev. with neg. weights with 1/boostweight rathre than boostweight
vector<Bool_t>fIsHighBkgCut
vector<Bool_t>fIsHighSigCut
vector<Bool_t>fIsLowBkgCut
vector<Bool_t>fIsLowSigCut
vector<Double_t>fLowBkgCut
vector<Double_t>fLowSigCut
UInt_tfMaxDepthmax depth
Double_tfMinLinCorrForFisherthe minimum linear correlation between two variables demanded for use in fisher criterium in node splitting
Int_tfMinNodeEventsmin number of events in node
Float_tfMinNodeSizemin percentage of training events in node
TStringfMinNodeSizeSstring containing min percentage of training events in node
TTree*fMonitorNtuplemonitoring ntuple
Int_tfNCutsgrid used in cut applied in node splitting
UInt_tfNNodesMaxmax # of nodes
Int_tfNTreesnumber of decision trees requested
TStringfNegWeightTreatmentvariable that holds the option of how to treat negative event weights in training
Bool_tfNoNegWeightsInTrainingignore negative event weights in the training
Double_tfNodePurityLimitpurity limit for sig/bkg nodes
Bool_tfPairNegWeightsGlobalpair ev. with neg. and pos. weights in traning sample and "annihilate" them
TMVA::DecisionTree::EPruneMethodfPruneMethodmethod used for prunig
TStringfPruneMethodSprune method option String
Double_tfPruneStrengtha parameter to set the "amount" of pruning..needs to be adjusted
Bool_tfRandomisedTreeschoose a random subset of possible cut variables at each node during training
map<const TMVA::Event*,vector<double> >fResidualsindividual event residuals for gradient boost
TMVA::SeparationBase*fSepTypethe separation used in node splitting
TStringfSepTypeSthe separation (option string) used in node splitting
Double_tfShrinkagelearning rate for gradient boost;
Double_tfSigToBkgFractionSignal to Background fraction assumed during training
vector<const TMVA::Event*>fSubSamplesubsample for bagged grad boost
Double_tfSumOfWeightssum of all event weights
vector<const TMVA::Event*>*fTrainSamplepointer to sample actually used in training (fEventSample or fSubSample) for example
Bool_tfTrainWithNegWeightsyes there are negative event weights and we don't ignore them
Double_tfTransitionPointbreak-down point for gradient regression
Bool_tfUseExclusiveVarsindividual variables already used in fisher criterium are not anymore analysed individually for node splitting
Bool_tfUseFisherCutsuse multivariate splits using the Fisher criterium
UInt_tfUseNTrainEventsnumber of randomly picked training events used in randomised (and bagged) trees
UInt_tfUseNvarsthe number of variables used in the randomised tree splitting
Bool_tfUsePoissonNvarsuse "fUseNvars" not as fixed number but as mean of a possion distr. in each split
Bool_tfUseYesNoLeafuse sig or bkg classification in leave nodes or sig/bkg
vector<const TMVA::Event*>fValidationSamplethe Validation events
vector<Double_t>fVariableImportancethe relative importance of the different variables
map<const TMVA::Event*,pair<Double_t,Double_t> >fWeightedResidualsweighted regression residuals
static const Int_tfgDebugLeveldebug level determining some printout/control plots etc.

Class Charts

Inheritance Inherited Members Includes Libraries
Class Charts

Function documentation

MethodBDT(const TString& jobName, const TString& methodTitle, TMVA::DataSetInfo& theData, const TString& theOption = "", TDirectory* theTargetDir = 0)
 the standard constructor for the "boosted decision trees"
MethodBDT(TMVA::DataSetInfo& theData, const TString& theWeightFile, TDirectory* theTargetDir = __null)
Bool_t HasAnalysisType(TMVA::Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
 BDT can handle classification with multiple classes and regression with one regression-target
void DeclareOptions()
 define the options (their key words) that can be set in the option string
 know options:
 nTrees        number of trees in the forest to be created
 BoostType     the boosting type for the trees in the forest (AdaBoost e.t.c..)
                  known: AdaBoost
                         AdaBoostR2 (Adaboost for regression)
                         Bagging
                         GradBoost
 AdaBoostBeta     the boosting parameter, beta, for AdaBoost
 UseRandomisedTrees  choose at each node splitting a random set of variables
 UseNvars         use UseNvars variables in randomised trees
 UsePoission Nvars use UseNvars not as fixed number but as mean of a possion distribution
 SeparationType   the separation criterion applied in the node splitting
                  known: GiniIndex
                         MisClassificationError
                         CrossEntropy
                         SDivSqrtSPlusB
 MinNodeSize:     minimum percentage of training events in a leaf node (leaf criteria, stop splitting)
 nCuts:           the number of steps in the optimisation of the cut for a node (if < 0, then
                  step size is determined by the events)
 UseFisherCuts:   use multivariate splits using the Fisher criterion
 UseYesNoLeaf     decide if the classification is done simply by the node type, or the S/B
                  (from the training) in the leaf node
 NodePurityLimit  the minimum purity to classify a node as a signal node (used in pruning and boosting to determine
                  misclassification error rate)
 PruneMethod      The Pruning method:
                  known: NoPruning  // switch off pruning completely
                         ExpectedError
                         CostComplexity
 PruneStrength    a parameter to adjust the amount of pruning. Should be large enough such that overtraining is avoided.
 PruningValFraction   number of events to use for optimizing pruning (only if PruneStrength < 0, i.e. automatic pruning)
 NegWeightTreatment      IgnoreNegWeightsInTraining  Ignore negative weight events in the training.
                         DecreaseBoostWeight     Boost ev. with neg. weight with 1/boostweight instead of boostweight
                         PairNegWeightsGlobal    Pair ev. with neg. and pos. weights in traning sample and "annihilate" them
 MaxDepth         maximum depth of the decision tree allowed before further splitting is stopped
void DeclareCompatibilityOptions()
 options that are used ONLY for the READER to ensure backward compatibility
void ProcessOptions()
 the option string is decoded, for available options see "DeclareOptions"
void SetMinNodeSize(Double_t sizeInPercent)
void SetMinNodeSize(TString sizeInPercent)
void Init( void )
 common initialisation with defaults for the BDT-Method
void Reset( void )
 reset the method, as if it had just been instantiated (forget all training etc.)
~MethodBDT( void )
destructor
   Note: fEventSample and ValidationSample are already deleted at the end of TRAIN
         When they are not used anymore
   for (UInt_t i=0; i<fEventSample.size();      i++) delete fEventSample[i];
   for (UInt_t i=0; i<fValidationSample.size(); i++) delete fValidationSample[i];
void InitEventSample( void )
 initialize the event sample (i.e. reset the boost-weights... etc)
void PreProcessNegativeEventWeights()
 o.k. you know there are events with negative event weights. This routine will remove
 them by pairing them with the closest event(s) of the same event class with positive
 weights
 A first attempt is "brute force", I dont' try to be clever using search trees etc,
 just quick and dirty to see if the result is any good
std::map<TString,Double_t> OptimizeTuningParameters(TString fomType = "ROCIntegral", TString fitType = "FitGA")
 call the Optimzier with the set of paremeters and ranges that
 are meant to be tuned.
void SetTuneParameters(map<TString,Double_t> tuneParameters)
 set the tuning parameters accoding to the argument
void Train()
 BDT training
Double_t GetGradBoostMVA(const TMVA::Event* e, UInt_t nTrees)
returns MVA value: -1 for background, 1 for signal
void UpdateTargets(vector<const TMVA::Event*>& , UInt_t cls = 0)
Calculate residua for all events;
void UpdateTargetsRegression(vector<const TMVA::Event*>& , Bool_t first = kFALSE)
Calculate current residuals for all events and update targets for next iteration
Double_t GetWeightedQuantile(vector<pair<Double_t,Double_t> > vec, const Double_t quantile, const Double_t SumOfWeights = 0.)
calculates the quantile of the distribution of the first pair entries weighted with the values in the second pair entries
Double_t GradBoost(vector<const TMVA::Event*>& , TMVA::DecisionTree* dt, UInt_t cls = 0)
Calculate the desired response value for each region
Double_t GradBoostRegression(vector<const TMVA::Event*>& , TMVA::DecisionTree* dt)
 Implementation of M_TreeBoost using a Huber loss function as desribed by Friedman 1999
void InitGradBoost(vector<const TMVA::Event*>& )
 initialize targets for first tree
Double_t TestTreeQuality(TMVA::DecisionTree* dt)
 test the tree quality.. in terms of Miscalssification
Double_t Boost(vector<const TMVA::Event*>& , TMVA::DecisionTree* dt, UInt_t cls = 0)
 apply the boosting alogrithim (the algorithm is selecte via the the "option" given
 in the constructor. The return value is the boosting weight
void BoostMonitor(Int_t iTree)
 fills the ROCIntegral vs Itree from the testSample for the monitoring plots
 during the training .. but using the testing events
Double_t AdaBoost(vector<const TMVA::Event*>& , TMVA::DecisionTree* dt)
 the AdaBoost implementation.
 a new training sample is generated by weighting
 events that are misclassified by the decision tree. The weight
 applied is w = (1-err)/err or more general:
            w = ((1-err)/err)^beta
 where err is the fraction of misclassified events in the tree ( <0.5 assuming
 demanding the that previous selection was better than random guessing)
 and "beta" being a free parameter (standard: beta = 1) that modifies the
 boosting.
Double_t AdaCost(vector<const TMVA::Event*>& , TMVA::DecisionTree* dt)
 the AdaCost boosting algorithm takes a simple cost Matrix  (currently fixed for
 all events... later could be modified to use individual cost matrices for each
 events as in the original paper...

                   true_signal true_bkg

     sel_signal |   Css         Ctb_ss    Cxx.. in the range [0,1]
     sel_bkg    |   Cts_sb      Cbb

    and takes this into account when calculating the misclass. cost (former: error fraction):

    err = sum_events ( weight* y_true*y_sel * beta(event)

Double_t Bagging()
 call it boot-strapping, re-sampling or whatever you like, in the end it is nothing
 else but applying "random" poisson weights to each event.
void GetBaggedSubSample(vector<const TMVA::Event*>& )
 fills fEventSample with fBaggedSampleFraction*NEvents random training events
Double_t RegBoost(vector<const TMVA::Event*>& , TMVA::DecisionTree* dt)
 a special boosting only for Regression ...
 maybe I'll implement it later...
Double_t AdaBoostR2(vector<const TMVA::Event*>& , TMVA::DecisionTree* dt)
 adaption of the AdaBoost to regression problems (see H.Drucker 1997)
void AddWeightsXMLTo(void* parent) const
 write weights to XML
void ReadWeightsFromXML(void* parent)
 reads the BDT from the xml file
void ReadWeightsFromStream(istream& istr)
 read the weights (BDT coefficients)
Double_t GetMvaValue(Double_t* err = 0, Double_t* errUpper = 0)
Double_t GetMvaValue(Double_t* err, Double_t* errUpper, UInt_t useNTrees)
 Return the MVA value (range [-1;1]) that classifies the
 event according to the majority vote from the total number of
 decision trees.
Double_t PrivateGetMvaValue(const TMVA::Event* ev, Double_t* err = 0, Double_t* errUpper = 0, UInt_t useNTrees = 0)
 Return the MVA value (range [-1;1]) that classifies the
 event according to the majority vote from the total number of
 decision trees.
const std::vector<Float_t>& GetMulticlassValues()
 get the multiclass MVA response for the BDT classifier
const std::vector<Float_t> & GetRegressionValues()
 get the regression value generated by the BDTs
void WriteMonitoringHistosToFile( void )
 Here we could write some histograms created during the processing
 to the output file.
vector< Double_t > GetVariableImportance()
 Return the relative variable importance, normalized to all
 variables together having the importance 1. The importance in
 evaluated as the total separation-gain that this variable had in
 the decision trees (weighted by the number of events)
Double_t GetVariableImportance(UInt_t ivar)
 Returns the measure for the variable importance of variable "ivar"
 which is later used in GetVariableImportance() to calculate the
 relative variable importances.
const TMVA::Ranking* CreateRanking()
 Compute ranking of input variables
void GetHelpMessage() const
 Get help message text

 typical length of text line:
         "|--------------------------------------------------------------|"
void MakeClassSpecific(ostream& , const TString& ) const
 make ROOT-independent C++ class for classifier response (classifier-specific implementation)
void MakeClassSpecificHeader(ostream& , const TString& ) const
 specific class header
void MakeClassInstantiateNode(TMVA::DecisionTreeNode* n, ostream& fout, const TString& className) const
 recursively descends a tree and writes the node instance to the output streem
void DeterminePreselectionCuts(const vector<const TMVA::Event*>& eventSample)
 find useful preselection cuts that will be applied before
 and Decision Tree training.. (and of course also applied
 in the GetMVA .. --> -1 for background +1 for Signal
   /*
Double_t ApplyPreselectionCuts(const TMVA::Event* ev)
 aply the  preselection cuts before even bothing about any
 Decision Trees  in the GetMVA .. --> -1 for background +1 for Signal
const std::vector<TMVA::DecisionTree*>& GetForest() const
{ return fForest; }
const std::vector<const TMVA::Event*> & GetTrainingEvents() const
{ return fEventSample; }
const std::vector<double>& GetBoostWeights() const
{ return fBoostWeights; }
MethodBDT(const TString& jobName, const TString& methodTitle, TMVA::DataSetInfo& theData, const TString& theOption = "", TDirectory* theTargetDir = 0)
 constructor for training and reading
UInt_t GetNTrees() const
 get the actual forest size (might be less than fNTrees, the requested one, if boosting is stopped early
{return fForest.size();}
void SetMaxDepth(Int_t d)
{fMaxDepth = d;}
void SetNTrees(Int_t d)
{fNTrees = d;}
void SetAdaBoostBeta(Double_t b)
void SetNodePurityLimit(Double_t l)
void SetShrinkage(Double_t s)
{fShrinkage = s;}
void SetUseNvars(Int_t n)
{fUseNvars = n;}
void SetBaggedSampleFraction(Double_t f)