Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
TMVA::MethodBDT Class Reference

Analysis of Boosted Decision Trees.

Boosted decision trees have been successfully used in High Energy Physics analysis for example by the MiniBooNE experiment (Yang-Roe-Zhu, physics/0508045). In Boosted Decision Trees, the selection is done on a majority vote on the result of several decision trees, which are all derived from the same training sample by supplying different event weights during the training.

Decision trees:

Successive decision nodes are used to categorize the events out of the sample as either signal or background. Each node uses only a single discriminating variable to decide if the event is signal-like ("goes right") or background-like ("goes left"). This forms a tree like structure with "baskets" at the end (leave nodes), and an event is classified as either signal or background according to whether the basket where it ends up has been classified signal or background during the training. Training of a decision tree is the process to define the "cut criteria" for each node. The training starts with the root node. Here one takes the full training event sample and selects the variable and corresponding cut value that gives the best separation between signal and background at this stage. Using this cut criterion, the sample is then divided into two subsamples, a signal-like (right) and a background-like (left) sample. Two new nodes are then created for each of the two sub-samples and they are constructed using the same mechanism as described for the root node. The devision is stopped once a certain node has reached either a minimum number of events, or a minimum or maximum signal purity. These leave nodes are then called "signal" or "background" if they contain more signal respective background events from the training sample.

Boosting:

The idea behind adaptive boosting (AdaBoost) is, that signal events from the training sample, that end up in a background node (and vice versa) are given a larger weight than events that are in the correct leave node. This results in a re-weighed training event sample, with which then a new decision tree can be developed. The boosting can be applied several times (typically 100-500 times) and one ends up with a set of decision trees (a forest). Gradient boosting works more like a function expansion approach, where each tree corresponds to a summand. The parameters for each summand (tree) are determined by the minimization of a error function (binomial log- likelihood for classification and Huber loss for regression). A greedy algorithm is used, which means, that only one tree is modified at a time, while the other trees stay fixed.

Bagging:

In this particular variant of the Boosted Decision Trees the boosting is not done on the basis of previous training results, but by a simple stochastic re-sampling of the initial training event sample.

Random Trees:

Similar to the "Random Forests" from Leo Breiman and Adele Cutler, it uses the bagging algorithm together and bases the determination of the best node-split during the training on a random subset of variables only which is individually chosen for each split.

Analysis:

Applying an individual decision tree to a test event results in a classification of the event as either signal or background. For the boosted decision tree selection, an event is successively subjected to the whole set of decision trees and depending on how often it is classified as signal, a "likelihood" estimator is constructed for the event being signal or background. The value of this estimator is the one which is then used to select the events from an event sample, and the cut value on this estimator defines the efficiency and purity of the selection.

Definition at line 63 of file MethodBDT.h.

Public Member Functions

 MethodBDT (const TString &jobName, const TString &methodTitle, DataSetInfo &theData, const TString &theOption="")
 The standard constructor for the "boosted decision trees".
 
 MethodBDT (DataSetInfo &theData, const TString &theWeightFile)
 
virtual ~MethodBDT (void)
 Destructor.
 
void AddWeightsXMLTo (void *parent) const
 Write weights to XML.
 
Double_t Boost (std::vector< const TMVA::Event * > &, DecisionTree *dt, UInt_t cls=0)
 Apply the boosting algorithm (the algorithm is selecte via the "option" given in the constructor.
 
const RankingCreateRanking ()
 Compute ranking of input variables.
 
void DeclareOptions ()
 Define the options (their key words).
 
const std::vector< double > & GetBoostWeights () const
 
const std::vector< TMVA::DecisionTree * > & GetForest () const
 
void GetHelpMessage () const
 Get help message text.
 
const std::vector< Float_t > & GetMulticlassValues ()
 Get the multiclass MVA response for the BDT classifier.
 
Double_t GetMvaValue (Double_t *err=nullptr, Double_t *errUpper=nullptr)
 
UInt_t GetNTrees () const
 
const std::vector< Float_t > & GetRegressionValues ()
 Get the regression value generated by the BDTs.
 
const std::vector< const TMVA::Event * > & GetTrainingEvents () const
 
std::vector< Double_tGetVariableImportance ()
 Return the relative variable importance, normalized to all variables together having the importance 1.
 
Double_t GetVariableImportance (UInt_t ivar)
 Returns the measure for the variable importance of variable "ivar" which is later used in GetVariableImportance() to calculate the relative variable importances.
 
virtual Bool_t HasAnalysisType (Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
 BDT can handle classification with multiple classes and regression with one regression-target.
 
void InitEventSample ()
 Initialize the event sample (i.e. reset the boost-weights... etc).
 
virtual TClassIsA () const
 
void MakeClassInstantiateNode (DecisionTreeNode *n, std::ostream &fout, const TString &className) const
 Recursively descends a tree and writes the node instance to the output stream.
 
void MakeClassSpecific (std::ostream &, const TString &) const
 Make ROOT-independent C++ class for classifier response (classifier-specific implementation).
 
void MakeClassSpecificHeader (std::ostream &, const TString &) const
 Specific class header.
 
virtual std::map< TString, Double_tOptimizeTuningParameters (TString fomType="ROCIntegral", TString fitType="FitGA")
 Call the Optimizer with the set of parameters and ranges that are meant to be tuned.
 
void ProcessOptions ()
 The option string is decoded, for available options see "DeclareOptions".
 
virtual void ReadWeightsFromStream (std::istream &)=0
 
void ReadWeightsFromStream (std::istream &istr)
 Read the weights (BDT coefficients).
 
virtual void ReadWeightsFromStream (TFile &)
 
void ReadWeightsFromXML (void *parent)
 Reads the BDT from the xml file.
 
void Reset (void)
 Reset the method, as if it had just been instantiated (forget all training etc.).
 
void SetAdaBoostBeta (Double_t b)
 
void SetBaggedSampleFraction (Double_t f)
 
void SetMaxDepth (Int_t d)
 
void SetMinNodeSize (Double_t sizeInPercent)
 
void SetMinNodeSize (TString sizeInPercent)
 
void SetNodePurityLimit (Double_t l)
 
void SetNTrees (Int_t d)
 
void SetShrinkage (Double_t s)
 
virtual void SetTuneParameters (std::map< TString, Double_t > tuneParameters)
 Set the tuning parameters according to the argument.
 
void SetUseNvars (Int_t n)
 
virtual void Streamer (TBuffer &)
 
void StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)
 
Double_t TestTreeQuality (DecisionTree *dt)
 Test the tree quality.. in terms of Misclassification.
 
void Train (void)
 BDT training.
 
void WriteMonitoringHistosToFile (void) const
 Here we could write some histograms created during the processing to the output file.
 
- Public Member Functions inherited from TMVA::MethodBase
 MethodBase (const TString &jobName, Types::EMVA methodType, const TString &methodTitle, DataSetInfo &dsi, const TString &theOption="")
 standard constructor
 
 MethodBase (Types::EMVA methodType, DataSetInfo &dsi, const TString &weightFile)
 constructor used for Testing + Application of the MVA, only (no training), using given WeightFiles
 
virtual ~MethodBase ()
 destructor
 
void AddOutput (Types::ETreeType type, Types::EAnalysisType analysisType)
 
TDirectoryBaseDir () const
 returns the ROOT directory where info/histograms etc of the corresponding MVA method instance are stored
 
virtual void CheckSetup ()
 check may be overridden by derived class (sometimes, eg, fitters are used which can only be implemented during training phase)
 
DataSetData () const
 
DataSetInfoDataInfo () const
 
void DisableWriting (Bool_t setter)
 
Bool_t DoMulticlass () const
 
Bool_t DoRegression () const
 
void ExitFromTraining ()
 
Types::EAnalysisType GetAnalysisType () const
 
UInt_t GetCurrentIter ()
 
virtual Double_t GetEfficiency (const TString &, Types::ETreeType, Double_t &err)
 fill background efficiency (resp.
 
const EventGetEvent () const
 
const EventGetEvent (const TMVA::Event *ev) const
 
const EventGetEvent (Long64_t ievt) const
 
const EventGetEvent (Long64_t ievt, Types::ETreeType type) const
 
const std::vector< TMVA::Event * > & GetEventCollection (Types::ETreeType type)
 returns the event collection (i.e.
 
TFileGetFile () const
 
const TStringGetInputLabel (Int_t i) const
 
const char * GetInputTitle (Int_t i) const
 
const TStringGetInputVar (Int_t i) const
 
TMultiGraphGetInteractiveTrainingError ()
 
const TStringGetJobName () const
 
virtual Double_t GetKSTrainingVsTest (Char_t SorB, TString opt="X")
 
virtual Double_t GetMaximumSignificance (Double_t SignalEvents, Double_t BackgroundEvents, Double_t &optimal_significance_value) const
 plot significance, \( \frac{S}{\sqrt{S^2 + B^2}} \), curve for given number of signal and background events; returns cut for maximum significance also returned via reference is the maximum significance
 
UInt_t GetMaxIter ()
 
Double_t GetMean (Int_t ivar) const
 
const TStringGetMethodName () const
 
Types::EMVA GetMethodType () const
 
TString GetMethodTypeName () const
 
virtual TMatrixD GetMulticlassConfusionMatrix (Double_t effB, Types::ETreeType type)
 Construct a confusion matrix for a multiclass classifier.
 
virtual std::vector< Float_tGetMulticlassEfficiency (std::vector< std::vector< Float_t > > &purity)
 
virtual std::vector< Float_tGetMulticlassTrainingEfficiency (std::vector< std::vector< Float_t > > &purity)
 
Double_t GetMvaValue (const TMVA::Event *const ev, Double_t *err=nullptr, Double_t *errUpper=nullptr)
 
const char * GetName () const
 
UInt_t GetNEvents () const
 
UInt_t GetNTargets () const
 
UInt_t GetNvar () const
 
UInt_t GetNVariables () const
 
virtual Double_t GetProba (const Event *ev)
 
virtual Double_t GetProba (Double_t mvaVal, Double_t ap_sig)
 compute likelihood ratio
 
const TString GetProbaName () const
 
virtual Double_t GetRarity (Double_t mvaVal, Types::ESBType reftype=Types::kBackground) const
 compute rarity:
 
virtual void GetRegressionDeviation (UInt_t tgtNum, Types::ETreeType type, Double_t &stddev, Double_t &stddev90Percent) const
 
const std::vector< Float_t > & GetRegressionValues (const TMVA::Event *const ev)
 
Double_t GetRMS (Int_t ivar) const
 
virtual Double_t GetROCIntegral (PDF *pdfS=nullptr, PDF *pdfB=nullptr) const
 calculate the area (integral) under the ROC curve as a overall quality measure of the classification
 
virtual Double_t GetROCIntegral (TH1D *histS, TH1D *histB) const
 calculate the area (integral) under the ROC curve as a overall quality measure of the classification
 
virtual Double_t GetSeparation (PDF *pdfS=nullptr, PDF *pdfB=nullptr) const
 compute "separation" defined as
 
virtual Double_t GetSeparation (TH1 *, TH1 *) const
 compute "separation" defined as
 
Double_t GetSignalReferenceCut () const
 
Double_t GetSignalReferenceCutOrientation () const
 
virtual Double_t GetSignificance () const
 compute significance of mean difference
 
const EventGetTestingEvent (Long64_t ievt) const
 
Double_t GetTestTime () const
 
const TStringGetTestvarName () const
 
virtual Double_t GetTrainingEfficiency (const TString &)
 
const EventGetTrainingEvent (Long64_t ievt) const
 
virtual const std::vector< Float_t > & GetTrainingHistory (const char *)
 
UInt_t GetTrainingROOTVersionCode () const
 
TString GetTrainingROOTVersionString () const
 calculates the ROOT version string from the training version code on the fly
 
UInt_t GetTrainingTMVAVersionCode () const
 
TString GetTrainingTMVAVersionString () const
 calculates the TMVA version string from the training version code on the fly
 
Double_t GetTrainTime () const
 
TransformationHandlerGetTransformationHandler (Bool_t takeReroutedIfAvailable=true)
 
const TransformationHandlerGetTransformationHandler (Bool_t takeReroutedIfAvailable=true) const
 
TString GetWeightFileName () const
 retrieve weight file name
 
Double_t GetXmax (Int_t ivar) const
 
Double_t GetXmin (Int_t ivar) const
 
Bool_t HasMVAPdfs () const
 
void InitIPythonInteractive ()
 
Bool_t IsModelPersistence () const
 
virtual Bool_t IsSignalLike ()
 uses a pre-set cut on the MVA output (SetSignalReferenceCut and SetSignalReferenceCutOrientation) for a quick determination if an event would be selected as signal or background
 
virtual Bool_t IsSignalLike (Double_t mvaVal)
 uses a pre-set cut on the MVA output (SetSignalReferenceCut and SetSignalReferenceCutOrientation) for a quick determination if an event with this mva output value would be selected as signal or background
 
Bool_t IsSilentFile () const
 
virtual void MakeClass (const TString &classFileName=TString("")) const
 create reader class for method (classification only at present)
 
TDirectoryMethodBaseDir () const
 returns the ROOT directory where all instances of the corresponding MVA method are stored
 
void PrintHelpMessage () const
 prints out method-specific help method
 
void ProcessSetup ()
 process all options the "CheckForUnusedOptions" is done in an independent call, since it may be overridden by derived class (sometimes, eg, fitters are used which can only be implemented during training phase)
 
void ReadStateFromFile ()
 Function to write options and weights to file.
 
void ReadStateFromStream (std::istream &tf)
 read the header from the weight files of the different MVA methods
 
void ReadStateFromStream (TFile &rf)
 write reference MVA distributions (and other information) to a ROOT type weight file
 
void ReadStateFromXMLString (const char *xmlstr)
 for reading from memory
 
void RerouteTransformationHandler (TransformationHandler *fTargetTransformation)
 
virtual void SetAnalysisType (Types::EAnalysisType type)
 
void SetBaseDir (TDirectory *methodDir)
 
void SetFile (TFile *file)
 
void SetMethodBaseDir (TDirectory *methodDir)
 
void SetMethodDir (TDirectory *methodDir)
 
void SetModelPersistence (Bool_t status)
 
void SetSignalReferenceCut (Double_t cut)
 
void SetSignalReferenceCutOrientation (Double_t cutOrientation)
 
void SetSilentFile (Bool_t status)
 
void SetTestTime (Double_t testTime)
 
void SetTestvarName (const TString &v="")
 
void SetTrainTime (Double_t trainTime)
 
void SetupMethod ()
 setup of methods
 
void StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)
 
virtual void TestClassification ()
 initialization
 
virtual void TestMulticlass ()
 test multiclass classification
 
virtual void TestRegression (Double_t &bias, Double_t &biasT, Double_t &dev, Double_t &devT, Double_t &rms, Double_t &rmsT, Double_t &mInf, Double_t &mInfT, Double_t &corr, Types::ETreeType type)
 calculate <sum-of-deviation-squared> of regression output versus "true" value from test sample
 
bool TrainingEnded ()
 
void TrainMethod ()
 
virtual void WriteEvaluationHistosToFile (Types::ETreeType treetype)
 writes all MVA evaluation histograms to file
 
void WriteStateToFile () const
 write options and weights to file note that each one text file for the main configuration information and one ROOT file for ROOT objects are created
 
- Public Member Functions inherited from TMVA::IMethod
 IMethod ()
 
virtual ~IMethod ()
 
void StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)
 
- Public Member Functions inherited from TMVA::Configurable
 Configurable (const TString &theOption="")
 constructor
 
virtual ~Configurable ()
 default destructor
 
void AddOptionsXMLTo (void *parent) const
 write options to XML file
 
template<class T >
void AddPreDefVal (const T &)
 
template<class T >
void AddPreDefVal (const TString &optname, const T &)
 
void CheckForUnusedOptions () const
 checks for unused options in option string
 
template<class T >
TMVA::OptionBaseDeclareOptionRef (T &ref, const TString &name, const TString &desc)
 
template<class T >
OptionBaseDeclareOptionRef (T &ref, const TString &name, const TString &desc="")
 
template<class T >
TMVA::OptionBaseDeclareOptionRef (T *&ref, Int_t size, const TString &name, const TString &desc)
 
template<class T >
OptionBaseDeclareOptionRef (T *&ref, Int_t size, const TString &name, const TString &desc="")
 
const char * GetConfigDescription () const
 
const char * GetConfigName () const
 
const TStringGetOptions () const
 
MsgLoggerLog () const
 
virtual void ParseOptions ()
 options parser
 
void PrintOptions () const
 prints out the options set in the options string and the defaults
 
void ReadOptionsFromStream (std::istream &istr)
 read option back from the weight file
 
void ReadOptionsFromXML (void *node)
 
void SetConfigDescription (const char *d)
 
void SetConfigName (const char *n)
 
void SetMsgType (EMsgType t)
 
void SetOptions (const TString &s)
 
void StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)
 
void WriteOptionsToStream (std::ostream &o, const TString &prefix) const
 write options to output stream (e.g. in writing the MVA weight files
 
- Public Member Functions inherited from TNamed
 TNamed ()
 
 TNamed (const char *name, const char *title)
 
 TNamed (const TNamed &named)
 TNamed copy ctor.
 
 TNamed (const TString &name, const TString &title)
 
virtual ~TNamed ()
 TNamed destructor.
 
void Clear (Option_t *option="") override
 Set name and title to empty strings ("").
 
TObjectClone (const char *newname="") const override
 Make a clone of an object using the Streamer facility.
 
Int_t Compare (const TObject *obj) const override
 Compare two TNamed objects.
 
void Copy (TObject &named) const override
 Copy this to obj.
 
virtual void FillBuffer (char *&buffer)
 Encode TNamed into output buffer.
 
const char * GetName () const override
 Returns name of object.
 
const char * GetTitle () const override
 Returns title of object.
 
ULong_t Hash () const override
 Return hash value for this object.
 
TClassIsA () const override
 
Bool_t IsSortable () const override
 
void ls (Option_t *option="") const override
 List TNamed name and title.
 
TNamedoperator= (const TNamed &rhs)
 TNamed assignment operator.
 
void Print (Option_t *option="") const override
 Print TNamed name and title.
 
virtual void SetName (const char *name)
 Set the name of the TNamed.
 
virtual void SetNameTitle (const char *name, const char *title)
 Set all the TNamed parameters (name and title).
 
virtual void SetTitle (const char *title="")
 Set the title of the TNamed.
 
virtual Int_t Sizeof () const
 Return size of the TNamed part of the TObject.
 
void Streamer (TBuffer &) override
 Stream an object of class TObject.
 
void StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)
 
- Public Member Functions inherited from TObject
 TObject ()
 TObject constructor.
 
 TObject (const TObject &object)
 TObject copy ctor.
 
virtual ~TObject ()
 TObject destructor.
 
void AbstractMethod (const char *method) const
 Use this method to implement an "abstract" method that you don't want to leave purely abstract.
 
virtual void AppendPad (Option_t *option="")
 Append graphics object to current pad.
 
virtual void Browse (TBrowser *b)
 Browse object. May be overridden for another default action.
 
ULong_t CheckedHash ()
 Check and record whether this class has a consistent Hash/RecursiveRemove setup (*) and then return the regular Hash value for this object.
 
virtual const char * ClassName () const
 Returns name of class to which the object belongs.
 
virtual void Delete (Option_t *option="")
 Delete this object.
 
virtual Int_t DistancetoPrimitive (Int_t px, Int_t py)
 Computes distance from point (px,py) to the object.
 
virtual void Draw (Option_t *option="")
 Default Draw method for all objects.
 
virtual void DrawClass () const
 Draw class inheritance tree of the class to which this object belongs.
 
virtual TObjectDrawClone (Option_t *option="") const
 Draw a clone of this object in the current selected pad with: gROOT->SetSelectedPad(c1).
 
virtual void Dump () const
 Dump contents of object on stdout.
 
virtual void Error (const char *method, const char *msgfmt,...) const
 Issue error message.
 
virtual void Execute (const char *method, const char *params, Int_t *error=nullptr)
 Execute method on this object with the given parameter string, e.g.
 
virtual void Execute (TMethod *method, TObjArray *params, Int_t *error=nullptr)
 Execute method on this object with parameters stored in the TObjArray.
 
virtual void ExecuteEvent (Int_t event, Int_t px, Int_t py)
 Execute action corresponding to an event at (px,py).
 
virtual void Fatal (const char *method, const char *msgfmt,...) const
 Issue fatal error message.
 
virtual TObjectFindObject (const char *name) const
 Must be redefined in derived classes.
 
virtual TObjectFindObject (const TObject *obj) const
 Must be redefined in derived classes.
 
virtual Option_tGetDrawOption () const
 Get option used by the graphics system to draw this object.
 
virtual const char * GetIconName () const
 Returns mime type name of object.
 
virtual char * GetObjectInfo (Int_t px, Int_t py) const
 Returns string containing info about the object at position (px,py).
 
virtual Option_tGetOption () const
 
virtual UInt_t GetUniqueID () const
 Return the unique object id.
 
virtual Bool_t HandleTimer (TTimer *timer)
 Execute action in response of a timer timing out.
 
Bool_t HasInconsistentHash () const
 Return true is the type of this object is known to have an inconsistent setup for Hash and RecursiveRemove (i.e.
 
virtual void Info (const char *method, const char *msgfmt,...) const
 Issue info message.
 
virtual Bool_t InheritsFrom (const char *classname) const
 Returns kTRUE if object inherits from class "classname".
 
virtual Bool_t InheritsFrom (const TClass *cl) const
 Returns kTRUE if object inherits from TClass cl.
 
virtual void Inspect () const
 Dump contents of this object in a graphics canvas.
 
void InvertBit (UInt_t f)
 
Bool_t IsDestructed () const
 IsDestructed.
 
virtual Bool_t IsEqual (const TObject *obj) const
 Default equal comparison (objects are equal if they have the same address in memory).
 
virtual Bool_t IsFolder () const
 Returns kTRUE in case object contains browsable objects (like containers or lists of other objects).
 
R__ALWAYS_INLINE Bool_t IsOnHeap () const
 
R__ALWAYS_INLINE Bool_t IsZombie () const
 
void MayNotUse (const char *method) const
 Use this method to signal that a method (defined in a base class) may not be called in a derived class (in principle against good design since a child class should not provide less functionality than its parent, however, sometimes it is necessary).
 
virtual Bool_t Notify ()
 This method must be overridden to handle object notification.
 
void Obsolete (const char *method, const char *asOfVers, const char *removedFromVers) const
 Use this method to declare a method obsolete.
 
void operator delete (void *ptr)
 Operator delete.
 
void operator delete[] (void *ptr)
 Operator delete [].
 
void * operator new (size_t sz)
 
void * operator new (size_t sz, void *vp)
 
void * operator new[] (size_t sz)
 
void * operator new[] (size_t sz, void *vp)
 
TObjectoperator= (const TObject &rhs)
 TObject assignment operator.
 
virtual void Paint (Option_t *option="")
 This method must be overridden if a class wants to paint itself.
 
virtual void Pop ()
 Pop on object drawn in a pad to the top of the display list.
 
virtual Int_t Read (const char *name)
 Read contents of object with specified name from the current directory.
 
virtual void RecursiveRemove (TObject *obj)
 Recursively remove this object from a list.
 
void ResetBit (UInt_t f)
 
virtual void SaveAs (const char *filename="", Option_t *option="") const
 Save this object in the file specified by filename.
 
virtual void SavePrimitive (std::ostream &out, Option_t *option="")
 Save a primitive as a C++ statement(s) on output stream "out".
 
void SetBit (UInt_t f)
 
void SetBit (UInt_t f, Bool_t set)
 Set or unset the user status bits as specified in f.
 
virtual void SetDrawOption (Option_t *option="")
 Set drawing option for object.
 
virtual void SetUniqueID (UInt_t uid)
 Set the unique object id.
 
void StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)
 
virtual void SysError (const char *method, const char *msgfmt,...) const
 Issue system error message.
 
R__ALWAYS_INLINE Bool_t TestBit (UInt_t f) const
 
Int_t TestBits (UInt_t f) const
 
virtual void UseCurrentStyle ()
 Set current style settings in this object This function is called when either TCanvas::UseCurrentStyle or TROOT::ForceStyle have been invoked.
 
virtual void Warning (const char *method, const char *msgfmt,...) const
 Issue warning message.
 
virtual Int_t Write (const char *name=nullptr, Int_t option=0, Int_t bufsize=0)
 Write this object to the current directory.
 
virtual Int_t Write (const char *name=nullptr, Int_t option=0, Int_t bufsize=0) const
 Write this object to the current directory.
 

Static Public Member Functions

static TClassClass ()
 
static const char * Class_Name ()
 
static constexpr Version_t Class_Version ()
 
static const char * DeclFileName ()
 
- Static Public Member Functions inherited from TMVA::MethodBase
static TClassClass ()
 
static const char * Class_Name ()
 
static constexpr Version_t Class_Version ()
 
static const char * DeclFileName ()
 
- Static Public Member Functions inherited from TMVA::IMethod
static TClassClass ()
 
static const char * Class_Name ()
 
static constexpr Version_t Class_Version ()
 
static const char * DeclFileName ()
 
- Static Public Member Functions inherited from TMVA::Configurable
static TClassClass ()
 
static const char * Class_Name ()
 
static constexpr Version_t Class_Version ()
 
static const char * DeclFileName ()
 
- Static Public Member Functions inherited from TNamed
static TClassClass ()
 
static const char * Class_Name ()
 
static constexpr Version_t Class_Version ()
 
static const char * DeclFileName ()
 
- Static Public Member Functions inherited from TObject
static TClassClass ()
 
static const char * Class_Name ()
 
static constexpr Version_t Class_Version ()
 
static const char * DeclFileName ()
 
static Longptr_t GetDtorOnly ()
 Return destructor only flag.
 
static Bool_t GetObjectStat ()
 Get status of object stat flag.
 
static void SetDtorOnly (void *obj)
 Set destructor only flag.
 
static void SetObjectStat (Bool_t stat)
 Turn on/off tracking of objects in the TObjectTable.
 

Protected Member Functions

void DeclareCompatibilityOptions ()
 Options that are used ONLY for the READER to ensure backward compatibility.
 
- Protected Member Functions inherited from TMVA::MethodBase
virtual std::vector< Double_tGetDataMvaValues (DataSet *data=nullptr, Long64_t firstEvt=0, Long64_t lastEvt=-1, Bool_t logProgress=false)
 get all the MVA values for the events of the given Data type
 
const TStringGetInternalVarName (Int_t ivar) const
 
virtual std::vector< Double_tGetMvaValues (Long64_t firstEvt=0, Long64_t lastEvt=-1, Bool_t logProgress=false)
 get all the MVA values for the events of the current Data type
 
const TStringGetOriginalVarName (Int_t ivar) const
 
const TStringGetWeightFileDir () const
 
Bool_t HasTrainingTree () const
 
Bool_t Help () const
 
Bool_t IgnoreEventsWithNegWeightsInTraining () const
 
Bool_t IsConstructedFromWeightFile () const
 
Bool_t IsNormalised () const
 
void NoErrorCalc (Double_t *const err, Double_t *const errUpper)
 
void SetNormalised (Bool_t norm)
 
void SetWeightFileDir (TString fileDir)
 set directory of weight file
 
void SetWeightFileName (TString)
 set the weight file name (depreciated)
 
void Statistics (Types::ETreeType treeType, const TString &theVarName, Double_t &, Double_t &, Double_t &, Double_t &, Double_t &, Double_t &)
 calculates rms,mean, xmin, xmax of the event variable this can be either done for the variables as they are or for normalised variables (in the range of 0-1) if "norm" is set to kTRUE
 
Bool_t TxtWeightsOnly () const
 
Bool_t Verbose () const
 
- Protected Member Functions inherited from TMVA::Configurable
void EnableLooseOptions (Bool_t b=kTRUE)
 
const TStringGetReferenceFile () const
 
Bool_t LooseOptionCheckingEnabled () const
 
void ResetSetFlag ()
 resets the IsSet flag for all declare options to be called before options are read from stream
 
void WriteOptionsReferenceToFile ()
 write complete options to output stream
 
- Protected Member Functions inherited from TObject
virtual void DoError (int level, const char *location, const char *fmt, va_list va) const
 Interface to ErrorHandler (protected).
 
void MakeZombie ()
 

Private Member Functions

Double_t AdaBoost (std::vector< const TMVA::Event * > &, DecisionTree *dt)
 The AdaBoost implementation.
 
Double_t AdaBoostR2 (std::vector< const TMVA::Event * > &, DecisionTree *dt)
 Adaption of the AdaBoost to regression problems (see H.Drucker 1997).
 
Double_t AdaCost (std::vector< const TMVA::Event * > &, DecisionTree *dt)
 The AdaCost boosting algorithm takes a simple cost Matrix (currently fixed for all events... later could be modified to use individual cost matrices for each events as in the original paper...
 
Double_t ApplyPreselectionCuts (const Event *ev)
 Apply the preselection cuts before even bothering about any Decision Trees in the GetMVA .
 
Double_t Bagging ()
 Call it boot-strapping, re-sampling or whatever you like, in the end it is nothing else but applying "random" poisson weights to each event.
 
void BoostMonitor (Int_t iTree)
 Fills the ROCIntegral vs Itree from the testSample for the monitoring plots during the training .
 
void DeterminePreselectionCuts (const std::vector< const TMVA::Event * > &eventSample)
 Find useful preselection cuts that will be applied before and Decision Tree training.
 
void GetBaggedSubSample (std::vector< const TMVA::Event * > &)
 Fills fEventSample with fBaggedSampleFraction*NEvents random training events.
 
Double_t GetGradBoostMVA (const TMVA::Event *e, UInt_t nTrees)
 Returns MVA value: -1 for background, 1 for signal.
 
Double_t GetMvaValue (Double_t *err, Double_t *errUpper, UInt_t useNTrees)
 Return the MVA value (range [-1;1]) that classifies the event according to the majority vote from the total number of decision trees.
 
Double_t GradBoost (std::vector< const TMVA::Event * > &, DecisionTree *dt, UInt_t cls=0)
 Calculate the desired response value for each region.
 
Double_t GradBoostRegression (std::vector< const TMVA::Event * > &, DecisionTree *dt)
 Implementation of M_TreeBoost using any loss function as described by Friedman 1999.
 
void Init (void)
 Common initialisation with defaults for the BDT-Method.
 
void InitGradBoost (std::vector< const TMVA::Event * > &)
 Initialize targets for first tree.
 
void PreProcessNegativeEventWeights ()
 O.k.
 
Double_t PrivateGetMvaValue (const TMVA::Event *ev, Double_t *err=nullptr, Double_t *errUpper=nullptr, UInt_t useNTrees=0)
 Return the MVA value (range [-1;1]) that classifies the event according to the majority vote from the total number of decision trees.
 
Double_t RegBoost (std::vector< const TMVA::Event * > &, DecisionTree *dt)
 A special boosting only for Regression (not implemented).
 
void UpdateTargets (std::vector< const TMVA::Event * > &, UInt_t cls=0)
 Calculate residual for all events.
 
void UpdateTargetsRegression (std::vector< const TMVA::Event * > &, Bool_t first=kFALSE)
 Calculate residuals for all events and update targets for next iter.
 

Private Attributes

Double_t fAdaBoostBeta
 beta parameter for AdaBoost algorithm
 
TString fAdaBoostR2Loss
 loss type used in AdaBoostR2 (Linear,Quadratic or Exponential)
 
Bool_t fAutomatic
 use user given prune strength or automatically determined one using a validation sample
 
Bool_t fBaggedBoost
 turn bagging in combination with boost on/off
 
Bool_t fBaggedGradBoost
 turn bagging in combination with grad boost on/off
 
Double_t fBaggedSampleFraction
 relative size of bagged event sample to original sample size
 
TString fBoostType
 string specifying the boost type
 
Double_t fBoostWeight
 ntuple var: boost weight
 
std::vector< doublefBoostWeights
 the weights applied in the individual boosts
 
Double_t fCbb
 Cost factor.
 
Double_t fCss
 Cost factor.
 
Double_t fCtb_ss
 Cost factor.
 
Double_t fCts_sb
 Cost factor.
 
Bool_t fDoBoostMonitor
 create control plot with ROC integral vs tree number
 
Bool_t fDoPreselection
 do or do not perform automatic pre-selection of 100% eff. cuts
 
Double_t fErrorFraction
 ntuple var: misclassification error fraction
 
std::vector< const TMVA::Event * > fEventSample
 the training events
 
std::vector< DecisionTree * > fForest
 the collection of decision trees
 
Double_t fFValidationEvents
 fraction of events to use for pruning
 
std::vector< Double_tfHighBkgCut
 
std::vector< Double_tfHighSigCut
 
Bool_t fHistoricBool
 
Double_t fHuberQuantile
 the option string determining the quantile for the Huber Loss Function in BDT regression.
 
Bool_t fInverseBoostNegWeights
 boost ev. with neg. weights with 1/boostweight rather than boostweight
 
std::vector< Bool_tfIsHighBkgCut
 
std::vector< Bool_tfIsHighSigCut
 
std::vector< Bool_tfIsLowBkgCut
 
std::vector< Bool_tfIsLowSigCut
 
Int_t fITree
 ntuple var: ith tree
 
std::map< const TMVA::Event *, LossFunctionEventInfofLossFunctionEventInfo
 map event to true value, predicted value, and weight used by different loss functions for BDT regression
 
std::vector< Double_tfLowBkgCut
 
std::vector< Double_tfLowSigCut
 
UInt_t fMaxDepth
 max depth
 
Double_t fMinLinCorrForFisher
 the minimum linear correlation between two variables demanded for use in fisher criterium in node splitting
 
Int_t fMinNodeEvents
 min number of events in node
 
Float_t fMinNodeSize
 min percentage of training events in node
 
TString fMinNodeSizeS
 string containing min percentage of training events in node
 
TTreefMonitorNtuple
 monitoring ntuple
 
Int_t fNCuts
 grid used in cut applied in node splitting
 
TString fNegWeightTreatment
 variable that holds the option of how to treat negative event weights in training
 
UInt_t fNNodesMax
 max # of nodes
 
Double_t fNodePurityLimit
 purity limit for sig/bkg nodes
 
Bool_t fNoNegWeightsInTraining
 ignore negative event weights in the training
 
Int_t fNTrees
 number of decision trees requested
 
Bool_t fPairNegWeightsGlobal
 pair ev. with neg. and pos. weights in training sample and "annihilate" them
 
DecisionTree::EPruneMethod fPruneMethod
 method used for pruning
 
TString fPruneMethodS
 prune method option String
 
Double_t fPruneStrength
 a parameter to set the "amount" of pruning..needs to be adjusted
 
Bool_t fRandomisedTrees
 choose a random subset of possible cut variables at each node during training
 
LossFunctionBDTfRegressionLossFunctionBDTG
 
TString fRegressionLossFunctionBDTGS
 the option string determining the loss function for BDT regression
 
std::map< const TMVA::Event *, std::vector< double > > fResiduals
 individual event residuals for gradient boost
 
SeparationBasefSepType
 the separation used in node splitting
 
TString fSepTypeS
 the separation (option string) used in node splitting
 
Double_t fShrinkage
 learning rate for gradient boost;
 
Double_t fSigToBkgFraction
 Signal to Background fraction assumed during training.
 
Bool_t fSkipNormalization
 true for skipping normalization at initialization of trees
 
std::vector< const TMVA::Event * > fSubSample
 subsample for bagged grad boost
 
std::vector< const TMVA::Event * > * fTrainSample
 pointer to sample actually used in training (fEventSample or fSubSample) for example
 
Bool_t fTrainWithNegWeights
 yes there are negative event weights and we don't ignore them
 
Bool_t fUseExclusiveVars
 individual variables already used in fisher criterium are not anymore analysed individually for node splitting
 
Bool_t fUseFisherCuts
 use multivariate splits using the Fisher criterium
 
UInt_t fUseNTrainEvents
 number of randomly picked training events used in randomised (and bagged) trees
 
UInt_t fUseNvars
 the number of variables used in the randomised tree splitting
 
Bool_t fUsePoissonNvars
 use "fUseNvars" not as fixed number but as mean of a poisson distr. in each split
 
Bool_t fUseYesNoLeaf
 use sig or bkg classification in leave nodes or sig/bkg
 
std::vector< const TMVA::Event * > fValidationSample
 the Validation events
 
std::vector< Double_tfVariableImportance
 the relative importance of the different variables
 

Static Private Attributes

static const Int_t fgDebugLevel = 0
 debug level determining some printout/control plots etc.
 

Additional Inherited Members

- Public Types inherited from TMVA::MethodBase
enum  EWeightFileType { kROOT =0 , kTEXT }
 
- Public Types inherited from TObject
enum  {
  kIsOnHeap = 0x01000000 , kNotDeleted = 0x02000000 , kZombie = 0x04000000 , kInconsistent = 0x08000000 ,
  kBitMask = 0x00ffffff
}
 
enum  { kSingleKey = (1ULL << ( 0 )) , kOverwrite = (1ULL << ( 1 )) , kWriteDelete = (1ULL << ( 2 )) }
 
enum  EDeprecatedStatusBits { kObjInCanvas = (1ULL << ( 3 )) }
 
enum  EStatusBits {
  kCanDelete = (1ULL << ( 0 )) , kMustCleanup = (1ULL << ( 3 )) , kIsReferenced = (1ULL << ( 4 )) , kHasUUID = (1ULL << ( 5 )) ,
  kCannotPick = (1ULL << ( 6 )) , kNoContextMenu = (1ULL << ( 8 )) , kInvalidObject = (1ULL << ( 13 ))
}
 
- Public Attributes inherited from TMVA::MethodBase
Bool_t fSetupCompleted
 
TrainingHistory fTrainHistory
 
- Protected Types inherited from TObject
enum  { kOnlyPrepStep = (1ULL << ( 3 )) }
 
- Protected Attributes inherited from TMVA::MethodBase
Types::EAnalysisType fAnalysisType
 
UInt_t fBackgroundClass
 
bool fExitFromTraining = false
 
std::vector< TString > * fInputVars
 
IPythonInteractivefInteractive = nullptr
 temporary dataset used when evaluating on a different data (used by MethodCategory::GetMvaValues)
 
UInt_t fIPyCurrentIter = 0
 
UInt_t fIPyMaxIter = 0
 
std::vector< Float_t > * fMulticlassReturnVal
 
Int_t fNbins
 
Int_t fNbinsH
 
Int_t fNbinsMVAoutput
 
RankingfRanking
 
std::vector< Float_t > * fRegressionReturnVal
 
ResultsfResults
 
UInt_t fSignalClass
 
DataSetfTmpData = nullptr
 temporary event when testing on a different DataSet than the own one
 
const EventfTmpEvent
 
- Protected Attributes inherited from TMVA::Configurable
MsgLoggerfLogger
 ! message logger
 
- Protected Attributes inherited from TNamed
TString fName
 
TString fTitle
 

#include <TMVA/MethodBDT.h>

Inheritance diagram for TMVA::MethodBDT:
[legend]

Constructor & Destructor Documentation

◆ MethodBDT() [1/2]

TMVA::MethodBDT::MethodBDT ( const TString jobName,
const TString methodTitle,
DataSetInfo theData,
const TString theOption = "" 
)

The standard constructor for the "boosted decision trees".

Definition at line 163 of file MethodBDT.cxx.

◆ MethodBDT() [2/2]

TMVA::MethodBDT::MethodBDT ( DataSetInfo theData,
const TString theWeightFile 
)

Definition at line 220 of file MethodBDT.cxx.

◆ ~MethodBDT()

TMVA::MethodBDT::~MethodBDT ( void  )
virtual

Destructor.

  • Note: fEventSample and ValidationSample are already deleted at the end of TRAIN When they are not used anymore

Definition at line 753 of file MethodBDT.cxx.

Member Function Documentation

◆ AdaBoost()

Double_t TMVA::MethodBDT::AdaBoost ( std::vector< const TMVA::Event * > &  eventSample,
DecisionTree dt 
)
private

The AdaBoost implementation.

a new training sample is generated by weighting events that are misclassified by the decision tree. The weight applied is \( w = \frac{(1-err)}{err} \) or more general: \( w = (\frac{(1-err)}{err})^\beta \) where \(err\) is the fraction of misclassified events in the tree ( <0.5 assuming demanding the that previous selection was better than random guessing) and "beta" being a free parameter (standard: beta = 1) that modifies the boosting.

Definition at line 1846 of file MethodBDT.cxx.

◆ AdaBoostR2()

Double_t TMVA::MethodBDT::AdaBoostR2 ( std::vector< const TMVA::Event * > &  eventSample,
DecisionTree dt 
)
private

Adaption of the AdaBoost to regression problems (see H.Drucker 1997).

Definition at line 2193 of file MethodBDT.cxx.

◆ AdaCost()

Double_t TMVA::MethodBDT::AdaCost ( std::vector< const TMVA::Event * > &  eventSample,
DecisionTree dt 
)
private

The AdaCost boosting algorithm takes a simple cost Matrix (currently fixed for all events... later could be modified to use individual cost matrices for each events as in the original paper...

              true_signal true_bkg
----------------------------------
sel_signal |   Css         Ctb_ss    Cxx.. in the range [0,1]
sel_bkg    |   Cts_sb      Cbb

and takes this into account when calculating the mis class. cost (former: error fraction):

err = sum_events ( weight* y_true*y_sel * beta(event) 

Definition at line 2024 of file MethodBDT.cxx.

◆ AddWeightsXMLTo()

void TMVA::MethodBDT::AddWeightsXMLTo ( void *  parent) const
virtual

Write weights to XML.

Implements TMVA::MethodBase.

Definition at line 2310 of file MethodBDT.cxx.

◆ ApplyPreselectionCuts()

Double_t TMVA::MethodBDT::ApplyPreselectionCuts ( const Event ev)
private

Apply the preselection cuts before even bothering about any Decision Trees in the GetMVA .

. --> -1 for background +1 for Signal

Definition at line 3133 of file MethodBDT.cxx.

◆ Bagging()

Double_t TMVA::MethodBDT::Bagging ( )
private

Call it boot-strapping, re-sampling or whatever you like, in the end it is nothing else but applying "random" poisson weights to each event.

Definition at line 2140 of file MethodBDT.cxx.

◆ Boost()

Double_t TMVA::MethodBDT::Boost ( std::vector< const TMVA::Event * > &  eventSample,
DecisionTree dt,
UInt_t  cls = 0 
)

Apply the boosting algorithm (the algorithm is selecte via the "option" given in the constructor.

The return value is the boosting weight.

Definition at line 1718 of file MethodBDT.cxx.

◆ BoostMonitor()

void TMVA::MethodBDT::BoostMonitor ( Int_t  iTree)
private

Fills the ROCIntegral vs Itree from the testSample for the monitoring plots during the training .

. but using the testing events

Definition at line 1752 of file MethodBDT.cxx.

◆ Class()

static TClass * TMVA::MethodBDT::Class ( )
static
Returns
TClass describing this class

◆ Class_Name()

static const char * TMVA::MethodBDT::Class_Name ( )
static
Returns
Name of this class

◆ Class_Version()

static constexpr Version_t TMVA::MethodBDT::Class_Version ( )
inlinestaticconstexpr
Returns
Version of this class

Definition at line 305 of file MethodBDT.h.

◆ CreateRanking()

const TMVA::Ranking * TMVA::MethodBDT::CreateRanking ( )
virtual

Compute ranking of input variables.

Implements TMVA::MethodBase.

Definition at line 2683 of file MethodBDT.cxx.

◆ DeclareCompatibilityOptions()

void TMVA::MethodBDT::DeclareCompatibilityOptions ( )
protectedvirtual

Options that are used ONLY for the READER to ensure backward compatibility.

Reimplemented from TMVA::MethodBase.

Definition at line 454 of file MethodBDT.cxx.

◆ DeclareOptions()

void TMVA::MethodBDT::DeclareOptions ( )
virtual

Define the options (their key words).

That can be set in the option string.

know options:

  • nTrees number of trees in the forest to be created
  • BoostType the boosting type for the trees in the forest (AdaBoost e.t.c..). Known:
    • AdaBoost
    • AdaBoostR2 (Adaboost for regression)
    • Bagging
    • GradBoost
  • AdaBoostBeta the boosting parameter, beta, for AdaBoost
  • UseRandomisedTrees choose at each node splitting a random set of variables
  • UseNvars use UseNvars variables in randomised trees
  • UsePoisson Nvars use UseNvars not as fixed number but as mean of a poisson distribution
  • SeparationType the separation criterion applied in the node splitting. Known:
  • MinNodeSize: minimum percentage of training events in a leaf node (leaf criteria, stop splitting)
  • nCuts: the number of steps in the optimisation of the cut for a node (if < 0, then step size is determined by the events)
  • UseFisherCuts: use multivariate splits using the Fisher criterion
  • UseYesNoLeaf decide if the classification is done simply by the node type, or the S/B (from the training) in the leaf node
  • NodePurityLimit the minimum purity to classify a node as a signal node (used in pruning and boosting to determine misclassification error rate)
  • PruneMethod The Pruning method. Known:
    • NoPruning // switch off pruning completely
    • ExpectedError
    • CostComplexity
  • PruneStrength a parameter to adjust the amount of pruning. Should be large enough such that overtraining is avoided.
  • PruningValFraction number of events to use for optimizing pruning (only if PruneStrength < 0, i.e. automatic pruning)
  • NegWeightTreatment
    • IgnoreNegWeightsInTraining Ignore negative weight events in the training.
    • DecreaseBoostWeight Boost ev. with neg. weight with 1/boostweight instead of boostweight
    • PairNegWeightsGlobal Pair ev. with neg. and pos. weights in training sample and "annihilate" them
  • MaxDepth maximum depth of the decision tree allowed before further splitting is stopped
  • SkipNormalization Skip normalization at initialization, to keep expectation value of BDT output according to the fraction of events

Implements TMVA::MethodBase.

Definition at line 333 of file MethodBDT.cxx.

◆ DeclFileName()

static const char * TMVA::MethodBDT::DeclFileName ( )
inlinestatic
Returns
Name of the file containing the class declaration

Definition at line 305 of file MethodBDT.h.

◆ DeterminePreselectionCuts()

void TMVA::MethodBDT::DeterminePreselectionCuts ( const std::vector< const TMVA::Event * > &  eventSample)
private

Find useful preselection cuts that will be applied before and Decision Tree training.

. (and of course also applied in the GetMVA .. --> -1 for background +1 for Signal)

Definition at line 3036 of file MethodBDT.cxx.

◆ GetBaggedSubSample()

void TMVA::MethodBDT::GetBaggedSubSample ( std::vector< const TMVA::Event * > &  eventSample)
private

Fills fEventSample with fBaggedSampleFraction*NEvents random training events.

Definition at line 2151 of file MethodBDT.cxx.

◆ GetBoostWeights()

const std::vector< double > & TMVA::MethodBDT::GetBoostWeights ( ) const
inline

Definition at line 312 of file MethodBDT.h.

◆ GetForest()

const std::vector< TMVA::DecisionTree * > & TMVA::MethodBDT::GetForest ( ) const
inline

Definition at line 310 of file MethodBDT.h.

◆ GetGradBoostMVA()

Double_t TMVA::MethodBDT::GetGradBoostMVA ( const TMVA::Event e,
UInt_t  nTrees 
)
private

Returns MVA value: -1 for background, 1 for signal.

Definition at line 1421 of file MethodBDT.cxx.

◆ GetHelpMessage()

void TMVA::MethodBDT::GetHelpMessage ( ) const
virtual

Get help message text.

Implements TMVA::IMethod.

Definition at line 2700 of file MethodBDT.cxx.

◆ GetMulticlassValues()

const std::vector< Float_t > & TMVA::MethodBDT::GetMulticlassValues ( )
virtual

Get the multiclass MVA response for the BDT classifier.

Reimplemented from TMVA::MethodBase.

Definition at line 2495 of file MethodBDT.cxx.

◆ GetMvaValue() [1/2]

Double_t TMVA::MethodBDT::GetMvaValue ( Double_t err,
Double_t errUpper,
UInt_t  useNTrees 
)
private

Return the MVA value (range [-1;1]) that classifies the event according to the majority vote from the total number of decision trees.

Definition at line 2452 of file MethodBDT.cxx.

◆ GetMvaValue() [2/2]

Double_t TMVA::MethodBDT::GetMvaValue ( Double_t err = nullptr,
Double_t errUpper = nullptr 
)
virtual

Implements TMVA::MethodBase.

Definition at line 2443 of file MethodBDT.cxx.

◆ GetNTrees()

UInt_t TMVA::MethodBDT::GetNTrees ( ) const
inline

Definition at line 112 of file MethodBDT.h.

◆ GetRegressionValues()

const std::vector< Float_t > & TMVA::MethodBDT::GetRegressionValues ( )
virtual

Get the regression value generated by the BDTs.

Reimplemented from TMVA::MethodBase.

Definition at line 2542 of file MethodBDT.cxx.

◆ GetTrainingEvents()

const std::vector< const TMVA::Event * > & TMVA::MethodBDT::GetTrainingEvents ( ) const
inline

Definition at line 311 of file MethodBDT.h.

◆ GetVariableImportance() [1/2]

vector< Double_t > TMVA::MethodBDT::GetVariableImportance ( )

Return the relative variable importance, normalized to all variables together having the importance 1.

The importance in evaluated as the total separation-gain that this variable had in the decision trees (weighted by the number of events)

Definition at line 2643 of file MethodBDT.cxx.

◆ GetVariableImportance() [2/2]

Double_t TMVA::MethodBDT::GetVariableImportance ( UInt_t  ivar)

Returns the measure for the variable importance of variable "ivar" which is later used in GetVariableImportance() to calculate the relative variable importances.

Definition at line 2671 of file MethodBDT.cxx.

◆ GradBoost()

Double_t TMVA::MethodBDT::GradBoost ( std::vector< const TMVA::Event * > &  eventSample,
DecisionTree dt,
UInt_t  cls = 0 
)
private

Calculate the desired response value for each region.

Definition at line 1595 of file MethodBDT.cxx.

◆ GradBoostRegression()

Double_t TMVA::MethodBDT::GradBoostRegression ( std::vector< const TMVA::Event * > &  eventSample,
DecisionTree dt 
)
private

Implementation of M_TreeBoost using any loss function as described by Friedman 1999.

Definition at line 1629 of file MethodBDT.cxx.

◆ HasAnalysisType()

Bool_t TMVA::MethodBDT::HasAnalysisType ( Types::EAnalysisType  type,
UInt_t  numberClasses,
UInt_t  numberTargets 
)
virtual

BDT can handle classification with multiple classes and regression with one regression-target.

Implements TMVA::IMethod.

Definition at line 280 of file MethodBDT.cxx.

◆ Init()

void TMVA::MethodBDT::Init ( void  )
privatevirtual

Common initialisation with defaults for the BDT-Method.

Implements TMVA::MethodBase.

Definition at line 687 of file MethodBDT.cxx.

◆ InitEventSample()

void TMVA::MethodBDT::InitEventSample ( void  )

Initialize the event sample (i.e. reset the boost-weights... etc).

Definition at line 761 of file MethodBDT.cxx.

◆ InitGradBoost()

void TMVA::MethodBDT::InitGradBoost ( std::vector< const TMVA::Event * > &  eventSample)
private

Initialize targets for first tree.

Definition at line 1658 of file MethodBDT.cxx.

◆ IsA()

virtual TClass * TMVA::MethodBDT::IsA ( ) const
inlinevirtual
Returns
TClass describing current object

Reimplemented from TMVA::MethodBase.

Definition at line 305 of file MethodBDT.h.

◆ MakeClassInstantiateNode()

void TMVA::MethodBDT::MakeClassInstantiateNode ( DecisionTreeNode n,
std::ostream &  fout,
const TString className 
) const

Recursively descends a tree and writes the node instance to the output stream.

Definition at line 2991 of file MethodBDT.cxx.

◆ MakeClassSpecific()

void TMVA::MethodBDT::MakeClassSpecific ( std::ostream &  fout,
const TString className 
) const
virtual

Make ROOT-independent C++ class for classifier response (classifier-specific implementation).

Reimplemented from TMVA::MethodBase.

Definition at line 2757 of file MethodBDT.cxx.

◆ MakeClassSpecificHeader()

void TMVA::MethodBDT::MakeClassSpecificHeader ( std::ostream &  fout,
const TString className 
) const
virtual

Specific class header.

Reimplemented from TMVA::MethodBase.

Definition at line 2877 of file MethodBDT.cxx.

◆ OptimizeTuningParameters()

std::map< TString, Double_t > TMVA::MethodBDT::OptimizeTuningParameters ( TString  fomType = "ROCIntegral",
TString  fitType = "FitGA" 
)
virtual

Call the Optimizer with the set of parameters and ranges that are meant to be tuned.

Reimplemented from TMVA::MethodBase.

Definition at line 1068 of file MethodBDT.cxx.

◆ PreProcessNegativeEventWeights()

void TMVA::MethodBDT::PreProcessNegativeEventWeights ( )
private

O.k.

you know there are events with negative event weights. This routine will remove them by pairing them with the closest event(s) of the same event class with positive weights A first attempt is "brute force", I dont' try to be clever using search trees etc, just quick and dirty to see if the result is any good

Definition at line 932 of file MethodBDT.cxx.

◆ PrivateGetMvaValue()

Double_t TMVA::MethodBDT::PrivateGetMvaValue ( const TMVA::Event ev,
Double_t err = nullptr,
Double_t errUpper = nullptr,
UInt_t  useNTrees = 0 
)
private

Return the MVA value (range [-1;1]) that classifies the event according to the majority vote from the total number of decision trees.

Definition at line 2468 of file MethodBDT.cxx.

◆ ProcessOptions()

void TMVA::MethodBDT::ProcessOptions ( )
virtual

The option string is decoded, for available options see "DeclareOptions".

Implements TMVA::MethodBase.

Definition at line 470 of file MethodBDT.cxx.

◆ ReadWeightsFromStream() [1/3]

virtual void TMVA::MethodBase::ReadWeightsFromStream ( std::istream &  )
virtual

Implements TMVA::MethodBase.

◆ ReadWeightsFromStream() [2/3]

void TMVA::MethodBDT::ReadWeightsFromStream ( std::istream &  istr)
virtual

Read the weights (BDT coefficients).

Implements TMVA::MethodBase.

Definition at line 2408 of file MethodBDT.cxx.

◆ ReadWeightsFromStream() [3/3]

virtual void TMVA::MethodBase::ReadWeightsFromStream ( TFile )
inlinevirtual

Reimplemented from TMVA::MethodBase.

Definition at line 266 of file MethodBase.h.

◆ ReadWeightsFromXML()

void TMVA::MethodBDT::ReadWeightsFromXML ( void *  parent)
virtual

Reads the BDT from the xml file.

Implements TMVA::MethodBase.

Definition at line 2341 of file MethodBDT.cxx.

◆ RegBoost()

Double_t TMVA::MethodBDT::RegBoost ( std::vector< const TMVA::Event * > &  ,
DecisionTree dt 
)
private

A special boosting only for Regression (not implemented).

Definition at line 2185 of file MethodBDT.cxx.

◆ Reset()

void TMVA::MethodBDT::Reset ( void  )
virtual

Reset the method, as if it had just been instantiated (forget all training etc.).

Reimplemented from TMVA::MethodBase.

Definition at line 725 of file MethodBDT.cxx.

◆ SetAdaBoostBeta()

void TMVA::MethodBDT::SetAdaBoostBeta ( Double_t  b)
inline

Definition at line 139 of file MethodBDT.h.

◆ SetBaggedSampleFraction()

void TMVA::MethodBDT::SetBaggedSampleFraction ( Double_t  f)
inline

Definition at line 143 of file MethodBDT.h.

◆ SetMaxDepth()

void TMVA::MethodBDT::SetMaxDepth ( Int_t  d)
inline

Definition at line 134 of file MethodBDT.h.

◆ SetMinNodeSize() [1/2]

void TMVA::MethodBDT::SetMinNodeSize ( Double_t  sizeInPercent)

Definition at line 660 of file MethodBDT.cxx.

◆ SetMinNodeSize() [2/2]

void TMVA::MethodBDT::SetMinNodeSize ( TString  sizeInPercent)

Definition at line 674 of file MethodBDT.cxx.

◆ SetNodePurityLimit()

void TMVA::MethodBDT::SetNodePurityLimit ( Double_t  l)
inline

Definition at line 140 of file MethodBDT.h.

◆ SetNTrees()

void TMVA::MethodBDT::SetNTrees ( Int_t  d)
inline

Definition at line 138 of file MethodBDT.h.

◆ SetShrinkage()

void TMVA::MethodBDT::SetShrinkage ( Double_t  s)
inline

Definition at line 141 of file MethodBDT.h.

◆ SetTuneParameters()

void TMVA::MethodBDT::SetTuneParameters ( std::map< TString, Double_t tuneParameters)
virtual

Set the tuning parameters according to the argument.

Reimplemented from TMVA::MethodBase.

Definition at line 1121 of file MethodBDT.cxx.

◆ SetUseNvars()

void TMVA::MethodBDT::SetUseNvars ( Int_t  n)
inline

Definition at line 142 of file MethodBDT.h.

◆ Streamer()

virtual void TMVA::MethodBDT::Streamer ( TBuffer )
virtual

Reimplemented from TMVA::MethodBase.

◆ StreamerNVirtual()

void TMVA::MethodBDT::StreamerNVirtual ( TBuffer ClassDef_StreamerNVirtual_b)
inline

Definition at line 305 of file MethodBDT.h.

◆ TestTreeQuality()

Double_t TMVA::MethodBDT::TestTreeQuality ( DecisionTree dt)

Test the tree quality.. in terms of Misclassification.

Definition at line 1697 of file MethodBDT.cxx.

◆ Train()

void TMVA::MethodBDT::Train ( void  )
virtual

BDT training.

Implements TMVA::MethodBase.

Definition at line 1142 of file MethodBDT.cxx.

◆ UpdateTargets()

void TMVA::MethodBDT::UpdateTargets ( std::vector< const TMVA::Event * > &  eventSample,
UInt_t  cls = 0 
)
private

Calculate residual for all events.

Definition at line 1435 of file MethodBDT.cxx.

◆ UpdateTargetsRegression()

void TMVA::MethodBDT::UpdateTargetsRegression ( std::vector< const TMVA::Event * > &  eventSample,
Bool_t  first = kFALSE 
)
private

Calculate residuals for all events and update targets for next iter.

Parameters
[in]eventSampleThe collection of events currently under training.
[in]firstShould be true when called before the first boosting iteration has been run

Definition at line 1557 of file MethodBDT.cxx.

◆ WriteMonitoringHistosToFile()

void TMVA::MethodBDT::WriteMonitoringHistosToFile ( void  ) const
virtual

Here we could write some histograms created during the processing to the output file.

Reimplemented from TMVA::MethodBase.

Definition at line 2628 of file MethodBDT.cxx.

Member Data Documentation

◆ fAdaBoostBeta

Double_t TMVA::MethodBDT::fAdaBoostBeta
private

beta parameter for AdaBoost algorithm

Definition at line 216 of file MethodBDT.h.

◆ fAdaBoostR2Loss

TString TMVA::MethodBDT::fAdaBoostR2Loss
private

loss type used in AdaBoostR2 (Linear,Quadratic or Exponential)

Definition at line 217 of file MethodBDT.h.

◆ fAutomatic

Bool_t TMVA::MethodBDT::fAutomatic
private

use user given prune strength or automatically determined one using a validation sample

Definition at line 248 of file MethodBDT.h.

◆ fBaggedBoost

Bool_t TMVA::MethodBDT::fBaggedBoost
private

turn bagging in combination with boost on/off

Definition at line 220 of file MethodBDT.h.

◆ fBaggedGradBoost

Bool_t TMVA::MethodBDT::fBaggedGradBoost
private

turn bagging in combination with grad boost on/off

Definition at line 221 of file MethodBDT.h.

◆ fBaggedSampleFraction

Double_t TMVA::MethodBDT::fBaggedSampleFraction
private

relative size of bagged event sample to original sample size

Definition at line 254 of file MethodBDT.h.

◆ fBoostType

TString TMVA::MethodBDT::fBoostType
private

string specifying the boost type

Definition at line 215 of file MethodBDT.h.

◆ fBoostWeight

Double_t TMVA::MethodBDT::fBoostWeight
private

ntuple var: boost weight

Definition at line 266 of file MethodBDT.h.

◆ fBoostWeights

std::vector<double> TMVA::MethodBDT::fBoostWeights
private

the weights applied in the individual boosts

Definition at line 213 of file MethodBDT.h.

◆ fCbb

Double_t TMVA::MethodBDT::fCbb
private

Cost factor.

Definition at line 272 of file MethodBDT.h.

◆ fCss

Double_t TMVA::MethodBDT::fCss
private

Cost factor.

Definition at line 269 of file MethodBDT.h.

◆ fCtb_ss

Double_t TMVA::MethodBDT::fCtb_ss
private

Cost factor.

Definition at line 271 of file MethodBDT.h.

◆ fCts_sb

Double_t TMVA::MethodBDT::fCts_sb
private

Cost factor.

Definition at line 270 of file MethodBDT.h.

◆ fDoBoostMonitor

Bool_t TMVA::MethodBDT::fDoBoostMonitor
private

create control plot with ROC integral vs tree number

Definition at line 260 of file MethodBDT.h.

◆ fDoPreselection

Bool_t TMVA::MethodBDT::fDoPreselection
private

do or do not perform automatic pre-selection of 100% eff. cuts

Definition at line 274 of file MethodBDT.h.

◆ fErrorFraction

Double_t TMVA::MethodBDT::fErrorFraction
private

ntuple var: misclassification error fraction

Definition at line 267 of file MethodBDT.h.

◆ fEventSample

std::vector<const TMVA::Event*> TMVA::MethodBDT::fEventSample
private

the training events

Definition at line 206 of file MethodBDT.h.

◆ fForest

std::vector<DecisionTree*> TMVA::MethodBDT::fForest
private

the collection of decision trees

Definition at line 212 of file MethodBDT.h.

◆ fFValidationEvents

Double_t TMVA::MethodBDT::fFValidationEvents
private

fraction of events to use for pruning

Definition at line 247 of file MethodBDT.h.

◆ fgDebugLevel

const Int_t TMVA::MethodBDT::fgDebugLevel = 0
staticprivate

debug level determining some printout/control plots etc.

Definition at line 302 of file MethodBDT.h.

◆ fHighBkgCut

std::vector<Double_t> TMVA::MethodBDT::fHighBkgCut
private

Definition at line 287 of file MethodBDT.h.

◆ fHighSigCut

std::vector<Double_t> TMVA::MethodBDT::fHighSigCut
private

Definition at line 286 of file MethodBDT.h.

◆ fHistoricBool

Bool_t TMVA::MethodBDT::fHistoricBool
private

Definition at line 294 of file MethodBDT.h.

◆ fHuberQuantile

Double_t TMVA::MethodBDT::fHuberQuantile
private

the option string determining the quantile for the Huber Loss Function in BDT regression.

Definition at line 297 of file MethodBDT.h.

◆ fInverseBoostNegWeights

Bool_t TMVA::MethodBDT::fInverseBoostNegWeights
private

boost ev. with neg. weights with 1/boostweight rather than boostweight

Definition at line 257 of file MethodBDT.h.

◆ fIsHighBkgCut

std::vector<Bool_t> TMVA::MethodBDT::fIsHighBkgCut
private

Definition at line 292 of file MethodBDT.h.

◆ fIsHighSigCut

std::vector<Bool_t> TMVA::MethodBDT::fIsHighSigCut
private

Definition at line 291 of file MethodBDT.h.

◆ fIsLowBkgCut

std::vector<Bool_t> TMVA::MethodBDT::fIsLowBkgCut
private

Definition at line 290 of file MethodBDT.h.

◆ fIsLowSigCut

std::vector<Bool_t> TMVA::MethodBDT::fIsLowSigCut
private

Definition at line 289 of file MethodBDT.h.

◆ fITree

Int_t TMVA::MethodBDT::fITree
private

ntuple var: ith tree

Definition at line 265 of file MethodBDT.h.

◆ fLossFunctionEventInfo

std::map< const TMVA::Event*, LossFunctionEventInfo> TMVA::MethodBDT::fLossFunctionEventInfo
private

map event to true value, predicted value, and weight used by different loss functions for BDT regression

Definition at line 224 of file MethodBDT.h.

◆ fLowBkgCut

std::vector<Double_t> TMVA::MethodBDT::fLowBkgCut
private

Definition at line 285 of file MethodBDT.h.

◆ fLowSigCut

std::vector<Double_t> TMVA::MethodBDT::fLowSigCut
private

Definition at line 284 of file MethodBDT.h.

◆ fMaxDepth

UInt_t TMVA::MethodBDT::fMaxDepth
private

max depth

Definition at line 242 of file MethodBDT.h.

◆ fMinLinCorrForFisher

Double_t TMVA::MethodBDT::fMinLinCorrForFisher
private

the minimum linear correlation between two variables demanded for use in fisher criterium in node splitting

Definition at line 237 of file MethodBDT.h.

◆ fMinNodeEvents

Int_t TMVA::MethodBDT::fMinNodeEvents
private

min number of events in node

Definition at line 231 of file MethodBDT.h.

◆ fMinNodeSize

Float_t TMVA::MethodBDT::fMinNodeSize
private

min percentage of training events in node

Definition at line 232 of file MethodBDT.h.

◆ fMinNodeSizeS

TString TMVA::MethodBDT::fMinNodeSizeS
private

string containing min percentage of training events in node

Definition at line 233 of file MethodBDT.h.

◆ fMonitorNtuple

TTree* TMVA::MethodBDT::fMonitorNtuple
private

monitoring ntuple

Definition at line 264 of file MethodBDT.h.

◆ fNCuts

Int_t TMVA::MethodBDT::fNCuts
private

grid used in cut applied in node splitting

Definition at line 235 of file MethodBDT.h.

◆ fNegWeightTreatment

TString TMVA::MethodBDT::fNegWeightTreatment
private

variable that holds the option of how to treat negative event weights in training

Definition at line 255 of file MethodBDT.h.

◆ fNNodesMax

UInt_t TMVA::MethodBDT::fNNodesMax
private

max # of nodes

Definition at line 241 of file MethodBDT.h.

◆ fNodePurityLimit

Double_t TMVA::MethodBDT::fNodePurityLimit
private

purity limit for sig/bkg nodes

Definition at line 240 of file MethodBDT.h.

◆ fNoNegWeightsInTraining

Bool_t TMVA::MethodBDT::fNoNegWeightsInTraining
private

ignore negative event weights in the training

Definition at line 256 of file MethodBDT.h.

◆ fNTrees

Int_t TMVA::MethodBDT::fNTrees
private

number of decision trees requested

Definition at line 211 of file MethodBDT.h.

◆ fPairNegWeightsGlobal

Bool_t TMVA::MethodBDT::fPairNegWeightsGlobal
private

pair ev. with neg. and pos. weights in training sample and "annihilate" them

Definition at line 258 of file MethodBDT.h.

◆ fPruneMethod

DecisionTree::EPruneMethod TMVA::MethodBDT::fPruneMethod
private

method used for pruning

Definition at line 244 of file MethodBDT.h.

◆ fPruneMethodS

TString TMVA::MethodBDT::fPruneMethodS
private

prune method option String

Definition at line 245 of file MethodBDT.h.

◆ fPruneStrength

Double_t TMVA::MethodBDT::fPruneStrength
private

a parameter to set the "amount" of pruning..needs to be adjusted

Definition at line 246 of file MethodBDT.h.

◆ fRandomisedTrees

Bool_t TMVA::MethodBDT::fRandomisedTrees
private

choose a random subset of possible cut variables at each node during training

Definition at line 249 of file MethodBDT.h.

◆ fRegressionLossFunctionBDTG

LossFunctionBDT* TMVA::MethodBDT::fRegressionLossFunctionBDTG
private

Definition at line 299 of file MethodBDT.h.

◆ fRegressionLossFunctionBDTGS

TString TMVA::MethodBDT::fRegressionLossFunctionBDTGS
private

the option string determining the loss function for BDT regression

Definition at line 296 of file MethodBDT.h.

◆ fResiduals

std::map< const TMVA::Event*,std::vector<double> > TMVA::MethodBDT::fResiduals
private

individual event residuals for gradient boost

Definition at line 226 of file MethodBDT.h.

◆ fSepType

SeparationBase* TMVA::MethodBDT::fSepType
private

the separation used in node splitting

Definition at line 229 of file MethodBDT.h.

◆ fSepTypeS

TString TMVA::MethodBDT::fSepTypeS
private

the separation (option string) used in node splitting

Definition at line 230 of file MethodBDT.h.

◆ fShrinkage

Double_t TMVA::MethodBDT::fShrinkage
private

learning rate for gradient boost;

Definition at line 219 of file MethodBDT.h.

◆ fSigToBkgFraction

Double_t TMVA::MethodBDT::fSigToBkgFraction
private

Signal to Background fraction assumed during training.

Definition at line 214 of file MethodBDT.h.

◆ fSkipNormalization

Bool_t TMVA::MethodBDT::fSkipNormalization
private

true for skipping normalization at initialization of trees

Definition at line 276 of file MethodBDT.h.

◆ fSubSample

std::vector<const TMVA::Event*> TMVA::MethodBDT::fSubSample
private

subsample for bagged grad boost

Definition at line 208 of file MethodBDT.h.

◆ fTrainSample

std::vector<const TMVA::Event*>* TMVA::MethodBDT::fTrainSample
private

pointer to sample actually used in training (fEventSample or fSubSample) for example

Definition at line 209 of file MethodBDT.h.

◆ fTrainWithNegWeights

Bool_t TMVA::MethodBDT::fTrainWithNegWeights
private

yes there are negative event weights and we don't ignore them

Definition at line 259 of file MethodBDT.h.

◆ fUseExclusiveVars

Bool_t TMVA::MethodBDT::fUseExclusiveVars
private

individual variables already used in fisher criterium are not anymore analysed individually for node splitting

Definition at line 238 of file MethodBDT.h.

◆ fUseFisherCuts

Bool_t TMVA::MethodBDT::fUseFisherCuts
private

use multivariate splits using the Fisher criterium

Definition at line 236 of file MethodBDT.h.

◆ fUseNTrainEvents

UInt_t TMVA::MethodBDT::fUseNTrainEvents
private

number of randomly picked training events used in randomised (and bagged) trees

Definition at line 252 of file MethodBDT.h.

◆ fUseNvars

UInt_t TMVA::MethodBDT::fUseNvars
private

the number of variables used in the randomised tree splitting

Definition at line 250 of file MethodBDT.h.

◆ fUsePoissonNvars

Bool_t TMVA::MethodBDT::fUsePoissonNvars
private

use "fUseNvars" not as fixed number but as mean of a poisson distr. in each split

Definition at line 251 of file MethodBDT.h.

◆ fUseYesNoLeaf

Bool_t TMVA::MethodBDT::fUseYesNoLeaf
private

use sig or bkg classification in leave nodes or sig/bkg

Definition at line 239 of file MethodBDT.h.

◆ fValidationSample

std::vector<const TMVA::Event*> TMVA::MethodBDT::fValidationSample
private

the Validation events

Definition at line 207 of file MethodBDT.h.

◆ fVariableImportance

std::vector<Double_t> TMVA::MethodBDT::fVariableImportance
private

the relative importance of the different variables

Definition at line 278 of file MethodBDT.h.

Libraries for TMVA::MethodBDT:

The documentation for this class was generated from the following files: