Logo ROOT   6.14/05
Reference Guide
TMVAMultipleBackgroundExample.C File Reference

Detailed Description

View in nbviewer Open in SWAN This example shows the training of signal with three different backgrounds Then in the application a tree is created with all signal and background events where the true class ID and the three classifier outputs are added finally with the application tree, the significance is maximized with the help of the TMVA genetic algrorithm.

0.0191218852997
22.8295149803
Processing /mnt/build/workspace/root-makedoc-v614/rootspi/rdoc/src/v6-14-00-patches/tutorials/tmva/TMVAMultipleBackgroundExample.C...
Start Test TMVAGAexample
========================
... event: 0 (200)
======> EVENT:0
var1 = -1.14361
var2 = -0.822373
var3 = -0.395426
var4 = -0.529427
created tree: TreeS
... event: 0 (200)
======> EVENT:0
var1 = -1.54361
var2 = -1.42237
var3 = -1.39543
var4 = -2.02943
created tree: TreeB0
... event: 0 (200)
======> EVENT:0
var1 = -1.54361
var2 = -0.822373
var3 = -0.395426
var4 = -2.02943
created tree: TreeB1
======> EVENT:0
var1 = 0.463304
var2 = 1.37192
var3 = -1.16769
var4 = -1.77551
created tree: TreeB2
created data file: tmva_example_multiple_background.root
========================
--- Training
<HEADER> DataSetInfo : [datasetBkg0] : Added class "Signal"
: Add Tree TreeS of type Signal with 200 events
<HEADER> DataSetInfo : [datasetBkg0] : Added class "Background"
: Add Tree TreeB0 of type Background with 200 events
<HEADER> Factory : Booking method: BDTG
:
: the option *InverseBoostNegWeights* does not exist for BoostType=Grad --> change
: to new default for GradBoost *Pray*
<HEADER> DataSetFactory : [datasetBkg0] : Number of events in input trees
:
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 100
: Signal -- testing events : 100
: Signal -- training and testing events: 200
: Background -- training events : 100
: Background -- testing events : 100
: Background -- training and testing events: 200
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.485 +0.637 +0.878
: var2: +0.485 +1.000 +0.752 +0.759
: var3: +0.637 +0.752 +1.000 +0.840
: var4: +0.878 +0.759 +0.840 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.377 +0.577 +0.847
: var2: +0.377 +1.000 +0.745 +0.722
: var3: +0.577 +0.745 +1.000 +0.811
: var4: +0.847 +0.722 +0.811 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [datasetBkg0] :
:
<HEADER> Factory : Train all methods
<HEADER> Factory : [datasetBkg0] : Create Transformation "I" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg0] : Create Transformation "D" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg0] : Create Transformation "P" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg0] : Create Transformation "G" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg0] : Create Transformation "D" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.066427 1.0417 [ -3.1150 2.9998 ]
: var2: 0.074159 1.0451 [ -3.4854 3.1113 ]
: var3: 0.11230 1.1191 [ -3.0033 3.9796 ]
: var4: 0.25340 1.3586 [ -3.2294 4.1179 ]
: -----------------------------------------------------------
: Preparing the Decorrelation transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.089897 1.0000 [ -2.8690 2.6768 ]
: var2: -0.048622 1.0000 [ -3.1024 2.5656 ]
: var3: -0.019979 1.0000 [ -2.8162 3.4529 ]
: var4: 0.31232 1.0000 [ -1.8094 2.4786 ]
: -----------------------------------------------------------
: Preparing the Principle Component (PCA) transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1:-1.4540e-09 2.0807 [ -5.7703 6.1568 ]
: var2: 4.0047e-10 0.78255 [ -2.1728 2.0976 ]
: var3:-4.5751e-10 0.47194 [ -1.3320 1.1953 ]
: var4:-5.3842e-10 0.33329 [ -0.78875 0.87706 ]
: -----------------------------------------------------------
: Preparing the Gaussian transformation...
: Preparing the Decorrelation transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.15835 1.0000 [ -1.3229 6.2791 ]
: var2: 0.12263 1.0000 [ -2.5143 6.0808 ]
: var3: 0.14347 1.0000 [ -1.7961 6.9066 ]
: var4: 0.048926 1.0000 [ -2.5286 6.0560 ]
: -----------------------------------------------------------
: Ranking input variables (method unspecific)...
<HEADER> IdTransformation : Ranking result (top variable is best ranked)
: -----------------------------------
: Rank : Variable : Separation
: -----------------------------------
: 1 : Variable 4 : 4.424e-01
: 2 : Variable 3 : 3.801e-01
: 3 : Variable 2 : 2.435e-01
: 4 : Variable 1 : 1.922e-01
: -----------------------------------
<HEADER> Factory : Train method: BDTG for Classification
:
<HEADER> BDTG : #events: (reweighted) sig: 100 bkg: 100
: #events: (unweighted) sig: 100 bkg: 100
: Training 1000 Decision Trees ... patience please
: Elapsed time for training with 200 events: 0.248 sec
<HEADER> BDTG : [datasetBkg0] : Evaluation of BDTG on training sample (200 events)
: Elapsed time for evaluation of 200 events: 0.0194 sec
: Creating xml weight file: datasetBkg0/weights/TMVAMultiBkg0_BDTG.weights.xml
: Creating standalone class: datasetBkg0/weights/TMVAMultiBkg0_BDTG.class.C
<HEADER> Factory : Training finished
:
: Ranking input variables (method specific)...
<HEADER> BDTG : Ranking result (top variable is best ranked)
: --------------------------------------
: Rank : Variable : Variable Importance
: --------------------------------------
: 1 : var1 : 2.838e-01
: 2 : var2 : 2.537e-01
: 3 : var4 : 2.384e-01
: 4 : var3 : 2.240e-01
: --------------------------------------
<HEADER> Factory : === Destroy and recreate all methods via weight files for testing ===
:
: Reading weight file: datasetBkg0/weights/TMVAMultiBkg0_BDTG.weights.xml
<HEADER> Factory : Test all methods
<HEADER> Factory : Test method: BDTG for Classification performance
:
<HEADER> BDTG : [datasetBkg0] : Evaluation of BDTG on testing sample (200 events)
: Elapsed time for evaluation of 200 events: 0.0122 sec
<HEADER> Factory : Evaluate all methods
<HEADER> Factory : Evaluate classifier: BDTG
:
<HEADER> BDTG : [datasetBkg0] : Loop over test events and fill histograms with classifier response...
:
<HEADER> TFHandler_BDTG : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.072229 0.95447 [ -2.7150 2.2789 ]
: var2: 0.026802 0.96431 [ -3.6952 2.5113 ]
: var3: 0.14087 1.0567 [ -3.3587 3.3281 ]
: var4: 0.27038 1.2168 [ -3.7913 3.5074 ]
: -----------------------------------------------------------
:
: Evaluation results ranked by best signal efficiency and purity (area)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA
: Name: Method: ROC-integ
: datasetBkg0 BDTG : 0.953
: -------------------------------------------------------------------------------------------------------------------
:
: Testing efficiency compared to training efficiency (overtraining check)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA Signal efficiency: from test sample (from training sample)
: Name: Method: @B=0.01 @B=0.10 @B=0.30
: -------------------------------------------------------------------------------------------------------------------
: datasetBkg0 BDTG : 0.000 (0.985) 0.905 (0.987) 0.976 (0.991)
: -------------------------------------------------------------------------------------------------------------------
:
<HEADER> Dataset:datasetBkg0 : Created tree 'TestTree' with 200 events
:
<HEADER> Dataset:datasetBkg0 : Created tree 'TrainTree' with 200 events
:
<HEADER> Factory : Thank you for using TMVA!
: For citation information, please visit: http://tmva.sf.net/citeTMVA.html
<HEADER> DataSetInfo : [datasetBkg1] : Added class "Signal"
: Add Tree TreeS of type Signal with 200 events
<HEADER> DataSetInfo : [datasetBkg1] : Added class "Background"
: Add Tree TreeB1 of type Background with 200 events
<HEADER> Factory : Booking method: BDTG
:
: the option *InverseBoostNegWeights* does not exist for BoostType=Grad --> change
: to new default for GradBoost *Pray*
<HEADER> DataSetFactory : [datasetBkg1] : Number of events in input trees
:
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 100
: Signal -- testing events : 100
: Signal -- training and testing events: 200
: Background -- training events : 100
: Background -- testing events : 100
: Background -- training and testing events: 200
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.485 +0.637 +0.878
: var2: +0.485 +1.000 +0.752 +0.759
: var3: +0.637 +0.752 +1.000 +0.840
: var4: +0.878 +0.759 +0.840 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.377 +0.577 +0.847
: var2: +0.377 +1.000 +0.745 +0.722
: var3: +0.577 +0.745 +1.000 +0.811
: var4: +0.847 +0.722 +0.811 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [datasetBkg1] :
:
<HEADER> Factory : Train all methods
<HEADER> Factory : [datasetBkg1] : Create Transformation "I" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg1] : Create Transformation "D" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg1] : Create Transformation "P" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg1] : Create Transformation "G" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg1] : Create Transformation "D" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.066427 1.0417 [ -3.1150 2.9998 ]
: var2: 0.37416 0.97541 [ -3.0952 3.1113 ]
: var3: 0.61230 0.96750 [ -2.3587 3.9796 ]
: var4: 0.25340 1.3586 [ -3.2294 4.1179 ]
: -----------------------------------------------------------
: Preparing the Decorrelation transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.15565 1.0000 [ -2.9801 2.6746 ]
: var2: 0.15984 1.0000 [ -2.9641 2.4763 ]
: var3: 0.73277 1.0000 [ -1.9228 4.1869 ]
: var4: 0.020567 1.0000 [ -2.0336 2.3391 ]
: -----------------------------------------------------------
: Preparing the Principle Component (PCA) transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1:-9.2201e-10 1.9235 [ -5.3639 5.7144 ]
: var2: 1.3318e-09 0.81666 [ -2.6634 2.0151 ]
: var3:-1.1642e-10 0.52391 [ -1.7345 1.3129 ]
: var4:-6.6590e-10 0.42084 [ -0.86901 1.1757 ]
: -----------------------------------------------------------
: Preparing the Gaussian transformation...
: Preparing the Decorrelation transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.14994 1.0000 [ -1.2992 6.2304 ]
: var2: 0.14446 1.0000 [ -2.1183 5.6897 ]
: var3: 0.091479 1.0000 [ -1.8403 6.2664 ]
: var4: 0.092468 1.0000 [ -2.1129 5.4495 ]
: -----------------------------------------------------------
: Ranking input variables (method unspecific)...
<HEADER> IdTransformation : Ranking result (top variable is best ranked)
: -----------------------------------
: Rank : Variable : Separation
: -----------------------------------
: 1 : Variable 4 : 4.424e-01
: 2 : Variable 1 : 1.922e-01
: 3 : Variable 2 : 1.264e-01
: 4 : Variable 3 : 7.836e-02
: -----------------------------------
<HEADER> Factory : Train method: BDTG for Classification
:
<HEADER> BDTG : #events: (reweighted) sig: 100 bkg: 100
: #events: (unweighted) sig: 100 bkg: 100
: Training 1000 Decision Trees ... patience please
: Elapsed time for training with 200 events: 0.245 sec
<HEADER> BDTG : [datasetBkg1] : Evaluation of BDTG on training sample (200 events)
: Elapsed time for evaluation of 200 events: 0.025 sec
: Creating xml weight file: datasetBkg1/weights/TMVAMultiBkg1_BDTG.weights.xml
: Creating standalone class: datasetBkg1/weights/TMVAMultiBkg1_BDTG.class.C
<HEADER> Factory : Training finished
:
: Ranking input variables (method specific)...
<HEADER> BDTG : Ranking result (top variable is best ranked)
: --------------------------------------
: Rank : Variable : Variable Importance
: --------------------------------------
: 1 : var1 : 2.933e-01
: 2 : var4 : 2.742e-01
: 3 : var2 : 2.180e-01
: 4 : var3 : 2.146e-01
: --------------------------------------
<HEADER> Factory : === Destroy and recreate all methods via weight files for testing ===
:
: Reading weight file: datasetBkg1/weights/TMVAMultiBkg1_BDTG.weights.xml
<HEADER> Factory : Test all methods
<HEADER> Factory : Test method: BDTG for Classification performance
:
<HEADER> BDTG : [datasetBkg1] : Evaluation of BDTG on testing sample (200 events)
: Elapsed time for evaluation of 200 events: 0.0116 sec
<HEADER> Factory : Evaluate all methods
<HEADER> Factory : Evaluate classifier: BDTG
:
<HEADER> BDTG : [datasetBkg1] : Loop over test events and fill histograms with classifier response...
:
<HEADER> TFHandler_BDTG : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.072229 0.95447 [ -2.7150 2.2789 ]
: var2: 0.32680 0.94378 [ -3.0952 3.1113 ]
: var3: 0.64087 0.96582 [ -2.3587 3.9796 ]
: var4: 0.27038 1.2168 [ -3.7913 3.5074 ]
: -----------------------------------------------------------
:
: Evaluation results ranked by best signal efficiency and purity (area)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA
: Name: Method: ROC-integ
: datasetBkg1 BDTG : 0.989
: -------------------------------------------------------------------------------------------------------------------
:
: Testing efficiency compared to training efficiency (overtraining check)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA Signal efficiency: from test sample (from training sample)
: Name: Method: @B=0.01 @B=0.10 @B=0.30
: -------------------------------------------------------------------------------------------------------------------
: datasetBkg1 BDTG : 0.000 (1.000) 1.000 (1.000) 1.000 (1.000)
: -------------------------------------------------------------------------------------------------------------------
:
<HEADER> Dataset:datasetBkg1 : Created tree 'TestTree' with 200 events
:
<HEADER> Dataset:datasetBkg1 : Created tree 'TrainTree' with 200 events
:
<HEADER> Factory : Thank you for using TMVA!
: For citation information, please visit: http://tmva.sf.net/citeTMVA.html
<HEADER> DataSetInfo : [datasetBkg2] : Added class "Signal"
: Add Tree TreeS of type Signal with 200 events
<HEADER> DataSetInfo : [datasetBkg2] : Added class "Background"
: Add Tree TreeB2 of type Background with 200 events
<HEADER> Factory : Booking method: BDTG
:
: the option *InverseBoostNegWeights* does not exist for BoostType=Grad --> change
: to new default for GradBoost *Pray*
<HEADER> DataSetFactory : [datasetBkg2] : Number of events in input trees
:
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 100
: Signal -- testing events : 100
: Signal -- training and testing events: 200
: Background -- training events : 100
: Background -- testing events : 100
: Background -- training and testing events: 200
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.485 +0.637 +0.878
: var2: +0.485 +1.000 +0.752 +0.759
: var3: +0.637 +0.752 +1.000 +0.840
: var4: +0.878 +0.759 +0.840 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.656 -0.044 +0.068
: var2: -0.656 +1.000 -0.013 -0.139
: var3: -0.044 -0.013 +1.000 +0.110
: var4: +0.068 -0.139 +0.110 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [datasetBkg2] :
:
<HEADER> Factory : Train all methods
<HEADER> Factory : [datasetBkg2] : Create Transformation "I" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg2] : Create Transformation "D" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg2] : Create Transformation "P" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg2] : Create Transformation "G" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> Factory : [datasetBkg2] : Create Transformation "D" with events from all classes.
:
<HEADER> : Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.35135 0.91590 [ -2.1665 2.9998 ]
: var2: 0.72107 0.88032 [ -3.0952 3.1113 ]
: var3: 0.29319 1.1286 [ -2.3587 3.9796 ]
: var4: 0.65463 1.1780 [ -2.2913 4.1179 ]
: -----------------------------------------------------------
: Preparing the Decorrelation transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.25774 1.0000 [ -2.0792 2.7730 ]
: var2: 0.77022 1.0000 [ -3.2294 3.1618 ]
: var3: 0.024586 1.0000 [ -2.2489 2.6129 ]
: var4: 0.45801 1.0000 [ -2.3000 2.5395 ]
: -----------------------------------------------------------
: Preparing the Principle Component (PCA) transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 9.5926e-10 1.5373 [ -5.3473 5.5326 ]
: var2: 8.9407e-10 0.88855 [ -2.2471 2.6430 ]
: var3:-2.9337e-10 0.79188 [ -2.3380 1.9125 ]
: var4: 4.9826e-10 0.70386 [ -1.5948 2.1465 ]
: -----------------------------------------------------------
: Preparing the Gaussian transformation...
: Preparing the Decorrelation transformation...
<HEADER> TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.15141 1.0000 [ -1.6172 5.6829 ]
: var2: 0.17168 1.0000 [ -1.5359 5.4248 ]
: var3: 0.14179 1.0000 [ -1.8210 5.3102 ]
: var4: 0.10065 1.0000 [ -2.3131 4.5774 ]
: -----------------------------------------------------------
: Ranking input variables (method unspecific)...
<HEADER> IdTransformation : Ranking result (top variable is best ranked)
: -----------------------------------
: Rank : Variable : Separation
: -----------------------------------
: 1 : Variable 2 : 3.627e-01
: 2 : Variable 4 : 3.197e-01
: 3 : Variable 3 : 2.418e-01
: 4 : Variable 1 : 1.907e-01
: -----------------------------------
<HEADER> Factory : Train method: BDTG for Classification
:
<HEADER> BDTG : #events: (reweighted) sig: 100 bkg: 100
: #events: (unweighted) sig: 100 bkg: 100
: Training 1000 Decision Trees ... patience please
: Elapsed time for training with 200 events: 0.273 sec
<HEADER> BDTG : [datasetBkg2] : Evaluation of BDTG on training sample (200 events)
: Elapsed time for evaluation of 200 events: 0.0266 sec
: Creating xml weight file: datasetBkg2/weights/TMVAMultiBkg2_BDTG.weights.xml
: Creating standalone class: datasetBkg2/weights/TMVAMultiBkg2_BDTG.class.C
<HEADER> Factory : Training finished
:
: Ranking input variables (method specific)...
<HEADER> BDTG : Ranking result (top variable is best ranked)
: --------------------------------------
: Rank : Variable : Variable Importance
: --------------------------------------
: 1 : var2 : 2.722e-01
: 2 : var1 : 2.666e-01
: 3 : var3 : 2.432e-01
: 4 : var4 : 2.180e-01
: --------------------------------------
<HEADER> Factory : === Destroy and recreate all methods via weight files for testing ===
:
: Reading weight file: datasetBkg2/weights/TMVAMultiBkg2_BDTG.weights.xml
<HEADER> Factory : Test all methods
<HEADER> Factory : Test method: BDTG for Classification performance
:
<HEADER> BDTG : [datasetBkg2] : Evaluation of BDTG on testing sample (200 events)
: Elapsed time for evaluation of 200 events: 0.0123 sec
<HEADER> Factory : Evaluate all methods
<HEADER> Factory : Evaluate classifier: BDTG
:
<HEADER> BDTG : [datasetBkg2] : Loop over test events and fill histograms with classifier response...
:
<HEADER> TFHandler_BDTG : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.26457 0.87243 [ -2.7150 2.2789 ]
: var2: 0.63463 0.90997 [ -2.8854 2.3222 ]
: var3: 0.29991 1.0505 [ -2.0033 3.3281 ]
: var4: 0.49000 1.1314 [ -1.8141 3.5074 ]
: -----------------------------------------------------------
:
: Evaluation results ranked by best signal efficiency and purity (area)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA
: Name: Method: ROC-integ
: datasetBkg2 BDTG : 0.961
: -------------------------------------------------------------------------------------------------------------------
:
: Testing efficiency compared to training efficiency (overtraining check)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA Signal efficiency: from test sample (from training sample)
: Name: Method: @B=0.01 @B=0.10 @B=0.30
: -------------------------------------------------------------------------------------------------------------------
: datasetBkg2 BDTG : 0.000 (0.936) 0.898 (0.946) 0.950 (0.958)
: -------------------------------------------------------------------------------------------------------------------
:
<HEADER> Dataset:datasetBkg2 : Created tree 'TestTree' with 200 events
:
<HEADER> Dataset:datasetBkg2 : Created tree 'TrainTree' with 200 events
:
<HEADER> Factory : Thank you for using TMVA!
: For citation information, please visit: http://tmva.sf.net/citeTMVA.html
========================
--- Application & create combined tree
: Booking "BDT method" of type "BDT" from datasetBkg0/weights/TMVAMultiBkg0_BDTG.weights.xml.
: Reading weight file: datasetBkg0/weights/TMVAMultiBkg0_BDTG.weights.xml
<HEADER> DataSetInfo : [Default] : Added class "Signal"
<HEADER> DataSetInfo : [Default] : Added class "Background"
: Booked classifier "BDTG" of type: "BDT"
: Booking "BDT method" of type "BDT" from datasetBkg1/weights/TMVAMultiBkg1_BDTG.weights.xml.
: Reading weight file: datasetBkg1/weights/TMVAMultiBkg1_BDTG.weights.xml
<HEADER> DataSetInfo : [Default] : Added class "Signal"
<HEADER> DataSetInfo : [Default] : Added class "Background"
: Booked classifier "BDTG" of type: "BDT"
: Booking "BDT method" of type "BDT" from datasetBkg2/weights/TMVAMultiBkg2_BDTG.weights.xml.
: Reading weight file: datasetBkg2/weights/TMVAMultiBkg2_BDTG.weights.xml
<HEADER> DataSetInfo : [Default] : Added class "Signal"
<HEADER> DataSetInfo : [Default] : Added class "Background"
: Booked classifier "BDTG" of type: "BDT"
--- Select signal sample
--- Processing: 200 events
--- ... Processing event: 0
--- End of event loop: Real time 0:00:00, CP time 0.030
--- Select background 0 sample
--- Processing: 200 events
--- ... Processing event: 0
--- End of event loop: Real time 0:00:00, CP time 0.040
--- Select background 1 sample
--- Processing: 200 events
--- ... Processing event: 0
--- End of event loop: Real time 0:00:00, CP time 0.040
--- Select background 2 sample
--- Processing: 200 events
--- ... Processing event: 0
--- End of event loop: Real time 0:00:00, CP time 0.040
--- Created root file: "tmva_example_multiple_backgrounds__applied.root" containing the MVA output histograms
==> Application of readers is done! combined tree created
========================
--- maximize significance
Classifier ranges (defined by the user)
range: -1 1
range: -1 1
range: -1 1
<HEADER> FitterBase : <GeneticFitter> Optimisation, please be patient ... (inaccurate progress timing for GA)
: Elapsed time: 16.3 sec
======================
Efficiency : 0.93
Purity : 0.907317
True positive weights : 186
False positive weights: 19
Signal weights : 200
cutValue[0] = 0.719237;
cutValue[1] = -0.998596;
cutValue[2] = -0.999939;
#include <iostream> // Stream declarations
#include <vector>
#include <limits>
#include "TChain.h"
#include "TCut.h"
#include "TDirectory.h"
#include "TH1F.h"
#include "TH1.h"
#include "TMath.h"
#include "TFile.h"
#include "TStopwatch.h"
#include "TROOT.h"
#include "TSystem.h"
#include "TMVA/Factory.h"
#include "TMVA/DataLoader.h"//required to load dataset
#include "TMVA/Reader.h"
using namespace std;
using namespace TMVA;
// ----------------------------------------------------------------------------------------------
// Training
// ----------------------------------------------------------------------------------------------
//
void Training(){
std::string factoryOptions( "!V:!Silent:Transformations=I;D;P;G,D:AnalysisType=Classification" );
TString fname = "./tmva_example_multiple_background.root";
TFile *input(0);
input = TFile::Open( fname );
TTree *signal = (TTree*)input->Get("TreeS");
TTree *background0 = (TTree*)input->Get("TreeB0");
TTree *background1 = (TTree*)input->Get("TreeB1");
TTree *background2 = (TTree*)input->Get("TreeB2");
/// global event weights per tree (see below for setting event-wise weights)
Double_t signalWeight = 1.0;
Double_t background0Weight = 1.0;
Double_t background1Weight = 1.0;
Double_t background2Weight = 1.0;
// Create a new root output file.
TString outfileName( "TMVASignalBackground0.root" );
TFile* outputFile = TFile::Open( outfileName, "RECREATE" );
// background 0
// ____________
TMVA::Factory *factory = new TMVA::Factory( "TMVAMultiBkg0", outputFile, factoryOptions );
TMVA::DataLoader *dataloader=new TMVA::DataLoader("datasetBkg0");
dataloader->AddVariable( "var1", "Variable 1", "", 'F' );
dataloader->AddVariable( "var2", "Variable 2", "", 'F' );
dataloader->AddVariable( "var3", "Variable 3", "units", 'F' );
dataloader->AddVariable( "var4", "Variable 4", "units", 'F' );
dataloader->AddSignalTree ( signal, signalWeight );
dataloader->AddBackgroundTree( background0, background0Weight );
// factory->SetBackgroundWeightExpression("weight");
TCut mycuts = ""; // for example: TCut mycuts = "abs(var1)<0.5 && abs(var2-0.5)<1";
TCut mycutb = ""; // for example: TCut mycutb = "abs(var1)<0.5";
// tell the factory to use all remaining events in the trees after training for testing:
dataloader->PrepareTrainingAndTestTree( mycuts, mycutb,
"nTrain_Signal=0:nTrain_Background=0:SplitMode=Random:NormMode=NumEvents:!V" );
// Boosted Decision Trees
factory->BookMethod( dataloader, TMVA::Types::kBDT, "BDTG",
"!H:!V:NTrees=1000:BoostType=Grad:Shrinkage=0.30:UseBaggedBoost:BaggedSampleFraction=0.6:SeparationType=GiniIndex:nCuts=20:MaxDepth=2" );
factory->TrainAllMethods();
factory->TestAllMethods();
factory->EvaluateAllMethods();
outputFile->Close();
delete factory;
delete dataloader;
// background 1
// ____________
outfileName = "TMVASignalBackground1.root";
outputFile = TFile::Open( outfileName, "RECREATE" );
dataloader=new TMVA::DataLoader("datasetBkg1");
factory = new TMVA::Factory( "TMVAMultiBkg1", outputFile, factoryOptions );
dataloader->AddVariable( "var1", "Variable 1", "", 'F' );
dataloader->AddVariable( "var2", "Variable 2", "", 'F' );
dataloader->AddVariable( "var3", "Variable 3", "units", 'F' );
dataloader->AddVariable( "var4", "Variable 4", "units", 'F' );
dataloader->AddSignalTree ( signal, signalWeight );
dataloader->AddBackgroundTree( background1, background1Weight );
// dataloader->SetBackgroundWeightExpression("weight");
// tell the factory to use all remaining events in the trees after training for testing:
dataloader->PrepareTrainingAndTestTree( mycuts, mycutb,
"nTrain_Signal=0:nTrain_Background=0:SplitMode=Random:NormMode=NumEvents:!V" );
// Boosted Decision Trees
factory->BookMethod( dataloader, TMVA::Types::kBDT, "BDTG",
"!H:!V:NTrees=1000:BoostType=Grad:Shrinkage=0.30:UseBaggedBoost:BaggedSampleFraction=0.6:SeparationType=GiniIndex:nCuts=20:MaxDepth=2" );
factory->TrainAllMethods();
factory->TestAllMethods();
factory->EvaluateAllMethods();
outputFile->Close();
delete factory;
delete dataloader;
// background 2
// ____________
outfileName = "TMVASignalBackground2.root";
outputFile = TFile::Open( outfileName, "RECREATE" );
factory = new TMVA::Factory( "TMVAMultiBkg2", outputFile, factoryOptions );
dataloader=new TMVA::DataLoader("datasetBkg2");
dataloader->AddVariable( "var1", "Variable 1", "", 'F' );
dataloader->AddVariable( "var2", "Variable 2", "", 'F' );
dataloader->AddVariable( "var3", "Variable 3", "units", 'F' );
dataloader->AddVariable( "var4", "Variable 4", "units", 'F' );
dataloader->AddSignalTree ( signal, signalWeight );
dataloader->AddBackgroundTree( background2, background2Weight );
// dataloader->SetBackgroundWeightExpression("weight");
// tell the dataloader to use all remaining events in the trees after training for testing:
dataloader->PrepareTrainingAndTestTree( mycuts, mycutb,
"nTrain_Signal=0:nTrain_Background=0:SplitMode=Random:NormMode=NumEvents:!V" );
// Boosted Decision Trees
factory->BookMethod( dataloader, TMVA::Types::kBDT, "BDTG",
"!H:!V:NTrees=1000:BoostType=Grad:Shrinkage=0.30:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20:MaxDepth=2" );
factory->TrainAllMethods();
factory->TestAllMethods();
factory->EvaluateAllMethods();
outputFile->Close();
delete factory;
delete dataloader;
}
// ----------------------------------------------------------------------------------------------
// Application
// ----------------------------------------------------------------------------------------------
//
// create a summary tree with all signal and background events and for each event the three classifier values and the true classID
void ApplicationCreateCombinedTree(){
// Create a new root output file.
TString outfileName( "tmva_example_multiple_backgrounds__applied.root" );
TFile* outputFile = TFile::Open( outfileName, "RECREATE" );
TTree* outputTree = new TTree("multiBkg","multiple backgrounds tree");
Float_t var1, var2;
Float_t var3, var4;
Int_t classID = 0;
Float_t weight = 1.f;
Float_t classifier0, classifier1, classifier2;
outputTree->Branch("classID", &classID, "classID/I");
outputTree->Branch("var1", &var1, "var1/F");
outputTree->Branch("var2", &var2, "var2/F");
outputTree->Branch("var3", &var3, "var3/F");
outputTree->Branch("var4", &var4, "var4/F");
outputTree->Branch("weight", &weight, "weight/F");
outputTree->Branch("cls0", &classifier0, "cls0/F");
outputTree->Branch("cls1", &classifier1, "cls1/F");
outputTree->Branch("cls2", &classifier2, "cls2/F");
// create three readers for the three different signal/background classifications, .. one for each background
TMVA::Reader *reader0 = new TMVA::Reader( "!Color:!Silent" );
TMVA::Reader *reader1 = new TMVA::Reader( "!Color:!Silent" );
TMVA::Reader *reader2 = new TMVA::Reader( "!Color:!Silent" );
reader0->AddVariable( "var1", &var1 );
reader0->AddVariable( "var2", &var2 );
reader0->AddVariable( "var3", &var3 );
reader0->AddVariable( "var4", &var4 );
reader1->AddVariable( "var1", &var1 );
reader1->AddVariable( "var2", &var2 );
reader1->AddVariable( "var3", &var3 );
reader1->AddVariable( "var4", &var4 );
reader2->AddVariable( "var1", &var1 );
reader2->AddVariable( "var2", &var2 );
reader2->AddVariable( "var3", &var3 );
reader2->AddVariable( "var4", &var4 );
// load the weight files for the readers
TString method = "BDT method";
reader0->BookMVA( "BDT method", "datasetBkg0/weights/TMVAMultiBkg0_BDTG.weights.xml" );
reader1->BookMVA( "BDT method", "datasetBkg1/weights/TMVAMultiBkg1_BDTG.weights.xml" );
reader2->BookMVA( "BDT method", "datasetBkg2/weights/TMVAMultiBkg2_BDTG.weights.xml" );
// load the input file
TFile *input(0);
TString fname = "./tmva_example_multiple_background.root";
input = TFile::Open( fname );
TTree* theTree = NULL;
// loop through signal and all background trees
for( int treeNumber = 0; treeNumber < 4; ++treeNumber ) {
if( treeNumber == 0 ){
theTree = (TTree*)input->Get("TreeS");
std::cout << "--- Select signal sample" << std::endl;
// theTree->SetBranchAddress( "weight", &weight );
weight = 1;
classID = 0;
}else if( treeNumber == 1 ){
theTree = (TTree*)input->Get("TreeB0");
std::cout << "--- Select background 0 sample" << std::endl;
// theTree->SetBranchAddress( "weight", &weight );
weight = 1;
classID = 1;
}else if( treeNumber == 2 ){
theTree = (TTree*)input->Get("TreeB1");
std::cout << "--- Select background 1 sample" << std::endl;
// theTree->SetBranchAddress( "weight", &weight );
weight = 1;
classID = 2;
}else if( treeNumber == 3 ){
theTree = (TTree*)input->Get("TreeB2");
std::cout << "--- Select background 2 sample" << std::endl;
// theTree->SetBranchAddress( "weight", &weight );
weight = 1;
classID = 3;
}
theTree->SetBranchAddress( "var1", &var1 );
theTree->SetBranchAddress( "var2", &var2 );
theTree->SetBranchAddress( "var3", &var3 );
theTree->SetBranchAddress( "var4", &var4 );
std::cout << "--- Processing: " << theTree->GetEntries() << " events" << std::endl;
sw.Start();
Int_t nEvent = theTree->GetEntries();
// Int_t nEvent = 100;
for (Long64_t ievt=0; ievt<nEvent; ievt++) {
if (ievt%1000 == 0){
std::cout << "--- ... Processing event: " << ievt << std::endl;
}
theTree->GetEntry(ievt);
// get the classifiers for each of the signal/background classifications
classifier0 = reader0->EvaluateMVA( method );
classifier1 = reader1->EvaluateMVA( method );
classifier2 = reader2->EvaluateMVA( method );
outputTree->Fill();
}
// get elapsed time
sw.Stop();
std::cout << "--- End of event loop: "; sw.Print();
}
input->Close();
// write output tree
/* outputTree->SetDirectory(outputFile);
outputTree->Write(); */
outputFile->Write();
outputFile->Close();
std::cout << "--- Created root file: \"" << outfileName.Data() << "\" containing the MVA output histograms" << std::endl;
delete reader0;
delete reader1;
delete reader2;
std::cout << "==> Application of readers is done! combined tree created" << std::endl << std::endl;
}
// -----------------------------------------------------------------------------------------
// Genetic Algorithm Fitness definition
// -----------------------------------------------------------------------------------------
//
class MyFitness : public IFitterTarget {
public:
// constructor
MyFitness( TChain* _chain ) : IFitterTarget() {
chain = _chain;
hSignal = new TH1F("hsignal","hsignal",100,-1,1);
hFP = new TH1F("hfp","hfp",100,-1,1);
hTP = new TH1F("htp","htp",100,-1,1);
TString cutsAndWeightSignal = "weight*(classID==0)";
nSignal = chain->Draw("Entry$/Entries$>>hsignal",cutsAndWeightSignal,"goff");
weightsSignal = hSignal->Integral();
}
// the output of this function will be minimized
Double_t EstimatorFunction( std::vector<Double_t> & factors ){
TString cutsAndWeightTruePositive = Form("weight*((classID==0) && cls0>%f && cls1>%f && cls2>%f )",factors.at(0), factors.at(1), factors.at(2));
TString cutsAndWeightFalsePositive = Form("weight*((classID >0) && cls0>%f && cls1>%f && cls2>%f )",factors.at(0), factors.at(1), factors.at(2));
// Entry$/Entries$ just draws something reasonable. Could in principle anything
Float_t nTP = chain->Draw("Entry$/Entries$>>htp",cutsAndWeightTruePositive,"goff");
Float_t nFP = chain->Draw("Entry$/Entries$>>hfp",cutsAndWeightFalsePositive,"goff");
weightsTruePositive = hTP->Integral();
weightsFalsePositive = hFP->Integral();
efficiency = 0;
if( weightsSignal > 0 )
efficiency = weightsTruePositive/weightsSignal;
purity = 0;
if( weightsTruePositive+weightsFalsePositive > 0 )
purity = weightsTruePositive/(weightsTruePositive+weightsFalsePositive);
Float_t effTimesPur = efficiency*purity;
Float_t toMinimize = std::numeric_limits<float>::max(); // set to the highest existing number
if( effTimesPur > 0 ) // if larger than 0, take 1/x. This is the value to minimize
toMinimize = 1./(effTimesPur); // we want to minimize 1/efficiency*purity
// Print();
return toMinimize;
}
void Print(){
std::cout << std::endl;
std::cout << "======================" << std::endl
<< "Efficiency : " << efficiency << std::endl
<< "Purity : " << purity << std::endl << std::endl
<< "True positive weights : " << weightsTruePositive << std::endl
<< "False positive weights: " << weightsFalsePositive << std::endl
<< "Signal weights : " << weightsSignal << std::endl;
}
Float_t nSignal;
Float_t efficiency;
Float_t purity;
Float_t weightsTruePositive;
Float_t weightsFalsePositive;
Float_t weightsSignal;
private:
TChain* chain;
TH1F* hSignal;
TH1F* hFP;
TH1F* hTP;
};
// ----------------------------------------------------------------------------------------------
// Call of Genetic algorithm
// ----------------------------------------------------------------------------------------------
//
void MaximizeSignificance(){
// define all the parameters by their minimum and maximum value
// in this example 3 parameters (=cuts on the classifiers) are defined.
vector<Interval*> ranges;
ranges.push_back( new Interval(-1,1) ); // for some classifiers (especially LD) the ranges have to be taken larger
ranges.push_back( new Interval(-1,1) );
ranges.push_back( new Interval(-1,1) );
std::cout << "Classifier ranges (defined by the user)" << std::endl;
for( std::vector<Interval*>::iterator it = ranges.begin(); it != ranges.end(); it++ ){
std::cout << " range: " << (*it)->GetMin() << " " << (*it)->GetMax() << std::endl;
}
TChain* chain = new TChain("multiBkg");
chain->Add("tmva_example_multiple_backgrounds__applied.root");
IFitterTarget* myFitness = new MyFitness( chain );
// prepare the genetic algorithm with an initial population size of 20
// mind: big population sizes will help in searching the domain space of the solution
// but you have to weight this out to the number of generations
// the extreme case of 1 generation and populationsize n is equal to
// a Monte Carlo calculation with n tries
const TString name( "multipleBackgroundGA" );
const TString opts( "PopSize=100:Steps=30" );
GeneticFitter mg( *myFitness, name, ranges, opts);
// mg.SetParameters( 4, 30, 200, 10,5, 0.95, 0.001 );
std::vector<Double_t> result;
Double_t estimator = mg.Run(result);
dynamic_cast<MyFitness*>(myFitness)->Print();
std::cout << std::endl;
int n = 0;
for( std::vector<Double_t>::iterator it = result.begin(); it<result.end(); it++ ){
std::cout << " cutValue[" << n << "] = " << (*it) << ";"<< std::endl;
n++;
}
}
void TMVAMultipleBackgroundExample()
{
// ----------------------------------------------------------------------------------------
// Run all
// ----------------------------------------------------------------------------------------
cout << "Start Test TMVAGAexample" << endl
<< "========================" << endl
<< endl;
TString createDataMacro = gROOT->GetTutorialDir() + "/tmva/createData.C";
gROOT->ProcessLine(TString::Format(".L %s",createDataMacro.Data()));
gROOT->ProcessLine("create_MultipleBackground(200)");
cout << endl;
cout << "========================" << endl;
cout << "--- Training" << endl;
Training();
cout << endl;
cout << "========================" << endl;
cout << "--- Application & create combined tree" << endl;
ApplicationCreateCombinedTree();
cout << endl;
cout << "========================" << endl;
cout << "--- maximize significance" << endl;
MaximizeSignificance();
}
int main( int argc, char** argv ) {
TMVAMultipleBackgroundExample();
}
Author
Andreas Hoecker

Definition in file TMVAMultipleBackgroundExample.C.