As input data is used a toy-MC sample consisting of four Gaussian-distributed and linearly correlated input variables.
The methods to be used can be switched on and off by means of booleans, or via the prompt command, for example:
(note that the backslashes are mandatory) If no method given, a default set is used.
The output file "TMVAReg.root" can be analysed with the use of dedicated macros (simply say: root -l <macro.C>), which can be conveniently invoked through a GUI that will appear at the end of the run of this macro.
==> Start TMVARegression
--- TMVARegression : Using input file: ./files/tmva_reg_example.root
DataSetInfo : [dataset] : Added class "Regression"
: Add Tree TreeR of type Regression with 10000 events
: Dataset[dataset] : Class index : 0 name : Regression
Factory : Booking method: ␛[1mPDEFoam␛[0m
:
DataSetFactory : [dataset] : Number of events in input trees
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Regression -- training events : 1000
: Regression -- testing events : 9000
: Regression -- training and testing events: 10000
:
DataSetInfo : Correlation matrix (Regression):
: ------------------------
: var1 var2
: var1: +1.000 +0.006
: var2: +0.006 +1.000
: ------------------------
DataSetFactory : [dataset] :
:
Factory : Booking method: ␛[1mKNN␛[0m
:
Factory : Booking method: ␛[1mLD␛[0m
:
Factory : Booking method: ␛[1mDNN_CPU␛[0m
:
: Parsing option string:
: ... "!H:V:ErrorStrategy=SUMOFSQUARES:VarTransform=G:WeightInitialization=XAVIERUNIFORM:Architecture=CPU:Layout=TANH|50,Layout=TANH|50,Layout=TANH|50,LINEAR:TrainingStrategy=LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE"
: The following options are set:
: - By User:
: <none>
: - Default:
: Boost_num: "0" [Number of times the classifier will be boosted]
: Parsing option string:
: ... "!H:V:ErrorStrategy=SUMOFSQUARES:VarTransform=G:WeightInitialization=XAVIERUNIFORM:Architecture=CPU:Layout=TANH|50,Layout=TANH|50,Layout=TANH|50,LINEAR:TrainingStrategy=LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE"
: The following options are set:
: - By User:
: V: "True" [Verbose output (short form of "VerbosityLevel" below - overrides the latter one)]
: VarTransform: "G" [List of variable transformations performed before training, e.g., "D_Background,P_Signal,G,N_AllClasses" for: "Decorrelation, PCA-transformation, Gaussianisation, Normalisation, each for the given class of events ('AllClasses' denotes all events of all classes, if no class indication is given, 'All' is assumed)"]
: H: "False" [Print method-specific help message]
: Layout: "TANH|50,Layout=TANH|50,Layout=TANH|50,LINEAR" [Layout of the network.]
: ErrorStrategy: "SUMOFSQUARES" [Loss function: Mean squared error (regression) or cross entropy (binary classification).]
: WeightInitialization: "XAVIERUNIFORM" [Weight initialization strategy]
: Architecture: "CPU" [Which architecture to perform the training on.]
: TrainingStrategy: "LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE" [Defines the training strategies.]
: - Default:
: VerbosityLevel: "Default" [Verbosity level]
: CreateMVAPdfs: "False" [Create PDFs for classifier outputs (signal and background)]
: IgnoreNegWeightsInTraining: "False" [Events with negative weights are ignored in the training (but are included for testing and performance evaluation)]
: ValidationSize: "20%" [Part of the training data to use for validation. Specify as 0.2 or 20% to use a fifth of the data set as validation set. Specify as 100 to use exactly 100 events. (Default: 20%)]
DNN_CPU : [dataset] : Create Transformation "G" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Preparing the Gaussian transformation...
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.012586 1.0260 [ -3.3377 5.7307 ]
: var2: 0.0043504 1.0383 [ -4.5564 5.7307 ]
: fvalue: 165.93 84.643 [ 2.0973 391.01 ]
: -----------------------------------------------------------
Parsed Training DNN string LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE
STring has size 3
Factory : Booking method: ␛[1mBDTG␛[0m
:
<WARNING> : Value for option maxdepth was previously set to 3
: the option NegWeightTreatment=InverseBoostNegWeights does not exist for BoostType=Grad
: --> change to new default NegWeightTreatment=Pray
Factory : ␛[1mTrain all methods␛[0m
Factory : [dataset] : Create Transformation "I" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3759 1.1674 [ 0.0058046 4.9975 ]
: var2: 2.4823 1.4587 [ 0.0032142 4.9971 ]
: fvalue: 165.93 84.643 [ 2.0973 391.01 ]
: -----------------------------------------------------------
: Ranking input variables (method unspecific)...
IdTransformation : Ranking result (top variable is best ranked)
: --------------------------------------------
: Rank : Variable : |Correlation with target|
: --------------------------------------------
: 1 : var2 : 7.636e-01
: 2 : var1 : 5.936e-01
: --------------------------------------------
IdTransformation : Ranking result (top variable is best ranked)
: -------------------------------------
: Rank : Variable : Mutual information
: -------------------------------------
: 1 : var2 : 2.315e+00
: 2 : var1 : 1.882e+00
: -------------------------------------
IdTransformation : Ranking result (top variable is best ranked)
: ------------------------------------
: Rank : Variable : Correlation Ratio
: ------------------------------------
: 1 : var1 : 6.545e+00
: 2 : var2 : 2.414e+00
: ------------------------------------
IdTransformation : Ranking result (top variable is best ranked)
: ----------------------------------------
: Rank : Variable : Correlation Ratio (T)
: ----------------------------------------
: 1 : var2 : 8.189e-01
: 2 : var1 : 3.128e-01
: ----------------------------------------
Factory : Train method: PDEFoam for Regression
:
: Build mono target regression foam
: Elapsed time: 0.611 sec
: Elapsed time for training with 1000 events: 0.618 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of PDEFoam on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.0092 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: ␛[0;36mdataset/weights/TMVARegression_PDEFoam.weights.xml␛[0m
: writing foam MonoTargetRegressionFoam to file
: Foams written to file: ␛[0;36mdataset/weights/TMVARegression_PDEFoam.weights_foams.root␛[0m
Factory : Training finished
:
Factory : Train method: KNN for Regression
:
KNN : <Train> start...
: Reading 1000 events
: Number of signal events 1000
: Number of background events 0
: Creating kd-tree with 1000 events
: Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)
ModulekNN : Optimizing tree for 2 variables with 1000 values
: <Fill> Class 1 has 1000 events
: Elapsed time for training with 1000 events: 0.00149 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of KNN on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.0116 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: ␛[0;36mdataset/weights/TMVARegression_KNN.weights.xml␛[0m
Factory : Training finished
:
Factory : Train method: LD for Regression
:
LD : Results for LD coefficients:
: -----------------------
: Variable: Coefficient:
: -----------------------
: var1: +42.509
: var2: +44.738
: (offset): -88.627
: -----------------------
: Elapsed time for training with 1000 events: 0.000398 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of LD on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.00289 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: ␛[0;36mdataset/weights/TMVARegression_LD.weights.xml␛[0m
Factory : Training finished
:
Factory : Train method: DNN_CPU for Regression
:
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.012586 1.0260 [ -3.3377 5.7307 ]
: var2: 0.0043504 1.0383 [ -4.5564 5.7307 ]
: fvalue: 165.93 84.643 [ 2.0973 391.01 ]
: -----------------------------------------------------------
: Start of neural network training on CPU.
:
: Training phase 1 of 3:
: Epoch | Train Err. Test Err. GFLOP/s Conv. Steps
: --------------------------------------------------------------
: 10 | 1858.9 1474.22 1.72701 0
: 20 | 1926.46 1405.63 1.76594 0
: 30 | 1673.29 1378.99 1.76961 0
: 40 | 1330.14 1270.79 1.68858 0
: 50 | 1465.01 1600.2 1.68894 10
: 60 | 3028.62 2368.09 1.6286 20
:
: Training phase 2 of 3:
: Epoch | Train Err. Test Err. GFLOP/s Conv. Steps
: --------------------------------------------------------------
: 5 | 26760.1 1780.6 1.83868 0
: 10 | 27595.6 2705.74 1.84688 5
: 15 | 24662.5 1062.78 1.85341 0
: 20 | 24412 1074.68 1.85325 5
: 25 | 23858 1235.48 1.85213 10
: 30 | 22821.5 1142.51 1.86105 15
: 35 | 22497.9 1310.45 1.8529 20
:
: Training phase 3 of 3:
: Epoch | Train Err. Test Err. GFLOP/s Conv. Steps
: --------------------------------------------------------------
: 5 | 1417.03 1308.75 1.9609 0
: 10 | 1392.86 1305.38 1.9586 0
: 15 | 1384.12 1292.85 1.95911 0
: 20 | 1377.57 1282.7 1.95964 0
: 25 | 1371.45 1276.53 1.96121 0
: 30 | 1365.44 1268.5 1.96087 0
: 35 | 1357.64 1262.33 1.95718 0
: 40 | 1351.91 1256.26 1.96182 0
: 45 | 1345.6 1251.28 1.96416 0
: 50 | 1339.93 1244.63 1.957 0
: 55 | 1334.54 1238.21 1.95992 0
: 60 | 1329.42 1231.37 1.96037 0
: 65 | 1324.35 1226.22 1.96485 0
: 70 | 1319.42 1221.63 1.9556 0
: 75 | 1314.65 1215.19 1.93574 0
: 80 | 1309.92 1210.23 1.94545 0
: 85 | 1305.32 1204.7 1.95174 0
: 90 | 1300.87 1198.81 1.96039 0
: 95 | 1296.31 1195.56 1.95658 0
: 100 | 1291.97 1189.86 1.95759 0
: 105 | 1287.66 1185.39 1.95285 0
: 110 | 1283.44 1180.65 1.95753 0
: 115 | 1279.28 1175.8 1.96025 0
: 120 | 1275.21 1171.34 1.95348 0
: 125 | 1271.25 1165.81 1.95345 0
: 130 | 1267.4 1160.38 1.94094 0
: 135 | 1273.85 1159.83 1.95935 5
: 140 | 1270.61 1155.56 1.95675 0
: 145 | 1266.63 1151.79 1.96056 0
: 150 | 1262.77 1145.71 1.95193 0
: 155 | 1258.28 1144.41 1.95989 0
: 160 | 1254.1 1141.3 1.96107 0
: 165 | 1238.48 1136.84 1.95848 0
: 170 | 1233.01 1132.01 1.95614 0
: 175 | 1226.94 1121.01 1.95869 0
: 180 | 1218.65 1114.16 1.9567 0
: 185 | 1213.1 1117.05 1.941 5
: 190 | 1209.13 1116.57 1.93849 10
:
: Elapsed time for training with 1000 events: 3.99 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of DNN_CPU on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.0198 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: ␛[0;36mdataset/weights/TMVARegression_DNN_CPU.weights.xml␛[0m
Factory : Training finished
:
Factory : Train method: BDTG for Regression
:
: Regression Loss Function: Huber
: Training 2000 Decision Trees ... patience please
: Elapsed time for training with 1000 events: 1.56 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of BDTG on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.314 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: ␛[0;36mdataset/weights/TMVARegression_BDTG.weights.xml␛[0m
: TMVAReg.root:/dataset/Method_BDT/BDTG
Factory : Training finished
:
Factory : === Destroy and recreate all methods via weight files for testing ===
:
: Reading weight file: ␛[0;36mdataset/weights/TMVARegression_PDEFoam.weights.xml␛[0m
: Read foams from file: ␛[0;36mdataset/weights/TMVARegression_PDEFoam.weights_foams.root␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVARegression_KNN.weights.xml␛[0m
: Creating kd-tree with 1000 events
: Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)
ModulekNN : Optimizing tree for 2 variables with 1000 values
: <Fill> Class 1 has 1000 events
: Reading weight file: ␛[0;36mdataset/weights/TMVARegression_LD.weights.xml␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVARegression_DNN_CPU.weights.xml␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVARegression_BDTG.weights.xml␛[0m
Factory : ␛[1mTest all methods␛[0m
Factory : Test method: PDEFoam for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of PDEFoam on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.0611 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: KNN for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of KNN on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.068 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: LD for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of LD on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.00347 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: DNN_CPU for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of DNN_CPU on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.174 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: BDTG for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of BDTG on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 1.93 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : ␛[1mEvaluate all methods␛[0m
: Evaluate regression method: PDEFoam
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.0418 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.00497 sec
TFHandler_PDEFoam : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3352 1.1893 [ 0.00020069 5.0000 ]
: var2: 2.4860 1.4342 [ 0.00071490 5.0000 ]
: fvalue: 163.91 83.651 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: KNN
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.0664 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.0075 sec
TFHandler_KNN : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3352 1.1893 [ 0.00020069 5.0000 ]
: var2: 2.4860 1.4342 [ 0.00071490 5.0000 ]
: fvalue: 163.91 83.651 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: LD
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.00436 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.000518 sec
TFHandler_LD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3352 1.1893 [ 0.00020069 5.0000 ]
: var2: 2.4860 1.4342 [ 0.00071490 5.0000 ]
: fvalue: 163.91 83.651 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: DNN_CPU
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.166 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.0183 sec
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.027278 1.0264 [ -3.3694 5.7307 ]
: var2: 0.0056047 0.98632 [ -5.7307 5.7307 ]
: fvalue: 163.91 83.651 [ 1.6186 394.84 ]
: -----------------------------------------------------------
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.027278 1.0264 [ -3.3694 5.7307 ]
: var2: 0.0056047 0.98632 [ -5.7307 5.7307 ]
: fvalue: 163.91 83.651 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: BDTG
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 1.95 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.215 sec
TFHandler_BDTG : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3352 1.1893 [ 0.00020069 5.0000 ]
: var2: 2.4860 1.4342 [ 0.00071490 5.0000 ]
: fvalue: 163.91 83.651 [ 1.6186 394.84 ]
: -----------------------------------------------------------
:
: Evaluation results ranked by smallest RMS on test sample:
: ("Bias" quotes the mean deviation of the regression from true target.
: "MutInf" is the "Mutual Information" between regression and target.
: Indicated by "_T" are the corresponding "truncated" quantities ob-
: tained when removing events deviating more than 2sigma from average.)
: --------------------------------------------------------------------------------------------------
: --------------------------------------------------------------------------------------------------
: dataset BDTG : 0.0707 0.102 2.45 1.95 | 3.100 3.175
: dataset KNN : -0.237 0.578 5.17 3.44 | 2.898 2.939
: dataset PDEFoam : 0.106 -0.0677 9.22 7.74 | 2.283 2.375
: dataset LD : 0.461 2.22 19.6 17.6 | 1.985 1.979
: dataset DNN_CPU : 1.45 3.52 34.5 28.9 | 1.229 1.270
: --------------------------------------------------------------------------------------------------
:
: Evaluation results ranked by smallest RMS on training sample:
: (overtraining check)
: --------------------------------------------------------------------------------------------------
: DataSet Name: MVA Method: <Bias> <Bias_T> RMS RMS_T | MutInf MutInf_T
: --------------------------------------------------------------------------------------------------
: dataset BDTG : 0.0597 0.0107 0.566 0.293 | 3.441 3.466
: dataset KNN : -0.425 0.423 5.19 3.54 | 3.006 3.034
: dataset PDEFoam : 8.35e-07 0.106 8.04 6.57 | 2.488 2.579
: dataset LD :-1.03e-06 1.54 20.1 18.5 | 2.134 2.153
: dataset DNN_CPU : 0.658 3.07 34.5 29.0 | 1.337 1.367
: --------------------------------------------------------------------------------------------------
:
Dataset:dataset : Created tree 'TestTree' with 9000 events
:
Dataset:dataset : Created tree 'TrainTree' with 1000 events
:
Factory : ␛[1mThank you for using TMVA!␛[0m
: ␛[1mFor citation information, please visit: http://tmva.sf.net/citeTMVA.html␛[0m
==> Wrote root file: TMVAReg.root
==> TMVARegression is done!