As input data is used a toy-MC sample consisting of four Gaussian-distributed and linearly correlated input variables. The methods to be used can be switched on and off by means of booleans, or via the prompt command, for example:
(note that the backslashes are mandatory) If no method given, a default set of classifiers is used. The output file "TMVA.root" can be analysed with the use of dedicated macros (simply say: root -l <macro.C>), which can be conveniently invoked through a GUI that will appear at the end of the run of this macro. Launch the GUI via the command:
==> Start TMVAClassification
--- TMVAClassification : Using input file: ./files/tmva_class_example.root
DataSetInfo : [dataset] : Added class "Signal"
: Add Tree TreeS of type Signal with 6000 events
DataSetInfo : [dataset] : Added class "Background"
: Add Tree TreeB of type Background with 6000 events
Factory : Booking method: ␛[1mCuts␛[0m
:
: Use optimization method: "Monte Carlo"
: Use efficiency computation method: "Event Selection"
: Use "FSmart" cuts for variable: 'myvar1'
: Use "FSmart" cuts for variable: 'myvar2'
: Use "FSmart" cuts for variable: 'var3'
: Use "FSmart" cuts for variable: 'var4'
Factory : Booking method: ␛[1mCutsD␛[0m
:
CutsD : [dataset] : Create Transformation "Decorrelate" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
: Use optimization method: "Monte Carlo"
: Use efficiency computation method: "Event Selection"
: Use "FSmart" cuts for variable: 'myvar1'
: Use "FSmart" cuts for variable: 'myvar2'
: Use "FSmart" cuts for variable: 'var3'
: Use "FSmart" cuts for variable: 'var4'
Factory : Booking method: ␛[1mLikelihood␛[0m
:
Factory : Booking method: ␛[1mLikelihoodPCA␛[0m
:
LikelihoodPCA : [dataset] : Create Transformation "PCA" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : Booking method: ␛[1mPDERS␛[0m
:
Factory : Booking method: ␛[1mPDEFoam␛[0m
:
Factory : Booking method: ␛[1mKNN␛[0m
:
Factory : Booking method: ␛[1mLD␛[0m
:
DataSetFactory : [dataset] : Number of events in input trees
:
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 1000
: Signal -- testing events : 5000
: Signal -- training and testing events: 6000
: Background -- training events : 1000
: Background -- testing events : 5000
: Background -- training and testing events: 6000
:
DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------------
: var1+var2 var1-var2 var3 var4
: var1+var2: +1.000 +0.038 +0.748 +0.922
: var1-var2: +0.038 +1.000 -0.058 +0.128
: var3: +0.748 -0.058 +1.000 +0.831
: var4: +0.922 +0.128 +0.831 +1.000
: ----------------------------------------------
DataSetInfo : Correlation matrix (Background):
: ----------------------------------------------
: var1+var2 var1-var2 var3 var4
: var1+var2: +1.000 -0.021 +0.783 +0.931
: var1-var2: -0.021 +1.000 -0.162 +0.057
: var3: +0.783 -0.162 +1.000 +0.841
: var4: +0.931 +0.057 +0.841 +1.000
: ----------------------------------------------
DataSetFactory : [dataset] :
:
Factory : Booking method: ␛[1mFDA_GA␛[0m
:
: Create parameter interval for parameter 0 : [-1,1]
: Create parameter interval for parameter 1 : [-10,10]
: Create parameter interval for parameter 2 : [-10,10]
: Create parameter interval for parameter 3 : [-10,10]
: Create parameter interval for parameter 4 : [-10,10]
: User-defined formula string : "(0)+(1)*x0+(2)*x1+(3)*x2+(4)*x3"
: TFormula-compatible formula string: "[0]+[1]*[5]+[2]*[6]+[3]*[7]+[4]*[8]"
Factory : Booking method: ␛[1mMLPBNN␛[0m
:
MLPBNN : [dataset] : Create Transformation "N" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
MLPBNN : Building Network.
: Initializing weights
Factory : Booking method: ␛[1mSVM␛[0m
:
SVM : [dataset] : Create Transformation "Norm" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : Booking method: ␛[1mBDT␛[0m
:
Factory : Booking method: ␛[1mRuleFit␛[0m
:
Factory : ␛[1mTrain all methods␛[0m
Factory : [dataset] : Create Transformation "I" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : [dataset] : Create Transformation "D" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : [dataset] : Create Transformation "P" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : [dataset] : Create Transformation "G" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : [dataset] : Create Transformation "D" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'myvar1' <---> Output : variable 'myvar1'
: Input : variable 'myvar2' <---> Output : variable 'myvar2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.062775 1.7187 [ -9.3380 7.6931 ]
: myvar2: 0.056495 1.0784 [ -3.2551 4.0291 ]
: var3: -0.020366 1.0633 [ -5.2777 4.6430 ]
: var4: 0.13214 1.2464 [ -5.6007 4.6744 ]
: -----------------------------------------------------------
: Preparing the Decorrelation transformation...
TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.17586 1.0000 [ -5.6401 4.8529 ]
: myvar2: 0.026952 1.0000 [ -2.9292 3.7065 ]
: var3: -0.11549 1.0000 [ -4.1792 3.5180 ]
: var4: 0.34819 1.0000 [ -3.3363 3.3963 ]
: -----------------------------------------------------------
: Preparing the Principle Component (PCA) transformation...
TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.11433 2.2714 [ -11.272 9.0916 ]
: myvar2: -0.0070834 1.0934 [ -3.9875 3.3836 ]
: var3: 0.011107 0.57824 [ -2.0171 2.1958 ]
: var4: -0.0094450 0.33437 [ -1.0176 1.0617 ]
: -----------------------------------------------------------
: Preparing the Gaussian transformation...
: Preparing the Decorrelation transformation...
TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.054412 1.0000 [ -3.0924 8.1350 ]
: myvar2: -0.0021417 1.0000 [ -4.5913 5.6461 ]
: var3: -0.0051998 1.0000 [ -3.1457 4.6043 ]
: var4: 0.074624 1.0000 [ -3.4587 5.9397 ]
: -----------------------------------------------------------
: Ranking input variables (method unspecific)...
IdTransformation : Ranking result (top variable is best ranked)
: -------------------------------------
: Rank : Variable : Separation
: -------------------------------------
: 1 : Variable 4 : 2.843e-01
: 2 : Variable 3 : 1.756e-01
: 3 : myvar1 : 1.018e-01
: 4 : Expression 2 : 3.860e-02
: -------------------------------------
Factory : Train method: Cuts for Classification
:
FitterBase : <MCFitter> Sampling, please be patient ...
: Elapsed time: 3.54 sec
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.1
: Corresponding background efficiency : 0.00621902
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -1.19223 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 2.126
: Cut[ 2]: -2.90978 < var3 <= 1e+30
: Cut[ 3]: 2.16207 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.2
: Corresponding background efficiency : 0.0171253
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -5.85714 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 2.21109
: Cut[ 2]: -0.759439 < var3 <= 1e+30
: Cut[ 3]: 1.66846 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.3
: Corresponding background efficiency : 0.0401486
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -6.09813 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 2.81831
: Cut[ 2]: -2.09336 < var3 <= 1e+30
: Cut[ 3]: 1.34308 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.4
: Corresponding background efficiency : 0.062887
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -4.55141 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 2.94573
: Cut[ 2]: -4.68697 < var3 <= 1e+30
: Cut[ 3]: 1.07157 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.5
: Corresponding background efficiency : 0.104486
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -5.86032 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 2.89615
: Cut[ 2]: -0.966191 < var3 <= 1e+30
: Cut[ 3]: 0.773848 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.6
: Corresponding background efficiency : 0.172806
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -5.52552 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 4.08498
: Cut[ 2]: -2.61706 < var3 <= 1e+30
: Cut[ 3]: 0.469684 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.7
: Corresponding background efficiency : 0.258379
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -5.69875 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 1.73784
: Cut[ 2]: -1.21467 < var3 <= 1e+30
: Cut[ 3]: 0.109026 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.8
: Corresponding background efficiency : 0.362964
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -1.99372 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 3.93767
: Cut[ 2]: -1.56317 < var3 <= 1e+30
: Cut[ 3]: -0.124013 < var4 <= 1e+30
: ------------------------------------------
: ------------------------------------------
Cuts : Cut values for requested signal efficiency: 0.9
: Corresponding background efficiency : 0.503885
: Transformation applied to input variables : None
: ------------------------------------------
: Cut[ 0]: -3.97304 < myvar1 <= 1e+30
: Cut[ 1]: -1e+30 < myvar2 <= 3.31284
: Cut[ 2]: -2.82879 < var3 <= 1e+30
: Cut[ 3]: -0.577302 < var4 <= 1e+30
: ------------------------------------------
: Elapsed time for training with 2000 events: 3.54 sec
Cuts : [dataset] : Evaluation of Cuts on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.000383 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_Cuts.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_Cuts.class.C␛[0m
: TMVA.root:/dataset/Method_Cuts/Cuts
Factory : Training finished
:
Factory : Train method: CutsD for Classification
:
: Preparing the Decorrelation transformation...
TFHandler_CutsD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.17586 1.0000 [ -5.6401 4.8529 ]
: myvar2: 0.026952 1.0000 [ -2.9292 3.7065 ]
: var3: -0.11549 1.0000 [ -4.1792 3.5180 ]
: var4: 0.34819 1.0000 [ -3.3363 3.3963 ]
: -----------------------------------------------------------
TFHandler_CutsD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.17586 1.0000 [ -5.6401 4.8529 ]
: myvar2: 0.026952 1.0000 [ -2.9292 3.7065 ]
: var3: -0.11549 1.0000 [ -4.1792 3.5180 ]
: var4: 0.34819 1.0000 [ -3.3363 3.3963 ]
: -----------------------------------------------------------
FitterBase : <MCFitter> Sampling, please be patient ...
: Elapsed time: 2.81 sec
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.1
: Corresponding background efficiency : 0
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 0.513038
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= -0.733858
: Cut[ 2]: -0.87113 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 0.687739 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.2
: Corresponding background efficiency : 0.000493656
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 1.60056
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 1.26936
: Cut[ 2]: -1.50073 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 1.54845 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.3
: Corresponding background efficiency : 0.00334252
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 2.16898
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 3.25932
: Cut[ 2]: -2.08503 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 1.43959 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.4
: Corresponding background efficiency : 0.00821453
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 1.9086
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 1.94778
: Cut[ 2]: -2.11471 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 1.1885 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.5
: Corresponding background efficiency : 0.0209024
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 3.97301
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 2.87835
: Cut[ 2]: -1.68889 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 0.969507 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.6
: Corresponding background efficiency : 0.055037
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 2.57624
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 2.20263
: Cut[ 2]: -3.86902 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 0.802122 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.7
: Corresponding background efficiency : 0.0975699
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 3.65719
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 3.19411
: Cut[ 2]: -2.87372 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 0.583961 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.8
: Corresponding background efficiency : 0.170999
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 4.74857
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 2.75269
: Cut[ 2]: -3.22043 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 0.327788 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: ------------------------------------------------------------------------------------------------------------------------
CutsD : Cut values for requested signal efficiency: 0.9
: Corresponding background efficiency : 0.326977
: Transformation applied to input variables : "Deco"
: ------------------------------------------------------------------------------------------------------------------------
: Cut[ 0]: -1e+30 < + 1.1476*[myvar1] + 0.027923*[myvar2] - 0.19981*[var3] - 0.82843*[var4] <= 3.56614
: Cut[ 1]: -1e+30 < + 0.027923*[myvar1] + 0.95469*[myvar2] + 0.18581*[var3] - 0.1623*[var4] <= 3.09071
: Cut[ 2]: -3.9944 < - 0.19981*[myvar1] + 0.18581*[myvar2] + 1.7913*[var3] - 0.77231*[var4] <= 1e+30
: Cut[ 3]: 0.0311777 < - 0.82843*[myvar1] - 0.1623*[myvar2] - 0.77231*[var3] + 2.1918*[var4] <= 1e+30
: ------------------------------------------------------------------------------------------------------------------------
: Elapsed time for training with 2000 events: 2.81 sec
CutsD : [dataset] : Evaluation of CutsD on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.00172 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_CutsD.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_CutsD.class.C␛[0m
: TMVA.root:/dataset/Method_CutsD/CutsD
Factory : Training finished
:
Factory : Train method: Likelihood for Classification
:
:
: ␛[1m================================================================␛[0m
: ␛[1mH e l p f o r M V A m e t h o d [ Likelihood ] :␛[0m
:
: ␛[1m--- Short description:␛[0m
:
: The maximum-likelihood classifier models the data with probability
: density functions (PDF) reproducing the signal and background
: distributions of the input variables. Correlations among the
: variables are ignored.
:
: ␛[1m--- Performance optimisation:␛[0m
:
: Required for good performance are decorrelated input variables
: (PCA transformation via the option "VarTransform=Decorrelate"
: may be tried). Irreducible non-linear correlations may be reduced
: by precombining strongly correlated input variables, or by simply
: removing one of the variables.
:
: ␛[1m--- Performance tuning via configuration options:␛[0m
:
: High fidelity PDF estimates are mandatory, i.e., sufficient training
: statistics is required to populate the tails of the distributions
: It would be a surprise if the default Spline or KDE kernel parameters
: provide a satisfying fit to the data. The user is advised to properly
: tune the events per bin and smooth options in the spline cases
: individually per variable. If the KDE kernel is used, the adaptive
: Gaussian kernel may lead to artefacts, so please always also try
: the non-adaptive one.
:
: All tuning parameters must be adjusted individually for each input
: variable!
:
: <Suppress this message by specifying "!H" in the booking option>
: ␛[1m================================================================␛[0m
:
: Filling reference histograms
: Building PDF out of reference histograms
: Elapsed time for training with 2000 events: 0.0138 sec
Likelihood : [dataset] : Evaluation of Likelihood on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.00202 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_Likelihood.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_Likelihood.class.C␛[0m
: TMVA.root:/dataset/Method_Likelihood/Likelihood
Factory : Training finished
:
Factory : Train method: LikelihoodPCA for Classification
:
: Preparing the Principle Component (PCA) transformation...
TFHandler_LikelihoodPCA : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.11433 2.2714 [ -11.272 9.0916 ]
: myvar2: -0.0070834 1.0934 [ -3.9875 3.3836 ]
: var3: 0.011107 0.57824 [ -2.0171 2.1958 ]
: var4: -0.0094450 0.33437 [ -1.0176 1.0617 ]
: -----------------------------------------------------------
: Filling reference histograms
: Building PDF out of reference histograms
: Elapsed time for training with 2000 events: 0.0163 sec
LikelihoodPCA : [dataset] : Evaluation of LikelihoodPCA on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.00403 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_LikelihoodPCA.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_LikelihoodPCA.class.C␛[0m
: TMVA.root:/dataset/Method_LikelihoodPCA/LikelihoodPCA
Factory : Training finished
:
Factory : Train method: PDERS for Classification
:
: Elapsed time for training with 2000 events: 0.00378 sec
PDERS : [dataset] : Evaluation of PDERS on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.26 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_PDERS.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_PDERS.class.C␛[0m
Factory : Training finished
:
Factory : Train method: PDEFoam for Classification
:
PDEFoam : NormMode=NUMEVENTS chosen. Note that only NormMode=EqualNumEvents ensures that Discriminant values correspond to signal probabilities.
: Build up discriminator foam
: Elapsed time: 0.336 sec
: Elapsed time for training with 2000 events: 0.364 sec
PDEFoam : [dataset] : Evaluation of PDEFoam on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.0141 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_PDEFoam.weights.xml␛[0m
: writing foam DiscrFoam to file
: Foams written to file: ␛[0;36mdataset/weights/TMVAClassification_PDEFoam.weights_foams.root␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_PDEFoam.class.C␛[0m
Factory : Training finished
:
Factory : Train method: KNN for Classification
:
:
: ␛[1m================================================================␛[0m
: ␛[1mH e l p f o r M V A m e t h o d [ KNN ] :␛[0m
:
: ␛[1m--- Short description:␛[0m
:
: The k-nearest neighbor (k-NN) algorithm is a multi-dimensional classification
: and regression algorithm. Similarly to other TMVA algorithms, k-NN uses a set of
: training events for which a classification category/regression target is known.
: The k-NN method compares a test event to all training events using a distance
: function, which is an Euclidean distance in a space defined by the input variables.
: The k-NN method, as implemented in TMVA, uses a kd-tree algorithm to perform a
: quick search for the k events with shortest distance to the test event. The method
: returns a fraction of signal events among the k neighbors. It is recommended
: that a histogram which stores the k-NN decision variable is binned with k+1 bins
: between 0 and 1.
:
: ␛[1m--- Performance tuning via configuration options: ␛[0m
:
: The k-NN method estimates a density of signal and background events in a
: neighborhood around the test event. The method assumes that the density of the
: signal and background events is uniform and constant within the neighborhood.
: k is an adjustable parameter and it determines an average size of the
: neighborhood. Small k values (less than 10) are sensitive to statistical
: fluctuations and large (greater than 100) values might not sufficiently capture
: local differences between events in the training set. The speed of the k-NN
: method also increases with larger values of k.
:
: The k-NN method assigns equal weight to all input variables. Different scales
: among the input variables is compensated using ScaleFrac parameter: the input
: variables are scaled so that the widths for central ScaleFrac*100% events are
: equal among all the input variables.
:
: ␛[1m--- Additional configuration options: ␛[0m
:
: The method inclues an option to use a Gaussian kernel to smooth out the k-NN
: response. The kernel re-weights events using a distance to the test event.
:
: <Suppress this message by specifying "!H" in the booking option>
: ␛[1m================================================================␛[0m
:
KNN : <Train> start...
: Reading 2000 events
: Number of signal events 1000
: Number of background events 1000
: Creating kd-tree with 2000 events
: Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)
ModulekNN : Optimizing tree for 4 variables with 2000 values
: <Fill> Class 1 has 1000 events
: <Fill> Class 2 has 1000 events
: Elapsed time for training with 2000 events: 0.00271 sec
KNN : [dataset] : Evaluation of KNN on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.0386 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_KNN.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_KNN.class.C␛[0m
Factory : Training finished
:
Factory : Train method: LD for Classification
:
:
: ␛[1m================================================================␛[0m
: ␛[1mH e l p f o r M V A m e t h o d [ LD ] :␛[0m
:
: ␛[1m--- Short description:␛[0m
:
: Linear discriminants select events by distinguishing the mean
: values of the signal and background distributions in a trans-
: formed variable space where linear correlations are removed.
: The LD implementation here is equivalent to the "Fisher" discriminant
: for classification, but also provides linear regression.
:
: (More precisely: the "linear discriminator" determines
: an axis in the (correlated) hyperspace of the input
: variables such that, when projecting the output classes
: (signal and background) upon this axis, they are pushed
: as far as possible away from each other, while events
: of a same class are confined in a close vicinity. The
: linearity property of this classifier is reflected in the
: metric with which "far apart" and "close vicinity" are
: determined: the covariance matrix of the discriminating
: variable space.)
:
: ␛[1m--- Performance optimisation:␛[0m
:
: Optimal performance for the linear discriminant is obtained for
: linearly correlated Gaussian-distributed variables. Any deviation
: from this ideal reduces the achievable separation power. In
: particular, no discrimination at all is achieved for a variable
: that has the same sample mean for signal and background, even if
: the shapes of the distributions are very different. Thus, the linear
: discriminant often benefits from a suitable transformation of the
: input variables. For example, if a variable x in [-1,1] has a
: a parabolic signal distributions, and a uniform background
: distributions, their mean value is zero in both cases, leading
: to no separation. The simple transformation x -> |x| renders this
: variable powerful for the use in a linear discriminant.
:
: ␛[1m--- Performance tuning via configuration options:␛[0m
:
: <None>
:
: <Suppress this message by specifying "!H" in the booking option>
: ␛[1m================================================================␛[0m
:
LD : Results for LD coefficients:
: -----------------------
: Variable: Coefficient:
: -----------------------
: myvar1: -0.309
: myvar2: -0.102
: var3: -0.142
: var4: +0.705
: (offset): -0.055
: -----------------------
: Elapsed time for training with 2000 events: 0.000881 sec
LD : [dataset] : Evaluation of LD on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.000412 sec
: <CreateMVAPdfs> Separation from histogram (PDF): 0.540 (0.000)
: Dataset[dataset] : Evaluation of LD on training sample
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_LD.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_LD.class.C␛[0m
Factory : Training finished
:
Factory : Train method: FDA_GA for Classification
:
:
: ␛[1m================================================================␛[0m
: ␛[1mH e l p f o r M V A m e t h o d [ FDA_GA ] :␛[0m
:
: ␛[1m--- Short description:␛[0m
:
: The function discriminant analysis (FDA) is a classifier suitable
: to solve linear or simple nonlinear discrimination problems.
:
: The user provides the desired function with adjustable parameters
: via the configuration option string, and FDA fits the parameters to
: it, requiring the signal (background) function value to be as close
: as possible to 1 (0). Its advantage over the more involved and
: automatic nonlinear discriminators is the simplicity and transparency
: of the discrimination expression. A shortcoming is that FDA will
: underperform for involved problems with complicated, phase space
: dependent nonlinear correlations.
:
: Please consult the Users Guide for the format of the formula string
: and the allowed parameter ranges:
: http://tmva.sourceforge.net/docu/TMVAUsersGuide.pdf
:
: ␛[1m--- Performance optimisation:␛[0m
:
: The FDA performance depends on the complexity and fidelity of the
: user-defined discriminator function. As a general rule, it should
: be able to reproduce the discrimination power of any linear
: discriminant analysis. To reach into the nonlinear domain, it is
: useful to inspect the correlation profiles of the input variables,
: and add quadratic and higher polynomial terms between variables as
: necessary. Comparison with more involved nonlinear classifiers can
: be used as a guide.
:
: ␛[1m--- Performance tuning via configuration options:␛[0m
:
: Depending on the function used, the choice of "FitMethod" is
: crucial for getting valuable solutions with FDA. As a guideline it
: is recommended to start with "FitMethod=MINUIT". When more complex
: functions are used where MINUIT does not converge to reasonable
: results, the user should switch to non-gradient FitMethods such
: as GeneticAlgorithm (GA) or Monte Carlo (MC). It might prove to be
: useful to combine GA (or MC) with MINUIT by setting the option
: "Converger=MINUIT". GA (MC) will then set the starting parameters
: for MINUIT such that the basic quality of GA (MC) of finding global
: minima is combined with the efficacy of MINUIT of finding local
: minima.
:
: <Suppress this message by specifying "!H" in the booking option>
: ␛[1m================================================================␛[0m
:
FitterBase : <GeneticFitter> Optimisation, please be patient ... (inaccurate progress timing for GA)
: Elapsed time: 0.681 sec
FDA_GA : Results for parameter fit using "GA" fitter:
: -----------------------
: Parameter: Fit result:
: -----------------------
: Par(0): 0.516172
: Par(1): -0.223976
: Par(2): 0
: Par(3): -0.0484459
: Par(4): 0.519119
: -----------------------
: Discriminator expression: "(0)+(1)*x0+(2)*x1+(3)*x2+(4)*x3"
: Value of estimator at minimum: 0.299995
: Elapsed time for training with 2000 events: 0.712 sec
FDA_GA : [dataset] : Evaluation of FDA_GA on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.000508 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_FDA_GA.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_FDA_GA.class.C␛[0m
Factory : Training finished
:
Factory : Train method: MLPBNN for Classification
:
:
: ␛[1m================================================================␛[0m
: ␛[1mH e l p f o r M V A m e t h o d [ MLPBNN ] :␛[0m
:
: ␛[1m--- Short description:␛[0m
:
: The MLP artificial neural network (ANN) is a traditional feed-
: forward multilayer perceptron implementation. The MLP has a user-
: defined hidden layer architecture, while the number of input (output)
: nodes is determined by the input variables (output classes, i.e.,
: signal and one background).
:
: ␛[1m--- Performance optimisation:␛[0m
:
: Neural networks are stable and performing for a large variety of
: linear and non-linear classification problems. However, in contrast
: to (e.g.) boosted decision trees, the user is advised to reduce the
: number of input variables that have only little discrimination power.
:
: In the tests we have carried out so far, the MLP and ROOT networks
: (TMlpANN, interfaced via TMVA) performed equally well, with however
: a clear speed advantage for the MLP. The Clermont-Ferrand neural
: net (CFMlpANN) exhibited worse classification performance in these
: tests, which is partly due to the slow convergence of its training
: (at least 10k training cycles are required to achieve approximately
: competitive results).
:
: ␛[1mOvertraining: ␛[0monly the TMlpANN performs an explicit separation of the
: full training sample into independent training and validation samples.
: We have found that in most high-energy physics applications the
: available degrees of freedom (training events) are sufficient to
: constrain the weights of the relatively simple architectures required
: to achieve good performance. Hence no overtraining should occur, and
: the use of validation samples would only reduce the available training
: information. However, if the performance on the training sample is
: found to be significantly better than the one found with the inde-
: pendent test sample, caution is needed. The results for these samples
: are printed to standard output at the end of each training job.
:
: ␛[1m--- Performance tuning via configuration options:␛[0m
:
: The hidden layer architecture for all ANNs is defined by the option
: "HiddenLayers=N+1,N,...", where here the first hidden layer has N+1
: neurons and the second N neurons (and so on), and where N is the number
: of input variables. Excessive numbers of hidden layers should be avoided,
: in favour of more neurons in the first hidden layer.
:
: The number of cycles should be above 500. As said, if the number of
: adjustable weights is small compared to the training sample size,
: using a large number of training samples should not lead to overtraining.
:
: <Suppress this message by specifying "!H" in the booking option>
: ␛[1m================================================================␛[0m
:
TFHandler_MLPBNN : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.089214 0.20183 [ -1.0000 1.0000 ]
: myvar2: -0.090751 0.29609 [ -1.0000 1.0000 ]
: var3: 0.059878 0.21436 [ -1.0000 1.0000 ]
: var4: 0.11587 0.24261 [ -1.0000 1.0000 ]
: -----------------------------------------------------------
: Training Network
:
: Finalizing handling of Regulator terms, trainE=0.713219 testE=0.724617
: Done with handling of Regulator terms
: Elapsed time for training with 2000 events: 2.45 sec
MLPBNN : [dataset] : Evaluation of MLPBNN on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.00353 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_MLPBNN.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_MLPBNN.class.C␛[0m
: Write special histos to file: TMVA.root:/dataset/Method_MLPBNN/MLPBNN
Factory : Training finished
:
Factory : Train method: SVM for Classification
:
TFHandler_SVM : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.089214 0.20183 [ -1.0000 1.0000 ]
: myvar2: -0.090751 0.29609 [ -1.0000 1.0000 ]
: var3: 0.059878 0.21436 [ -1.0000 1.0000 ]
: var4: 0.11587 0.24261 [ -1.0000 1.0000 ]
: -----------------------------------------------------------
: Building SVM Working Set...with 2000 event instances
: Elapsed time for Working Set build: 0.085 sec
: Sorry, no computing time forecast available for SVM, please wait ...
: Elapsed time: 0.327 sec
: Elapsed time for training with 2000 events: 0.417 sec
SVM : [dataset] : Evaluation of SVM on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.0628 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_SVM.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_SVM.class.C␛[0m
Factory : Training finished
:
Factory : Train method: BDT for Classification
:
BDT : #events: (reweighted) sig: 1000 bkg: 1000
: #events: (unweighted) sig: 1000 bkg: 1000
: Training 850 Decision Trees ... patience please
: Elapsed time for training with 2000 events: 0.737 sec
BDT : [dataset] : Evaluation of BDT on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.146 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_BDT.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_BDT.class.C␛[0m
: TMVA.root:/dataset/Method_BDT/BDT
Factory : Training finished
:
Factory : Train method: RuleFit for Classification
:
:
: ␛[1m================================================================␛[0m
: ␛[1mH e l p f o r M V A m e t h o d [ RuleFit ] :␛[0m
:
: ␛[1m--- Short description:␛[0m
:
: This method uses a collection of so called rules to create a
: discriminating scoring function. Each rule consists of a series
: of cuts in parameter space. The ensemble of rules are created
: from a forest of decision trees, trained using the training data.
: Each node (apart from the root) corresponds to one rule.
: The scoring function is then obtained by linearly combining
: the rules. A fitting procedure is applied to find the optimum
: set of coefficients. The goal is to find a model with few rules
: but with a strong discriminating power.
:
: ␛[1m--- Performance optimisation:␛[0m
:
: There are two important considerations to make when optimising:
:
: 1. Topology of the decision tree forest
: 2. Fitting of the coefficients
:
: The maximum complexity of the rules is defined by the size of
: the trees. Large trees will yield many complex rules and capture
: higher order correlations. On the other hand, small trees will
: lead to a smaller ensemble with simple rules, only capable of
: modeling simple structures.
: Several parameters exists for controlling the complexity of the
: rule ensemble.
:
: The fitting procedure searches for a minimum using a gradient
: directed path. Apart from step size and number of steps, the
: evolution of the path is defined by a cut-off parameter, tau.
: This parameter is unknown and depends on the training data.
: A large value will tend to give large weights to a few rules.
: Similarly, a small value will lead to a large set of rules
: with similar weights.
:
: A final point is the model used; rules and/or linear terms.
: For a given training sample, the result may improve by adding
: linear terms. If best performance is obtained using only linear
: terms, it is very likely that the Fisher discriminant would be
: a better choice. Ideally the fitting procedure should be able to
: make this choice by giving appropriate weights for either terms.
:
: ␛[1m--- Performance tuning via configuration options:␛[0m
:
: I. TUNING OF RULE ENSEMBLE:
:
: ␛[1mForestType ␛[0m: Recommended is to use the default "AdaBoost".
: ␛[1mnTrees ␛[0m: More trees leads to more rules but also slow
: performance. With too few trees the risk is
: that the rule ensemble becomes too simple.
: ␛[1mfEventsMin ␛[0m
: ␛[1mfEventsMax ␛[0m: With a lower min, more large trees will be generated
: leading to more complex rules.
: With a higher max, more small trees will be
: generated leading to more simple rules.
: By changing this range, the average complexity
: of the rule ensemble can be controlled.
: ␛[1mRuleMinDist ␛[0m: By increasing the minimum distance between
: rules, fewer and more diverse rules will remain.
: Initially it is a good idea to keep this small
: or zero and let the fitting do the selection of
: rules. In order to reduce the ensemble size,
: the value can then be increased.
:
: II. TUNING OF THE FITTING:
:
: ␛[1mGDPathEveFrac ␛[0m: fraction of events in path evaluation
: Increasing this fraction will improve the path
: finding. However, a too high value will give few
: unique events available for error estimation.
: It is recommended to use the default = 0.5.
: ␛[1mGDTau ␛[0m: cutoff parameter tau
: By default this value is set to -1.0.
: This means that the cut off parameter is
: automatically estimated. In most cases
: this should be fine. However, you may want
: to fix this value if you already know it
: and want to reduce on training time.
: ␛[1mGDTauPrec ␛[0m: precision of estimated tau
: Increase this precision to find a more
: optimum cut-off parameter.
: ␛[1mGDNStep ␛[0m: number of steps in path search
: If the number of steps is too small, then
: the program will give a warning message.
:
: III. WARNING MESSAGES
:
: ␛[1mRisk(i+1)>=Risk(i) in path␛[0m
: ␛[1mChaotic behaviour of risk evolution.␛[0m
: The error rate was still decreasing at the end
: By construction the Risk should always decrease.
: However, if the training sample is too small or
: the model is overtrained, such warnings can
: occur.
: The warnings can safely be ignored if only a
: few (<3) occur. If more warnings are generated,
: the fitting fails.
: A remedy may be to increase the value
: ␛[1mGDValidEveFrac␛[0m to 1.0 (or a larger value).
: In addition, if ␛[1mGDPathEveFrac␛[0m is too high
: the same warnings may occur since the events
: used for error estimation are also used for
: path estimation.
: Another possibility is to modify the model -
: See above on tuning the rule ensemble.
:
: ␛[1mThe error rate was still decreasing at the end of the path␛[0m
: Too few steps in path! Increase ␛[1mGDNSteps␛[0m.
:
: ␛[1mReached minimum early in the search␛[0m
: Minimum was found early in the fitting. This
: may indicate that the used step size ␛[1mGDStep␛[0m.
: was too large. Reduce it and rerun.
: If the results still are not OK, modify the
: model either by modifying the rule ensemble
: or add/remove linear terms
:
: <Suppress this message by specifying "!H" in the booking option>
: ␛[1m================================================================␛[0m
:
RuleFit : -------------------RULE ENSEMBLE SUMMARY------------------------
: Tree training method : AdaBoost
: Number of events per tree : 2000
: Number of trees : 20
: Number of generated rules : 196
: Idem, after cleanup : 80
: Average number of cuts per rule : 3.01
: Spread in number of cuts per rules : 1.23
: ----------------------------------------------------------------
:
: GD path scan - the scan stops when the max num. of steps is reached or a min is found
: Estimating the cutoff parameter tau. The estimated time is a pessimistic maximum.
: Best path found with tau = 0.0000 after 2.22 sec
: Fitting model...
<WARNING> :
: Minimisation elapsed time : 1.2 sec
: ----------------------------------------------------------------
: Found minimum at step 10000 with error = 0.552378
: Reason for ending loop: end of loop reached
: ----------------------------------------------------------------
: The error rate was still decreasing at the end of the path
: Increase number of steps (GDNSteps).
: Removed 28 out of a total of 80 rules with importance < 0.001
:
: ================================================================
: M o d e l
: ================================================================
RuleFit : Offset (a0) = 9.46803
: ------------------------------------
: Linear model (weights unnormalised)
: ------------------------------------
: Variable : Weights : Importance
: ------------------------------------
: myvar1 : -6.338e-01 : 0.472
: myvar2 : -4.488e-01 : 0.209
: var3 : -2.810e-01 : 0.129
: var4 : 1.850e+00 : 1.000
: ------------------------------------
: Number of rules = 52
: Printing the first 10 rules, ordered in importance.
: Rule 1 : Importance = 0.4294
: Cut 1 : -0.708 < var4
: Rule 2 : Importance = 0.3676
: Cut 1 : var3 < -0.0812
: Rule 3 : Importance = 0.3363
: Cut 1 : -0.0812 < var3
: Rule 4 : Importance = 0.2934
: Cut 1 : -0.877 < var3
: Cut 2 : 0.271 < var4
: Rule 5 : Importance = 0.2706
: Cut 1 : myvar1 < 2.83
: Cut 2 : -1.67 < var3
: Rule 6 : Importance = 0.2387
: Cut 1 : myvar1 < 1.46
: Cut 2 : var4 < 0.271
: Rule 7 : Importance = 0.1904
: Cut 1 : var4 < -0.708
: Rule 8 : Importance = 0.1897
: Cut 1 : var3 < 0.256
: Cut 2 : var4 < -0.708
: Rule 9 : Importance = 0.1689
: Cut 1 : myvar1 < -2.85
: Rule 10 : Importance = 0.1611
: Cut 1 : -2.85 < myvar1 < 2.68
: Skipping the next 42 rules
: ================================================================
:
<WARNING> : No input variable directory found - BUG?
: Elapsed time for training with 2000 events: 3.48 sec
RuleFit : [dataset] : Evaluation of RuleFit on training sample (2000 events)
: Elapsed time for evaluation of 2000 events: 0.0025 sec
: Creating xml weight file: ␛[0;36mdataset/weights/TMVAClassification_RuleFit.weights.xml␛[0m
: Creating standalone class: ␛[0;36mdataset/weights/TMVAClassification_RuleFit.class.C␛[0m
: TMVA.root:/dataset/Method_RuleFit/RuleFit
Factory : Training finished
:
: Ranking input variables (method specific)...
: No variable ranking supplied by classifier: Cuts
: No variable ranking supplied by classifier: CutsD
Likelihood : Ranking result (top variable is best ranked)
: -------------------------------------
: Rank : Variable : Delta Separation
: -------------------------------------
: 1 : var4 : 5.365e-02
: 2 : myvar1 : -4.483e-04
: 3 : myvar2 : -2.298e-03
: 4 : var3 : -9.266e-03
: -------------------------------------
LikelihoodPCA : Ranking result (top variable is best ranked)
: -------------------------------------
: Rank : Variable : Delta Separation
: -------------------------------------
: 1 : var4 : 2.952e-01
: 2 : myvar1 : 7.646e-02
: 3 : var3 : 2.035e-02
: 4 : myvar2 : 1.950e-02
: -------------------------------------
: No variable ranking supplied by classifier: PDERS
PDEFoam : Ranking result (top variable is best ranked)
: ----------------------------------------
: Rank : Variable : Variable Importance
: ----------------------------------------
: 1 : var4 : 3.830e-01
: 2 : myvar1 : 2.979e-01
: 3 : var3 : 1.915e-01
: 4 : myvar2 : 1.277e-01
: ----------------------------------------
: No variable ranking supplied by classifier: KNN
LD : Ranking result (top variable is best ranked)
: ---------------------------------
: Rank : Variable : Discr. power
: ---------------------------------
: 1 : var4 : 7.053e-01
: 2 : myvar1 : 3.094e-01
: 3 : var3 : 1.423e-01
: 4 : myvar2 : 1.019e-01
: ---------------------------------
: No variable ranking supplied by classifier: FDA_GA
MLPBNN : Ranking result (top variable is best ranked)
: -------------------------------
: Rank : Variable : Importance
: -------------------------------
: 1 : var4 : 1.360e+00
: 2 : myvar2 : 1.009e+00
: 3 : myvar1 : 8.834e-01
: 4 : var3 : 3.562e-01
: -------------------------------
: No variable ranking supplied by classifier: SVM
BDT : Ranking result (top variable is best ranked)
: ----------------------------------------
: Rank : Variable : Variable Importance
: ----------------------------------------
: 1 : var4 : 2.697e-01
: 2 : myvar1 : 2.467e-01
: 3 : myvar2 : 2.460e-01
: 4 : var3 : 2.377e-01
: ----------------------------------------
RuleFit : Ranking result (top variable is best ranked)
: -------------------------------
: Rank : Variable : Importance
: -------------------------------
: 1 : var4 : 1.000e+00
: 2 : myvar1 : 6.981e-01
: 3 : var3 : 5.947e-01
: 4 : myvar2 : 4.105e-01
: -------------------------------
Factory : === Destroy and recreate all methods via weight files for testing ===
:
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_Cuts.weights.xml␛[0m
: Read cuts optimised using sample of MC events
: Reading 100 signal efficiency bins for 4 variables
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_CutsD.weights.xml␛[0m
: Read cuts optimised using sample of MC events
: Reading 100 signal efficiency bins for 4 variables
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_Likelihood.weights.xml␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_LikelihoodPCA.weights.xml␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_PDERS.weights.xml␛[0m
: signal and background scales: 0.001 0.001
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_PDEFoam.weights.xml␛[0m
: Read foams from file: ␛[0;36mdataset/weights/TMVAClassification_PDEFoam.weights_foams.root␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_KNN.weights.xml␛[0m
: Creating kd-tree with 2000 events
: Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)
ModulekNN : Optimizing tree for 4 variables with 2000 values
: <Fill> Class 1 has 1000 events
: <Fill> Class 2 has 1000 events
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_LD.weights.xml␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_FDA_GA.weights.xml␛[0m
: User-defined formula string : "(0)+(1)*x0+(2)*x1+(3)*x2+(4)*x3"
: TFormula-compatible formula string: "[0]+[1]*[5]+[2]*[6]+[3]*[7]+[4]*[8]"
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_MLPBNN.weights.xml␛[0m
MLPBNN : Building Network.
: Initializing weights
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_SVM.weights.xml␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_BDT.weights.xml␛[0m
: Reading weight file: ␛[0;36mdataset/weights/TMVAClassification_RuleFit.weights.xml␛[0m
Factory : ␛[1mTest all methods␛[0m
Factory : Test method: Cuts for Classification performance
:
Cuts : [dataset] : Evaluation of Cuts on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.000614 sec
Factory : Test method: CutsD for Classification performance
:
CutsD : [dataset] : Evaluation of CutsD on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00589 sec
Factory : Test method: Likelihood for Classification performance
:
Likelihood : [dataset] : Evaluation of Likelihood on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00774 sec
Factory : Test method: LikelihoodPCA for Classification performance
:
LikelihoodPCA : [dataset] : Evaluation of LikelihoodPCA on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.0173 sec
Factory : Test method: PDERS for Classification performance
:
PDERS : [dataset] : Evaluation of PDERS on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.951 sec
Factory : Test method: PDEFoam for Classification performance
:
PDEFoam : [dataset] : Evaluation of PDEFoam on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.0724 sec
Factory : Test method: KNN for Classification performance
:
KNN : [dataset] : Evaluation of KNN on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.192 sec
Factory : Test method: LD for Classification performance
:
LD : [dataset] : Evaluation of LD on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00193 sec
: Dataset[dataset] : Evaluation of LD on testing sample
Factory : Test method: FDA_GA for Classification performance
:
FDA_GA : [dataset] : Evaluation of FDA_GA on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00141 sec
Factory : Test method: MLPBNN for Classification performance
:
MLPBNN : [dataset] : Evaluation of MLPBNN on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.0115 sec
Factory : Test method: SVM for Classification performance
:
SVM : [dataset] : Evaluation of SVM on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.277 sec
Factory : Test method: BDT for Classification performance
:
BDT : [dataset] : Evaluation of BDT on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.582 sec
Factory : Test method: RuleFit for Classification performance
:
RuleFit : [dataset] : Evaluation of RuleFit on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.0123 sec
Factory : ␛[1mEvaluate all methods␛[0m
Factory : Evaluate classifier: Cuts
:
<WARNING> : You have asked for histogram MVA_EFF_BvsS which does not seem to exist in *Results* .. better don't use it
<WARNING> : You have asked for histogram EFF_BVSS_TR which does not seem to exist in *Results* .. better don't use it
TFHandler_Cuts : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: CutsD
:
<WARNING> : You have asked for histogram MVA_EFF_BvsS which does not seem to exist in *Results* .. better don't use it
TFHandler_CutsD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.14555 1.0166 [ -5.5736 5.0206 ]
: myvar2: -0.093417 1.0353 [ -3.8442 3.7856 ]
: var3: -0.096857 1.0078 [ -4.5469 4.5058 ]
: var4: 0.65748 0.95864 [ -4.0893 3.7760 ]
: -----------------------------------------------------------
<WARNING> : You have asked for histogram EFF_BVSS_TR which does not seem to exist in *Results* .. better don't use it
TFHandler_CutsD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.17586 1.0000 [ -5.6401 4.8529 ]
: myvar2: 0.026952 1.0000 [ -2.9292 3.7065 ]
: var3: -0.11549 1.0000 [ -4.1792 3.5180 ]
: var4: 0.34819 1.0000 [ -3.3363 3.3963 ]
: -----------------------------------------------------------
TFHandler_CutsD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: -0.14555 1.0166 [ -5.5736 5.0206 ]
: myvar2: -0.093417 1.0353 [ -3.8442 3.7856 ]
: var3: -0.096857 1.0078 [ -4.5469 4.5058 ]
: var4: 0.65748 0.95864 [ -4.0893 3.7760 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: Likelihood
:
Likelihood : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_Likelihood : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: LikelihoodPCA
:
TFHandler_LikelihoodPCA : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 1.1147 2.2628 [ -12.508 10.719 ]
: myvar2: -0.25554 1.1225 [ -4.1578 3.8995 ]
: var3: -0.19401 0.58225 [ -2.2950 1.8880 ]
: var4: -0.32038 0.33412 [ -1.3929 0.88819 ]
: -----------------------------------------------------------
LikelihoodPCA : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_LikelihoodPCA : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 1.1147 2.2628 [ -12.508 10.719 ]
: myvar2: -0.25554 1.1225 [ -4.1578 3.8995 ]
: var3: -0.19401 0.58225 [ -2.2950 1.8880 ]
: var4: -0.32038 0.33412 [ -1.3929 0.88819 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: PDERS
:
PDERS : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_PDERS : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: PDEFoam
:
PDEFoam : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_PDEFoam : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: KNN
:
KNN : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_KNN : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: LD
:
LD : [dataset] : Loop over test events and fill histograms with classifier response...
:
: Also filling probability and rarity histograms (on request)...
TFHandler_LD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: FDA_GA
:
FDA_GA : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_FDA_GA : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: MLPBNN
:
TFHandler_MLPBNN : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.12216 0.20255 [ -1.0614 1.0246 ]
: myvar2: -0.12333 0.30492 [ -1.2280 0.99911 ]
: var3: 0.097148 0.21347 [ -1.0158 0.99984 ]
: var4: 0.17495 0.23851 [ -1.2661 1.0694 ]
: -----------------------------------------------------------
MLPBNN : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_MLPBNN : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.12216 0.20255 [ -1.0614 1.0246 ]
: myvar2: -0.12333 0.30492 [ -1.2280 0.99911 ]
: var3: 0.097148 0.21347 [ -1.0158 0.99984 ]
: var4: 0.17495 0.23851 [ -1.2661 1.0694 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: SVM
:
TFHandler_SVM : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.12216 0.20255 [ -1.0614 1.0246 ]
: myvar2: -0.12333 0.30492 [ -1.2280 0.99911 ]
: var3: 0.097148 0.21347 [ -1.0158 0.99984 ]
: var4: 0.17495 0.23851 [ -1.2661 1.0694 ]
: -----------------------------------------------------------
SVM : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_SVM : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.12216 0.20255 [ -1.0614 1.0246 ]
: myvar2: -0.12333 0.30492 [ -1.2280 0.99911 ]
: var3: 0.097148 0.21347 [ -1.0158 0.99984 ]
: var4: 0.17495 0.23851 [ -1.2661 1.0694 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: BDT
:
BDT : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_BDT : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
Factory : Evaluate classifier: RuleFit
:
RuleFit : [dataset] : Loop over test events and fill histograms with classifier response...
:
TFHandler_RuleFit : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: myvar1: 0.21781 1.7248 [ -9.8605 7.9024 ]
: myvar2: -0.062175 1.1106 [ -4.0854 4.0259 ]
: var3: 0.16451 1.0589 [ -5.3563 4.6422 ]
: var4: 0.43566 1.2253 [ -6.9675 5.0307 ]
: -----------------------------------------------------------
:
: Evaluation results ranked by best signal efficiency and purity (area)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA
: Name: Method: ROC-integ
: dataset LD : 0.921
: dataset MLPBNN : 0.919
: dataset LikelihoodPCA : 0.914
: dataset CutsD : 0.908
: dataset FDA_GA : 0.899
: dataset SVM : 0.898
: dataset RuleFit : 0.881
: dataset BDT : 0.881
: dataset KNN : 0.838
: dataset PDEFoam : 0.822
: dataset PDERS : 0.797
: dataset Cuts : 0.792
: dataset Likelihood : 0.757
: -------------------------------------------------------------------------------------------------------------------
:
: Testing efficiency compared to training efficiency (overtraining check)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA Signal efficiency: from test sample (from training sample)
: Name: Method: @B=0.01 @B=0.10 @B=0.30
: -------------------------------------------------------------------------------------------------------------------
: dataset LD : 0.364 (0.438) 0.781 (0.758) 0.929 (0.920)
: dataset MLPBNN : 0.343 (0.432) 0.777 (0.768) 0.926 (0.920)
: dataset LikelihoodPCA : 0.308 (0.345) 0.757 (0.728) 0.920 (0.911)
: dataset CutsD : 0.262 (0.449) 0.735 (0.709) 0.914 (0.890)
: dataset FDA_GA : 0.267 (0.344) 0.707 (0.704) 0.906 (0.885)
: dataset SVM : 0.321 (0.332) 0.711 (0.725) 0.894 (0.898)
: dataset RuleFit : 0.075 (0.077) 0.667 (0.718) 0.893 (0.896)
: dataset BDT : 0.275 (0.402) 0.661 (0.731) 0.870 (0.899)
: dataset KNN : 0.195 (0.252) 0.561 (0.642) 0.810 (0.843)
: dataset PDEFoam : 0.173 (0.219) 0.499 (0.541) 0.761 (0.773)
: dataset PDERS : 0.158 (0.171) 0.465 (0.492) 0.750 (0.756)
: dataset Cuts : 0.112 (0.133) 0.444 (0.496) 0.741 (0.758)
: dataset Likelihood : 0.076 (0.089) 0.389 (0.420) 0.686 (0.692)
: -------------------------------------------------------------------------------------------------------------------
:
Dataset:dataset : Created tree 'TestTree' with 10000 events
:
Dataset:dataset : Created tree 'TrainTree' with 2000 events
:
Factory : ␛[1mThank you for using TMVA!␛[0m
: ␛[1mFor citation information, please visit: http://tmva.sf.net/citeTMVA.html␛[0m
==> Wrote root file: TMVA.root
==> TMVAClassification is done!
(int) 0