A class doing the actual fitting of a linear model using rules as base functions.
Definition at line 49 of file RuleFitParams.h.
Public Member Functions | |
RuleFitParams () | |
constructor | |
virtual | ~RuleFitParams () |
destructor | |
Int_t | FindGDTau () |
This finds the cutoff parameter tau by scanning several different paths. | |
UInt_t | GetPathIdx1 () const |
UInt_t | GetPathIdx2 () const |
UInt_t | GetPerfIdx1 () const |
UInt_t | GetPerfIdx2 () const |
void | Init () |
Initializes all parameters using the RuleEnsemble and the training tree. | |
void | InitGD () |
Initialize GD path search. | |
Double_t | LossFunction (const Event &e) const |
Implementation of squared-error ramp loss function (eq 39,40 in ref 1) This is used for binary Classifications where y = {+1,-1} for (sig,bkg) | |
Double_t | LossFunction (UInt_t evtidx) const |
Implementation of squared-error ramp loss function (eq 39,40 in ref 1) This is used for binary Classifications where y = {+1,-1} for (sig,bkg) | |
Double_t | LossFunction (UInt_t evtidx, UInt_t itau) const |
Implementation of squared-error ramp loss function (eq 39,40 in ref 1) This is used for binary Classifications where y = {+1,-1} for (sig,bkg) | |
void | MakeGDPath () |
The following finds the gradient directed path in parameter space. | |
Double_t | Penalty () const |
This is the "lasso" penalty To be used for regression. | |
Double_t | Risk (UInt_t ind1, UInt_t ind2, Double_t neff) const |
risk assessment | |
Double_t | Risk (UInt_t ind1, UInt_t ind2, Double_t neff, UInt_t itau) const |
risk assessment for tau model <itau> | |
Double_t | RiskPath () const |
Double_t | RiskPerf () const |
Double_t | RiskPerf (UInt_t itau) const |
UInt_t | RiskPerfTst () |
Estimates the error rate with the current set of parameters. | |
void | SetGDErrScale (Double_t s) |
void | SetGDNPathSteps (Int_t np) |
void | SetGDPathStep (Double_t s) |
void | SetGDTau (Double_t t) |
void | SetGDTauPrec (Double_t p) |
void | SetGDTauRange (Double_t t0, Double_t t1) |
void | SetGDTauScan (UInt_t n) |
void | SetMsgType (EMsgType t) |
void | SetRuleFit (RuleFit *rf) |
Int_t | Type (const Event *e) const |
Protected Types | |
typedef std::vector< constTMVA::Event * >::const_iterator | EventItr |
Protected Member Functions | |
Double_t | CalcAverageResponse () |
calculate the average response - TODO : rewrite bad dependancy on EvaluateAverage() ! | |
Double_t | CalcAverageResponseOLD () |
Double_t | CalcAverageTruth () |
calculate the average truth | |
void | CalcFStar () |
Estimates F* (optimum scoring function) for all events for the given sets. | |
void | CalcGDNTau () |
void | CalcTstAverageResponse () |
calc average response for all test paths - TODO: see comment under CalcAverageResponse() note that 0 offset is used | |
Double_t | ErrorRateBin () |
Estimates the error rate with the current set of parameters It uses a binary estimate of (y-F*(x)) (y-F*(x)) = (Num of events where sign(F)!=sign(y))/Neve y = {+1 if event is signal, -1 otherwise} — NOT USED —. | |
Double_t | ErrorRateReg () |
Estimates the error rate with the current set of parameters This code is pretty messy at the moment. | |
Double_t | ErrorRateRoc () |
Estimates the error rate with the current set of parameters. | |
Double_t | ErrorRateRocRaw (std::vector< Double_t > &sFsig, std::vector< Double_t > &sFbkg) |
Estimates the error rate with the current set of parameters. | |
void | ErrorRateRocTst () |
Estimates the error rate with the current set of parameters. | |
void | EvaluateAverage (UInt_t ind1, UInt_t ind2, std::vector< Double_t > &avsel, std::vector< Double_t > &avrul) |
evaluate the average of each variable and f(x) in the given range | |
void | EvaluateAveragePath () |
void | EvaluateAveragePerf () |
void | FillCoefficients () |
helper function to store the rule coefficients in local arrays | |
void | InitNtuple () |
initializes the ntuple | |
void | MakeGradientVector () |
make gradient vector | |
void | MakeTstGradientVector () |
make test gradient vector for all tau same algorithm as MakeGradientVector() | |
Double_t | Optimism () |
implementation of eq. | |
void | UpdateCoefficients () |
Establish maximum gradient for rules, linear terms and the offset. | |
void | UpdateTstCoefficients () |
Establish maximum gradient for rules, linear terms and the offset for all taus TODO: do not need index range! | |
Protected Attributes | |
std::vector< Double_t > | fAverageRulePath |
average of each rule, same range | |
std::vector< Double_t > | fAverageRulePerf |
average of each rule, same range | |
std::vector< Double_t > | fAverageSelectorPath |
average of each variable over the range fPathIdx1,2 | |
std::vector< Double_t > | fAverageSelectorPerf |
average of each variable over the range fPerfIdx1,2 | |
Double_t | fAverageTruth |
average truth, ie sum(y)/N, y=+-1 | |
Double_t | fbkgave |
Average of F(bkg) | |
Double_t | fbkgrms |
Rms of F(bkg) | |
std::vector< Double_t > | fFstar |
vector of F*() - filled in CalcFStar() | |
Double_t | fFstarMedian |
median value of F*() using | |
std::vector< std::vector< Double_t > > | fGDCoefLinTst |
linear coeffs - one per tau | |
std::vector< std::vector< Double_t > > | fGDCoefTst |
rule coeffs - one per tau | |
Double_t | fGDErrScale |
stop scan at error = scale*errmin | |
std::vector< Double_t > | fGDErrTst |
error rates per tau | |
std::vector< Char_t > | fGDErrTstOK |
error rate is sufficiently low <— stores boolean | |
Int_t | fGDNPathSteps |
number of path steps | |
UInt_t | fGDNTau |
number of tau-paths - calculated in SetGDTauPrec | |
UInt_t | fGDNTauTstOK |
number of tau in the test-phase that are ok | |
TTree * | fGDNtuple |
Gradient path ntuple, contains params for each step along the path. | |
std::vector< Double_t > | fGDOfsTst |
offset per tau | |
Double_t | fGDPathStep |
step size along path (delta nu in eq 22, ref 1) | |
Double_t | fGDTau |
selected threshold parameter (tau in eq 26, ref 1) | |
Double_t | fGDTauMax |
max threshold parameter (tau in eq 26, ref 1) | |
Double_t | fGDTauMin |
min threshold parameter (tau in eq 26, ref 1) | |
Double_t | fGDTauPrec |
precision in tau | |
UInt_t | fGDTauScan |
number scan for tau-paths | |
std::vector< Double_t > | fGDTauVec |
the tau's | |
std::vector< Double_t > | fGradVec |
gradient vector - dimension = number of rules in ensemble | |
std::vector< Double_t > | fGradVecLin |
gradient vector - dimension = number of variables | |
std::vector< std::vector< Double_t > > | fGradVecLinTst |
gradient vector, linear terms - one per tau | |
std::vector< std::vector< Double_t > > | fGradVecTst |
gradient vector - one per tau | |
Double_t | fNEveEffPath |
sum of weights for Path events | |
Double_t | fNEveEffPerf |
idem for Perf events | |
UInt_t | fNLinear |
number of linear terms | |
UInt_t | fNRules |
number of rules | |
Double_t * | fNTCoeff |
GD path: rule coefficients. | |
Double_t | fNTCoefRad |
GD path: 'radius' of all rulecoeffs. | |
Double_t | fNTErrorRate |
GD path: error rate (or performance) | |
Double_t * | fNTLinCoeff |
GD path: linear coefficients. | |
Double_t | fNTNuval |
GD path: value of nu. | |
Double_t | fNTOffset |
GD path: model offset. | |
Double_t | fNTRisk |
GD path: risk. | |
UInt_t | fPathIdx1 |
first event index for path search | |
UInt_t | fPathIdx2 |
last event index for path search | |
UInt_t | fPerfIdx1 |
first event index for performance evaluation | |
UInt_t | fPerfIdx2 |
last event index for performance evaluation | |
RuleEnsemble * | fRuleEnsemble |
rule ensemble | |
RuleFit * | fRuleFit |
rule fit | |
Double_t | fsigave |
Sigma of current signal score function F(sig) | |
Double_t | fsigrms |
Rms of F(sig) | |
Private Member Functions | |
MsgLogger & | Log () const |
Private Attributes | |
MsgLogger * | fLogger |
! message logger | |
#include <TMVA/RuleFitParams.h>
|
protected |
Definition at line 130 of file RuleFitParams.h.
TMVA::RuleFitParams::RuleFitParams | ( | ) |
constructor
Definition at line 64 of file RuleFitParams.cxx.
|
virtual |
destructor
Definition at line 104 of file RuleFitParams.cxx.
|
protected |
calculate the average response - TODO : rewrite bad dependancy on EvaluateAverage() !
note that 0 offset is used
Definition at line 1512 of file RuleFitParams.cxx.
|
protected |
|
protected |
calculate the average truth
Definition at line 1527 of file RuleFitParams.cxx.
|
protected |
Estimates F* (optimum scoring function) for all events for the given sets.
The result is used in ErrorRateReg(). — NOT USED —
Definition at line 885 of file RuleFitParams.cxx.
|
inlineprotected |
Definition at line 136 of file RuleFitParams.h.
|
protected |
calc average response for all test paths - TODO: see comment under CalcAverageResponse() note that 0 offset is used
Definition at line 1491 of file RuleFitParams.cxx.
|
protected |
Estimates the error rate with the current set of parameters It uses a binary estimate of (y-F*(x)) (y-F*(x)) = (Num of events where sign(F)!=sign(y))/Neve y = {+1 if event is signal, -1 otherwise} — NOT USED —.
Definition at line 1008 of file RuleFitParams.cxx.
|
protected |
Estimates the error rate with the current set of parameters This code is pretty messy at the moment.
Cleanup is needed. – NOT USED —
Definition at line 964 of file RuleFitParams.cxx.
|
protected |
Estimates the error rate with the current set of parameters.
It calculates the area under the bkg rejection vs signal efficiency curve. The value returned is 1-area. This works but is less efficient than calculating the Risk using RiskPerf().
Definition at line 1107 of file RuleFitParams.cxx.
|
protected |
Estimates the error rate with the current set of parameters.
It calculates the area under the bkg rejection vs signal efficiency curve. The value returned is 1-area.
Definition at line 1042 of file RuleFitParams.cxx.
|
protected |
Estimates the error rate with the current set of parameters.
It calculates the area under the bkg rejection vs signal efficiency curve. The value returned is 1-area.
See comment under ErrorRateRoc().
Definition at line 1155 of file RuleFitParams.cxx.
|
protected |
evaluate the average of each variable and f(x) in the given range
Definition at line 208 of file RuleFitParams.cxx.
|
inlineprotected |
Definition at line 177 of file RuleFitParams.h.
|
inlineprotected |
Definition at line 180 of file RuleFitParams.h.
|
protected |
helper function to store the rule coefficients in local arrays
Definition at line 868 of file RuleFitParams.cxx.
Int_t TMVA::RuleFitParams::FindGDTau | ( | ) |
This finds the cutoff parameter tau by scanning several different paths.
Definition at line 449 of file RuleFitParams.cxx.
|
inline |
Definition at line 91 of file RuleFitParams.h.
|
inline |
Definition at line 92 of file RuleFitParams.h.
|
inline |
Definition at line 93 of file RuleFitParams.h.
|
inline |
Definition at line 94 of file RuleFitParams.h.
void TMVA::RuleFitParams::Init | ( | ) |
Initializes all parameters using the RuleEnsemble and the training tree.
Definition at line 114 of file RuleFitParams.cxx.
void TMVA::RuleFitParams::InitGD | ( | ) |
Initialize GD path search.
Definition at line 373 of file RuleFitParams.cxx.
|
protected |
initializes the ntuple
Definition at line 185 of file RuleFitParams.cxx.
|
inlineprivate |
Definition at line 254 of file RuleFitParams.h.
Implementation of squared-error ramp loss function (eq 39,40 in ref 1) This is used for binary Classifications where y = {+1,-1} for (sig,bkg)
Definition at line 278 of file RuleFitParams.cxx.
Implementation of squared-error ramp loss function (eq 39,40 in ref 1) This is used for binary Classifications where y = {+1,-1} for (sig,bkg)
Definition at line 290 of file RuleFitParams.cxx.
Implementation of squared-error ramp loss function (eq 39,40 in ref 1) This is used for binary Classifications where y = {+1,-1} for (sig,bkg)
Definition at line 302 of file RuleFitParams.cxx.
void TMVA::RuleFitParams::MakeGDPath | ( | ) |
The following finds the gradient directed path in parameter space.
More work is needed... FT, 24/9/2006
The algorithm is currently as follows (if not otherwise stated, the sample used below is [fPathIdx1,fPathIdx2]):
The algorithm will warn if:
Definition at line 538 of file RuleFitParams.cxx.
|
protected |
make gradient vector
Definition at line 1375 of file RuleFitParams.cxx.
|
protected |
make test gradient vector for all tau same algorithm as MakeGradientVector()
Definition at line 1259 of file RuleFitParams.cxx.
|
protected |
implementation of eq.
7.17 in Hastie,Tibshirani & Friedman book this is the covariance between the estimated response yhat and the true value y. NOT REALLY SURE IF THIS IS CORRECT! — THIS IS NOT USED —
Definition at line 925 of file RuleFitParams.cxx.
Double_t TMVA::RuleFitParams::Penalty | ( | ) | const |
This is the "lasso" penalty To be used for regression.
— NOT USED —
Definition at line 356 of file RuleFitParams.cxx.
risk assessment
Definition at line 314 of file RuleFitParams.cxx.
risk assessment for tau model <itau>
Definition at line 334 of file RuleFitParams.cxx.
|
inline |
Definition at line 108 of file RuleFitParams.h.
|
inline |
Definition at line 109 of file RuleFitParams.h.
Definition at line 110 of file RuleFitParams.h.
UInt_t TMVA::RuleFitParams::RiskPerfTst | ( | ) |
Estimates the error rate with the current set of parameters.
using the <Perf>
subsample. Return the tau index giving the lowest error
Definition at line 1201 of file RuleFitParams.cxx.
|
inline |
Definition at line 85 of file RuleFitParams.h.
|
inline |
Definition at line 65 of file RuleFitParams.h.
|
inline |
Definition at line 68 of file RuleFitParams.h.
|
inline |
Definition at line 82 of file RuleFitParams.h.
|
inline |
Definition at line 86 of file RuleFitParams.h.
Definition at line 71 of file RuleFitParams.h.
|
inline |
Definition at line 79 of file RuleFitParams.h.
void TMVA::RuleFitParams::SetMsgType | ( | EMsgType | t | ) |
Definition at line 1556 of file RuleFitParams.cxx.
|
inline |
Definition at line 62 of file RuleFitParams.h.
Definition at line 1550 of file RuleFitParams.cxx.
|
protected |
Establish maximum gradient for rules, linear terms and the offset.
Definition at line 1441 of file RuleFitParams.cxx.
|
protected |
Establish maximum gradient for rules, linear terms and the offset for all taus TODO: do not need index range!
Definition at line 1327 of file RuleFitParams.cxx.
|
protected |
average of each rule, same range
Definition at line 205 of file RuleFitParams.h.
|
protected |
average of each rule, same range
Definition at line 207 of file RuleFitParams.h.
|
protected |
average of each variable over the range fPathIdx1,2
Definition at line 204 of file RuleFitParams.h.
|
protected |
average of each variable over the range fPerfIdx1,2
Definition at line 206 of file RuleFitParams.h.
|
protected |
average truth, ie sum(y)/N, y=+-1
Definition at line 232 of file RuleFitParams.h.
|
protected |
Average of F(bkg)
Definition at line 248 of file RuleFitParams.h.
|
protected |
Rms of F(bkg)
Definition at line 249 of file RuleFitParams.h.
|
protected |
vector of F*() - filled in CalcFStar()
Definition at line 234 of file RuleFitParams.h.
|
protected |
median value of F*() using
Definition at line 235 of file RuleFitParams.h.
|
protected |
linear coeffs - one per tau
Definition at line 218 of file RuleFitParams.h.
|
protected |
rule coeffs - one per tau
Definition at line 217 of file RuleFitParams.h.
|
protected |
stop scan at error = scale*errmin
Definition at line 230 of file RuleFitParams.h.
|
protected |
error rates per tau
Definition at line 215 of file RuleFitParams.h.
|
protected |
error rate is sufficiently low <— stores boolean
Definition at line 216 of file RuleFitParams.h.
|
protected |
number of path steps
Definition at line 229 of file RuleFitParams.h.
|
protected |
number of tau-paths - calculated in SetGDTauPrec
Definition at line 222 of file RuleFitParams.h.
|
protected |
number of tau in the test-phase that are ok
Definition at line 221 of file RuleFitParams.h.
|
protected |
Gradient path ntuple, contains params for each step along the path.
Definition at line 237 of file RuleFitParams.h.
|
protected |
offset per tau
Definition at line 219 of file RuleFitParams.h.
|
protected |
step size along path (delta nu in eq 22, ref 1)
Definition at line 228 of file RuleFitParams.h.
|
protected |
selected threshold parameter (tau in eq 26, ref 1)
Definition at line 227 of file RuleFitParams.h.
|
protected |
max threshold parameter (tau in eq 26, ref 1)
Definition at line 226 of file RuleFitParams.h.
|
protected |
min threshold parameter (tau in eq 26, ref 1)
Definition at line 225 of file RuleFitParams.h.
|
protected |
precision in tau
Definition at line 223 of file RuleFitParams.h.
|
protected |
number scan for tau-paths
Definition at line 224 of file RuleFitParams.h.
|
protected |
the tau's
Definition at line 220 of file RuleFitParams.h.
|
protected |
gradient vector - dimension = number of rules in ensemble
Definition at line 209 of file RuleFitParams.h.
|
protected |
gradient vector - dimension = number of variables
Definition at line 210 of file RuleFitParams.h.
|
protected |
gradient vector, linear terms - one per tau
Definition at line 213 of file RuleFitParams.h.
|
protected |
gradient vector - one per tau
Definition at line 212 of file RuleFitParams.h.
|
mutableprivate |
! message logger
Definition at line 253 of file RuleFitParams.h.
|
protected |
sum of weights for Path events
Definition at line 201 of file RuleFitParams.h.
|
protected |
idem for Perf events
Definition at line 202 of file RuleFitParams.h.
|
protected |
number of linear terms
Definition at line 192 of file RuleFitParams.h.
|
protected |
number of rules
Definition at line 191 of file RuleFitParams.h.
|
protected |
GD path: rule coefficients.
Definition at line 243 of file RuleFitParams.h.
|
protected |
GD path: 'radius' of all rulecoeffs.
Definition at line 241 of file RuleFitParams.h.
|
protected |
GD path: error rate (or performance)
Definition at line 239 of file RuleFitParams.h.
|
protected |
GD path: linear coefficients.
Definition at line 244 of file RuleFitParams.h.
|
protected |
GD path: value of nu.
Definition at line 240 of file RuleFitParams.h.
|
protected |
GD path: model offset.
Definition at line 242 of file RuleFitParams.h.
|
protected |
GD path: risk.
Definition at line 238 of file RuleFitParams.h.
|
protected |
first event index for path search
Definition at line 197 of file RuleFitParams.h.
|
protected |
last event index for path search
Definition at line 198 of file RuleFitParams.h.
|
protected |
first event index for performance evaluation
Definition at line 199 of file RuleFitParams.h.
|
protected |
last event index for performance evaluation
Definition at line 200 of file RuleFitParams.h.
|
protected |
rule ensemble
Definition at line 189 of file RuleFitParams.h.
|
protected |
rule fit
Definition at line 188 of file RuleFitParams.h.
|
protected |
Sigma of current signal score function F(sig)
Definition at line 246 of file RuleFitParams.h.
|
protected |
Rms of F(sig)
Definition at line 247 of file RuleFitParams.h.