Implemented Performance Measures
The following tables show the performance measures available for the different types of learning problems as well as general performance measures in alphabetical order. (See also the documentation about measures and makeMeasure for available measures and their properties.)
Column Minimize indicates if the measure is minimized during, e.g., tuning or feature selection. Best and Worst show the best and worst values the performance measure can attain. For classification, column MultiClass indicates if a measure is suitable for multi-class problems. If not, the measure can only be used for binary classification problems.
The next six columns refer to information required to calculate the performance measure.
- Prediction: The Prediction object.
- Truth: The true values of the response variable(s) (for supervised learning).
- Probs: The predicted probabilities (might be needed for classification).
- Model: The WrappedModel (e.g., for calculating the training time).
- Task: The Task (relevant for cost-sensitive classification).
- Feats: The predicted data (relevant for clustering).
Aggregation shows the default aggregation method tied to the measure.
Classification
Measure | Note | Minimize | Best | Worst | MultiClass | Prediction | Truth | Probs | Model | Task | Feats | Aggregation |
---|---|---|---|---|---|---|---|---|---|---|---|---|
acc - Accuracy | 1 | 0 | X | X | X | test.mean | ||||||
auc - Area under the curve | 1 | 0 | X | X | X | test.mean | ||||||
bac - Balanced accuracy | Mean of true positive rate and true negative rate. | 1 | 0 | X | X | test.mean | ||||||
ber - Balanced error rate | Mean of misclassification error rates on all individual classes. | X | 0 | 1 | X | X | X | test.mean | ||||
brier - Brier score | X | 0 | 1 | X | X | X | test.mean | |||||
f1 - F1 measure | 1 | 0 | X | X | test.mean | |||||||
fdr - False discovery rate | X | 0 | 1 | X | X | test.mean | ||||||
fn - False negatives | Also called misses. | X | 0 | Inf | X | X | test.mean | |||||
fnr - False negative rate | X | 0 | 1 | X | X | test.mean | ||||||
fp - False positives | Also called false alarms. | X | 0 | Inf | X | X | test.mean | |||||
fpr - False positive rate | Also called false alarm rate or fall-out. | X | 0 | 1 | X | X | test.mean | |||||
gmean - G-mean | Geometric mean of recall and specificity. | 1 | 0 | X | X | test.mean | ||||||
gpr - Geometric mean of precision and recall | 1 | 0 | X | X | test.mean | |||||||
mcc - Matthews correlation coefficient | 1 | -1 | X | X | test.mean | |||||||
mmce - Mean misclassification error | X | 0 | 1 | X | X | X | test.mean | |||||
multiclass.auc - Multiclass area under the curve | Calls pROC::multiclass.roc . |
1 | 0 | X | X | X | X | test.mean | ||||
npv - Negative predictive value | 1 | 0 | X | X | test.mean | |||||||
ppv - Positive predictive value | Also called precision. | 1 | 0 | X | X | test.mean | ||||||
tn - True negatives | Also called correct rejections. | Inf | 0 | X | X | test.mean | ||||||
tnr - True negative rate | Also called specificity. | 1 | 0 | X | X | test.mean | ||||||
tp - True positives | Inf | 0 | X | X | test.mean | |||||||
tpr - True positive rate | Also called hit rate or recall. | 1 | 0 | X | X | test.mean |
Regression
Measure | Note | Minimize | Best | Worst | Prediction | Truth | Probs | Model | Task | Feats | Aggregation |
---|---|---|---|---|---|---|---|---|---|---|---|
mae - Mean of absolute errors | X | 0 | Inf | X | X | test.mean | |||||
medae - Median of absolute errors | X | 0 | Inf | X | X | test.mean | |||||
medse - Median of squared errors | X | 0 | Inf | X | X | test.mean | |||||
mse - Mean of squared errors | X | 0 | Inf | X | X | test.mean | |||||
rmse - Root mean square error | X | 0 | Inf | X | X | test.sqrt.of.mean | |||||
sae - Sum of absolute errors | X | 0 | Inf | X | X | test.mean | |||||
sse - Sum of squared errors | X | 0 | Inf | X | X | test.mean |
Survival analysis
Measure | Note | Minimize | Best | Worst | Prediction | Truth | Probs | Model | Task | Feats | Aggregation |
---|---|---|---|---|---|---|---|---|---|---|---|
cindex - Concordance index | 1 | 0 | X | X | test.mean |
Cluster analysis
Measure | Note | Minimize | Best | Worst | Prediction | Truth | Probs | Model | Task | Feats | Aggregation |
---|---|---|---|---|---|---|---|---|---|---|---|
db - Davies-Bouldin cluster separation measure | See ?clusterSim::index.DB . |
X | 0 | Inf | X | X | test.mean | ||||
dunn - Dunn index | See ?clValid::dunn . |
Inf | 0 | X | X | test.mean | |||||
G1 - Calinski-Harabasz pseudo F statistic | See ?clusterSim::index.G1 . |
Inf | 0 | X | X | test.mean | |||||
G2 - Baker and Hubert adaptation of Goodman-Kruskal's gamma statistic | See ?clusterSim::index.G2 . |
Inf | 0 | X | X | test.mean | |||||
silhouette - Rousseeuw's silhouette internal cluster quality index | See ?clusterSim::index.S . |
Inf | 0 | X | X | test.mean |
Cost-sensitive classification
Measure | Note | Minimize | Best | Worst | Prediction | Truth | Probs | Model | Task | Feats | Aggregation |
---|---|---|---|---|---|---|---|---|---|---|---|
mcp - Misclassification penalty | Average difference between costs of oracle and model prediction. | X | 0 | Inf | X | X | test.mean | ||||
meancosts - Mean costs of the predicted choices | X | 0 | Inf | X | X | test.mean |
General performance measures
Measure | Note | Minimize | Best | Worst | Prediction | Truth | Probs | Model | Task | Feats | Aggregation |
---|---|---|---|---|---|---|---|---|---|---|---|
featperc - Percentage of original features used for model | Useful for feature selection. | X | 0 | 1 | X | X | test.mean | ||||
timeboth - timetrain + timepredict | X | 0 | Inf | X | X | test.mean | |||||
timepredict - Time of predicting test set | X | 0 | Inf | X | test.mean | ||||||
timetrain - Time of fitting the model | X | 0 | Inf | X | test.mean |