Evaluation Metrics in Machine Learning#
In this section, I will provide brief explanations and introductions when encountered, focusing on the most critical parts, while other details can be found in relevant literature and blog posts.
1. About P and R Values#
These are probably the two most commonly used statistical measures in machine learning. To calculate them, we need to calculate the confusion matrix, with the simplest version shown below.
Predicted Positive | Predicted Negative | |
---|---|---|
Truth: True | TP | FN |
Truth: False | FN | TN |
When calculating the four items in this table, the second item represents the predicted value, while the first item is compared with the true value, being True if the same, and False if different.
Based on this table, various metrics can be calculated, with the most commonly used being P (Precision), R (Recall), and accuracy. The formulas for calculating them are as follows:
P = TP/TP+FP
R = TP/TP+FN
Accuracy = TP+TN/TP+FN+FP+TN
The meanings of these metrics are relatively easy to understand, so I won't elaborate further.
- Calculation of F Value
The F value is the weighted harmonic mean of P and R. How can we understand it, and why use this form? If we look closely, we can see that the commonly used F1 value formula is similar to the formula for resistors in parallel:
F=2PR/P+R
-
Others
Based on these statistical measures, there are many other derived metrics and curves used to describe different performance aspects. Common ones include PR curves, AUC curves, etc., which are not complex and can be looked up when needed.