分类任务的衡量指标

一、二分类

1.1 confusion matrix

1.2 accuracy

accuracy 衡量全局分类正确的数量占总样本的比例

1.3 precision

precision为预测正确正样本数占预测的全部正样本数的比例,即系统判定为正样本的正确率。通俗地说,假如医生给病人检查,医生判断病人有疾病,然后医生判断的正确率有多少。

1.4 recall

recall为预测正确的正样本数量占真实正样本数量的比例,即衡量正样本的召回比例。通俗说,假如有一批病人,医生能从中找出病人的比例

1.5 F1

由于precision和recall往往是矛盾的,因此为了综合考虑二者,引入F1,即为precision和recall的调和平均

当$precision$和$recall$的任一个值为0,$F_1$都为0

之所以采用调和平均,是因为调和平均数受极端值影响较大,更适合评价不平衡数据的分类问题

通用的F值表达式:

除了$F_1$分数之外,$F_2$ 分数和$F_{0.5}$分数在统计学中也得到大量的应用。其中,$F_2$分数中,召回率的权重高于精确率,而$F_{0.5}$分数中,精确率的权重高于召回率。

1.6 ROC

roc曲线:接收者操作特征(receiver operating characteristic), roc曲线上每个点反映某个阈值下的FPR和TPR的组合。

横轴:$FPR$,叫做假正类率,表示预测为正例但真实情况为反例的占所有真实情况中反例的比率,公式为$FPR=\frac{FP}{TN+FP}$。

纵轴:$TPR$ ,叫做真正例率,表示预测为正例且真实情况为正例的占所有真实情况中正例的比率,公式为​

$TPR=\frac{TP}{TP+FN}$。

1.7 AUC

$AUC$(Area under Curve):ROC曲线下的面积,数值可以直观评价分类器的好坏,值越大越好,对于二分类,结果介于0.5和1之间,1为完美分类器,0.5是因为二分类分类效果最差也是0.5。

二、多分类

2.1 混淆矩阵

2.2 accuracy

2.3 某个类别的precision,recall,F1

与二分类公式一样

2.4 系统的precision,recall,F1

系统的precision,recall,$F_1$需要综合考虑所有类别,即同时考虑猫、狗、猪的precision,recall,$F_1$。有如下几种方案:

2.4.1 Macro average

2.4.2 Weighted average

对macro的推广

2.4.3 Micro average

2.5 ROC

对于多分类分类器整体效果的ROC如上micro或者macro曲线,其余3条描述单个类别的分类效果。对于多分类,ROC上的点,同样是某个阈值下的FPR和TPR的组合。

对于多分类的$FPR$,$TPR$,有几种计算方式

a. micro average

b. macro average

2.6 AUC

$AUC$依旧为ROC曲线下的面积,对于多分类个人认为取值范围为[0,1]。

三.代码

accuracy,precision,recall,F1

1
2
3
4
5
from sklearn.metrics import precision_recall_fscore_support, accuracy_score
def eval_acc_f1(y_true, y_pred):
acc = accuracy_score(y_true, y_pred)
prf = precision_recall_fscore_support(y_true, y_pred, average="macro")
return acc, prf

ROC和AUC

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# 引入必要的库
import numpy as np
import matplotlib.pyplot as plt
from itertools import cycle
from sklearn import svm, datasets
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from scipy import interp

# 加载数据
iris = datasets.load_iris()
X = iris.data
y = iris.target
# 将标签二值化
y = label_binarize(y, classes=[0, 1, 2])
# 设置种类
n_classes = y.shape[1]

# 训练模型并预测
random_state = np.random.RandomState(0)
n_samples, n_features = X.shape

# shuffle and split training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5,random_state=0)

# Learn to predict each class against the other
classifier = OneVsRestClassifier(svm.SVC(kernel='linear', probability=True,
random_state=random_state))
y_score = classifier.fit(X_train, y_train).decision_function(X_test)

# 计算每一类的ROC
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
roc_auc[i] = auc(fpr[i], tpr[i])

# Compute micro-average ROC curve and ROC area(方法二)
fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel())
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])

# Compute macro-average ROC curve and ROC area(方法一)
# First aggregate all false positive rates
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))
# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
mean_tpr += interp(all_fpr, fpr[i], tpr[i])
# Finally average it and compute AUC
mean_tpr /= n_classes
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])

# Plot all ROC curves
lw=2
plt.figure()
plt.plot(fpr["micro"], tpr["micro"],
label='micro-average ROC curve (area = {0:0.2f})'
''.format(roc_auc["micro"]),
color='deeppink', linestyle=':', linewidth=4)

plt.plot(fpr["macro"], tpr["macro"],
label='macro-average ROC curve (area = {0:0.2f})'
''.format(roc_auc["macro"]),
color='navy', linestyle=':', linewidth=4)

colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
plt.plot(fpr[i], tpr[i], color=color, lw=lw,
label='ROC curve of class {0} (area = {1:0.2f})'
''.format(i, roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Some extension of Receiver operating characteristic to multi-class')
plt.legend(loc="lower right")
plt.show()

参考资料:

https://blog.csdn.net/Orange_Spotty_Cat/article/details/80520839

https://zhuanlan.zhihu.com/p/147663370

https://zhuanlan.zhihu.com/p/81202617

https://zhuanlan.zhihu.com/p/266386193


:D 一言句子获取中...