发布于2019-08-07 12:40 阅读(1819) 评论(0) 点赞(3) 收藏(2)
ML之模型评价指标:基于不同机器学习框架(sklearn/TF)下算法的模型评估函数(Scoring/metrics)集合(仅代码实现)
目录
相关文章
ML之模型评价指标:基于不同机器学习框架(sklearn/TF)下算法的模型评估函数(Scoring/metrics)集合(仅代码实现)
ML之LF:机器学习中常用的损失函数案例代码实现
ML之LF:损失函数——回归预测问题中评价指标(常用的误差度量方法,MSE/RMSE/MAE)简介、使用方法、代码实现、案例应用之详细攻略
ML之LF:损失函数——回归预测问题中评价指标(常用的误差度量方法,MSE/RMSE/MAE)简介、使用方法、代码实现、案例应用之详细攻略
(1)、MSE函数
- def mean_squared_error(y, t):
- return 0.5 * np.sum((y-t)**2)
(2)、综合案例:自定义实现求MSE、RMSE、MAE,比较MSE与目标方差。
- #综合案例:自定义实现求MSE、RMSE、MAE,比较MSE与目标方差
- target = [1.5, 2.1, 3.3, -4.7, -2.3, 0.75]
- prediction = [0.5, 1.5, 2.1, -2.2, 0.1, -0.5]
-
- #for循环求出列表的error
- error = []
- for i in range(len(target)):
- error.append(target[i] - prediction[i])
- print("Errors ", error)
-
-
- #自定义求MSE:
- #for循环实现计算每个元素的SE、AE:calculate the squared errors and absolute value of errors
- squaredError = []
- absError = []
- for val in error:
- squaredError.append(val*val)
- absError.append(abs(val))
- print("Squared Error", squaredError)
- print("Absolute Value of Error", absError)
- print("MSE = ", sum(squaredError)/len(squaredError)) #综合计算MSE
-
-
- #自定义求RMSE、MAE
- from math import sqrt
- print("RMSE = ", sqrt(sum(squaredError)/len(squaredError)))
- print("MAE = ", sum(absError)/len(absError))
-
-
- #比较MSE与目标方差
- targetDeviation = []
- targetMean = sum(target)/len(target)
- for val in target:
- targetDeviation.append((val - targetMean)*(val - targetMean))
- print("Target Variance = ", sum(targetDeviation)/len(targetDeviation)) #输出目标方差
- print("Target Standard Deviation = ", sqrt(sum(targetDeviation)/len(targetDeviation))) #输出目标标准偏差(方差平方根)
(1)、cross_entropy_error()函数
- def cross_entropy_error(y, t):
- if y.ndim == 1:
- t = t.reshape(1, t.size)
- y = y.reshape(1, y.size)
- # 监督数据是one-hot-vector的情况下,转换为正确解标签的索引
- if t.size == y.size:
- t = t.argmax(axis=1)
- batch_size = y.shape[0]
- return -np.sum(np.log(y[np.arange(batch_size), t] + 1e-7)) / batch_size
explained_variance_score()函数:
- #explained_variance_score()函数:解释方差分数
- from sklearn.metrics import explained_variance_score
- y_true = [3, -0.5, 2, 7]
- y_pred = [2.5, 0.0, 2, 8]
- EVS01 = explained_variance_score(y_true, y_pred)
- print(EVS01)
-
- y_true = [[0.5, 1], [-1, 1], [7, -6]]
- y_pred = [[0, 2], [-1, 2], [8, -5]]
- EVS02 = explained_variance_score(y_true, y_pred, multioutput='raw_values')
- print(EVS02)
- EVS03 = explained_variance_score(y_true, y_pred, multioutput=[0.3, 0.7])
- print(EVS03)
mean_absolute_error()
- #mean_absolute_error():平均绝对误差
- from sklearn.metrics import mean_absolute_error
- y_true = [3, -0.5, 2, 7]
- y_pred = [2.5, 0.0, 2, 8]
- MAE01 = mean_absolute_error(y_true, y_pred)
- print(MAE01)
-
- y_true = [[0.5, 1], [-1, 1], [7, -6]]
- y_pred = [[0, 2], [-1, 2], [8, -5]]
- MAE02 = mean_absolute_error(y_true, y_pred)
- print(MAE02)
- MAE03 = mean_absolute_error(y_true, y_pred, multioutput='raw_values')
- MAE04 = mean_absolute_error(y_true, y_pred, multioutput=[0.3, 0.7])
- print(MAE03)
- print(MAE04)
mean_squared_error()
- #mean_squared_error():均方误差
- from sklearn.metrics import mean_squared_error
- y_true = [3, -0.5, 2, 7]
- y_pred = [2.5, 0.0, 2, 8]
- MSE01 = mean_squared_error(y_true, y_pred)
- print(MSE01)
-
- y_true = [[0.5, 1], [-1, 1], [7, -6]]
- y_pred = [[0, 2], [-1, 2], [8, -5]]
- MSE02 = mean_squared_error(y_true, y_pred)
- print(MSE02)
- #Root Mean Squared Error均方误差
- from sklearn.metrics import mean_squared_error
- import numpy as np
-
- y_true = [3, -0.5, 2, 7]
- y_pred = [2.5, 0.0, 2, 8]
- MSE01 = mean_squared_error(y_true, y_pred)
- RMSE01 = np.sqrt(MSE01 )
- print(RMSE01)
mean_squared_log_error():均方对数误差
- #mean_squared_log_error():均方对数误差
- from sklearn.metrics import mean_squared_log_error
- y_true = [3, 5, 2.5, 7]
- y_pred = [2.5, 5, 4, 8]
- MSLE01 = mean_squared_log_error(y_true, y_pred)
- print(MSLE01)
- y_true = [[0.5, 1], [1, 2], [7, 6]]
- y_pred = [[0.5, 2], [1, 2.5], [8, 8]]
- MSLE02 = mean_squared_log_error(y_true, y_pred)
- print(MSLE02)
median_absolute_error()
- #MeAE中位数绝对误差
- from sklearn.metrics import median_absolute_error
- y_true = [3, -0.5, 2, 7]
- y_pred = [2.5, 0.0, 2, 8]
- MeAE01 = median_absolute_error(y_true, y_pred)
- print(MeAE01)
r2_score()
- #R^2决定系数:
- from sklearn.metrics import r2_score
- y_true = [3, -0.5, 2, 7]
- y_pred = [2.5, 0.0, 2, 8]
- r2_score01 = r2_score(y_true, y_pred)
- print(r2_score01)
-
- y_true = [[0.5, 1], [-1, 1], [7, -6]]
- y_pred = [[0, 2], [-1, 2], [8, -5]]
- r2_score02 = r2_score(y_true, y_pred, multioutput='variance_weighted')
- print(r2_score02)
-
- y_true = [[0.5, 1], [-1, 1], [7, -6]]
- y_pred = [[0, 2], [-1, 2], [8, -5]]
- r2_score03 = r2_score(y_true, y_pred, multioutput='uniform_average')
- print(r2_score03)
-
- r2_score04 = r2_score(y_true, y_pred, multioutput='raw_values')
- r2_score05 = r2_score(y_true, y_pred, multioutput=[0.3, 0.7])
- print(r2_score04)
- print(r2_score05)
- #Adjusted_R2校正决定系数:
- import numpy as np
- from sklearn.metrics import r2_score
-
- y_true = [[0.5, 1], [0.1, 1], [7, 6], [7.5, 6.5]]
- y_pred = [[0, 2], [0.1, 2], [8, 5], [7.2, 6.2]]
- y_true_array = np.array([[0.5, 1], [0.1, 1], [7, 6], [7.5, 6.5]])
- n=y_true_array.shape[0] #样本数量
- p=y_true_array.shape[1] #特征数量
- print(n,p)
- r2_score01 = r2_score(y_true, y_pred, multioutput='variance_weighted')
- print(r2_score01)
- Adj_r2_score01 = 1-( (1-r2_score01)*(n-1) ) / (n-p-1)
- print(Adj_r2_score01)
metrics模块还提供为其他目的而实现的预测误差评估函数
– 分类任务的评估函数如表所示,其他任务评估函数请见:https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics
Classification metrics
See the Classification metrics section of the user guide for further details.
metrics.accuracy_score (y_true, y_pred[, …]) |
Accuracy classification score. |
metrics.auc (x, y[, reorder]) |
Compute Area Under the Curve (AUC) using the trapezoidal rule |
metrics.average_precision_score (y_true, y_score) |
Compute average precision (AP) from prediction scores |
metrics.balanced_accuracy_score (y_true, y_pred) |
Compute the balanced accuracy |
metrics.brier_score_loss (y_true, y_prob[, …]) |
Compute the Brier score. |
metrics.classification_report (y_true, y_pred) |
Build a text report showing the main classification metrics |
metrics.cohen_kappa_score (y1, y2[, labels, …]) |
Cohen’s kappa: a statistic that measures inter-annotator agreement. |
metrics.confusion_matrix (y_true, y_pred[, …]) |
Compute confusion matrix to evaluate the accuracy of a classification |
metrics.f1_score (y_true, y_pred[, labels, …]) |
Compute the F1 score, also known as balanced F-score or F-measure |
metrics.fbeta_score (y_true, y_pred, beta[, …]) |
Compute the F-beta score |
metrics.hamming_loss (y_true, y_pred[, …]) |
Compute the average Hamming loss. |
metrics.hinge_loss (y_true, pred_decision[, …]) |
Average hinge loss (non-regularized) |
metrics.jaccard_score (y_true, y_pred[, …]) |
Jaccard similarity coefficient score |
metrics.log_loss (y_true, y_pred[, eps, …]) |
Log loss, aka logistic loss or cross-entropy loss. |
metrics.matthews_corrcoef (y_true, y_pred[, …]) |
Compute the Matthews correlation coefficient (MCC) |
metrics.multilabel_confusion_matrix (y_true, …) |
Compute a confusion matrix for each class or sample |
metrics.precision_recall_curve (y_true, …) |
Compute precision-recall pairs for different probability thresholds |
metrics.precision_recall_fscore_support (…) |
Compute precision, recall, F-measure and support for each class |
metrics.precision_score (y_true, y_pred[, …]) |
Compute the precision |
metrics.recall_score (y_true, y_pred[, …]) |
Compute the recall |
metrics.roc_auc_score (y_true, y_score[, …]) |
Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores. |
metrics.roc_curve (y_true, y_score[, …]) |
Compute Receiver operating characteristic (ROC) |
metrics.zero_one_loss (y_true, y_pred[, …]) |
Zero-one classification loss. |
用交叉验证(cross_val_score和GridSearchCV)评价模型性能时,用scoring参数定义评价指标。
(1)、评价指标是越高越好,因此用一些损失函数当评价指标时, 需要再加负号,如neg_log_loss,neg_mean_squared_error 。详见 https://scikit-learn.org/stable/modules/model_evaluation.html#log-loss
Scoring |
Function |
Comment |
Classification |
||
‘accuracy’ |
||
‘average_precision’ |
|
|
‘f1’ |
for binary targets |
|
‘f1_micro’ |
micro-averaged |
|
‘f1_macro’ |
macro-averaged |
|
‘f1_weighted’ |
weighted average |
|
‘f1_samples’ |
by multilabel sample |
|
‘neg_log_loss’ |
|
requires predict_proba sup port |
‘precision’ etc. |
suffixes apply as with ‘f1’ |
|
‘recall’ etc. |
suffixes apply as with ‘f1’ |
|
‘roc_auc’ |
||
Clustering |
||
‘adjusted_rand_score’ |
||
Regression |
||
‘neg_mean_absolute_error’ |
||
‘neg_mean_squared_error’ |
||
‘neg_median_absolute_error’ |
|
|
‘r2’ |
作者:3434erer
链接:https://www.pythonheidong.com/blog/article/10916/6c9c78c63ead4203673f/
来源:python黑洞网
任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任
昵称:
评论内容:(最多支持255个字符)
---无人问津也好,技不如人也罢,你都要试着安静下来,去做自己该做的事,而不是让内心的烦躁、焦虑,坏掉你本来就不多的热情和定力
Copyright © 2018-2021 python黑洞网 All Rights Reserved 版权所有,并保留所有权利。 京ICP备18063182号-1
投诉与举报,广告合作请联系vgs_info@163.com或QQ3083709327
免责声明:网站文章均由用户上传,仅供读者学习交流使用,禁止用做商业用途。若文章涉及色情,反动,侵权等违法信息,请向我们举报,一经核实我们会立即删除!