ãã¼ã¿åæã¬ãåå¼·ã¢ããã³ãã«ã¬ã³ãã¼ 9æ¥ç®ã
ãã¼ã¿ãå¦ç¿å¨ã«å ¥ããã¨ããã¾ã§ã¯ã§ãããå¾ã¯å¦ç¿ãããã ãï¼ ã ããå¦ç¿å¨ã¨ãã£ã¦ããããããããã©ãããå¦ç¿å¨ãé¸ã¹ã°ããã®ã ãããã
ãã®ããã«ã¯ã¢ãã«ããã¾ãè©ä¾¡ããããã¿ãä½ããªããã°ãªããªãã
ä»åã¯ãsklearn.metrics
ãè¦ãªãããå¦ç¿ã®è©ä¾¡ãã©ã®ããã«è¡ããã«ã¤ãã¦å¦ã¶
- æ©æ¢°å¦ç¿ã«ä½¿ãææ¨ç·ã¾ã¨ã(æ師ããå¦ç¿ç·¨)
- sklearn.metricsã«ããå®éææ¨
- ãã¢å®è£ (åé¡å¦ç¿å¨ãã¨ã®å·®)
- ã¾ã¨ã
æ©æ¢°å¦ç¿ã«ä½¿ãææ¨ç·ã¾ã¨ã(æ師ããå¦ç¿ç·¨)
å¦ç¿å¨ã®è©ä¾¡ã«ã¯ããã¾ãã¾ãªææ¨ãç¨æããã¦ããã 以åãæ師ããå¦ç¿ã«ã¤ãã¦ã¯å ¨åã§ã¾ã¨ãã¦ããããªã®ã§ããã¡ããè¦ã¦ããã ããã°ã¨æãã
ãããããã¡ãã®è¨äºã§ã¯ãæ°å¼ãå¤ããå®æããããªãã ããããªã®ã§ä»åã¯ãã®ä¸ã®ææ¨ã®å®è£ ãè¡ã£ã¦ã¿ãã
sklearn.metricsã«ããå®éææ¨
sklearn.metricsã«ã¯ãããããã®ææ¨ãç¨æããã¦ãã
- åé¡ç¨ã®ææ¨
- å帰ç¨ã®ææ¨
- ã¯ã©ã¹ã¿ãªã³ã°ç¨ã®ææ¨
- ...
åé¡ã®å®éææ¨
å ã»ã©ã®åå¸ãåé¡ããçµæãè©ä¾¡ããããåèã«ããã®ã¯ãã¡ãã®å ¬å¼ãã¼ã¸ã ç¨æãããã®ã¯ãä¸è¨ã®ãã®ã§ãããã«ãã³å ã¯ç§ãå®è£ ã§ç¨ããå¤æ°
- ãã¹ããã¼ã¿ã®æ£è§£ã©ãã«(
y_test
)- å¦ç¿å¨ãå¤å®ããçµæ(
pred_y
)- ãã®ã¨ãã®å¤å®ã¹ã³ã¢(
pred_y_score
)â»
â»å¤å®ã¹ã³ã¢ã¯ãå¿ è¦ãªææ¨ã¨å¿ è¦ã§ãªãææ¨ãããã使ããªãææ¨ã¯ããã åé¡çµæã ããã¿ã¦å¤å®ããã®ã§ãããããããä¸æ¹ã§å¤å®ã¹ã³ã¢ã使ãå ´åã¯ãææ¨ãããããããã«ãããªãä¸æ¹ã§ãããããåé¡å¨ã®èªä¿¡ãèæ ®ããªããã®ç®åºã¨ãªãã®ã§ããã詳細ã«åé¡å¨ãè©ä¾¡ãã¦ããã
ããã§ãsklearn.metricsã«å ¥ã£ã¦ãã主è¦ãªåé¡ææ¨ãè¦ã¦ã¿ã
ã¯ã©ã¹ | åé¡ææ¨ | å¤å®ã¹ã³ã¢ã® ä½¿ç¨ |
åè |
---|---|---|---|
confusion matrix |
-- | ãªã | å®éã¨äºæ¸¬ã®ããå ·åãè¦ã |
accuracy |
Accuracy | ãªã | ã©ã®ãããããã£ãã |
classification_report |
få¤, recall, precision | ãªã | ãã使ãææ¨ãã¾ã¨ãã¦ããã¦ã¦è¯ã |
hamming_loss |
hamming loss | ãªã | |
jaccard_similarity |
jaccardé¡ä¼¼åº¦ | ãªã | |
log_loss |
logarithm loss | ãã | 使ãæ¹ããã¾ãèªä¿¡ããªãå ¬å¼ |
precision_recall_curve |
precision-recallæ²ç· | ãã | ãã¬ã¼ããªãå ·åãã¿ã |
roc_auc |
ROCæ²ç· | ãã | é¾å¤å¤å®ã¨ã |
ãã使ãããã ãªã¨æã£ãã®ããaccuracy
. classification_report
. roc_auc
ã主è¦ãªææ¨ããããã«ã¾ã¨ã¾ã£ã¦ãã
sklearn.datasets.load_breast_cancer()
ã®å¦ç¿ãç°¡æçã«è¡ããä¸è¨ã®3ã¤ã®ææ¨ã®å
¥åã®ä»æ¹ã¨åºåã®ä»æ¹ãå¦ã¶ã
import matplotlib.pyplot as plt from sklearn.datasets import load_breast_cancer from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from sklearn import metrics breast_cancer = load_breast_cancer() clf = LogisticRegression() X_train, X_test, y_train, y_test = train_test_split(breast_cancer.data, breast_cancer.target,train_size=0.8) clf.fit(X_train,y_train) # çµæãè¿ã pred_y = clf.predict(X_test) # ã¹ã³ã¢ãåºãããã®é¢æ°ã # ref : [decision_functionã¨ã®éãã¯ï¼](https://stackoverflow.com/questions/36543137/whats-the-difference-between-predict-proba-and-decision-function-in-sklearn-py) pred_y_score = clf.predict_proba(X_test)[:,1] print("accuracy\n", metrics.accuracy_score(y_test, pred_y)) print("classification report\n",metrics.classification_report(y_test,pred_y)) # Compute ROC curve and ROC area for each class fpr, tpr, _ = metrics.roc_curve(y_test, pred_y_score,pos_label=1) roc_auc = metrics.auc(fpr, tpr) print("auc area", roc_auc) # ROC plt.figure(figsize=(5,5)) plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve') plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('ROC curve: \nAUC={:.3f}'.format(roc_auc)) plt.show()
çµæã¯ãããªæã
å帰ã®å®éææ¨
æ£è§£ãæ°å¤ã§ä¸ããããå帰ã«ããããã¤ãææ¨ãããããããsklearn.metrics
å
ã§å®è£
ããã¦ããã
sklearn.datasets,load_boston
ãã¼ã¿ã»ããã使ã£ã¦ã試ãã¦ã¿ã
from sklearn.datasets import load_boston from sklearn.linear_model import LinearRegression from sklearn import model_selection, metrics import numpy as np boston = load_boston() reg = LinearRegression() X_train, X_test, y_train, y_test = model_selection.train_test_split(boston.data, boston.target,train_size=0.8) reg.fit(X_train,y_train) pred_y = reg.predict(X_test) print("Variance score", metrics.explained_variance_score(y_test, pred_y)) print("MAE : Mean absolute error", metrics.mean_absolute_error(y_test, pred_y)) print("MSE : Mean squared error", metrics.mean_squared_error(y_test, pred_y)) print("RMSE: Root mean squared error", np.sqrt(metrics.mean_squared_error(y_test, pred_y))) print("R2 score", metrics.r2_score(y_test, pred_y))
åºåã¯
Variance score 0.637022922972 MAE : Mean absolute error 3.16480595297 MSE : Mean squared error 19.7557043856 RMSE: Root mean squared error 4.44473895584 R2 score 0.633791811542
ã¯ã©ã¹ã¿ãªã³ã°ã®å®éææ¨
Evaluating the performance of a clustering algorithm is not as trivial as counting the number of errors or the precision and recall of a supervised classification algorithm. (clustering performance evaluation)
ã¨ããããã«ãã¯ã©ã¹ã¿ãªã³ã°ãã絶対çãªå®éææ¨ãåå¨ããªããåãæ¹ãªã®ã§ãç¨éãæå³ã«ãã£ã¦å¤ãã£ã¦ããããã ããã©ãä¸å¿æéã¿ãããªã®ã¯ãã£ã¦ãã¯ã©ã¹ã¿å ã®æ£ãã°ãã®å°ãªãã¨ã¯ã©ã¹ã¿éã®è·é¢ã®é·ãããã¾ãå®å¼åãããã¨ã§ãå®éåããã
ä¸è¨ãµã¤ãã«ã¯ã¯ã©ã¹ã¿ãªã³ã°ã®ææ¨ãå¤æ°å®è£ ããã¦ããã*1
ãã¢å®è£ (åé¡å¦ç¿å¨ãã¨ã®å·®)
æå¾ã«ãã¢ãä½ãã
ãã®å³ã¯ãä¸æ¥æå½¢ã«åå¸ãããã¼ã¿ãåé¡ãããã¨ããçµæãåé¡å¨å¥ã«åãã¦ã¿ããã®ã§ããã *2
ç·å½¢ãéç·å½¢ãã¢ã³ãµã³ãã«å¦ç¿ãªã©ãå¦ç¿å¨ã®æ§è³ªã«ãã£ã¦ããªãç°ãªããã¨ãè¦ã¦åãã¦é¢ç½ãã
ãããã£ã¦åé¢ç¶æ³ãå¯è¦åãã¦ããã¨ã大ä½ã©ãããé°å²æ°ã§å¦ç¿å¨ãåå¸ãã¦ãããã¯ãããã ã§ããããã§ã¯å®éçãªææ¨ã¨ã¯ãããªããã®ã§ãå ã»ã©ã¾ã§ã®ææ¨ãè¦ã¦ãå®éåãã¦ã¿ã
詳細ã¯ä¾ã®ãã¨ãgithubã«ããã¦ããã
çµæ1(表)
大ããçµæã®é ä½ãå¤ãããããªãã¨ã¯ãªããã精度ã ããããªããprecisionã¨recallãªã©ãè¦ãã¨éãããããã®ããªããã®ãªã©éãããã£ãããã¦ãåé¡å¨ã®åæ§ã詳細ã«ç¥ããã¨ãã§ããã
Algorithm | accuracy | precision | recall | f1-score | ham. loss | jaccard sim | AUC |
---|---|---|---|---|---|---|---|
LogisticRegression | 0.83 | 0.82 | 0.83 | 0.82 | 0.17 | 0.83 | 0.92 |
Nearest Neighbors | 0.95 | 0.91 | 1 | 0.95 | 0.05 | 0.95 | 0.95 |
Linear SVM | 0.8 | 0.8 | 0.77 | 0.79 | 0.2 | 0.8 | 0.92 |
RBF SVM | 0.94 | 0.89 | 1 | 0.94 | 0.06 | 0.94 | 0.98 |
Gaussian Process | 0.95 | 0.91 | 1 | 0.95 | 0.05 | 0.95 | 0.98 |
Decision Tree | 0.92 | 0.92 | 0.92 | 0.92 | 0.08 | 0.92 | 0.92 |
Random Forest | 0.89 | 0.88 | 0.9 | 0.89 | 0.11 | 0.89 | 0.93 |
Neural Net | 0.83 | 0.82 | 0.83 | 0.82 | 0.17 | 0.83 | 0.92 |
AdaBoost | 0.86 | 0.84 | 0.88 | 0.86 | 0.14 | 0.86 | 0.92 |
Naive Bayes | 0.82 | 0.81 | 0.81 | 0.81 | 0.18 | 0.82 | 0.92 |
QDA | 0.83 | 0.82 | 0.83 | 0.82 | 0.17 | 0.83 | 0.92 |
çµæ2(å³)
代表çãª3種ã®åé¡å¨ã§ROCæ²ç·ã¨Precision-Recallæ²ç·ãå¼ããã 両è ã¨ãããã¬ã¼ããªãã®ã°ã©ãã«ãªã£ã¦ããã®ã§ãè¦ãããé åã®é¢ç©ã大ããã»ã©ãããåé¡å¨ã§ããã¨ãããã ä»åã®ãã¼ã¿ã«å¯¾ãã¦ã¯ãéç·å½¢åé¡å¨ã«è»é ãä¸ããããã ã
ã¾ã¨ã
以åã¾ã¨ãã¦ããææ¨ãçè«ã§ã¯ããã£ã¦ãã¦ãå®éã«ä½¿ã£ã¦åããã¿ã¦ã¿ãã¨ãç¥èã¨ãã¦å®çããã
ããããã¨ææ¨ã¯ããããsklearn.metrics
ãç¨ããã°ç°¡åã«å®è£
ãå¯è½ãªã®ã§ãåå¼·ãããæ¹ã¯æ¯éä¸åº¦æãåããã¦ã¿ã¦ã¯ãããã§ããããã
ã¡ãã£ã¨é·ããªãã¾ããããä»æ¥ã¯ãã®è¾ºã§ã ææ¥ã¯ã¢ãã«é¸æ(Cross Validation)ã¨ãã©ã¡ã¼ã¿ãµã¼ãï¼
*1:ä½è£ãããã°å çãã¾ã...w
*2:Classifier Comparisonãä¸é¨æ¹å¤ãã¦ã¤ãã£ã