æ©æ¢°å¦ç¿ã§ã¯ã¢ãã«ãä½ã£ã¦çµãããã¨ãããã¨ã¯ç¡ããã¢ãã«ä½æå¾ã«ãã¹ããã¼ã¿ã使ã£ã¦ãæ¬å½ã«è¯ãã¢ãã«ãªã®ãï¼ãã¨ããè©ä¾¡ãå¿ ãè¡ãå¿ è¦ãããã¾ãã
ã§ã¯å ·ä½çã«ã©ã®ããã«è©ä¾¡ãããã°è¯ãã®ãï¼ã¨ãã話ã«ãªãã¾ãããä»åã¯ä»£è¡¨çãªè©ä¾¡ææ¨ã§ãã
- ROC
- AUC
ã¤ãã¦èª¬æãã¦ããã¾ãã
ãã®è¾ºãã«ã¤ãã¦ã¯ã以ä¸æ¸ç±ã§ããã¾ã¨ã¾ã£ã¦ããã®ã§ããããããã°æ¯éï¼
Pythonã¨å®ãã¼ã¿ã§éãã§å¦ã¶ ãã¼ã¿åæè¬åº§
- ä½è : æ¢ æ´¥éä¸,ä¸éè²´åº
- åºç社/ã¡ã¼ã«ã¼: ã·ã¼ã¢ã³ãã¢ã¼ã«ç 究æ
- çºå£²æ¥: 2019/08/10
- ã¡ãã£ã¢: åè¡æ¬ï¼ã½ããã«ãã¼ï¼
- ãã®ååãå«ãããã°ãè¦ã
â»è¿½è¨â»
ã¹ããã ã¨æ°å¼ããã¾ã表示ãããªãå¯è½æ§ãããã¾ãã®ã§ããã¡ãã®ãªã³ã¯ããããã¯PCããè³¼èªé ãã¾ãã¨å¹¸ãã§ãã
æ£è§£çã®åé¡ç¹ã¨ãå½é½æ§çã¨çé½æ§ç
ROCã»AUCã«å ¥ãåã«ãããããè¨ç®ããããã®æ§è½è©ä¾¡å¤ã«ã¤ãã¦ã¾ãã¯è¦ã¦ããã¾ããããã§ã¯ãã¯ã©ã¹aã¨ã¯ã©ã¹bãåé¡ããäºå¤åé¡åé¡ãèãã¦ã¿ã¾ãããã
ããè¨ç·´ãã¼ã¿ã使ã£ã¦ãã¯ã©ã¹aã¨bãåé¡ããã¢ãã«ãä½ãããã®ã¢ãã«ãè©ä¾¡ããããã«ãã¹ããã¼ã¿ã«å½ã¦ã¯ãã¾ãã
æ£ãaã¯ã©ã¹ãè² ãbã¯ã©ã¹ã¨ããã¨ãããã¹ããã¼ã¿ã«å½ã¦ã¯ããçµæã«å¯¾ãã¦ä»¥ä¸ã®ãããªå³ãæ¸ãã¾ããã
èå¥ã¯ã©ã¹ | |||
çã®ã¯ã©ã¹ | æ£ï¼aï¼ | è² ï¼bï¼ | |
æ£ï¼aï¼ | True Positive ï¼TPï¼çé½æ§ï¼ |
False Negative ï¼FNï¼å½é°æ§ï¼ |
|
è² ï¼bï¼ | False Positive ï¼FPï¼å½é½æ§ï¼ |
True Negative ï¼TNï¼çé°æ§ï¼ |
縦軸ãã¢ãã«ã®äºæ¸¬ã§ã横軸ãæ£è§£ãã¼ã¿ã«ãªãã¾ãã
å象éã®èª¬æãç°¡åã«ããã¨ã
- True Positive(TP)ï¼ æ£è§£ãã¼ã¿æ£ã§ãããã®ããæ£ããæ£ã¨äºæ¸¬ã§ããæ°
- False Positive(FP)ï¼æ£è§£ãã¼ã¿è² ã§ãããã®ããééã£ã¦æ£ã¨äºæ¸¬ããæ°
- Flase Negative(FN)ï¼æ£è§£ãã¼ã¿æ£ã§ãããã®ããééã£ã¦è² ã¨äºæ¸¬ããæ°
- True Negative(TN)ï¼æ£è§£ãã¼ã¿è² ã§ãããã®ããæ£ããè² ã¨äºæ¸¬ã§ããæ°
ã¨ãªãã¾ãã
â»ä»¥éã§ãããTrue Positiveã¯TPãFalse Positiveã¯FPã¨ãã£ãããã«ã表è¨ãçç¥ããã¦é ãã¾ãã®ã§ãäºæ¿ãã ããã
ãã¦ããã®è¡¨ããæ£è§£çãæ±ããã¨ããã¨ã
æ£è§£çãè©ä¾¡ææ¨ã¨ãã¦ç¨ããã®ãç´æçã«ã¯è¯ãããã§ã¯ããã®ã§ãããã¯ã©ã¹ã«åããããå ´åãæ©è½ããªããªãã¨ããåé¡ãããã¾ãã
ãªãä¸åè¡¡ãªã¯ã©ã¹ã§ã®æ£è§£çè©ä¾¡ãåé¡ãï¼
ä¾ãã°ãæ£è§£ãã¼ã¿ãã¯ã©ã¹aï¼80人ãã¯ã©ã¹bï¼20人ã®ãã¼ã¿ãããã¨ãã¾ããããããªãããã©ããªãã¼ã¿ããã¹ã¦ã¯ã©ã¹aã«å ¥ãã®ã ï¼ï¼ï¼ç°è«ã¯èªããªããï¼ï¼ãã¨ãã大å¤ãã¡ããã¡ããªã¢ãã«ãä½ã£ã¦ãã¾ã£ãã¨ãã¾ãã
å ã»ã©ã®è¡¨ãä½ã£ã¦ã¿ãã¨ã
èå¥ã¯ã©ã¹ | |||
---|---|---|---|
æ£ | è² | ||
çã®ã¯ã©ã¹ | æ£ | TPï¼80 | FNï¼0 |
è² | FPï¼20 | TNï¼0 |
ããããããã§ã¯ã¢ãã«ã®ç²¾åº¦ãæ¬å½ã«è¯ãã¨ã¯è¨ããªãã§ãããã
ãã®ããã«ãã¯ã©ã¹éã®ãã¼ã¿æ°ã«åããããã¨ããã¼ã¿æ°ã®å¤ãã¯ã©ã¹ã«ã¨ããããåé¡ãã¦ããã°ãåç´ã«æ£è§£çã¯å¤§ãããªã£ã¦ãã¾ãã¾ãã
ã¨ãããã¨ã§ãã¯ã©ã¹éã®åãã«ä¾åããªãææ¨ãå¿ è¦ã«ãªãããã§ãã
å½é½æ§çã¨çé½æ§ç
ããä¸åº¦ãã®è¡¨ãè¦ã¦ã¿ã¾ããããèå¥ã¯ã©ã¹ | |||
çã®ã¯ã©ã¹ | æ£ï¼aï¼ | è² ï¼bï¼ | |
æ£ï¼aï¼ | True Positive ï¼TPï¼çé½æ§ï¼ |
False Negative ï¼FNï¼å½é°æ§ï¼ |
|
è² ï¼bï¼ | False Positive ï¼FPï¼å½é½æ§ï¼ |
True Negative ï¼TNï¼çé°æ§ï¼ |
ãã®è¡¨ã使ã£ã¦ä¸ã®2ã¤ã®ææ¨ãç®åºãã¦ã¿ããã¨æãã¾ãã
- False Positive Rate(å½é½æ§çï¼ï¼
- True Positive Rate(çé½æ§çï¼ï¼
ãå½é½æ§çãã¯ãæ£è§£ãã¼ã¿è² ã§ãããã®ãééã£ã¦æ£ã¨äºæ¸¬ããå²åã(åæ¯ãæ£è§£ãã¼ã¿è² ã®ç·å)
ãçé½æ§çãã¯ãæ£è§£ãã¼ã¿æ£ã§ãããã®ãæ£ããæ£ã¨äºæ¸¬ããå²åã(åæ¯ãæ£è§£ãã¼ã¿æ£ã®ç·å)
ããããåæ£è§£ãã¼ã¿ã®ã¯ã©ã¹å ç·ãã¼ã¿æ°ããã¨ã«è¨ç®ããããããã¯ã©ã¹éã®ãã¼ã¿æ°ã®åãã«ããå½±é¿ã¯ããã¾ããã
ããã¦ãï¼æ¬¡ç¯ã§æãä¸ããã¾ããï¼ROCæ²ç·ã¨ã¯ããããªç¹å¾´ãæã¤ãå½é½æ§çãã¨ãçé½æ§çãããã¨ã«ç®åºããææ¨ã¨ãªãã¾ãã
ãã®ãããROCæ²ç·ãã¯ã©ã¹éã®ãã¼ã¿æ°ã®åãã«ããå½±é¿ãåããªãã¨ããç¹å¾´ãæã£ã¦ãã¾ãã
ROCæ²ç·ã¨AUCã¨ã¯ï¼
ROCæ²ç·ãAUCãç®åºããã«ã¯ãå ç¨æ±ããå½é½æ§çã¨çé½æ§çã使ãã¾ããFalse Positive Rateï¼å½é½æ§çï¼ã横軸ã«True Positive Rateï¼çé½æ§çï¼ã縦軸ã«ç½®ãã¦ãããããããã®ãROCæ²ç·ã§ãã
ãã®èª¬æã ãã ã¨ããåãããªãã¨æãã®ã§ãå
·ä½ä¾ã§è¦ã¦ããã¾ãããã
ä¾ãã°ãã¯ã©ã¹aã¨ã¯ã©ã¹bã®2ã¤ã®ã¯ã©ã¹ãåé¡ããåé¡ã§ãåãã¼ã¿ã«å¯¾ãã¦ãã¢ãã«ã®äºæ¸¬ã¹ã³ã¢ã以ä¸ã®ããã«ãªã£ãã¨ãã¾ãã
æ£è§£ã¯ã©ã¹ | score |
a | 0.8 |
a | 0.45 |
a | 0.7 |
b | 0.2 |
b | 0.4 |
a | 0.5 |
b | 0.15 |
a | 0.65 |
b | 0.2 |
b | 0.5 |
a | 0.6 |
ãã¡ãã®è¡¨ã¯ã
- æ£è§£ãaã®ãã¨ãããã¼ã¿ãã¢ãã«ã«éããã¨ãã®äºæ¸¬ã¹ã³ã¢ã0.8
- æ£è§£ãaã®ãã¨ãããã¼ã¿ãã¢ãã«ã«éããã¨ãã®äºæ¸¬ã¹ã³ã¢ã0.45
- æ£è§£ãaã®ãã¨ãããã¼ã¿ãã¢ãã«ã«éããã¨ãã®äºæ¸¬ã¹ã³ã¢ã0.7
- æ£è§£ãbã®ãã¨ãããã¼ã¿ãã¢ãã«ã«éããã¨ãã®äºæ¸¬ã¹ã³ã¢ã0.2
â¦ã¨ãã風ã«ãæ£è§£ã¨ã¹ã³ã¢ã並ã¹ããã®ã§ãã
大ä½ã®ã¢ãã«ã§ã¯ãããã®äºæ¸¬ã¹ã³ã¢ã 0.5以ä¸ãªãaã¨ããã0.5ããå°ããã£ããbã¨ãããã¨ãã風ã«é¾å¤ï¼ãããå¤ï¼ã決ãã¦åé¡ã®å¤æãä¸ãã¾ãã
ãã®é¾å¤ãè²ã å¤åãããã¨ãããã«å¿ãã¦å½é½æ§çã¨çé½æ§çãå½ç¶å¤åãã¦ããã¾ãã
ä¾ãã°ããã¹ã³ã¢ã0.8以ä¸ã®ã¨ãã¯ã©ã¹aã¨äºæ¸¬ããã0.8ããå°ãããã°bã¨äºæ¸¬ãããã¨ãã¾ãããã®ã¨ãã
èå¥ã¯ã©ã¹ | |||
---|---|---|---|
æ£ï¼aï¼ | è² ï¼bï¼ | ||
çã®ã¯ã©ã¹ | æ£ï¼aï¼ | TPï¼1 | FNï¼5 |
è² ï¼bï¼ | FPï¼0 | TNï¼5 |
ã¨ãªããããå½é½æ§çã¯0, çé½æ§çã¯1/6ã¨æ±ãããã¾ããã
次ã«ããã¹ã³ã¢ã0.45以ä¸ã®ã¨ãã¯ã©ã¹aã¨äºæ¸¬ããã0.45ããå°ãããã°bã¨äºæ¸¬ãããã¨ãã¾ãããã®ã¨ãã
èå¥ã¯ã©ã¹ | |||
---|---|---|---|
æ£ï¼aï¼ | è² ï¼bï¼ | ||
çã®ã¯ã©ã¹ | æ£ï¼aï¼ | TPï¼6 | FNï¼0 |
è² ï¼bï¼ | FPï¼1 | TNï¼4 |
ã¨ãªããããå½é½æ§çã¯0.2, çé½æ§çã¯1ã¨æ±ãããã¾ãã
ãã®ããã«åã¹ã³ã¢å¤ãé¾å¤ã¨ãã¦ãå¤åãããã¨ãã«ãåã¹ã³ã¢ã«å¯¾ããå½é½æ§çã¨çé½æ§çã®é¢ä¿ãã¾ã¨ãããã®ã以ä¸ã®è¡¨ã«ãªãã¾ãã(é¾å¤ããã大ããã¹ã³ã¢ã®å ´åãaã¨äºæ¸¬ãããã¨ãæ³å®)
æ£è§£ã¯ã©ã¹ | ãããããããscore | å½é½æ§ç | çé½æ§ç |
a | 0.8 | 0 | 0.17 |
a | 0.45 | ããããããã0.2 | 0.1 |
a | 0.7 | 0 | 0.33 |
b | 0.2 | 0.8 | 1 |
b | 0.4 | 0.4 | 1 |
a | 0.5 | 0.2 | 0.83 |
b | 0.15 | 1 | 1 |
a | 0.65 | 0 | 0.33 |
b | 0.2 | 0.8 | 1 |
b | 0.5 | 0.2 | 0.83 |
a | 0.6 | 0 | 0.67 |
ã¹ã³ã¢ã®é¾å¤ãå¤ããæãå½é½æ§çã¨çé½æ§çãã©ã®ããã«å¤åãã¦ããã®ããã¿ãããã«ãã¹ã³ã¢ãéé ã§ä¸¦ã³æ¿ãã¦ã¿ã¾ãã
æ£è§£ã¯ã©ã¹ | ãããããããscore | å½é½æ§ç | çé½æ§ç |
a | 0.8 | 0 | 0.17 |
a | 0.7 | 0 | 0.33 |
a | 0.65 | 0 | 0.33 |
a | 0.6 | 0 | 0.67 |
a | 0.5 | 0.2 | 0.83 |
a | 0.5 | 0.2 | 0.83 |
a | 0.45 | ããããããã0.2 | 0.1 |
b | 0.4 | 0.4 | 1 |
b | 0.2 | 0.8 | 1 |
b | 0.2 | 0.8 | 1 |
b | 0.15 | 1 | 1 |
ã¹ã³ã¢(ï¼ã¯ã©ã¹åé¡ã®é¾å¤ï¼ã大ããå¤ããä¸ãã¦ããã¨ãå½é½æ§çã¨çé½æ§çãã¨ãã«å¤§ãããªã£ã¦ãããã¨ããããã¾ãã
é¾å¤ãä¸ããã¨ãããã¨ã¯ããªãã§ãããã§ãã¨ããããaã¨äºæ¸¬ãããã¨ãæãã¦ããã®ã§ãé½æ§ã¨å¤å®ãããå²åã¯å¤§ãããªã(çé½æ§çã大ãããªãï¼ã¨åæã«ãééã£ã¦é½æ§ã¨å¤å®ããå²å(å½é½æ§çã大ãããªãï¼ãé«ããªãã¾ãã
ããã§ç®æãããã®ã¯ãééã£ã¦é½æ§ã¨å¤å®ããå²åãå°ãããã¦(å½é½æ§çãå°ããï¼ããªããã¤æ£è§£ãã¼ã¿ã®é½æ§ã§ãããã®ãã§ããã ãå¤ãé½æ§ã¨å¤å®(çé½æ§çã大ããï¼ã§ããã¢ãã«ãä½ããã¨ã§ãã
ROCæ²ç·ããããã
å ç¨ã®å½é½æ§çã¨çé½æ§çã®è¡¨ãããããããã¨ä»¥ä¸ã®ãããªã°ã©ããåºæ¥ä¸ããã¾ãããã®ããã«ãé¾å¤ãå¤åãããã¨ãã®å½é½æ§çã¨çé½æ§çã«ããåç¹ãçµãã ãã®ãROCæ²ç·ã§ãã(横軸ï¼å½é½æ§çã縦軸ï¼çé½æ§çï¼
å½é½æ§çï¼æ£è§£ãã¯ã©ã¹bã®ãã¼ã¿ãã¯ã©ã¹aã¨èå¥ãã¦ãã¾ã£ãå²åï¼ã大ãããªãããã«é¾å¤ãè¨å®ããã°ãããã¯ã¤ã¾ãã¯ã©ã¹aã«åé¡ããå¤å®ãçããããã¨ãæå³ãããããçé½æ§çï¼æ£è§£ãã¯ã©ã¹aã®ãã¼ã¿ãã¯ã©ã¹aã¨èå¥ã§ããå²åï¼ãèªåçã«ä¸ããã¾ããã¤ã¾ãããã®ã°ã©ãã¯å³ä¸ã«ä¸ãããã¨ã¯æ±ºãã¦ãªããå³ã«è¡ãã»ã©ï¼å½é½æ§çãä¸ããã»ã©ï¼ä¸ã«ä¼¸ã³ãå½¢ãã¨ãã¾ãï¼çé½æ§çãä¸ããï¼ã
極端ã«è¨ãã°ãå½é½æ§çã1ã«è¿ã¥ãã¨ãããã¨ã¯ããªãã§ãããã§ãã¯ã©ã¹aã«åé¡ãã¦ãããããªãã®ãªã®ã§ãããã¯å¿ ç¶çã«çé½æ§çã1ã«è¿ã¥ãã¾ãããã
ãã®ãããªãã¨ãèããã¨ãè¯ãã¢ãã«ã¨ã¯å½é½æ§çãä½ãæç¹ã§æ¢ã«çé½æ§çãé«ãæ°å¤ãã§ããã¨ã¨ããèããç´è¦³çã«ç解ã§ããã®ã§ã¯ã¨æãã¾ãã
AUCã®èãæ¹
- ROCæ²ç·ã¯å³ã«ããã»ã©ä¸ãããã¨ã¯ãªã
- å½é½æ§çã®å¤ãå°ããæç¹ã§ãé«ãçé½æ§çãéæãã¦ããã¢ãã«ã»ã©è¯ã
ã¨ãã2ç¹ãèããã¨ãROCæ²ç·ã¨x軸y軸ã§å²ã¾ããé¨åï¼ä¸å³ã®æç·é¨ï¼ã®é¢ç©ãã§ããã ã大ãããã®ã»ã©è¯ãã¢ãã«ã§ããã¨è¨ãããã§ãã
ãã®é¢ç©ã®å¤ãAUC(Area under an ROC curve)ã¨ãªãã¾ããAUCã1ã«è¿ãã»ã©æ§è½ãé«ãã¢ãã«ã¨ãªããå®å ¨ã«ã©ã³ãã ã«äºæ¸¬ãããå ´åãAUCã¯0.5ãã¤ã¾ãROCæ²ç·ã¯åç¹(0,0)ã¨(1,1)ãçµã¶ç´ç·ã«ãªãã¾ãã
ä¾ãã°ï¼ã¤ã®ã¢ãã«ãæ¯è¼ãããã¨ãã«ãROCæ²ç·ã以ä¸ã®ããã«ãªã£ãã¨ãã¾ãã
AUCã1ã«è¿ãã»ã©=ROCã§å²ããã¦ããé¢ç©ã®å¤§ããã»ããã¢ãã«æ§è½ãé«ãã®ã§ã
ãã®å ´åã¯ã¢ãã«1ã®æ¹ãèå¥æ§è½ãé«ãã¨ããå¤æã«ãªãã¾ãã
çµããã«
ä»åã¯ã¢ãã«æ§è½ã®è©ä¾¡ææ¨ã¨ãã¦æåãªROCãAUCã«ã¤ãã¦ã¾ã¨ãã¦ã¿ã¾ãããç´æçã«å°ãã§ãã¤ã¡ã¼ã¸ãæ´ãã§é ãã¾ããã幸ãã§ããããç¨åº¦ç´è¦³çç解ãã§ããã°ãå®éã¯ããã±ã¼ã¸ãèªåçã«æ±ãã¦ããããã¨ãå¤ãã®ã§ãæãã大ä¸å¤«ã§ãã
PythonãRã¨ãã«ROCæ²ç·ã®ãããããAUCã®ç®åºã¯ç°¡åã«ã§ããã®ã§ããã²ä»å¾ã®ã¢ãã«è©ä¾¡ã¨ãã¦ä½¿ã£ã¦ã¿ã¦ãã ããã
ï¼ä»å¾ãæéãããã¨ãã«å®è·µç·¨ã¨ãã¦ã¢ãã«è©ä¾¡ãPythonã¨Rã§ããããããã¾ããããããã¶ãï¼