ã¢ã©ã¤ã¢ã³ã¹äºæ¥éçºé¨ã®å¤§æ½æ ¹(@dr_paradi)ã§ãã ãã¥ã¼ã¹ãã¹ã¨ããã¢ããªã®åæã¨éçºãè¡ã£ã¦ããã¾ãã
ä»åã¯æ©æ¢°å¦ç¿ã®è©ä¾¡é¢æ°ã®ã話ããã¾ãã å 容ã¯ã【FiNC×プレイド】Machine Learning Meetup #1 - connpassã§çºè¡¨ãããã®ã«ãªãã¾ãã
çºè¡¨è³æ
www.slideshare.net
æ©æ¢°å¦ç¿ã«ãããè©ä¾¡
ç¾å¨ã¯æ©æ¢°å¦ç¿ã©ã¤ãã©ãªãå å®ãã¦ãããã¾ããWebãµã¼ãã¹ã®æ®åã«ããå¦ç¿ã«å¿ è¦ãªãã¼ã¿ã®ç²å¾ã以åã¨æ¯è¼ãã¦å®¹æã«ãªã£ã¦ãã¾ãã ãã®ãããæ©æ¢°å¦ç¿ã®ãã¸ãã¹å©ç¨ã¸ã®æ·å± ãä¸ãã£ã¦ãã¾ãã
äºæ¸¬ãåé¡ã¨ãã£ãåé¡ã解ãéã«ã¯ãè¨å®ãã課é¡ã«å¯¾ãã¦ã©ã®ã¢ãã«ãæãé©ãã¦ããããè©ä¾¡ããããã®ææ¨(è©ä¾¡é¢æ°)ãå¿ è¦ã«ãªãã¾ãã Kaggle*1ãªã©ã®ã³ã³ããã£ã·ã§ã³ã§ã¯ãããããè©ä¾¡ææ¨ãå®ã¾ã£ã¦ãã¾ããããããå®ãã¸ãã¹ã§æ©æ¢°å¦ç¿ãå¿ç¨ããéã«ã¯èªãè©ä¾¡ææ¨ãè¨å®ããå¿ è¦ãããã¾ãã ããã«ãé©åãªè©ä¾¡é¢æ°ãé¸ã¶ã®ã¯åå¦è ã«ã¯é£ããããã¸ãã¹ã®åé¡è¨å®ãç®çæèã«ãã£ã¦ãç°ãªãã¾ãã ã¾ãããªãã©ã¤ã³ã§ã®äºæ¸¬ã¯ã¦ã¼ã¶ã®å®éã®è¡åäºæ¸¬ã¨ã¯ã®ã£ãããããâ¨å ´åãããã¾ã*2(æé©ã§ãªãçãã®ä¸ã«ç°è³ªãªãã®ãå ¥ãã¨ã¤ãã¯ãªãã¯ãã¦ãã¾ããªã©)ã
ããã§ãæ¬ããã°ã§ã¯å¤ãããè©ä¾¡é¢æ°ã®ãã¡ä»£è¡¨çãªãã®ããã¤ããã¾ã¨ãã¦è§£èª¬ãã¾ãã ã¯ããã«2å¤åé¡ãè©ä¾¡ããéã«ç¨ããããåºç¤çãªè©ä¾¡é¢æ°ã解説ããç¾å¨ã®Kaggleã®Active Competitionã®è©ä¾¡ææ¨ãç´¹ä»ãã¾ãã
åºç¤ç·¨
2å¤åé¡ (ä¾ãã°ã¨ããç æ°ã«ç½¹æ£ãã¦ãããã©ãã) ã®äºæ¸¬ãããéã®ææ¨ã解説ãã¾ãã
æ£è§£ç (Accuracy)
ãã®ææ¨èªä½ã精度ã¨ããå ´åãããã¾ã*3ã äºæ¸¬çµæå ¨ä½ã¨ãçããã©ããããä¸è´ãã¦ããããå¤æããææ¨ãè¨ç®å¼ã¯ä¸è¨ãåç §ã
ä¸è¦åªããææ¨ã«è¦ãã¾ãããçºè¡¨ã¹ã©ã¤ãã«ãããéããæ£è§£ãã¼ã¿ã®æ£ã1%ã§è² ã99%ã®ãããªå ´åããã¹ã¦ã®ãã¼ã¿ãä¸æ£è§£ã¨äºæ¸¬ãããã¼ã¿ãããå ´åã99%ã®ç²¾åº¦ ãæã¤ã¢ãã«ã¨è©ä¾¡ããã¦ãã¾ãã¾ããããã§ã¯ããã¢ãã«ã¨ã¯è¨ããªãã®ã§ä¸è¨ã®ææ¨ãä½µãã¦ä½¿ãå ´åãå¤ãã§ãã
é©åç (Precision)
å½é½æ§ãä½ãæãããã¨ãç®çã¨ããå ´åã«ã¯é©åçãé«ãã¢ãã«ãæ¡ç¨ãã¾ããç¯ç½ªã®æ¤æãä¾ã«ããã¨ãä¸è¬å¸æ°ãå¤ç½ªã§é®æãã¦ãã¾ãçãä½ãæãããã¨ãã§ãã¾ããããããéã«çç¯äººãè¦ã¤ãã確çãä¸ããå ´åãããã¾ãã
åç¾ç (Recall)
å½é°æ§ãä½ãæãããå ´åã«æ¡ç¨ããææ¨ã§ãã
Få¤ (F-measure, F-score)
é©åçã¨åç¾çã¯ãã¬ã¼ããªãã®é¢ä¿ã«ãã(ã©ã¡ãããé«ããªãã¨ã©ã¡ãããä½ããªã)ã®ã§èª¿åå¹³åãã¨ã£ãææ¨ã§ãã
éã¿ä»ãFå¤ (Weighted F-measure)
æ¤ç´¢ã«ããã¦ãã¦ã¼ã¶ã®åå¿ãå ã«"Precisionãããé«ãã"ãªã©éã«éã¿ä»ãã®Få¤ã使ãå ´åãããã¾ãã
\( 0 < \beta < 1\) ã®ã¨ãã«åç¾çãéè¦ãã\( 1 < \beta\) ã®ã¨ãã«é©åçãéè¦ãã¾ãã
Kaggleã®ææ¨
8/3æç¹ã§Activeã ã£ãCompetitionã®ããã¤ããããã¯ã¢ãããã¦è©ä¾¡ææ¨ãè¦ã¦ã¿ã¾ãã
Diceä¿æ° (Dice Coefficient)
"Ultrasound Nerve Segmentation" ã§ä½¿ããã¦ããææ¨ é©åçãªã©ã¨è¿ãææ¨ã§ãããéåãäºæ¸¬ããéã«ãçãäºæ¸¬ããéã«è©ä¾¡ãé«ããªãææ¨ Jaccardä¿æ°ãªã©ã¨åããææ¸ãåèªã®é¡ä¼¼åº¦ãå³ãéã«ãç¨ãããã(ããã)
\( |X| \): äºæ¸¬ãã解ã®éå
\( |Y| \): æ£è§£ã®è§£ã®éåã§ã
RMSâEç³»
"Grupo Bimbo Inventory Demand" ã§ã¯RMSLEã使ããã¦ããåºãç¨ããããæå°äºä¹èª¤å·®ãæ¡å¼µããææ¨ã«ãªã£ã¦ãã¾ãã
以ä¸ã«ããã¤ã代表çãªæå°äºä¹ââ誤差系ãã¾ã¨ãã¾ããã
RMSE (Root Mean Squared Error)
ä¸è¬çãªäºä¹èª¤å·®
RMSPE (Root Mean Squared Persentage Error)
å²åã®å·®ã®äºä¹èª¤å·®
RMSLE (Root Mean Squared Logarithmic Error)
"Grupo Bimbo Inventory Demand" ã§ä½¿ããã¦ããææ¨ã§ã対æ°ã®å·®ãåã£ã¦ãã¾ããä¾ãã°ã売ãä¸ãã100åã®ååã10,000åã¨äºæ¸¬ããå ´åã®å·®ãå°ããè©ä¾¡ããã¾ãã å人ã®è³ç£ã®é¡ãªã©ã®æ¡ã大ãããªã対æ°æ£è¦åå¸ã«è¿ãåå¸ã«ããã¦æç¨ã§ãã
RMSLEã¯å¯¾æ°ãåã£ã¦ããã®ã§ä¸ã¤ã®å¤§ããªééãã§ã®å·®ãåºã«ãããªã£ã¦ãã¾ããRMSPEãå²åãªã®ã§åæ§ã§ãã (ä¾ãã°åºèã®å£²ãä¸ãäºæ¸¬ãªã©ã§ãä¸ã¤ã®åºèã®ãç°å¸¸ã«å£²ãä¸ããé«ãã¨ãæå°äºä¹èª¤å·®ã®å ´åããã®åºã®äºæ¸¬ç²¾åº¦ã ããä¸ããã°ãã)
\( y_i' \): içªç®ã®è¦ç´ ã®äºæ¸¬å¤
\( y_i \): içªç®ã®è¦ç´ ã®æ£è§£å¤
Multi-class logarithmic loss
"TalkingData Mobile User Demographics" ã§ä½¿ããã¦ããææ¨
å¤ã¯ã©ã¹åé¡ã®å ´åã«ã¯accuracyã使ããã¨ãå¤ãã§ãããäºæ¸¬ã¢ãã«ã®åºåãç¹å®ã®ã¯ã©ã¹ã«å±ãã確çã§ãããã¨ãå¤ãã®ã§ãæ£è§£ã¨ã®è·é¢ã対æ°ã§åã£ããã®ã®åãè©ä¾¡é¢æ°ã¨ãã¦ãã¾ãã
\( p_{ij}' \): içªç®ã®è¦ç´ ã®jçªç®ã®ã¯ã©ã¹ã«å±ããäºæ¸¬ç¢ºç
\( y_{ij} \): içªç®ã®è¦ç´ ã®jçªç®ã®ã¯ã©ã¹ã«å±ãããã©ãã (1 or 0)
ä¸å³ã®å ´åã«ã¯ãäºæ¸¬ã¢ãã«2ã®è©ä¾¡ãé«ããªãã¾ã(Multi-class logarithmic lossèªä½ã¯0ã«è¿ãæ¹ããã)ã
Accuracyã§è©ä¾¡ããå ´åã«ã¯äºæ¸¬ã¢ãã«1ãäºæ¸¬ã¢ãã«2ã®åæ¹ã¨ãåãè©ä¾¡ã«ãªãã¾ãã
ãã®ä»
確çåå¸
ã«ã«ããã¯ã©ã¤ãã©ã¼ãã¤ãã¼ã¸ã§ã³ã¹ãªã©ã確çåå¸å士ã®å·®ãæ¯è¼ããéã«å©ç¨ããã¾ãã
ã¾ã¨ã
è©ä¾¡é¢æ°ãæ·¡ã ã¨ç´¹ä»ãã¾ããããæåã«è¿°ã¹ãããã«ããªãã©ã¤ã³ã§ã®äºæ¸¬ã¯ã¦ã¼ã¶ã®å®éã®è¡åäºæ¸¬ã¨ã¯ã®ã£ãããããâ¨å ´åãããã¾ã(æé©ã§ãªãçãã®ä¸ã«ç°è³ªãªãã®ãå ¥ãã¨ã¤ãã¯ãªãã¯ãã¦ãã¾ããªã©)ã
BtoCã®å®ãã¸ãã¹ã«æ©æ¢°å¦ç¿ãå°å ¥ããéã«ã¯ABãã¹ã*4ãªã©ãç¨ãã¦å®éã®ã¦ã¼ã¶ã®åå¿ãè¦ã¤ã¤è©ä¾¡ææ¨ãå¤åãã (æã«ã¯æ°ããææ¨ãä½æã)ãæ¹åããã¦ãããã¨ãéè¦ãã¨æãã¾ãã
*1:æåãªãã¼ã¿ãµã¤ã¨ã³ã¹ã®ã³ã³ããã£ã·ã§ã³
*2:Data-Driven Metric Development for Online Controlled Experiments: Seven Lessons Learned Xiaolin Shi*, Yahoo Labs; Alex Deng, Microsoft;â¨KDD '16
*3:æ°å¼ãªã©ã¯ä¸»ã«ãã¡ããåèã«ãã¦ãã¾ã: æ å ±æ¤ç´¢ã®åºç¤ Christopher D.Manning (è), Prabhakar Raghavan (è), Hinrich Schutze (è)岩é åçã (翻訳), å ±ç«åºç 2012