ãã®è¨äºã¯ランク学習(Learning to Rank) Advent Calendar 2018 - Adventarã®4æ¬ç®ã®è¨äºã§ãã
ãã®è¨äºã¯ä½ï¼
ååã®è¨äºã§ãLightGBMãç¨ããã©ã³ã¯å¦ç¿ã«ããã©ã³ãã³ã°ã¢ãã«ãä½ãã¾ããã
www.szdrblog.info
æ©æ¢°å¦ç¿ã¢ãã«ãä½ã£ããã次ã«ããã¹ããã¨ã¯ãã®ã¢ãã«ã®äºæ¸¬ç²¾åº¦ã確èªãããã¨ã§ãã
*1
ãã®è¨äºã§ã¯ãã©ã³ã¯å¦ç¿ã®äºæ¸¬ç²¾åº¦ã測ãææ¨ã¨ãã¦ãMean Reciprocal Rank (MRR)ã»Mean Average Precision (MAP)ã»Normalized DCG (NDCG) ãç´¹ä»ãã¾ãã*2
â¦ã¨ãã£ã¦ãããã®è¾ºãã®äºæ¸¬ç²¾åº¦ã®è©±ã¯é常ã«æåãªã®ã§ã解説è¨äºã¯ããããåºã¦ãã¾ãããªã®ã§ãã®è¨äºã§ã¯ããããã®äºæ¸¬ç²¾åº¦ã«ã¾ã¤ããåé¡ç¹ãã¡ãã£ã¨ã ãç´¹ä»ãããã¨æãã¾ãã
ã©ãã«ã0 or 1ã®2å¤ã®ã¨ã
ãã¼ã¿ã»ããã«å«ã¾ããåææ¸ã®ã©ãã«ã0 or 1ã®2å¤ãåãå ´åãèãã¾ãã
ä¾ãã°ããæ¤ç´¢ãã¼ã¯ã¼ããã¨ãææ¸ãã®åçµã«å¯¾ãã¦ã人ãè¦ã¦ããã®æ¤ç´¢ãã¼ã¯ã¼ããªãããã®ææ¸ã¯æ¤ç´¢çµæã«ç¾ãã¦è¯ããã¨å¤æãããâããã¡ãªãÃã¨ã©ãã«ãä»ä¸ãã¦ãããã¨ãèãããã¾ãã
人æã«ããå¦çã¯é常ã«å¤§å¤ãªã®ã§ãã¦ã¼ã¶ã¼ã®è¡åãã°ãå¾ãããå ´åã¯ãä¾ãã°ãæ¤ç´¢ãã¼ã¯ã¼ããã¨ãææ¸ãã®åçµã«å¯¾ãã¦ãã¦ã¼ã¶ã¼ããã®ææ¸ã¸ã®ãªã³ã¯ãã¯ãªãã¯ãããâãã¯ãªãã¯ããªãã£ããÃã¨ã©ãã«ãä»ä¸ãããã¨ãã§ãã¾ãã
ã©ãã«ã0 or 1ã®2å¤ã®ã¨ãã¯ãMean Reciprocal Rank (MRR)ã»Mean Average Precision (MAP) ã«ããè©ä¾¡ãè¡ããã¨ãã§ãã¾ãã
Mean Reciprocal Rank (MRR)
ã¾ãReciprocal Rank (RR) ãç´¹ä»ãã¾ãã
â¦ã¨è¨ã£ã¦ãé常ã«åç´ãªè©ä¾¡ææ¨ã§ããæ¤ç´¢çµæãä¸ããè¦ã¦ãã£ã¦æåã«â(=1)ã®ææ¸ãç¾ããé ä½ãã®ã¨ãããã¨å®ç¾©ãã¾ãã
ä¾ãã°ä»¥ä¸ã®ã±ã¼ã¹ã§ã¯ãæ¤ç´¢çµæã®2ä½ã«æåã®ã®ææ¸ãç¾ãããããã¨ãªãã¾ãã
Reciprocal Rankã¯æåã«â(=1)ã®ææ¸ãç¾ããé ä½ããæ°ã«ããªãã®ã§ãä»ã®âã®ææ¸ã®é ä½ã¯Reciprocal Rankã®å¤ã«å½±é¿ãã¾ããã
ãªã®ã§ã以ä¸ã®ãããªå¥å¦ãªè©ä¾¡ãè¡ãã¨ããããã¾ãã
é ä½ | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
ã©ã³ãã³ã°ã¢ãã«Aã§ä¸¦ã¹ãéã®ææ¸ã®ã©ãã« | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 |
ã©ã³ãã³ã°ã¢ãã«Bã§ä¸¦ã¹ãéã®ææ¸ã®ã©ãã« | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
ã©ã³ãã³ã°ã¢ãã«Bã¯Aãããâ(=1)ã®ææ¸ãæ¤ç´¢çµæä¸ä½ã«éãã¦ããã®ã§ãç´æçã«ã¯ã©ã³ãã³ã°ã¢ãã«Bã¯Aãããåªãã¦ããã¯ãã§ãã
ä¸æ¹ã§ãã©ã¡ãã®ã©ã³ãã³ã°ã¢ãã«ã1ä½ã«â(=1)ã®ææ¸ãç¾ãã¦ãããããå
±ã«RR=1ã¨ãªã£ã¦ãã¾ãã¾ãã
ãã¡ãããæ¤ç´¢ãµã¼ãã¹ã«ãã£ã¦ã¯Reciprocal Rankã«ããè©ä¾¡ãæ£ããã±ã¼ã¹ãããã¨ã¯æãã¾ãããæ¬å½ã«Reciprocal Rankã§è©ä¾¡ãã¦è¯ãã®ãã¯æ¤è¨ããæ¹ãè¯ãã§ãã
ãã¦ãMean Reciprocal Rank (MRR) ã§ãããããã¯åç´ã«ãæ¤ç´¢ãã¼ã¯ã¼ããæ¯ã«æ±ããããReciprocal Rankã®å¤ãå¹³åããå¤ã¨ãªãã¾ãã
Mean Average Precision (MAP)
ãPrecision@kãâãAverage PrecisionãâãMean Average Precisionãã®é ã«ç´¹ä»ãã¦ããã¾ãã
Precision@kã¯ããæ¤ç´¢çµæã®kçªç®ã¾ã§è¦ãã¨ãã®Precisionãã¨ãã¦å®ç¾©ãã¾ãã
ä¾ãã°ä»¥ä¸ã®ã±ã¼ã¹ã§ã¯ã
- æ¤ç´¢çµæã®1çªç®ã¾ã§è¦ãã¨ããâ(=1)ã®ææ¸ã®æ°ã¯0ãªã®ã§ãPrecision@1=0
- æ¤ç´¢çµæã®2çªç®ã¾ã§è¦ãã¨ããâ(=1)ã®ææ¸ã®æ°ã¯1ãªã®ã§ãPrecision@2=1/2
- æ¤ç´¢çµæã®3çªç®ã¾ã§è¦ãã¨ããâ(=1)ã®ææ¸ã®æ°ã¯1ãªã®ã§ãPrecision@3=1/3
- æ¤ç´¢çµæã®4çªç®ã¾ã§è¦ãã¨ããâ(=1)ã®ææ¸ã®æ°ã¯2ãªã®ã§ãPrecision@4=1/2
ã¨ãªãã¾ãã
次ã«Average Precisionã§ããããâ(=1)ã®ææ¸ãç¾ããåé ä½ã«ã¤ãã¦ãPrecision@kã®å¤ã®å¹³åãåããã¨ãã¦å®ç¾©ãã¾ãã
ä¾ãã°å
ã»ã©ã®ã±ã¼ã¹ã§ãããâ(=1)ã®ææ¸ã¯æ¤ç´¢çµæã®2çªç®ã¨4çªç®ã«ç¾ãã¾ãã
ããã§ãAverage Precisonã¯Precision@2ã¨Precision@4ã®å¹³åã§ããã¨ãªãã¾ãã
æå¾ã«Mean Average Precision (MAP) ã§ãããããã¯åç´ã«ãæ¤ç´¢ãã¼ã¯ã¼ããæ¯ã«æ±ããããAverage Precisionã®å¤ãå¹³åããå¤ã¨ãªãã¾ãã
Mean Average Precisionã®åé¡ç¹ã«ã¤ãã¦è§¦ãã¦ããã¾ãã
"Some Common Mistakes In IR Evaluation, And How They Can Be Avoided", N. Fuhr, 2017ã§ãMean Average Precisionã¯ä»¥ä¸ã®ã¦ã¼ã¶ã¼è¡åãä»®å®ãã¦ããã¨è¨ã£ã¦ãã¾ãã*3
1. users stop only after a relevant document, and
2. the probability of stopping is the same for all relevant ranks.
1ã¤ç®ã®ä»®å®ã§ãããããã¯Average Precisionãâ(=1)ã®ææ¸ãç¾ããåé ä½ã®ã¿ã§å¹³åãåã£ã¦ãããã¨ã«å¯¾å¿ãã¾ããè«æããã¯ã1ã¤ç®ã®ä»®å®ã«ã¤ãã¦ã¯ããã¾ã§å¤§ããªåé¡ã§ã¯ç¡ãããã ã¨èªã¿åãã¾ãã
While the first assumption might be approximately true in some applications (but how would users know when they have reached the last relevant document?)
åé¡ã¯2ã¤ç®ã®ä»®å®ã§ãããã¯Average Precisionãéã¿ãä»ããåç´ã«Precision@kã®å¹³åãåã£ã¦ãããã¨ã«å¯¾å¿ãã¾ãã
ä¸è¬çã«ãã¦ã¼ã¶ã¼ã¯æ¤ç´¢çµæä¸ä½ã§ç®çã®ææ¸ãçºè¦ããããæ¤ç´¢çµæã®ä¸ã¾ã§ç¢ºèªãã«ãã確çã¯å°ãããªãã¾ããã«ãããããããAverage Precisionã¯å
¨ã¦ã®Precision@kãå¹³çã«æ±ã£ã¦å¹³åãåã£ã¦ãã¾ãã
â¦ããããã©ãããããããã ãã¨ãã話ã§ãããè«æä¸ã§ã¯Rank Biased Precision (RBP) ãææ¡ãã¦ãã¾ãã*4
ã©ãã«ãå¤å¤ã®ã¨ã
ãã¼ã¿ã»ããã«å«ã¾ããåææ¸ããPerfect(4), Excellent(3), Good(2), Fair(1), Bad(0)ãã®ããã«ãå¤å¤ãåãå ´åãèãã¾ãã
ãã¡ãã人æã§ã©ãã«ãæ¯ã£ã¦ãè¯ãã§ãããã¦ã¼ã¶ã¼ã®è¡åãã°ã使ã£ã¦ã©ãã«ãæ¯ããã¨ãèãããã¾ãï¼ä¾ãã°ãã¦ã¼ã¶ã¼ã®åææ¸ã«å¯¾ããæ»å¨æéããã©ãã«ãæ±ãããªã©ï¼ã
ã©ãã«ãå¤å¤ã®ã¨ãã¯ãã¾ãåºã¦ããè©ä¾¡ææ¨ã¨ãã¦Normalized Discounted Cumulative Gain (NDCG) ãããã¾ãã
Normalized Discounted Cumulative Gain (NDCG)
ãDiscounted Cumulative Gain (DCG)ãâãNormalized Discounted Cumulative Gain (NDCG)ãã®é ã«ç´¹ä»ãã¾ãã
é«ãå¤ã®ã©ãã«ãæ¤ç´¢çµæä¸ä½ã«ç¾ããã¨å¬ããã¨ããæ°æã¡ãããDCG@kã¯ä»¥ä¸ã®ããã«å½¢å¼çã«å®ç¾©ããã¾ãã
表è¨ãå ¥ãçµãã§ãã¦ãããããã§ãããã¯æ¤ç´¢çµæçªç®ã®ææ¸ã®IDã表ããã¯åææ¸ã®IDã«å¯¾ããã©ãã«ã®ãªã¹ãã表ãã¦ãã¾ãã
ã¾ããã¯ãã©ãã«ã®ææ¸ã®å¾ç¹ãã表ããã¨ãã¦è¨ç®ããããã¨ãå¤ãã§ãã
- Perfect(4)ã§ã¯ç¹
- Excellent(3)ã§ã¯ç¹
- Good(2)ã§ã¯ç¹
- Fair(1)ã§ã¯ç¹
- Bad(0)ã§ã¯ç¹
ã¨ããããã«æ±ãã¾ãã
ã§ãããæ¤ç´¢çµæã®ä¸ã®æ¹ã¯ãã¾ãéè¦ã§ãªãã¨ããæ°æã¡ãããã§è¨ç®ãããå¾ç¹ã«ä¸ããããã«ãã£ã表ãã¦ãã¾ãã
ä¸è¬çã«ã¯é ä½ã®ææ¸ã«ã¤ãã¦ã¨è¨ç®ããããã¨ãå¤ãã§ãã
â¦æ°å¼ã§èª¬æãã¦ãä¼ããã«ããã¨æãã®ã§ãä¾ãåºãã¦ç´¹ä»ãã¾ãã
ã©ã³ãã³ã°ã¢ãã«ã®ã¹ã³ã¢é ã«ææ¸ã並ã¹ãã¨ããã©ãã«ã¯ã¨ãªã£ãã¨ãã¾ãã
ã¨ãåææ¸ã§è¨ç®ããã¨æ±ãããã¨ãã§ãã¾ããã
次ã«Normalized Discounted Cumulative Gain (NDCG) ã§ãããä¸ã§æ±ããDCGã0~1ã®éã«åã¾ãããã«æ£è¦åããã¹ã³ã¢ã«ãªãã¾ãã
æ£è¦åã§ãããææ¸ãçæ³çã«ä¸¦ã¹ãéã«å¾ãããDCGï¼ideal DCGï¼ã§å²ã£ã¦ããã¾ãã
ä¾ãã°ä¸ã®ä¾ã§ãããææ¸ãçæ³çã«ä¸¦ã¹ãã¨ããã©ãã«ã¯ã¨ãªãã¾ãã
ãã®å
ã§DCGãè¨ç®ããã¨ããå¾ããã¾ãã
ãã¦ãã©ã³ãã³ã°ã¢ãã«ã®ã¹ã³ã¢é ã«ä¸¦ã¹ãéã«ã¯ãã§ããã
ãªã®ã§ãNDCGã¯ã¨æ±ãããã¾ãã
ãã¦ãNDCGã®åä»ãªç¹ã§ãããæ£ç´ããã¾ã§NDCGã«èª²é¡ã¯åå¨ããªãããããªããã¨æã£ã¦ãã¾ãã
å人çã«æ°ã«ãªãç¹ã¨ãã¦ãã¨ã®ä»»ææ§ãããã¾ãããä¸ã®å®ç¾©ã§è¨ç®ããããã¨ãã»ã¨ãã©ã§ãã
ãã®ãããã®è©±ã¯ã以åå¥ã®è¨äºã§ãç´¹ä»ãã¾ããã®ã§ããã¡ããã覧ãã ããã
ã¾ã¨ã
ãã®è¨äºã§ã¯ãæ¤ç´¢çµæã®ç²¾åº¦è©ä¾¡ææ¨ã¨ãã¦MRRã»MAPã»NDCGãç´¹ä»ãã¾ããã
次ã®è¨äºã§ã¯ãã©ã³ã¯å¦ç¿ãå®éã«ã©ãã§åãã¦ããã®ããªã©ã®å¿ç¨äºä¾ãç´¹ä»ãã¦ããããã¨æãã¾ãã
åè
- Learning to Rank for Information Retrieval, T. Y. Liu, 2011
*1:ãã¡ããäºæ¸¬ç²¾åº¦ä»¥å¤ã«ããã©ã³ãã³ã°ã¢ãã«ãåºåããæ¤ç´¢çµæãæå³éãã«ãªã£ã¦ãããç®ã§ç¢ºèªããã¨ãã確èªãã¹ãäºé ã¯ããã¤ãããã¨æãã¾ãã
*2:ã©ã³ã¯å¦ç¿ã«ãã ããããæ¤ç´¢çµæä¸è¬ã®ç²¾åº¦è©ä¾¡ææ¨ã«ä½¿ãã¾ãã
*3:å è«æã¯"A new interpretation of average precision", S. E. Robertson, 2008ã§ã
*4:RBPãããã¯ããã§ãã©ã¡ã¼ã¿ããã£ãããã¦æ±ãã«ããè©ä¾¡ææ¨ã§ããã