3. Banditã¢ã«ã´ãªãºã ã«ã¤ã㦠â¢â¯ å ±é ¬ãããããªãè¤æ°ã®ã¹ããããã·ã³ã ãã£ãã¨ãã«ä½åã試è¡ãããã¨ã«ããæãå© å¾ãé«ãã¹ããããã·ã³ãçºè¦ãã â⯠Ex: åºåã®ã¯ãªã¨ã¤ãã£ãã®ãã¡æãã¯ãªãã¯ç (Click through rate, CTR)ãé«ãã¯ãªã¨ã¤ãã£ããè¦ ã¤ãã â¢â¯ åè Finite-ÂâDme Analysis of mulDarmed bandit problem, Machine Learning,2002 4. Bandit ã¢ã«ã´ãªãºã ã«ã¤ã㦠CTR=20% Click 2000å Imps 10000å CTR=12.5% Click 1å Imps 8å CTR=13.0% Click 1300å Imps 10000å â¢â¯ åç´ã«CTRãé«ãã¯ãªã¨ã¤ãã£ããåºãã ããªãå·¦ã®åºåã®æ¹ãé« ãããããçãä¸ã®åºåã¯ã¾ã ã»ã¨ãã©ã¤ã³
3. Banditã¢ã«ã´ãªãºã ã«ã¤ãã¦ â¢ å ±é ¬ãããããªãè¤æ°ã®ã¹ããããã·ã³ã ãã£ãã¨ãã«ä½åã試è¡ãããã¨ã«ããæã å©å¾ãé«ãã¹ããããã·ã³ãçºè¦ãã â åºåã®ã¯ãªã¨ã¤ãã£ãã®ãã¡æãé«ãCTRãè¦ã¤ã ã ⢠åè Finite-time Analysis of multiarmed bandit problem, Machine Learning,2002
Google ã¢ããªãã£ã¯ã¹ ã¦ã§ããã¹ãã®åºç¤ãæãçµ±è¨ææ³ã«ã¤ãã¦èª¬æãã¾ããGoogle ã¢ããªãã£ã¯ã¹ã§ã¯ãã¦ã§ããã¹ãã®ææ³ã¨ãã¦å¤è ãã³ãã£ããæ¹å¼ãæ¡ç¨ãã¦ãã¾ããå¤è ãã³ãã£ãã ãã¹ãã«ã¯ã次ã®ãããªç¹å¾´ãããã¾ãã æãå©çã®å¤§ããé¸æè¢ã®ç¹å®ãç®æ¨ã¨ãã ã©ã³ãã åå¸ããã¹ãã®é²è¡ã¨ã¨ãã«æ´æ°ããã ãå¤è ãã³ãã£ããï¼multi-armed banditï¼ãã¨ããååã¯ãããããã«ç°ãªãè¦è¾¼ã¿é å½çãè¨å®ãããããOne-armed banditï¼çè ã®çè³ï¼ãã¨ããã¹ããã ãã·ã³ãè¤æ°ä¸¦ãã§ããç¶æ³ã模ãã仮説ãã¹ãã¨ããæå³ãæã£ã¦ãã¾ããã¹ããã ãã·ã³ã®ãã¬ã¤ã¤ã¼ã¯ãæãè¦è¾¼ã¿é å½çãé«ãã¹ããã ãã·ã³ãè¦ã¤ãåºãå¿ è¦ãããä¸æ¹ã§ãå©çãæ大åããå¿ è¦ãããã¾ãããã®ç¶æ³ã§ã¯ãããã¾ã§ã®é å½çãæãåªãã¦ãããã·ã³ã®ã¿ããã¬ã¤ããããããã¨ãããã«é å½ç
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}