ãããããé¢æ£åå¸ã«é¢ããã¾ã¨ã
ä»åã¯é¢æ£åå¸ï¼discrete distributionï¼ã®ä»£è¡¨æ ¼ã§ããå¤é åå¸ï¼multinomial distributionï¼ãããã®å ±å½¹äºååå¸ã§ãããã£ãªã¯ã¬åå¸ï¼Dirichlet distributionï¼ã¨ã®é¢ä¿æ§ãè¨ç®æ¹æ³ãæ´çãããã¨æãã¾ãã
é¢æ£åå¸ã¨ããã¨ãæ¬æ¥ã¯ãã¢ã½ã³åå¸ï¼Poisson distributionï¼ãªã©ãå«ããé¢æ£å¤ãåºåãããããªåå¸å ¨è¬ã®ãã¨æãã¾ãããããå®éã«è«æãªã©ãèªãã§ããã¨ãããå¼ãã®ããã«åç´ã«åºç®ã®æ¯çãä¸ãããããããªåå¸ãé¢æ£åå¸ã¨åä»ãã¦ãã¾ã£ã¦ããå ´åãããè¦ããã¾ããã¾ãæèçã«èª¤è§£ãæããã¨ã¯ãã¾ããªãã¨æãã®ã§ãããããå¼ãã®åå¸ããã£ã¨ããããªè¡¨ç¾ãããªããè¤æ°ããã«ãã´ãªã¼ããï¼ã¤ãæ½åºããã¨ããæå³ã§ã«ãã´ãªã«ã«åå¸ï¼categorical distributionï¼ã¨å¼ã¶ã®ãé©åãã¨æãã¾ãããããã¯ã観測åæ°ã1ã®å ´åã®å¤é åå¸ã¨ãã£ã¦å¼ã¶ãã¨ãã§ãã¾ããã
ã¨ã¾ãå¥ã«ååã¯ã©ãã§ãããã®ã§ããããããã®é¢æ£åå¸ã®é¢ä¿æ§ãã¾ã¨ãã¦ã¿ãã¨æ¬¡ã®å³ã®ããã«ãªãã¾ãã
ãã³ã¯ã®ç·ãå¤æ¬¡å å¤æ°ã¸ã®ä¸è¬åã§ããªã¬ã³ã¸ã®ç·ãè¤æ°åã®è©¦è¡ãè¡ã£ãå ´åã«å¯¾ããä¸è¬åã§ããã°ã¬ã¼ã®ç¹ç·ç¢å°ã¯ãããããã®ç¢ºçåå¸ã«å¯¾ããå ±å½¹ã®é¢ä¿æ§ã示ãã¦ãã¾ãï¼ç¢å°ã®å ã«å½ããã®ãå ±å½¹äºååå¸ï¼conjugate priorï¼ã§ãï¼ãÂ
çµè«ã¨ãã¦ã¯ãå¤é åå¸ã¨ãã®å ±å½¹äºååå¸ã§ãããã£ãªã¯ã¬åå¸ã®è¨ç®ã ãããç¨åº¦é¦´æãã§ããã°ãå¾ã¯ãã®ç¹æ®ãªä¾ãªã®ã§ãã¡ãã¡æ°ã«ããå¿ è¦ã¯ãªãã¨ãããã¨ã«ãªãã¾ããã
Â
Â
ï¼å確çåå¸ã®èªå·±ç´¹ä»ï¼
ãã¦ãããããããããè¨ç®ç¢ºèªãããåã«ãããããã®ç¢ºçåå¸ã®å®ç¾©å¼ãåæãã¦ã¿ã¾ãã
Â
ã»ãã«ãã¼ã¤åå¸ï¼Bernoulliï¼
ã馴æã¿ã®ã²ããããã³ã¤ã³ã®åå¸ã§ããã¯0ã1ã®ã©ã¡ããã®å¤ãå¿ ãåãããã©ã¡ã¼ã¿ã¯ãã¿ããã¾ãã*1
ã»äºé åå¸ï¼Binomialï¼
ã³ã¤ã³ãã¹ãåè¡ã£ãæã®ã表é¢ã®åæ°ã«é¢ããåå¸ã§ãããããã£ã¦ã¯0以ä¸ä»¥ä¸ã®æ´æ°å¤ãåãã¾ãããã«ãã¼ã¤åå¸ã¨åæ§ãã§ãã
ã»ã«ãã´ãªã«ã«åå¸ï¼Categoricalï¼
ãã¡ããã馴æã¿ã®é¢ãæã¤ã²ãããããµã¤ã³ãã®åå¸ã§ããã次å ãã¯ãã«ã§ããã®è¦ç´ ã¯0ã1ã®ã©ã¡ããã®å¤ãå¿ ãåããããããï¼ã¤ã®ãã1ã«ãªããªãã®ã§ããã¿ããã¾ããããããã®åºç®ã®ç¢ºçã¯ã§ä¸ãããããã¿ããããã«è¨å®ãã¾ãã
ã»å¤é åå¸ï¼Multinomialï¼
é¢ãæã¤ã²ãããããµã¤ã³ããåæããæã®ãåºç®ã®åæ°ã«é¢ããåå¸ã§ãããããã£ã¦æ¬¡å ãã¯ãã«ã§ããã®è¦ç´ ã¯éè² ã®æ´æ°å¤ãå¿ ãåãããã¿ããã¾ããã«ãã´ãªã«ã«åå¸ã¨åæ§ã§ãã
ã»ãã¼ã¿åå¸ï¼Betaï¼Â
ã³ã¤ã³ãã¹ã®å ±å½¹äºååå¸ã«å½ããåå¸ã§ããã³ã¤ã³ã®ãã²ãããå ·åãã«é¢ããåå¸ã«ãªãã¾ãããã®åå¸ããåºã¦ããå¤ã¯éãªãã0ã¨1ã®éã«åã¾ã£ã¦ãããããã«ä½ããã¦ãã¾ããåå¸ã®ç¹å¾´ã決ãããã¯ã0ãã大ããå®æ°å¤ãè¨å®ããå¿ è¦ãããã¾ããã¡ãªã¿ã«æ£è¦åé ã«å½ããé¨åã«ã¯ã¤ã«ã¤ãæ ¼å¥½ã®ã¬ã³ãé¢æ°ï¼gamma functionï¼ã®å¡ããã¾ããã¬ã³ãé¢æ°ã¯éä¹ãï¼ãã®å®æ°å¤ãã¼ã¸ã§ã³ã§ãããããã¯ã©ã¤ãã©ãªã¨ã使ãã°ç°¡åã«è¨ç®ã§ããã®ã§ãã¾ãæããå¿ è¦ã¯ãªãã§ãã*2
ã»ãã£ãªã¯ã¬åå¸ï¼Dirichletï¼
ãµã¤ã³ãã®å ±å½¹äºååå¸ã«å½ããåå¸ã§ãããµã¤ã³ãã®ãã²ãããå ·åãã«é¢ããåå¸ã§ãããæ··åã¢ãã«ï¼mixture modelï¼ã«ä½¿ãããã»ããèªç¶è¨èªå¦çã§ä½¿ãããLDAï¼Latent Dirichlet Allocationï¼ã®ç±æ¥ã«ããªã£ã¦ãã¾ãããåºã¦ããå¤ã¯å¿ ããã¿ããã¦ãããããã«åå¸ãä½ããã¦ãã¾ããã¾ãããã¹ã¦ã®ã«å¯¾ãã¦ã¯0ãã大ããå®æ°å¤ãè¨å®ããå¿ è¦ãããã¾ããããã§ãã¾ãã¤ã«ã¤ãã®å¡ãå é ã«ããã¾ãããå¤ãã®å ´åã§ç¡è¦ãã¦è¨ç®ã§ããã®ã§æ°ã«ããªãã¦ã大ä¸å¤«ã§ãã
Â
Â
ï¼å確çåå¸ã®é¢ä¿æ§ï¼Â
ãã¦ãèªå·±ç´¹ä»ãçµãã£ãã¨ããã§ããããã®åå¸ã®é¢ä¿æ§ãå¼ã§ç¢ºèªãã¦ã¿ã¾ããããããã§ã¯ãå¤é åå¸ããã£ãªã¯ã¬åå¸ããã®ä»ã®åå¸ã®ç¹å¥ãªå ´åã§ãããã¨ã確èªãã¦ã¿ã¾ãã
Â
ã»å¤é åå¸ï¼ï¼ã«ãã´ãªã«ã«åå¸
å¤é åå¸ã«ããã¦>ã¨è¨å®ãã¦ãããã°ã
ã¨ãªããã«ãã´ãªã«ã«åå¸ã«ä¸è´ãã¾ããã¨è¨å®ãããã¨ã«ããã¯0ã1ãããåããªããªã£ããã¨ã¨ãã®è¨ç®ã«æ³¨æãã¦ãã ããã
ã»å¤é åå¸ï¼ï¼äºé åå¸
å¤é åå¸ã«ããã¦æ¬¡å ãã¨è¨å®ãã¦ãããã°ã
ã¨ãªããäºé åå¸ã«ãªãã¾ããã®å¶ç´ãããã®ã§ãå®è³ªã®ã¿ã®ç¢ºçåå¸ã«ãªãã®ããã¤ã³ãã§ãããã©ã¡ã¼ã¿ãããããï¼åæã¤å¿ è¦ã¯ãªãããèæ ®ããã°ã ãæã£ã¦ããã°å¤§ä¸å¤«ã§ããã
ã»ãã£ãªã¯ã¬åå¸ï¼ï¼ãã¼ã¿åå¸
ãã£ãªã¯ã¬åå¸ã«ããã¦æ¬¡å ãã¨è¨å®ãã¦ãããã°ã
ã¨ãªãããã¼ã¿åå¸ã«ãªãã¾ããããã§ããã®å¶ç´æ¡ä»¶ãå©ç¨ãããã¨ã«ãã£ã¦å®è³ªã®ã¿ã®ç¢ºçåå¸ã«ãããã¨ãã§ãã¾ãã
Â
åæ§ã«ããã«ãã¼ã¤åå¸ã¯ã«ãã´ãªã«ã«åå¸ããã¨ããããäºé åå¸ããã¨ããã¨å°ãã¾ããç°¡åãªã®ã§çç¥ãã¾ãã
Â
Â
ï¼å ±å½¹äºååå¸ã¨ãã¤ãºæ¨è«ï¼
ãã¦æå¾ã«ããã£ãªã¯ã¬åå¸ãå¤é åå¸ã®å ±å½¹äºååå¸ã§ãããã¨ã確èªãã¦ã¿ã¾ããããå ±å½¹äºååå¸ã¨ã¯ã観測ã¢ãã«ã¨ããåãããå¾ã«åãé¢æ°å½¢ç¶ãåºã¦ãããããªåå¸ã®ãã¨ã§ããã¤ã¾ãããã¤ãºã®å®ç
ãé©ç¨ããã¨ãã«ãäºååå¸ã¨äºå¾åå¸ã確çå¤æ°ã«é¢ãã¦åãå½¢å¼ã®é¢æ°ã«ãªãããã«ãå®ããã¨ãããã¨ã§ããããã®ãããªäºååå¸ãé¸ãã§ããã¨ãå種æ¨è«è¨ç®ã解æçã«ç°¡åã«è¨ç®ã§ããããã«ãªãã»ãããã¼ã¿ãå°åãã«ãã¦å¦ç¿ããããªã³ã©ã¤ã³å¦ç¿ï¼online learningï¼ãæ§ç¯ãããã¨ã容æã«ãªãã¾ãããªã³ã©ã¤ã³å¦ç¿ã¯é次å¦ç¿ï¼sequential learningï¼ã追å å¦ç¿ï¼incremental learningï¼ã¨ãå¼ã°ããããã§ãã
ã¨ãããã¨ã§ããã£ãªã¯ã¬åå¸ãäºååå¸ã¨ããå¤é åå¸ã«å¾ããã¼ã¿ã観測ããå¾ã®äºå¾åå¸ãè¨ç®ãã¦ã¿ã¾ãããã
æ£è¦åé ï¼ã¬ã³ãé¢æ°ã¨ãéä¹ã¨ãå«ãã§ãé ï¼ãç¡è¦ãã¦è¨ç®ããã®ãã©ã¯ããç§è¨£ã§ãããã¦ãçµæã®å¼ãè¦ã¦ã¿ãã¨ããã£ãªã¯ã¬åå¸ã¨åãé¢æ°å½¢ç¶ããã¦ãããã¨ããããã¾ãããããã£ã¦ã¨ããã¦ãããã°ã
ã¨ãããã¨ã«ãªããäºå¾åå¸ããã£ãªã¯ã¬åå¸ã«ãªããã¨ããããã¾ããã*3
ã¾ããä¸ã®ãã¤ãºæ¨è«ã§ã¨ç½®ãã°ãã«ãã´ãªã«ã«åå¸ã®ãã©ã¡ã¼ã¿ã«é¢ããå¦ç¿ãåæ§ã«ãã£ãªã¯ã¬åå¸ã使ã£ã¦è¡ããã¨ãã§ãããã¨ããããã¾ããããã¯ç¹ã«æ··åã¢ãã«ï¼ã¯ã©ã¹ã¿ãªã³ã°ãªã©ï¼ãæ±ãã¨ãã«åºã¦ããè¨ç®ãªã®ã§ãããç¨åº¦æ £ãã¦ããã¨ä¾¿å©ã§ãã
ä»åã®è¨äºãããããããï¼ã¨ããæ¹ã«ã¯ï¼æ¬¡ã®ãããªå ¥éæ¸ãããã¾ãï¼
*1:ã¡ãªã¿ã«è±èªã§ãã¹ãã¬ãããã¨è¨ã£ã¦ãéãã¾ããããããã¼ãªãã¨ããæ¹ãè¿ãã§ãã
*2:ã¬ã³ãé¢æ°ã¯ããã°ã©ã ä¸å®è£ ãããã¨ã¯ããã»ã©å¤ãã¯ãªãã®ã§ãããã©ããã¦ãè¨ç®ããå¿ è¦ãããå ´åã¯ã対æ°ãè¿ãã¦ãããé¢æ°ï¼lgammaãªã©ï¼ã使ãã¨è¯ãã§ããé常ã¯ãã®ããã大ããå¤ã«ãªã£ã¦ãã¾ãã®ã§ã
*3:ãäºå¾åå¸ã®è¨ç®çµæãã«ãã´ãªã«ã«åå¸ã«ãè¦ãããã ãã©ã»ã»ã»ãã¨ããæ¹ã¯ãã¡ãã£ã¨è½ã¡çãã¦ã¿ã¦ãã ãããä»ã¯ã§ã¯ãªãã«é¢ããåå¸ããã¤ãºã®å®çã使ã£ã¦è¨ç®ãã¦ãã¾ãããããã£ã¦ãã®é¢æ°å½¢ç¶ã¯ãã£ãªã¯ã¬åå¸ä¸æã«ãªãã¾ãã