Introduction to Information Retrieval #16 ã®å¾©ç¿è³æ
ãã°ããéã空ãã¦ãã¾ãã¾ãããIntroduction to Information Retrieval 輪èªä¼ 16ç« ã®å¾©ç¿è³æã以ä¸ã«ã¢ãããã¼ããã¾ããã
16ç« ã®ãã¼ãã¯ã"Flat Clustering" ã§è©±é¡ã¯ã¯ã©ã¹åé¡ããã¯ã©ã¹ã¿ãªã³ã°ã¸ã¨ç§»ãã¾ãã16ç« ã§ã¯ã¯ã©ã¹ã¿ã¨ã¯ã©ã¹ã¿ã®éã«é¢ä¿æ§ããªããã©ããã¯ã©ã¹ã¿ãªã³ã°ãæ±ããç¶ã 17ç« ã§ã¯ã¯ã©ã¹ã¿éã«é層çæ§é ãè¦åºãé層åã¯ã©ã¹ã¿ãªã³ã° (Hierachical clustering) ãæ±ãã¾ãã
ã¯ã©ã¹ã¿ãªã³ã°
13ç« ãã15ç« ã¾ã§ã¯ Naive Bayes ã SVM ãªã©ã«ãã "Classification" ã話ã®ä¸»é¡ã§ãããã¯ã©ã¹ã¿ãªã³ã°ãåæ§ã«æ å ±ã®ã°ã«ã¼ãã³ã°ãè¡ããã®ã§ãããClassification 㨠Clustering ã¯æ師ããå¦ç¿ããæ師ãªãå¦ç¿ãã¨ããç¹ã大ããç°ãªãã¾ããã¯ã©ã¹ã¿ãªã³ã°ã¯äººæã«ããã«ãã´ãªå¤å®ãªãã«ã°ã«ã¼ãã³ã°ãè¡ããæ師ãªãå¦ç¿ã§ãã
16ç« ã§ã¯ã¯ã©ã¹ã¿ãªã³ã°ã®åºç¤æ¦å¿µãã¯ã©ã¹ã¿ãªã³ã°ã®åé¡ãæ å ±æ¤ç´¢ã«ãããã¯ã©ã¹ã¿ãªã³ã°ã®å¿ç¨ä¾ãå®éçãªè©ä¾¡ã®ææ³ãªã©ã解説ããå¾ãå ·ä½çãªå®ç¾æ¹æ³ã¨ã㦠K-means ã¢ã«ã´ãªãºã 㨠EM ã¢ã«ã´ãªã ãç´¹ä»ããã¦ãã¾ãã
K-means ã¢ã«ã´ãªãºã ã¯ãã©ããã¯ã©ã¹ã¿ãªã³ã°ã®åºç¤çãã¤éè¦ãªã¢ã«ã´ãªãºã ã§ããã¯ãã«ç©ºéä¸ã®ãã¯ãã«ãããã¯ãã«ã®éå¿ã¨ã®å¹³åè·é¢ã«ãã£ã¦ã¯ã©ã¹ã¿ãªã³ã°ããææ³ã§ããæåã«ã·ã¼ãã«ãªãéå¿ãé©å½ã«é¸ãã§ãå帰çã«éå¿ã移åãããªããã¯ã©ã¹ã¿ã®éå¿ãç®åºãã¾ããã¯ã©ã¹ã¿åã®ã«ã¼ã«ã§ããç®çé¢æ°ã«ã¯ RSS (Residual sum of squares) ã¨ãããåãã¯ãã«ã®éå¿ããã®å¹³åè·é¢ã®äºä¹å¤ãå©ç¨ãããããæå°åãããã¨ãã¯ã©ã¹ã¿ãªã³ã°ã®ã´ã¼ã«ã«ãªãã¾ããæ¸ç±å ã§ã¯2次å ã®ãã¯ãã«ã®éå¿ã K-means ã¢ã«ã´ãªãºã ã«ãã£ã¦ç§»åãã¦ããæ§ãå³è§£ããã¦ãã¦ãé¢ç½ãã§ãã
K-means ã¢ã«ã´ãªãºã ã®è¨ç®éã¯ãéå¿ã®åè¨ç®åæ°Iãã¯ã©ã¹ã¿ã®æ°Kãããã¥ã¡ã³ãæ°Nããã¯ãã«ã®æ¬¡å M (ããã¥ã¡ã³ããbag of words ã¢ãã«ã§ãã¯ãã«åãã¦ããã®ã§ããã°èªå½æ°) ã«å¯¾ãã¦ç·å½¢ Î(IKNM) ã§ãé層åã¯ã©ã¹ã¿ãªã³ã°ã«æ¯è¼ããã¨å¹çã¯è¯ãã§ãããã ããããã¥ã¡ã³ãæ°ããã¯ãã«ã®æ¬¡å ã¯ã·ã¹ãã ã®è¦æ¨¡ã«åããã¦å¤§ãããªã£ã¦ããã®ã§ã転置ã¤ã³ããã¯ã¹ããã³ãµã¤ã³é¡ä¼¼åº¦ãæ±ããæåæ§ã«ããã¯ãã«ãã¹ãã¼ã¹ã§ãããã¨ãåèªã®éè¦åº¦ãªã©ãèæ ®ãã¦è¨ç®éãä¸ãã工夫ãå¿ è¦ã§ãã
EM ã¢ã«ã´ãªãºã 㯠K-means ãä¸è¬åãã¦ãK-means ããã¼ãã¯ã©ã¹ã¿ãªã³ã°ã§ããã®ã«å¯¾ãã½ããã¯ã©ã¹ã¿ãªã³ã°ãå®ç¾ããã¢ã«ã´ãªãºã ã§ãã
次å輪è¬ã»ã
ä»æ¥ã®è¼ªèªä¼ã第17ç« ã¯å¼ãç¶ãã¯ã©ã¹ã¿ãªã³ã°ããã¼ãã§ãé層åã¯ã©ã¹ã¿ãªã³ã°ã®è©±ã§ãããæ å½ã®ææ¬ããã®å訳ææ¸ã®ã¯ãªãªãã£ãé«ãã (Tex ã§æ°å¼ã¾ã§ç¶ºéºã«ãã©ã¼ãããããææ¸ãä½æããã¦ãã¾ãã) ã¦å¹ãã¾ãããç¶ã18 ç« ã¯ Latent Semantic Indexing ã§ããã¯ã©ã¹ã¿ãªã³ã°ãããã§ããããããæè¿ã®ç« ã§ã¯æ±ãæ¤ç´¢ã¢ãã«ããã¯ãã«ç©ºéã¢ãã«ä¸å¿ã«æ»ã£ã¦ãã¾ãã
次åã®è¼ªèªä¼ã¯ 2/7 (å) äºå®ã次å輪èªä¼å¾ããã¤ãéã復ç¿è³æãã¢ãããã¾ãã
éå»ã®ç« ã®å¾©ç¿è³æ ppt ã¯å URL ã®ãã£ã¬ã¯ã㪠(http://bloghackers.net/~naoya/iir/ppt/) ããä¸è¦§å¯è½ã§ãã
Introduction to Information Retrieval
- ä½è : Christopher D. Manning,Prabhakar Raghavan,Hinrich Schuetze
- åºç社/ã¡ã¼ã«ã¼: Cambridge University Press
- çºå£²æ¥: 2008/07/07
- ã¡ãã£ã¢: ãã¼ãã«ãã¼
- è³¼å ¥: 7人 ã¯ãªãã¯: 115å
- ãã®ååãå«ãããã° (37件) ãè¦ã
ãè©«ã³
2åã»ã©è¼ªèªä¼ã«åå ã§ããªãåãç¶ããã®ã§ãã¾ã 13 ãã 15 ç« ã¾ã§ã®å¾©ç¿è³æãããã¾ãããæãè¦ã¦è¿½å ã§ããã°ã¨æã£ã¦ãã¾ãã