natural language processing blog ã«Non-parametric as memorizing, in exactly the wrong way?ã¨ãããã¹ãããã£ããè¨èªã¢ããªã³ã°ã¯ããæ°å¹´ã§ã ãã¶ç 究ãé²å±ããã¨ããã§ããã£ãªã¯ã¬éç¨ã¨ããªãã ã¨ããæ°ççã«ç²¾ç·»ãªã¢ãã«ã(è¨ç®æ©ã®é«åº¦åãç¸ã¾ã£ã¦)ç»å ´ãã¦ãã¦ãããããåãã£ã¦ãã¦ããããããªåéã§ããã
æè¿ PPM ã«ã¤ãã¦èª¿ã¹ãããdaiti-m ããã®「PPM, 言語モデル, Burrows-Wheeler Transform」ã¨ãã¨「PPMと言語モデル (2)」ãããã¹ããæ¹ãã¦èªãã§ã¿ã¦ãããããåãã£ã¦ããæ°ãããã
Google æ¥æ¬èª N ã°ã©ã ã¿ãã巨大ãªãã¼ã¿ã使ã£ã¦ããã¨ãã¹ã ã¼ã¸ã³ã°ãªã«ããï¼ãã¨è¨ã£ã¦ãã¦ãããããã(æ©æ¢°ç¿»è¨³ã§ã Google 1T gram ãç¨ããç 究ã§ãããã ããã¼ã¿ãããã¨ã¹ã ã¼ã¸ã³ã°ã¯å¿ è¦ãªããã¨ãããããªè©±ããã£ãã¨æã)ãªã®ã ããç¾å®çã«ã¯ãããªå¤§éã®ãã¼ã¿ãæã«å ¥ããªãã£ããããããã¯æã«å ¥ã£ãã¨ãã¦ã CPU ãã¡ã¢ãªããã£ã¹ã¯ã®å¶ç´ã§å ¨é¨ä½¿ããªãã£ããã¨ãã¹ã ã¼ã¸ã³ã°ã¯å¿ é ã®å¦çãªãã ãªããã¨(å½ããåã ãâ¦â¦)ã
ã¡ãªã¿ã«ã¹ã ã¼ã¸ã³ã°ã¨ã¯ããã¼ã¿ã®ä¸ã«åºç¾ããã¤ãã³ãã«ç¢ºçãå²ãæ¯ãã¨ããåºç¾ãããã®ã ãã«ç¢ºçãæ¯ã£ã¦ãã¾ãã¨ãæªç¥ã®ã¤ãã³ãã«å¯¾ãã¦ç¢ºçãè¨ç®ãããã¨ãã§ããªã(ããç®ã ã¨ã©ããã0ã«ãªãã¨å ¨ä½ã0ã«ãªãã®ã§ãã¹ã ã¼ã¸ã³ã°ãªãã§1ç®ææªç¥èªãããã¨å ¨ä½ã®æã解æã§ããªããªã)ã¨ããåé¡(ãã¼ã¿ã¹ãã¼ã¹ãã¹)ã¸ã®å¯¾å¦ã¨ãã¦ãæªç¥ã®ã¤ãã³ãã«ã確çãå²ãæ¯ãã¾ããã(ãã®ã¶ãæ¢ç¥ã®ã¤ãã³ããã確çãå²ãå¼ã)ãã¨ããææ³ã®ãã¨ã§ããã
ä¼¼ãæ¦å¿µã¨ãã¦ããã¯ãªãã¨ãããã®ããã£ã¦ããã¡ãã¯ããã©ã«ãã®ã¢ãã«ã§è¦ã¤ãããªããã¼ã¿ãå ¥ã£ã¦ããã¨ããããåºç¤çãªã¢ãã«ã«ã¹ã¤ãããããã¨ã§è§£æãã¾ããããã¨ããææ³ããã¨ãã°ç´åã®åèªãã次ã®åèªãäºæ¸¬ã§ããã¨ããã¢ãã«ãä½ã£ã(åèª 2-gram ã¢ãã«)ã¨ãã¦ãããã¾ã§è¦ããã¨ãªã2åèªã®é£éããã£ã¦ãã1åèªã«å解ãã¦èãããè¦ããã¨ãããåèªã ã£ãå ´åãæèã«ä¾åãããã®åèªãåºç¾ãã確çãè¨ç®ããã¢ãã«(åèª 1-gram ã¢ãã«)ã§äºæ¸¬ãããã¨ãããã®ã
ããã§å ã®è©±ã«æ»ãã¨ãåé ã®ãã¹ãã¯äººéããã¨ãã°è±èªã®éå»å½¢ãè¦ããã¨ããã-ed ãã¤ããã°éå»å½¢ããããä»¥å¤ went ã¨ã gave ã¨ãã¯ä¾å¤ãã¨ããããã«ãã«ã¼ã«ã¨ä¾å¤ã§è¦ãã¦ããã®ããããã¨ããtalked, opened, ã¯éå»å½¢ãã¨å ¨é¨éå»å½¢ã丸è¦ãããã®ããææ°ã®æ©æ¢°å¦ç¿çãªææã§ã¯å¾è ãæ¯æãããããããã¯èªç¥è¨èªå¦çãªç¥è¦ã«åããã®ã§ã¯ï¼ãã¨ããåé¡æèµ·ããããã©ããã人éãã«ã¼ã«ã§è¦ãã¦ããã®ã§ã¯ãªããå ¨é¨ä¸¸è¦ããã¦ãã¦ãç¥ããªãåèªã«åºä¼ãã¨ã«ã¼ã«ã«ããã¯ãªããã¦ããã®ã§ã¯ãªãããã¨ãã話ã被é¨è ã使ã£ãå®é¨ã§ã¯(ç´æçã«ã¯åè ãããããªãããå®éã¯)å¾è ã®ããã§ãããã¨ããæã(ã§ãã¡ããã¨è©±ãè¦ãã¦ããªãã®ã§ç¢ºããã©ããåãããªããã¨ã®ã³ã¡ã³ã)ã
確ãã«å®éãã使ããã®ã¯(ç°¡åã«ã¢ã¯ã»ã¹ã§ããããã«)åèªã§è¦ãã¦ããã¦ããã¾ã使ããªããã®ã¯(ã¡ãã£ã¨å¦çã«æéã¯ãããã)ããã©ã«ãã®ã«ã¼ã«ã§å¦çãããã¨ããã®ã¯åççãªæ°ãããããã¨ã¯æåã®ããã使ãåèªãªã¹ãããã©ãæ§ç¯ããããã ãã人éã¯èªåã§ãã®ããããããã ããªããå®ã«ãããã