2012-01-01ãã1å¹´éã®è¨äºä¸è¦§
é¢è¥¿å ¥åã¡ã½ããã¯ã¼ã¯ã·ã§ãããéç§° IM 飲ã¿ä¼ã§çºè¡¨ãã¦ãã¾ãããã¯ã¼ã¯ã·ã§ããã®æ§åãããããã®çºè¡¨å 容ã«ã¤ãã¦ã¯ @mamoruk ããã29æ¥ã®æ¥è¨ã«æ¸ãã¦ãã ãã£ã¦ãã¾ãï¼ããã«ãã¦ã @mamorukããã®å·çè½åã¯ãããï¼ã ç§ãçºè¡¨ãããç¨è¨æ´»ç¨â¦
ããæ£æçãã¨ããè¨èã®èª¤ç¨ãåºã¾ã£ã¦ãããã¨ãã説ãããã¾ãããã³ãã³å¤§ç¾ç§-æ£æç æä¸ã§ã®ç»å ´ã®ä»æ¹ãä¼¼ã¦ããããããæå³çãä½çºçãæ æãã¨ããæå³ã§ä½¿ããããã¨ãããããããæ£æçã«ã¯ãç¡ä½çºãå®ã¾ã£ãæå³ããªãã¨ãããã¥ã¢ã³ã¹ãâ¦
æ¥æ¬èªå ¥åã·ã¹ãã Simejiï¼å ¬å¼ã¢ã«ã¦ã³ãï¼@Simeji_jpï¼ã«ã¤ãã¦ãSocial IME 飿ºã«ãã£ã¦ã夿åè£ã«èªåã®ã¢ã«ã¦ã³ãåãåºããã¨ããåé¡ãå ±åãããããçä¸ãã¦ããããããããã«é¢ãã¦ã@Simeji_jpããæ¬¡ã®ãããªã¢ãã¦ã³ã¹ããã£ãã @trankieâ¦
ååã®è¨äºã®ç°¡æçãå ¼è£è¶³ãå ¼åçã§ã*1ã ã¾ããã¿ã¤ãã«ããããè¨ããã£ã¦ééã£ã¦ãã®ï¼ãã¨æã£ã人ã«ãããã¯ã¤ã³ãã¯ãéè¦ã§ã¤ããçãã¿ã¤ãã«ãªã®ã§ã誤解ããã£ããç³ã訳ããã¾ãããæ¬æã«æ¸ããéããããèããã¨ãè¨ããã¯ä½¿ãåããã®ãâ¦
ã¿ã¤ãã«ã¯ãã¯ãªã äºã¤ã®çå ã¿ã¤ãã«ãèªãã 人ã®åå¿ã¯ãä¸»ã«æ¬¡ã®äºã¤ã«åããã*1ã ããã ãè¨ãããæ£ãããªãã¦æã£ã¦ãã人ãããã®ï¼ ããããããã¨ããæå³ã§ã¯ãè¨ãããæ£ããããããªãã®ï¼ ã¾ãããè¨ãããæ£ãããªãã¦æã£ã¦ãã人ãããâ¦
ä»åã¯ãCRF ã®ååãã»å¾ãåãã¢ã«ã´ãªãºã ã«ã¤ãã¦ãå¯å¤æ¬¡æ° CRF ã®ã¢ã«ã´ãªãºã ã¨ã®å¯¾æ¯ã®ããã«æ¸ãã¦ããã ååãã»å¾ãåãã¢ã«ã´ãªãºã ã¯ã1 次㮠CRF ã§ä½¿ããã*1ã髿¬¡ã«å¿ç¨ããæ¹æ³ãèããããªããã¨ããªãããè¨ç®éãæ¬¡æ°ã«å¯¾ãã¦ææ°çã«â¦
ä»åã¯ãå¯å¤æ¬¡æ° CRF ã®è¨ç®æ¹æ³ã«ã¤ãã¦ã®è§£èª¬ãããã¯ã¼ãã®ç ç©¶ã§ã2å¹´åã«ä¿®è«ã«ãã¦ãå½å ã®è¨èªå¦çå¦ä¼ã«ãåºããã®ã ããä¸äººã§å½éå¦ä¼ã«åºãããããªè«æã«ããã¾ã§ã®ã¢ããã¼ã·ã§ã³ãæ¹§ããããã®ã¾ã¾ã«ãªã£ã¦ãããä¸ç·ã«èãã¦ããã人ãåéâ¦
æå¤§ã¨ã³ãããã¼ã¢ãã«ã®ç¶ããä»åã¯ãCRFï¼Conditional Random Fields, æ¡ä»¶ä»ã確çå ´ã¨ãï¼ ä¸è¬*1ã«ã¤ãã¦ãååãã»å¾ãåãã¢ã«ã´ãªãºã ã«ã¤ãã¦ã¯æ¸ããªããã¾ããä¸è¬ã«é¢é£ãæ·±ãã¨ããã MEMM ã¨ãããã®ã«ã¤ãã¦ããããã§ã¯è§¦ããªãã CRF ã¨â¦
æå¤§ã¨ã³ãããã¼ã¢ãã«ã«ã¤ãã¦ã®è§£èª¬ã æ¢åã®è³æã¨ãã¦ã¯ãA Simple Introduction to Maximum Entropy Models for Natural Language Processing ããè¨èªå¦çã®ããã®æ©æ¢°å¦ç¿å ¥éãï¼éç§°é«ææ¬ï¼ã詳ãããè¨äºä¸ã§ããã®ç¨èªãé©å®ä½¿ã£ãããå 容ãå¼â¦
åè©ã®æ´¾çå½¢ãåæãããã¡ã¤ã«ãä½ãã¾ãããhttps://github.com/hiroshi-manabe/Japanese_verb_derivationç¬ç«ãããã®ï¼æ¸ã:æ¸ãã çï¼ã¨è¤åèªï¼ããæ¸ã:ããæ¸ãã çï¼ã¯å¥ãã¡ã¤ã«ã«ãã¦ãã¾ããä½ãã«ä½¿ããã®ãã©ããã¯ãããã¾ãããå¼·ãã¦è¨â¦
æ¥æ¬èªã®å¤åã¨ããã¨ããçãå¾ããã¨ã人æ°ã§ãããããçãå°ãããæ£ããï¼ã¨ãã¤é¡ãã¦ãã¡ãã£ã¨ããæºè¶³æãå¾ãã®ã«ãæè»½ã§ãã ããããæ¥æ¬èªã®å¤åã¨ããã®ã¯æå¤ã¨å°å³ãªã¨ããã§èµ·ãã¦ããããã¾ããããã¤ãæ¸ãã¦ã¿ã¾ãã 1. ãããããããâ¦
æ¥æ¬èªã®ç¨è¨ãªã¹ããä½ãã¾ãããhttps://github.com/hiroshi-manabe/japanese_verb_adj_listæ¸ãã¹ããã¨ã¯ã ããããã¡ãã® README ã«æ¸ãã¦ããã¾ãããå°ãã ãè£è¶³ãã ãã®ãªã¹ãã¯æ¢åã®è¾æ¸ãã³ã¼ãã¹çããã¼ã¹ã«ãã¦ãã¾ãããç¹ã«æ°èªã»ä¿èªã«ã¤â¦
å¯å¤æ¬¡æ° CRF ã¨ãããã®ã«ã¤ãã¦ä¿®å£«è«æã§æ¸ãã¦ãè¨èªå¦çå¦ä¼ï¼NLP2011ï¼ã§ãçºè¡¨ãããã ãã©ãããã¯åºãããéãçºè¡¨ä¼ã¿ãããªãã®ã§ããã¾ãæå³ããªããã§ããã®å¯å¤æ¬¡æ° CRF ãªãã ãã©ãèªåã§ã¯ããã¢ã¤ãã¢ã ã¨æã£ã¦ãããæ®éã® CRF ã§ã¯ 1â¦
è¶ å°ãã¿ãã¦ã§ã¼ãã¬ããè¡åã® C++ çãæ¸ããæã«ãããé転ï¼ä¸ä½ããä¸ä½ã®ãããã®ä¸¦ã³ãéã«ããï¼ãã¼ãã«ãä½ãå¿ è¦ããã£ãã®ã§ããã®æã«ã¡ãã£ã¨å·¥å¤«ãããã¨ãã©ã£ãã§ã¨ã£ãã«æ¢åºãããç¶æ³ã¨ãã¦ã¯ã0 ãã 255 ã¾ã§ï¼8ãããï¼ã¨ãã0 ããâ¦
Google ãè¾ãã¦ãããé ãå·ããããã«ãã£ããä¼ãã§ããã®éã«ããããèãããè¾ããæã«å¢ãã§æ¸ããè¨äºã«ã¯ãIME ã®ä¼ç¤¾ãåãã¦ã¿ãããã¨æ¸ããããã©ãããã§ããã®ã確信ããªãã£ããã ã£ã¦ãIME ã®ä¼ç¤¾ã¨ããã¨ãæå½ç£ IME *1ãä½ã£ã¦ãã*2å¾³â¦
å°å·ã®æ¥æ¬èªã¯ãé常ã«è³ªãé«ããç·¨éè ãæ ¡æ£è ã«ãã£ã¦ãã§ãã¯ããã¦ããããã ãããã«å¯¾ãã¦ããããã®æ¥æ¬èªã¯è³ªããã©ãã©ã ãæ¸ããã¨ã®ããã§ãªãä¸è¬äººãæ¥æ¬èªãæ¸ãã®ã§ãã©ããã¦ããããªã£ã¦ãã¾ãããã¨ãã£ã¦ããéããããããã§ããªãæâ¦
ã¾ãæåã«ããèããã¨ãè¨ããã®ä½¿ãåãã¯ããæå³ã«ãããã®ãã§ã¯ããã¾ããã ããè¨ãã¨ãè¥ã人ã®ä¸ã«ã¯ããã£ï¼ ã¨æã人ãããããããã¾ããã質åããã®ããè¨ããã§ã鳿¥½ãéè³ããã®ããè´ã*1ãã§ããã以å¤ããèãããããªãã®ï¼ ã¨ãããâ¦
ã¦ã§ã¼ãã¬ããè¡åã使ã£ã¦ãwat-array ã®ã¯ãã¼ã³ï¼List*Range() ãé¤ãï¼ãä½ã£ã¦ã¿ã¾ãããGitHub ãªãã¸ããª: https://github.com/hiroshi-manabe/wavelet-matrix-cppãã¹ãã¯åããªãç¶æ ã§ãã wat-array ã«å«ã¾ãã¦ãã performance_test.cpp ãå©ç¨â¦
ï¼æ³¨ï¼ä»åã®å 容ã¯å è«æã«æ¸ãã¦ãã£ãããã§ã¯ãªãã®ã§ç¬èªç ç©¶æ°å³ã§ãï¼åã åã¨ååã§ãã¦ã§ã¼ãã¬ããè¡åã使ã£ãã¢ã«ã´ãªãºã ãäºã¤ç´¹ä»ãã¾ãããäºã¤ã¨ããããé åã«å¯¾ãã¦éå§ä½ç½®ã¨çµäºä½ç½®ãæå®ããç¯å²ã«ã¤ãã¦ã®æä½ã§ããRankLessThan(),â¦
ååã«å¼ãç¶ããã¦ã§ã¼ãã¬ããè¡åã使ã£ãæ¤ç´¢ã«ã¤ãã¦æ¸ãã¾ããä»åãã2鲿°ããããæä½ã«æ £ãã¦ãã人ã«ã¯åé·ã«æããããããããã¾ãããä»åã¯ããé åã®ããç¯å²ã®ä¸ã«ããæ°åã§ãnçªç®ã«å°ããæ°åãè¿ããã¨ãã颿°ã§ããããã¯å°ããããâ¦
ååãã¦ã§ã¼ãã¬ããæ¨ã«ã§ãããã¨ã¯ã¦ã§ã¼ãã¬ããè¡åã«ãã§ããã¨æ¸ãã¾ããããä»åã¨æ¬¡åã«åãã¦ãã¦ã§ã¼ãã¬ããè¡åã使ã£ã¦2種é¡ã®é¢æ°ãå®è¡ããæ¹æ³ãæ¸ãã¦ã¿ã¾ããç´°ããæé ã追ãããã2鲿°ããããæä½ã«æ £ãã¦ãã人ã«ã¯åé·ã«æãããâ¦
ã¦ã§ã¼ãã¬ããè¡åã®ãã¹ãå®è£ ã«ãrank(), select() ã®ä»ãwat-array ã§å®è£ ããã¦ãããã®ãã ããã追å ãã¾ãããã¦ã§ã¼ãã¬ããæ¨ã§ã§ãããã¨ã¯ãã¦ã§ã¼ãã¬ããè¡åã§ãåãããã«ã§ããï¼å è«æã®æå¾ã«ã¡ããã¨ããæ¸ãã¦ããï¼ã¨ãããã¨ã確èªã§â¦
ãã¡ããããã«ãã Wavelet Matrixã®ç´¹ä»è¨äºãèªãã§ãããã¯ãããã¨æéãåããã®ã§ã試ãã« Python ã§å®è£ ãã¦ã¿ããhttps://github.com/hiroshi-manabe/wavelet-matrixï¼è¿½è¨: wat-array ãå ã«ãã C++ç https://github.com/hiroshi-manabe/waveletâ¦
è±èªå¦ç¿ã¨ã³ããªã«è§¦çºããã¦ã第äºå¤å½èªå¦ç¿ã¨ã³ããªãæ¸ãã¦ã¿ããã¨æãã¾ããè±èªã¨ãã®ä»ã®å¤å½èªã§ãå¦ç¿æ¹æ³ãæ¬è³ªçã«éãã¨ãããã¨ã¯ãã¡ãããªãã®ã§ãããä¸é« 6å¹´éã«ããã£ã¦å¦æ ¡ã§åå¼·ããè±èªã¨ãåºç¤ãã»ã¨ãã©ãªãç¶æ ããå§ãã第äºå¤â¦
ä½ã£ãããªãã¸ããªã¯ãã¡ããhttps://github.com/hiroshi-manabe/ngram-converter-cpp 以åãN-gram æ¼¢å-ããªå¤æã¨ããè¨äºã§ãN-gram ã使ã£ãããªæ¼¢åã»æ¼¢åããªå¤æãå ¬éãããå é¨ã§ä½¿ç¨ãã¦ããã¢ã«ã´ãªãºã ã«ã¤ãã¦ã¯ãå¯å¤æ¬¡æ° N-gram ãã³ã¼ãã®â¦
ãã¾ã«ã¯ãæ¥è¨ã®ã¿ã¤ãã«ã«ãµããããè¨äºãæ¸ãã¦ã¿ããã¨æãã éå¥ä¼ã¨ãããã®ã¯ãç¹ã«å¥½ãã§ãå«ãã§ããªããå¼ã°ãããè¡ããå¼ã°ããªãã£ããè¡ããªãããã®æç¾©ã¨ãããã®ããããããã£ã¦ããã¨ã¯è¨ãããããæè¿ä¼ã¯ã¾ã ãããããããè·å ´ã®å ´â¦
仿´ãªãããFaster and Smaller N-Gram Language Modelsãèªãã§ã¿ã¾ããããã®è¨äºã«ã¤ãã¦ã¯ããã§ã«ACL2011è«æãFaster and Smaller N-Gram Language Modelsããèªãã - EchizenBlog-Zwei ãN-gram è¨èªã¢ãã«ãå§ç¸®ããã«ã¯ - ããï¼ ã¯ã¦ãªæ¥è¨ã§ç´¹ä»â¦
iOS ã®æ¥æ¬èªããªãã¯å ¥åãªã©ã§å®è£ ããã¦ããã修飾ãã¼çç¥å ¥åãã¨ããæ©è½ã«ã¤ãã¦æ¸ãã¾ããæ¬¡ã®è¨äºã«è§£èª¬ããã¦ãã¾ãã http://blog.pasonatech.co.jp/ohashi/16439.html ããããã¦ããã¨å ¥åããã¨ããå é»ããçµé»ããéç¹ããçµç¹ãã¨ããåè£â¦
ãã¾ããã
ç§ãæ¥æ¬èªå ¥åã«ã¤ãã¦æã£ã¦ãããã¨ãæ¸ãã¦ã¿ãã èªåã¨ãã¦ã¯ãããã©ã«ã以å¤ã®æ¥æ¬èªå ¥åã·ã¹ãã ã¨ãã¦ãATOK 㨠Google æ¥æ¬èªå ¥åãããï¼ã¾ããBaidu IME ã¨ãããããï¼ãããªãã®ãå¿ã®åºããæ®å¿µã§ãã¾ããªãã éèªã®ç¹éãªã©ã§ãGoogle æ¥â¦