æ å ±è·é¢ã«åºã¥ã åèª N-gram 㮠大åçéã¿ä»ãææ³ ç½å·çæ¾, åé浩, è¥¿å°¾ç« æ²»é 大éªå¤§å¦ 大å¦é¢æ å ±ç§å¦ç ç©¶ç§ NLPè¥æã®ä¼ (YANS) 第9åã·ã³ãã¸ã¦ã ï¼ ãããããã¤ã³ãºä¸æµ¦ï¼2014å¹´9æ22æ¥ï¼æï¼ æ¬ç 究ã®ä¸ã¤ã®è²¢ç® IDFã¨æ å ±è·é¢ (Information Distance) ã®é¢ä¿æ§ã®è§£æ æ å ±è·é¢ã«åºã¥ãåèªN-gramã®å¤§åçéã¿ä»ãææ³ã®ææ¡ æ¡å¼µæ¥å°¾è¾é åã¨ã¦ã§ã¼ãã¬ããæ¨ã«ããå¹ççãªè¨ç®æ¹æ³ã®å®è£ 2 ç 究èæ¯ 3 èªã®éã¿ä»ã (Term Weighting) 4 TF-IDF å±æçé㿠大åçé㿠対象ã®èªãä»æ³¨ç®ãã¦ãã ææ¸ã§ã©ã®ç¨åº¦éè¦ã 対象ã®èªãä¸è¬çã« ã©ã®ç¨åº¦éè¦ã èªã®éã¿ã¯åºç¾ãã ææ¸ã«ãã£ã¦å¤å èªã®éã¿ã¯åºç¾ãã ææ¸ã«ãããåºå® èªã®éã¿ä»ãææ³ 5 TF-IDF �� ��, �� = ����(��, �
{{#tags}}- {{label}}
{{/tags}}