tl;dr 2020å¹´1å¹´éã®ã¯ã¦ãªããã¯ãã¼ã¯ã®äººæ°ã¨ã³ããªã¼3ä¸ä»¶ããã¨ã«æè¡ãã¬ã³ããåæã ãã®çµè«ã¨Pythonã§ã°ã©ãåããæé ãæ¸ãè¨ãã¾ãã â»ãææãããã¾ããããæè¡ãã¬ã³ãã¨ããããitãã¥ã¼ã¹ãã¬ã³ãã¨è¨ã£ãæ¹ãæ£ããããããã¾ãããè¸ã¾ãã¦ãèªã¿ãã ããã åç½®ã æå ã«2020å¹´ã®1å¹´éã§ã¯ã¦ãªããã¯ãã¼ã¯ã®æè¡ã«ãã´ãªã¼ã«ããã¦äººæ°ã¨ã³ããªã¼ã«ä¸åº¦ã§ãä¹ã£ããã¨ã®ããè¨äºã®ã¿ã¤ãã«ãã¼ã¿ã3ä¸ä»¶ã»ã©ãã£ããããå½¢æ ç´ è§£æãè¡ãåèªã®åºç¾é »åº¦é ã«ä¸¦ã¹ã¦ã¿ã¾ãããæ¬ æã®å²åã¨ãã¦ã¯å¤ãã¦ã1å²ç¨åº¦ãã¤ã¾ãå°ãªãã¨ã9å²ç¨åº¦ã®ãã¼ã¿ã¯æã£ã¦ããã¯ããªã®ã§ç²¾åº¦ã¯ããªãé«ãã¨æãã¾ãã ï¼â»ã¯ã¦ãªããã¯ãã¼ã¯ã¯NewsPicksã¿ããã¤ã³ã¿ã¼ãããä¸ã®è¨äºãããã¯ãã¼ã¯ã»ã³ã¡ã³ãã§ããããå¤ãããã¯ãã¼ã¯ãããè¨äºã人æ°ã¨ã³ããªã¼ã¨ãã¦ããã¯ã¢ããããããµã¼ãã¹ã§ããw
Mecabã¨ã¯ å½¢æ ç´ è§£æã¨ã³ã¸ã³ã§ãã å½¢æ ç´ è§£æã«ã¤ãã¦ã¯ä»ã®æ¹ã®è¨äºããããããããããã®ã§å²æãã¾ãã ãªããååã®ç±æ¥ã¯éçºè ããåå¸èªã好ãã ããã ããã§ãï¾ï½² mecabã«å¿ è¦ãªã©ã¤ãã©ãª äºåã«å¿ è¦ãªã©ã¤ãã©ãªãå ¥ãã¦ããã¾ããããmecabã®ãµã¤ãããæç²ãã¦ããã¾ããä»ã«å¿ è¦ãªã©ã¤ãã©ãªãããã°ã³ã¡ã³ããã¦ããã ããã¨å©ããã¾ãm(_ _)m ã»ubuntu
ã¯ããã« ä¸å¹´åã«ãããªè¨äºãæ¸ãã¾ãããæªã ã«ã¡ããã¡ããããããé ãã¦ããã®ã§ãèªç¶è¨èªå¦çã®ç·´ç¿ãå ¼ãã¦ä¹ ãã¶ãã«éãã§ã¿ãç³»ã®è¨äºãæ稿ãããã¨æãã¾ãã ãã£ãã㨠æè©ãã¼ã¿ã®ã¯ãã¼ãªã³ã° Mecabã«ããåãã¡æ¸ã tf-idfã«ãããã¯ãã«å ãã¯ãã«åããæè©ã«ããã¢ã¼ãã£ã¹ãã®ã¯ã©ã¹ã¿ãªã³ã°ã¨UMAPã§ã®å¯è¦å (ãã¾ã) fastTextã§ãã=米津ç師ãè¦åããããã®ã? åæã«ã¯Jupyter Labãç¨ãã¾ããã æè©ãã¼ã¿ ä»åç¨ããæè©ãã¼ã¿ã«ã¤ãã¦èª¬æãã¾ãã ã¯ãã¼ãªã³ã°ã§åå¾ å ç«ã£ã¦æè©ãã¼ã¿ã®ã¯ãã¼ãªã³ã°ããã¾ãããã¨ãã人æ°ã¢ã¼ãã£ã¹ãé ã«æè©ãåå¾ã§ãããµã¤ãããã45人ã®J-popã¢ã¼ãã£ã¹ãã«ã¤ããæ大50æ²åã®æè©ãåå¾ãCSVã«ä¿åãã¾ããã å®éã«ã¯ãã¼ãªã³ã°ã«ç¨ããã³ã¼ããå ¬éããã®ãã©ããã¨æãã®ã§ãããã§ã¯å²æãã¾ããããBea
- ã¯ããã« - è¿å¹´ãITæ¥çã®ãã¸ã£ã¬ã¯ç¾çã®ä¸éã辿ã£ã¦ãã(ITã ãã«) ã é¡ç¾©èªãå·§ã¿ã«åãå ¥ãããã¸ã£ã¬ãé£èªåããããã¸ã£ã¬ãªã©ãå¢å ããä¸ä½ã©ãã§ãåç¬ããããã°è¯ãã®ãæ©ãè¥è ãå°ãªããªãã ãã®ãããªèæ¯ãããããã¸ã£ã¬ãå¤å®ããã¢ã«ã´ãªãºã ã®éçºãçãã§ããã ã«ã¼ã«ãã¼ã¹ã«ããå¤å®ã§ã¯ã@kurehajimeãææ¡ãéçºããdajarep *1 ãã@fujit33ã«ããShareka *2ãåå¨ãããç¹ã«Sharekaã¯ãã«ã¼ã«ãã¼ã¹ã®ãã¸ãã¯ã«ãé¢ããããå復åã¨ããã種é¡ã®ãã¸ã£ã¬ã«å¯¾ãã¦é«ã精度ã§ã®å¤å®ãå¯è½ã«ãã¦ãããã¾ããæ©æ¢°å¦ç¿ã¢ãã«ãç¨ããå¤å®ææ³ã¨ãã¦ãè°·æ´¥(@tuu_yaa)ããéçºããDajaRecognizer *3ããããDajaRecognizerã¯ãå¤ãã®ã«ã¼ã«ãã¼ã¹ã«ãã£ã¦åé³é³é»é¡ä¼¼åº¦ãPMIã¨ãã¦å®ç¾©ãBag-of-Wordsã
ã¾ãã¾ãæ¨æ¥ã®è¨äºã®ç¶ããæ¨æ¥ã¯å¤ç®æ¼±ç³ã®ãã¼ã£ã¡ããããã¡ã«ãã使ã£ã¦å½¢æ ç´ è§£æããã¦Word Cloudã«èªã¿è¾¼ã¾ãã¦ã¿ãããä»æ¥ã¯é »åºåè©ãã«ã¦ã³ããã¦ã°ã©ãã«ãã¦ã¿ããã¹ãã Pythonã§ã°ã©ããæãã®ã¯seabornã¨ããã©ã¤ãã©ãªãæåã¿ããã§ãããã使ã£ã¦ã¿ããä¸ç·ã«æåæ°ãã«ã¦ã³ãããããã«ã³ã³ãããã¼ã¿åã¨ãè¨ãæ¨æºã©ã¤ãã©ãªã®collectionsã使ãã ãµã³ãã«ã³ã¼ãã¯ä»¥ä¸ã®éããæ¨æ¥ã¯stop_wordsã§è¦ããªãåèªã使ããªãæ§ã«ãããã©ãä»åã¯collectionsã«å ¥ã£ã¦ãã®ã§ãããdelã§åé¤ãmost_commonã¨ããã¡ã½ããã§é »åºä¸ä½30ä½ã¾ã§ã®åèªãã°ã©ãæç»ã«ä½¿ãã ãã£ã¨ããéã«ã¿ã¤ãã«ç»åã®æ§ãªãã¤ã¹ãªã°ã©ããå®æããã¡ããã¡ãç°¡åãPythonç´ æµãseabornã¯ä»ã«ãè²ããªã°ã©ããæããã¿ããã§ãã¡ãã£ã¨ããã£ã¦ã¿ãããªã¼ãä»äºã§ã使
Janomeã¯Pythonã®å½¢æ ç´ è§£æã¨ã³ã¸ã³ãæ¥æ¬èªã®ããã¹ããå½¢æ ç´ ãã¨ã«åå²ãã¦åè©ãå¤å®ãããåãã¡æ¸ãï¼åèªã«åå²ï¼ããããããã¨ãã§ãããpipã§ã¤ã³ã¹ãã¼ã«å¯è½ã mocobeta/janome: Japanese morphological analysis engine written in pure Python Welcome to janome's documentation! (Japanese) â Janome v0.4 documentation (ja) janome package â Janome API reference v0.4 ããã§ã¯ä»¥ä¸ã®å 容ã«ã¤ãã¦èª¬æããã Janomeã®ã¤ã³ã¹ãã¼ã« Janomeã¨MeCab 解æçµæã®ç²¾åº¦ å½¢æ ç´ è§£æã®é度 Janomeã§å½¢æ ç´ è§£æ åºæ¬çãªä½¿ãæ¹ Tokenãªãã¸ã§ã¯ãã®å±æ§ Janomeã§åãã¡æ¸
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}