You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
ã¨ã³ã¸ãã¢ã®tetsuã§ãã åèªã®ãã¯ãã«è¡¨ç¾ãå¾ãææ³ã¨ããã°ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ãç¨ããWord2vecãä¸çªæåããããã¾ããããã ãåç´ã«Word2vecãç¨ããå ´åã«ã¯æªç¥èªã®ãã¯ãã«åãã§ãã¾ãããããã«å¯¾ãã¦ãfastTextãç¨ããã¨æªç¥èªã«å¯¾ãã¦ããã¯ãã«åãå¯è½ã«ãªãã¾ãã ä»åã¯fastTextã§æªç¥èªããã¯ãã«åããå¦ç¿ãã¼ã¿ã«å«ã¾ããåèªã®ä¸ããé¡ä¼¼ãã¦ãããã®ãæ¢ãã¨ãããã¨ã試ãã¦ã¿ã¾ãã fastTextã¨ã¯ï¼ fastTextã¯Facebookã«ãã£ã¦éçºããã¦ããã©ã¤ãã©ãªã§GitHubä¸ã§ã½ã¼ã¹ãå ¬éããã¦ãã¾ãã https://github.com/facebookresearch/fastText ãã®fastTextã¯é«éã«åèªã®åæ£è¡¨ç¾ãå¾ããã¨ï¼åèªã®ãã¯ãã«åï¼ã¨æã®åé¡åé¡ã解ããã¨ãå¯è½ã§ããä»åã«é¢ãã¦ã¯åè ã®åèªã®åæ£è¡¨ç¾ã
-r --rcfile 使ç¨ãããªã½ã¼ã¹ãã¡ã¤ã«ãæå®ãã ãªã½ã¼ã¹ãã¡ã¤ã«ã¨ã¯ãè¾æ¸ãã£ã¬ã¯ããªã«å ¥ã£ã¦ãããdicrcããã¡ã¤ã«ãæãã¾ãã 試ãã«ã·ã¹ãã è¾æ¸ã®ãdicrcããã¡ã¤ã«ãã³ãã¼ãã¦ããdicrc2ãã¨ãããã¡ã¤ã«ãä½ãããã®ä¸ã®ã; simpleãã®ãEOSãããeosãã«æ¸ãæãã¾ããããã¨ãããªé¢¨ã«ãªãã¾ãã // ãªã½ã¼ã¹ãæå®ããã«å®è¡ $ echo ãã¹ã | mecab -O simple ãã¹ã åè©-ãµå¤æ¥ç¶ EOS // ãªã½ã¼ã¹ãæ¹å¤ããdic2ã«æå®ãã¦å®è¡ $ echo ãã¹ã | mecab -r dicrc2 -O simple -d /usr/local/lib/mecab/dic/naist-jdic ãã¹ã åè©-ãµå¤æ¥ç¶ eos æã家ã®ç°å¢ã§ã¯ãã·ã¹ãã è¾æ¸ãã£ã¬ã¯ããªãã«ã¬ã³ããã£ã¬ã¯ããªã¨ããç¶æ ã«ããããã-dãã§ã·ã¹ãã è¾æ¸
ã¯ããã« ãªã¼ãã³ã½ã¼ã¹ã®æ¥æ¬èªå½¢æ ç´ è§£æå¨ï¼MeCabã®è¾æ¸ãå©ç¨ããæ¹æ³ãåå¿é²ã¨ãã¦è¨é²ãã¾ãã MeCabã®è¾æ¸ã«ã¯ãã·ã¹ãã è¾æ¸ã¨ã¦ã¼ã¶è¾æ¸ãããã¾ãã ã·ã¹ãã è¾æ¸ã®æ¹ãå¦çãæ©ãã¨è¨ããã¦ãããã¦ã¼ã¶è¾æ¸ã使ç¨ããæ©ä¼ã¯å°ãªãã¨æãããã ã¾ããMeCabç¨ã®æ°èªè¾æ¸ã»åºæ表ç¾ã«å¼·ãã·ã¹ãã è¾æ¸ï¼mecab-ipadic-NEologdãå ¬éããã¦ãããmecab-ipadic-NEologdã¯ãæ¯é±ï¼åï¼æææ¥ã¨æ¨ææ¥ï¼ã«ä¸è¬ãµã¤ãï¼ã¯ã¦ãªãã¼ã¯ã¼ããéµä¾¿çªå·ãã¼ã¿ãSNSï¼ãã¥ã¼ã¹è¨äºãªã©ï¼ããæ å ±ãåéãã¦æ´æ°ããã¦ããã åèãµã¤ãï¼ MeCabå ¬å¼ãµã¤ã MeCabã®è¾æ¸ãã«ã¹ã¿ãã¤ãºãã mecab-ipadic-neologd - GitHub ç°å¢ OSï¼Red Hat Enterprise Linux 7.2 MeCabï¼0.996 MeCabã®ã·ã¹ãã è¾æ¸
require 'csv' def output(title, type) title_length = title.length return nil unless title_length > 3 score = [-36000.0, -400 * (title_length ** 1.5)].max.to_i [title, nil, nil, score, 'åè©', 'ä¸è¬', '*', '*', '*', '*', title, '*', '*', type] end CSV.open("user.csv", 'w') do |csv| # niconico Dir::foreach('./niconico') do |f| next unless f =~ /^head[0-9]{4}\.csv$/ open("./niconico/#{f}").each do |line|
Vtuberãã©ãã©ãæåºããã¦ããæ¨ä», Mecabã®è¾æ¸ãå¤ãã¾ã¾ã«ãã¦ããã¨å½¢æ ç´ è§£æã®ç²¾åº¦ãæªããªã. å½¢æ ç´ è§£æãèªä½ãã¦ãè¯ãã, ããã«å´åãå²ãã®ãåççã§ã¯ãªãã®ã§èªåã§è¾æ¸ãä½ã£ã¦æ°åºèªã解æã§ããããã«ãã¦ãã¾ãã. èªå·±åç §ç¨ã®ã¨ã³ããªãªã®ã§è©³ããç¥ããã人ã¯ä¸é¨ã®åèããåã¨ã³ããªã«é£ã¶ã¨è¯ã. ç°å¢ MacOS X ã·ã¹ãã è¾æ¸ã¨ãã¦ã¯ipadicã§ã¯ãªãmecab-ipadic-neologdã使ç¨ãã¦ãã. ã³ã³ãã¤ã« ããã¾ã§é£ããã¯ãªã. $ /usr/local/Cellar/mecab/0.996/libexec/mecab/mecab-dict-index \ -d /usr/local/lib/mecab/dic/mecab-ipadic-neologd \ -u user.dic \ -f utf-8 \ -t utf-8 added.csv
æ¦è¦ åèªè¾æ¸ã®æ§é ãç解ãããã¨ã§, MeCab ãæ±ç¨çãªããã¹ãå¤æãã¼ã«ã¨ãã¦å©ç¨ãããã¨ãã§ãã¾ã. ä¾ãã°, ã²ãã㪠to ã«ã¿ã«ãå¤æ, ãã¼ãå to ã²ãããªå¤æ, Auto Linkçã MeCab ã ãã§å®è¡ã§ãã¾ã ãã¡ã¤ã« åèªè¾æ¸ãæ§ç¯ããã«ã¯, æä½ä»¥ä¸ã®ãã¡ã¤ã«ãä½æããå¿ è¦ãããã¾ã. *.csv ãã¡ã¤ã« (åèªè¾æ¸) matrix.def (é£æ¥è¡¨) unk.def (æªç¥èªç¨åè©å®ç¾©) char.def (æªç¥èªã®æåå®ç¾©) dicrc (è¨å®ãã¡ã¤ã«) *.csv ãã¡ã¤ã« åèªè¾æ¸ã§ã ã¨ã³ããªã¯, 以ä¸ã®ãã㪠CSV ã§è¿½å ãã¾ã. test,1223,1223,6058,foo,bar,baz æåã®4ã¤ã¯å¿ é ã¨ã³ããªã§, ãããã 表層形 å·¦æèID (åèªãå·¦ããè¦ãã¨ãã®æè ID) å³æèID (åèªãå³ããè¦ãã¨ãã®æè ID)
ã¯ããã«fastTextã¨ããFacebookãéçºãããããã³ã¼ãã¹ãå ¥åã¨ããåèªã®åæ£è¡¨ç¾ï¼åèªããã¯ãã«åãããã®ï¼ãåå¾ããã©ã¤ãã©ãªã§ããã ããã¯åããã³ã¼ãã¹ããåèªã®åæ£è¡¨ç¾ãç²å¾ããword2vecã®éçºè Tomas Mikolovæ°ã«ãã£ã¦éçºããããããããªãããã¢ãã«ã«æ¹è¯ãå ãããã¦ãããå¦ç¿ãããé«éåããã精度ãåä¸ããã¨ããã¦ããï¼åèæç®åç §ï¼ã ãã®ã©ã¤ãã©ãªãç¨ãã¦ãä»åã¯ãããåèªã«å¯¾ããé¡ç¾©èªããåå¾ããæ¹æ³ãç´¹ä»ããã ã¤ã³ã¹ãã¼ã«æ¹æ³ã¨ãµã³ãã«ã³ã¼ããã¿ã¦ããã ã³ã³ãã³ãfast Text ã¨ã¯ã¤ã³ã¹ãã¼ã«fastTectãç¨ããwikipediaã³ã¼ãã¹ã®å¦ç¿é¡ç¾©èªã®åå¾ã¾ã¨ã ã¤ã³ã¹ãã¼ã«ã¾ãã¯ãfastTextã«å¿ è¦ãªå種ã©ã¤ãã©ãªãã¤ã³ã¹ãã¼ã«ãã¦ããã å ¬å¼ã«å¾ã£ã¦fastTextã®ã¤ã³ã¹ãã¼ã«ã
D. M. ã§ããæ¨ä»ã¯ããã¹ã解æãé常ã«ãããããæ代ã«ãªãã¾ããããã¼ã å ã§ãæ´»çºã«æ¤è¨¼ã»æ´»ç¨ããã¦ãããç§ãæµãã«ä¹ã£ã¦ Word2Vec ã Doc2Vec ã触ãã ãã¾ããããåèã«ãªãæ¥æ¬èªã®è¨äºå¤ãã§ãããããããã®ã¯ãã¥ã¼ã¹è¨äºã»é空æ庫㻠Wikipedia ã®è§£æã§ãããå社ã®ç¬èªã®æååãã¼ã¿ããããçµæ§ãªãµã¤ãºã®ãã¼ã¿ãé£ããã¦é¢é£èªãåºãè¨äºãªã©ããã£ãããã¦ãå®å©ç¨å¯è½ãã©ããã¯é¢ä¿ç¡ãã«æ¥½ãããã§ãã ããããã㨠é¡èªå¤å®ã«ã¤ãã¦ãã¦ã§ãä¸ã§ã¯æ¢ã«ç¸å½ããããªç¨®é¡ã®è¨äºãä¸ãããã¦ãã¦åããªãã¨æãã¤ã¤ãçµæ§ç°¡åã«è¦ããã®ã§ç§ãä½ãæ¤è¨¼ãããã¨æãã¾ããããã åããã¨ããã£ã¦ããã¾ãé¢ç½ã¿ãç¡ãã§ããå°è¦æ¨¡ã§ãå§ãããããããªãã¨ãèãã¦ãã²ã¨ã¾ãèªåã® Twitter ã®ã¤ã¶ãããé£ããã¦é¡èªãè¦ã¦ã¿ããã¨ã«ãã¾ãããä»æ¥ã¯ãããªåæ©çãªè©¦ã¿ã®ç´¹ä»ã§ãã
ããã«ã¡ã¯ã¢ããã³ã¹ããã¯ããã¸ã¼é¨ã®@y-matsushitaã§ãã ä»åã¯æ©æ¢°å¦ç¿ã使ã£ãåãçµã¿ã¨ãã¦ãæå§ãã«fastTextã使ã£ãããã¹ãã®åé¡ã«ã¤ãã¦è§¦ãããã¨æãã¾ãã fasttext.cc fastTextã¨ã¯Facebookãæä¾ããåèªã®ãã¯ãã«åã¨ããã¹ãåé¡ããµãã¼ãããæ©æ¢°å¦ç¿ã©ã¤ãã©ãªã§ãã fastTextã¨ããååã®éãåä½ã軽ãæ©ãã®ãç¹å¾´ã§ãã試ãã«ä½¿ã£ã¦ã¿ãã¨ãã精度ãè¯å¥½ã§åä½ã軽ãã£ãã®ã§ãç´¹ä»ããã¦ããã ãã¾ãï¼ ä»åã¯è©¦ãã«æ§ã ãªæ å ±ãå ¥ãæ··ãã£ãTwitterã®æ稿å 容ãåé¡ãã¦ãç¾å®¹ç³»ããã¨ã³ã¿ã¡ç³»ããæ®ããç³»ãæ å ±ã®3ãã¿ã¼ã³ã«åé¡ãã¦ã¿ã¾ãã ãªãä»åã®è¨äºã§ã¯Python 3.6.1ã使ç¨ãã¾ãã fastTextã使ã£ã¦ã§ããã㨠ã¾ãæåã«fastTextã使ã£ãçµæããè¦ããã¾ãã ãåé¡åããå¦çåã§ãåé¡å¾ããfastT
ããã«ã¡ã¯ã éåã¨ç©ºãã¦ãã¾ãã¾ããã 3æãªã®ã«ä»å¹´æåã®ã¨ã³ããªã£ã¦ã©ããããã¨ãããâ¦ã 以åãfastTextã®ã¤ã³ã¹ãã¼ã«ããã¾ããããå®éã«ä½¿ã£ã¦ããªãã£ãã®ã§ã ä»åã¯é©å½ãªæç« ããåèªã®ãã¯ãã«ãå¦ç¿ããã¦ããã®æ¼ç®ã試ãã¦ã¿ã¾ãã ã¡ãªã¿ã«fastTextã¯åé¡ã«ããå©ç¨ãããããã åæ£è¡¨ç¾ã使ã£ã¦ã©ãããããã®ã¯ããã¾ãããã¥ã©ã¼ã§ã¯ãªãã§ãã ãããã£ããæ©è½ã¨ãã¦ããã®ã§ä½¿ã£ã¦ã¿ã¾ãã ä½ãã§ããã®ï¼ ãããªã - ããã©ã³ã¹ã + ãæ¥æ¬ã = ãæ±äº¬ã ã¨ã ãçæ§ã - ãç·ã + ã女ã = ã女çã ã¿ãããªãã¤ã§ãã ããã¯fastTextåºæã®æ©è½ã¨ãããããããªãã¦ã fastTextã®å ã«ãªã£ã¦ããï¼å ã£ã¦ããã®ãéãæ°ãããããã©ï¼ãword2vecã§æåãªæ©è½ã§ãã å¦ç¿ãã 詳ãã説æã¯ä¸ã®ä¸ã«ããããããã®ã§ããã£ããã¨ç«¯æãã¾ãã ä»
ããã°ãã¯ã å¯ãã«è² ãã¦æ©ããã ã¼ãã³ãã¼ããåºãã¦ãã¾ã£ãã®ã§ããã ãã£ã¨å¯ããªã£ããã©ããªæ ¼å¥½ãããã°è¯ãã®ã§ããããã ãã¦ãä»æ¥ã¯Facebookã®å ¬éãã¦ããèªç¶è¨èªå¦çã©ã¤ãã©ãªãfastTextãã使ã£ã¦ã¿ããã®ã§ã ãã®ç°å¢æ§ç¯ããã¦ã¿ã¾ãã ã¤ã³ã¹ãã¼ã«ããã ã¨ãããã調ã¹ããã³ãã³ãã©ã¤ã³å®è¡ã®ããã®æ§ç¯æé ã¨ã pythonã©ã¤ãã©ãªã®ããã®æ§ç¯æé ããã£ã¡ãã«ãªã£ã¦ãããããã®ã§ããã£ã±ãªããæ··ä¹±ãã¾ãã ã©ã©ã©ããããã¨ãªã®ã»ã»ã»ç§ã¯pythonã¯æ¸ããªãã®ã§ã³ãã³ãå®è¡ãè¯ãã®ã§ããã»ã»ã»ã ï¼ã°ã°ãï¼ ã¨ãããããããã£ããã¨ã¯Windowsç°å¢ãããLinuxç°å¢ã®æ¹ãè¯ãããã¨ãããã¨ã ã¨ããããã§ãMacOSã«Dockerã³ã³ãããç«ã¦ã¦è©¦ãã¦ã¿ããã¨ã«ãã¾ãã ããããæã«Dockerã¯ãã¹ã£ããããç´ããã§ããã®ã§ä¾¿å©ã§ããã ã¨ããããã§ä½
Sansan Advent Calendar 2018 ã®1æ¥ç®ã®è¨äºã§ãã ãã¤ããä¸è©±ã«ãªã£ã¦ããMeCabã«ã¤ãã¦ã®åå¿é²ã§ãã ã¤ã³ã¹ãã¼ã«ãè¾æ¸ãè¾æ¸æ´åãPythonãã·ã§ã«ã§ã®åãæ±ãã¾ã§ã使ãæ¹ãã¾ã¨ãã¾ãã ããã¥ã¢ã«èªãã°åãããï¼ã¨ããããã¯å ¬å¼ããã¥ã¢ã«ãå å®ãã¦ããã®ã§ãã¡ããèªãã®ããããã¨æãã¾ãã MeCabã¨ã¯ ã¤ã³ã¹ãã¼ã« Linuxã§ã½ã¼ã¹ãããã«ã ããã±ã¼ã¸ããã¼ã¸ã£ Docker åºæ¬çãªä½¿ãæ¹ æ¨æºå ¥åãã解æ åºåãã©ã¼ããã è¾æ¸ è¾æ¸æ´å Pythonãã¤ã³ãã£ã³ã° ãããã« MeCabã¨ã¯ MeCab(åå¸èª)ã¨ã¯2006å¹´ããéçºããã¦ãããªã¼ãã³ã½ã¼ã¹ã®å½¢æ ç´ è§£æå¨ã§ãã åä½ãé常ã«é«éã§ãè¾æ¸ã®é å¸ãè¾æ¸ã®ä½æãã§ããããåºãå©ç¨ããã¦ãã¾ãã ããã¹ãå¤æå¨ã¨ãã¦è¨è¨ããã¦ãããããä¾ãã°ã²ãããªããã«ã¿ã«ãã¸ã®å¤æå¨ã®ããã«
DataStrategyã®é½è¤ï¼@pigooosukeï¼ã§ãã ãããã·ã§ããä½æãµã¼ãã¹ãBASEãã¯60ä¸åºèã®ã·ã§ãããå©ç¨ãã¦ãããã·ã§ããã³ã°ã¢ããªãBASEãã®ã¦ã¼ã¶ã¼ã¯ãæ°çååããã¼ã¯ã¼ãæ¤ç´¢ãé¢é£ååãååç¹éãªã©ãä»ãã¦æ°ã«ãªãååãè¦ã¤ãããã¨ãã§ãã¾ããä»åãæ°æ©è½ã¨ãã¦ãæ¤ç´¢ã¯ã¼ãã«é¢é£ãããã¼ã¯ã¼ãã表示ãããã¨ã§ãã¦ã¼ã¶ã¼ã®èå³ã®ãããããªååã«ãã©ãçããåç·ãæ©æ¢°å¦ç¿ãæ´»ç¨ãã¦å®è£ ãã¾ããã DataStrategyãã¼ã ã¯çºè¶³ãã¦éããªãããµã¼ãã¹ãã¡ã¤ã³ã«é©å¿ããåèªè¾æ¸ããªãã£ãã®ã§ãæ°è¦ã§ä½æããã¨ããããå§ã¾ãã¾ãããæ©æ¢°å¦ç¿ã«ããããã¼ã¿ã»ããã®ã¢ããã¼ã·ã§ã³ã«ã¤ãã¦ã®ç¥è¦ãå ±æãããæ©ä¼ãå°ãªãå°è±¡ããããæè§ãªã®ã§ä»åç§éãè¡ã£ããã¼ã¿ä½ãããå®è£ ã¾ã§ã®æµãããç´¹ä»ãã¾ãã æ¦è¦ ä»åãã©ããªãã¼ã¯ã¼ããæå³çã«è¿ããã°ããµã¸ã§ã¹ããã¦ãè¯
å½¢æ ç´ è§£æã¨ã³ã¸ã³MeCabãPython3ãã使ã£ã¦ã¿ã¾ããã®ã§ãç´¹ä»ãã¾ãã ç°å¢ macOS 10.13.6 Python 3.6.4 æºå MeCabã¨è¾æ¸ã¨ãmecab-python3ãã¤ã³ã¹ãã¼ã« $ brew install mecab mecab-ipadic git curl xz $ pip install mecab-python3 mecab-ipadic-NEologdã®ã¤ã³ã¹ãã¼ã« æ¨æºã®è¾æ¸ã ã¨EC2ã¨ãS3ã¨ããã¾ãåãã¡æ¸ããã§ããªãã£ãã®ã§Webä¸ã®æ°èªã追å ãããã·ã¹ãã è¾æ¸ mecab-ipadic-NEologd ãã¤ã³ã¹ãã¼ã«ãã¾ããã ã¤ã³ã¹ãã¼ã«æ¹æ³ã®è©³ç´°ã¯ mecab-ipadic-NEologd : Neologism dictionary for MeCab ã確èªãã¦ã¿ã¦ãã ããã $ git clone --depth 1 git@
é空æ庫ã«åå·è±æ²»æ¬ãå ¬éããã¦ããã®ã§è¨èªåæã試ãã¦ã¿ã¾ããããã¬ãã³ãã®åä½ã«ãªã£ã¦ããå®®æ¬æ¦èµã¨ãæèªãã§ãããã«è¨æ¶ãããä¸å½å¿ãåæãã¾ããåæçµæã ããè¦ããæ¹ã¯ã"試ãã¦ã¿ã"ã®é ç®ã¾ã§é£ã°ãã¦ãã ããã ä»åã¯ãpythonã使ããã«node.jsã§å¦ç¿ã¨å¦ç¿ãã¼ã¿ã®æ´»ç¨ããã¦ãã¾ãã word2vec-nodeã¨ããç´ æ´ãããããã±ã¼ã¸ããã£ãããã§ãã ç°å¢ mac OSX Elcapitan 10.11.6 Mecab mecab-ipadic-neologd node.js 6.1.0 mecab-async word2vec-node node.jsã§word2vecã®ãã¼ã¿ãä½ãã¾ãã å¦ç¿ãã¼ã¿ä½æ ãã¼ã¿å ¥æ ããããé空æ庫ã®åå·è±æ²»æ¬ãæã£ã¦ãã¦è§£æãã¦ãã¾ããæ´æ°ãæ¢ã¾ã£ã¦ããããã§æ°æ¸å¤ªé¤è¨ãªã©ãç¡ãã§ãããä¸å½å¿ã¨å®®æ¬æ¦èµã¯å ¥ã£ã¦ããã®ã§ãããã
SSII2021 [TS3] æ©æ¢°å¦ç¿ã®ã¢ããã¼ã·ã§ã³ã«ããã ãã¼ã¿åéâ ã 精度åä¸ã®ããã®ä»çµã¿ã»å«çã社ä¼æ§ãã¤ã¢ã¹ ã
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}