æè¿ã®åå¼·ä¼ã§è¯ãè¨åããã¦ããåèªã調ã¹ãã
ãããããã¨ããã¦ãã¾ããéãªæ¸æ¨ã¦ã¹ã¯ãªããã§ãã
- APIã§ATNDãconnpassã®åå¼·ä¼æ å ±ãåã£ã¦ããã
- mecabã§å½¢æ ç´ è§£æãã¦åèªã«ã°ããã
- åèªãã«ã¦ã³ãã
# ãã¼ã¯ã¼ãã«ä¸è´ããã¤ãã³ãã®ãã£ã¹ããªãã·ã§ã³çãåå¾ curl -s 'http://connpass.com/api/v1/event/?count=100&keyword=ã好ããªåèª' | jq '.events[].description' >> dump_1 curl -s 'http://api.atnd.org/events/?format=json&count=100&keyword=ã好ããªåèª' | jq '.events[].event.description' >> dump_1 # htmlã¿ã°ãããæååãæ¶ããæ¹è¡ã³ã¼ããæ¶ãã awk '{ gsub(/<[^>]*>/, "") ; gsub(/\\n|\\t/, "") ; print }' dump_1 > dump_2 # å½¢æ ç´ è§£æ & åºæåè©ã ãæãåºã mecab --input-buffer-size 8192000 dump_2 | awk '($2 ~ /åºæåè©/ ) && ($2 !~ /å°å/) && ($2 !~ /人å/) {print $0}' > dump_3 # åºç¾åæ° cat dump_3 | uniq -c | sort #rm dump_*
ææ³
- è¾æ¸ãéããªãã¨é§ç®ã ã
- ã§ãé°å²æ°ã¯æ´ããã
- æè¡çãã¼ã¯ã¼ãã¨åããããã®é »åº¦ã§ç¹å®å人ã®ååãç»å ´ãã¦ããããã