å½ä¼è°å¡ã®Tweet40ä¸ä»¶åæãã¦æ¯æãã¹ãæ¿æ²»å®¶ãæ¢ãã¦ã¿ã
æ°åã³ããç¦ãæãå½ã®æ¿æ²»ã®æ·±å»ãªåé¡ããã¶ãåºãã¦ãã¾ããä¸å¸æ°ã¨ãã¦ã¯æ¯æããæ¿æ²»å®¶ãæ¬è °å ¥ãã¦æ¤è¨ããªããã°ããã¾ããã
å人çãªå顿èã¯ä¸»ã«ãå´åãã¨ã財æ¿ãã«ããã¾ãã®ã§ããããã®åé¡ã«ç©æ¥µçã«åãçµãã§ãããæ¹ãããã§ããä»åã®ã³ããç¦ã§ãã®2ã¤ã¯æ¬å½ã«åå®ãªåé¡ã«ãªãã¾ãããå対ã«ãè±åçºãã¨ããæ¹æ²ãã¯ããã¦ã»ããããªâ¦â¦ã財æ¿å建ãã¨ããå½ç¶ã ãªï¼ãããã¡ãã彿¿ã®è©±ã§ãã
ãããªè¨³ã§ãã¼ã¿ã®åã§å顿èã®åãç¾è·å½ä¼è°å¡ãæ¢ãã¦ã¿ã¾ããã使ãã®ã¯ã¿ããªå¤§å¥½ãPython3 on Google colab(Jupyter notebook)ã§ããæè¡çãªè©±ãé£ã°ãã¦çµè«ã ãè¦ããæ¹ã¯ãã¡ãããã©ããã
ãã£ããã¨
ã¾ãã¯Twitterããã£ã¦ãããã¹ã¦ã®ç¾è·å½ä¼è°å¡ã®Tweetãä¸äººå½ããææ°1000ä»¶ã»ã©åå¾ãã¾ããç¾è·å½ä¼è°å¡ã®ã¢ã«ã¦ã³ãä¸è¦§ã¯å½ä¼è°å¡ãã¡ãããªã¹ã@standbycitizensæ§ãããåããã¾ããã
##å
å¡ä¸è¦§ãè¡åããããã«åå¾
def get_member_list(id):
members = []
for member in tweepy.Cursor(api.list_members, list_id=id).items():
members.append(member.screen_name)
return members
jimin_lower = get_member_list(1062685419437871104)
jimin_upper = get_member_list(1062298938223411200)
koumei_upper = get_member_list(1062306783929131010)
rikken_upper = get_member_list(1062311126208151552)
##...以ä¸ç¶ãTwitterããã£ã¦ããããªãæ¹ãããããã®ã§ãå ¨ã¢ã«ã¦ã³ãæ°ã¯522åã§ãããå½ä¼è°å¡ã¯è¡ååããã¦710åãªã®ã§ãç´74%ã®è°å¡ãTwitterããã£ã¦ãããã¨ã«ãªãã¾ãããã®ãªã¹ãããåå ãã¨ã®äººæ°ãç®åºãã¦ããã¾ãã
#å
å¡ä¸äººãããTweetã1000ä»¶åå¾ãCSVã«æ ¼ç´
def get_tweets(screen_name, party, house):
for m in screen_name:
for tweet in tweepy.Cursor(api.user_timeline,screen_name = m,exclude_replies = False, wait_on_rate_limit = True).items(1000):
tweets = [m, party, house, tweet.id,tweet.created_at+datetime.timedelta(hours=9),tweet.text.replace('\n','')]
with open('tweets.csv', 'a') as f:
writer = csv.writer(f)
writer.writerow(tweets)
get_tweets(shamin, 'shamin', 'upper')1000ä»¶ãæç¨¿ãã¦ããªãæ¹ãéµå¢ã®æ¹ãããããã®ã§ãå®éã®ä»¶æ°ã¯408,854ä»¶ã§ããã
words_labor = ['ãã©ãã¯ä¼æ¥', 'å´å', 'è³é', 'éç¨', 'è§£é', 'å¾ é', '失æ¥', 'æ´¾é£åã', 'éãæ¢ã'] words_finance = ['財æ¿åºå', '給ä»', '交ä»é', 'æ¸ç¨', 'ããã¬'] words_ng = ['æ¹æ²', '乿¡', 'è±åçº', '財æ¿å建']
é¢å¿äºãªã¹ãã¨NGãªã¹ããåèªã¬ãã«ã§ä½æãã¾ããåèªãªã¹ãã¯ãã¯ãããããåé¡ã«é¢å¿ããæã¡ã® @koshian æ§ã«ãæä¼ãé ãã¾ãããã§ããã ãæ¨é²æ´¾ã»è³ææ´¾ãå¼ã£æãããå対派ã¯å¼ã£æãããªããããªè¨èãé¸ãã§ãã¾ããæ©æ¢°å¦ç¿ãªèªç¶è¨èªè§£æã¨ãã¯ããåå§çãªæ¹æ³ã§ã
df = pd.read_csv('tweets.csv')
def get_tweets_by_topic(w_list):
df_new = pd.DataFrame()
for w in w_list:
add = df[df['tweet'].str.contains(w, na = False)]
df_new = pd.concat([df_new, add])
return df_newCSVã«ä¿åãããã¼ã¿ãDataFrameã§èªã¿è¾¼ã¿ããªã¹ãã«è¨å®ããååèªãå«ã¾ãã¦ããtweetãæ½åºã
çµæ
ã¾ãã¯æ¿å
ãã¨ã®ãå´åãåé¡ã«ã¤ãã¦ã®tweetç·æ°(ã®ã¹)ã§ããç«æ²æ°ä¸»å
ãæãå¤ããæ¬¡ã«èªæ°å
ã§ãã
party_names_labor = df_labor['party'].value_counts() print(party_names_labor) -- rikken 3763 jimin 3468 koumei 1720 kyosan 1266 independent 576 ishin 371 kokumin 361 shamin 210 Name: party, dtype: int64
è°å¡ä¸äººå½ããã®ä»¶æ°ã«ç´ãã¨æ¯è²ãå¤ãã£ã¦ãã¾ããå§åçãªã®ã社æ°å ã®ç´70ä»¶ãæ¬¡ãã§å ±ç£å ã®50.6ä»¶ã§ãã左派æ¿å ã®é¢ç®èºå¦ã§ãããèªæ°å ã¯æä¸ä½ã§ããã
jimin 14.1 rikken 27.7 koumei 35.8 kyosan 50.6 kokumin 27.8 ishin 16.1 shamin 70.0 independent 20.6 dtype: float64
ããã§ã¯å人ã ã¨å´ååé¡ã«ã¤ãã¦èª°ãä¸çªçºè¨ãã¦ããã§ããããï¼ä¸ä½10åã確èªãã¾ãã
person_names = df_labor['name'].value_counts() person_names.head(10) -- yamanoikazunori 262 ishibashi2010 174 hanyuda_takashi 156 pru_moriya 150 miyamototooru 136 wako0501 135 kishimakiko_j 132 genkihoriuchi 131 hatanokimie 123 Senator_ISHIDA 113 Name: name, dtype: int64
å§åçã«å¤ãã®ãç«æ²æ°ä¸»å ã®å±±äºååæ°ã§ããå»¶ã¹ã¨ã¯ãã1000ä»¶ä¸1/4以ä¸ãå´åé¢é£ã®tweetã§ãã
twitter.comãã®æ¬¡ãåããç«æ²æ°ä¸»å ã®石橋みちひろæ°ãããã¦èªæ°å ã®羽生田たかしæ°ã¨ç¶ãã¾ããæ¿å ã¨ãããããå人éã§å¤§ããªã°ãã¤ããããããã§ãã
ã§ã¯æ¬¡ã«ã財æ¿ãåé¡ã«é¢å¿ã®é«ãæ¿å ã確èªãã¾ãããã
df_finance = get_tweets_by_topic(words_finance) party_names_finance = df_finance['party'].value_counts() print(party_names_finance) -- rikken 2543 jimin 2493 koumei 2134 kyosan 924 independent 588 kokumin 390 ishin 286 shamin 41 Name: party, dtype: int64
ãã¡ãã¯å ¨ä½çã«ãå´åãåé¡ã«æ¯ã¹ã¦ä»¶æ°ãå°ãªãã§ããããã¦ä¸çªã¯ãã¯ãç«æ²æ°ä¸»å ã§ããã
jimin 10.1 rikken 18.7 koumei 44.5 kyosan 37.0 kokumin 30.0 ishin 12.4 shamin 13.7 independent 21.0 dtype: float64
ä¸äººå½ããã«ç´ãã¨ãã¯ãè¦ããæ¯è²ãå¤ãã£ã¦ãã¾ãããªãã¨ä¸ä½ã¯å ¬æå ã®ç´44.5ä»¶ã§ããæ¬¡ãã§å ±ç£å ã®37ä»¶ã彿°æ°ä¸»å ã®30ä»¶ã¨ç¶ãã¾ãã社æ°å ã¯ç¶æ°ä¸¦ã¿ã«å°ãªãã財æ¿ã«å¼±ãã¨ããã¤ã¡ã¼ã¸ãè£ä»ããããæãã§ããããã¦æä¸ä½ã¯ãã¯ãèªæ°å ãªãã§ãããèªæ°å ã®æ¹ã¯æ®æ®µä½ã話ãã¦ããããã®ã§ããããï¼ï¼
ããã¦å人ã ã¨
person_names = df_finance['name'].value_counts() person_names.head(10) -- yamanoikazunori 192 tamakiyuichiro 136 ueno_hiroshi 131 akutsu0626 126 andouhiroshi 120 hisatake_sugi 111 sayaka_sasaki 106 kitagawa_kazuo 95 yasue_nobuo 92 310kakizawa 90 Name: name, dtype: int64
ããã§ãä¸ä½ã¯ç«æ²æ°ä¸»å ã®å±±äºååæ°ã§ããã2ä½ã¯å½æ°æ°ä¸»å ã®玉木雄一郎æ°ã§ãããçæ¨æ°ã¯è²¡æ¿ã«ã¤ãã¦ããããçºè¨ããã¦ããã¨ããå°è±¡ãããã¾ããããå®ã¯ãã®ã¯ããä¸ãããã3ä½ã®èªæ°å ã»上野宏史æ°ã¨ãã¾ãå¤ããã¾ããããã¯ãæ¿å ã¨ããããå人éã®ã°ãã¤ãã大ããæãã§ãããªããçæ¨éä¸éæ°ã¯å´ååé¡ã«ã¤ãã¦ã¯32ä»¶æç¨¿ããã¦ãã¾ãã
ã§ã¯ãNGãªãããã¯ã«ã¤ãã¦ä¸çªçºè¨ãã¦ããæ¿å ã®ç¢ºèªã§ãã
df_ng = get_tweets_by_topic(words_ng) party_names_ng = df_ng['party'].value_counts() print(party_names_ng) -- rikken 350 kyosan 276 jimin 135 kokumin 62 shamin 50 ishin 49 independent 39 koumei 38 Name: party, dtype: int64
ãã£ã¨ä»¶æ°ã¯ä¸ãã£ããã®ã®ããã¡ãã§ãç«æ²æ°ä¸»å ããããã§ãããã©ããç«æ²æ°ä¸»å ã¯è¯ããæªãããããã®é¢å¿ããããããã¯ã«ã¤ãã¦èªããã¨ãå¤ãããã§ãããã ãã©ã®å ãæã£ãããNGãªäºæã«ã¤ãã¦è©±ãã¦ããªããããªã®ã§å®å¿ãã¾ããã
ä»¶æ°ãå°ãªãã®ã§ãä¸äººå½ããã®å¤æã¯ãã¾ãããåäººã®æ¤è¨¼ã«ç§»ãã¾ãã
person_ng = df_ng['name'].value_counts() person_ng.head(10) -- ShioriYamao 49 tadatomoyoshida 34 yunoki_m 34 pioneertaku84 34 TAMURATAKAAKI 31 ryon_t 22 kondo_shoichi 22 kurabayashia 21 ShiokawaTetsuya 21 kokutakeiji 20 Name: name, dtype: int64
ãªãã¨ããããã®å½æ°æ°ä¸»å ã»山尾志桜里æ°ã彿°æ°ä¸»å å ¨ä»¶62ä»¶ã®ãã¡49ä»¶ã稼ãã§ããï¼ï¼ãªãã»ã©ã§ããâ¦â¦
ãã¦ããã¾ã§æ¥ã¦ãããããæ¨ãã¹ãæ¿å ã¯å ±ç£å ãæ¿æ²»å®¶ã¯ç«æ²æ°ä¸»å ã®å±±äºååæ°ã ã¨ãããã¨ããããã¾ããã財æ¿ã¨å´åãã©ã¡ãã®åé¡ã«ã¤ãã¦ãç©æ¥µçãªãé¢å¿ããæã¡ã®ããã§ãããããããã¼ã¼ãå ±ç£å ã¯è¦æãã¦ããç«æ²æ°ä¸»å ãâ¦â¦ä¸å¿ãå±±äºæ°ãNGçºè¨ãããã¦ããªããã©ããã確èªãã¾ãã
person_ng['yamanoikazunori'] -- KeyError: 'yamanoikazunori'
ç´ æ´ããããä¸ä»¶ãããã¾ããï¼ï¼
æ¬å½ã«å±±äºæ°ãæ¨ãã¦ãããã©ããæçµç¢ºèªã®ããã«å ¬å¼ãµã¤ããè¦ã¾ãã
yamanoi.netæ¿çãæè¦ããéããè³ä¸ããåã©ãã®è²§å°åé¡ã«ç©æ¥µçã«åãçµãã§ããããããã§ããããã¯ç´ æ´ãããã§ããããã ç«æ²æ°ä¸»å ã®æ¿çã¨ãã¦äºæ¥ä»åããå ¬å ±äºæ¥åæ¸ã«ç©æ¥µçãªç¹ã«ã¤ãã¦ã¯ã©ããèããªã®ãæ°ã«ãªãã¾ããç«æ²æ°ä¸»å ã§ã¯ãªãå ±ç£å ã«ç§»ã£ã¦é ãããå®å¿ã§ããã®ããªâ¦â¦
ä»å¾ã®èª²é¡
åèªã§ã®ããããè¦ãã ãã§ã¯ä¸»å¼µãè§£æããã®ã«ä¸ååã§ããããããæ©æ¢°å¦ç¿ã§ã®èªç¶è¨èªå¦çãªã©ã使ã£ã¦å 容ã«è¸ã¿è¾¼ãã 主張ã®è§£æãè¡ãããã¨æãã¾ããã¨ã¯ãããããããã¨ã¯å ¨ç¶ããããªãã®ã§æèè ã®ãæå°ãéæ»ãä»°ãããã¨ããã§ããchainerã¨ãããã°ããã®ããªâ¦â¦ããããã

çµæ¸æ¿çã§äººã¯æ»ã¬ã?: å ¬è¡è¡çå¦ããè¦ã䏿³å¯¾ç
- ä½è :ãã´ã£ãã ã¹ã¿ãã¯ã©ã¼,ãµã³ã¸ã§ã¤ ãã¹
- çºå£²æ¥: 2014/10/15
- ã¡ãã£ã¢: åè¡æ¬