åã«AtCoderã®ã¬ã¼ãã£ã³ã°ã®åå¸ã«ã¤ãã¦èª¿ã¹ã¦è¨äºãæ¸ãã¾ãã
sucrose.hatenablog.com
Twitterã§ãAtCoderã«ããããåå ããã°ã¬ã¼ãã£ã³ã°ãä¸ãããã¨ãã話ãè¦ããã¦æ°ã«ãªã£ãã®ã§ãã¦ã¼ã¶ã¼ã®åå åæ°ã¨ã¬ã¼ãã£ã³ã°ã®é¢ä¿ãéã«ã°ã©ãã«ãããã¬ã¼ãã£ã³ã°ã®åå¸ã®ã°ã©ããæ¸ããããã¦ã¿ã¾ãã
調ã¹ãã®ãç°¡åãªAtCoderã®ã©ã³ãã³ã°ã«æ¸ãã¦ããç¾å¨ã®ã¬ã¼ãã£ã³ã°ã¨åå åæ°ã®è¡¨ã®ãã¼ã¿ã使ãã¾ãã
AtCoderã®ã¬ã¼ãã£ã³ã°ã®åå¸
ã¬ã¼ãã£ã³ã°ã®è²åãã«ã¤ãã¦ã¯ãã¡ãâ
1åããåå ãã¦ããªãã¦ã¼ã¶ã¼ãæ°ãã¦ããã®ã§ç°è²ãå
¨ä½ã®ååè¿ããå ãã¦ãã¾ã
ã¬ã¼ãã£ã³ã° | ã¬ã¼ãã£ã³ã°ä¸ä½ä½%ã |
---|---|
2800(赤) | 1% |
2400(ãªã¬ã³ã¸) | 2% |
2000(é»è²) | 5% |
1600(é) | 11% |
1200(æ°´è²) | 21% |
800(ç·) | 35% |
400(è¶è²) | 52% |
0(ç°è²) | 100% |
åå åæ°ã¨ã¬ã¼ãã£ã³ã°ã®é¢ä¿
è¦ãã¨ãã10åãã15åãããã®åå ã¾ã§ã¯ã¬ã¼ãã£ã³ã°ã®å¹³åã大ããå¤ãã£ã¦ããã
ãã®å¾ã®åå åæ°ãå¢ãããã¨ã«ã¬ã¼ãã£ã³ã°ã®å¹³åã¯å¾ã
ã«å¢ãã¦ãã£ã¦ããããã«è¦ãã¾ããåæ°ãå¢ãã¦ãã°ãã¤ãã¯çµæ§å¤§ããã§ã
ææ³
åå åæ°ãå¤ãã»ã©ã¬ã¼ãã£ã³ã°ã®å¹³åãå¢ãã¦ããããªæ°ããã
ããæ°ã®æç¡ãåå åæ°ã¨ã¬ã¼ãã£ã³ã°ã®ä¸æã®ä¸¡æ¹ã«é¢ãã£ã¦ããã¨æãããã®ã§ããã£ã¨ã¡ããã¨èª¿ã¹ããªãåã
ã®ã¦ã¼ã¶ã¼ã®ã¬ã¼ãã£ã³ã°ã®ä¸ä¸ã追ã£ã¦ã¡ããã¨èª¿ã¹ãªãã¨ãã¡ãã
ãã¾ã: ç®±ã²ãå³
ç®±ã²ãå³ãæ¸ããã®ã§ã¤ãã§ã«è²¼ã£ã¦ããã¾ã
ä¸ã®å¹³åã¨æ¨æºåå·®ã®æ£ã°ã©ãã¨è¦æ¯ã¹ãã¨çµæ§å°è±¡ãå¤ããã¾ã
ã½ã¼ã¹ã³ã¼ã
# coding: utf-8 import pyquery import requests import time import scipy.stats import numpy as np import pandas as pd import seaborn as sns import matplotlib.font_manager import matplotlib.cm as cm import tqdm import matplotlib.pyplot as plt rating_atcoder = [] counts = [] for i in tqdm.tqdm(xrange(1, 129)): table = pyquery.PyQuery(url='https://atcoder.jp/ranking?p={}'.format(i)) for elm in table.find('tr')[1:]: tr = pyquery.PyQuery(elm) tds = tr.find('td') rank = int(pyquery.PyQuery(tds[0]).text()) name = pyquery.PyQuery(tds[1]).text() rating = int(pyquery.PyQuery(tds[2]).text()) count = int(pyquery.PyQuery(tds[4]).text()) rating_atcoder.append(rating) counts.append(count) time.sleep(1) df = pd.DataFrame({ 'rating_atcoder': rating_atcoder, 'count': counts }) prop = matplotlib.font_manager.FontProperties(fname=r'C:\Windows\Fonts\meiryo.ttc', size=12) plt.hist(df[df['rating_atcoder'] < 400].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#808080') plt.hist(df[(400 <= df['rating_atcoder']) & (df['rating_atcoder'] < 800)].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#804000') plt.hist(df[(800 <= df['rating_atcoder']) & (df['rating_atcoder'] < 1200)].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#008000') plt.hist(df[(1200 <= df['rating_atcoder']) & (df['rating_atcoder'] < 1600)].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#00C0C0') plt.hist(df[(1600 <= df['rating_atcoder']) & (df['rating_atcoder'] < 2000)].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#0000FF') plt.hist(df[(2000 <= df['rating_atcoder']) & (df['rating_atcoder'] < 2400)].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#C0C000') plt.hist(df[(2400 <= df['rating_atcoder']) & (df['rating_atcoder'] < 2800)].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#FF8000') plt.hist(df[2800 <= df['rating_atcoder']].reset_index()['rating_atcoder'], bins=range(0, 4001, 100), histtype='stepfilled', color='#FF0000') plt.title(u'AtCoderã®ã¬ã¼ãã£ã³ã°ã®åå¸', fontproperties=prop) plt.xlabel(u'AtCoderã®ã¬ã¼ãã£ã³ã°', fontproperties=prop) plt.ylabel(u'ã¦ã¼ã¶ã¼æ°', fontproperties=prop) plt.show() print 'rating: percentile' for i in [0, 400, 800, 1200, 1600, 2000, 2400, 2800]: print '{}: {:.3}'.format(i, 100 - scipy.stats.percentileofscore(df['rating_atcoder'], i)) df = df.dropna() # åæ°ãã¨ã®åå¸ãåºã sns.barplot(x=df['count'], y=df['rating_atcoder'], ci='sd') plt.title(u'AtCoderã®åå åæ°ãã¨ã®ã¬ã¼ãã£ã³ã°ã®å¹³åã¨æ¨æºåå·®', fontproperties=prop) plt.xlabel(u'AtCoderã®åå åæ°', fontproperties=prop) plt.ylabel(u'ã¬ã¼ãã£ã³ã°ã®å¹³å', fontproperties=prop) plt.show() # ç®±ã²ãå³ sns.boxplot(x=df['count'], y=df['rating_atcoder']) plt.title(u'AtCoderã®åå åæ°ãã¨ã®ã¬ã¼ãã£ã³ã°(ç®±ã²ãå³)', fontproperties=prop) plt.xlabel(u'AtCoderã®åå åæ°', fontproperties=prop) plt.ylabel(u'ã¬ã¼ãã£ã³ã°', fontproperties=prop) plt.show() print df.groupby('count').mean()['rating_atcoder']