- ãã«ãã©ã½ã³ã¯ 30 ãããããé£ãã
- 解æçµæ
- 5000 人åã®çµæã csv ãã¡ã¤ã«ã«
- ã¨ãããã¨ã§
ãã«ãã©ã½ã³ã¯ 30 ãããããé£ãã
ã¨è¨ããã¦ãã¾ãããæ¬å½ãªã®ããããã¦ã俺ã¯ä½ããã¡ã§ 3 æéãåããªãã£ãããã©ã³ãã¼ãºã¢ãããã¼ãã§å ¬éããã¦ããé¸æ 5000 人ã®çµæãå©ç¨ãã¦åæãã¦ã¿ã¾ããã
å æ¥ãèµ°ã£ã¦ãã京é½ãã©ã½ã³ 2017 ã®çµæãå©ç¨ããã¦é ãã¾ãã
解æçµæ
5KM ã©ããã¿ã¤ã ã®é·ç§»ã§è¦ã 3 æéãåã人ã¨åããªã人
å¹³å
ã¾ã㯠5 KM ã©ããã¿ã¤ã ãä¸ä½ 10 人ãã®ãªã®ãªãµãã¹ãªã¼ 10 人ããµãã¹ãªã¼ã¾ã§ããã²ã¨é å¼µãã® 10 人ã®åã»ã°ã¡ã³ãã§å¹³åãåã£ã¦æ¯è¼ãã¦ã¿ã¾ããã
ä¸ä½ 10 人
ã´ã¼ã«ã¿ã¤ã 㯠2 æé 27 åã 2 æé 43 åã¾ã§ã®ä¸ä½é¸æ 10 人åãï¼25 ããã¯è·é¢ãçãã®ã§æéãçããªã£ã¦ããï¼
ã®ãªã®ãªãµãã¹ãªã¼ã® 10 人
ã´ã¼ã«ã¿ã¤ã 㯠2 æé 59 åå°ã®é¸æ 10 人åãï¼25 ããã¯è·é¢ãçãã®ã§æéãçããªã£ã¦ããï¼
ãµãã¹ãªã¼ã¾ã§ããã²ã¨é å¼µãã® 10 人
ã´ã¼ã«ã¿ã¤ã 㯠3 æé 5 åå°ã®é¸æ 10 人åãï¼25 ããã¯è·é¢ãçãã®ã§æéãçããªã£ã¦ããï¼
è¦è§£
- ä¸ä½ 10 人ã¯æåã® 5KMééæ 㨠40KM ééæã®ã©ããã¿ã¤ã ã®å·®ãå°ãã
- ãµãã¹ãªã¼ã¾ã§ããã²ã¨é å¼µãå¿ è¦ãªäººã¯æåã® 5KM ééæã¨å¾åã® 5KM ã®ã©ããã¿ã¤ã ã®å·®ã大ãã
- 3 æéåããªã人ã¯ä¸å®ãããã¼ã¹ã§ 40KM èµ°ãããçºã®èµ°åãåãã£ã¦ããªããã¨ãèãããã
- 19 ã 21 å/5KMãããã®éã§ããã¾ãããã«èµ°ãåããã¨ãåºæ¥ãã°ãµãã¹ãªã¼
ãã«ãã©ã½ã³ã¯ 30 ãããããé£ãã
æåã® 5 ãã
æåã® 5 ããã¯ãã©ã¤ããå¤ããã¹ã¿ã¼ãç´å¾ãªã®ã§èªåã®ãã¼ã¹ãæ¢ããªããèµ°ã£ã¦ããã©ã³ãã¼ãå¤ãã¨æããã¾ããï¼å·¦ã«ããã»ã©è¨é²ãè¯ãã©ã³ãã¼ï¼
10 ãã
10 ããã«ãªãã¨ãã¼ã¹ãè½ã¡çãã¦ãã¦ãã¾ããï¼å·¦ã«ããã»ã©è¨é²ãè¯ãã©ã³ãã¼ï¼
15 ãã
15 ãã以éã¯åã©ã³ãã¼ã¯èªåã®ãã¼ã¹ãæ´ãã§å®å®ãããã¼ã¹ã§èµ°ãå§ãã¾ãã
20 ãã
25 ãã
30 ãã
ãã«ãã©ã½ã³ã®é¬¼éã30 ãã以éãåã©ã³ãã¼ã®ãã¼ã¹ãå°ããã¤ä¹±ãã¯ããã¦ããã®ããããã¾ãã
35 ãã
40 ãã
35 ãã以éãæå¾ã® 10 ããã«èµ°åã®å·®ãæ´ç¶ã¨åºã¦æ¥ãããã«è¦ãã¾ããèµ°åã®ããé¸æãã©ããã¿ã¤ã ãä¸ãã£ã¦ãã¦ã¯ãã¾ãããã´ã¼ã«ã¿ã¤ã ãé ããªãé¸æã»ã©æå¾ã® 5 ã 10 ããã®ã¿ã¤ã ãããã¾ã§ã® 5 ããã¨æ¯ã¹ãã¨ä¸æ¯ãã大ãããªã£ã¦ããï¼é ããªã£ã¦ããï¼ããã§ããéã«èµ°åã®ããé¸æã¯ã¿ã¤ã ãä¸ãã£ã¦ããé¸æããã©ãã©è¦ããã¾ãã
å¾å 15 ããã®ã©ããã¿ã¤ã ã®è½ã¡è¾¼ã¿
5 KM ã©ããã¿ã¤ã ãä¸ä½ 10 人ãã®ãªã®ãªãµãã¹ãªã¼ 10 人ããµãã¹ãªã¼ã¾ã§ããã²ã¨é å¼µãã® 10 人ã®åã»ã°ã¡ã³ãã§å¹³åã®å¹³åãåããæåã® 5KM ééæã®ã¿ã¤ã 㨠30KM 㨠35KM åã³ 40KM ééæã® 5KM ã©ããã®å·®åãæ¯è¼ãã¦ã¿ã¾ããã40KM ã¾ã§ã® 5KM ã§ã¯ãããé¸æã§ãã©ããã¿ã¤ã ãè½ã¡è¾¼ãã§ãã¾ã£ã¦ãããã¨ããããã¾ããä½ãããããé¸æã®è½ã¡è¾¼ã¿ãããèµ°åã®ä½ãé¸æã®è½ã¡è¾¼ã¿ã大ããããã«è¦ããã¾ãã
è¦è§£
- é説éãã30 ãã以éã¯å¤§ä½ã®ã©ã³ãã¼ãã©ããã¿ã¤ã ãä¸æ¯ãããï¼é ããªãï¼
- è¨é²ãè³ãããªãã©ã³ãã¼ï¼èµ°åãç¡ãã©ã³ãã¼ï¼ç¨ 30 ãã以éã®ã©ããã¿ã¤ã ã大ããä¸æ¯ãããå¾åãè¦ããã
5000 人åã®çµæã csv ãã¡ã¤ã«ã«
çµæã®åå¾
ã©ã³ãã¼ãºã¢ãããã¼ãã¯åé¸æã®ã¼ãã±ã³çªå·æ¯ã®ãã¼ã¸ãåå¨ãã¦ãã¦ã以ä¸ã®ããã« curl ã wget ã使ãã°ãã¼ã¸ã® HTML ãåå¾ãããã¨ãåºæ¥ã¾ããã
wget http://p.kyoto-marathon.com/numberfile/10265.html -O 10265.html
ä½ããé »åº¦ã®é«ãã¢ã¯ã»ã¹ã¯æ§ãã¾ãããã
å®éã®çµæãã¼ã¸ã¯ä»¥ä¸ã®ãããªãã¼ã¸ã§ãã
Pandas ã©ã¤ãã©ãª
Pandas ã¨ããã©ã¤ãã©ãªã使ãã° HTML ãã¡ã¤ã«ã解æãã¦ããã¼ãã«ãã¼ã¿ã«ä»¥ä¸ã®ããã«ã¢ã¯ã»ã¹åºæ¥ãããã«ãªãã¾ãã
$ python Python 3.6.0 (default, Dec 24 2016, 07:27:52) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import pandas >>> url = 'http://p.kyoto-marathon.com/numberfile/10265.html' >>> df = pandas.io.html.read_html(url) >>> df[0] 0 1 2 3 0 å°ç¹åPoint ã¹ããªãã ï¼ãããã¿ã¤ã ï¼Split ï¼Net Timeï¼ ã©ããLap ééæå»Time 1 5km 00:23:45ã(0:23:31) 0:23:31 09:23:45 2 10km 00:44:43ã(0:44:29) 0:20:58 09:44:43 3 15km 01:05:07ã(1:04:53) 0:20:24 10:05:07 4 20km 01:25:41ã(1:25:27) 0:20:34 10:25:41 5 ä¸éç¹ 01:30:08ã(1:29:54) 0:04:27 10:30:08 6 25km 01:46:28ã(1:46:14) 0:16:20 10:46:28 7 30km 02:07:45ã(2:07:31) 0:21:17 11:07:45 8 35km 02:29:29ã(2:29:15) 0:21:44 11:29:29 9 40km 02:54:14ã(2:54:00) 0:24:45 11:54:14 10 Finish 03:05:17ã(3:05:03) 0:11:03 12:05:17 >>>
åè¦ç´ ã¸ã®ã¢ã¯ã»ã¹ã¯ä»¥ä¸ã®ããã«ã
>>> df[0][1] 0 ã¹ããªãã ï¼ãããã¿ã¤ã ï¼Split ï¼Net Timeï¼ 1 00:23:45ã(0:23:31) 2 00:44:43ã(0:44:29) 3 01:05:07ã(1:04:53) 4 01:25:41ã(1:25:27) 5 01:30:08ã(1:29:54) 6 01:46:28ã(1:46:14) 7 02:07:45ã(2:07:31) 8 02:29:29ã(2:29:15) 9 02:54:14ã(2:54:00) 10 03:05:17ã(3:05:03) Name: 1, dtype: object
5000 人åã®çµæã csv ãã¡ã¤ã«ã«
wget ã§åå¾ãã 5000 人åã®ãã¼ã¿ã以ä¸ã®ããã« csv ãã¡ã¤ã«ã«æ¸ãåºãã¾ããã
import glob import pandas import csv with open('output.csv', 'a') as c: writer = csv.writer(c, lineterminator='\n') file_list = glob.glob('./*.html') for file in file_list: bib_number = file.split('.')[-2].split('/')[-1] # print(bib_number) with open(file, 'r') as file: table = pandas.io.html.read_html(file.read()) column = [] for value in table[0][2]: column.append(value) # éä¸ã§ãªã¿ã¤ã¤ããäººå¯¾å¿ if len(table[0][1]) == 11: column.append(table[0][1][10].split('\u3000')[0]) column.insert(1, bib_number) print(column[1:]) writer.writerow(column[1:])
ã¨ãããã¨ã§
ãµãã¹ãªã¼ã«åãã¦
ç¾æç¹ã§èµ°åãç¡ãèªåãã©ãããã¹ããã
- 20 ã 21 å/5KM ã§èµ°ãããããã«é å¼µã
- å¾åã®è½ã¡è¾¼ã¿ãèãã㨠19 åå°ã§èµ°ããã°ãªãè¯ã
- 京é½ãã©ã½ã³ã§ã¯ååã® 5 KM ã 23 åæãã£ã¦ããã®ã§ãæåãã 20 åå°ã21 åå°ã§æ¼ããããã«å¿ãããï¼ãã¤ã¬æ³¨æï¼
Python + Pandas
ã¡ãã¼ä¾¿å©ã§ãã