ä»åãããã¾ã python é¢ä¿ãªã話é¡ã§ãããé¢ç½ãã£ãã®ã§ã¡ã¢ã¡ã¢ã
ã©ã®è¨èªã«ãè¨ãã話ã§ãã
æ¦è¦
stackoverflow ã® Python ã«ãã´ãªè¦ã¦ããã以ä¸ã®å 容ãçºè¦ã
ãããã¯ã®æ稿è ããã¯ãå¦çãPythonã§æ¸ãã¦ããã©ãã£ã¡ãæéããããããªãã¨ããããã¨ã®ãã¨ãå 容ã¯
- 200ä¸ä»¶ä»¥ä¸ã®ãªã¹ãããã
- 1ãã¼ã¿ã¯ [a, b, c, d] ã®ããã«ãªã¹ãã¨ãªã£ã¦ãããããã200ä¸ä»¶ä»¥ä¸
- ãã¼ã¿å¦çç¨ã® dict ãããã
- ãã¼ã®æ°ã2000ãªã¼ãã¼ã
- 1ãã¼ã«å°ãªãã¨ã50ãã¼ã¿å ¥ã£ã¦ããªã¹ããè¨å®ããã¦ããã
- 200ä¸ãªã¼ãã¼ã®ãªã¹ããã«ã¼ããã¦ã1ãã¼ã¿ã®[0]ã®è¦ç´ ã®å¤ã¨ dict ã® value ã®ãªã¹ããæ¯ã¹ãã
- å¤ããã£ãã (in) ã[0] ã®è¦ç´ å¤ã dict ã® key ã«ç½®ãæããã
ã¿ãããªãã¨ããã£ã¦ããã¿ããããããè¶ æéãããã¨ã®ãã¨ã
ãªã¼ãããå®åã§ãããè¦ããã¿ã¼ã³ã§ãããï½ããããªã®ããããã¾ãã
åçè ã®å 容
ã§ãåçè ã®äººã®å 容ã以ä¸ã
è¨ç®éããã£ã¡ããã«ã¤ãããå°ãªãããï¼ï¼
æ¹åãã¤ã³ãã¯ã以ä¸ã®é¨åã
- Listã® in ã¯ãç·å½¢æ¢ç´¢ã§æéè¨ç®éã¯O(n)ãªã®ã§ããããããã
- ã½ã¼ã¹è¦ãã¨ãlistã®ã«ã¼ãæ¯ã« 50 * 2000 ã¨ããªã£ã¦ãã®ã§ããã¯æéãããã
- 代ããã« set 㧠in ãããããã¯ãæéè¨ç®éãO(1)ã
- ãã£ã¨éãããããã ã£ãããç¾å¨ã® dict ã®ãã¼ã¨å¤ãéã«ããã
- ããã§ã[0]ã®è¦ç´ å¤ãä¸çºã§ dict ã«å½ã¦ããã¨ãã§ããã
ã¿ãããªæãã
åºæ¬çã«ããªã¹ãã§ãµã¼ãããå¦çã¯ãç·å½¢æ¢ç´¢ã«ãªãã®ã§éãå¢ããã¨é ãã§ãã ããããå ´åã¯ãéåã¨ãè¾æ¸ã¨ãã¤ãã£ã¦ããããããããã«ããã¨ä¸æ°ã«éããªãã¾ãã
大ããªãã¼ã¿ãå¦çããå ´åã«ãå¿ ããªã¹ããªããªããªãã®ã³ã³ãããå©ç¨ãã¾ãã ãã®ééè¦ãªã®ã
- é åºãå¿ è¦ãã©ããï¼
- éè¤ã許容ãããã©ããï¼
ã§ãããåã«å¤ã®éã¾ãã¨ãã¦å¦çããã®ã ã£ãããéåãè¾æ¸ä½¿ã£ãã»ããå¹çãããã¨ããå¤ãã§ãã
ãªãã§ãããã§ãããªã¹ãã§ãããã¨ããã¨ãã¼ã¿éãå¢ããã¨ãã«ï½±ï¾ï¾ï¾ï¾ã£ã¦ãªãã¨ããããã¾ãã
ãã¼ã¿ã®æ§è³ªãæãã¦ãé©åãªã³ã¬ã¯ã·ã§ã³ãé¸ã¶ã®ã¯ãpython ã§ã C# ã§ã Java ã§ãåãã§ãã
ä»åã®ã±ã¼ã¹ã ã¨ãæçµçã«
It is just roughly 100.000 times faster in this case :-)
ãããéããªã£ãã¿ããã§ããã
ãµã³ãã«
è¦ã¦ã¦ãé¢ç½ãã£ãã®ã§ããªãããµã³ãã«ã§ãä½ããããªã£ã¦ããã£ã¦ python ã§æ¸ãã¦ã¿ã¾ããã
ãã£ã¦ããã¨ã«å ¨ç¶æå³ããªãã¯ã½ãµã³ãã«ã§ããã大éã®ãã¼ã¿ã欲ããã£ãã®ã§éµä¾¿å±ã®éµä¾¿çªå·ãã¼ã¿ã
å©ç¨ãã¾ãããé称 ken_all ããã200ä¸ã§ã¯ãªãã§ãããããã§ã12ä¸4000è¡ä»¥ä¸ããã¾ãã
# coding: utf-8 """ Pythonã§ã®ã«ã¼ãæé©åã®ãµã³ãã«ã§ãã 以ä¸ã®URLã®æ å ±ã«ã¤ã³ã¹ãã¤ã¢ããã¦ãµã³ãã«ã¤ããã¾ããã http://stackoverflow.com/questions/43827281/python-loop-optimization """ import collections import csv import pathlib import zipfile as zip from timeit import timeit from typing import List, Dict import requests class PrepareProc: def __init__(self) -> None: super().__init__() self.zip_file_name = r'ken_all.zip' self.csv_file_name = r'ken_all.csv' self.work_dir = pathlib.Path(r'/tmp') self.zip_file_path = self.work_dir / self.zip_file_name self.csv_file_path = self.work_dir / self.csv_file_name # éµä¾¿çªå·ãã¼ã¿ãã¦ã³ãã¼ãURL self.data_url = r'http://www.post.japanpost.jp/zipcode/dl/kogaki/zip/ken_all.zip' # éµä¾¿çªå·ãã¼ã¿ãã¡ã¤ã«ã®ã¨ã³ã³ã¼ãã£ã³ã° self.csv_encoding = 'sjis' def download(self) -> None: if self.zip_file_path.exists(): return with open(self.zip_file_path, mode='wb') as writer: writer.write(requests.get(self.data_url).content) def extract(self) -> None: if self.csv_file_path.exists(): return with zip.ZipFile(str(self.zip_file_path.absolute()), mode='r') as z: z.extractall(self.work_dir) def read(self) -> List[List[str]]: with open(self.csv_file_path, mode='rt', encoding=self.csv_encoding, newline='') as f: reader = csv.reader(f) return [line for line in reader] # noinspection PyUnresolvedReferences class _ProcValidateMixin: def _pre_validate(self) -> None: assert self._lines[0][0] != 'åæµ·é' def _post_validate(self) -> None: assert self._lines[0][0] == 'åæµ·é' class SlowProc(_ProcValidateMixin): def __init__(self, lines: List[List[str]]) -> None: super().__init__() self._lines = lines self._mapping = self._make_mapping() def __call__(self, *args, **kwargs) -> None: for line in self._lines: for key in self._mapping: if line[0] in self._mapping[key]: line[0] = key def _make_mapping(self) -> Dict[str, list]: mapping = collections.defaultdict(list) for line in self._lines: mapping[line[6]].append(line[0]) return mapping class NormalProc(_ProcValidateMixin): def __init__(self, lines: List[List[str]]) -> None: super().__init__() self._lines = lines self._mapping = self._make_mapping() def __call__(self, *args, **kwargs) -> None: for line in self._lines: for key in self._mapping: if line[0] in self._mapping[key]: line[0] = key def _make_mapping(self) -> Dict[str, set]: mapping = collections.defaultdict(set) for line in self._lines: mapping[line[6]].add(line[0]) return mapping class FastProc(_ProcValidateMixin): def __init__(self, lines: List[List[str]]) -> None: super().__init__() self._lines = lines self._mapping = self._make_mapping() def __call__(self, *args, **kwargs) -> None: for line in self._lines: item = self._mapping[line[0]] if item: line[0] = item def _make_mapping(self) -> Dict[str, str]: mapping = collections.defaultdict(str) for line in self._lines: mapping[line[0]] = line[6] return mapping if __name__ == '__main__': prepare = PrepareProc() prepare.download() prepare.extract() # å ¨ä»¶å¦çãããã¨ãã¨ã¦ãæéããããã®ã§20000è¡ã«çµã£ã¦å®æ½ã # å®éã®è¡æ°ã¯ã2017/05/10æç¹ã§124115è¡ããã slow = SlowProc(prepare.read()[:20000]) slow._pre_validate() print(f'slow={round(timeit(slow, number=1), 3)}') slow._post_validate() normal = NormalProc(prepare.read()) normal._pre_validate() print(f'normal={round(timeit(normal, number=1), 3)}') normal._post_validate() fast = FastProc(prepare.read()) fast._pre_validate() print(f'fast={round(timeit(fast, number=1), 3)}') fast._post_validate()
å®è¡ããã¨ã以ä¸ã®ããã«ãªãã¾ãã
slow=9.404 normal=0.836 fast=0.025
slowãªãã¤ã¯ãå ¨ä»¶ã§ããã¨ããã¤ã¾ã§ãçµãããªãã®ã§2ä¸ä»¶ã ãã§ãã®æéã§ãã
ã½ã¼ã¹ã¯ã以ä¸ã§ãè¦ãã¾ãã
try-python/loop_optimization01.py at master · devlights/try-python · GitHub
åèã«ãªãæ å ±
- PythonSpeed - Python Wiki
- python 㧠é度ä¸ããããã®æ å ±ã«ã¤ãã¦ãå æ¬çã«æ¸ããã¦ãã¾ãã
- TimeComplexity - Python Wiki
- åã³ã¬ã¯ã·ã§ã³ã®å¦çã«ã¤ãã¦ãæéè¨ç®éãè¨è¼ããã¦ãã¾ããå¿ èªã
- PythonSpeed/PerformanceTips - Python Wiki
- ããã«ããã©ã¼ãã³ã¹ãã¢ãããããããã® tipsã
- python リスト、辞書、セット型のinを使った時の参照速度を調べてみた。 - Qiita
- Qiitaã® python ã«ãã´ãªã¯ãã¤ãåãããããã¦ã¿ã¡ã«ãªãæ å ±ãå¤ãã§ããæè¬ã
- Newest 'python' Questions - Stack Overflow
- æ å ±ã®å®åº«ããã®ãããéãããã¾ããè¦ã¦ãã ãã§é¢ç½ãã
éå»ã®è¨äºã«ã¤ãã¦ã¯ã以ä¸ã®ãã¼ã¸ãããåç §ä¸ããã
- ããããåå¿é²æ¥è¨ã¾ã¨ã
ãµã³ãã«ã³ã¼ãã¯ã以ä¸ã®å ´æã§å ¬éãã¦ãã¾ãã
- ããããåå¿é²æ¥è¨ãµã³ãã«ã½ã¼ã¹ç½®ãå ´