ã¨ã³ã¸ãã¢ã®é´æ¨ï¼æ³°ï¼ã§ãã
ä»åã¯ãmultiprocessingã¨threadingã¨asyncioã®éãã¨ã¯ãªãã ããï¼ã¨ããåã«ææ¦ãã¦ã¿ããã¨æãã¾ãã
ãã®åã®çããã°ã¼ã°ã«å çã«èãã¦ã¿ãã¨ãé常ã«ããããã®æ å ±ãããããã¾ããããããªãããã©ã®æ å ±ãæççãªãã®ã°ããã§ï¼æ¬è¨äºããããªã®ããããã¾ããï¼ãè²ã ã¨æ¬ãèªãã ãããããæ¼ã£ãããã¦ãæ å ±ãè£å®ããªããã°ãªãã¾ããã§ããã
æ¬è¨äºã¯ãåã調ã¹ãéãã®æ å ±ãéç´ãããã®åã«å¯¾ããçµè«ãï¼ã¤ã®è¨äºã«ã¾ã¨ãããã®ã¨ãªã£ã¦ãã¾ãã
- åæ
- æ¬é¡
- ã¾ã¨ã
- åè
åæ
æ¬é¡ã«å ¥ãåã«ããã¤ããã®åæã«ã¤ãã¦èªèãåããã¦ããã¾ãã
ãã«ãããã»ã¹ã¨ã¯
ããã»ã¹ã¨ã¯å®è¡ä¸ã®ããã°ã©ã ã§ããä¾ãã°ãPythonã®ã½ã¼ã¹ã³ã¼ããå®è¡ããã¨ãã½ã¼ã¹ã³ã¼ããã¤ã³ã¿ããªã¿ã¼ããã¤ãã³ã¼ãã«ã³ã³ãã¤ã«ãã¾ããOSã¯ãã®ãã¤ãã³ã¼ããå®è¡ããã½ã¼ã¹ã³ã¼ãã«æ¸ããã¦ããéãã«å¦çãéå§ãã¾ãããã®å®è¡ä¸ã®å¦çãããã»ã¹ã¨å¼ã³ã¾ãã
ï¼ã¤ã®ããã»ã¹ã¯ãOSãã空ãã¦ããCPUã³ã¢ãå²ãå½ã¦ããããã¨ã«ãããå¦çãé²ãããã¨ãã§ãã¾ããå½ç¶ãCPUã®ã³ã¢ãï¼ã¤ã ãã§ããå ´åãï¼ã¤ã®ããã»ã¹ã®å¦çã ãããé²ãããã¨ãã§ãã¾ãããããããCPUã®ã³ã¢ãè¤æ°ããå ´åãããããã®ã³ã¢ãè¤æ°ã®ããã»ã¹ã«å¯¾ãã¦åæã«å²ãå½ã¦ããã¨ãã§ãããããè¤æ°ã®ããã»ã¹ã®å¦çãåæã«é²ãããã¨ãã§ãã¾ãã
ãã«ãããã»ã¹ã¨ã¯ãè¤æ°ã®ããã»ã¹ãåæã«å¦çãé²ãããã¨ãæãã¾ãããã«ãããã»ã¹ã®ã¡ãªããã¯ãï¼ã¤ã®ããã°ã©ã ã®ç®çãéæããããã«è¤æ°ã®CPUã®ã³ã¢ãå©ç¨ãããã¨ã§ãããéãç®çãéæã§ããã¨ããç¹ã«ããã¾ãã
ãã«ãããã»ã¹æ©æ§ã¯OSæ¯ã«å®è£ ãç°ãªãã¾ããOSæ¯ã®æåã®éãã«æ³¨æããå¿ è¦ã¯ããã¾ãããããã°ã©ãã³ã°è¨èªæ¯ã®æåã®éãã¯ãã¾ããªãã§ããã¨ã¯ãããåããã°ã©ãã³ã°è¨èªã«ããã¦ãããã»ã¹ã®ä½æãOSã«å¯¾ãã¦ç´æ¥ã«å½ä»¤ãããã¨ã¯å°ãªããåè¨èªæ¯ã«ç¨æããã¦ããã©ããã¼é¢æ°ãã¯ã©ã¹ãéãã¦è¡ãã¾ããå¾ã£ã¦ãåè¨èªæ¯ã«ããããã®ã©ããã¼ã®ä»æ§ã®éããç¥ã£ã¦ããå¿ è¦ã¯ããã¾ãã
ãã«ãã¹ã¬ããã¨ã¯
ã¹ã¬ããã¨ã¯ãããã»ã¹ã®ä¸ã«ãããå¦çã®æµãã®ãã¨ã§ãããå¦çã®æµããã¨ãã表ç¾ã§ã¯ææ§ã§ãããã«ãããããå ·ä½ä¾ã§èª¬æãã¾ãã
以ä¸ã®Pythonã®ã½ã¼ã¹ã³ã¼ããå®è¡ããã¨ãããã»ã¹ãä½ããã¾ãããã®ããã»ã¹ã®ä¸ã§ã¯ãHello
ã®åºåããå§ã¾ãã!
ã®åºåã§çµããå¦çã®æµããããã¾ãããã®å¦çã®æµããã¹ã¬ããã§ãããã®ã¹ã¬ãããã¡ã¤ã³ã¹ã¬ããã¨å¼ã³ã¾ãããã®ã½ã¼ã¹ã³ã¼ãã§ã¯ãããã»ã¹ãéå§ãããããçµããã¾ã§ãå¦çã®æµãã¯ãã£ã¨ã¡ã¤ã³ã¹ã¬ããï¼ã¤ã ãã§ãã
hello.py
print('Hello') print('world') print('!')
以ä¸ã®Pythonã®ã½ã¼ã¹ã³ã¼ãã¯threadingã©ã¤ãã©ãªãå©ç¨ãããã«ãã¹ã¬ãããå®è¡ãããã®ã§ããjob.start()
é¢æ°ãã¹ã¬ãããéå§ãã¾ãããã®ã½ã¼ã¹ã³ã¼ãã§ã¯print('Hello')
ãprint('world')
ãprint('!')
ãããã¦ã¡ã¤ã³ã¹ã¬ããã®ï¼ã¤ã®å¦çã®æµããããã¾ããjob.join()
é¢æ°ã®å®è¡å¾ã¯ã¹ã¬ãããå®äºãã¾ãããã£ã¦ãprint('done')
ãå®è¡ãããæç¹ã«ããã¦ã¯ãã¹ã¬ããã¯ã¡ã¤ã³ã¹ã¬ããã®ï¼ã¤ã ãã§ãã
hello_threading.py
import threading jobs = [] jobs.append(threading.Thread(target=lambda : print('Hello'))) jobs.append(threading.Thread(target=lambda : print('world'))) jobs.append(threading.Thread(target=lambda : print('!'))) for job in jobs: job.start() for job in jobs: job.join() print('done')
ï¼ã¤ã®ã¹ã¬ããã¯ãããã°ã©ãã³ã°è¨èªæ¯ã«å®è£ ããã¦ããæ©æ§ï¼Linuxã§ã¯PthreadãJavaã®Threadsã©ã¤ãã©ãªãPythonã§ã¯asyncioãthreadingã©ã¤ãã©ãªçï¼ãéãã¦CPUã³ã¢ãå²ãå½ã¦ããããã¨ã«ãããå¦çãé²ãããã¨ãã§ãã¾ããããã»ã¹ã®ããã«ãOSããç´æ¥CPUã³ã¢ãå²ãå½ã¦ãããã®ã§ã¯ããã¾ãããããã»ã¹ã®å ´åã¨åæ§ã«ãCPUã®ã³ã¢ãè¤æ°ããå ´åãããããã®ã³ã¢ãè¤æ°ã®ã¹ã¬ããã«å¯¾ãã¦åæã«å²ãå½ã¦ããã¨ãã§ããã°ãè¤æ°ã®ã¹ã¬ããã®å¦çãåæã«é²ãããã¨ãã§ãã¾ãã
ãã«ãã¹ã¬ããã¨ã¯ãè¤æ°ã®ã¹ã¬ãããåæã«å¦çãé²ãããã¨ãæãã¾ãã
ä¸è¬çã«ã¯ããã«ãã¹ã¬ããã®ã¡ãªããããã«ãããã»ã¹ã®ã¡ãªããã¨åæ§ã§ããããPythonã«ããã¦ã¯ãCPythonãGILã§ããã¨ãããã¨ã«æ³¨æããå¿ è¦ãããã¾ãã
Pythonã«ããããã«ãã¹ã¬ãã
Pythonã«ããã¦ããã«ãã¹ã¬ãããªã½ã¼ã¹ã³ã¼ããæ¸ãå ´åãCPythonãGILããããã¨ãèæ ®ããªããã°ãªãã¾ãããã¹ã¯ãªããè¨èªã®ã¤ã³ã¿ããªã¿ã¼ã¯ãGILã§ãããã®ã¨ããã§ãªããã®ãããã¾ããCPythonã¯GILã§ãããJythonãIronPythonã¯GILã§ã¯ããã¾ãããã¡ãªã¿ã«CRubyã¯GILã§ãã
GILã§ããã¤ã³ã¿ããªã¿ã¼ã«ããã¦ã¯ããã«ãã¹ã¬ãããªã½ã¼ã¹ã³ã¼ããæ¸ããã¨ãã¦ããã¤ã³ã¿ããªã¿ã¼ãåºåãããã¤ãã³ã¼ããOSä¸ã§å®è¡ãã段éã«ããã¦ãã«ãã¹ã¬ããã§ã¯å®è¡ããã¾ããããã¨ãã°ãä¸ã§æ²è¼ããhello_threading.pyã¯ãOSä¸ã§å®è¡ããã段éã«ããã¦ãã«ãã¹ã¬ããã§ã¯å®è¡ããã¾ããã
æ¬é¡
Pythonã«ããã¦ããã«ãããã»ã¹ããã«ãã¹ã¬ãããªã½ã¼ã¹ã³ã¼ããæ¸ãå ´åãmultiprocessingãthreadingãasyncioã®ã©ããå©ç¨ãã¹ããªã®ã§ããããï¼
ãã«ãããã»ã¹ï¼multiprocessingã©ã¤ãã©ãªï¼ãå©ç¨ããã»ããè¯ãå ´å
CPUè² è·ã®é«ãå¦çï¼ããããCPU boundï¼ãéæããããã®ã½ã¼ã¹ã³ã¼ãã§ããå ´åãmultiprocessingãå©ç¨ãããã«ãããã»ã¹ã«æ¸ãã¾ãããï¼Jythonçã®GILã§ã¯ãªãã¤ã³ã¿ããªã¿ã¼ã使ãã®ã§ããã°ããã®éãã§ã¯ããã¾ããï¼ã
CPUè² è·ã®é«ãå¦çããããã«ãã«ãã¹ã¬ãããªã½ã¼ã¹ã³ã¼ããæ¸ããã¨ãã¦ããããã©ã¼ãã³ã¹ã¯æ¹åããã¾ããããªããªãããPythonã«ããããã«ãã¹ã¬ãããã§èª¬æããéããPythonã®ã½ã¼ã¹ã³ã¼ãã¯ã¤ã³ã¿ããªã¿ã¼ã«ããã³ã³ãã¤ã«ãããå¾ãOSä¸ã§ã·ã³ã°ã«ã¹ã¬ããã§å®è¡ãããããã§ããããªãã¡ãå©ç¨ã§ããCPUã³ã¢ã¯ï¼ã¤ã ãã«éå®ããã¾ãã
å®éã«ãã£ã¦ã¿ãã¨ãããã©ã¼ãã³ã¹ã®å·®ãé¡èã«è¡¨ãã¾ãã
æ¤è¨¼ç°å¢
- 4 vCPUs, 16 GB memory
- CentOS, 8, x86_64 built on 20210701
- Python3.8
cpu_sec.py
CPUè² è·ã®é«ãå¦çburden_cpué¢æ°ãï¼ã¤ã®ããã»ã¹ãï¼ã¤ã®ã¹ã¬ããã§å¦çããããã°ã©ã ã§ãã
def burden_cpu(): for i in range(10000): for j in range(10000): pass for i in range(4): burden_cpu()
å®è¡çµæ
$ time python3.8 cpu_sec.py real 0m9.518s user 0m9.473s sys 0m0.005s
CPU使ç¨çãCPUã®ã³ã¢ãï¼åãããã¡ãï¼ã¤ã®ã³ã¢ã ãã使ç¨ãã¦ããããã25%ã¨ãªãã¾ãã
$ mpstat 1 ...ï¼çç¥ï¼ 16:08:49 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 16:08:49 all 18.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 81.50 16:08:50 all 25.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.00 16:08:51 all 25.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.00 16:08:52 all 25.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.00 16:08:53 all 24.88 0.00 0.25 0.00 0.25 0.00 0.00 0.00 0.00 74.63 16:08:54 all 24.81 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.19 16:08:55 all 25.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.00 16:08:56 all 24.75 0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00 74.75 16:08:57 all 25.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.00 ...ï¼çç¥ï¼
cpu_multiprocessing.py
CPUè² è·ã®é«ãå¦çburden_cpué¢æ°ãï¼ã¤ã®ããã»ã¹ãåããã»ã¹ä¸ã§ã¯ï¼ã¤ã®ã¹ã¬ããã§å¦çããããã°ã©ã ã§ãã
import multiprocessing as mp def burden_cpu(_: any): for i in range(10000): for j in range(10000): pass pool = mp.Pool(4) pool.map(burden_cpu, [i for i in range(4)]) pool.close()
å®è¡çµæãCPUãå¹çè¯ã使ç¨ã§ãã¦ããï¼ä¸è¨åç §ï¼ãããcpu_sec.pyã®å®è¡æéãããå°ãããªãã¾ãã
$ time python3.8 cpu_multiprocessing.py real 0m5.351s user 0m21.062s sys 0m0.028s
CPU使ç¨çãCPUã®ã³ã¢ãï¼åãããã¡ãï¼ã¤ããã»ã¹ã«å¯¾ãã¦ï¼ã¤ãã¤ã³ã¢ãå²ãå½ã¦ãããåæã«ï¼ã¤ã®ã³ã¢ã使ç¨ãã¦ããããã»ã¼100%ã¨ãªãã¾ãã
$ mpstat 1 ...ï¼çç¥ï¼ 16:12:08 all 99.50 0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00 0.00 16:12:09 all 99.01 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 16:12:10 all 99.75 0.00 0.00 0.00 0.25 0.00 0.00 0.00 0.00 0.00 16:12:11 all 99.25 0.00 0.00 0.00 0.75 0.00 0.00 0.00 0.00 0.00 ...ï¼çç¥ï¼
cpu_threading.py
CPUè² è·ã®é«ãå¦çburden_cpué¢æ°ãï¼ã¤ã®ããã»ã¹ãï¼ã¤ã®ã¹ã¬ããã§å¦çããããã°ã©ã ã§ãã
from concurrent.futures import ThreadPoolExecutor def burden_cpu(): for i in range(10000): for j in range(10000): pass pool = ThreadPoolExecutor(max_workers=4) for i in range(4): pool.submit(burden_cpu) pool.shutdown()
å®è¡çµæãã½ã¼ã¹ã³ã¼ãä¸ã§ã¯ï¼ã¤ã®ã¹ã¬ãããåæã«å¦çãé²ãã¦ãã¾ããããã¤ãã³ã¼ãä¸ã§ã¯ï¼ã¤ã®ã¹ã¬ããã ããå¦çãå®è¡ãã¦ããã ãã®ç¶æ ï¼ä¸è¨åç §ï¼ã§ããããã«ãcpu_sec.pyã®å®è¡æéã¨ã»ã¼åãã§ãã
$ time python3.8 cpu_threading.py real 0m9.812s user 0m9.820s sys 0m0.090s
CPU使ç¨çãCPU使ç¨çã25%ç¨åº¦ã§ãããã¨ãããCPUã®ã³ã¢ãï¼åãããã¡ï¼ã¤ã ãããå©ç¨ã§ãã¦ããªããã¨ããããã¾ãã
$ mpstat 1 ...ï¼çç¥ï¼ 16:16:46 all 25.00 0.00 0.25 0.00 0.00 0.25 0.00 0.00 0.00 74.50 16:16:47 all 24.88 0.00 0.25 0.00 0.50 0.00 0.00 0.00 0.00 74.38 16:16:48 all 24.75 0.00 0.00 0.00 0.25 0.00 0.25 0.00 0.00 74.75 16:16:49 all 24.81 0.00 0.25 0.00 0.00 0.25 0.00 0.00 0.00 74.69 16:16:50 all 25.00 0.00 0.25 0.00 0.50 0.00 0.00 0.00 0.00 74.25 16:16:51 all 24.75 0.00 0.00 0.00 0.25 0.00 0.25 0.00 0.00 74.75 16:16:52 all 24.94 0.00 0.25 0.00 0.25 0.00 0.00 0.00 0.00 74.56 16:16:53 all 24.69 0.00 0.25 0.00 0.00 0.00 0.00 0.00 0.00 75.06 16:16:54 all 25.31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 74.69 ...ï¼çç¥ï¼
cpu_asyncio.py
asyncioã使ç¨ããå ´åã§ããå®è¡çµæã¯ãcpu_threading.pyã®ãã®ã¨åãã§ãã
import asyncio running = 0 async def burden_cpu_async(): global running for i in range(10000): for j in range(10000): pass running-=1 async def main(): await asyncio.gather(*[ burden_cpu_async(), burden_cpu_async(), burden_cpu_async(), burden_cpu_async(), ]) asyncio.run(main())
å®è¡çµæã
$ time python3.8 cpu_asyncio.py real 0m9.433s user 0m9.389s sys 0m0.007s
CPU使ç¨çãCPU使ç¨çã25%ç¨åº¦ã§ãããã¨ãããCPUã®ã³ã¢ãï¼åãããã¡ï¼ã¤ã ãããå©ç¨ã§ãã¦ããªããã¨ããããã¾ãã
$ mpstat 1 ...ï¼çç¥ï¼ 01:10:20 all 24.94 0.00 0.00 0.00 0.25 0.00 0.00 0.00 0.00 74.81 01:10:21 all 24.81 0.00 0.00 0.00 0.25 0.00 0.00 0.00 0.00 74.94 01:10:22 all 25.00 0.00 0.00 0.00 0.25 0.00 0.00 0.00 0.00 74.75 01:10:23 all 24.81 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 75.19 01:10:24 all 24.88 0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00 74.63 01:10:25 all 24.81 0.00 0.25 0.00 0.00 0.00 0.00 0.00 0.00 74.94 01:10:26 all 24.94 0.00 0.00 0.00 0.00 0.00 0.25 0.00 0.00 74.81 01:10:27 all 24.75 0.00 0.00 0.00 0.25 0.00 0.00 0.00 0.00 75.00 ...ï¼çç¥ï¼
ã½ã¼ã¹ã³ã¼ãæ¯ã®å®è¡çµæã¾ã¨ã表
ã½ã¼ã¹ã³ã¼ã | ããã»ã¹ | ã¹ã¬ãã | å®è¡æéï¼ç§ï¼ | CPU使ç¨çï¼ï¼ ï¼ |
---|---|---|---|---|
cpu_sec.py | 1 | 1 | 9.518 | 25.0 |
cpu_multiprocessing.py | 4 | 1 | 5.351 | 99.75 |
cpu_threading.py | 1 | 4 | 9.812 | 25.0 |
cpu_asyncio.py | 1 | 1 | 9.433 | 25.0 |
threadingã¨asyncioãå©ç¨ããã»ããè¯ãå ´å
I/Oå¾ ã¡æéã大ãããã®ï¼ããããI/O boundï¼ãéæããããã®ã½ã¼ã¹ã³ã¼ãã§ããå ´åãthreadingï¼ãã«ãã¹ã¬ããï¼ãasyncioï¼éåæI/Oï¼ãå©ç¨ãã¾ãããã
multiprocessingï¼ãã«ãããã»ã¹ï¼ãå©ç¨ããªãæ¹ãè¯ãçç±ã¯ãããã»ã¹ãä½ãéã«çºçããã³ã¹ãã大ããããã§ããããã»ã¹ãä½ãã³ã¹ããããã¹ã¬ãããä½ãã³ã¹ãã®æ¹ãå°ããã®ã§ãã³ã¹ããå°ããæ¹ãå©ç¨ããæ¹ãè¯ãã¨ãããã¨ã§ããããã»ã¹ãæ°ããä½ãã¨ãæ°ããä½ãããããã»ã¹ã®æ°ã«æ¯ä¾ãã¦ãã¡ã¤ã«ãã£ã¹ã¯ãªãã¿æ°ãOSãCPUãåãæ¿ããããã®ã¹ã¤ããã³ã°åæ°ã大ãããªãã¾ãã
åæ§ã«ãã¦ãã¹ã¬ãããä½ãã³ã¹ãã¨ãã観ç¹ããè¨ãã°ãã¹ã¬ãããä½ãã³ã¹ããããéåæI/Oã®ã¤ãã³ããçºç«ããã³ã¹ãã®æ¹ãå°ãããããasyncioãå©ç¨ããæ¹ãè¯ãã¨è¨ãããã§ãã
æããã¦ã©ããªã®ã§ããããï¼æ¤è¨¼ãã¦ããããã¨æãã¾ãã
threading vs asyncio
誤解ãæããã«ããã°ãthreadingã¨asyncioã¯æ¬è³ªçã«ã¯ã©ã¡ãããPythonã«ããããè¤æ°ã®å¦çãåæã«é²ããããã®ä»çµã¿ããæä¾ããã©ã¤ãã©ãªã§ããã©ã¡ãã«ããã¦ãããPythonã«ããããã«ãã¹ã¬ãããã«ã¦è¿°ã¹ãéããã¤ã³ã¿ããªã¿ã¼ãåºåãããã¤ãã³ã¼ãã¯OSä¸ã§ï¼ã¤ã®ã¹ã¬ããã§ã®ã¿å®è¡ããã¾ãã
両è ã®å·®ç°ã¯æ¬¡ã®ç¹ã«ããã¾ãã
- threading
- æãããããPython1.6ï¼2000å¹´ï¼ããæ¨æºã©ã¤ãã©ãªã«ããã¾ãã
- æããããããã«ãã¹ã¬ããããã°ã©ãã³ã°ã¨ãããã©ãã¤ã ã«å±ããã
- Pythonã®threadingã©ã¤ãã©ãªã®APIã¯ããªãã¨ãªãã§ãããJavaã®ãã«ãã¹ã¬ããã«ä¼¼ã¦ãã¾ãã
- è¤æ°ã®ã¹ã¬ãããä½ããããããã®å¦çãåæã«é²ãããã¨ãã§ããã
- 競åç¶æ ï¼Race Conditionï¼ã«æ°ãã¤ããªããã°ãªããªãã
- asyncio
- 2015å¹´ï¼Python3.4ï¼ããå°å ¥ãããã
- ãã10å¹´ãããã§åºã¾ã£ã¦ããéåæããã°ã©ãã³ã°ã¨ãããã©ãã¤ã ã«å±ããã
- ããå¦çãI/Oå¾
ã¡ããã¦ããéã«ä»ã®å¦çãé²ãããã¨ãã§ããããã®ãã¨ãããããããã«ãå³å¯ã«è¨ãã°ãè¤æ°ã®å¦çãåæã«é²ãã¦ããããã§ã¯ãªãããå¾
ã¡ããçºçããæã«ãä»ã«é²ãããã¨ã®ã§ããå¦çï¼å¾
ã¡ãçºçãã¦ããªãå¦çï¼ãé²ãã¦ããã ãã§ããï¼éåæI/Oã«ã¤ãã¦ã®è©³ç´°ãªèª¬æã¯æ¬è¨äºã§ã¯å²æãã¾ãã詳ããç¥ãããæ¹ã¯ãã°ã¼ã°ã«å
çã«èãã¦ã¿ã¦ãã ãããï¼
- 競åç¶æ
ï¼Race Conditionï¼ããã¾ãæ°ã«ããå¿
è¦ã¯ãªãããç¨ãããéåæI/Oã®ãå¾
ã¡ãï¼ä¾
asyncio.sleep
é¢æ°ãªã©ï¼ãå ¥ããããªããã°ã©ã ãæ¸ããªããã°ãªããªãï¼ãå¾ ã¡ããå ¥ããªãå ´åãã³ã³ããã¹ãã¹ã¤ããã³ã°ãèµ·ãããªããï¼
- 競åç¶æ
ï¼Race Conditionï¼ããã¾ãæ°ã«ããå¿
è¦ã¯ãªãããç¨ãããéåæI/Oã®ãå¾
ã¡ãï¼ä¾
ä¸è¦ããã¨threadingãããasyncioãå©ç¨ããæ¹ãè¯ãããã§ãããå®éã®ã¨ããã©ããªã®ã§ããããï¼I/O boundãªå¦çãããããã®ã©ã¤ãã©ãªãå©ç¨ãã¦æ¸ããæ¯è¼ãã¦ã¿ã¾ãããã
æ¤è¨¼ç°å¢
- 4 vCPUs, 16 GB memory
- CentOS, 8, x86_64 built on 20210701
- Python3.8
æ¯è¼æ¹æ³
æ¯è¼ã«ç¨ããããæ¤è¨¼ç¨ããã°ã©ã ã¯2ã¤ããã¾ããio_threading.pyã¨io_asyncio.pyã§ãã
io_threading.pyãio_asyncio.pyã¯ãããããI/O boundãªã¿ã¹ã¯ãå¦çãã常é§ããã°ã©ã ã§ããã¦ã§ããµã¼ãã¼ã®ãããªããã°ã©ã ã模å£ãã¦ãã¾ããã¦ã§ããµã¼ãã¼ã¯ãã¼ãã«å±ãããªã¯ã¨ã¹ããå¦çãã¾ããããã模å£ããæ¤è¨¼ç¨ããã°ã©ã ã¯æ¨æºå ¥åã«å±ããã¿ã¹ã¯ãå¦çãã¾ããã¦ã§ããµã¼ãã¼ã«ã¯ããªã¯ã¨ã¹ãã®å¦çãã¹ã¬ããã«ä»»ãããã®ï¼nginxã®ãããªï¼ã¨ãã¤ãã³ãã«ã¼ãã«ä»»ãããã®ï¼node.jsã®ãããªï¼ãããã¾ããio_threading.pyã¯æ¨æºå ¥åã«å±ããã¿ã¹ã¯ã®å¦çãã¹ã¬ããã«ä»»ãã¾ããä¸æ¹ãio_asyncio.pyã¯ã¤ãã³ãã«ã¼ãã«ä»»ãã¾ãã
ã¿ã¹ã¯ã¯æ¤è¨¼ç¨ããã°ã©ã ã®æ¨æºå ¥åã«å ¥åããã¾ããå ¥åãããæååã¯æ°åã§ãªããã°ãªãã¾ããããã®æ°åã¯å ¥åãããã¿ã¹ã¯ã®éã表ãã¾ããmax_weight_io_burdenããæ¤è¨¼ç¨ããã°ã©ã ãå¦çããªããã°ãªããªãã¿ã¹ã¯ã®ç·éã§ããæ¤è¨¼ç¨ããã°ã©ã ãå¦çããã¿ã¹ã¯ã®éã®åãã¿ã¹ã¯ã®ç·éãè¶ ããã¨ãããã°ã©ã ã¯çµäºãã¾ãã
ã¿ã¹ã¯ã¯I/O boundãªãã®ã§ããã¿ã¹ã¯ã®éã¯I/Oå¾ ã¡ã®æéï¼ç§ï¼ã§ããio_burdené¢æ°ããI/O boundãªã¿ã¹ã¯ã模å£ãã¾ãã
io_threading.pyã¨io_asyncio.pyã¯ãã¿ã¹ã¯ã®ç·éãã©ãã ãéãçµãããããã¨ãã§ããã®ãï¼ã競ãã¾ãã
io_threading.py
I/O boundãªå¦çãthreadingãç¨ãã¦æãå®è£ ã§ãã
import threading import fileinput import time import os max_weight_io_burden = int(os.getenv('MAX_WEIGHT_IO_BURDEN')) start = None # å¦çæ¸ã®ã¿ã¹ã¯ã®é processed_weight = 0 processed_weight_lock = threading.Lock() def io_burden(weight: int): # I/O boundãªå¦çã模å£ããé¢æ° # weightå¼æ°ã«æå®ãããç§æ°ã ãå¾ ã¡ãçºçããã¾ã global processed_weight global processed_lock time.sleep(weight) with processed_weight_lock: processed_weight += weight if processed_weight >= max_weight_io_burden: print(time.time() - start, processed_weight) def get_input(): global start inputs = 0 for line in fileinput.input(): # æ¨æºå ¥åã®ã¿ã¹ã¯ãåãåã # weightãã¿ã¹ã¯ã®é weight = int(line) if inputs == 0: start = time.time() if inputs >= max_weight_io_burden: # å¦çæ¸ã¿ã®ã¿ã¹ã¯ã®éãmax_weight_io_burdenã«å°éãããã«ã¼ããæãã break # ã¹ã¬ãããçæããã¿ã¹ã¯ãå¦çããã¹ã¬ãããéå§ t = threading.Thread(target=io_burden, args=(weight,)) t.start() inputs += weight # å¦çä¸ã®ã¹ã¬ãããå ¨ã¦çµããã¾ã§å¾ 㤠while threading.active_count() > 1: pass get_input()
io_asyncio.py
I/O boundãªå¦çãasyncioãç¨ãã¦æãå®è£ ã§ããä¸è¨ã®io_threading.pyã®asyncioçã§ãã
import threading import fileinput import time import os import asyncio max_weight_io_burden = int(os.getenv('MAX_WEIGHT_IO_BURDEN')) start = None processed_weight = 0 async def io_burden(weight: int, loop): global processed_weight await asyncio.sleep(weight) processed_weight += weight if processed_weight >= max_weight_io_burden: loop.stop() print(time.time() - start, processed_weight) def get_input(loop): global start inputs = 0 for line in fileinput.input(): weight = int(line) if inputs == 0: start = time.time() if inputs >= max_weight_io_burden: break # ã¿ã¹ã¯ãå¦çããã³ã«ã¼ãã³ãã¤ãã³ãã«ã¼ãã«ç»é²ãã asyncio.run_coroutine_threadsafe(io_burden(weight, loop), loop=loop) inputs += weight loop = asyncio.get_event_loop() thread_input = threading.Thread(target=get_input, args=(loop,)) thread_input.start() loop.run_forever()
ããã°ã©ã ã®å®è¡æ¹æ³
ãã®ããã°ã©ã ã¯2ã¤ã®ç«¯æ«ã«ããå®è¡ãã¾ãã
1ã¤ç®ã®ç«¯æ«ã§ã¯ãæ¤è¨¼ç¨ããã°ã©ã ãåããã¾ããç°å¢å¤æ°MAX_IO_BURDEN_TASKSã¯ããã°ã©ã ãå¦çããã¿ã¹ã¯ã®ç·éã§ãã
# å®è¡ä¾ # ããã°ã©ã ãèµ·åããã®ããã°ã©ã ã¯ã¿ã¹ã¯ã1000000ã ãå¦çãããçµäºããã $ tail -f a.txt | MAX_IO_BURDEN_TASKS=1000000 python3.8 io_threading.py # ããã°ã©ã ãèµ·åããã®ããã°ã©ã ã¯ã¿ã¹ã¯ã1000ã ãå¦çãããçµäºããã $ tail -f a.txt | MAX_IO_BURDEN_TASKS=1000 python3.8 io_asyncio.py
2ã¤ç®ã®ç«¯æ«ã§ã¯ãããã°ã©ã ã«ã¿ã¹ã¯ãæå ¥ãã¾ãã
# é1ã®ã¿ã¹ã¯ãæå ¥ãç¶ãã $ while true; do echo "1" >> a.txt; done # é10ã®ã¿ã¹ã¯ãæå ¥ãç¶ãã $ while true; do echo "10" >> a.txt; done
å®è¡çµæ
io_threading.py
ããã°ã©ã | MAX_IO_BURDEN_TASKS | åã¿ã¹ã¯ã®éï¼ç§ï¼ | å¦çæéï¼ç§ï¼ | åè |
---|---|---|---|---|
io_threading.py | 1,000,000 | 1 | 188.2251 | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 2 | 98.5299 | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 3 | 67.4113 | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 4 | 54.4028 | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 5 | - | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 30 | - | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 40 | 43.4900 | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 50 | 52.6634 | ï¼ï¼ï¼ |
io_threading.py | 1,000,000 | 100 | 101.3432 | ï¼ï¼ï¼ |
io_asyncio.py
ããã°ã©ã | MAX_IO_BURDEN_TASKS | åã¿ã¹ã¯ã®éï¼ç§ï¼ | å¦çæéï¼ç§ï¼ | åè |
---|---|---|---|---|
io_asyncio.py | 1,000,000 | 1 | 127.0902 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 2 | 71.9533 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 3 | 50.1331 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 4 | 36.5489 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 5 | 29.2290 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 6 | 25.2520 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 7 | 22.3903 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 8 | 20.4432 | ï¼ï¼ï¼ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 9 | 20.1911 | ï¼ï¼ï¼ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 10 | 19.8268 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 20 | 24.8514 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 30 | 33.0301 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 40 | 42.3853 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 50 | 51.8705 | ï¼ï¼ï¼ |
io_asyncio.py | 1,000,000 | 100 | 100.7834 | ï¼ï¼ï¼ |
å®è¡çµæã®èå¯
ï¼ï¼ï¼é度ãªã¿ã¹ã¯åå²ã«ãããªã¼ãã¼ãããå¢å¤§
threadingãasyncioå ±ã«ãæãå¦çæéã大ãããªã£ã¦ãã¾ããããã¯ã¹ã¬ãããã¤ãã³ãã«ã¼ãããã®ä»è«¸ã ã®ãªã¼ãã¼ãããã®å½±é¿ã大ãããªã£ã¦ãã¾ã£ããã¨ãèµ·å ãã¦ããã¨èãããã¾ãããã«ãã¹ã¬ããã®ã¡ãªããã¯å¤§ããªã¿ã¹ã¯ãå°ããªã¿ã¹ã¯ã«åå²ããè¤æ°ã®ã¿ã¹ã¯ãè¤æ°ã®ã¹ã¬ãããåæã«å¦çãããã¨ã§ãå ¨ã¦ã®ã¿ã¹ã¯ãéãçµäºãããããã®ææ³ã§ããã¿ã¹ã¯ãå°ããããã°ããã»ã©ããããã®ã¹ã¬ããã¯éãçµäºãã¾ãããããããããã®ã¹ã¬ãããçæã»ç®¡çããªããã°ãªãã¾ãããéåæI/Oã§ãåæ§ã«ãã¿ã¹ã¯ãå°ããããã°ããã»ã©ããããã®ã¿ã¹ã¯ã¯éãçµäºãã¾ãããããããããã®ã¿ã¹ã¯ãéåæI/Oã®ã¤ãã³ãã«ã¼ãã«ç»é²ã»ç®¡çããªããã°ãªãã¾ãããã¾ããä»åã®ããã°ã©ã ã®å ´åãã¿ã¹ã¯ãå°ããããã°ããã»ã©ã¿ã¹ã¯ãæ¨æºå ¥åããèªã¿è¾¼ãåæ°ã大ãããªãã¾ãã
asyncioã®æ¹ãthreadingãããå¦çæéãå°ããã§ããããã¯éåæI/Oã®ã¤ãã³ãã«ã¼ãã®ã¿ã¹ã¯ã®ç»é²ã»ç®¡çã«ãããæéã®æ¹ããã¹ã¬ããã®çæã»ç®¡çã®ãããããå°ããããã§ããã¨èãããã¾ããéåæI/Oã®æ¹ããã«ãã¹ã¬ãããããã³ã³ããã¹ãã®ã¹ã¤ããã³ã°ã«é¢ãããªã¼ãã¼ããããå°ããã¨ããä¸è¬è«ã«ãåè´ãã¾ãã
ï¼ï¼ï¼OSã®ã¹ã¬ããæ°ä¸éå¤ãå½±é¿
OSä¸ã§ç¨¼åãã¦ããã¹ã¬ããæ°ããå®è¡ç°å¢ã®ä¸éå¤ã«å¼ã£ããã£ã¦ãã¾ããã¨ã©ã¼çµäºãã¾ãã
$ tail -f a.txt | MAX_WEIGHT_IO_BURDEN=1000000 python3.8 io_threading.py Traceback (most recent call last): File "io_threading.py", line 43, in <module> get_input() File "io_threading.py", line 37, in get_input t.start() File "/usr/lib64/python3.8/threading.py", line 852, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread 32702 libgcc_s.so.1 must be installed for pthread_cancel to work
å®è¡ç°å¢OSã¯Linuxã§ãã1ããã»ã¹æ¯ã«ä½ããã¨ã®ã§ããã¹ã¬ããæ°ã«ã¯ä¸éå¤ãããã¾ããCPythonã®ã¹ã¬ããã¯pthreadãç¨ãã¦å®è£ ããã¦ããããããã®ä¸éå¤ã®å½±é¿ãåãã¾ãã
$ cat /proc/sys/kernel/threads-max 126329
ï¼ï¼ï¼ããã©ã¼ãã³ã¹é æã¡
æãå¦çæéãå°ããã§ãããï¼ï¼ï¼ã§è¿°ã¹ãçæ³çãªããã©ã¼ãã³ã¹ã®åä¸ãé æã¡ã«ãªã£ã¦ããç¶æ ã§ããï¼ï¼ï¼ã§è¿°ã¹ããããªãªã¼ãã¼ãããã®å½±é¿ãåºå§ãã¦ãããã®ã¨æããã¾ãã
ï¼ï¼ï¼ã¿ã¹ã¯åå²æ°ã«æ¯ä¾ãã¦ããã©ã¼ãã³ã¹åä¸
åã¿ã¹ã¯ã®éã¨å¦çæéãã»ã¼åãã§ãããã«ãã¹ã¬ãããéåæI/Oãå ±ã«ãçæ³çãªããã©ã¼ãã³ã¹ã®åä¸ãå®ç¾ã§ãã¦ãã¾ãããçæ³çãªãã¨ããæ以ã¯ã大ããªã¿ã¹ã¯ãå°ããåå²ããåã ããå¦çæéãåä¸ãã¦ããããã§ãã
io_threading.pyã¨io_asyncio.pyã®å¦çæéã«å·®ç°ãã»ã¨ãã©ããã¾ãããããã¯ï¼ï¼ï¼ã§è¿°ã¹ããããªãªã¼ãã¼ãããã®å½±é¿ãç¡è¦ã§ããã»ã©å°ããããã ã¨æããã¾ãã
ã¾ã¨ã
ä»åã®è©¦è¡é¯èª¤ããå¾ãããçµè«ã¯æ¬¡ã§ãã
- CPUè² è·ã®é«ãå¦çï¼ããããCPU boundï¼ãéæãããã®ã§ããã°ããã«ãããã»ã¹ï¼multiprocessingï¼ãå©ç¨ã
- I/Oã®å¾ ã¡æéã大ããå¦çï¼ããããI/O boundï¼ãéæãããã®ã§ããã°ããã«ãã¹ã¬ããï¼threadingï¼ãéåæI/Oï¼asyncioï¼ãå©ç¨ã
- åæã«å®è¡ãã¦ããã¹ã¬ããæ°ã大ããå ´åã«ããã¦ãéåæI/Oã®ããã©ã¼ãã³ã¹ã®æ¹ãè¯ãããã ããåæã«å®è¡ãã¦ããã¹ã¬ããæ°ã大ãããªãå ´åã«ããã¦ã¯ããã«ãã¹ã¬ããã¨éåæI/Oã®ããã©ã¼ãã³ã¹ã®å·®ç°ã¯ãã¾ããªãã
åè
- マルチスレッドプログラミング | TECHSCORE(テックスコア)
- Pythonをとりまく並行/非同期の話
- Async/await - Wikipedia
- Amazon.co.jp: Operating System Concepts : Silberschatz, Abraham, Galvin, Peter B., Gagne, Greg: Foreign Language Books
- GlobalInterpreterLock - Python Wiki
- multiprocessing vs multithreading vs asyncio in Python 3 - Stack Overflow
- 7.5.6 Thread Objects
- POSIX Threads - Wikipedia
- Python Documentation by Version | Python.org