æ¬è¨äºã®å
¬éå¾ã®2016å¹´7æã«ã¯ã¦ãªã«ããããã¥ã¼ãã³ã°äºä¾ãç´¹ä»ããã はてなにおけるLinuxネットワークスタックパフォーマンス改善 / Linux network performance improvement at hatena - Speaker Deck
HAProxy ã nginx ãªã©ã®ã½ããã¦ã§ã¢ãã¼ããã©ã³ãµããªãã¼ã¹ãããã·ãmemcached ãªã©ã® KVS ã®ãããªé«ãã±ããã¬ã¼ãã«ãªãããããããã¯ã¼ã¯ã¢ããªã±ã¼ã·ã§ã³ã«ããã¦ãåä¸ã® CPU ã³ã¢ã«è² è·ãåãããã«ãã³ã¢ã¹ã±ã¼ã«ããªããã¨ãããã¾ãã ä»åã¯ããã®ãããªãããã¯ã¼ã¯ã¢ããªã±ã¼ã·ã§ã³ã«ãã㦠CPU è² è·ããã«ãã³ã¢ã¹ã±ã¼ã«ããªãçç±ã¨ããã«ãã³ã¢ã¹ã±ã¼ã«ãããããã® Linux ã«ã¼ãã«ã®ãããã¯ã¼ã¯ã¹ã¿ãã¯ã®ãã¥ã¼ãã³ã°ææ³ã¨ã㦠RFS (Receive Flow Steering) ãç´¹ä»ãã¾ãã
Redis ã Nodejs ã®ãããª1ããã»ã¹1ã¹ã¬ããã§åä½ããã¢ããªã±ã¼ã·ã§ã³ããã«ãã³ã¢ã¹ã±ã¼ã«ããããããªè©±ã§ã¯ããã¾ããã®ã§ã注æãã ããã
- åé¡ã¨èæ¯
- ãããã¯ã¼ã¯ã¹ã¿ãã¯ããã«ãã³ã¢ã¹ã±ã¼ã«ãããããã®æè¡
- å®é¨
- åèè³æ
- ã¾ã¨ã
- ãã¾ã: ãããã¯ã¼ã¯ã¹ã¿ãã¯å¦çã® CPU è² è·ãæé©åããæè¡
åé¡ã¨èæ¯
åè¿°ã®ããã«é«è² è·ãªãããã¯ã¼ã¯ã¢ããªã±ã¼ã·ã§ã³ã«ããã¦ãä¸è¨ã®ããã«ä»ã®ã³ã¢ã空ãã¦ããã«ãé¢ããããCPU0 ã® softirq(%soft) ã«è² è·ãéä¸ããçµæãCPU0 ã®ã¿ idle(%idle) ãèããä½ãã¨ãããã¨ãããããã¾ãã
CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle all 31.73 0.00 1.47 0.13 0.00 0.96 0.06 0.00 65.64 0 70.41 0.00 5.10 0.00 0.00 15.31 0.00 0.00 9.18 1 68.04 0.00 3.09 0.00 0.00 0.00 0.00 0.00 28.87 2 53.06 0.00 3.06 0.00 0.00 0.00 0.00 0.00 43.88 3 47.47 0.00 2.02 0.00 0.00 0.00 1.01 0.00 49.49 4 49.45 0.00 1.10 0.00 0.00 0.00 0.00 0.00 49.45 5 44.33 0.00 2.06 0.00 0.00 0.00 0.00 0.00 53.61 6 38.61 0.00 2.97 0.99 0.00 0.00 0.00 0.00 57.43 7 32.63 0.00 1.05 0.00 0.00 0.00 0.00 0.00 66.32 8 29.90 0.00 1.03 1.03 0.00 0.00 0.00 0.00 68.04 9 10.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 90.00 10 8.08 0.00 1.01 0.00 0.00 0.00 0.00 0.00 90.91 11 6.12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 93.88 12 10.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 88.00 13 11.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 88.00 14 17.71 0.00 0.00 0.00 0.00 0.00 0.00 0.00 82.29 15 11.22 0.00 1.02 0.00 0.00 0.00 0.00 0.00 87.76
softirq ã¨ããã®ã¯ã½ããå²ãè¾¼ã¿ã¨å¼ã°ãã¾ãããã½ããå²ãè¾¼ã¿ã®è² è·ãé«ãã¨ããã®ã¯ã©ããããã¨ãã¨ããã®ã¨ããªã CPU0 ã«è² è·ãéä¸ããã®ãã«ã¤ãã¦è°è«ãã¦ã¿ã¾ãã
ã½ããå²ãè¾¼ã¿ã®è² è·ãé«ãã¨ã¯ã©ããããã¨ã
Linux ã®ã½ããå²ãè¾¼ã¿ã«ã¤ãã¦ã¯ãhttp://sourceforge.jp/projects/linux-kernel-docs/wiki/2.2%E3%80%80Linux%E3%82%AB%E3%83%BC%E3%83%8D%E3%83%AB%E3%81%AE%E5%89%B2%E3%82%8A%E8%BE%BC%E3%81%BF%E5%87%A6%E7%90%86%E3%81%AE%E7%89%B9%E5%BE%B4 ãåç §ãã¦ãã ããã
ã¾ããLinux ã«ãããå²ãè¾¼ã¿ãç¨ãããã±ããã®åä¿¡ããã¼ãã¿ã¦ã¿ã¾ããLinux 2.6 ããã¯å²ãè¾¼ã¿ã¨ãã¼ãªã³ã°ãçµã¿åããã NAPI ã¨ããä»çµã¿ã§ãã±ãããåä¿¡ãã¾ããNAPI ã«ã¤ãã¦ã¯ãã®è¨äºã®æå¾ã®ãã¾ãã®ç« ãã¿ã¦ãã ããã
- ãNICãã¼ãã¦ã§ã¢åä¿¡ãNIC ã¯ãã±ãããåä¿¡ãã㨠NIC ã®å é¨ã¡ã¢ãªã«ãã±ãããç½®ãã
- ããã¼ãã¦ã§ã¢å²ãè¾¼ã¿ããã±ãããåä¿¡ãããã¨ãç¥ãããããã«ãNIC ãããã¹ãã® CPU ã«ããã¦ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ããããããã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãããããã CPU (NIC ãã©ã¤ã) 㯠NIC ä¸ã®ãã±ãããã«ã¼ãã«ä¸ã®ãªã³ã°ãããã¡ã«ç½®ã ã移è¡ã®ãã±ããå¦çã¯åæçã«å¦çããªãã¦ãããããã½ããå²ãè¾¼ã¿ãã¹ã±ã¸ã¥ã¼ã«ãã¦ããã¡ãã«ä»»ããã(å³å¯ã«ã¯ããªã³ã°ãããã¡ã«æ¸¡ãã¦ããã®ã¯ã½ã±ããã¸ã®ãã¤ã³ã¿çãªãã®ã§ããã±ãããã¼ã¿ã¯ DMA 㧠CPU ãä»ããã«ãã«ã¼ãã«ã®ã¡ã¢ãªé åã«æ¸¡ãããã¯ã)
- ãã½ããå²ãè¾¼ã¿ãã¹ã±ã¸ã¥ã¼ã«ãããã½ããå²ãè¾¼ã¿ãçºçãããªã³ã°ãããã¡ãããã±ãããåãåºããã½ããå²ãè¾¼ã¿ãã³ãã©ã§ãããã³ã«å¦çããã®ã¡ãã½ã±ãããã¥ã¼ã«ãã±ãããç©ã¾ããã
- ãã¢ããªã±ã¼ã·ã§ã³åä¿¡ã
read
ãrecv
ãrecvfrom
ãªã©ãå¼ã°ããã¨ã½ã±ãããã¥ã¼ããã¢ããªã±ã¼ã·ã§ã³ã¸åä¿¡ãã¼ã¿ãã³ãã¼ãããã
ä¸åä¸åã®ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ããã³ã½ããå²ãè¾¼ã¿è² è·ã¯å½ç¶å¤§ãããã¨ã¯ããã¾ããããããã64 ãã¤ããã¬ã¼ã 㧠1Gbps ã®ã¯ã¤ã¤ã¼ã¬ã¼ãã§ãã©ããã¯ãæµããã¨ããã¨ãç´ 1,500,000å/sec ãã®å²ãè¾¼ã¿ãçºçãããã¨ã«ãªãã¾ããã½ããå²ãè¾¼ã¿ã¯åªå 度ã®é«ãã¿ã¹ã¯ãªã®ã§ãå²ãè¾¼ã¿ãåãã¦ãã CPU ã¯å²ãè¾¼ã¿ãã³ãã©ã®å¦ç以å¤ä½ãã§ããªããªãã¾ãã
CPU ã®ã¯ããã¯å¨æ³¢æ°ãé æã¡ã«ãªãã10 Gbps ã¤ã¼ãµããããªã©ã®ã¯ã¤ã¤ã¼ã¬ã¼ããåä¸ããã¨ããã®ããã« CPU å¦çã®ãã¡å²ãè¾¼ã¿å¦çã®å²åã大ãããªã£ã¦ããã¾ãã
ãã±ããåä¿¡ã®æµãã«ã¤ãã¦ã¯ãä¸è¨ã®æç®ã詳ããã§ãã
- TCP Implementation in Linux: A Brief Tutorial
- 8.3. パケット受信の概要 Red Hat Enterprise Linux 6 | Red Hat Customer Portal
- Linux packet-forwarding
- The Performance Analysis of Linux Networking – Packet Receiving
ãªããã«ãã³ã¢ã¹ã±ã¼ã«ããªãã®ãï¼ãªã CPU è² è·ãç¹å®ã³ã¢ã«åãã®ãï¼
ã¾ãããã«ããã¥ã¼å¯¾å¿(MSI-X 対å¿ãRSS 対å¿) NIC ã§ãªãå ´åãNIC ããã®ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿å ã¯ç¹å®ã® CPU ã«åºå®ããã¾ãã ããã©ã³ãã ã«è¤æ°ã® CPU ã«å²ãè¾¼ã¿ããããã¨ããã¨ããã±ããã並åå¦çãããã¨ã«ãªããTCP ã®ãããªä¸ã¤ã®ããã¼(ã³ãã¯ã·ã§ã³)ä¸ã®ãã±ããã®é åºä¿è¨¼ããããããã³ã«ã®å ´åããã±ããã®ä¸¦ã¹ç´ããå¿ è¦ã«ãªãã¾ãã ããã TCP reordering åé¡ã¨ããã¾ããreordering ã«ããããã©ã¼ãã³ã¹ãä¸ããªãããã«ãåã CPU ã«ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãããã¦ãã¾ãã
ããã«ããã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãã³ãã©ã§ã½ããå²ãè¾¼ã¿ãã¹ã±ã¸ã¥ã¼ã«ãã¾ãããã®ã¨ã Linux ã§ã¯ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãåãã CPU ã¨åã CPU ã«ã½ããå²ãè¾¼ã¿ãå²ãå½ã¦ãããã«ãªã£ã¦ãã¾ããããã¯ããã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãã³ãã©ã§ã¡ã¢ãªã¢ã¯ã»ã¹ããæ§é ä½ãã½ããå²ãè¾¼ã¿ãã³ãã©ã«ãå¼ãç¶ãã®ã§ãCPUã³ã¢ãã¼ã«ã«ãª L1, L2ãã£ãã·ã¥ãå¹çããå©ç¨ããããã§ãã
ãããã£ã¦ãç¹å®ã® CPU ã«ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ã¨ã½ããå²ãè¾¼ã¿ãéä¸ãããã¨ã«ãªãã¾ãã
å®ç°å¢ã§ã¯ãå²ãè¾¼ã¿è² è·ã«å ãã¦ãã¢ããªã±ã¼ã·ã§ã³å¦çã® CPUè² è· (%user) ãå²ãè¾¼ã¿ãããã£ã¦ãã CPU ã«åã£ã¦ãã¾ãã
ããã¯ãaccept()
ã§å¾
ã¡ç¶æ
ã®ã¢ããªã±ã¼ã·ã§ã³ããã»ã¹ãããã¼ã¿å°çæã«ããã»ã¹ã¹ã±ã¸ã¥ã¼ã©ã«ããããããã³ã«å¦çãã CPU ã¨åã CPU ã«åªå
ãã¦ã¢ããªã±ã¼ã·ã§ã³ããã»ã¹ãå²ãå½ã¦ã¦ãããããªæ°ããã¦ãã¾ãã
ããããL1, L2ãã£ãã·ã¥ã®å¹çå©ç¨ã®ããã ã¨æãã¾ãããã¡ããã¨ç¢ºèªã§ãã¦ãã¾ããã
ãããã¯ã¼ã¯ã¹ã¿ãã¯ããã«ãã³ã¢ã¹ã±ã¼ã«ãããããã®æè¡
ãããã¯ã¼ã¯ã¹ã¿ãã¯ããã«ãã³ã¢ã¹ã±ã¼ã«ãããããã®æè¡ã¯ãNIC(ãã¼ãã¦ã§ã¢)ã®æ©è½ã«ãããã®ããã«ã¼ãã«(ã½ããã¦ã§ã¢)ã®æ©è½ã«ãããã®ãããã®ä¸¡æ¹ãã«åé¡ã§ãã¾ãã ä»åç´¹ä»ããã®ã¯ãNICã®æ©è½ã§å®ç¾ãã RSS(Receive Side Scaling) 㨠RPS/RFS ã§ãã RPS ã®çºå±ããå®è£ ã RFS ãªã®ã§ãå®è³ª RSS 㨠RFS ã¨ãããã¨ã«ãªãã¾ãã
RSS/RPS/RFS ã«ã¤ãã¦ã¯ãScaling in the Linux Networking Stack ã詳ããã§ãã RPS/RFS 㯠Linux ã«ã¼ãã« 2.6.35 以éã§å®è£ ããã¦ãã¾ãããRHLE 系㯠5.9 ããã以éã§ããã¯ãã¼ãããã¦ããã¨æãã¾ãã
ä»ã«ããè«æãã¼ã¹ã§ã¯ã[1][Aâ©Transport-Friendlyâ©NICâ©forâ©Multicore/Multiprocessorâ©Systems] ã [2][mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems] ãªã©ããããã¯ã¼ã¯ã¹ã¿ãã¯ãé«éåãããããã®æ§ã ãªææ³ãææ¡ããã¦ãã¾ãã
RSSï¼Receive Side Scaling)
ãã«ãã³ã¢ã¹ã±ã¼ã«ããªãããããã®åå ã¯ãç¹å®ã® CPU ã«ã®ã¿ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãéä¸ããããã§ãã ããã§ãNIC ã«è¤æ°ã®ãã±ãããã¥ã¼ãæããã¦ããã¥ã¼ã¨ CPU ã®ãããã³ã°ãä½ãããã¥ã¼ãã¨ã«ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿å CPU ãå¤ãã¾ãã
TCP reordering ãåé¿ããããã«ã"ãã£ã«ã¿"ã«ããåãããã¼ã®ãã±ããã¯åããã¥ã¼ã«ã¤ãªããããã«ãã¾ãã ãã£ã«ã¿ã®å®è£ ã¯å¤§æµãIPãããã¨ãã©ã³ã¹ãã¼ã層ã®ããããä¾ãã° src/dst IPã¢ãã¬ã¹ã¨ src/dst ãã¼ãçªå·ã®4ã¿ãã«ããã¼ã¨ãã¦ããã¥ã¼çªå·ãããªã¥ã¼ã¨ããããã·ã¥ãã¼ãã«ã«ãªãã¾ãã
ããã·ã¥ãã¼ãã«ã®ã¨ã³ããªæ°ã¯ 128 ã§ããã£ã«ã¿ã§è¨ç®ããããã·ã¥å¤ã®ä¸ä½ 7 bit ããã¼ã¨ãã¦ãããã¼ãã¦ã§ã¢å®è£ ãå¤ãããã§ãã
RPS (Receive Packet Steering)
RSS 㯠NIC ã®æ©è½ãªã®ã§ãNIC ã対å¿ãã¦ããªããã°ä½¿ãã¾ããã RSS ç¸å½ã®æ©è½ãã½ããã¦ã§ã¢(Linuxã«ã¼ãã«)ã§å®ç¾ãããã®ã RPS ã§ãã
RPS ã¯ã½ããå²ãè¾¼ã¿ãã³ãã©ã§ NIC ã®ãããã¡ãããã±ããããã§ããããå¾ããããã³ã«å¦çãããåã«ãä»ã® CPU ã¸ã³ã¢éå²ãè¾¼ã¿(IPI: Inter-processor interrupt)ãã¾ããããã¦ãã³ã¢éå²ãè¾¼ã¿å ã® CPU ããããã³ã«å¦çãã¦ãã¢ããªã±ã¼ã·ã§ã³åä¿¡å¦çããã¾ãã
ä»ã® CPU ã®é¸ææ¹æ³ã¯ RSS ã¨ããä¼¼ã¦ãããsrc/dst IPã¢ãã¬ã¹ã¨ src/dst ãã¼ãçªå·ã®4ã¿ãã«ããã¼ã¨ãã¦ãConsistent-Hashing ã«ããåæ£å ã® CPU ãé¸æããã¾ãã
RPS ã®ã¡ãªããã¯ä»¥ä¸ã®3ç¹ãèãããã¾ãã
- ã½ããã¦ã§ã¢å®è£ ãªã®ã§ã NIC ã«ä¾åããªã
- ãã£ã«ã¿ã®å®è£ ãã½ããã¦ã§ã¢ãªã®ã§ãæ°ãããããã³ã«ç¨ã®ãã£ã«ã¿ãç°¡åã«è¿½å ã§ãã
- ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿æ°ãå¢ãããªã
- ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ã CPU éã§åæ£ãããããã«ãé常1åã§æ¸ããã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãå ¨ã¦ã® CPU ã«å¯¾ãã¦è¡ããããã¯ãç²å¾ãã CPU ã®ã¿ã½ããå²ãè¾¼ã¿ãã¹ã±ã¸ã¥ã¼ã«ããã¨ããéå¹çãªããæ¹ãããï¼ (RSS 㧠MSI-X ã使ããªãã¨ãããªãï¼)
RPS ã®ä½¿ãæ¹ã¯ç°¡åã§ãNIC ã®ãã¥ã¼ãã¨ã«ä¸è¨ã®ãããªã³ãã³ããå©ãã ãã§ããã·ã³ã°ã«ãã¥ã¼ NIC ãªã rx-0
ã®ã¿ã§ããã«ããã¥ã¼ NIC ãªããã¥ã¼ã®æ°ã ã rx-N
ãããã¾ãã
# echo "f" > /sys/class/net/eth0/queues/rx-0/rps_cpus
rps_cpus
ã¯åæ£å
ã®ã® CPU ã®åè£ãããããããã§è¡¨ãã¦ãã¾ãã
"f"
ã®2é²æ°è¡¨ç¾ã¯ 1111
ã¨ãªããåããããä¸ä½ããé çªã« CPU0 ~ CPU3 ã¾ã§å¯¾å¿ãã¦ãã¾ãããããã 1 ãªãã°ã対å¿ãã CPU ã¯åæ£å
ã®åè£ã¨ãªãã¨ãããã¨ã§ãã
ã¤ã¾ãã"f"
ãªã CPU0,1,2,3 ãåæ£å
ã®åè£ã¨ãªãã¾ãã
é常ã¯å
¨ã¦ã®ã³ã¢ãåæ£å
ã«é¸æããã°ããã¨æãã¾ãã
RPS ãæå¹ã«ãã¦ãããã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãåãã¦ãã CPU ã«åæ£ãããããªãã¨ãã¯ããã® CPU ã®å¯¾å¿ãããã 0 ã«ãªããããª16é²æ°è¡¨ç¾ã«ãã¾ãã
詳細㯠https://www.kernel.org/doc/Documentation/IRQ-affinity.txt ã«æ¸ããã¦ãã¾ãã
RFS (Receive Flow Steering)
RPS ã¯ã¢ããªã±ã¼ã·ã§ã³ããã»ã¹ã¾ã§å«ãã L1, L2ãã£ãã·ã¥ã®å±ææ§ã®è¦³ç¹ã§ã¯åé¡ãããã¾ãã
RPS ã§ã¯ããã¼ã¨ããã¼ãå¦çããåæ£å
ã®CPUã®ãããã³ã°ã¯ã©ã³ãã ã«æ±ºå®ããã¾ãã
ããã§ã¯ãaccept(2)
ã read(2)
ãå¼ãã§ã¹ãªã¼ãä¸ã®ã¢ããªã±ã¼ã·ã§ã³ããã»ã¹ãã¹ãªã¼ãåã«å®è¡ããã¦ãã CPU ã¨ã¯ç°ãªã CPU ã«å²ãå½ã¦ãããå¯è½æ§ãããã¾ãã
ããã§ãRFS 㯠RPS ãæ¡å¼µãã¦ãã¢ããªã±ã¼ã·ã§ã³ããã»ã¹ããã¬ã¼ã¹ã§ããããã«ãªã£ã¦ãã¾ãã
å
·ä½çã«ã¯ãããã¼ã«å¯¾ããããã·ã¥å¤ãããã®ã¾ã¾ Consistent-Hashing ã§åæ£å
CPU ã決ããã®ã§ã¯ãªããããã¼ã«å¯¾ããããã·ã¥å¤ããã¼ã¨ããããã¼ãã¼ãã«ãç¨æãã¦ããã¼ãã«ã¨ã³ããªã«ã¯åæ£å
ã® CPU çªå·ãæ¸ãã¦ããã¾ãã
該å½ããã¼ãæå¾ã«å¦çãã CPU ãå®å
CPU ã«ãªãããã«ãrecv_message
ãªã©ã®ã·ã¹ãã ã³ã¼ã«ãå¼ã°ããã¨ãã«ãããã¼ãã¼ãã«ã®å®å
CPU ãæ´æ°ãã¾ãã
RFS ã®è¨å®ã¯ãRPS ã®è¨å®ã® rps_cpus
ã«å ãã¦ãrps_flow_cnt
㨠rps_sock_flow_entries
ãè¨å®ããã ãã§ãã
# echo "f" > /sys/class/net/eth0/queues/rx-0/rps_cpus # echo 4096 > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt # echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
rps_sock_flow_entries
ã¯ã·ã¹ãã ã°ãã¼ãã«ãªããã¼ãã¼ãã«ã®ã¨ã³ããªæ°ãè¨å®ãã¾ãã
ãã¼ã«ã«ãã¼ãæ°(æ大æ¥ç¶æ°)以ä¸ãè¨å®ãã¦ãæå³ã¯ãªãã®ã§ã65536 以ä¸ã®æ°å¤ãè¨å®ããã°ããã¯ãã§ãã
32768 ãè¨å®ããã¦ããä¾ãããè¦ããã¾ãã
rps_flow_cnt
㯠NIC ãã¥ã¼ãã¨ã®ããã¼æ°ãè¨å®ã§ãã¾ãã
16 åã®ãã¥ã¼ãã㤠NIC ã§ããã°ãrps_sock_flow_entries
ã 32768 ã«è¨å®ããã¨ããã¨ãrps_flow_cnt
㯠2048 ã«è¨å®ããã®ãæã¾ããã¨æãã¾ãã
ã·ã³ã°ã«ãã¥ã¼ NIC ã§ããã°ãrps_flow_cnt
㯠rps_sock_flow_entries
ã¨åãè¨å®ã§ããã§ãã
è¨å®ã®æ°¸ç¶åã«ã¤ãã¦ã¯ãCentOS6で/sys/の変更を永続化する方法 ãé常ã«åèã«ãªãã¾ãã
å®é¨
ãã³ããã¼ã¯ãã¼ã«( iperf )ã«ãããã³ããã¼ã¯ã¨å®ã¢ããªã±ã¼ã·ã§ã³ã¸ã®é©ç¨ããã£ã¦ã¿ã¾ããã ãããã 10GBps NIC ã使ç¨ãã¦ãã¾ãã
iperfã«ãããã³ããã¼ã¯
ãã³ããã¼ã¯ç°å¢ã¯ä»¥ä¸ã®éãã§ãã
- CPU: Intel Core i5 3470 3.2GHz 2ã³ã¢ (Hyper Threadingæå¹)
- NIC: Mellanox ConnectX-3 EN 10GbE PCI Express 3.0
- OS: CentOS 5.9
CPU ã¯ããã¯å¨æ³¢æ°ã BIOS 㧠1.6 GHz ã«å¶éãã¦ãç¡çããCPUããã¯ãªç°å¢ãä½ã£ã¦ãã¾ãã iperfã®ããã»ã¹æ°(ã³ãã¯ã·ã§ã³æ°)ã4ã¨ãã¦ããã±ãããµã¤ãºã64ãã¤ãã«ãã¦ãã©ããã¯ãæµãã¨ãiperfãµã¼ãå´ã§ã¯ãä¸è¨ã®ããã« CPU0 ã® softirq ã³ã³ããã¹ãã®ä½¿ç¨çã 100% ã«ãªãã¾ãã
CPU %user %nice %sys %iowait %irq %soft %steal %idle all 0.74 0.00 16.75 0.00 0.00 27.83 0.00 54.68 0 0.00 0.00 0.00 0.00 0.00 100.00 0.00 0.00 1 2.00 0.00 52.00 0.00 0.00 9.00 0.00 37.00 2 0.00 0.00 14.95 0.00 0.00 3.74 0.00 81.31 3 0.00 0.00 1.00 0.00 0.00 0.00 0.00 99.00
ããã§ãä¸è¨ã®è¨å®ã§ RFS ãæå¹ã«ããã¨ãCPU1,CPU2,CPU3ã«ã system 㨠softirq ãåæ£ãããå½¢ã«ãªãã¾ããã
# echo "f" > /sys/class/net/eth0/queues/rx-0/rps_cpus # echo 4096 > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt # echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
CPU %user %nice %sys %iowait %irq %soft %steal %idle all 0.24 0.00 11.86 0.00 0.00 23.00 0.00 64.89 0 0.00 0.00 11.11 0.00 0.00 38.38 0.00 50.51 1 0.95 0.00 12.38 0.00 0.00 19.05 0.00 67.62 2 0.00 0.00 13.21 0.00 0.00 18.87 0.00 67.92 3 0.00 0.00 12.50 0.00 0.00 16.35 0.00 71.15
å®ã¢ããªã±ã¼ã·ã§ã³(Starlet)ã¸ã®é©ç¨
次ã«ãPerl ã® Starlet ã§åä½ãã¦ãããããã¯ã·ã§ã³ç°å¢ã®ã¢ããªã±ã¼ã·ã§ã³ãµã¼ãã« RFS ãé©ç¨ãã¦ã¿ã¾ããã ã¢ããªã±ã¼ã·ã§ã³ãµã¼ãç°å¢ã¯ä»¥ä¸ã®éãã§ãã
- EC2 c3.4xlarge SR-IOVæå¹
- CPU: Intel Xeon E5-2680 v2 @ 2.80GHz 16ã³ã¢
- NIC: Intel 82599 10 Gigabit Ethernet Controller
- NIC driver: ixgbevf 2.7.12
- OS: 3.10.23 Debian Wheezy
RFS ã®è¨å®ã¯æ¨æºçã§ãã
# echo "ffff" > /sys/class/net/eth0/queues/rx-0/rps_cpus # echo 32768 > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt # echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
RFS æå¹åã¯ä¸è¨ã®éããCPU0 以å¤ã® CPU ã³ã¢ã¯ãã空ããªã®ã«ãããããããCPU0 ã® softirq(%soft) ã 15% ãå ãã¦ãããidle ã 9% ãããªãã¨ããç¶æ³ã§ãã
CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle all 31.73 0.00 1.47 0.13 0.00 0.96 0.06 0.00 65.64 0 70.41 0.00 5.10 0.00 0.00 15.31 0.00 0.00 9.18 1 68.04 0.00 3.09 0.00 0.00 0.00 0.00 0.00 28.87 2 53.06 0.00 3.06 0.00 0.00 0.00 0.00 0.00 43.88 3 47.47 0.00 2.02 0.00 0.00 0.00 1.01 0.00 49.49 4 49.45 0.00 1.10 0.00 0.00 0.00 0.00 0.00 49.45 5 44.33 0.00 2.06 0.00 0.00 0.00 0.00 0.00 53.61 6 38.61 0.00 2.97 0.99 0.00 0.00 0.00 0.00 57.43 7 32.63 0.00 1.05 0.00 0.00 0.00 0.00 0.00 66.32 8 29.90 0.00 1.03 1.03 0.00 0.00 0.00 0.00 68.04 9 10.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 90.00 10 8.08 0.00 1.01 0.00 0.00 0.00 0.00 0.00 90.91 11 6.12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 93.88 12 10.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 88.00 13 11.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 88.00 14 17.71 0.00 0.00 0.00 0.00 0.00 0.00 0.00 82.29 15 11.22 0.00 1.02 0.00 0.00 0.00 0.00 0.00 87.76
RFS æå¹å¾ã¯ãsoftirq(si) ãä»ã®ã³ã¢ã«åæ£ãããã¤ãã§ã«ãuser(%usr) ã system(%sys) ã®è² è·ãåæ£ããã¦ãã¾ãã
CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle all 27.41 0.00 3.07 0.00 0.00 0.70 0.13 0.00 68.69 0 36.08 0.00 8.25 0.00 0.00 6.19 0.00 0.00 49.48 1 30.43 0.00 3.26 0.00 0.00 0.00 0.00 0.00 66.30 2 31.96 0.00 4.12 0.00 0.00 2.06 0.00 0.00 61.86 3 35.64 0.00 3.96 0.00 0.00 0.00 0.99 0.00 59.41 4 44.12 0.00 1.96 0.00 0.00 0.98 0.00 0.00 52.94 5 37.00 0.00 9.00 0.00 0.00 0.00 0.00 0.00 54.00 6 38.78 0.00 1.02 0.00 0.00 1.02 0.00 0.00 59.18 7 39.00 0.00 6.00 0.00 0.00 1.00 1.00 0.00 53.00 8 19.59 0.00 3.09 0.00 0.00 0.00 0.00 0.00 77.32 9 23.16 0.00 3.16 0.00 0.00 0.00 0.00 0.00 73.68 10 17.17 0.00 4.04 0.00 0.00 0.00 0.00 0.00 78.79 11 18.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00 79.00 12 16.49 0.00 0.00 0.00 0.00 0.00 0.00 0.00 83.51 13 16.33 0.00 0.00 0.00 0.00 1.02 0.00 0.00 82.65 14 16.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 83.00 15 16.67 0.00 0.00 0.00 0.00 0.00 0.00 0.00 83.33
softirq ã ãã§ãªããuser ã system ãåæ£ããã¦ããã¨ããã®ãéè¦ã§ãåè¿°ããããã«ãããã³ã«å¦çãå®è¡ããã CPU ãåæ£ãããã¨ãã¢ããªã±ã¼ã·ã§ã³å¦çè² è·ãåæ£ããã¾ãã
éå»ã«ã¯ Starlet 以å¤ã« HAProxyãpgpoolãVarnish ãªã©ã®ã¢ããªã±ã¼ã·ã§ã³ã§ RFS ã試ãããã¨ãããã¾ãããç¹ã«å¯ä½ç¨ãªãåãã¦ãã¾ãã
èè ã®çæ§æ¹ã¯ memcached ã LVSãLinux ã«ã¼ã¿ã«é©ç¨ããã¦ããããã§ãã
- CentOS 6.2 で RPS/RFS を使ってネットワークの割り込み処理を複数コアに分散してみた - blog.nomadscafe.jp
- CentOS5でもRPS/RFSでNICが捗る話 | Nekoya press
- Re: CentOS5でもRPS/RFSでNICが捗る話 - まいんだーのはてなブログ
åèè³æ
- Scaling in the Linux Networking Stack
- ã«ã¼ãã«ããã¥ã¡ã³ã
- rfs: Receive Flow Steering [LWN.net]
- RFS ã®ããã
- 10GbE時代のネットワークI/O高速化
- RSS/RFS ã«éããªããããã¯ã¼ã¯ã¹ã¿ãã¯å ¨ä½ã®æé©åã®è©±
- Linux packet-forwarding
- Linux / x86_64の割り込み処理 第1回 | 技術文書 | 技術情報 | VA Linux Systems Japan株式会社
RFS ã¯ç´æ¥é¢ä¿ãªãã§ãã @ten_forward ããã«è¯ãè³ææãã¦ããã ãã¾ããã
@y_uuk1 ãã®è¾ºããé¢é£è³æã㪠:)https://t.co/P762uGwFOwhttps://t.co/gpm9wssjNM
— ð £enForwardð (@ten_forward) 2015å¹´3æ31æ¥
ã¾ã¨ã
Linuxã«ã¼ãã«ã®ãããã¯ã¼ã¯ã¹ã¿ãã¯å¦çã®æé©åã«ã¤ãã¦ç´¹ä»ãã¾ããã ãã®ä¸ã§ã¨ãã« RFS ã«çç®ãã¦ããã³ããã¼ã¯ã¨å®ã¢ããªã±ã¼ã·ã§ã³ã«é©ç¨ãã¦ã¿ã¦ããã«ãã³ã¢ã¹ã±ã¼ã«ããããããã¨ã確èªã§ãã¾ããã RFS ã¯ãã¼ãã¦ã§ã¢ä¾åããªããã«ã¼ãã«ã®ãã¼ã¸ã§ã³ãããããªãã«æ°ãããã°ä½¿ãã¦ãã¾ãã®ã§ãçµæ§æ軽ãªä¸ã«å¹æãé«ãã®ã§ãã³ã¹ãããããã¥ã¼ãã³ã°ã¨ãããã¨æãã¾ãã
ééã£ã¦ããè¨è¿°ãªã©ãããã°æ§ãã ããã
ã追è¨ãã³ã¡ã³ãããã ããããã«ãã½ããã¦ã§ã¢å²ãè¾¼ã¿ã¨ããè¨èã使ãã¨ãCPUã®ã½ããã¦ã§ã¢å²ãè¾¼ã¿ã¨æ··åãã¦ãã¾ãã®ã§ãLinux ã«ã¼ãã«ã® softirq ããã½ããå²ãè¾¼ã¿ãã¨å¼ã¶ããã«ä¿®æ£ãã¾ããã(http://www.slideshare.net/syuu1228/10-gbeio ã®è³æãªã©ãã softirq ãã½ããã¦ã§ã¢å²ãè¾¼ã¿ã¨å¼ã¶ãã¨ãããããã§ã)
ã追è¨2ãC社ã®åã³ã®å£°
ryot_a_raiããy_uuk1ããã®ããã°ã®ã«ã¼ãã«ãã¥ã¼ãã³ã°ãã¨ããããæ¬çªAPPã«ããã¦ã¿ã¾ããã¼ã¨è¨ã£ã¦ãã£ã¦ã¿ããå¿çé度ã10%ãããéããªã£ã¦ã³ã£ãããã http://t.co/5q5iweUrnD
— æ± ç°æ大 (@mikeda) 2015å¹´4æ2æ¥
ãã¾ã: ãããã¯ã¼ã¯ã¹ã¿ãã¯å¦çã® CPU è² è·ãæé©åããæè¡
10GbE時代のネットワークI/O高速化 ã«å
¨ã¦æ¸ãã¦ããã¾ãã
ãããã¯ã¼ã¯ã¹ã¿ãã¯ã®CPUè² è·ãããã«ããã¯ã¨è¨ãããããã«ãªã£ã¦ä¹ ããã§ããããã®éãããã¯ã¼ã¯ã¹ã¿ãã¯å¦çã®CPUè² è·ãä½æ¸ãããããã®ææ³ãç»å ´ãã¦ãã¾ãã(代表çãªãã®ã ãæç²)
ã¾ãããã§ãã¯ãµã è¨ç®ãªã©ã®ãããã¯ã¼ã¯ã¹ã¿ãã¯å¦çã®ä¸ã§æ¯è¼çéãå¦çãCPU(ã«ã¼ãã«)ã§ãããã«ãNICã®ASICã«ãªããã¼ãããã¨ããææ³ãããã¾ãã
NICã対å¿ãã¦ããå¿
è¦ãããã¾ããããªã³ãã¼ãNICã§ãªããã°ã ããã対å¿ãã¦ããæ°ããã¾ãã
ethtool -k eth0 rx-checksumming
ãªã©ã®ã³ãã³ãã§æå¹åã§ãã¾ãã
ã¹ã¿ãã¯å¦çã®ä¸é¨ã ãã§ãªããTCPã¹ã¿ãã¯å¦çã®ã»ã¨ãã©ãNICã«ãããã TCP Offload Engine (TOE) ãªã©ãããã¾ãã
次ã«ãNICããã®å²ãè¾¼ã¿åæ°ãå¤ããªãã1ãã±ãããã¨ã«NICããCPUã¸ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ããããã®ã§ã¯ãªããè¤æ°ãã±ããã®åä¿¡ãå¾
ã£ã¦ããã1ã¤ã®å²ãè¾¼ã¿ã«ã¾ã¨ãã¦ãã¾ãã°ããã¨ããæ¹æ³ã§ããããã¯ãInterrupt Coalescing ã¨å¼ã°ãã¦ãã¾ãããããNICãªããã¼ãã®ä¸ç¨®ã¨ãããããããã¾ããã
Interrupt Coalescing ã¯è² è·ãä¸ããããä¸æ¹ã§ãå¾ç¶ãã±ããã®åä¿¡å¾
ã¤åãã¬ã¤ãã³ã·ã¯ä¸ãã£ã¦ãã¾ãã¨ãããã¡ãªãããããã¾ãã
Interrupt Coalescing ã¯ä»ãæ®éã«ä½¿ããã¦ãããå»å¹´ãEC2ç°å¢ã§å²ãè¾¼ã¿é »åº¦ãªã©ã®ãã©ã¡ã¼ã¿ãã¥ã¼ãã³ã°ããã¦ããããã¾ããã EC2でSR-IOVを使うときのNICドライバパラメータ検証 - ゆううきブログ ã
NIC ã¸ã®ãªããã¼ãã¯å®éã¯åé¡ãèµ·ãããããç¡å¹ã«ãããã¨ãå¤ãã§ãã
ã½ããã¦ã§ã¢ãã¼ã¹ã®ææ³ã¨ãã¦ãLinux 2.6 ãã使ãã NAPI ã¨ããã«ã¼ãã«ãæä¾ãã NIC ãã©ã¤ããã¬ã¼ã ã¯ã¼ã¯ãããã¾ãã NAPI 㯠Interrupt Coalescing ã¨åæ§ã1ãã±ãããã¨ã«NICããCPUã¸ãã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãããããã¨ãé²ãã¾ãããCPU ãã NIC ã¸ãã¼ãªã³ã°ããããã¨ãããç°ãªãã¾ãã å ·ä½çã«ã¯ãä¸æ¦ãã±ãããåä¿¡ããããã¼ãã¦ã§ã¢å²ãè¾¼ã¿ãç¦æ¢ãã¦ãCPU ã NIC ã®åä¿¡ãããã¡ä¸ã®ãã±ããããã§ãããã¾ãã ãã±ããã¬ã¼ããé«ããã°ãç¦æ¢ãããã§ããã¾ã§ã®éã«NICåä¿¡ãããã¡ã«è¤æ°ã®ãã±ãããç©ã¾ããã¯ããªã®ã§ã1åã®ãã¼ãªã³ã°ã§ããããã¾ã¨ãã¦ãã§ããã§ãã¾ãã e1000 ã igb ãªã©ã®ä¸»è¦ãªNICãã©ã¤ã㯠NAPI ã«å¯¾å¿ãã¦ããã¯ããªã®ã§ãå ¸åç㪠Linux ç°å¢ã§ããã°ãNAPI ã§åä½ãã¦ããã¨æã£ã¦è¯ãã¨æãã¾ãã

詳解 Linuxã«ã¼ã㫠第3ç
- ä½è :Daniel P. Bovet,Marco Cesati
- çºå£²æ¥: 2007/02/26
- ã¡ãã£ã¢: 大åæ¬

- ä½è :æ± ç° å®åº,大岩 å°å®,å³¶æ¬ è£å¿,ç«¹é¨ æ¶é,å¹³æ¾ é å·³
- çºå£²æ¥: 2011/07/26
- ã¡ãã£ã¢: åè¡æ¬ï¼ã½ããã«ãã¼ï¼