ã¯ããã«
ããã«ã¡ã¯ãå¯å£«éç 究æ ã³ã³ãã¥ã¼ãã£ã³ã°ç 究æã®å¤§è¾»å¼è²´ã§ããä»åã¯ãã¸ã§ã¼ã¸ã¢å·ã¢ãã©ã³ã¿ã§éå¬ãããACM/IEEE SC24ï¼æ£å¼å称ï¼The International Conference for High Performance Computing, Networking, Storage, and Analysisï¼ã®åå å ±åããå±ããã¾ãããã®ä¼è°ã¯ãã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ã«é¢ããä¸çæ大ã®å½éä¼è°ã§ãããAI Computer Brokerï¼ããã°è¨äº, åç», PDFï¼ãã¤ã³ã¿ã©ã¯ãã£ãHPCï¼åç», PDFï¼ãã¯ããã¨ããæè¡ã®å±ç¤ºãçºè¡¨ãè¡ãã¾ããã
ACM/IEEE SC24ã¨ã¯ï¼
ACM/IEEE SC24ã¯ãæ¯å¹´åç±³ã§éå¬ãããã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ã«é¢ããä¸çæ大è¦æ¨¡ã®å½éä¼è°ã§ãããã®ä¼è°ã¯ãã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ã®å種ã©ã³ãã³ã°ãçºè¡¨ãããå ´ã§ããããè«æçºè¡¨ã ãã§ãªããåºå¤§ãªå±ç¤ºã¹ãã¼ã¹ãä½µè¨ããã¦ãããã¨ãç¹å¾´ã§ããä»å¹´ã¯éå»æå¤ã§ãã18,000人以ä¸ãåå ããé常ã«æ´»æ°ã«æºã¡ãã¤ãã³ãã¨ãªãã¾ããã
ä¼è°ã«ããããã¬ã³ã
ã©ã³ãã³ã°ã®åå
ä»å¹´ã®Top500ã©ã³ãã³ã°ã§ã¯ãæ°ãã«ç»å ´ããç±³å½ãã¼ã¬ã³ã¹ãªãã¢ã¢å½ç«ç 究æã®El Capitanã1ä½ã¨ãªãã¾ãããAMDã®æ°ããã¢ã¯ã»ã©ã¬ã¼ã¿MI300Aãæè¼ããEl Capitanã¯ããã³ããã¼ã¯æ§è½(HPL)ã«ããã¦1.742 EFlop/sãè¨é²ããæ¶è²»é»åã¯29.6MWã¨ãªã£ã¦ãã¾ãããã®æ¶è²»é»åã¯å¤§ããããã«æãããã¾ãããä¾ãã°Top500ã©ã³ãã³ã°3ä½ã®Auroraã¯38.7MWã®æ¶è²»é»åã§1.012EFlop/sã§ãããã¨ãããEl Capitanã¯é»åãããã®æ§è½ãåªãã¦ããã¨è¨ãã¾ãããã®ã»ãã«æ¥æ¬ããã¯ãã½ãããã³ã¯ç¤¾ã®CHIE-2ã¨CHIE-3ã16ä½ã¨17ä½ã«ãJCAHPCï¼æ±äº¬å¤§å¦ã»ç波大å¦ï¼ã®Miyabi-Gã28ä½ã«æ°ç»å ´ãã¾ããã2020å¹´6æã«1ä½ã§ç»å ´ãããå¯å²³ãã¯ä»åã®ã©ã³ãã³ã°ã§ã¯6ä½ã¨ãªãã¾ããã
å±ç¤ºã®åå
çæAIãã¼ã ã®å½±é¿ã«ãããé»æºã»å·å´ç³»ã®å±ç¤ºãå¢å ãã¦ãã¾ããçæAIã«ä½¿ç¨ãããã¢ã¯ã»ã©ã¬ã¼ã¿ã¯1ã¤ã§1000Wè¿ãæ¶è²»ãããã®ããããããããè¤æ°æè¼ãããµã¼ããä¸è¬çã«ãªãã¤ã¤ããã¾ãããã®ãããã©ãã¯å½ããã®é»åãç±éã®å¯åº¦ãé«ã¾ããé»æºãå·å´ã®é«åº¦ãªæè¡ãæ±ãããã¦ãã¾ãããã®çµæãé»æºè£ ç½®ãæ°´å·è£ ç½®ã»é¨åã®ã¡ã¼ã«ã¼ãããæ°å¹´ã§ç®ç«ã¤ããã«ãªãã¾ããã
ã¾ããã¡ã¢ãªæ¡å¼µæè¡ã¨ãã¦CXL (Compute Express Link)ã®å¿ç¨ãè¦ããã¾ãããCXLã¯æ§ã ãªããã¤ã¹ãæ¥ç¶ããããã®è¦æ ¼ã®ä¸ã¤ã§ãç¹ã«ã¡ã¢ãªæ¡å¼µã¸ã®å¿ç¨ã注ç®ããã¦ãã¾ããçæAIã¢ãã«ã®å¤§è¦æ¨¡åã«ä¼´ããã¡ã¢ãªéã®èª²é¡ã解決ããããã«ãã¡ã¢ãªã®å ±æãä»å¾éè¦ã«ãªã£ã¦ããã¨èãããã¾ãã
å¯å£«éæè¡ã®å±ç¤ºã»çºè¡¨
å¯å£«éãã¼ã¹ã®ç´¹ä»
å¯å£«éã®ãã¼ã¹ã§ã¯ãGPUãå¹ççã«å©ç¨ããããã®ããã«ã¦ã§ã¢æè¡ACB (AI Computing Broker)ãã¹ãã³ã³ä¸ã§å¤§è¦æ¨¡ã¸ã§ãã®å³æå®è¡ãå¯è½ã«ããInteractive HPCæè¡ãå¯å£«é製ARMããã»ããµMONAKAãéåã³ã³ãã¥ã¼ã¿ã®å±ç¤ºãè¡ãã¾ãããéåã³ã³ãã¥ã¼ã¿ã«ã¤ãã¦ã¯1/2ãµã¤ãºã®ã¢ãã¯ã¢ãããå±ç¤ºããå¤æ°ã®æ¥å ´è ãåçãæ®å½±ãããªã©äººç®ãå¼ãã¦ãã¾ããã
ã³ã³ãã¥ã¼ãã£ã³ã°åºç¤æè¡ã®ç´¹ä»
å±ç¤ºã®ãã¡ãç§ãå±ç¤ºèª¬æãè¡ã£ãã³ã³ãã¥ã¼ãã£ã³ã°åºç¤æè¡ã®å 容ããç´¹ä»ãã¾ãã
- ã¤ã³ã¿ã©ã¯ãã£ãHPC
ãLLMãã¯ããã¨ããçæAIã¯ããã®æ§è½ãé«ããããã«æ¥éã«å¤§è¦æ¨¡åãé²ãã§ãã¾ãããã®çµæãããã¾ã§å¤§è¦æ¨¡ãªè¨ç®ç°å¢ã«é¦´æã¿ã®ãªãAIç 究è ã»éçºè ã§ãã£ã¦ããã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ãã¯ããã¨ãã並åè¨ç®æ©ãå¿ è¦ã¨ããæ©ä¼ãå¢ãã¦ãã¾ãããããããªãããååã¨ãã¦å¦çãé çªéãã«å®è¡ï¼ãããå®è¡ï¼ããã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ã¯å¾ ã¡æéã®çºçãé¿ãããããã¯ã©ã¦ããå©ç¨ãã¦åçã«è³æºã確ä¿ãã¦ãããããã®æ°ããå©ç¨è ã«ã¨ã£ã¦ã¯ä½¿ãããããã®ã¨ã¯è¨ãã¾ããã§ãããããã§ç§ã¯å¤§è¦æ¨¡ãªå¦çã§ãå³æå®è¡ãå¯è½ã«ãªãã¤ã³ã¿ã©ã¯ãã£ãHPCæè¡ã®ç 究éçºã«åãçµã¿ãã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ãããå¤ãã®æ¹ã«å¹ççã«ä½¿ã£ã¦ããã ããã¨ãç®æãã¦ãã¾ããå±ç¤ºã§ã¯1000ãã¼ãç´ã¯ã©ã¹ã¿ã«ãããè¤æ°ã®ä¸¦åããã°ã©ã ã®æéåå²åä½ãã¢ãããã¸ã¿ã«ãã¤ã³ã¢ããªã±ã¼ã·ã§ã³ã«ãããæ½çæ¢ç´¢ãé«éåããããã¨ã示ãã¾ãããå®éã«ã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ãéç¨ãã¦ããçæ§ããããã¤ã³ã¿ã©ã¯ãã£ãHPCæè¡ã®å°å ¥ã«ããå¹æã«å¯¾ãã¦æå¾ ãå¯ãããã好è©ãåãã¾ããã
- ACB
GPUã¯ä¸è¬çã«ä¸åº¦ããã¢ããªã±ã¼ã·ã§ã³ã«å²ãå½ã¦ãããã¨ãã¢ããªçµäºã¾ã§è§£æ¾ããã¾ãããããããå®è¡ä¸ã«å¸¸ã«100% GPUã使ç¨ããã¦ããã¨ã¯éããã空ãæéãçãããã¨ãããã¾ããACBã¯ãPytorchãªã©ã®AIãã¬ã¼ã ã¯ã¼ã¯ã®ã¬ãã«ã§GPUå ±æãå®ç¾ããããã«ã¦ã§ã¢æè¡ã§ããã®ç©ºãæéãæå¹æ´»ç¨ãã¾ãããã¢ã§ã¯ãAlphaFold2ãLLMæ¨è«ãªã©è¤æ°ã®å¦çãéãããGPUæ°ã§å¹ççã«å®è¡ã§ãããã¨ã示ãã¾ãããGPUä¸è¶³ãå¤ãã®ä¼æ¥ãè¨ç®ã»ã³ã¿ã¼ã«ããã¦èª²é¡ã¨ãªã£ã¦ãããã¨ããããæ¬æè¡ã«ã¤ãã¦ã質åãåãåãããå¤æ°ããã ãã¾ããã
ç£å¦é£æºã®åãçµã¿
- æ±äº¬ç§å¦å¤§ãã¼ã¹çºè¡¨
æ±äº¬ç§å¦å¤§æ§ã®å±ç¤ºãã¼ã¹ã«ããã¦ãã¤ã³ã¿ã©ã¯ãã£ãHPCæè¡ã®ç´¹ä»ã¨ããã®å¿ç¨äºä¾ã§ãããã¸ã¿ã«ãã¤ã³ã¢ããªã±ã¼ã·ã§ã³ã®TSUBAMEä¸ã«ããã稼åã«ã¤ãã¦è¬æ¼ãå®æ½ãã¾ããããã®è¬æ¼ã«ããã¦ã¯åæè¡ã®è©³ç´°ã説æããã¨ã¨ãã«ãæ±äº¬ç§å¦å¤§å¦æ§ã®TSUBAME4.0ä¸ã«ããã¦ã¤ã³ã¿ã©ã¯ãã£ãHPCæè¡ãå±éããä¸ã§ãã¸ã¿ã«ãã¤ã³ã¢ããªã±ã¼ã·ã§ã³ãå®è¡ããäºä¾ããç´¹ä»ããã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ã«ããã¦ã容æã«å¦çã®å³æå®è¡ãå¯è½ã«ãªããã¨ã§ãå¾æ¥ãããå¹ççã«æ½çæ¢ç´¢ãå¯è½ã«ãªããã¨ã示ãã¾ããã
- ç波大å¦ãã¼ã¹ã«ãããå±ç¤ºããã³PDSW24çºè¡¨
ãAIã®å¤§è¦æ¨¡åãé²ãã«ã¤ãã¦ãã¼ã¿ã®å ¥åºåã«æ±ããããè¦æ±ãé«ã¾ã£ã¦ãã¾ãã ãã®è¦æ±ã«ããããããã«ç波大å¦æ§ã¨é«æ§è½ãªãã¼ã¿ã¹ãã¢ã«é¢ããå ±åç 究ãè¡ã£ã¦ãã¾ãããã®åãçµã¿ã®ãã¡ãé«éRPCï¼é éæç¶ãå¼ã³åºãï¼æè¡ã«é¢ããçºè¡¨ãPDSW (Parallel Data Systems Workshop) 2024ã§è¡ãã¾ãããè¤æ°ã®ã³ã³ãã¥ã¼ã¿ã«ããæ§æãããé«æ§è½ãªãã¼ã¿ã¹ãã¢ã¯ãå¤é¨ã¨ã®ããåããå é¨ã®ãã¼ã¿æä½ã®ããã«ãããã¯ã¼ã¯ãä»ããå¦çã®å®è¡ãä¸å¯æ¬ ã§ããå¾æ¥ã«æ¯ã¹ã¦å§åçãªæ§è½ãçºæ®ãããã¼ã¿ã¹ãã¢ã·ã¹ãã ãå®ç¾ããããã«ã¯ãéä¿¡ã¨ãã¼ã¿æä½ãå¯ã«çµåããæ°ããä»çµã¿ãå¿ è¦ã§ããã¨èãã¦ãããRDMAï¼é éç´æ¥ã¡ã¢ãªã¢ã¯ã»ã¹ï¼ã¨é«ä¸¦åå¦çæ©è½ãåããé«éRPCæè¡ãç 究éçºãã¦ãã¾ããã¾ããç波大å¦æ§ã®ãã¼ã¹ã«ããã¦ãå ±åç 究ã«é¢ãããã¹ã¿ã¼ãå±ç¤ºããé«æ§è½ãã¼ã¿ã¹ãã¢ã·ã¹ãã ã«é¢ããç£å¦é£æºã®åãçµã¿ãææã示ãã¾ããã
ã¾ã¨ã
çæAIæè¡ã®æ®åã«ãããè¨ç®æ©ãã®ãã®ã®ã¿ãªãããå·å´ãé»æºãªã©å¹ åºãé åã®åºå±ãå¢ãã¦ãããæè¡ã®ããéã®åºããæããæ©ä¼ãå¢ãã¦ãã¾ãããã©ã³ãã³ã°ã«ããã¦ã¯ãå¼ãç¶ããããã®å ¥ãæ¿ãããèµ·ãã£ã¦ãããç©æ¥µçãªç 究éçºãé²ãã§ãããã¨ãå®æã§ãã¾ãããæ¥æ¬ãå¼ãç¶ãã¹ã¼ãã¼ã³ã³ãã¥ã¼ã¿ã®é åã§ã®ãã¬ã¼ã³ã¹ãçºæ®ãã¦ãããç§ãè¨ç®åºç¤é åã«ãããç 究éçºã®åãçµã¿ãéãã¦ãç¶ç¶çã«è²¢ç®ãã¦ããããã¨èãã¦ãã¾ãã