NVIDIA AI Summit Japan 2024ã«ã¦çºè¡¨ããå 容ã§ãã ç»å£æ¥ï¼ 2024å¹´11æ13æ¥ ã¤ãã³ããªã³ã¯ï¼Â https://www.nvidia.com/ja-jp/events/ai-summit/ ç»å£åç»ï¼Â https://www.nvidia.com/ja-jp/onâ¦
ããã«ã¡ã¯ããããã®ãã¬ãã¸ç·¨éé¨ã§ãã 7æã«éå¬ãããJANOG54 Meetingã§ãçæAIåããããªãã¯ã¯ã©ã¦ããµã¼ãã¹ãã¤ãã£ã¦ã¿ã話ãã«ã¤ãã¦ããããã¤ã³ã¿ã¼ããã é«å³¯ èª ãããäºä¸ å¬è¦ãããå¹³ç° å¤§ç¥ãããç»å£ãã¾ããããã®å 容ãã¬ãã¼ããã¾ãã çæAIåãåºç¤ã«ã¤ãã¦ãã¾ãã¯æç³»åãç¨ãã¦æ¦è¦ã説æãã¾ãã2011å¹´ã«ããããã¤ã³ã¿ã¼ãããã¯åæµ·éç³ç©ã«èªç¤¾éå¶ã®ç³ç©ãã¼ã¿ã»ã³ã¿ãéæãã¾ããã2016å¹´9æã«åãã¦GPUã®ã³ã³ãã¥ã¼ãã£ã³ã°ãªã½ã¼ã¹ãæä¾ããããããã®å°ç¨ãµã¼ã é«ç«åãã·ãªã¼ãºã®ãµã¼ãã¹ãéå§ãã2020å¹´7æã«ã¯ãããã®å°ç¨ãµã¼ãã®æ°ãããããã¯ã¼ã¯åºç¤ããããã®å°ç¨ãµã¼ã PHYãã¨ãããµã¼ãã¹ãæä¾éå§ãã¾ãããããã«ã2021å¹´ã«ã¯2016å¹´ã«æä¾éå§ãããããã®å°ç¨ãµã¼ãé«ç«åã®GPUãµã¼ããä»®æ³åãããããã®ã¯ã©ã¦ãã«è¼ããããããã¯
ç±³Metaã¯4æ10æ¥ï¼ç¾å°æéï¼ãAIãã¬ã¼ãã³ã°é«éåã®ããã®ã«ã¹ã¿ã ããããMTIAãï¼Meta Training and Inference Acceleratorï¼ã®ç¬¬2ä¸ä»£ãçºè¡¨ããã æ¨å¹´5æã«çºè¡¨ããMTIAã¯ãMetaã®FacebookãInstagramã§ã®ã©ã³ãã³ã°ããã³åºåã¬ã³ã¡ã³ãã¼ã·ã§ã³æ©è½ãªã©ã§æé©ã«åä½ããããè¨è¨ããã¦ãããAIã®ãã¬ã¼ãã³ã°ãå¹çåããæ¨è«ã容æã«ããã®ãç®çã ã 第2ä¸ä»£ã®MTIAã¯ãå 代ã®ã³ã³ãã¥ã¼ãã£ã³ã°å¸¯åå¹ ã¨ã¡ã¢ãªå¸¯åå¹ ã2å以ä¸ã«æ¡å¼µãããã¦ã¼ã¶ã¼ã«é«å質ã®æ¨å¥¨ãæä¾ããã©ã³ãã³ã°ã¨æ¨å¥¨ã¢ãã«ãå¹ççã«æä¾ããããè¨è¨ããã¦ãããã¨ãããåæã®ãã¹ãçµæã§ã¯ã4ã¤ã®ä¸»è¦ã¢ãã«ã§å 代ã¨æ¯è¼ãã¦æ§è½ã3ååä¸ããã ä¸è¬çãªGPUãã大容éã®SRAMã使ããã¨ã§ããããµã¤ãºãå¶éããã¦ãã¦ãååãªã³ã³ãã¥ã¼ãã£ã³ã°ãæä¾ã§ããã
ã¯ããã« NVIDIA A100ã«ã¦ãL2 Cacheã®æ§æãå¤ãã£ããã¨ã¯ãä¸è¨ã®ããã°ã§æ¸ãã¾ããã vengineer.hatenablog.com ä»åã¯ãL2 Cache ã®ãµã¤ãºããP100ã®4MBãV100ã®6MBãã A100 ã§ã¯ 40MB (48MB)ãH100 ã§ã¯ 50MB (60MB) ã«ãªã£ã¦ããã®å©ç¨ã«ã¤ãã¦èª¿ã¹ã¦ã¿ã¾ããã NVIDIA GA100 ã® L2 Cache A100 ã® L2 Cache ã¯ã40MB (GA100ã§ã¯ 48MB ã§ãããA100 ã¨ãã¦ã¯ 40MB ãã使ãã¾ãã) ã¨ãV100 ã® 6MB ãã 大ããå¢ãã¾ããã ååã®ããã°ã§æ¸ããããã«ãGA100 ã® L2 Cacheã¯2ã¤ã®ãããã¯ã«åå²ãããåãããã¯ã¯ 20MBã20MB ã¯ã512KB x 40 åã¨ããæ§æã«ãªã£ã¦ãã¾ãã GA100ã¯ã6åã®HBM2e ã
AITuberãããããéçºè ã¨ãã¦ãç¥ããããããå çãã¨akio kodairaæ°ãçé ã«ããç 究ã°ã«ã¼ãã¯12æ21æ¥ããªã¢ã«ã¿ã¤ã ç»åçæãå®ç¾ããããã«æé©åããããã¤ãã©ã¤ã³ãStreamDiffusionããçºè¡¨ãå¾æ¥ã®ç»åçæãã¤ãã©ã¤ã³ã¨æ¯ã¹ã¦é£èºçãªé度åä¸ãå®ç¾ãã¦ããã ãã¤ãºé¤å»ããããå¦çã§é«éå ãStable Diffusionããã¯ããã¨ããç»åçæAIã¢ãã«ã®é«æ§è½åã¯èããããã¡ã¿ãã¼ã¹ããªã³ã©ã¤ã³ã¹ããªã¼ãã³ã°ãªã©é«ã¹ã«ã¼ãããã¨ä½ã¬ã¤ãã³ã·ã¼ãå¿ è¦ãªç°å¢ã§ã¯ã¾ã åä¸è¶³ã ã StreamDiffusionã¯æ°ããã¢ããã¼ããæ¡ç¨ããå¾æ¥ã®é£ç¶çãªãã¤ãºé¤å»ããããå¦çã®ããã»ã¹ã«å¤æãããã¨ã§ãé«ã¹ã«ã¼ãããã¹ããªã¼ã ãå®ç¾ãããã«ãGPUã®å©ç¨å¹çãåä¸ããããããå¾æ¥ã®åé¡å¨ããªã¼ã¬ã¤ãã³ã¹ï¼CFGï¼ã«ä»£ãããæ®å·®åé¡å¨ããªã¼ã¬ã¤ãã³ã¹ï¼RCFG
ä¸è¨ã®è¡¨ã®ãã±ã¼ã¸æ°ãã¨ã¯ãèé¢ã«ããcluster networkç¨ã®éä¿¡ãã¼ãã®å£æ°ã表ãã¦ãã¾ãã DGX H100ã§æ¡ç¨ãããtwin port OSFPã¯ãçä½å é¨ã§2ã¤ã®ConnectX-7(400Gbps)ã«æ¥ç¶ããæ§é ã¨ãªã£ã¦ãããï¼ã¤ã®ãã©ã³ã·ã¼ãã«ï¼æ¬ã®ã±ã¼ãã«ãæ¿ãã¦éç¨ãã¾ãããã®ããããã¼ãæ°ã¯ã±ã¼ã¸æ°ã®2åã¨ãªãã¾ããtwin port OSFPãæ¡ç¨ãããã¨ã§ãDGX A100ã§ã¯ãµã¼ãã¼èé¢ã®ç´ååãå ãã¦ããcluster networké¨åããDGX H100ã§ã¯ãµã¼ãã¼èé¢ã®ä¸å¿é¨åã«åã¾ãããã«ãªãã¾ããã ãã°ãã°ãã1ã¤ã®OSFPãã©ã³ã·ã¼ãã使ã£ã¦800Gbpsã§éä¿¡ã§ãããã¨ããè¨è¿°ãããã¾ãããããã¯åã«é信帯åã足ãåãããæ°å¤ã«éãããDGXãµã¼ãã¼éã®GPUéä¿¡ã800Gbpsã§è¡ããã¨ããæå³ã§ã¯ããã¾ããã®ã§æ³¨æãã¦ãã ããã c
NVIDIA Deep Learning Performance Documentation - Last updated February 1, 2023 Get Started With Deep Learning Performance This is the landing page for our deep learning performance documentation. This page provides recommendations that apply to most deep learning operations. It also provides links, short explanations of other performance documents, and how these pages fit together. Training Train
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}