NVIDIA MAGNUM IO SDK
The IO Subsystem for the Modern, GPU-Accelerated Data Center
The NVIDIA MAGNUM IO⢠software development kit (SDK) enables developers to remove input/output (IO) bottlenecks in AI, high performance computing (HPC), data science, and visualization applications, reducing the end-to-end time of their workflows. Magnum IO covers all aspects of data movement between CPUs, GPUsns, DPUs, and storage subsystems in virtualized, containerized, and bare-metal environments.
Get Magnum IO ContainerLatest Magnum IO News
Magnum IO for Cloud-Native Supercomputing Architecture
Magnum IO, the IO subsystem for data centers, introduces the new enhancements necessary to accelerate IO and the communications supporting multi-tenant data centers, known as Magnum IO for Cloud-Native Supercomputing.
Read MoreVolumetric Video Leveraging Magnum IO and Verizon 5G
Magnum IO GPUDirect over an InfiniBand network enables Verizonâs breakthrough distributed volumetric video architecture. By placing their technology into edge computing centers, located at sports centers around the United States and in Verizon facilities, theyâre able to bring 3D experiences to media and serve up new options for putting you in the game.
Watch VideoMagnum IO Ecosystem
Flexible Abstractions
Magnum IO enables AI, data analytics, visualization, and HPC developers to innovate and accelerate applications built using common high-level abstractions and APIs.
Architected for Scale
Magnum IO technologies allow for scaling up computation to multiple GPUs via NVLink and PCIe and across multiple nodes on InfiniBand and Ethernet at data center scale.
Advanced IO Management
Advanced telemetry and monitoring built with NVIDIA NetQ⢠and NVIDIA UFM® help users to configure, troubleshoot, and fine-tune the interconnect infrastructure for peak performance.
Magnum IO Components
Network IO
Storage IO
- GPUDirect Storage
- SNAP
In-Network Computing
- Hardware tag matching
- NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)â¢
IO Management
Accelerating IO Across Applications
Deep Learning
Magnum IO networking provides both point-to-point functions like send and receive, and collectives like AllReduce for deep learning training at scale. The collective APIs hide low-level optimizations like topology detection, peer-to-peer copy, and multi-threading to simplify deep learning training. Send/receive can enable users to accelerate giant deep learning models too big to fit in one GPUâs memory. GPUDirect Storage can also help alleviate IO bottlenecks from local or remote storage by bypassing bounce buffers on the CPU host.
High-Performance Computing
To unlock next-generation discoveries, scientists rely on simulation to better understand complex molecules for drug discovery, physics for new sources of energy, and atmospheric data to better predict extreme weather patterns. Magnum IO exposes hardware-level acceleration engines and smart offloads, such as RDMA, GPUDirect, and NVIDIA SHARP, while bolstering the 400Gb/s high bandwidth and ultra-low latency of NVIDIA Quantum 2 InfiniBand networking.
With multi-tenancy, user applications may be unaware of indiscriminate interference from neighboring application traffic. Magnum IO, on the latest NVIDIA Quantum-2 InfiniBand platform, features new and improved capabilities for mitigating the negative impact on a userâs performance. This delivers optimal results, as well as the most efficient high performance computing (HPC) and machine learning deployments at any scale.
Data Analytics
Data science and machine learning are the world's largest compute segments. Modest improvements in the accuracy of predictive machine learning models can translate into billions of dollars. To enhance accuracy, the RAPIDS⢠Accelerator for Apache Spark library has a built-in shuffle based on NVIDIA UCX® that can leverage GPU-to-GPU communication and RDMA capabilities. Combined with NVIDIA networking, Magnum IO, GPU-accelerated Spark 3.0, and RAPIDS, the NVIDIA data center platform can speed up these huge workloads at unprecedented levels of performance and efficiency.
Resources
- Magnum IO Developer Environment Documentation
- GPUDirect Storage: A Direct Path Between Storage and GPU Memory
- Accelerating IO in the Modern Data Center: Network IO
- Accelerating NVSHMEM 2.0 Team-Based Collectives Using NCCL
- Optimizing Data Movement in GPU Applications with the NVIDIA Magnum IO Developer Environment
- Access MOFED
Get Started Using the Magnum IO Developer Environment
The Magnum IO Developer Environment is available as a container with the latest versions of all libraries, development tools, and profiling tools needed to begin development and optimization. The optimized applications can then be run in virtualized, containerized, or bare-metal environments.