Missed Next '24? All sessions are now available on demand. Watch now.

Integrated supercomputing architecture

AI Hypercomputer

AI optimized hardware, software, and consumption, combined to improve productivity and efficiency.

Blog: Introducing Cloud TPUv5 and AI Hypercomputer

Overview

Performance-optimized hardware

Our performance optimized infrastructure including Google Cloud TPU, Google Cloud GPU, Google Cloud Storage and the underlying Jupiter network consistently provides fastest time to train for large scale state of the art models due to the strong scaling characteristics of the architecture which leads to the best price/performance for serving large models.

Power your LLMs with Google Cloud TPU

Learn how Google Cloud’s custom designed AI Accelerator—Google Cloud TPU optimizes performance for your LLM workloads.

Watch on-demand

Open software

Our architecture is optimized to support the most common tools and libraries such as Tensorflow, Pytorch and JAX. Plus it allows customers to take advantage of technologies such as Cloud TPU Multislice and Multihost configurations and managed services like Google Kubernetes Engine. This allows customers to deliver turnkey deployment for common workloads like the NVIDIA NeMO framework orchestrated by SLURM.

Open LLMs on GKE-Llama 2 and Beyond

Discover how you can take your gen AI platform game to the next level with Open LLMs on GKE-Llama 2 and Beyond.

Watch on-demand

Flexible consumption

Our flexible consumption models allow customers to choose fixed costs with committed use discounts or dynamic on-demand models to meet their business needs. Dynamic Workload Scheduler helps customers get the capacity they need without over allocating so they are only paying for what they need. Plus, Google Cloud's cost optimization tools help automate resource utilization to reduce manual tasks for engineers.

Optimize resource access and economics for AI/ML workloads

Learn how the Dynamic Workload Scheduler service optimizes your AI workload execution.

Read the blog

How It Works

Google is a leader in artificial intelligence with the invention of technologies like TensorFlow. Did you know you can leverage Google’s technology for your own projects? Learn about Google's history of innovation in AI infrastructure and how you can leverage it for your workloads.

Watch on-demand

Google Cloud AI Hypercomputer architecture diagram alongside the Google Cloud product manager Chelsie's photo

Common Uses

Run large-scale AI training

Powerful, scalable, and efficient AI training

The AI Hypercomputer architecture offers optionality to use the underlying infrastructure that best scales to meet your training needs.

How to define a storage infrastructure for AI workloads

Three Charts Describing AI Growth Factors

How-tos

Powerful, scalable, and efficient AI training

The AI Hypercomputer architecture offers optionality to use the underlying infrastructure that best scales to meet your training needs.

How to define a storage infrastructure for AI workloads

Additional resources

Powerful, scalable, and efficient AI training

Measure the effectiveness of your large scale training the Google way with ML Productivity Goodput.

Introducing ML Productivity Goodput: a metric to measure AI system efficiency

Training Speed TPUv4(bf16) vs TPUv5(int8)

Customer examples

Character AI leverages Google Cloud to scale up

"We need GPUs to generate responses to users' messages. And as we get more users on our platform, we need more GPUs to serve them. So on Google Cloud, we can experiment to find what is the right platform for a particular workload. It's great to have that flexibility to choose which solutions are most valuable." Myle Ott, Founding Engineer, Character.AI

Watch Now

Myle Ott, Founding Engineer, Character.AI

1:36

Deliver AI powered applications