Install GPU drivers

Linux Windows

After you create a virtual machine (VM) instance with one or more GPUs, your system requires NVIDIA device drivers so that your applications can access the device. Make sure your virtual machine (VM) instances have enough free disk space. You should choose at least 40 GB for the boot disk when creating the new VM.

To install the drivers, you have two options to choose from:

If you need GPUs for hardware accelerated 3D graphics such as remote desktop or gaming, see Install drivers for NVIDIA RTX Virtual Workstations (vWS).
For other workloads, follow the instructions in this document to install the NVIDIA driver.

Pro Tip: Alternatively, you can skip this setup by creating VMs with Deep Learning VM images. Deep Learning VM images have NVIDIA drivers pre-installed, and also include other machine learning applications such as TensorFlow and PyTorch.

NVIDIA driver, CUDA toolkit, and CUDA runtime versions

There are different versioned components of drivers and runtime that might be needed in your environment. These include the following components:

NVIDIA driver
CUDA toolkit
CUDA runtime

When installing these components, you have the ability to configure your environment to suit your needs. For example, if you have an earlier version of Tensorflow that works best with an earlier version of the CUDA toolkit, but the GPU that you want to use requires a later version of the NVIDIA driver, then you can install an earlier version of a CUDA toolkit along with a later version of the NVIDIA driver.

However, you must make sure that your NVIDIA driver and CUDA toolkit versions are compatible. For CUDA toolkit and NVIDIA driver compatibility, see the NVIDIA documentation about CUDA compatibility.

Required NVIDIA driver versions

For NVIDIA GPUs running on Compute Engine, the following NVIDIA driver versions are recommended.

Machine series	NVIDIA GPU model	Linux recommended driver	Windows recommended driver
A3	H100	550.90.07	N/A
G2	L4	550.90.07	538.67
A2	A100	550.90.07	538.67
N1	T4, P4, P100, and V100	535.183.01	538.67

Install GPU drivers on VMs by using NVIDIA guides

One way to install the NVIDIA driver on most VMs is to install the NVIDIA CUDA Toolkit.

To install the NVIDIA toolkit, complete the following steps:

Select a CUDA toolkit that supports the minimum driver that you need.
Connect to the VM where you want to install the driver.

On your VM, download and install the CUDA toolkit. The installation package and guide for the minimum recommended toolkit is found in the following table. Before you install the toolkit, make sure you complete the pre-installation steps found in the installation guide.

Machine series	NVIDIA GPU model	Linux recommended CUDA toolkit	Windows recommended CUDA toolkit
A3	H100	Download link: CUDA Toolkit 12.4 Update 1 Installation guide: CUDA 12.4 installation guide	N/A
G2	L4	Download link: CUDA Toolkit 12.4 Update 1 Installation guide: CUDA 12.4 installation guide	Download link: CUDA Toolkit 12.2 Update 2 Installation guide: CUDA 12.2 installation guide
A2	A100
N1	T4 V100 P100 P4	Download link: CUDA Toolkit 12.2 Installation guide: CUDA 12.2 installation guide	Download link: CUDA Toolkit 12.2 Installation guide: CUDA 12.2 installation guide

Install GPU drivers on VMs by using installation script

You can use the following scripts to automate the installation process. To review these scripts, see the GitHub repository.

Linux

Use these instructions to install GPU drivers on a running VM.

Supported operating systems

The Linux installation script was tested on the following operating systems:

Debian 10, 11, and 12
Red Hat Enterprise Linux (RHEL) 8 and 9
Rocky Linux 8 and 9
Ubuntu 20, 22, and 24

If you use this script on other operating systems, the installation might fail. This script can install NVIDIA driver as well as CUDA Toolkit. To install the GPU drivers and CUDA Toolkit, complete the following steps:

If you have version 2.38.0 or later of the Ops Agent collecting GPU metrics on your VM, you must stop the agent before you can install or upgrade your GPU drivers using this installation script.

After you have completed the installation or upgrade of the GPU driver, you must then reboot the VM.

To stop the Ops Agent, run the following command:
```
sudo systemctl stop google-cloud-ops-agent
```
Ensure that Python 3 is installed on your operating system.

Download the installation script.

curl -L https://github.com/GoogleCloudPlatform/compute-gpu-installation/releases/download/cuda-installer-v1.1.0/cuda_installer.pyz --output cuda_installer.pyz

Run the installation script.
```
sudo python3 cuda_installer.pyz install_driver
```
The script takes some time to run. It will restart your VM. If the VM restarts, run the script again to continue the installation.
Verify the installation. See Verify the GPU driver install.
You can also use this tool to install the CUDA Toolkit. To install the CUDA Toolkit, run the following command:
```
sudo python3 cuda_installer.pyz install_cuda
```
This script might take at least 30 minutes to run. It will restart your VM. If the VM restarts, run the script again to continue the installation.

Verify the CUDA toolkit installation.

sudo python3 cuda_installer.pyz verify_cuda

Linux (startup script)

Use these instructions to install GPU drivers during startup of a VM.

Supported operating systems

The Linux installation script was tested on the following operating systems:

Debian 10, 11, and 12
Red Hat Enterprise Linux (RHEL) 8 and 9
Rocky Linux 8 and 9
Ubuntu 20, 22, and 24

If you use this script on other operating systems, the installation might fail. This script can install NVIDIA driver as well as CUDA Toolkit.

Use the following startup script to automate the driver and CUDA Toolkit installation:

#!/bin/bash
if test -f /opt/google/cuda-installer
then
  exit
fi

mkdir -p /opt/google/cuda-installer
cd /opt/google/cuda-installer/ || exit

curl -fSsL -O https://github.com/GoogleCloudPlatform/compute-gpu-installation/releases/download/cuda-installer-v1.1.0/cuda_installer.pyz
python3 cuda_installer.pyz install_cuda

Windows

This installation script can be used on VMs that have secure boot enabled.

For Windows VMs that use a G2 machine series, this script installs only the NVIDIA driver.
For other machine types, the script installs the NVIDIA driver and CUDA toolkit.

Open a PowerShell terminal as an administrator, then complete the following steps:

If you are using Windows Server 2016, set the Transport Layer Security (TLS) version to 1.2.
```
[Net.ServicePointManager]::SecurityProtocol = 'Tls12'
```

Download the script.

Invoke-WebRequest https://github.com/GoogleCloudPlatform/compute-gpu-installation/raw/main/windows/install_gpu_driver.ps1 -OutFile C:\install_gpu_driver.ps1

Run the script.
```
C:\install_gpu_driver.ps1
```
The script takes some time to run. No command prompts are given during the installation process. Once the script exits, the driver is installed.

This script installs the drivers in the following default location on your VM: C:\Program Files\NVIDIA Corporation\.
Verify the installation. See Verify the GPU driver install.

Install GPU drivers (Secure Boot VMs)

These instructions are for installing GPU drivers on Linux VMs that use Secure Boot.

If you are using either a Windows VM or a Linux VM that doesn't use Secure Boot, review one of the following instructions instead:

Installation of the driver on a Secure Boot VM is different for Linux VMs, because these VMs require all kernel modules to be signed by the key trusted by the system.

These instructions are only available for Secure boot Linux VMs that run on Ubuntu 18.04, 20.04, and 22.04 operating systems. Support for more Linux operating systems is in progress.

To install GPU drivers on your Ubuntu VMs that use Secure Boot, complete the following steps:

Connect to the VM where you want to install the driver.
Update the repository.
```
  sudo apt-get update
```
Search for the most recent NVIDIA kernel module package or the version you want. This package contains NVIDIA kernel modules signed by the Ubuntu key. If you want to find an earlier version, change the number for the tail parameter to get an earlier version. For example, specify tail -n 2.
Ubuntu PRO and LTS
For Ubuntu PRO and LTS, run the following command:
```
NVIDIA_DRIVER_VERSION=$(sudo apt-cache search 'linux-modules-nvidia-[0-9]+-gcp$' | awk '{print $1}' | sort | tail -n 1 | head -n 1 | awk -F"-" '{print $4}')
```
Ubuntu PRO FIPS
For Ubuntu PRO FIPS, run the following commands:
1. Enable Ubuntu FIPS updates.
  sudo ua enable fips-updates
2. Shutdown and reboot
  sudo shutdown -r now
3. Get the latest package.
  NVIDIA_DRIVER_VERSION=$(sudo apt-cache search 'linux-modules-nvidia-[0-9]+-gcp-fips$' | awk '{print $1}' | sort | tail -n 1 | head -n 1 | awk -F"-" '{print $4}')
You can check the picked driver version by running echo $NVIDIA_DRIVER_VERSION. The output is a version string like 455.
Install the kernel module package and corresponding NVIDIA driver.

Note: Installing the package might upgrade your kernel.
```
  sudo apt install linux-modules-nvidia-${NVIDIA_DRIVER_VERSION}-gcp nvidia-driver-${NVIDIA_DRIVER_VERSION}
```
If the command failed with the package not found error, the latest NVIDIA driver might be missing from the repository. Retry the previous step and select an earlier driver version by changing the tail number.
Verify that the NVIDIA driver is installed. You might need to reboot the VM.
If you rebooted the system to verify the NVIDIA version. After the reboot, you need to reset the NVIDIA_DRIVER_VERSION variable by rerunning the command that you used in step 3.

Configure APT to use the NVIDIA package repository.

To help APT pick the correct dependency, pin the repositories as follows:

sudo tee /etc/apt/preferences.d/cuda-repository-pin-600 > /dev/null <<EOL
Package: nsight-compute
Pin: origin *ubuntu.com*
Pin-Priority: -1

Package: nsight-systems
Pin: origin *ubuntu.com*
Pin-Priority: -1

Package: nvidia-modprobe
Pin: release l=NVIDIA CUDA
Pin-Priority: 600

Package: nvidia-settings
Pin: release l=NVIDIA CUDA
Pin-Priority: 600

Package: *
Pin: release l=NVIDIA CUDA
Pin-Priority: 100
EOL

Install software-properties-common. This is required if you are using Ubuntu minimal images.
```
 sudo apt install software-properties-common
 
```
Set the Ubuntu version.
Ubuntu 18.04
For Ubuntu 18.04, run the following command:
```
export UBUNTU_VERSION=ubuntu1804/x86_64
```
Ubuntu 20.04
For Ubuntu 20.04, run the following command:
```
export UBUNTU_VERSION=ubuntu2004/x86_64
```
Ubuntu 22.04
For Ubuntu 22.04, run the following command:
```
export UBUNTU_VERSION=ubuntu2204/x86_64
```

Download the cuda-keyring package.

wget https://developer.download.nvidia.com/compute/cuda/repos/$UBUNTU_VERSION/cuda-keyring_1.0-1_all.deb

Install the cuda-keyring package.
```
sudo dpkg -i cuda-keyring_1.0-1_all.deb
```

Add the NVIDIA repository.

sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/$UBUNTU_VERSION/ /"

If prompted, select the default action to keep your current version.

Find the compatible CUDA driver version.

The following script determines the latest CUDA driver version that is compatible with the NVIDIA driver we just installed:
```
  CUDA_DRIVER_VERSION=$(apt-cache madison cuda-drivers | awk '{print $3}' | sort -r | while read line; do
     if dpkg --compare-versions $(dpkg-query -f='${Version}\n' -W nvidia-driver-${NVIDIA_DRIVER_VERSION}) ge $line ; then
        echo "$line"
        break
     fi
  done)
```
You can check the CUDA driver version by running echo $CUDA_DRIVER_VERSION. The output is a version string like 455.32.00-1.

Install CUDA drivers with the version identified from the previous step.

  sudo apt install cuda-drivers-${NVIDIA_DRIVER_VERSION}=${CUDA_DRIVER_VERSION} cuda-drivers=${CUDA_DRIVER_VERSION}

Optional: Hold back dkms packages.

After enabling Secure Boot, all kernel modules must be signed to be loaded. Kernel modules built by dkms don't work on the VM because they aren't properly signed by default. This is an optional step, but it can help prevent you from accidentally installing other dkms packages in the future.

To hold dkms packages, run the following command:
```
  sudo apt-get remove dkms && sudo apt-mark hold dkms
```

Install CUDA toolkit and runtime.

Pick the suitable CUDA version. The following script determines the latest CUDA version that is compatible with the CUDA driver we just installed:

  CUDA_VERSION=$(apt-cache showpkg cuda-drivers | grep -o 'cuda-runtime-[0-9][0-9]-[0-9],cuda-drivers [0-9\\.]*' | while read line; do
     if dpkg --compare-versions ${CUDA_DRIVER_VERSION} ge $(echo $line | grep -Eo '[[:digit:]]+\.[[:digit:]]+') ; then
        echo $(echo $line | grep -Eo '[[:digit:]]+-[[:digit:]]')
        break
     fi
  done)

You can check the CUDA version by running echo $CUDA_VERSION. The output is a version string like 11-1.

Install the CUDA package.

  sudo apt install cuda-${CUDA_VERSION}

Verify the CUDA installation.
```
  sudo nvidia-smi
  /usr/local/cuda/bin/nvcc --version
```
The first command prints the GPU information. The second command prints the installed CUDA compiler version.

Verify the GPU driver install

After completing the driver installation steps, verify that the driver installed and initialized properly.

Linux

Connect to the Linux instance and use the nvidia-smi command to verify that the driver is running properly.

sudo nvidia-smi

The output is similar to the following:

Tue Mar 21 19:50:15 2023
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       On  |   00000000:00:04.0 Off |                    0 |
| N/A   50C    P8             16W /   70W |       1MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

If this command fails, review the following:

Check if GPUs are attached to the VM. To check for any NVIDIA PCI devices, run the following command:
```
sudo lspci | grep -i "nvidia"
```
Check that the driver kernel version and the VM kernel version are the same.
- To check the VM kernel version, run the following command:
```
uname -r
```
- To check the driver kernel version, run the following command:
```
sudo apt-cache show linux-modules-nvidia-NVIDIA_DRIVER_VERSION-gcp
```
  If the versions don't match, reboot the VM to the new kernel version.

Windows Server

Connect to the Windows Server instance and open a PowerShell terminal, then run the following command to verify that the driver is running properly.

nvidia-smi

The output is similar to the following:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 538.67                 Driver Version: 538.67       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L4                    WDDM  | 00000000:00:03.0 Off |                    0 |
| N/A   66C    P8              17W /  72W |    128MiB / 23034MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      4888    C+G   ...CBS_cw5n1h2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A      5180    C+G   ....Search_cw5n1h2txyewy\SearchApp.exe    N/A      |
+---------------------------------------------------------------------------------------+

What's next?

To monitor GPU performance, see Monitor GPU performance.
To handle GPU host maintenance, see Handle GPU host maintenance events.
To improve network performance, see Use higher network bandwidth.
To troubleshoot GPU VMs, see Troubleshoot GPU VMs.

Install GPU drivers

NVIDIA driver, CUDA toolkit, and CUDA runtime versions

Required NVIDIA driver versions

Install GPU drivers on VMs by using NVIDIA guides

Install GPU drivers on VMs by using installation script

Linux

Linux (startup script)

Windows

Install GPU drivers (Secure Boot VMs)

Ubuntu PRO and LTS

Ubuntu PRO FIPS

Ubuntu 18.04

Ubuntu 20.04

Ubuntu 22.04

Verify the GPU driver install

Linux

Windows Server

What's next?