NVIDIA Fleet Intelligence Agent

NVIDIA Fleet Intelligence Agent - Host agent for GPU telemetry collection and attestation.

Built on top of leptonai/gpud

Overview

For installation prerequisites and setup details, see: Helm Installation, DEB Installation, and RPM Installation.

What It Monitors:

GPU Metrics: Power, temperature, clocks, utilization, memory, Xid events
System Metrics: CPU, memory, disk, network usage
Infrastructure: NVIDIA drivers, CUDA runtime, InfiniBand, containers

Export Formats:

HTTP API Server: Serves data via REST endpoints (JSON) and Prometheus metrics (/metrics)
File Export (Offline Mode): Writes data to local files in CSV or JSON format
Remote Export: Sends telemetry data to OpenTelemetry-compatible endpoints via OTLP over HTTP

Key Features:

Lightweight: <100MB RAM, <1% CPU usage
Non-intrusive: Read-only operations, no system modifications
Production-ready: 24/7 datacenter operation

Supported Platforms

OS Family	Supported Versions	Architecture	GPU
Ubuntu	22.04, 24.04	x86_64, ARM64	Hopper, Blackwell, Rubin
RHEL	8, 9, 10	x86_64, ARM64	Hopper, Blackwell, Rubin
Rocky Linux	8, 9, 10	x86_64, ARM64	Hopper, Blackwell, Rubin
AlmaLinux	8, 9, 10	x86_64, ARM64	Hopper, Blackwell, Rubin
Amazon Linux	2023	x86_64, ARM64	Hopper, Blackwell, Rubin

Documentation

Helm Installation - Kubernetes (Helm) installation and troubleshooting
DEB Installation - Ubuntu package install, update, and uninstall
RPM Installation - RHEL/Rocky/Alma/Amazon package install, update, and uninstall
Architecture - Bare metal and Kubernetes architecture, dependencies, and runtime flow
Usage - Commands, HTTP API, integration, and troubleshooting
Configuration - Environment variables and service configuration
Development - Building from source and contributing

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Related: leptonai/gpud (upstream dependency)

License

Apache License 2.0 - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.github		.github
cmd/fleetint		cmd/fleetint
deployments		deployments
docs		docs
internal		internal
third_party/fleet-intelligence-sdk		third_party/fleet-intelligence-sdk
.coderabbit.yaml		.coderabbit.yaml
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
.secrets.baseline		.secrets.baseline
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.citests		Dockerfile.citests
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
sonar-project.properties		sonar-project.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NVIDIA Fleet Intelligence Agent

Overview

Supported Platforms

Documentation

Contributing

License

About

Uh oh!

Releases 80

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NVIDIA Fleet Intelligence Agent

Overview

Supported Platforms

Documentation

Contributing

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 80

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages