GitHub - armadaproject/armada at 36748e5a28b8bff8eb60b1b6945017c28368041d

Armada is a system for scheduling and running batch jobs (e.g., a compute job for training a machine learning model) over Kubernetes clusters. Armada is designed for high availability and to handle scheduling hundreds of jobs per second (with potentially millions of jobs queued) over thousands of nodes.

To achieve this, Armada, unlike previous Kubernetes batch schedulers (e.g., kube-batch), can schedule jobs over multiple Kubernetes clusters simultaneously and hence scale beyond the limitations of a single Kubernetes cluster (which we find to be about 1000 nodes). In addition, Kubernetes clusters can be connected and disconnected from Armada on the fly without disruption. Jobs are submitted to job queues, of which there may be many (for example, each user could have a separate queue), and Armada divides compute resources fairly between queues.

Armada is loosely based on the HTCondor batch scheduler and can be used as a replacement of HTCondor, provided all nodes are enrolled in Kubernetes clusters.

Armada is not

A service scheduler (i.e., Armada jobs should have a finite lifetime)
Designed for low-latency scheduling (expect on the order of 10 seconds from job submission)
Designed to schedule jobs over underlying systems other than Kubernetes clusters

Documentation

For an overview of the architecture and design of Armada, and instructions for submitting jobs, see:

For instructions of how to setup and develop Armada, see:

For API reference, see:

Api Documentation

We expect readers of the documentation to have a basic understanding of Docker and Kubernetes; see, e.g., the following links:

Name		Name	Last commit message	Last commit date
Latest commit History 1,313 Commits
.circleci		.circleci
.github		.github
build		build
client/DotNet		client/DotNet
cmd		cmd
config		config
deployment		deployment
docs		docs
e2e		e2e
example		example
internal		internal
pkg		pkg
scripts		scripts
.gitignore		.gitignore
.gitpod.Dockerfile		.gitpod.Dockerfile
.gitpod.yml		.gitpod.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
logo.svg		logo.svg
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Documentation

About

Releases 278

Packages

Contributors 88

Languages

License

armadaproject/armada

Folders and files

Latest commit

History

Repository files navigation

Documentation

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 278

Packages 0

Contributors 88

Languages

Packages