Jump to content

Portal:Toolforge/About Toolforge

From Wikitech


This page explains the basics of how Toolforge works, and introduces core concepts that Toolforge users should understand.

Key concepts: tools, tool accounts, and maintainers

The terms "tool", "tool account", and "project" have the same meaning in Toolforge; "tool accounts" and "tool" are often used interchangeably. The tool is the basic unit of deployment in Toolforge. Each tool is actually a tool account with resources, processes, and other components in a tool-specific namespace.

A tool account is a group account associated with a tool. A tool account can have one or more members or tool maintainers. You create a separate tool account for each new tool you develop on Toolforge. When you're invited to work on or help maintain a tool, you'll join an existing tool account. Tool accounts enable multiple maintainers to collaboratively manage the software source code, configuration, and jobs for that tool.

A diagram depicting tool maintainers and tool accounts and the actions that can be performed as each

People who have access to a tool account are called maintainers. Maintainers have access to the tool account's code and data.

Maintainers can:

  • Create tool accounts/tools
  • Join existing tool accounts/tools
  • Leave tool accounts/tools in the care of others
  • Log in (sudo) to the tool accounts/tools


Components of Toolforge

Bastion hosts

A bastion is the main host on any given network for external users to log in to. Toolforge users can choose from two bastion hosts:

login.toolforge.org
user login to access tools interactively; use this host for most tasks
dev.toolforge.org
functionally identical to login host, but used for heavy processing, such as compiles

Storage (NFS)

NFS is used for many purposes in Toolforge, including:

  • Storing tool home directories and files (in /data/project/<TOOL NAME>)
  • Distributing tool source code to the Kubernetes computing backend
  • Storing database access credentials (in $HOME/replica.my.cnf) and Kubernetes TLS certificates
  • Storing logs that are generated at runtime from webservices and jobs
  • Distributing wiki dumps
  • Storing tool temp files

To learn more, visit Help:Shared storage.

Kubernetes cluster

At the core of the Toolforge service is a Kubernetes computing engine. Kubernetes (often abbreviated k8s) is a platform for running containers. Toolforge uses Kubernetes to isolate tools from each other and distribute tools across a pool of servers (the "cluster"). Users don't need to worry about becoming system administrators of any servers or hardware because the Toolforge administrators manage the pool of virtual servers. When you log in to a bastion host and start a web service or other jobs, Kubernetes creates a container and finds an appropriate execution node to run your workloads when resources are available.

To learn more about Kubernetes and Toolforge, visit Help:Toolforge/Kubernetes.

Databases

Toolforge supports two sets of databases: the wiki replicas and user-created databases, which are used by individual tools:

  • The wiki replicas follow the same setup as production wiki databases, and the information you can access from them is the same as that which normal registered users (not +sysop or other types of advanced users) can access on-wiki or via the API. Note: some data has been removed from the replicas for privacy reasons.
  • User-created databases can be created by either a user or a tool to store data that is generated by a tool. Most user-created databases are stored in a single database server called "ToolsDB".

To learn about both types of databases, visit Help:Toolforge/Database.

Shared services and resources

Toolforge is shared infrastructure; users access shared resources and services. For example, all users of the Toolforge bastion share the same limited connection to NFS storage. Admins enforce a shared-system policy to prevent any single user from using too many system resources. This includes quotas that limit memory and storage to 2 virtual CPUs and 8 GB of RAM. The maximum recommended per-job memory limit is 4 GB. If your project requires more memory, you can request a quota increase.

Tools for tool maintainers

  • Toolsadmin is where maintainers of Toolforge tools can create and manage their tools.
  • Toolhub is a public catalog of tools, including (but not limited to) tools hosted on Toolforge.
  • Tool-db-usage is a dashboard displaying information about tool databases (user-created databases).

Next steps

To complete one-time set up and get started using Toolforge, visit the Quickstart. If you've already done those steps, learn more about working with tool accounts, or try a tutorial.

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)