Skip to content

Commit

Permalink
Local swarm + podman support (webrecorder#261)
Browse files Browse the repository at this point in the history
* backend: refactor swarm support to also support podman (webrecorder#260)
- implement podman support as subclass of swarm deployment
- podman is used when 'RUNTIME=podman' env var is set
- podman socket is mapped instead of docker socket
- podman-compose is used instead of docker-compose (though docker-compose works with podman, it does not support secrets, but podman-compose does)
- separate cli utils into SwarmRunner and PodmanRunner which extends it
- using config.yaml and config.env, both copied from sample versions
- work on simplifying config: add docker-compose.podman.yml and docker-compose.swarm.yml and signing and debug configs in ./configs
- add {build,run,stop}-{swarm,podman}.sh in scripts dir
- add init-configs, only copy if configs don't exist
- build local image use current version of podman, to support both podman 3.x and 4.x
- additional fixes for after testing podman on centos
- docs: update Deployment.md to cover swarm, podman, k8s deployment
  • Loading branch information
ikreymer authored Jun 14, 2022
1 parent 68ec582 commit 418c07b
Show file tree
Hide file tree
Showing 40 changed files with 661 additions and 389 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:

-
name: Copy Configs
run: cp ./configs/config.sample.env ./configs/config.env; cp ./configs/storages.sample.yaml ./configs/storages.yaml
run: ./scripts/init-configs.sh

-
name: Build Backend
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
**/*.pyc
**/node_modules/
**/config.env
configs/storages.yaml
**/config.yaml
**/signing.yaml
.DS_Store
98 changes: 73 additions & 25 deletions Deployment.md
Original file line number Diff line number Diff line change
@@ -1,60 +1,108 @@
# Deploying Browsertrix Cloud

Currently Browsertrix Cloud can be deployed in both Docker and Kubernetes.
Browsertrix Cloud can be deployed anywhere from single-node isolated environments, multi-machine setups and cloud-native Kubernetes!

Browsertrix Cloud currently supports three deployment methods:
- Rootless deployment with podman on a single-machine (no Docker required)
- Docker Swarm for single or multi-machine deployments
- Kubernetes Cluster deployment.

Some basic instructions are provided below, we plan to expand this into more detail tutorial in the future.

## Deploying to Docker
(All shell scripts can be found in the `./scripts` directory)

## Deploying with Docker Swarm

For local deployments, using Docker Swarm is recommended. Docker Swarm can be used in a single-machine mode as well
as with multi-machine setups. Docker Swarm is part of Docker, so if you have Docker installed, you can use this method.

1. Run the `init-configs.sh` which will copy the sample configs to `configs/config.env` and `configs/config.yaml`.

2. You can edit `configs/config.env` and `configs/config.yaml` to set default passwords for superadmin, minio and mongodb.

3. Run `run-swarm.sh` to initialize the cluster.

4. Load `http://localhost:9871/` to see the Browsertrix Cloud login page. (The API is also available at: `http://localhost:9871/api/docs`).

You can stop the deployment with `stop-swarm.sh` and restart again with `run-swarm.sh`


Note: Currently, unless email settings are configured, you will need to look at the logs to get the invite code for invites. You can do this by running:
`docker service logs btrix_backend`


## Deploying with Podman

Browsertrix Cloud can now also be used with Podman for environments that don't support Docker.

For testing out Browsertrix Cloud on a single, local machine, the Docker Compose-based deployment is recommended.
Podman allows Browsertrix Cloud to be deployed locally by a non-root user.

To deploy via local Docker instance, copy the `config.sample.env` to `config.env`.
Podman deployment also requires either docker-compose or podman-compose.

Docker Compose is required.

Then, run `docker-compose build; docker-compose up -d` to launch.
### Initial Installation

To update/relaunch, use `./docker-restart.sh`.
To run with Podman as a non-root user, there's a few initial installation

The API documentation should be available at: `http://localhost:9871/api/docs`.
1. Ensure the podman service over a socket is running with: `systemctl --user start podman.socket`. Podman does not require a service, but Browsertrix Cloud requires access to the socket to worker.

To allow downloading of WACZ files via the UI from a remote host, set the `STORE_ACCESS_ENDPOINT_URL` to use the domain of the host.
Otherwise, the files are accesible only through the default Minio service running on port 9000.
2. Ensure podman [can set cpu limits](https://github.com/containers/podman/blob/main/troubleshooting.md#26-running-containers-with-cpu-limits-fails-with-a-permissions-error) as Browsertrix Cloud uses cpu and memory limits for each crawl. After following instructions above, also run `sudo systemctl daemon-reload` to reload the delegate settings.

3. Ensure podman-compose is installed via `pip install podman-compose`.

Note: When deployed in local Docker, failed crawls are not retried currently. Scheduling is handled by a subprocess, which stores active schedule in the DB.
4. Run `build-podman.sh` to build the local images.

5. Run the `init-configs.sh` which will copy the sample configs to `configs/config.env` and `configs/config.yaml`.

### Enabling Signing
6. You can edit `configs/config.env` and `configs/config.yaml` to set default passwords for superadmin, minio and mongodb.

7. Run `run-podman.sh` to run Browsertrix Cloud using podman.

8. Load `http://localhost:9871/` to see the Browsertrix Cloud login page. (The API is also available at: `http://localhost:9871/api/docs`).


You can stop the deployment with `stop-podman.sh` and restart again with `run-podman.sh`

Note: Currently, unless email settings are configured, you will need to look at the logs to get the invite code for invites. You can do this by running:
`podman logs -f browsertrix-cloud_backend_1`

It's also possible to use Docker Compose with podman by setting `export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock`. You can change the setting
in `run-podman.sh` and `stop-podman.sh` to use docker-compose instead if desired.


### Enabling Signing (for Swarm and Podman Deployments)

Browsertrix Cloud can optionally sign WACZ files with the same key used to generate an SSL cert.
To use this functionality, the machine running Browsertrix Cloud must be associated with a domain and must have port 80 available on that domain.

To enable signing in the Docker-based deployment:
To use this functionality, the machine running Browsertrix Cloud must be associated with a domain and must have port 80 available on that domain,
or another port forwarding to port 80.

The `docker-compose.signing.yml` adds the capability for signing with the `authsign` module.

1) Copy `configs/signing.sample.yaml` to `configs/signing.yaml` and set the domain and email fields in the config. Set `staging` to false to generate real certificates.
To enable signing in the Docker-based deployment:

2) In `configs.config.env`, also uncomment `WACZ_SIGN_URL`.
1. Copy `configs/signing.sample.yaml` to `configs/signing.yaml` and set the domain and email fields in the config. Set `staging` to false to generate real certificates.

2. In `docker-compose.signing.yaml`, set an optional signing token.

WACZ files created on minio should now be signed! Be sure to also set `STORE_ACCESS_ENDPOINT_URL` to get downloadable links from the UI downloads view.
3. In `run-swarm.sh`, uncomment the option for running with signing.


## Deploying to Kubernetes

For deploying in the cloud and across multiple machines, the Kubernetes (k8s) deployment is recommended.

To deploy to K8s, `helm` is required. Browsertrix Cloud comes with a helm chart, which can be installed as follows:
## Deploying to Kubernetes

`helm install -f ./chart/values.yaml btrix ./chart/`
For deploying in the cloud, the Kubernetes (k8s) deployment is recommended.
Browsertrix Cloud uses `helm` to deploy to K8s.

This will create a `browsertrix-cloud` service in the default namespace.

For a quick update, the following is recommended:
1. Ensure `helm` is installed locally and `kubectl` is configured for your k8s cluster.

`helm upgrade -f ./chart/values.yaml btrix ./chart/`
2. Edit `chart/values.yaml` to configure your deployment. The `ingress` section contains the domain the service will be deployed in, and `signing` can be used to enable WACZ signing.

3. Run: `helm upgrade --install -f ./chart/values.yaml btrix ./chart/` to deploy or upgrade an existing deployment.

Note: When deployed in Kubernetes, failed crawls are automatically retried. Scheduling is handled via Kubernetes Cronjobs, and crawl jobs are run in the `crawlers` namespace.

To stop, run `helm uninstall btrix`.

*Additional info coming soon*
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ and managing all aspects of crawling process. This system provides the orchestra
while the actual crawling is performed using
[Browsertrix Crawler](https://github.com/webrecorder/browsertrix-crawler) containers, which are launched for each crawl.

The system is designed to run equally in Kubernetes and Docker.
The system is designed to run in both Kubernetes and Docker Swarm, as well as locally under Podman.

See [Features](https://browsertrix.cloud/features) for a high-level list of planned features.

Expand All @@ -21,7 +21,7 @@ See the [Deployment](Deployment.md) page for information on how to deploy Browse

## Development Status

Browsertrix Cloud is currently in pre-alpha stages and not ready for production. This is an ambitious project and there's a lot to be done!
Browsertrix Cloud is currently in an alpha stage and not ready for production. This is an ambitious project and there's a lot to be done!

If you would like to help in a particular way, please open an issue or reach out to us in other ways.

Expand Down
6 changes: 6 additions & 0 deletions backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
ARG PODMAN_VERSION=4

FROM docker.io/mgoltzsche/podman:${PODMAN_VERSION}-remote as podmanremote

FROM python:3.9

WORKDIR /app
Expand All @@ -10,5 +14,7 @@ RUN python-on-whales download-cli

ADD btrixcloud/ /app/btrixcloud/

COPY --from=podmanremote /usr/local/bin/podman-remote /usr/bin/podman

CMD uvicorn btrixcloud.main:app_root --host 0.0.0.0 --access-log --log-level info

4 changes: 2 additions & 2 deletions backend/btrixcloud/archives.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,8 +120,8 @@ async def serialize_for_user(self, user: User, user_manager):
class ArchiveOps:
"""Archive API operations"""

def __init__(self, db, invites):
self.archives = db["archives"]
def __init__(self, mdb, invites):
self.archives = mdb["archives"]

self.router = None
self.archive_viewer_dep = None
Expand Down
8 changes: 8 additions & 0 deletions backend/btrixcloud/crawl_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ def __init__(self):
self.finished = None

self._cached_params = {}
self._files_added = False

params = {
"cid": self.cid,
Expand Down Expand Up @@ -188,6 +189,12 @@ async def finish_crawl(self):
if self.finished:
return

# check if one-page crawls actually succeeded
# if only one page found, and no files, assume failed
if self.last_found == 1 and not self._files_added:
await self.fail_crawl()
return

self.finished = dt_now()

completed = self.last_done and self.last_done == self.last_found
Expand Down Expand Up @@ -283,6 +290,7 @@ async def add_file_to_crawl(self, cc_data):
"$push": {"files": crawl_file.dict()},
},
)
self._files_added = True

return True

Expand Down
1 change: 1 addition & 0 deletions backend/btrixcloud/crawlconfigs.py
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,7 @@ async def update_crawl_config(self, cid: uuid.UUID, update: UpdateCrawlConfig):

async def get_crawl_configs(self, archive: Archive):
"""Get all crawl configs for an archive is a member of"""
# pylint: disable=duplicate-code
cursor = self.crawl_configs.aggregate(
[
{"$match": {"aid": archive.id, "inactive": {"$ne": True}}},
Expand Down
1 change: 1 addition & 0 deletions backend/btrixcloud/crawls.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,7 @@ async def list_crawls(
if running_only:
query["state"] = {"$in": ["running", "starting", "stopping"]}

# pylint: disable=duplicate-code
aggregate = [
{"$match": query},
{
Expand Down
5 changes: 2 additions & 3 deletions backend/btrixcloud/invites.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ class InviteToArchiveRequest(InviteRequest):
class InviteOps:
""" invite users (optionally to an archive), send emails and delete invites """

def __init__(self, db, email):
self.invites = db["invites"]
def __init__(self, mdb, email):
self.invites = mdb["invites"]
self.email = email

async def add_new_user_invite(
Expand Down Expand Up @@ -95,7 +95,6 @@ async def remove_invite(self, invite_token: str):
""" remove invite from invite list """
await self.invites.delete_one({"_id": invite_token})

# pylint: disable=no-self-use
def accept_user_invite(self, user, invite_token: str):
""" remove invite from user, if valid token, throw if not """
invite = user.invites.pop(invite_token, "")
Expand Down
2 changes: 1 addition & 1 deletion backend/btrixcloud/k8s/base_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def __init__(self):

async def init_job_objects(self, template, extra_params=None):
""" init k8s objects from specified template with given extra_params """
with open(self.config_file) as fh_config:
with open(self.config_file, encoding="utf-8") as fh_config:
params = yaml.safe_load(fh_config)

params["id"] = self.job_id
Expand Down
1 change: 0 additions & 1 deletion backend/btrixcloud/k8s/k8sman.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,6 @@ async def _create_from_yaml(self, _, yaml_data):
""" create from yaml """
await create_from_yaml(self.api_client, yaml_data, namespace=self.namespace)

# pylint: disable=no-self-use
def _secret_data(self, secret, name):
""" decode secret data """
return base64.standard_b64decode(secret.data[name]).decode()
Expand Down
51 changes: 16 additions & 35 deletions backend/btrixcloud/swarm/base_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,21 @@

from fastapi.templating import Jinja2Templates

from .utils import get_templates_dir, run_swarm_stack, delete_swarm_stack
from .utils import get_templates_dir, get_runner
from ..utils import random_suffix

runner = get_runner()


# =============================================================================
# pylint: disable=too-many-instance-attributes,bare-except,broad-except
class SwarmJobMixin:
""" Crawl Job State """

def __init__(self):
self.secrets_prefix = "/var/run/secrets/"
self.shared_config_file = os.environ.get("SHARED_JOB_CONFIG")
self.custom_config_file = os.environ.get("CUSTOM_JOB_CONFIG")
self.shared_secrets_file = os.environ.get("STORAGE_SECRETS")

self.curr_storage = {}

Expand All @@ -39,15 +41,14 @@ def __init__(self):
self.prefix = os.environ.get("STACK_PREFIX", "stack-")

if self.custom_config_file:
self._populate_env("/" + self.custom_config_file)
self._populate_env(self.secrets_prefix + self.custom_config_file)

self.templates = Jinja2Templates(directory=get_templates_dir())

super().__init__()

# pylint: disable=no-self-use
def _populate_env(self, filename):
with open(filename) as fh_config:
with open(filename, encoding="utf-8") as fh_config:
params = yaml.safe_load(fh_config)

for key in params:
Expand All @@ -61,7 +62,9 @@ async def init_job_objects(self, template, extra_params=None):
loop.add_signal_handler(signal.SIGUSR1, self.unschedule_job)

if self.shared_config_file:
with open("/" + self.shared_config_file) as fh_config:
with open(
self.secrets_prefix + self.shared_config_file, encoding="utf-8"
) as fh_config:
params = yaml.safe_load(fh_config)
else:
params = {}
Expand All @@ -71,18 +74,7 @@ async def init_job_objects(self, template, extra_params=None):
if extra_params:
params.update(extra_params)

if (
os.environ.get("STORAGE_NAME")
and self.shared_secrets_file
and not self.curr_storage
):
self.load_storage(
f"/var/run/secrets/{self.shared_secrets_file}",
os.environ.get("STORAGE_NAME"),
)

if self.curr_storage:
params.update(self.curr_storage)
params["storage_name"] = os.environ.get("STORAGE_NAME", "default")

await self._do_create(loop, template, params)

Expand All @@ -94,36 +86,25 @@ async def delete_job_objects(self, _):
if not self.is_scheduled or self.remove_schedule:
print("Removed other objects, removing ourselves", flush=True)
await loop.run_in_executor(
None, delete_swarm_stack, f"job-{self.orig_job_id}"
None, runner.delete_service_stack, f"job-{self.orig_job_id}"
)
else:
sys.exit(0)

return True

def unschedule_job(self):
""" mark job as unscheduled"""
""" mark job as unscheduled """
print("Unscheduled, will delete when finished", flush=True)
self.remove_schedule = True

def load_storage(self, filename, storage_name):
""" load storage credentials for given storage from yaml file """
with open(filename) as fh_config:
data = yaml.safe_load(fh_config.read())

if not data or not data.get("storages"):
return

for storage in data["storages"]:
if storage.get("name") == storage_name:
self.curr_storage = storage
break

async def _do_create(self, loop, template, params):
data = self.templates.env.get_template(template).render(params)
return await loop.run_in_executor(
None, run_swarm_stack, self.prefix + self.job_id, data
None, runner.run_service_stack, self.prefix + self.job_id, data
)

async def _do_delete(self, loop):
await loop.run_in_executor(None, delete_swarm_stack, self.prefix + self.job_id)
await loop.run_in_executor(
None, runner.delete_service_stack, self.prefix + self.job_id
)
Loading

0 comments on commit 418c07b

Please sign in to comment.