When you enable the Backup for GKE agent in your Google Kubernetes Engine cluster,
Backup for GKE provides a CustomResourceDefinition
that introduces a new kind of Kubernetes
resource: the ProtectedApplication
.
Composing a ProtectedApplication
involves three activities:
-
Selecting the set of resources that you want to backup and restore as part of the application.
Defining detailed orchestration rules for a subset of those resources.
Validating the
ProtectedApplication
to check if it is ready for backup.
ProtectedApplication
resources provide you with these capabilities when
customizing backup and restore logic at the application level:
More fine-grained backup and restore operations. Without
ProtectedApplications
, the scope of your backups must be defined at theNamespace
level (either by selecting allNamespaces or selectedNamespaces). Similar logic applies to namespaced resource restoration. CreatingProtectedApplication
resources allows you to supply a name to a subset of the resources in aNamespace
. You can then backup and restore that subset by listing selectedApplications in your backup scope (and similarly, for restore).Orchestrating fine-grained details of the backup or restore process, including:
Skipping selected volumes during backup.
Incorporating application topology into backup and restore (for example, only backing up one instance of a replicated database and using it to restore multiple instances).
Executing user-defined hooks before and after volumes are snapshotted. These can be used, for example, to flush and quiesce a workload before snapshotting and unquiesce it afterwards.
You create ProtectedApplication
via kubectl
like other Kubernetes resources.
They are completely optional. If ProtectedApplication
resources are not
present, Backup for GKE creates volume backups for all volumes within the
scope of a backup and the resulting volume backups will be crash consistent -
all writes flushed to the disk at a particular point in time will be captured
(i.e., no partial writes). However, some applications may keep data in
memory that isn't flushed to disk, so whether or not an application can recover
successfully from a crash consistent backup depends upon the application logic.
Selecting resources
The first step in building your ProtectedApplication
resource is to identify
the other resources in the same Namespace
that you want to include as part
of the application. This is the set of resources that will be backed up or
restored if you supply the
selectedApplications
scope option in your BackupPlan
configuration.
Resources are identified using a label
selector
This requires that you label all your resources (using the metadata.label field
in each resource) with the same label. Note that this
also applies to resources that are automatically created by controllers. These
auto-created resources are labeled using their corresponding template.
Note that it is common to re-use the same label you are already using to
associate generated Pods
and PersistentVolumeClaims
with their parent
resource. The following example shows how you can apply the app: nginx
label to the
other resources in addition to the Deployment
.
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-vars
namespace: webserver
labels:
app: nginx
data:
...
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx-logs
namespace: webserver
labels:
app: nginx
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
storageClassName: standard-rwo
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: webserver
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: nginx-logs
persistentVolumeClaim:
claimName: nginx-logs
containers:
...
Once you have your selected label applied to all your target resources (and the
templates from which additional resources are generated), then you can
reference those resources from a ProtectedApplication
. For example:
kind: ProtectedApplication
apiVersion: gkebackup.gke.io/v1
metadata:
name: nginx
namespace: webserver
spec:
resourceSelection:
type: Selector
selector:
matchLabels:
app: nginx
...
Define orchestration rules
Once you have all the resources in your ProtectedApplication
identified, you
can choose to define detailed orchestration rules for a subset of these
resources. These rules may only apply to two kinds of resources:
Deployments
and StatefulSets and are
referenced in the components
section of the ProtectedApplication
.
Component overview
Configuring a component involves the following:
Selecting a fundamental strategy for how backup and restore will work for this component. There are three strategies available:
BackupAllRestoreAll
- backup the volumes associated with all instances of the component and restore them all from the backups.BackupOneRestoreAll
- backup the volumes from only one instance of the component and use those backups to restore all instances.DumpAndLoad
- export data from the application to a single volume at backup time and import that data into the application at restore time.
Defining execution hooks to run during backup (and possibly restore, depending on the strategy). A hook is a command that is executed in specific containers.
Execution hooks
A hook is a shell command that Backup for GKE executes in a container at particular phase of the backup or restore process.
There are four different types of hooks:
pre hooks
- these commands are executed right before volumes are backed up and are generally expected to flush any data in memory to disk and then quiesce the application so that no new disk writes are occurring. These hooks are used in theBackupAllRestoreAll
andBackupOneRestoreAll
strategies.post hooks
- these commands are executed during the volume backup process right after the SNAPSHOTTING step of the volume backup process (before the UPLOADING step). Generally, the SNAPSHOTTING step takes just a few seconds. They are generally expected to unquiesce the application (i.e. allow normal processing and disk writes to proceed). These hooks are used in theBackupAllRestoreAll
,BackupOneRestoreAll
, andDumpAndLoad
strategies.dump hooks
- these commands are executed before the volume is backed up in theDumpAndLoad
strategy and are generally expected to export data from the application into the designated backup volume.load hooks
- these commands are executed at restore time after the backup volume is restored inDumpAndLoad
strategy cases. They are generally expected to import the data from the backup volume into the application.
You may provide more than one hook for each type and Backup for GKE will execute them in the order you define them.
You define hooks as part of the component section of the
ProtectedApplication
specification. All hook definitions have the same
available fields:
name
- a name you assign to the hook.container
- (optional) name of container to run command in. If you don't supply the container, Backup for GKE will run the hook in the first container defined for the targetPod
(s).command
- this is the actual command sent into the container, constructed as an array of words. The first word in the array is the path to the command and subsequent words are the arguments to be passed to the command.timeoutSeconds
- (optional) time before hook execution is aborted. If you don't supply this, then it defaults to 30 seconds.onError
- (optional) behavior taken when the hook fails. May be set toIgnore
orFail
(default). If you set this toFail
, then when a hook fails, the volume backup will fail. If you set this toIgnore
, failures of this hook are ignored.
Before applying ProtectedApplication
hooks to your application, you should test the command
by using kubectl exec
to ensure that the hooks behave as expected:
kubectl exec POD_NAME -- COMMAND
Replace the following:
POD_NAME
: the name of the Pod that contains theProtectedApplication
resource.COMMAND
: the array containing the command that you want to run in the container, for example/sbin/fsfreeze, -f, /var/log/nginx
.
Selecting a subset of volumes to backup
Sometimes, applications write to volumes that are not interesting to restore (for example, certain log or scratch volumes). You can suppress the backup of these volumes by using a volume selector.
To use this feature, you must first apply a common label to the volumes you want
to backup and then leave this label off the volumes you do not want
backed up. Then you include a volumeSelector
clause in your component definition
as follows:
spec:
...
components:
...
strategy:
...
volumeSelector:
matchLabels:
label_name: label_value
If you supply a volumeSelector
for a component, then only the volumes that
have the given label will be backed up and restored. At restore time, any other
volumes will be provisioned as empty instead of restored from a volume backup.
Strategy: BackupAllRestoreAll
This is the simplest strategy and backs up all the component's volumes
at backup time and restores them all from their volume backups at restore time.
It is your best choice when your application has no replication between Pods
.
This strategy supports the following parameters:
backupPreHooks
- (optional) an ordered list of hooks that are executed right before volumes are backed up. These commands are executed on allPods
in the component.backupPostHooks
- (optional) an ordered list of hooks that are executed after volume backups have reached the UPLOADING phase. These commands are executed on allPods
in the component.volumeSelector
- (optional) logic for matching a subset of volumes to backup.
This example creates a ProtectedApplication
resource that quiesces the
file system before backing up the logs volume and unquiesces after the backup:
kind: ProtectedApplication
apiVersion: gkebackup.gke.io/v1
metadata:
name: nginx
namespace: sales
spec:
resourceSelection:
type: Selector
selector:
matchLabels:
app: nginx
components:
- name: nginx-app
resourceKind: Deployment
resourceNames: ["nginx"]
strategy:
type: BackupAllRestoreAll
backupAllRestoreAll:
backupPreHooks:
- name: fsfreeze
container: nginx
command: [ /sbin/fsfreeze, -f, /var/log/nginx ]
backupPostHooks:
- name: fsunfreeze
container: nginx
command: [ /sbin/fsfreeze, -u, /var/log/nginx ]
Strategy: BackupOneAndRestoreAll
This strategy backs up one copy of a selected Pod. This single copy is
the source for restoring all Pods during a restore. This method can help reduce
storage cost and backup time. This strategy works in a high availability configuration
when a component is deployed with one primary PersistentVolumeClaim
and multiple
secondary PersistentVolumeClaims
.
This strategy supports the following parameters:
backupTargetName
- (required) specifies which Deployment or StatefulSet that you want to use to back up the data. The best Pod to back up is automatically selected. In a high availability configuration, we recommend that you set this to one of your application replicas.backupPreHooks
- (optional) an ordered list of hooks that are executed right before volumes are backed up. These commands are executed only on the selected backupPod
.backupPostHooks
- (optional) an ordered list of hooks that are executed after volume backups have reached the UPLOADING phase. These commands are executed only on the selected backupPod
.volumeSelector
- (optional) logic for matching a subset of volumes to backup.
If a component is configured with multiple Deployments or StatefulSets, all resources must have the same PersistentVolume structure, meaning they must follow these rules:
- The number of
PersistentVolumeClaims
used by all Deployments or StatefulSets must be the same. - The purpose of
PersistentVolumeClaims
in the same index must be the same. For StatefulSets, the index is defined in thevolumeClaimTemplate
. For Deployments, the index is defined inVolumes
and any non-persistent volumes are skipped. - If the application component consists of Deployments, each Deployment must have exactly one replica.
Given these considerations, multiple volume sets can be selected for backup, but only one volume from each volume set will be selected.
This example, assuming an architecture of one primary StatefulSet and a secondary StatefulSet, shows a backup of volumes of one Pod in secondary StatefulSet, and then a restore to all other volumes:
kind: ProtectedApplication
apiVersion: gkebackup.gke.io/v1
metadata:
name: mariadb
namespace: mariadb
spec:
resourceSelection:
type: Selector
selector:
matchLabels:
app: mariadb
components:
- name: mariadb
resourceKind: StatefulSet
resourceNames: ["mariadb-primary", "mariadb-secondary"]
strategy:
type: BackupOneRestoreAll
backupOneRestoreAll:
backupTargetName: mariadb-secondary
backupPreHooks:
- name: quiesce
container: mariadb
command: [...]
backupPostHooks:
- name: unquiesce
container: mariadb
command: [...]
Strategy: DumpAndLoad
This strategy uses a dedicated volume for backup and restore processes
and requires a dedicated PersistentVolumeClaim
attached to a component that stores
dump data.
This strategy supports the following parameters:
dumpTarget
- (required) specifies which Deployment or StatefulSet that you want to use to back up the data. The best Pod to back up is automatically selected. In a high availability configuration, we recommend that you set this to one of your application replicas.loadTarget
- (required) specifies which Deployment or StatefulSet should be used to load the data. The best Pod to back up is automatically selected. The load target does not have to be the same as the dump target.dumpHooks
- (required) an ordered list of hooks that are executed to populate the dedicated backup volume. These commands are only executed on the selected dumpPod
.backupPostHooks
- (optional) an ordered list of hooks that are executed after volume backups have reached the UPLOADING phase. These commands are executed only on the selected dumpPod
.loadHooks
- (required) an ordered list of hooks that are executed to load the data from the restored volume after the application starts. These commands are executed only on the selected loadPod
.volumeSelector
- (required) logic for matching a single volume to backup and restore (the "dump" volume). Though it must only match a single volume, you configure this the same way you do the subset of volumes to backup used by other strategies.
If the application consists of Deployments, each Deployment must have exactly one replica.
This example, assuming an architecture of one primary StatefulSet and a secondary StatefulSet
with dedicated PersistentVolumeClaims
for both primary and secondary StatefulSets,
shows a DumpAndLoad
strategy:
kind: ProtectedApplication
apiVersion: gkebackup.gke.io/v1
metadata:
name: mariadb
namespace: mariadb
spec:
resourceSelection:
type: Selector
selector:
matchLabels:
app: mariadb
components:
- name: mariadb-dump
resourceKind: StatefulSet
resourceNames: ["mariadb-primary", "mariadb-secondary"]
strategy:
type: DumpAndLoad
dumpAndLoad:
loadTarget: mariadb-primary
dumpTarget: mariadb-secondary
dumpHooks:
- name: db_dump
container: mariadb
command:
- bash
- "-c"
- |
mysqldump -u root --all-databases > /backup/mysql_backup.dump
loadHooks:
- name: db_load
container: mariadb
command:
- bash
- "-c"
- |
mysql -u root < /backup/mysql_backup.sql
volumeSelector:
matchLabels:
gkebackup.gke.io/backup: dedicated-volume
Check if a ProtectedApplication
is ready for backup
You can check whether a ProtectedApplication
is ready for a backup by running
the following command:
kubectl describe protectedapplication APPLICATION_NAME
Replace APPLICATION_NAME
with the name of your application.
If ready, the application description will show Ready to backup
status as true
,
such as in this example:
% kubectl describe protectedapplication nginx
Name: nginx
Namespace: default
API Version: gkebackup.gke.io/v1
Kind: ProtectedApplication
Metadata:
UID: 90c04a86-9dcd-48f2-abbf-5d84f979b2c2
Spec:
Components:
Name: nginx
Resource Kind: Deployment
Resource Names:
nginx
Strategy:
Backup All Restore All:
Backup Pre Hooks:
Command:
/sbin/fsfreeze
-f
/var/log/nginx
Container: nginx
Name: freeze
Backup Post Hooks:
Command:
/sbin/fsfreeze
-u
/var/log/nginx
Container: nginx
Name: unfreeze
Type: BackupAllRestoreAll
Resource Selection:
Selector:
Match Labels:
app: nginx
Type: Selector
Status:
Ready To Backup: true
Events: <none>
What's next
- Learn more about planning a set of backups.