Recovering from a disaster

How to recover in the case of a disaster

Previous steps

Task CRD

After defining a LiveSynchronization, a Task resource will be created in the destination cluster. The operator processes the spec.config.reaplication.resources[*].recoveryProcess parameter to define the required steps to activate the dormant applications. Taking as an example the following definition:

apiVersion: automation.astronetes.io/v1alpha1
kind: Task
metadata:
  name: set-test-label
spec:
  plugin: kubernetes-objects-transformation
  config:
    resources:
      - identifier:
          group: apps
          version: v1
          resources: deployments
        patch:
          operations:
            - op: replace
              path: '/labels/test'
              value: 'ok'

Every label with key test in a Deployment will be replaced with the value ok. This Task originates from the following LiveSynchronization object:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: set-test-label
spec:
  plugin: kubernetes-objects-to-kubernetes
  suspend: false
  config:
    sourceName: source
    destinationName: destination
    observability:
      enabled: false
    replication:
      resources:
        - group: apps
          version: v1
          resource: deployments
          recoveryProcess:
            fromPatch:
            - op: replace
              path: '/labels/test'
              value: 'ok'

This object should not be tempered with. It is managed by their adjacent LiveSynchronization.

On the day of a disaster

Recovering from a disaster will require the deployment of a TaskRun resource per Task that applies to recover the system and applications. The following example executes the TaskRun resource defined in the previous section:

apiVersion: automation.astronetes.io/v1alpha1
kind: TaskRun
metadata:
  name: restore-apps
spec:
  taskName: set-test-label