Configuring a recovery plan

How to proctect the platform resources from a disaster

Introduction

A RecoveryPlan resource indicates a set of Kubernetes resource to replicate or synchronize between the source cluster and the destination cluster.

Requirements

Process

1. Configure the recovery plan

Create the recoveryplan.yaml file according to your requirements. For this example, the goal is to synchronize deployments with the disaster-recovery label set to enabled. It is also desirable that when its replication is completed that no pod is created in the destination cluster and that after a RecoveryExecutionJob the deployment launches active pods again.

Let’s dissect the following YAML:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: applications
spec:
  suspend: true
  forceNamespaceCreation: true
  sourceClusterRef:
    name: source
    namespace: dr-maqueta
  destinationClusterRef:
    name: destination
    namespace: dr-maqueta
  resources:
    - group: apps
      version: v1
      resource: deployments
      transformation:
        patch:
          - op: replace
            path: /spec/replicas
            value: 0
      filters:
        selector:
          matchLabels:
            disaster-recovery: enabled
      recoveryProcess:
        fromPatch:
          - op: replace
            path: /spec/replicas
            value: 1

spec.sourceClusterRef and spec.destinationClusterRef refers to the name and namespace of the ManagedCluster resources for the corresponding clusters.

The spec.resources is a list of the set of resources to deploy. A single RecoveryPlan can cover multiple types or groups of resources, although this example only manages deployments.

The type of the resource is defined at spec.resources[0].resource. The filters can be located in spec.resources[0].filters. In this case, the RecoveryPlan is matching the content of the disaster-recovery label.

The spec.resources[0].transformation and spec.resources[0].recoveryProcess establish the actions to take after each resource is synchronized and after they are affected by the recovery process respectively. In this case, while being replicated, each deployment will set their replicas to 0 in the destination cluster and will get back to one after a successful RecoveryExecutionJob The resource parameters are always left intact in the source cluster.

2. Suspending and resumen a recovery plan

A keen eye might have noticed the spec.suspend parameter. In this example it is set to true to indicate that the recovery plan is inactive. An inactive or suspended recovery plan will not replicate new or existing resources until it is resumed. Resuming a recovery plan can be done by setting spec.suspend to false and applying the changes in yaml. Alternatively, a patch with kubectl will work as well and will not require the original yaml file:

kubectl patchrecoveryplan <recovery_plan_name> -p '{"spec":{"suspend":false}}' --type=merge

3. Deploy the recovery plan

The recovery plan can be deployed as any other Kubernetes resource:

kubectl -n <namespace_name> apply -f recoveryplan.yaml

4. Identify the RecoveryExecutionPlan

Once you have deployed the RecoveryPlan in the management cluster, you should found the RecoveryExecutionPlan in the destination cluster created by the operator:

kubectl -n <namespace_name> get recoveryexecutionplan

Additional steps

For more examples, take a look at our samples.

Modifying synchronized resources.

Depending on the use case and the chosen solution for Disaster Recovery, it is convenient that resources synchronized in the destination cluster differ from the original copy. Taking as example a warm standby scenario, in order to optimize infrastructure resources, certain objects such as Deployments or Cronjobs do not need to be actively running until there is a disaster. The standby destination cluster can run with minimal computing power and autoscale as soon as the recovery process starts, reducing the required overhead expenditure.

While a resource is being synchronized into the destination cluster, its properties can be transformed to adapt them to the organization necessities. Then, if and when a disaster occurs, the resource characteristics can be restored to either its original state or an alternative one with the established recover process.

Filters

FIlters are useful to select only the exact objects to synchronize. They are set in the spec.resources[x].filters parameter.

Name selector

The nameSelector filters by the name of the resources of the version and type indicated. The following example selects only the Configmaps that follow the regular expression config.*:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: test-name-selector
  namespace: dr-config
spec:
  suspend: false
  sourceClusterRef:
    name: source
    namespace: dr-config
  destinationClusterRef:
    name: destination
    namespace: dr-config
  forceNamespaceCreation: true
  resources:
    - version: v1
      resource: configmaps
      filters:
        nameSelector:
          regex:
            - "config.*"

This selector can also be used negatively with excludeRegex. The following example excludes every configmap that ends in .test:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: test-name-selector
  namespace: dr-config
spec:
  suspend: false
  sourceClusterRef:
    name: source
    namespace: dr-config
  destinationClusterRef:
    name: destination
    namespace: dr-config
  forceNamespaceCreation: true
  resources:
    - version: v1
      resource: configmaps
      filters:
        nameSelector:
          excludeRegex:
          - "*.test"

Namespace selector

The namespaceSelector filters resources taking in consideration the namespace they belong to. This selector is useful to synchronize entire applications if they are stored in a namespace. The following example selects every deployment that is placed in a namespace with the label disaster-recovery: enabled:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: applications
spec:
  suspend: true
  forceNamespaceCreation: true
  sourceClusterRef:
    name: source
    namespace: dr-maqueta
  destinationClusterRef:
    name: destination
    namespace: dr-maqueta
  resources:
    - group: apps
      version: v1
      resource: deployments
      filters:
        selector:
          matchLabels:
            disaster-recovery: enabled

Transformations

Transformations are set in the spec.resources[x].transformation parameter and are managed through patches.

Patch modifications alter the underlying object definiton using the same mechanism as kubectl patch. As with jsonpatch, the allowed operations are replace, add and remove. Patches are defined in the spec.resources[x].transformation.patch list and admits an arbitary number of modifications.

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: recovery-plan
spec:
  ...
  resources:
    - ...
      transformation:
        patch:
          - op: replace
            path: /spec/replicas
            value: 0
          - op: remove
            path: /spec/strategy

RecoveryProcess

The RecoveryProcess of a RecoveryPlan is executed when a RecoveryExecutionJob targetting the RecoveryExecutionPlan originated from the RecoveryPlan is deployed. A resource can be either restored from the original definition stored in a bucket or by performing custom patches like with Transformations.

To restore from the original data, read the Recovering from a Bucket section. This option will disregard performed transformations and replace the parameters with those of the source cluster.

Patching when recovering is accessible at spec.resources[x].recoveryProcess.fromPatch list and admits an arbitary number of modifications. It will act on the current state of the resource in the destination cluster, meaning it will take into consideration the transformations performed when it was synchronized unlike when recovering from original. As with jsonpatch, the allowed operations are replace, add and remove.

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: recovery-plan
spec:
  ...
  resources:
    - ...
      recoveryProcess:
        fromPatch:
          - op: replace
            path: /spec/replicas
            value: 1