This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Post-installation configuration

Steps to configure the Disaster Recovery solution

1: Setting a managed cluster
2: Configuring a recovery plan
3: Recovering from a Bucket
4: Resynchronization

1 - Setting a managed cluster

Granting access to source and destination cluster

Introduction

Connection to both the source and destination clusters is set using the ManagedCluster resource. Credentials are stored in Kubernetes secrets from which the ManagedCluster collection access to connect to the clusters.

Requirements

The kubeconfig file to access as read-only to the source cluster
The kubeconfig file to access as cluster-admin to the destination cluster
The Secret provided by AstroKube to access the Image Registry

Process

1. Prepare

Create Namespace

Create the namespace to configure the recovery process:

kubectl create namespace <namespace_name>

Setup registry credentials

Create the Secret that stores the credentials to the AstroKube image registry:

kubectl -n <namespace_name> create -f pull-secret.yaml

2. Configure the source Cluster

Create secret

Get the kubeconfig file that can be used to access the cluster, and save it as source-kubeconfig.yaml.

Then create the Secret with the following command:

kubectl -n <namespace_name> create secret generic source --from-file=kubeconfig.yaml=source-kubeconfig.yaml

Create resource

Define the ManagedCluster resource with the following YAML, and save it as managedcluster.yaml:

apiVersion: dr.astronetes.io/v1alpha1
kind: ManagedCluster
metadata:
  name: source
  namespace: <namespace_name>
spec:
  secretRef:
    name: source
    namespace: <namespace_name>

Deploy the resource with the following command:

kubectl create -f managedcluster.yaml

3. Configure the destination Cluster

Create secret

Get the kubeconfig file that can be used to access the cluster, and save it as destination-kubeconfig.yaml.

Then create the Secret with the following command:

kubectl -n <namespace_name> create secret generic destination --from-file=kubeconfig.yaml=destination-kubeconfig.yaml

Create resource

Define the ManagedCluster resource with the following YAML, and save it as managedcluster.yaml:

apiVersion: dr.astronetes.io/v1alpha1
kind: ManagedCluster
metadata:
  name: destination
  namespace: <namespace_name>
spec:
  secretRef:
    name: destination
    namespace: <namespace_name>

Deploy the resource with the following command:

kubectl create -f managedcluster.yaml

2 - Configuring a recovery plan

How to proctect the platform resources from a disaster

Introduction

A RecoveryPlan resource indicates a set of Kubernetes resource to replicate or synchronize between the source cluster and the destination cluster.

Requirements

A ManagedCluster resource for source cluster.
A ManagedCluster resource for destination cluster.

Process

1. Configure the recovery plan

Create the recoveryplan.yaml file according to your requirements. For this example, the goal is to synchronize deployments with the disaster-recovery label set to enabled. It is also desirable that when its replication is completed that no pod is created in the destination cluster and that after a RecoveryExecutionJob the deployment launches active pods again.

Let’s dissect the following YAML:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: applications
spec:
  suspend: true
  forceNamespaceCreation: true
  sourceClusterRef:
    name: source
    namespace: dr-maqueta
  destinationClusterRef:
    name: destination
    namespace: dr-maqueta
  resources:
    - group: apps
      version: v1
      resource: deployments
      transformation:
        patch:
          - op: replace
            path: /spec/replicas
            value: 0
      filters:
        selector:
          matchLabels:
            disaster-recovery: enabled
      recoveryProcess:
        fromPatch:
          - op: replace
            path: /spec/replicas
            value: 1

spec.sourceClusterRef and spec.destinationClusterRef refers to the name and namespace of the ManagedCluster resources for the corresponding clusters.

The spec.resources is a list of the set of resources to deploy. A single RecoveryPlan can cover multiple types or groups of resources, although this example only manages deployments.

The type of the resource is defined at spec.resources[0].resource. The filters can be located in spec.resources[0].filters. In this case, the RecoveryPlan is matching the content of the disaster-recovery label.

The spec.resources[0].transformation and spec.resources[0].recoveryProcess establish the actions to take after each resource is synchronized and after they are affected by the recovery process respectively. In this case, while being replicated, each deployment will set their replicas to 0 in the destination cluster and will get back to one after a successful RecoveryExecutionJob The resource parameters are always left intact in the source cluster.

2. Suspending and resumen a recovery plan

A keen eye might have noticed the spec.suspend parameter. In this example it is set to true to indicate that the recovery plan is inactive. An inactive or suspended recovery plan will not replicate new or existing resources until it is resumed. Resuming a recovery plan can be done by setting spec.suspend to false and applying the changes in yaml. Alternatively, a patch with kubectl will work as well and will not require the original yaml file:

kubectl patchrecoveryplan <recovery_plan_name> -p '{"spec":{"suspend":false}}' --type=merge

3. Deploy the recovery plan

The recovery plan can be deployed as any other Kubernetes resource:

kubectl -n <namespace_name> apply -f recoveryplan.yaml

Recovery Plans and namespaces

It is only possible to deploy one recovery plan per namespace if they share common resources such as a recovery bucket or a managed cluster.

4. Identify the RecoveryExecutionPlan

Once you have deployed the RecoveryPlan in the management cluster, you should found the RecoveryExecutionPlan in the destination cluster created by the operator:

kubectl -n <namespace_name> get recoveryexecutionplan

Additional steps

For more examples, take a look at our samples.

Modifying synchronized resources.

Depending on the use case and the chosen solution for Disaster Recovery, it is convenient that resources synchronized in the destination cluster differ from the original copy. Taking as example a warm standby scenario, in order to optimize infrastructure resources, certain objects such as Deployments or Cronjobs do not need to be actively running until there is a disaster. The standby destination cluster can run with minimal computing power and autoscale as soon as the recovery process starts, reducing the required overhead expenditure.

While a resource is being synchronized into the destination cluster, its properties can be transformed to adapt them to the organization necessities. Then, if and when a disaster occurs, the resource characteristics can be restored to either its original state or an alternative one with the established recover process.

Filters

FIlters are useful to select only the exact objects to synchronize. They are set in the spec.resources[x].filters parameter.

Name selector

The nameSelector filters by the name of the resources of the version and type indicated. The following example selects only the Configmaps that follow the regular expression config.*:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: test-name-selector
  namespace: dr-config
spec:
  suspend: false
  sourceClusterRef:
    name: source
    namespace: dr-config
  destinationClusterRef:
    name: destination
    namespace: dr-config
  forceNamespaceCreation: true
  resources:
    - version: v1
      resource: configmaps
      filters:
        nameSelector:
          regex:
            - "config.*"

This selector can also be used negatively with excludeRegex. The following example excludes every configmap that ends in .test:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: test-name-selector
  namespace: dr-config
spec:
  suspend: false
  sourceClusterRef:
    name: source
    namespace: dr-config
  destinationClusterRef:
    name: destination
    namespace: dr-config
  forceNamespaceCreation: true
  resources:
    - version: v1
      resource: configmaps
      filters:
        nameSelector:
          excludeRegex:
          - "*.test"

Namespace selector

The namespaceSelector filters resources taking in consideration the namespace they belong to. This selector is useful to synchronize entire applications if they are stored in a namespace. The following example selects every deployment that is placed in a namespace with the label disaster-recovery: enabled:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: applications
spec:
  suspend: true
  forceNamespaceCreation: true
  sourceClusterRef:
    name: source
    namespace: dr-maqueta
  destinationClusterRef:
    name: destination
    namespace: dr-maqueta
  resources:
    - group: apps
      version: v1
      resource: deployments
      filters:
        selector:
          matchLabels:
            disaster-recovery: enabled

Transformations

Transformations are set in the spec.resources[x].transformation parameter and are managed through patches.

Patch modifications alter the underlying object definiton using the same mechanism as kubectl patch. As with jsonpatch, the allowed operations are replace, add and remove. Patches are defined in the spec.resources[x].transformation.patch list and admits an arbitary number of modifications.

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: recovery-plan
spec:
  ...
  resources:
    - ...
      transformation:
        patch:
          - op: replace
            path: /spec/replicas
            value: 0
          - op: remove
            path: /spec/strategy

Multiple transformations

While Astronetes Disaster Recovery Operator supports multiple transformations for the same RecoveryPlan, it does not cover having more than one transformation for the same resource group. Transformations that cover different resources of the same resource group should be in different recovery plans. The same resource or resource set can only be affected by up to one transformation and cannot be present in more than one RecoveryPlan.

RecoveryProcess

The RecoveryProcess of a RecoveryPlan is executed when a RecoveryExecutionJob targetting the RecoveryExecutionPlan originated from the RecoveryPlan is deployed. A resource can be either restored from the original definition stored in a bucket or by performing custom patches like with Transformations.

To restore from the original data, read the Recovering from a Bucket section. This option will disregard performed transformations and replace the parameters with those of the source cluster.

Patching when recovering is accessible at spec.resources[x].recoveryProcess.fromPatch list and admits an arbitary number of modifications. It will act on the current state of the resource in the destination cluster, meaning it will take into consideration the transformations performed when it was synchronized unlike when recovering from original. As with jsonpatch, the allowed operations are replace, add and remove.

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: recovery-plan
spec:
  ...
  resources:
    - ...
      recoveryProcess:
        fromPatch:
          - op: replace
            path: /spec/replicas
            value: 1

3 - Recovering from a Bucket

How save objects and recover them using object storage.

Introduction

A RecoveryBucket resource indicates an Object Storage that will be used to restore original objects in the RecoveryPlan.

Object Storage stores data in an unstructured format in which each entry represents an object. Unlike other storage solutions, there is not a relationship or hierarchy between the data being stored. Organizations can access their files as easy as with traditional hierarchical or tiered storage. Object Storage benefits include virtually infinite scalability and high availability of data.

Many Cloud Providers include their own flavor of Object Storage and most tools and SDKs can interact with them as their share the same interface. Disaster Recovery Operator officially supports the following Object Storage solutions:

AWS Simple Storage Service (S3) Google Cloud Storage

Disaster Recovery Operator can support multiple buckets in different providers as each one is managed independently.

Contents stored in a bucket

A bucket is assigned to a RecoveryPlan spec.resources item. The same bucket can be assigned to multiple resources. It stores every synchronized object in the destination cluster with some internal control annotations added. In the case of a disaster, resources with recoveryProcess.fromOriginal.enabled equal to true will be restored using the bucket configuration.

The path of a stored object is as follows: <recoveryplan_namespace>/<recoveryplan_name>/<object_group-version-resource>/<object_namespace>.<object_name>.

Requirements

At least an instance of a ObjectStorage service in one of the supported Cloud Providers. This is commonly known as a bucket and will be referred as so in the documentation.
At least one pair of accessKeyID and secretAccessKey that gives both write and read permissions over all objects of the bucket. Refer to the chosen cloud provider documentation to learn how to create and extract them. It is recommended that each access key pair has only access to a single bucket.

Preparing and setting the bucket

Create the secret

Store the following file and apply it into the cluster substituting the template parameters with real ones.

apiVersion: v1
kind: Secret
metadata:
  name: bucket
  namespace: <namespace>
stringData:
  s3.auth.yaml: |
    accessKeyID: <access_key_id>
    secretAccessKey: <secret_access_key>
    useSSL: true

Create the RecoveryBucket

Store the following file and apply it into the cluster substituting the template parameters with real ones.

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryBucket
metadata:
  name: bucket
  namespace: <namespace>
spec:
  endpoint: storage.googleapis.com
  bucketName: <bucket_name>
  secretRef:
    name: bucket
    namespace: <namespace>

Create the RecoveryPlan

For how to get started with Recovery Plans check its section. If the Recovery Plan does not set spec.resources[x].recoveryProcess.fromOriginal.enabled equal to true, where x refers to the index of the desired resource, the contents of the bucket will not be used. For the configuration to work, make sure both the bucket reference and recovery process transformations are correctly set.

Indicating which bucket to use can accomplished by configuring the spec.BucketRef like in the following example:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: applications
spec:
  suspend: false
  forceNamespaceCreation: true
  sourceClusterRef:
    name: source
    namespace: dr
  destinationClusterRef:
    name: destination
    namespace: dr
  resources:
    - group: apps
      version: v1
      resource: deployments
      transformation:
        patch:
          - op: replace
            path: /spec/replicas
            value: 0
      recoveryProcess:
        fromOriginal:
          enabled: true
  bucketRef:
    name: <bucket_name>
    namespace: <bucket_namespace>
    objectPrefix: <object_prefix>

Create the secret

Store the following file and apply it into the cluster substituting the template parameters with real ones.

apiVersion: v1
kind: Secret
metadata:
  name: bucket
  namespace: <namespace>
stringData:
  s3.auth.yaml: |
    accessKeyID: <access_key_id>
    secretAccessKey: <secret_access_key>
    useSSL: true

Create the RecoveryBucket

Store the following file and apply it into the cluster substituting the template parameters with real ones.

S3 requires that the region in the endpoint matches the region of the target bucket. It has to be explicitely set as AWS does not infer buckets region e.g. us-east-1 for North Virginia.

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryBucket
metadata:
  name: bucket
  namespace: <namespace>
spec:
  endpoint: s3.<aws_region>.amazonaws.com
  bucketName: <bucket_name>
  secretRef:
    name: bucket
    namespace: <namespace>

Create the RecoveryPlan

Indicating which bucket to use can accomplished by configuring the spec.BucketRef like in the following example:

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: applications
spec:
  suspend: false
  forceNamespaceCreation: true
  sourceClusterRef:
    name: source
    namespace: dr
  destinationClusterRef:
    name: destination
    namespace: dr
  resources:
    - group: apps
      version: v1
      resource: deployments
      transformation:
        patch:
          - op: replace
            path: /spec/replicas
            value: 0
      recoveryProcess:
        fromOriginal:
          enabled: true
  bucketRef:
    name: <bucket_name>
    namespace: <bucket_namespace>
    objectPrefix: <object_prefix>

4 - Resynchronization

Synchronized resources reconciliation between source and destination cluster.

Introduction

Due to special circumstances it might be possible that there are objects that were not synchronized from the source cluster to the destination cluster. To cover this case, Astronetes Disaster Recovery Operator offers a reconciliation process that adds, deletes or updates objects in the destination cluster if its state differs from the source.

Auto pruning

The resynchronization process will delete resources in the destination cluster that are not present in the source cluster. It is recommended that before deploying the RecoveryExecutionJob the RecoveryPlan is suspended to avoid potential data loss. Read more about it in the Recovery Plan pause section.

Architecture

Reconciliation is performed at the Recovery Plan level. Every Recovery Plan is in charge of their covered objects and that they are up to date with the specification. Reconciliation is started by two components, EventsListener and Reconciler. The former is in charge of additive reconciliation and the latter of substractive reconciliation.

Additive reconciliation

Refers to the reconciliation of missing objects that are present in the source cluster but, for any reason, are not present or are not up to date in the destination cluster. The entry point is the EventsListener service which receives events with the current state in the source cluster of all the objects covered by the Recovery Plan with a period of one hour by default.

These resync events are then treated like regular events and follow the syncronization communication flow. If the object does not exist in the destination cluster, the Synchronizer will apply it. In the case of updates, only those with a resourceVersion greater than the existing one for that object will be applied, updating the definition of said object.

Substractive reconciliation

In the case that an object was deleted in the source cluster but it was not in the destination, the Additive Reconciliation will not detect it. The source cluster can send events containing the current state of its existing components, but not of those that ceased to exist in it.

For that, the Reconciler is activated with a period of one hour by default. It compares the state of the objects covered the Recovery Plan in both source and destination clusters. If a change is found, it creates a delete event in the NATS. This event is then processed as an usual delete event throughout the rest of the communication process.

Modifying the periodic interval

By default, the resynchronization process will be launched every hour. It can be changed by modifying the value at spec.reconciliation.Duration in the RecoveryPlan object. The admitted format is %Hh%Mm%Ss e.g. 1h0m0s for intervals of exactly one hour. Modifying this variable updates the schedule for both additive and substractive reconciliations.

apiVersion: dr.astronetes.io/v1alpha1
kind: RecoveryPlan
metadata:
  name: resync-3h-25m-12s
spec:
  ...
  reconciliation:
    Duration: 3h25m12s