Astronetes Resiliency Operator provides a transparent and effortless solution to protect Cloud Native platforms from possible disaster outages by leveraging Kubernetes native tools.

1 - Intro

What is Resiliency Operator and why it could be useful for you

Business continuity refers to the ability that a particular business can overcome potentially disruptive events with minimal impact in its operations. This no small ordeal requires the definition, implementation of plans, processes and systems while involving complete collaboration and synchronization between multiple actors and departments.

This collection of assets and processes compose the company’s Disaster Recovery. Its goal is to reduce the downtime and data loss in the case of a catastrophic, unforeseen situation. Disaster Recovery needs answer two questions:

How much data can we lose? - Recovery Point Objective (RPO)
How long can we take to recover the system? - Recovery Time Objective (RTO)

Resiliency Operator provides a solution to improve the business continuity of Cloud Native Platforms by offering a tool that improves resiliency that is transparent in day-to-day operations while having minimal impact in technical maintenance.

Depending on the organisation, system and project necessities resiliency can be improved with a combination of real time synchronization across two or more instances and with a backup and restore strategy. Resiliency Operator implements both methods of data replication across multiple technologies and allows for flexibility on where and how is the information stored.

Business Continuity plans often include complex tests to validate backups content and that they can be restored at any time. To help with these requirements Resiliency Operator includes monitorization systems so that operational teams can make sure that the data is being correctly synchronized and its state in destination.

2 - Architecture

Astronetes Resiliency Operator architecture

2.1 - Overview

Resiliency Operator architecture

Resiliency Operator is installed in a Kubernetes cluster that acts as an orchestrator and hosts the tools and components that synchronize the data across assets.

2.2 - Components

Resiliency Operator Components

Operator

Component	Description
Database controller	Orchestrate the Database obejcts.
Synchronization controller	Orchestrate the Synchronization obejcts.
Synchronization plan controller	Orchestrate the SynchronizationPlan obejcts.

2.3 - Synchronization objects

Astronetes Synchronization objects

Introduction

Astronetes offers the following synchronization objects to cover the infrastructure resiliency requirements.

Synchronization

Synchronizes the content of an asset to another one once. The operations required to perform the snapshot depend on the plugin. More information can be found in the Plugin section. Synchronizations are managed through the Synchronization Custom Resource Definition.

Synchronization Plan

Periodic snapshots with intervals set by the user. Periodicy is established as a cron expression. When the Synchronization Plan starts a new snapshot, it creates a new Synchronization resource akin to how a Kubernetes CronJob deploys a new Job whenever the cron expression indicates. Synchronization Plans are managed through the SynchronizationPlan Custom Resource Definition.

Live Synchronization

Live Synchronization replicates the contents of a source database to a destination one in near real time. This option is the most appropiate to establish a warm-standy, pilot light or active to active resiliency architecture.

This option minimises RPO and RTO due to the minimal amount of data lost before a disaster and the low overhead and wait time to restart operations in the new instance.

Live Synchronization is managed with the Custom Resource Definition LiveSynchronization.

3 - Installation

Install the Resiliency Operator

3.1 - Preparing to install

Setup for the necessary tools to install the operator.

Pre-requirements

Get familiarized with the architecture reading this section.

A valid Resiliency Operator license key and registry access key should already be assigned.

Supported platforms

Astronetes Resiliency Operator is vendor agnostic meaning that any Kubernetes distribution such as Google Kubernetes Engine, Azure Kubernetes Service, OpenShift or self-managed bare metal installations can run it.

This is the certified compatibility matrix:

Platform	Min Version	Max Version
AKS	1.24	1.29
EKS	1.24	1.28
GKE	1.24	1.28
OpenShift Container Platform	4.11	4.14

Permissions

To install the Resiliency Operator on a cluster, you need to have Cluster Admin permissions in that cluster.

The Resiliency Operator needs read access to the assets being protected and read/write access to the backup assets. Refer to plugin documentation for details.

Kuberentes requirements

Software

Official kubernetes.io client CLI kubectl.

Networking

Allow traffic to the Image Registry quay.io/astrokube using the mechanism provided by the chosen distribution.
In a 3-clusters architecture, the management cluster needs to have communication with both the destination and source cluster. Note that it is not necessary to also allow connections between the target clusters. Due to the lack of a centralised management cluster, in a 2-clusters architecture communication between destination and source should be enabled.

OpenShift requirements

Software

OpenShift client CLI.

Networking

Add quay.io/astrokube to the allowed registries in the Image configuration.
In a 3-clusters architecture, the management cluster needs to have communication with both the destination and source cluster. Note that it is not necessary to also allow connections between the target clusters. Due to the lack of a centralised management cluster, in a 2-clusters architecture communication between destination and source should be enabled.

apiVersion: config.openshift.io/v1
kind: Image
metadata:
    ...
spec:
  registrySources: 
    allowedRegistries: 
    ...
    - quay.io/astrokube

Cluster configuration

Cluster admin permission in management, destination and source clusters. In a 2-clusters architecture it is only required to have admin permissions in the destination and source clusters as the operator activities will be delegated to the former.
The Secret provided by AstroKube to access the Image Registry.
The Secret provided by AstroKube with the license key.

3.2 - Installing on OpenShift

Steps to install the Resiliency Operator in OpenShift

The following operations need to executed in both the management and destination cluster.

Process

1. Create Namespace

Create the Namespace where the operator will be installed:

oc create namespace resiliency-operator

2. Setup registry credentials

Create the Secret that stores the credentials to the AstroKube image registry:

oc -n resiliency-operator create -f pull-secret.yaml

3. Setup license key

Create the Secret that stores the license key:

oc -n resiliency-operator create -f license-key.yaml

4. Install the operator

Install the CRDs:

oc apply -f https://astronetes.io/deploy/resiliency-operator/v1.1.10/crds.yaml

Install the operator:

oc -n resiliency-operator apply -f https://astronetes.io/deploy/resiliency-operator/v1.1.10/operator-openshift.yaml

3.3 - Uninstalling on OpenShift

Steps to uninstall the Resiliency Operator on OpenShift

Process

1. Delete Operator objects

Delete the synchronizations from the management cluster:

oc delete synchronizationplans.automation.astronetes.io,synchronizations.automation.astronetes.io -A --all

Delete the assets from the management cluster:

oc delete databases.assets.astronetes.io -A --all

2. Remove the operator

Delete the operator:

oc -n resiliency-operator delete -f https://astronetes.io/deploy/resiliency-operator/v1.1.10/operator-openshift.yaml

Delete the CRDs:

oc delete -f https://astronetes.io/deploy/resiliency-operator/v1.1.10/crds.yaml

3. Remove registry credentials

Delete the Secret that stores the credentials to the AstroKube image registry:

oc -n resiliency-operator delete -f pull-secret.yaml

4. Remove license key

Delete the Secret that stores the license key:

oc -n resiliency-operator delete -f license-key.yaml

4 - Update license key

Steps to update the license key for the Resiliency Operator

There is no need to reinstall the operator when updating the license key.

1. Update the license key

Update the Kubernetes Secret that stores the license key with the new license:

kubectl -n resiliency-operator apply -f new-license-key.yaml

oc -n resiliency-operator apply -f new-license-key.yaml

2. Restart the Resiliency Operator

Restart the Resiliency Operator Deployment to apply the new license:

kubectl -n resiliency-operator rollout restart deployment resiliency-operator-database-controller
kubectl -n resiliency-operator rollout restart deployment resiliency-operator-synchronization-controller
kubectl -n resiliency-operator rollout restart deployment resiliency-operator-synchronizationplan-controller

kubectl -n resiliency-operator rollout restart deployment resiliency-operator-database-controller
kubectl -n resiliency-operator rollout restart deployment resiliency-operator-synchronization-controller
kubectl -n resiliency-operator rollout restart deployment resiliency-operator-synchronizationplan-controller

3. Wait for the Pods restart

Wait a couple of minutes until all the Resiliency Operator Pods are restarted with the new license.

5 - Plugins

Resiliency Operator Plugins

Plugins implement the logic to synchronize data from a particular type of asset to another instance that runs the same or another technology.

Specifying which one to use is required whether the synchronization is managed through Synchronization or SynchronizationPlan Custom Resources.

5.1 - Kubernetes Objects to Kubernetes

Synchronization across two Kubernetes clusters.

5.1.1 - Architecture

Kubernetes to Kubernetes plugin Architecture

The cluster is protected with a warm stand-by paired cluster where the workloads will be offloaded when the disaster occurs. The resources can be deactivated while in the destination cluster until such event takes place, avoiding unnecessary resource consumption and optimizing organizational costs.

Resiliency Operator extracts the resources from the source cluster and syncs them on the destination cluster maintaining a consistent state between them.

Operator monitoring is attached to the operator and it is independent of either cluster.

The operator can be deployed in either a 2-clusters or 3-clusters architecture.

2-clusters

This configuration is recommended for training, testing, validation or when the 3-clusters option is not optimal or possible.

The currently active cluster will be the source cluster, while the passive is the destination cluster. The operator, including all the Custom Resource Definitions (CRD) and processes, is installed in the latter. The operator will listen for new resources that fulfill the requirements and clone them into the destination cluster.

The source cluster is never aware of the destination cluster and can exist and operate as normal without its presence. The destination cluster needs to have access to it through a KubernetesCluster resource.

3-clusters

In addition of the already existing 2 clusters, this modality includes the management cluster. The operator synchronization workflow is delegated in it instead of depending on the destination cluster. The management cluster is in charge of reading the changes and new resources in the source cluster and syncing them to the destination. Neither source or destination cluster needs to know of the existence of the management cluster and can operate without it. Having a separate cluster that is decoupled from direct production activity lowers operational risks and eases access control to both human and software operators. The operator needs to be installed in the destination cluster as well to start the recovery process without depending on other clusters. Custom Resources that configure the synchronization are deployed in the management cluster while those only relevant when executing the recovery process are deployed in the destination cluster.

This structure fits organizations that are already depending on a management cluster for other tasks or ones that are planning to do so. Resiliency Operator does not require a standalone management cluster and can be installed and managed from an existing one.

Components

Synchronization across clusters is managed through Kubesync, Astronetes solution for Kubernetes cluster replication. The following components are deployed when synchronization between two clusters is started:

Component	Description	Source cluster permissions	Destination cluster permissions
Events listener	Read events in the source cluster.	Cluster reader	N/A
Processor	Filter and transform the objects read from the source cluster.	Cluster reader	N/A
Synchronizer	Write processed objects in the destination cluster.	N/A	Write
Reconciler	Sends delete events whenever it founds discrepancies between source and destination.	Cluster reader	Cluster reader
NATS	Used by other components to send and receive data.	N/A	N/A
Redis	Stores metadata about the synchronization state. Most LiveSynchronization components interact with it.	N/A	N/A
Metrics exporter	Export metrics about the LiveSynchronization status.	N/A	N/A

5.1.2 - Components

Kubernetes to Kubernetes plugin Components

Component	Description	Source cluster permissions	Destination cluster permissions
Events listener	Read events in the source cluster.	Cluster reader	N/A
Processor	Filter and transform the objects read from the source cluster.	Cluster reader	N/A
Synchronizer	Write processed objects in the destination cluster.	N/A	Write
Reconciler	Sends delete events whenever it founds discrepancies between source and destination.	Cluster reader	Cluster reader
NATS	Used by other components to send and receive data.	N/A	N/A
Redis	Stores metadata about the synchronization state. Most LiveSynchronization components interact with it.	N/A	N/A
Metrics exporter	Export metrics about the LiveSynchronization status.	N/A	N/A

5.1.3 - Post-installation configuration

Steps to configure the Kubernetes objects to Kubernetes cluster plugin

5.1.3.1 - Setting a Kubernetes cluster

Granting access to source and destination cluster

Introduction

Connection to both the source and destination clusters is set using the KubernetesCluster resource. Credentials are stored in Kubernetes secrets from which the KubernetesCluster collection access to connect to the clusters.

Requirements

The kubeconfig file to access as read-only to the source cluster
The kubeconfig file to access as cluster-admin to the destination cluster
The Secret provided by AstroKube to access the Image Registry

Process

1. Prepare

Create Namespace

Create the namespace to configure the recovery process:

kubectl create namespace <namespace_name>

Setup registry credentials

Create the Secret that stores the credentials to the AstroKube image registry:

kubectl -n <namespace_name> create -f pull-secret.yaml

2. Configure the source Cluster

Create secret

Get the kubeconfig file that can be used to access the cluster, and save it as source-kubeconfig.yaml.

Then create the Secret with the following command:

kubectl -n <namespace_name> create secret generic source --from-file=kubeconfig.yaml=source-kubeconfig.yaml

Create resource

Define the ManagedCluster resource with the following YAML, and save it as sourcecluster.yaml:

apiVersion: assets.astronetes.io/v1alpha1
kind: KubernetesCluster
metadata:
  name: source
spec:
  secretName: <secret_name>

Deploy the resource with the following command:

kubectl create -f sourcecluster.yaml

3. Configure the destination Cluster

Create secret

Get the kubeconfig file that can be used to access the cluster, and save it as destination-kubeconfig.yaml.

Then create the Secret with the following command:

kubectl -n <namespace_name> create secret generic destination --from-file=kubeconfig.yaml=destination-kubeconfig.yaml

Create resource

Define the ManagedCluster resource with the following YAML, and save it as destinationcluster.yaml:

apiVersion: assets.astronetes.io/v1alpha1
kind: KubernetesCluster
metadata:
  name: destination
spec:
  secretName: <secret_name>

Deploy the resource with the following command:

kubectl create -f destinationcluster.yaml

5.1.3.2 - Configuring a LiveSynchronization

How to proctect the platform resources from a disaster

Introduction

A LiveSynchronization resource indicates a set of Kubernetes resource to replicate or synchronize between the source cluster and the destination cluster.

Requirements

A KubernetesCluster resource for source cluster.
A KubernetesCluster resource for destination cluster.

Process

1. Configure the live synchronization

Create the livesynchronization.yaml file according to your requirements. For this example, the goal is to synchronize deployments with the disaster-recovery label set to enabled. It is also desirable that when its replication is completed that no pod is created in the destination cluster and that after a recovery is launched the deployment launches active pods again.

Let’s dissect the following YAML:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  suspend: false
  plugin: kubernetes-objects-to-kubernetes
  config:
    sourceName: source
    destinationName: destination
    observability:
      enabled: false
    replication:
      resources:
        - group: apps
          version: v1
          resource: deployments
          transformation:
            patch:
              - op: replace
                path: /spec/replicas
                value: 0
          filters:
            namespaceSelector:
              matchLabels:
                disaster-recovery: enabled
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1

spec.config.sourceName and spec.config.destinationName refers to the name and namespace of the KubernetesCluster resources for the corresponding clusters.

The spec.config.replication.resources is a list of the set of resources to deploy. A single LiveSynchronization can cover multiple types or groups of resources, although this example only manages deployments.

The type of the resource is defined at spec.config.replication.resources[*].resource. The filters can be located in spec.config.replication.resources[*].filters. In this case, the RecoveryPlan is matching the content of the disaster-recovery label.

The spec.config.replication.resources[*].transformation and spec.config.replication.resources[*].recoveryProcess establish the actions to take after each resource is synchronized and after they are affected by the recovery process respectively. In this case, while being replicated, each deployment will set their replicas to 0 in the destination cluster and will get back to one after a successful recovery. The resource parameters are always left intact in the source cluster.

2. Suspending and resumen a recovery plan

A keen eye might have noticed the spec.suspend parameter. In this example it is set to true to indicate that the recovery plan is inactive. An inactive or suspended recovery plan will not replicate new or existing resources until it is resumed. Resuming a recovery plan can be done by setting spec.suspend to false and applying the changes in yaml. Alternatively, a patch with kubectl will work as well and will not require the original yaml file:

kubectl patch livesynchronization <livesynchronization_name> -p '{"spec":{"suspend":false}}' --type=merge

3. Deploy the Live Synchronization

The live synchronizarion can be deployed as any other Kubernetes resource:

kubectl -n <namespace_name> apply -f livesynchronization.yaml

Live Synchronizations and namespaces

It is only possible to deploy one live synchronization per namespace if they share common resources such as a bucket or a kubernetes cluster.

Additional steps

For more examples, take a look at our samples.

Modifying synchronized resources.

Depending on the use case and the chosen solution for Resiliency Operator, it is convenient that resources synchronized in the destination cluster differ from the original copy. Taking as example a warm standby scenario, in order to optimize infrastructure resources, certain objects such as Deployments or Cronjobs do not need to be actively running until there is a disaster. The standby destination cluster can run with minimal computing power and autoscale as soon as the recovery process starts, reducing the required overhead expenditure.

While a resource is being synchronized into the destination cluster, its properties can be transformed to adapt them to the organization necessities. Then, if and when a disaster occurs, the resource characteristics can be restored to either its original state or an alternative one with the established recover process.

Filters

FIlters are useful to select only the exact objects to synchronize. They are set in the spec.config.replication.resources[*].filters parameter.

Name selector

The nameSelector filters by the name of the resources of the version and type indicated. The following example selects only the Configmaps that follow the regular expression config.*:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  plugin: kubernetes-objects-to-kubernetes
  suspend: false
  config:
    sourceName: source
    destinationName: destination
    observability:
      enabled: false
    replication:
      resources:
      - version: v1
        resource: configmaps
        filters:
          nameSelector:
            regex:
              - "config.*"

This selector can also be used negatively with excludeRegex. The following example excludes every configmap that ends in .test:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  plugin: kubernetes-objects-to-kubernetes
  suspend: false
  config:
    sourceName: source
    destinationName: destination
    observability:
      enabled: false
    replication:
      resources:
      - version: v1
        resource: configmaps
        filters:
          nameSelector:
            excludeRegex:
            - "*.test"

Namespace selector

The namespaceSelector filters resources taking in consideration the namespace they belong to. This selector is useful to synchronize entire applications if they are stored in a namespace. The following example selects every deployment that is placed in a namespace with the label disaster-recovery: enabled:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  plugin: kubernetes-objects-to-kubernetes
  suspend: false
  config:
    sourceName: source
    destinationName: destination
    observability:
      enabled: false
    replication:
      resources:
        - group: apps
          version: v1
          resource: deployments
          filters:
            selector:
              matchLabels:
                disaster-recovery: enabled

Transformations

Transformations are set in the spec.config.replication.resources[*].transformation parameter and are managed through patches.

Patch modifications alter the underlying object definiton using the same mechanism as kubectl patch. As with jsonpatch, the allowed operations are replace, add and remove. Patches are defined in the spec.config.replication.resources[*].transformation.patch list and admits an arbitary number of modifications.

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  ...
  config:
    ...
    replication:
      resources:
      - ...
        transformation:
          patch:
            - op: replace
              path: /spec/replicas
              value: 0
            - op: remove
              path: /spec/strategy

Multiple transformations

While Resiliency Operator supports multiple transformations for the same LiveSynchronization, it does not cover having more than one transformation for the same resource group. Transformations that cover different resources of the same resource group should be in different recovery plans. The same resource or resource set can only be affected by up to one transformation and cannot be present in more than one LiveSynchronization.

RecoveryProcess

The RecoveryProcess of a LiveSynchronization is executed in the case of a disaster to recover the original status of the application in the destination cluster. A resource can be either restored from the original definition stored in a bucket or by performing custom patches like with Transformations.

To restore from the original data, read the Recovering from a Bucket section. This option will disregard performed transformations and replace the parameters with those of the source cluster.

Patching when recovering is accessible at spec.config.replication.resources[*].recoveryProcess.fromPatch list and admits an arbitary number of modifications. It will act on the current state of the resource in the destination cluster, meaning it will take into consideration the transformations performed when it was synchronized unlike when recovering from original. As with jsonpatch, the allowed operations are replace, add and remove.

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  ...
  config:
    ...
    replication:
      resources:
        - ...
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1

5.1.3.3 - Recovering from a Bucket

How save objects and recover them using object storage.

Introduction

A Bucket resource indicates an Object Storage that will be used to restore original objects when recovering from a disaster.

Object Storage stores data in an unstructured format in which each entry represents an object. Unlike other storage solutions, there is not a relationship or hierarchy between the data being stored. Organizations can access their files as easy as with traditional hierarchical or tiered storage. Object Storage benefits include virtually infinite scalability and high availability of data.

Many Cloud Providers include their own flavor of Object Storage and most tools and SDKs can interact with them as their share the same interface. Resiliency Operator officially supports the following Object Storage solutions:

AWS Simple Storage Service (S3) Google Cloud Storage

Resiliency Operator can support multiple buckets in different providers as each one is managed independently.

Contents stored in a bucket

A bucket is assigned to a LiveSynchronization by setting it in a spec.config.bucketName item. It stores every synchronized object in the destination cluster with some internal control annotations added. In the case of a disaster, resources with recoveryProcess.fromOriginal.enabled equal to true will be restored using the bucket configuration.

The path of a stored object is as follows: <bucket_namespace>/<bucket_name>/<object_group-version-resource>/<object_namespace>.<object_name>.

Requirements

At least an instance of a ObjectStorage service in one of the supported Cloud Providers. This is commonly known as a bucket and will be referred as so in the documentation.
At least one pair of accessKeyID and secretAccessKey that gives both write and read permissions over all objects of the bucket. Refer to the chosen cloud provider documentation to learn how to create and extract them. It is recommended that each access key pair has only access to a single bucket.

Preparing and setting the bucket

Create the secret

Store the following file and apply it into the cluster substituting the template parameters with real ones.

apiVersion: v1
kind: Secret
metadata:
  name: bucket-credentials
stringData:
  accessKeyID: <access_key_id>
  secretAccessKey: <secret_access_key>

Create the Bucket

Store the following file and apply it into the cluster substituting the template parameters with real ones.

apiVersion: assets.astronetes.io/v1alpha1
kind: Bucket
metadata:
  name: gcp
  namespace: <namespace>
spec:
  generic:
    endpoint: storage.googleapis.com
    name: <bucket_name>
    useSSL: true
    secretName: bucket-credentials

Create the LiveSynchronization

If the LiveSynchronization does not set spec.resources[x].recoveryProcess.fromOriginal.enabled equal to true, where x refers to the index of the desired resource, the contents of the bucket will not be used. For the configuration to work, make sure both the bucket reference and recovery process transformations are correctly set.

Indicating which bucket to use can accomplished by configuring the spec.config.bucketName like in the following example:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  plugin: kubernetes-objects-to-kubernetes
  config:
    sourceName: source
    destinationName: destination
    bucketName: <bucket_object_name>
    observability:
      enabled: false
    replication:
      resources:
        - group: apps
          version: v1
          resource: deployments
          transformation:
            patch:
              - op: replace
                path: /spec/replicas
                value: 0
          filters:
            namespaceSelector:
              matchLabels:
                env: pre
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1
        - group: apps
          version: v1
          resource: deployments
          transformation:
            patch:
              - op: replace
                path: /spec/replicas
                value: 0
          filters:
            namespaceSelector:
              matchLabels:
                env: pre-second
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1
        - group: ""
          version: v1
          resource: services
          filters:
            namespaceSelector:
              matchLabels:
                env: pre
        - group: ""
          version: v1
          resource: services
          filters:
            namespaceSelector:
              matchLabels:
                env: pre-second
        - group: ""
          version: v1
          resource: secrets
          filters:
            namespaceSelector:
              matchLabels:
                env: pre

Create the secret

Store the following file and apply it into the cluster substituting the template parameters with real ones.

apiVersion: v1
kind: Secret
metadata:
  name: bucket-credentials
stringData:
  accessKeyID: <access_key_id>
  secretAccessKey: <secret_access_key>

Create the Bucket

Store the following file and apply it into the cluster substituting the template parameters with real ones.

S3 requires that the region in the endpoint matches the region of the target bucket. It has to be explicitely set as AWS does not infer buckets region e.g. us-east-1 for North Virginia.

apiVersion: assets.astronetes.io/v1alpha1
kind: Bucket
metadata:
  name: gcp
spec:
  generic:
    endpoint: s3.<bucket-region>.amazonaws.com
    name: <bucket-name>
    useSSL: true
    secretName: bucket-credentials

Create the LiveSynchronization

If the Recovery Plan does not set spec.resources[x].recoveryProcess.fromOriginal.enabled equal to true, where x refers to the index of the desired resource, the contents of the bucket will not be used. For the configuration to work, make sure both the bucket reference and recovery process transformations are correctly set.

Indicating which bucket to use can accomplished by configuring the spec.BucketRef like in the following example:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  plugin: kubernetes-objects-to-kubernetes
  config:
    sourceName: source
    destinationName: destination
    bucketName: <bucket_object_name>
    observability:
      enabled: false
    replication:
      resources:
        - group: apps
          version: v1
          resource: deployments
          transformation:
            patch:
              - op: replace
                path: /spec/replicas
                value: 0
          filters:
            namespaceSelector:
              matchLabels:
                env: pre
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1
        - group: apps
          version: v1
          resource: deployments
          transformation:
            patch:
              - op: replace
                path: /spec/replicas
                value: 0
          filters:
            namespaceSelector:
              matchLabels:
                env: pre-second
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1
        - group: ""
          version: v1
          resource: services
          filters:
            namespaceSelector:
              matchLabels:
                env: pre
        - group: ""
          version: v1
          resource: services
          filters:
            namespaceSelector:
              matchLabels:
                env: pre-second
        - group: ""
          version: v1
          resource: secrets
          filters:
            namespaceSelector:
              matchLabels:
                env: pre

5.1.3.4 - Resynchronization

Synchronized resources reconciliation between source and destination cluster.

Introduction

Due to particular circumstances it might be possible that there are objects that were not synchronized from the source cluster to the destination cluster. To cover this case, Resiliency Operator offers a reconciliation process that adds, deletes or updates objects in the destination cluster if its state differs from the source.

Auto pruning

The resynchronization process will delete resources in the destination cluster that are not present in the source cluster. It is recommended that before recovering from a disaster the target LiveSynchronization is suspended to avoid potential data loss.

Architecture

Reconciliation is performed at the LiveSynchronization level. Every Live Synchronization is in charge of their covered objects and that they are up to date with the specification. Reconciliation is started by two components, EventsListener and Reconciler. The former is in charge of additive reconciliation and the latter of substractive reconciliation.

Additive reconciliation

Refers to the reconciliation of missing objects that are present in the source cluster but, for any reason, are not present or are not up to date in the destination cluster. The entry point is the EventsListener service which receives events with the current state in the source cluster of all the objects covered by the Recovery Plan with a period of one hour by default.

These resync events are then treated like regular events and follow the syncronization communication flow. If the object does not exist in the destination cluster, the Synchronizer will apply it. In the case of updates, only those with a resourceVersion greater than the existing one for that object will be applied, updating the definition of said object.

Substractive reconciliation

In the case that an object was deleted in the source cluster but it was not in the destination, the Additive Reconciliation will not detect it. The source cluster can send events containing the current state of its existing components, but not of those that ceased to exist in it.

For that, the Reconciler is activated with a period of one hour by default. It compares the state of the objects covered the Recovery Plan in both source and destination clusters. If a change is found, it creates a delete event in the NATS. This event is then processed as an usual delete event throughout the rest of the communication process.

Modifying the periodic interval

By default, the resynchronization process will be launched every hour. It can be changed by modifying the value at spec.config.resyncPeriod in the LiveSynchronization object. The admitted format is %Hh%Mm%Ss e.g. 1h0m0s for intervals of exactly one hour. Modifying this variable updates the schedule for both additive and substractive reconciliations.

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: resync-3h-25m-12s
spec:
  ...
  config:
    replication:
      resyncPeriod: 3h25m12s

5.1.3.5 - Recovering from a disaster

How to recover in the case of a disaster

Previous steps

Task CRD

After defining a LiveSynchronization, a Task resource will be created in the destination cluster. The operator processes the spec.config.reaplication.resources[*].recoveryProcess parameter to define the required steps to activate the dormant applications. Taking as an example the following definition:

apiVersion: automation.astronetes.io/v1alpha1
kind: Task
metadata:
  name: set-test-label
spec:
  plugin: kubernetes-objects-transformation
  config:
    resources:
      - identifier:
          group: apps
          version: v1
          resources: deployments
        patch:
          operations:
            - op: replace
              path: '/labels/test'
              value: 'ok'

Every label with key test in a Deployment will be replaced with the value ok. This Task originates from the following LiveSynchronization object:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: set-test-label
spec:
  plugin: kubernetes-objects-to-kubernetes
  suspend: false
  config:
    sourceName: source
    destinationName: destination
    observability:
      enabled: false
    replication:
      resources:
        - group: apps
          version: v1
          resource: deployments
          recoveryProcess:
            fromPatch:
            - op: replace
              path: '/labels/test'
              value: 'ok'

This object should not be tempered with. It is managed by their adjacent LiveSynchronization.

On the day of a disaster

Recovering from a disaster will require the deployment of a TaskRun resource per Task that applies to recover the system and applications. The following example executes the TaskRun resource defined in the previous section:

apiVersion: automation.astronetes.io/v1alpha1
kind: TaskRun
metadata:
  name: restore-apps
spec:
  taskName: set-test-label

5.1.4 - Observability

Monitor the state of the synchronization and recovery process

5.1.4.1 - Audit fields

Parameters built into Resiliency Operator to track when a change was made and whom did it

Auditing and version control is an important step when configuring resources. Knowing when a change was made and the account that applied it can be determinative in an ongoing investigation to solve an issue or a configuration mismanagement.

Audit fields

The following annotation are attached to every resource that belongs to Resiliency Operator Custom Resources:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  annotations:
    audit.astronetes.io/last-update-time: "<date>"         # Time at which the last update was applied.
    audit.astronetes.io/last-update-user-uid: "<uid-hash>" # Hash representing the Unique Identifier of the user that applied the change.
    audit.astronetes.io/last-update-username: "<username>" # Human readable name of the user that applied the change.

Example:

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  annotations:
    audit.astronetes.io/last-update-time: "2024-02-09T14:05:30.67520525Z"
    audit.astronetes.io/last-update-user-uid: "b3fd2a87-0547-4ff7-a49f-cce903cc2b61"
    audit.astronetes.io/last-update-username: system:serviceaccount:preproduction:microservice1

Fields are updated only when a change to the fields .spec, .labels or .annotations are detected. Status modifications by the operator are not recorded.

Objects that are synchronized will not have these labels.

5.1.4.2 - Understanding logging

How to interpret Disaster Recovery Operator log messages and manage them

Disaster Recovery Operator implements a logging system throughout all its pieces so that the end user can have visibility on the system.

JSON fields

Name	Description
level	Log level at write time.
timestamp	Time at which the log was written.
msg	Log message.
process	Information about the process identity that generated the log.
event	Indicates if the log is referring to a create, update or delete action.
sourceObject	Object related to the source cluster that is being synchronized.
oldSourceObject	Previous state of the sourceObject. Only applicable to `update` events.
sourceCluster	Information about the source managed cluster.
destinationObject	Object related to the destination cluster.
destinationObject	Information about the destination managed cluster.
bucket	Recovery bucket information.
bucketObject	Path to the object to synchronize.
lastUpdate	Auditing information. More information.

Examples

An object read from the source cluster.

{
  "level": "info",
  "timestamp": "2023-11-28T18:05:26.904276629Z",
  "msg": "object read from cluster",
  "process": {
    "id": "eventslistener"
  },
  "sourceCluster": {
    "name": "source",
    "namespace": "dr-config",
    "resourceVersion": "91015",
    "uid": "3c39aaf0-4216-43a8-b23c-63f082b22436"
  },
  "sourceObject": {
    "apiGroup": "apps",
    "apiVersion": "v1",
    "name": "nginx-deployment-five",
    "namespace": "test-namespace-five",
    "resource": "deployments",
    "resourceVersion": "61949",
    "uid": "5eb6d1d1-b694-4679-a482-d453bcd5317f"
  },
  "oldSourceObject": {
    "apiGroup": "apps",
    "apiVersion": "v1",
    "name": "nginx-deployment-five",
    "namespace": "test-namespace-five",
    "resource": "deployments",
    "resourceVersion": "61949",
    "uid": "5eb6d1d1-b694-4679-a482-d453bcd5317f"
  },
  "lastUpdate": {
    "time": "2023-11-25T13:12:28.251894531Z",
    "userUID": "165d3e9f-04f4-418e-863f-07203389b51e",
    "username": "kubernetes-admin"
  },
  "event": {
    "type": "update"
  }
}

An object was uploaded to a recovery bucket.

{
  "level": "info",
  "timestamp": "2023-11-28T18:05:27.593493962Z",
  "msg": "object uploaded in bucket",
  "sourceObject": {
    "apiGroup": "apps",
    "apiVersion": "v1",
    "name": "helloworld",
    "namespace": "test-namespace-one",
    "resource": "deployments",
    "resourceVersion": "936",
    "uid": "7c2ac690-3279-43ca-b14e-57b6d57e78e1"
  },
  "oldSourceObject": {
    "apiGroup": "apps",
    "apiVersion": "v1",
    "name": "helloworld",
    "namespace": "test-namespace-one",
    "resource": "deployments",
    "resourceVersion": "936",
    "uid": "7c2ac690-3279-43ca-b14e-57b6d57e78e1"
  },
  "process": {
    "id": "processor",
    "consumerID": "event-processor-n74"
  },
  "bucket": {
    "name": "bucket-dev",
    "namespace": "dr-config",
    "resourceVersion": "91006",
    "uid": "47b50013-3058-4283-8c0d-ea3a3022a339"
  },
  "bucketObject": {
    "path": "dr-config/pre/apps-v1-deployments/test-namespace-one.helloworld"
  },
  "lastUpdate": {
    "time": "2023-11-25T13:12:29.625399813Z",
    "userUID": "165d3e9f-04f4-418e-863f-07203389b51e",
    "username": "kubernetes-admin"
  }
}

Managing logs

Messages structure vary depending on the operation that originated it.

The sourceCluster and destinationCluster are only present for operations that required direct access to either cluster. For the former, only messages originating from either the eventsListener, processor or reconciler services can include it in their logs. The latter will only be present in synchronizer or reconciler logs messages. These parameters will not be present for internal messages such as those coming from the nats since there is no direct connection with either cluster.

oldSourceObject is the previous state of the object when performing an update operation. It is not present in other types.

When the bucket and bucketObject parameters are present, the operation is performed against the indicated bucket without any involvement of the source and destination clusters. For create operations, an object was uploaded for the first time to the bucket, for updates an existing one is modified and for delete an object was deleted from the specified bucket.

These characteristics can be exploited to improve log searches by narrowing down the messages to those that are relevant at the moment. Serving as an example, the following command will output only those logs that affect the source managed cluster by filtering the messages that lack the sourceCluster.

kubectl -n dr-config logs pre-eventslistener-74bc689665-fwsjc | jq '. | select(.sourceCluster != null)'

This could be useful when trying to debug and solve connection issues that might arise.

Log messages

The log message is located in the msg parameter. It can be read and interpreted to establish the severity of the log. The following tables group every different log message depending on whether it should be treated as error or informative.

Error messages

msg
“error reading server groups and resources”
“error reading resources for group version”
“error getting namespace from cluster”
“error creating namespace in cluster”
“error getting object from cluster”
“error creating object in cluster”
“error updating object in cluster”
“error listing objects in cluster”
“error deleting object in cluster”
“error uploading object in bucket”
“error deleting object form bucket”
“error getting object from bucket”

Informative messages

Not found objects are not errors

Errors regarding not found objects do not represent errors but rather normal behaviour while synchronizing objects not present in one of the clusters.

msg
“reading server groups and resources”
“server group and resources read from cluster”
“reading resources for group version”
“resource group version not found”
“group resource version found”
“reading namespace from cluster”
“namespace not found in cluster”
“namespace read from cluster”
“creating namespace from cluster”
“namespace already exists in cluster”
“namespace created in cluster”
“reading object from cluster”
“object not found in cluster”
“object read from cluster”
“creating object in cluster”
“object created in cluster”
“updating object in cluster”
“object updated in cluster”
“deleting object in cluster”
“object deleted in cluster”
“listing objects in cluster”
“list objects not found in cluster”
“listed objects in cluster”
“uploading object in bucket”
“object uploaded in bucket”
“deleting object from bucket”
“object deleted from bucket”
“getting object from bucket”
“object got from bucket”
“listing object from bucket”

5.1.4.3 - Granafa setup

How to configure Grafana

Resiliency Operator offers the option of leveraging an existing Grafana installation to monitor the state of the synchronization and recovery process. Users can incorporate the provided visualizations to their workflows in a transparent manner without affecting their operability.

1. Requirements

Grafana Operator

The operator installation includes the necessary tools to extract the information from it. To view that information with the official dashboard, is required that the management cluster has the Grafana Operator installed.

Astronetes Disaster Recovery Operator supports Grafana v4 and Grafana v5.

2a. Using Grafana Operator v4

Create the GrafanaDashboard from the release manifests:

kubectl apply -f https://astronetes.io/deploy/disaster-recovery-operator/v0.11.0/grafana-v4-dashboard.yaml

2b. Using Grafana Operator v5

Create the GrafanaDashboard from the release manifests:

kubectl apply -f https://astronetes.io/deploy/disaster-recovery-operator/v0.11.0/grafana-v5-dashboard.yaml

3. Working with the dashboard

The dashboard shows detailed information about the write, read and computing processes alongside a general overview of the health of the operator.

General view of the status of the operator:

The dashboard can be filtered attending the following characteristics:

Namespace. Only shows information related to the LiveSynchronizations in a specified namespace.
Recovery Plan. Filters by a specific LiveSynchronizaton.
Object Namespace. Only shows information of the objects located in a given namespace regardless their associated LiveSynchronization.
Object API Group. Objects are filtered attending to the API Group that they belong to.

Filters can be combined to get more specific results e.g. Getting the networking related objects that belong to a LiveSynchronization that is deployed in a namespace.

5.1.5 - Configuration

Plugin parameters and accepted values

LiveSynchronization

Configuration

Name	Description	Type	Required
sourceName	Kubernetes Cluster acting as source	string	yes
destinationName	Kubernetes Cluster acting as destination	string	yes
bucketName	Bucket name to upload the synchronization contents	string	no
replication	Configuration of the plugin synchronization	UserConfig	yes
observability	Configuration of the observability components	ObservabilityConfig	no
components	Plugin component management	Components	no

UserConfig

Name	Description	Type	Required
resyncPeriod	Period to activate resynchronization	Timestamp with (HH)h(mm)m(ss)s format	no
resources	Resources to synchronize	List of Resource	yes
forceNamespaceCreation	Force namespace creation when applying the object	boolean	no

Resource

Name	Description	Type	Required
group	Group of the resource	string	no
version	Version of the resource	string	yes
resource	Kind of the resource	string	yes
transformation	Transformations to apply	Transformation	no
filters	Filters to apply	Filters	no
recoveryProcess	Actions to execute while recovering	RecoveryProcess	no

Transformation

Name	Description	Type	Required
patchOptions	Patch options	PatchOpts	no
patch	Patches to apply	List of PatchOperation	no

PatchOpts

Name	Description	Type	Required
skipIfNotFoundOnDelete	Determines if errors should be ignored when trying to remove an field that doesn’t exist.	bool	no

PatchOperation

Name	Description	Type	Required
op	Operations to apply. Accepted values are “replace” and “delete”	string	yes
path	Path of the object to modify	string	yes
value	Value to include if applicable	JSON	yes

Filters

Name	Description	Type	Required
selector	Resource selector	Kubernetes LabelSelector	no
namespaceSelector	Resource selector based on namespaces	Kubernetes LabelSelector	no

RecoveryProcess

Name	Description	Type	Required
patchOptions	Patch options	PatchOpts	no
fromPatch	Path of the object to modify	List of PatchOperation	no
fromOriginal	Options to recover from a disaster from the original source	From Original	no

FromOriginal

Name	Description	Type	Required
enabled	Enable recovering from original	boolean	no

ObservabilityConfig

Name	Description	Type	Required
enabled	Enable observability	boolean	no
interval	Interval to gather metrics from source	Duration with format number and metric e.g. 30s or 15m	no

Components

Name	Description	Type	Required
eventsListener	Settings for the component	Component	no
processor	Settings for the component	Component	no
reconciler	Settings for the component	Component	no
restorer	Settings for the component	Component	no
synchronizer	Settings for the component	Component	no
nats	Settings for the component	Component	no
redis	Settings for the component	Component	no
metricsExporter	Settings for the component	Component	no

Component

Name	Description	Type	Required
logLevel	Log level for the component	string	no
imagePullPolicy	Image pull policy for the component image	Kubernetes pull policy	no
resources	Resource quota for the component	Kubernetes Resource Quotas	no
concurrentTasks	Number of concurrent tasks	int32	no
replicas	Number of replicas	int32	no

5.2 - Zookeeper to Zookeeper

Plugin for Zookeeper to Zookeeper Synchronization

5.2.1 - Introduction

Zookeeper to Zookeeper nodes introduction

Replicates from one Zookeeper instance to another one directly.

Samples

Databases.

Synchronization.

SynchronizationPlan.

5.2.2 - Configuration

Plugin parameters and accepted values

Synchronization

Configuration

Name	Description	Type	Required
sourceName	Zookeeper instance acting as source	string	yes
destinationName	Zookeeper instance acting as destination	string	yes
rootPath	Root Path of the contents to synchronize	string	yes
createRootPath	Whether to create the Root Path in the destination database	boolean	no
ignoreEphemeral	Whether to ignore ephemeral	boolean	no
excludePathRegexp	Regular expression for keys to exclude while synchronizing	string	no

6 - Samples

6.1 - Kubernetes objects to Kubernetes samples

6.1.1 - Generic Bucket

Generic Bucket hosted in GCP including Secret with access to it.

apiVersion: assets.astronetes.io/v1alpha1
kind: Bucket
metadata:
  name: gcp
spec:
  generic:
    endpoint: storage.googleapis.com
    name: <bucket-name>
    useSSL: true
    secretName: bucket-credentials
---
apiVersion: v1
kind: Secret
metadata:
  name: bucket-credentials
stringData:
  accessKeyID: <access_key_id>
  secretAccessKey: <secret_access_key>

6.1.2 - Kubernetes Clusters

KubernetesCluster for source and destination cluster assuming the secrets are set as specified.

apiVersion: assets.astronetes.io/v1alpha1
kind: KubernetesCluster
metadata:
  name: source
spec:
  secretName: source
---
apiVersion: assets.astronetes.io/v1alpha1
kind: KubernetesCluster
metadata:
  name: destination
spec:
  secretName: destination

6.1.3 - Live Synchronization

apiVersion: automation.astronetes.io/v1alpha1
kind: LiveSynchronization
metadata:
  name: livesynchronization-sample
spec:
  plugin: kubernetes-objects-to-kubernetes
  suspend: false
  config:
    sourceName: source
    destinationName: destination
    observability:
      enabled: false
    replication:
      resources:
        - group: apps
          version: v1
          resource: deployments
          transformation:
            patch:
              - op: replace
                path: /spec/replicas
                value: 0
          filters:
            namespaceSelector:
              matchLabels:
                env: pre
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1
        - group: apps
          version: v1
          resource: deployments
          transformation:
            patch:
              - op: replace
                path: /spec/replicas
                value: 0
          filters:
            namespaceSelector:
              matchLabels:
                env: pre-second
          recoveryProcess:
            fromPatch:
              - op: replace
                path: /spec/replicas
                value: 1
        - group: ""
          version: v1
          resource: services
          filters:
            namespaceSelector:
              matchLabels:
                env: pre
        - group: ""
          version: v1
          resource: services
          filters:
            namespaceSelector:
              matchLabels:
                env: pre-second
        - group: ""
          version: v1
          resource: secrets
          filters:
            namespaceSelector:
              matchLabels:
                env: pre

6.1.4 - Task

apiVersion: automation.astronetes.io/v1alpha1
kind: Task
metadata:
  name: set-test-label
spec:
  plugin: kubernetes-objects-transformation
  config:
    resources:
      - identifier:
          group: apps
          version: v1
          resources: deployments
        patch:
          operations:
            - op: replace
              path: '/labels/test'
              value: 'ok'

6.1.5 - Task Run

Tasks are already created in the destination cluster when a LiveSynchronization is created in the management cluster.

apiVersion: automation.astronetes.io/v1alpha1
kind: TaskRun
metadata:
  name: restore-apps
spec:
  taskName: set-test-label

6.2 - Zookeeper to Zookeeper samples

6.2.1 - Zookeeper Database

Source and destination Databases should include the host and port of the target Zookeper instances. A Secret containing the user login credentials is required for each Database instance. Databases are mapped to Secrets that have the same name and namespace.

Zookeeper users should have appropiate read permissions if they belong to a source instance and write permissions if they are instead located in a destination location.

apiVersion: v1
kind: Secret
metadata:
  name: zookeeper-source
stringData:
  user: admin
  password: password
---
apiVersion: assets.astronetes.io/v1alpha1
kind: Database
metadata:
  name: zookeeper-source
spec:
  zookeeper:
    client:
      servers:
        - 172.18.0.4:30181
---
apiVersion: v1
kind: Secret
metadata:
  name: zookeeper-destination
stringData:
  user: admin
  password: password
---
apiVersion: assets.astronetes.io/v1alpha1
kind: Database
metadata:
  name: zookeeper-destination
spec:
  zookeeper:
    client:
      servers:
        - 172.18.0.5:30181

6.2.2 - Zookeeper Synchronization

Zookeeper synchronization requires the path to the root endpoint. If can be specified in spec.template.spec.config.rootPath.

---
apiVersion: automation.astronetes.io/v1alpha1
kind: Synchronization
metadata:
  generateName: synchronize-zookeeper-
spec:
  plugin: zookeeper-to-zookeeper-nodes
  config:
    sourceName: zookeeper-source
    destinationName: zookeeper-destination
    rootPath: /test
    createRoutePath: true

6.2.3 - Zookeeper Synchronization Plan

Zookeeper synchronization requires the path to the root endpoint. If can be specified in spec.template.spec.config.rootPath.

---
apiVersion: automation.astronetes.io/v1alpha1
kind: SynchronizationPlan
metadata:
  name: synchronize-zookeeper
spec:
  schedule: "10 * * * *"
  template:
    spec:
      plugin: zookeeper-to-zookeeper-nodes
      config: 
        sourceName: zookeeper-source
        destinationName: zookeeper-destination
        rootPath: /test

7 - Reference

This section contains the API Reference of CRDs for the Resiliency Operator.

7.1 - Assets API Reference

Packages

assets.astronetes.io/v1alpha1

assets.astronetes.io/v1alpha1

Package v1alpha1 contains API Schema definitions for the assets v1alpha1 API group

Resource Types

AWSS3

Appears in:

BucketSpec

Field	Description	Validation
`name` string	Bucket name	Required: {}
`region` string	AWS region name	Required: {}
`secretName` string	Secret name where credentials are stored	Required: {}

Bucket

Bucket is the Schema for the buckets API

Appears in:

BucketList

Field	Description	Default	Validation
`apiVersion` string	`assets.astronetes.io/v1alpha1`
`kind` string	`Bucket`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` BucketSpec

BucketList

BucketList contains a list of Bucket

Field	Description	Default	Validation
`apiVersion` string	`assets.astronetes.io/v1alpha1`
`kind` string	`BucketList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` Bucket array

BucketSpec

BucketSpec defines the desired state of Bucket

Appears in:

Bucket

Field	Description	Validation
`generic` GenericBucket	Reference a generic bucket	Optional: {}
`gcpCloudStorage` GCPCloudStorage	Reference a GCP Cloud Storage service	Optional: {}
`awsS3` AWSS3	Reference a AWS Bucket service	Optional: {}

Database

Database is the Schema for the databases API

Appears in:

DatabaseList

Field	Description	Default	Validation
`apiVersion` string	`assets.astronetes.io/v1alpha1`
`kind` string	`Database`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` DatabaseSpec

DatabaseList

DatabaseList contains a list of Database

Field	Description	Default	Validation
`apiVersion` string	`assets.astronetes.io/v1alpha1`
`kind` string	`DatabaseList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` Database array

DatabaseSpec

DatabaseSpec defines the desired state of Database

Appears in:

Database

Field	Description	Default	Validation
`zookeeper` Zookeeper	Zookeeper database		Optional: {}

GCPCloudStorage

Appears in:

BucketSpec

Field	Description	Default	Validation
`name` string	Bucket name		Required: {}
`secretName` string	Secret name where credentials are stored		Required: {}

GenericBucket

Appears in:

BucketSpec

Field	Description	Validation
`name` string	Bucket name	Required: {}
`endpoint` string	Bucket endpoint	Required: {}
`useSSL` boolean	Use SSL	Optional: {}
`secretName` string	Secret name where credentials are stored	Required: {}

KubernetesCluster

KubernetesCluster is the Schema for the kubernetesclusters API

Appears in:

KubernetesClusterList

Field	Description	Default	Validation
`apiVersion` string	`assets.astronetes.io/v1alpha1`
`kind` string	`KubernetesCluster`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` KubernetesClusterSpec

KubernetesClusterList

KubernetesClusterList contains a list of KubernetesCluster

Field	Description	Default	Validation
`apiVersion` string	`assets.astronetes.io/v1alpha1`
`kind` string	`KubernetesClusterList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` KubernetesCluster array

KubernetesClusterSpec

KubernetesClusterSpec defines the desired state of KubernetesCluster

Appears in:

KubernetesCluster

Field	Description	Default	Validation
`secretName` string	Reference to the secret that stores the cluster Kubeconfig		Required: {}

Zookeeper

Appears in:

DatabaseSpec

Field	Description	Default	Validation
`admin` ZookeeperAdmin	Credentials for the admin port		Optional: {}
`client` ZookeeperClient	Credentials for the client port		Optional: {}

ZookeeperAdmin

Appears in:

Zookeeper

Field	Description	Validation
`protocol` string	Zookeeper protocol	Required: {}
`host` string	Zookeeper host	Required: {}
`port` string	Zookeeper port	Required: {}
`secretName` string	Zookeeper authentication data	Optional: {}

ZookeeperClient

Appears in:

Zookeeper

Field	Description	Default	Validation
`servers` string array	Zookeeper servers		Required: {}

7.2 - Automation API Reference

Packages

automation.astronetes.io/v1alpha1

automation.astronetes.io/v1alpha1

Package v1alpha1 contains API Schema definitions for the automation v1alpha1 API group

Resource Types

Backup

Backup is the Schema for the backups API

Appears in:

BackupList

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`Backup`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` BackupSpec

BackupDestinationBucket

Appears in:

BackupDestination

Field	Description	Default	Validation
`name` string	Reference the Bucket name		Required: {}
`basePath` string	The base path to be used to store the Backup data		Optional: {}

BackupList

BackupList contains a list of Backup

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`BackupList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` Backup array

BackupPlugin

Underlying type: string

Appears in:

BackupSpec

BackupSourceDatabase

Appears in:

BackupSource

Field	Description	Default	Validation
`name` string	Reference the Database name		Required: {}

BackupSourceKubernetesCluster

Appears in:

BackupSource

Field	Description	Default	Validation
`name` string	Reference the KubernetesCluster name		Required: {}
`namespaces` string array	Reference the Kubernetes namespaces to be included		Optional: {}

BackupSpec

BackupSpec defines the desired state of Backup

Appears in:

Backup

Field	Description	Validation
`restartPolicy` RestartPolicy	Suspend the CronJob	Optional: {}
`plugin` BackupPlugin	Backup plugin	Required: {}
`config` JSON	Synchronization config	Required: {}

LiveSynchronization

LiveSynchronization is the Schema for the livesynchronizations API

Appears in:

LiveSynchronizationList

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`LiveSynchronization`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` LiveSynchronizationSpec

LiveSynchronizationList

LiveSynchronizationList contains a list of LiveSynchronization

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`LiveSynchronizationList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` LiveSynchronization array

LiveSynchronizationPlugin

Underlying type: string

Appears in:

LiveSynchronizationSpec

LiveSynchronizationSpec

LiveSynchronizationSpec defines the desired state of LiveSynchronization

Appears in:

LiveSynchronization

Field	Description	Default	Validation
`suspend` boolean	Suspend the execution	false	Optional: {}
`plugin` LiveSynchronizationPlugin	LiveSynchronization plugin		Required: {}
`config` JSON	LiveSynchronization config		Required: {}

Resource

Appears in:

KubernetesClusterObjectsSelector

Field	Description	Validation
`group` string	Resource group	Optional: {}
`version` string	Resource version	Required: {}
`resource` string	Resource	Required: {}

Synchronization

Synchronization is the Schema for the synchronizations API

Appears in:

SynchronizationList

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`Synchronization`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` SynchronizationSpec

SynchronizationList

SynchronizationList contains a list of Synchronization

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`SynchronizationList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` Synchronization array

SynchronizationPlan

SynchronizationPlan is the Schema for the synchronizationplans API

Appears in:

SynchronizationPlanList

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`SynchronizationPlan`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` SynchronizationPlanSpec

SynchronizationPlanList

SynchronizationPlanList contains a list of SynchronizationPlan

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`SynchronizationPlanList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` SynchronizationPlan array

SynchronizationPlanSpec

SynchronizationPlanSpec defines the desired state of SynchronizationPlan

Appears in:

SynchronizationPlan

Field	Description	Default	Validation
`schedule` string	Schedule in Cron format		Required: {}
`startingDeadlineSeconds` integer	Optional deadline in seconds for starting the job if it misses scheduled time for any reason. Missed jobs executions will be counted as failed ones.		Optional: {}
`concurrencyPolicy` ConcurrencyPolicy	Specifies how to treat concurrent executions of a Job. Valid values are: - “Allow” (default): allows CronJobs to run concurrently; - “Forbid”: forbids concurrent runs, skipping next run if previous run hasn’t finished yet; - “Replace”: cancels currently running job and replaces it with a new one		Optional: {}
`suspend` boolean	Suspend the execution	false	Optional: {}
`template` SynchronizationTemplateSpec	Specify the Synchronization that will be created when executing the Cron		Optional: {}
`successfulJobsHistoryLimit` integer	The number of successful finished jobs to retain. Value must be non-negative integer	2	Optional: {}
`failedJobsHistoryLimit` integer	The number of failed finished jobs to retain. Value must be non-negative integer	2	Optional: {}

SynchronizationPlugin

Underlying type: string

Appears in:

SynchronizationSpec

SynchronizationSpec

SynchronizationSpec defines the desired state of Synchronization

Appears in:

Field	Description	Validation
`restartPolicy` RestartPolicy	Restart policy	Optional: {}
`plugin` SynchronizationPlugin	Synchronization plugin	Required: {}
`config` JSON	Synchronization config	Required: {}

SynchronizationTemplateSpec

Appears in:

SynchronizationPlanSpec

Field	Description	Default	Validation
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` SynchronizationSpec	Specification of the desired behavior of the Synchronization		Optional: {}

Task

Task is the Schema for the tasks API

Appears in:

TaskList

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`Task`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` TaskSpec

TaskList

TaskList contains a list of Task

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`TaskList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` Task array

TaskPlugin

Underlying type: string

Appears in:

TaskSpec

TaskRun

TaskRun is the Schema for the taskruns API

Appears in:

TaskRunList

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`TaskRun`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` TaskRunSpec

TaskRunList

TaskRunList contains a list of TaskRun

Field	Description	Default	Validation
`apiVersion` string	`automation.astronetes.io/v1alpha1`
`kind` string	`TaskRunList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` TaskRun array

TaskRunSpec

TaskRunSpec defines the desired state of TaskRun

Appears in:

TaskRun

Field	Description	Default	Validation
`taskName` string	Task name		Required: {}

TaskSpec

TaskSpec defines the desired state of Task

Appears in:

Field	Description	Validation
`restartPolicy` RestartPolicy	Restart policy	Optional: {}
`plugin` TaskPlugin	Task plugin	Required: {}
`config` JSON	Task config	Required: {}

8 - Release notes

Resiliency Operator Release Notes

v1.1.10

Fix memory leak in kubernetes-objects-to-kubernetes.

v1.1.9

Fix memory leak in Database controller.

v1.1.8

Improvements in Database controller.

v1.1.7

Improvements in Database controller.

v1.1.6

Delete pods when Synchronization object is deleted.

v1.1.5

Add audit logs in kubernetes-objects-transformation plugin.

v1.1.4

Exclude /zookeeper path by default in zookeeper-to-zookeeper-nodes plugin.
Show suspend field in LiveSynchronization list.

v1.1.3

Fix recovery from bucket in kubernetes-objects-to-kubernetes plugin.

v1.1.2

Show source name and destination name in Synchronization and LiveSynchronization.
Fix label error when creating LiveSynchronization objects with long names.
Improved logs in kubernetes-objects-transformation plugin.
Fix transformations with cluster-wide objects in kubernetes-objects-transformation plugin.
Fix default env variables in zookeeper-to-zookeper-nodes plugin.
Improved concurrency in zookeeper-to-zookeeper-nodes plugin.

v1.1.1

Fix ServiceMonitor configuration when deploying kubernetes-objects-to-kubernetes plugin.

v1.1.0

Add LiveSynchronization CRD.
Add Task CRD.
Add TaskRun CRD.
Add kubernetes-objects-to-kubernetes plugin.
Add kubernetes-objects-transformation plugin.
Add option in zookeeper-to-zookeeper-nodes to create or not the root node in the destination.