Integrating with OpenShift Alerting
Introduction
OpenShift allows the creation of alerts based on Prometheus metrics to provide additional information about the functioning and status of Astronetes operator.
Prerequisites
- Access Requirement: cluster-admin access to the OpenShift cluster
Configure alerts
Two types of alerts are provided for managing the operator’s integration within the cluster and for monitoring the synchronization
Platform alerts
Metrics defined to assess the functionality of the integration between the product and the assets
Applying these rules:
oc apply -f https://astronetes.io/deploy/resiliency-operator/v1.3.5/alert-rules-resiliency-operator.yaml
Synchronization alerts
Metrics are employed to assess the status of synchronized objects.
For configuring this rule its necesary to follow these steps:
- Create this PrometheusRule manifest:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: failed-synchronize-items
namespace: <your-synchronization-namespace>
spec:
groups:
- name: synchronization-alerts
rules:
- alert: SynchronizationNotInSync
annotations:
summary: "There are synchronization items not in sync."
description: "Synchronization {{ $labels.synchronizationName }} is out of sync in namespace {{ $labels.synchronizationNamespace }}"
expr: astronetes_total_synchronized_objects{objectStatus!="Sync"} > 0
for: 1h
labels:
severity: warning
- alert: WriteOperationsFailed
annotations:
summary: "There are one or more write operations failed"
description: "Synchronization {{ $labels.synchronizationName }} failed write operator in namespace {{ $labels.synchronizationNamespace }}"
expr: astronetes_total_write_operations{writeStatus="failed"} > 0
for: 1h
labels:
severity: warning
Edit namespace: Use the namespace where synchronizes are deployed
Applying this rule:
kubectl apply -f <path-to-your-modified-yaml-file>.yaml
How to configure custom alerts
Prometheus provides a powerful set of metrics that can be used to monitor the status of your cluster and the functionality of your operator by creating customized alert rules.
The PrometheusRule should be created in the same namespace as the process that generates these metrics to ensure proper functionality and visibility.
Here is an example of a PrometheusRule YAML file:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: <alert-name>
namespace: <namespace>
spec:
groups:
- name: <group-name>
rules:
- alert: <alert-rule-name>
annotations:
description: <description>
summary: <summary>
expr: <expresion>
for: <duration>
labels:
severity: <severity-level>
Field Value Descriptions
In the PrometheusRule YAML file, several fields are essential for defining your alerting rules. Below is a table describing the values that can be used for each field:
| Field | Description | Example Values |
|---|---|---|
| alert | Specifies the name of the alert that will be triggered. It should be descriptive. | AssetFailure, HighCPUUsage, MemoryThresholdExceeded |
| for | Defines the duration for which the condition must be true before the alert triggers. | 5m, 1h, 30s |
| severity | Indicates the criticality of the alert. Helps prioritize alerts. | critical, warning, info |
| expr | The Prometheus expression (in PromQL) that determines the alerting condition based on metrics. | sum(rate(http_requests_total[5m])) > 100, node_memory_usage > 90 |
Apply to the cluster
Create new prometheus rule in the cluster:
oc apply -f <path-to-your-prometheus-rule-file>.yaml
Checking alerts
1. Access OpenShift Web Console:
- Open your browser and go to the OpenShift web console URL.
- Log in with your credentials.
2. Navigate to Observe:
- In the OpenShift console, go to the Observe section from the main menu.
- In the Alerts tab, you’ll find a list of active and silenced alerts.
- Check for any alerts triggered based on the custom rules you can create in Prometheus.
- Also you can see the entire list of alerting rules configurated.
3. Filter Custom Alerts:
- To filter the custom alerts, use the source field and set its value to user. This will display only the alerts that were generated based on user-defined rules. Check openshift docs about filtering.