Home
Products
OpenShift Container Platform
4.18
Monitoring
Chapter 9. Monitoring clusters that run on RHOSO

Chapter 9. Monitoring clusters that run on RHOSO

You can correlate observability metrics for clusters that run on Red Hat OpenStack Services on OpenShift (RHOSO). By collecting metrics from both environments, you can monitor and troubleshoot issues across the infrastructure and application layers.

There are two supported methods for metric correlation for clusters that run on RHOSO:

Remote writing to an external Prometheus instance.
Collecting data from the OpenShift Container Platform federation endpoint to the RHOSO observability stack.

9.1. Remote writing to an external Prometheus instance
Copy link

Use remote write with both Red Hat OpenStack Services on OpenShift (RHOSO) and OpenShift Container Platform to push their metrics to an external Prometheus instance.

Prerequisites

You have access to an external Prometheus instance.
You have administrative access to RHOSO and your cluster.
You have certificates for secure communication with mTLS.
Your Prometheus instance is configured for client TLS certificates and has been set up as a remote write receiver.
The Cluster Observability Operator is installed on your RHOSO cluster.
The monitoring stack for your RHOSO cluster is configured to collect the metrics that you are interested in.
Telemetry is enabled in the RHOSO environment.
Note
To verify that the telemetry service is operating normally, entering the following command:
$ oc -n openstack get monitoringstacks metric-storage -o yaml
Copy to Clipboard Toggle word wrap
The monitoringstacks CRD indicates whether telemetry is enabled correctly.

Procedure

Configure your RHOSO management cluster to send metrics to Prometheus:

Create a secret that is named mtls-bundle in the openstack namespace that contains HTTPS client certificates for authentication to Prometheus by entering the following command:

oc --namespace openstack \
    create secret generic mtls-bundle \
        --from-file=./ca.crt \
        --from-file=osp-client.crt \
        --from-file=osp-client.key

$ oc --namespace openstack \
    create secret generic mtls-bundle \
        --from-file=./ca.crt \
        --from-file=osp-client.crt \
        --from-file=osp-client.key

Copy to Clipboard

Toggle word wrap

Open the controlplane configuration for editing by running the following command:
```
oc -n openstack edit openstackcontrolplane/controlplane
```
```
$ oc -n openstack edit openstackcontrolplane/controlplane
```
Copy to Clipboard Toggle word wrap

With the configuration open, replace the .spec.telemetry.template.metricStorage section so that RHOSO sends metrics to Prometheus. As an example:

      metricStorage:
        customMonitoringStack:
          alertmanagerConfig:
            disabled: false
          logLevel: info
          prometheusConfig:
            scrapeInterval: 30s
            remoteWrite:
            - url: https://external-prometheus.example.com/api/v1/write 
              tlsConfig:
                ca:
                  secret:
                    name: mtls-bundle
                    key: ca.crt
                cert:
                  secret:
                    name: mtls-bundle
                    key: ocp-client.crt
                keySecret:
                  name: mtls-bundle
                  key: ocp-client.key
            replicas: 2
          resourceSelector:
            matchLabels:
              service: metricStorage
          resources:
            limits:
              cpu: 500m
              memory: 512Mi
            requests:
              cpu: 100m
              memory: 256Mi
          retention: 1d 
        dashboardsEnabled: false
        dataplaneNetwork: ctlplane
        enabled: true
        prometheusTls: {}

      metricStorage:
        customMonitoringStack:
          alertmanagerConfig:
            disabled: false
          logLevel: info
          prometheusConfig:
            scrapeInterval: 30s
            remoteWrite:
            - url: https://external-prometheus.example.com/api/v1/write


              tlsConfig:
                ca:
                  secret:
                    name: mtls-bundle
                    key: ca.crt
                cert:
                  secret:
                    name: mtls-bundle
                    key: ocp-client.crt
                keySecret:
                  name: mtls-bundle
                  key: ocp-client.key
            replicas: 2
          resourceSelector:
            matchLabels:
              service: metricStorage
          resources:
            limits:
              cpu: 500m
              memory: 512Mi
            requests:
              cpu: 100m
              memory: 256Mi
          retention: 1d


        dashboardsEnabled: false
        dataplaneNetwork: ctlplane
        enabled: true
        prometheusTls: {}

Copy to Clipboard

Toggle word wrap

1: Replace this URL with the URL of your Prometheus instance.
2: Set a retention period. Optionally, you can reduce retention for local metrics because of external collection.

Configure the tenant cluster on which your workloads run to send metrics to Prometheus:

Create a cluster monitoring config map as a YAML file. The map must include a remote write configuration and cluster identifiers. As an example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    prometheusK8s:
      retention: 1d 
      remoteWrite:
      - url: "https://external-prometheus.example.com/api/v1/write"
        writeRelabelConfigs:
        - sourceLabels:
          - __tmp_openshift_cluster_id__
          targetLabel: cluster_id
          action: replace
        tlsConfig:
          ca:
            secret:
              name: mtls-bundle
              key: ca.crt
          cert:
            secret:
              name: mtls-bundle
              key: ocp-client.crt
          keySecret:
            name: mtls-bundle
            key: ocp-client.key

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    prometheusK8s:
      retention: 1d


      remoteWrite:
      - url: "https://external-prometheus.example.com/api/v1/write"
        writeRelabelConfigs:
        - sourceLabels:
          - __tmp_openshift_cluster_id__
          targetLabel: cluster_id
          action: replace
        tlsConfig:
          ca:
            secret:
              name: mtls-bundle
              key: ca.crt
          cert:
            secret:
              name: mtls-bundle
              key: ocp-client.crt
          keySecret:
            name: mtls-bundle
            key: ocp-client.key

Copy to Clipboard

Toggle word wrap

1: Set a retention period. Optionally, you can reduce retention for local metrics because of external collection.

Save the config map as a file called cluster-monitoring-config.yaml.

Create a secret that is named mtls-bundle in the openshift-monitoring namespace that contains HTTPS client certificates for authentication to Prometheus by entering the following command:

oc --namespace openshift-monitoring \
    create secret generic mtls-bundle \
        --from-file=./ca.crt \
        --from-file=ocp-client.crt \
        --from-file=ocp-client.key

$ oc --namespace openshift-monitoring \
    create secret generic mtls-bundle \
        --from-file=./ca.crt \
        --from-file=ocp-client.crt \
        --from-file=ocp-client.key

Copy to Clipboard

Toggle word wrap

Apply the cluster monitoring configuration by running the following command:
```
oc apply -f cluster-monitoring-config.yaml
```
```
$ oc apply -f cluster-monitoring-config.yaml
```
Copy to Clipboard Toggle word wrap

After the changes propagate, you can see aggregated metrics in your external Prometheus instance.

9.2. Collecting cluster metrics from the federation endpoint
Copy link

You can employ the federation endpoint of your OpenShift Container Platform cluster to make metrics available to a Red Hat OpenStack Services on OpenShift (RHOSO) cluster to practice pull-based monitoring.

Prerequisites

You have administrative access to RHOSO and the tenant cluster that is running on it.
Telemetry is enabled in the RHOSO environment.
The Cluster Observability Operator is installed on your cluster.
The monitoring stack for your cluster is configured.
Your cluster has its federation endpoint exposed.

Procedure

Connect to your cluster by using a username and password; do not log in by using a kubeconfig file that was generated by the installation program.
To retrieve a token from the OpenShift Container Platform cluster, run the following command on it:
```
oc whoami -t
```
```
$ oc whoami -t
```
Copy to Clipboard Toggle word wrap

Make the token available as a secret in the openstack namespace in the RHOSO management cluster by running the following command:

oc -n openstack create secret generic ocp-federated --from-literal=token=<the_token_fetched_previously>

$ oc -n openstack create secret generic ocp-federated --from-literal=token=<the_token_fetched_previously>

Copy to Clipboard

Toggle word wrap

To get the Prometheus federation route URL from your OpenShift Container Platform cluster, run the following command:

oc -n openshift-monitoring get route prometheus-k8s-federate -ojsonpath={'.status.ingress[].host'}

$ oc -n openshift-monitoring get route prometheus-k8s-federate -ojsonpath={'.status.ingress[].host'}

Copy to Clipboard

Toggle word wrap

Write a manifest for a scrape configuration and save it as a file called cluster-scrape-config.yaml. As an example:

apiVersion: monitoring.rhobs/v1alpha1
kind: ScrapeConfig
metadata:
  labels:
    service: metricStorage
  name: sos1-federated
  namespace: openstack
spec:
  params:
    'match[]':
    - '{__name__=~"kube_node_info|kube_persistentvolume_info|cluster:master_nodes"}' 
  metricsPath: '/federate'
  authorization:
    type: Bearer
    credentials:
      name: ocp-federated 
      key: token
  scheme: HTTPS # or HTTP
  scrapeInterval: 30s 
  staticConfigs:
  - targets:
    - prometheus-k8s-federate-openshift-monitoring.apps.openshift.example

apiVersion: monitoring.rhobs/v1alpha1
kind: ScrapeConfig
metadata:
  labels:
    service: metricStorage
  name: sos1-federated
  namespace: openstack
spec:
  params:
    'match[]':
    - '{__name__=~"kube_node_info|kube_persistentvolume_info|cluster:master_nodes"}'


  metricsPath: '/federate'
  authorization:
    type: Bearer
    credentials:
      name: ocp-federated


      key: token
  scheme: HTTPS # or HTTP
  scrapeInterval: 30s


  staticConfigs:
  - targets:
    - prometheus-k8s-federate-openshift-monitoring.apps.openshift.example

Copy to Clipboard

Toggle word wrap

1: Add metrics here. In this example, only the metrics kube_node_info, kube_persistentvolume_info, and cluster:master_nodes are requested.
2: Insert the previously generated secret name here.
3: Limit scraping to fewer than 1000 samples for each request with a maximum frequency of once every 30 seconds.
4: Insert the URL you fetched previously here. If the endpoint is HTTPS and uses a custom certificate authority, add a tlsConfig section after it.

While connected to the RHOSO management cluster, apply the manifest by running the following command:
```
oc apply -f cluster-scrape-config.yaml
```
```
$ oc apply -f cluster-scrape-config.yaml
```
Copy to Clipboard Toggle word wrap

After the config propagates, the cluster metrics are accessible for querying in the OpenShift Container Platform UI in RHOSO.

9.3. Available metrics for clusters that run on RHOSO
Copy link

To query metrics and identifying resources across the stack, there are helper metrics that establish a correlation between Red Hat OpenStack Services on OpenShift (RHOSO) infrastructure resources and their representations in the tenant OpenShift Container Platform cluster.

To map nodes with RHOSO compute instances, in the metric kube_node_info:

node is the Kubernetes node name.
provider_id contains the identifier of the corresponding compute service instance.

To map persistent volumes with RHOSO block storage or shared filesystems shares, in the metric kube_persistentvolume_info:

persistentvolume is the volume name.
csi_volume_handle is the block storage volume or share identifier.

By default, the compute machines that back the cluster control plane nodes are created in a server group with a soft anti-affinity policy. As a result, the compute service creates them on separate hypervisors on a best-effort basis. However, if the state of the RHOSO cluster is not appropriate for this distribution, the machines are created anyway.

In combination with the default soft anti-affinity policy, you can configure an alert that activates when a hypervisor hosts more than one control plane node of a given cluster to highlight the degraded level of high availability.

As an example, this PromQL query returns the number of OpenShift Container Platform master nodes per RHOSP host:

sum by (vm_instance) (
  group by (vm_instance, resource) (ceilometer_cpu)
    / on (resource) group_right(vm_instance) (
      group by (node, resource) (
        label_replace(kube_node_info, "resource", "$1", "system_uuid", "(.+)")
      )
    / on (node) group_left group by (node) (
      cluster:master_nodes
    )
  )
)

sum by (vm_instance) (
  group by (vm_instance, resource) (ceilometer_cpu)
    / on (resource) group_right(vm_instance) (
      group by (node, resource) (
        label_replace(kube_node_info, "resource", "$1", "system_uuid", "(.+)")
      )
    / on (node) group_left group by (node) (
      cluster:master_nodes
    )
  )
)

Copy to Clipboard

Toggle word wrap

Chapter 9. Monitoring clusters that run on RHOSO

9.1. Remote writing to an external Prometheus instance
Copy link

9.2. Collecting cluster metrics from the federation endpoint
Copy link

9.3. Available metrics for clusters that run on RHOSO
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 9. Monitoring clusters that run on RHOSO

9.1. Remote writing to an external Prometheus instanceCopy linkLink copied to clipboard!

9.2. Collecting cluster metrics from the federation endpointCopy linkLink copied to clipboard!

9.3. Available metrics for clusters that run on RHOSOCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

9.1. Remote writing to an external Prometheus instance
Copy link

9.2. Collecting cluster metrics from the federation endpoint
Copy link

9.3. Available metrics for clusters that run on RHOSO
Copy link