Chapter 3. Using observability


Use the observability service to view the utilization of clusters across your fleet.

3.1. Querying metrics using the observability API

Observability provides an external API for metrics to be queried through the OpenShift route, rbac-query-proxy. See the following options to get your queries for the rbac-query-proxy route:

  • You can get the details of the route with the following command:

    oc get route rbac-query-proxy -n open-cluster-management-observability
  • You can also access the rbac-query-proxy route with your OpenShift OAuth access token. The token should be associated with a user or service account, which has permission to get namespaces. For more information, see Managing user-owned OAuth access tokens.

Complete the following steps to create proxy-byo-cert secrets for observability:

  1. Get the default CA certificate and store the content of the key tls.crt in a local file. Run the following command:

    oc -n openshift-ingress get secret router-certs-default -o jsonpath="{.data.tls\.crt}" | base64 -d > ca.crt
  2. Run the following command to query metrics:

    curl --cacert ./ca.crt -H "Authorization: Bearer {TOKEN}" https://{PROXY_ROUTE_URL}/api/v1/query?query={QUERY_EXPRESSION}

    Note: The QUERY_EXPRESSION is the standard Prometheus query expression. For example, query the metrics cluster_infrastructure_provider by replacing the URL in the previously mentioned command with the following URL: https://{PROXY_ROUTE_URL}/api/v1/query?query=cluster_infrastructure_provider. For more details, see Querying Prometheus.

  3. Run the following command to create proxy-byo-ca secrets using the generated certificates:

    oc -n open-cluster-management-observability create secret tls proxy-byo-ca --cert ./ca.crt --key ./ca.key
  4. Create proxy-byo-cert secrets using the generated certificates by using the following command:

    oc -n open-cluster-management-observability create secret tls proxy-byo-cert --cert ./ingress.crt --key ./ingress.key

3.2. Exporting metrics to external endpoints

Export metrics to external endpoints, which support the Prometheus Remote-Write specification in real time. Complete the following steps to export metrics to external endpoints:

  1. Create the Kubernetes secret for an external endpoint with the access information of the external endpoint in the open-cluster-management-observability namespace. View the following example secret:

    apiVersion: v1
    kind: Secret
    metadata:
      name: victoriametrics
      namespace: open-cluster-management-observability
    type: Opaque
    stringData:
      ep.yaml: |
        url: http://victoriametrics:8428/api/v1/write
        http_client_config:
          basic_auth:
            username: test
            password: test

    The ep.yaml is the key of the content and is used in the MultiClusterObservability custom resource in next step. Currently, observability supports exporting metrics to endpoints without any security checks, with basic authentication or with tls enablement. View the following tables for a full list of supported parameters:

    NameDescriptionSchema

    url
    required

    URL for the external endpoint.

    string

    http_client_config
    optional

    Advanced configuration for the HTTP client.

    HttpClientConfig

    HttpClientConfig

    NameDescriptionSchema

    basic_auth
    optional

    HTTP client configuration for basic authentication.

    BasicAuth

    tls_config
    optional

    HTTP client configuration for TLS.

    TLSConfig

    BasicAuth

    NameDescriptionSchema

    username
    optional

    User name for basic authorization.

    string

    password
    optional

    Password for basic authorization.

    string

    TLSConfig

    Name

    Description

    Schema

    secret_name
    required

    Name of the secret that contains certificates.

    string

    ca_file_key
    optional

    Key of the CA certificate in the secret (only optional if insecure_skip_verify is set to true).

    string

    cert_file_key
    required

    Key of the client certificate in the secret.

    string

    key_file_key
    required

    Key of the client key in the secret.

    string

    insecure_skip_verify
    optional

    Parameter to skip the verification for target certificate.

    bool

  2. Add the writeStorage parameter to the MultiClusterObservability custom resource for adding a list of external endppoints that you want to export. View the following example:

    spec:
      storageConfig:
        writeStorage: 1
        - key: ep.yaml
          name: victoriametrics
    1
    Each item contains two attributes: name and key. Name is the name of the Kubernetes secret that contains endpoint access information, and key is the key of the content in the secret. If you add more than one item to the list, then the metrics are exported to multiple external endpoints.
  3. View the status of metric export after the metrics export is enabled by checking the acm_remote_write_requests_total metric.

    1. From the OpenShift Container Platform console of your hub cluster, navigate to the Metrics page by clicking Metrics in the Observe section.
    2. Then query the acm_remote_write_requests_total metric. The value of that metric is the total number of requests with a specific response for one external endpoint, on one observatorium API instance. The name label is the name for the external endpoint. The code label is the return code of the HTTP request for the metrics export.

3.3. Viewing and exploring data by using dashboards

View the data from your managed clusters by accessing Grafana from the hub cluster. You can query specific alerts and add filters for the query.

For example, to explore the cluster_infrastructure_provider alert from a single-node OpenShift cluster, use the following query expression: cluster_infrastructure_provider{clusterType="SNO"}

Note: Do not set the ObservabilitySpec.resources.CPU.limits parameter if observability is enabled on single node managed clusters. When you set the CPU limits, it causes the observability pod to be counted against the capacity for your managed cluster. See the reference for Management Workload Partitioning in the Additional resources section.

3.3.1. Viewing historical data

When you query historical data, manually set your query parameter options to control how much data is displayed from the dashboard. Complete the following steps:

  1. From your hub cluster, select the Grafana link that is in the console header.
  2. Edit your cluster dashboard by selecting Edit Panel.
  3. From the Query front-end data source in Grafana, click the Query tab.
  4. Select $datasource.
  5. If you want to see more data, increase the value of the Step parameter section. If the Step parameter section is empty, it is automatically calculated.
  6. Find the Custom query parameters field and select max_source_resolution=auto.
  7. To verify that the data is displayed, refresh your Grafana page.

Your query data appears from the Grafana dashboard.

3.3.2. Viewing Red Hat Advanced Cluster Management dashboards

When you enable the Red Hat Advanced Cluster Management observability service, three dashboards become available. the following dashboard descriptions:

  • Alert Analysis: Overview dashboard of the alerts being generated within the managed cluster fleet.
  • Clusters by Alert: Alert dashboard where you can filter by the alert name.
  • Alerts by Cluster: Alert dashboard where you can filter by cluster, and view real-time data for alerts that are initiated or pending within the cluster environment.

3.3.3. Viewing the etcd table

You can also view the etcd table from the hub cluster dashboard in Grafana to learn the stability of the etcd as a data store. Select the Grafana link from your hub cluster to view the etcd table data, which is collected from your hub cluster. The Leader election changes across managed clusters are displayed.

3.3.4. Viewing the Kubernetes API server dashboard

View the following options to view the Kubernetes API server dashboards:

  • View the cluster fleet Kubernetes API service-level overview from the hub cluster dashboard in Grafana.

    1. Navigate to the Grafana dashboard.
    2. Access the managed dashboard menu by selecting Kubernetes > Service-Level Overview > API Server. The Fleet Overview and Top Cluster details are displayed.

      The total number of clusters that are exceeding or meeting the targeted service-level objective (SLO) value for the past seven or 30-day period, offending and non-offending clusters, and API Server Request Duration is displayed.

  • View the Kubernetes API service-level overview table from the hub cluster dashboard in Grafana.

    1. Navigate to the Grafana dashboard from your hub cluster.
    2. Access the managed dashboard menu by selecting Kubernetes > Service-Level Overview > API Server. The Fleet Overview and Top Cluster details are displayed.

      The error budget for the past seven or 30-day period, the remaining downtime, and trend are displayed.

3.4. Additional resources

3.5. Using Grafana dashboards

Use Grafana dashboards to view hub cluster and managed cluster metrics. The data displayed in the Grafana alerts dashboard relies on alerts metrics, originating from managed clusters. The alerts metric does not affect managed clusters forwarding alerts to Red Hat Advanced Cluster Management alert manager on the hub cluster. Therefore, the metrics and alerts have distinct propagation mechanisms and follow separate code paths.

Even if you see data in the Grafana alerts dashboard, that does not guarantee that the managed cluster alerts are successfully forwarding to the Red Hat Advanced Cluster Management hub cluster alert manager. If the metrics are propagated from the managed clusters, you can view the data displayed in the Grafana alerts dashboard.

To use the Grafana dashboards for your development needs, complete the following:

3.5.1. Setting up the Grafana developer instance

You can design your Grafana dashboard by creating a grafana-dev instance. Be sure to use the most current grafana-dev instance.

Complete the following steps to set up the Grafana developer instance:

  1. Clone the open-cluster-management/multicluster-observability-operator/ repository, so that you are able to run the scripts that are in the tools folder.
  2. Run the setup-grafana-dev.sh to setup your Grafana instance. When you run the script the following resources are created: secret/grafana-dev-config, deployment.apps/grafana-dev, service/grafana-dev, ingress.extensions/grafana-dev, persistentvolumeclaim/grafana-dev:

    ./setup-grafana-dev.sh --deploy
    secret/grafana-dev-config created
    deployment.apps/grafana-dev created
    service/grafana-dev created
    serviceaccount/grafana-dev created
    clusterrolebinding.rbac.authorization.k8s.io/open-cluster-management:grafana-crb-dev created
    route.route.openshift.io/grafana-dev created
    persistentvolumeclaim/grafana-dev created
    oauthclient.oauth.openshift.io/grafana-proxy-client-dev created
    deployment.apps/grafana-dev patched
    service/grafana-dev patched
    route.route.openshift.io/grafana-dev patched
    oauthclient.oauth.openshift.io/grafana-proxy-client-dev patched
    clusterrolebinding.rbac.authorization.k8s.io/open-cluster-management:grafana-crb-dev patched
  3. Switch the user role to Grafana administrator with the switch-to-grafana-admin.sh script.

    1. Select the Grafana URL, https:grafana-dev-open-cluster-management-observability.{OPENSHIFT_INGRESS_DOMAIN}, and log in.
    2. Then run the following command to add the switched user as Grafana administrator. For example, after you log in using kubeadmin, run following command:

      ./switch-to-grafana-admin.sh kube:admin
      User <kube:admin> switched to be grafana admin

The Grafana developer instance is set up.

3.5.1.1. Verifying Grafana version

Verify the Grafana version from the command line interface (CLI) or from the Grafana user interface.

After you log in to your hub cluster, access the observabilty-grafana pod terminal. Run the following command:

grafana-cli

The Grafana version that is currently deployed within the cluster environment is displayed.

Alternatively, you can navigate to the Manage tab in the Grafana dashboard. Scroll to the end of the page, where the version is listed.

3.5.2. Designing your Grafana dashboard

After you set up the Grafana instance, you can design the dashboard. Complete the following steps to refresh the Grafana console and design your dashboard:

  1. From the Grafana console, create a dashboard by selecting the Create icon from the navigation panel. Select Dashboard, and then click Add new panel.
  2. From the New Dashboard/Edit Panel view, navigate to the Query tab.
  3. Configure your query by selecting Observatorium from the data source selector and enter a PromQL query.
  4. From the Grafana dashboard header, click the Save icon that is in the dashboard header.
  5. Add a descriptive name and click Save.

3.5.2.1. Designing your Grafana dashboard with a ConfigMap

Design your Grafana dashboard with a ConfigMap. You can use the generate-dashboard-configmap-yaml.sh script to generate the dashboard ConfigMap, and to save the ConfigMap locally:

./generate-dashboard-configmap-yaml.sh "Your Dashboard Name"
Save dashboard <your-dashboard-name> to ./your-dashboard-name.yaml

If you do not have permissions to run the previously mentioned script, complete the following steps:

  1. Select a dashboard and click the Dashboard settings icon.
  2. Click the JSON Model icon from the navigation panel.
  3. Copy the dashboard JSON data and paste it in the data section.
  4. Modify the name and replace $your-dashboard-name. Enter a universally unique identifier (UUID) in the uid field in data.$your-dashboard-name.json.$$your_dashboard_json. You can use a program such as uuidegen to create a UUID. Your ConfigMap might resemble the following file:

    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: $your-dashboard-name
      namespace: open-cluster-management-observability
      labels:
        grafana-custom-dashboard: "true"
    data:
      $your-dashboard-name.json: |-
        $your_dashboard_json

    Notes:

    • If your dashboard is created within the grafana-dev instance, you can take the name of the dashboard and pass it as an argument in the script. For example, a dashboard named Demo Dashboard is created in the grafana-dev instance. From the CLI, you can run the following script:

      ./generate-dashboard-configmap-yaml.sh "Demo Dashboard"

      After running the script, you might receive the following message:

      Save dashboard <demo-dashboard> to ./demo-dashboard.yaml
    • If your dashboard is not in the General folder, you can specify the folder name in the annotations section of this ConfigMap:

      annotations:
        observability.open-cluster-management.io/dashboard-folder: Custom

      After you complete your updates for the ConfigMap, you can install it to import the dashboard to the Grafana instance.

Verify that the YAML file is created by applying the YAML from the CLI or OpenShift Container Platform console. A ConfigMap within the open-cluster-management-observability namespace is created. Run the following command from the CLI:

oc apply -f demo-dashboard.yaml

From the OpenShift Container Platform console, create the ConfigMap using the demo-dashboard.yaml file. The dashboard is located in the Custom folder.

3.5.3. Uninstalling the Grafana developer instance

When you uninstall the instance, the related resources are also deleted. Run the following command:

./setup-grafana-dev.sh --clean
secret "grafana-dev-config" deleted
deployment.apps "grafana-dev" deleted
serviceaccount "grafana-dev" deleted
route.route.openshift.io "grafana-dev" deleted
persistentvolumeclaim "grafana-dev" deleted
oauthclient.oauth.openshift.io "grafana-proxy-client-dev" deleted
clusterrolebinding.rbac.authorization.k8s.io "open-cluster-management:grafana-crb-dev" deleted

3.5.4. Additional resources

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.