This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.Questo contenuto non è disponibile nella lingua selezionata.
Monitoring
Configuring and using the monitoring stack in OpenShift Container Platform
Abstract
Chapter 1. Cluster monitoring Copia collegamentoCollegamento copiato negli appunti!
1.1. About cluster monitoring Copia collegamentoCollegamento copiato negli appunti!
OpenShift Container Platform includes a pre-configured, pre-installed, and self-updating monitoring stack that is based on the Prometheus open source project and its wider eco-system. It provides monitoring of cluster components and includes a set of alerts to immediately notify the cluster administrator about any occurring problems and a set of Grafana dashboards. The cluster monitoring stack is only supported for monitoring OpenShift Container Platform clusters.
To ensure compatibility with future OpenShift Container Platform updates, configuring only the specified monitoring stack options is supported.
1.1.1. Stack components and monitored targets Copia collegamentoCollegamento copiato negli appunti!
The monitoring stack includes these components:
Component | Description |
---|---|
Cluster Monitoring Operator | The OpenShift Container Platform Cluster Monitoring Operator (CMO) is the central component of the stack. It controls the deployed monitoring components and resources and ensures that they are always up to date. |
Prometheus Operator | The Prometheus Operator (PO) creates, configures, and manages Prometheus and Alertmanager instances. It also automatically generates monitoring target configurations based on familiar Kubernetes label queries. |
Prometheus | The Prometheus is the systems and service monitoring system, around which the monitoring stack is based. |
Prometheus Adapter | The Prometheus Adapter exposes cluster resource metrics API for horizontal pod autoscaling. Resource metrics are CPU and memory utilization. |
Alertmanager | The Alertmanager service handles alerts sent by Prometheus. |
|
The |
|
The |
|
|
Thanos Querier | The Thanos Querier enables aggregating and, optionally, deduplicating cluster and user workload metrics under a single, multi-tenant interface. |
Grafana | The Grafana analytics platform provides dashboards for analyzing and visualizing the metrics. The Grafana instance that is provided with the monitoring stack, along with its dashboards, is read-only. |
All the components of the monitoring stack are monitored by the stack and are automatically updated when OpenShift Container Platform is updated.
In addition to the components of the stack itself, the monitoring stack monitors:
- CoreDNS
- Elasticsearch (if Logging is installed)
- etcd
- Fluentd (if Logging is installed)
- HAProxy
- Image registry
- Kubelets
- Kubernetes apiserver
- Kubernetes controller manager
- Kubernetes scheduler
- Metering (if Metering is installed)
- OpenShift apiserver
- OpenShift controller manager
- Operator Lifecycle Manager (OLM)
- Telemeter client
Each OpenShift Container Platform component is responsible for its monitoring configuration. For problems with a component’s monitoring, open a bug in Bugzilla against that component, not against the general monitoring component.
Other OpenShift Container Platform framework components might be exposing metrics as well. For details, see their respective documentation.
1.1.2. Next steps Copia collegamentoCollegamento copiato negli appunti!
1.2. Configuring the monitoring stack Copia collegamentoCollegamento copiato negli appunti!
Prior to OpenShift Container Platform 4, the Prometheus Cluster Monitoring stack was configured through the Ansible inventory file. For that purpose, the stack exposed a subset of its available configuration options as Ansible variables. You configured the stack before you installed OpenShift Container Platform.
In OpenShift Container Platform 4, Ansible is not the primary technology to install OpenShift Container Platform anymore. The installation program provides only a very low number of configuration options before installation. Configuring most OpenShift framework components, including the Prometheus Cluster Monitoring stack, happens post-installation.
This section explains what configuration is supported, shows how to configure the monitoring stack, and demonstrates several common configuration scenarios.
1.2.1. Prerequisites Copia collegamentoCollegamento copiato negli appunti!
- The monitoring stack imposes additional resource requirements. Consult the computing resources recommendations in Scaling the Cluster Monitoring Operator and verify that you have sufficient resources.
1.2.2. Maintenance and support Copia collegamentoCollegamento copiato negli appunti!
The supported way of configuring OpenShift Container Platform Monitoring is by configuring it using the options described in this document. Do not use other configurations, as they are unsupported. Configuration paradigms might change across Prometheus releases, and such cases can only be handled gracefully if all configuration possibilities are controlled. If you use configurations other than those described in this section, your changes will disappear because the cluster-monitoring-operator reconciles any differences. The operator reverses everything to the defined state by default and by design.
Explicitly unsupported cases include:
-
Creating additional
ServiceMonitor
objects in theopenshift-*
namespaces. This extends the targets the cluster monitoring Prometheus instance scrapes, which can cause collisions and load differences that cannot be accounted for. These factors might make the Prometheus setup unstable. -
Creating unexpected
ConfigMap
objects orPrometheusRule
objects. This causes the cluster monitoring Prometheus instance to include additional alerting and recording rules. - Modifying resources of the stack. The Prometheus Monitoring Stack ensures its resources are always in the state it expects them to be. If they are modified, the stack will reset them.
- Using resources of the stack for your purposes. The resources created by the Prometheus Cluster Monitoring stack are not meant to be used by any other resources, as there are no guarantees about their backward compatibility.
- Stopping the Cluster Monitoring Operator from reconciling the monitoring stack.
- Adding new alerting rules.
- Modifying the monitoring stack Grafana instance.
1.2.3. Creating a cluster monitoring ConfigMap Copia collegamentoCollegamento copiato negli appunti!
To configure the OpenShift Container Platform monitoring stack, you must create the cluster monitoring ConfigMap.
Prerequisites
- You have access to the cluster as a user with the cluster-admin role.
-
You have installed the OpenShift CLI (
oc
).
Procedure
Check whether the
cluster-monitoring-config
ConfigMap object exists:oc -n openshift-monitoring get configmap cluster-monitoring-config
$ oc -n openshift-monitoring get configmap cluster-monitoring-config
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the ConfigMap does not exist:
Create the following YAML manifest. In this example the file is called
cluster-monitoring-config.yaml
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the configuration to create the ConfigMap:
oc apply -f cluster-monitoring-config.yaml
$ oc apply -f cluster-monitoring-config.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
1.2.4. Configuring the cluster monitoring stack Copia collegamentoCollegamento copiato negli appunti!
You can configure the Prometheus Cluster Monitoring stack using ConfigMaps. ConfigMaps configure the Cluster Monitoring Operator, which in turn configures components of the stack.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role. - You have installed the OpenShift CLI (oc).
-
You have created the
cluster-monitoring-config
ConfigMap object.
Procedure
Start editing the
cluster-monitoring-config
ConfigMap:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Put your configuration under
data/config.yaml
as key-value pair<component_name>: <component_configuration>
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Substitute
<component>
and<configuration_for_the_component>
accordingly.For example, create this ConfigMap to configure a Persistent Volume Claim (PVC) for Prometheus:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Here, prometheusK8s defines the Prometheus component and the following lines define its configuration.
- Save the file to apply the changes. The pods affected by the new configuration are restarted automatically.
Additional resources
-
See Creating a cluster monitoring ConfigMap to learn how to create the
cluster-monitoring-config
ConfigMap object.
1.2.5. Configurable monitoring components Copia collegamentoCollegamento copiato negli appunti!
This table shows the monitoring components you can configure and the keys used to specify the components in the ConfigMap:
Component | Key |
---|---|
Prometheus Operator |
|
Prometheus |
|
Alertmanager |
|
kube-state-metrics |
|
openshift-state-metrics |
|
Grafana |
|
Telemeter Client |
|
Prometheus Adapter |
|
From this list, only Prometheus and Alertmanager have extensive configuration options. All other components usually provide only the nodeSelector
field for being deployed on a specified node.
1.2.6. Moving monitoring components to different nodes Copia collegamentoCollegamento copiato negli appunti!
You can move any of the monitoring stack components to specific nodes.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role. - You have installed the OpenShift CLI (oc).
-
You have created the
cluster-monitoring-config
ConfigMap object.
Procedure
Start editing the
cluster-monitoring-config
ConfigMap:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the
nodeSelector
constraint for the component underdata/config.yaml
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Substitute
<component>
accordingly and substitute<node_key>: <node_value>
with the map of key-value pairs that specifies the destination node. Often, only a single key-value pair is used.The component can only run on a node that has each of the specified key-value pairs as labels. The node can have additional labels as well.
For example, to move components to the node that is labeled
foo: bar
, use:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file to apply the changes. The components affected by the new configuration are moved to new nodes automatically.
Additional resources
-
See Creating a cluster monitoring ConfigMap to learn how to create the
cluster-monitoring-config
ConfigMap object. - See Placing pods on specific nodes using node selectors for more information about using node selectors.
-
See the Kubernetes documentation for details on the
nodeSelector
constraint.
1.2.7. Assigning tolerations to monitoring components Copia collegamentoCollegamento copiato negli appunti!
You can assign tolerations to any of the monitoring stack components to enable moving them to tainted nodes.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role. - You have installed the OpenShift CLI (oc).
-
You have created the
cluster-monitoring-config
ConfigMap object.
Procedure
Start editing the
cluster-monitoring-config
ConfigMap:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify
tolerations
for the component:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Substitute
<component>
and<toleration_specification>
accordingly.For example, a
oc adm taint nodes node1 key1=value1:NoSchedule
taint prevents the scheduler from placing pods in thefoo: bar
node. To make thealertmanagerMain
component ignore that taint and to placealertmanagerMain
infoo: bar
normally, use this toleration:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file to apply the changes. The new component placement configuration is applied automatically.
Additional resources
-
See Creating a cluster monitoring ConfigMap to learn how to create the
cluster-monitoring-config
ConfigMap object. - See the OpenShift Container Platform documentation on taints and tolerations.
- See the Kubernetes documentation on taints and tolerations.
1.2.8. Configuring persistent storage Copia collegamentoCollegamento copiato negli appunti!
Running cluster monitoring with persistent storage means that your metrics are stored to a persistent volume (PV) and can survive a pod being restarted or recreated. This is ideal if you require your metrics or alerting data to be guarded from data loss. For production environments, it is highly recommended to configure persistent storage. Because of the high IO demands, it is advantageous to use local storage.
1.2.9. Prerequisites Copia collegamentoCollegamento copiato negli appunti!
- Dedicate sufficient local persistent storage to ensure that the disk does not become full. How much storage you need depends on the number of pods. For information on system requirements for persistent storage, see Prometheus database storage requirements.
- Make sure you have a persistent volume (PV) ready to be claimed by the persistent volume claim (PVC), one PV for each replica. Because Prometheus has two replicas and Alertmanager has three replicas, you need five PVs to support the entire monitoring stack. The PVs should be available from the Local Storage Operator. This does not apply if you enable dynamically provisioned storage.
- Use the block type of storage.
- Configure local persistent storage.
1.2.9.1. Configuring a local persistent volume claim Copia collegamentoCollegamento copiato negli appunti!
For the Prometheus or Alertmanager to use a persistent volume (PV), you first must configure a persistent volume claim (PVC).
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role. - You have installed the OpenShift CLI (oc).
-
You have created the
cluster-monitoring-config
ConfigMap object.
Procedure
Edit the
cluster-monitoring-config
ConfigMap:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Put your PVC configuration for the component under
data/config.yaml
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow See the Kubernetes documentation on PersistentVolumeClaims for information on how to specify
volumeClaimTemplate
.For example, to configure a PVC that claims local persistent storage for Prometheus, use:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the above example, the storage class created by the Local Storage Operator is called
local-storage
.To configure a PVC that claims local persistent storage for Alertmanager, use:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file to apply the changes. The pods affected by the new configuration are restarted automatically and the new storage configuration is applied.
1.2.9.2. Modifying retention time for Prometheus metrics data Copia collegamentoCollegamento copiato negli appunti!
By default, the Prometheus Cluster Monitoring stack configures the retention time for Prometheus data to be 15 days. You can modify the retention time to change how soon the data is deleted.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role. - You have installed the OpenShift CLI (oc).
-
You have created the
cluster-monitoring-config
ConfigMap object.
Procedure
Start editing the
cluster-monitoring-config
ConfigMap:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Put your retention time configuration under
data/config.yaml
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Substitute
<time_specification>
with a number directly followed byms
(milliseconds),s
(seconds),m
(minutes),h
(hours),d
(days),w
(weeks), ory
(years).For example, to configure retention time to be 24 hours, use:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file to apply the changes. The pods affected by the new configuration are restarted automatically.
Additional resources
-
See Creating a cluster monitoring ConfigMap to learn how to create the
cluster-monitoring-config
ConfigMap object. - Understanding persistent storage
- Optimizing storage
1.2.10. Configuring Alertmanager Copia collegamentoCollegamento copiato negli appunti!
The Prometheus Alertmanager is a component that manages incoming alerts, including:
- Alert silencing
- Alert inhibition
- Alert aggregation
- Reliable deduplication of alerts
- Grouping alerts
- Sending grouped alerts as notifications through receivers such as email, PagerDuty, and HipChat
1.2.10.1. Alertmanager default configuration Copia collegamentoCollegamento copiato negli appunti!
The default configuration of the OpenShift Container Platform Monitoring Alertmanager cluster is this:
OpenShift Container Platform monitoring ships with the Watchdog alert, which fires continuously. Alertmanager repeatedly sends notifications for the Watchdog alert to the notification provider, for example, to PagerDuty. The provider is usually configured to notify the administrator when it stops receiving the Watchdog alert. This mechanism helps ensure continuous operation of Prometheus as well as continuous communication between Alertmanager and the notification provider.
1.2.10.2. Applying custom Alertmanager configuration Copia collegamentoCollegamento copiato negli appunti!
You can overwrite the default Alertmanager configuration by editing the alertmanager-main
secret inside the openshift-monitoring
namespace.
Prerequisites
-
An installed
jq
tool for processing JSON data
Procedure
Print the currently active Alertmanager configuration into file
alertmanager.yaml
:oc -n openshift-monitoring get secret alertmanager-main --template='{{ index .data "alertmanager.yaml" }}' |base64 -d > alertmanager.yaml
$ oc -n openshift-monitoring get secret alertmanager-main --template='{{ index .data "alertmanager.yaml" }}' |base64 -d > alertmanager.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Change the configuration in file
alertmanager.yaml
to your new configuration:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, this listing configures PagerDuty for notifications:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow With this configuration, alerts of
critical
severity fired by theexample-app
service are sent using theteam-frontend-page
receiver, which means that these alerts are paged to a chosen person.Apply the new configuration in the file:
oc -n openshift-monitoring create secret generic alertmanager-main --from-file=alertmanager.yaml --dry-run -o=yaml | oc -n openshift-monitoring replace secret --filename=-
$ oc -n openshift-monitoring create secret generic alertmanager-main --from-file=alertmanager.yaml --dry-run -o=yaml | oc -n openshift-monitoring replace secret --filename=-
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional resources
- See the PagerDuty official site for more information on PagerDuty.
-
See the PagerDuty Prometheus Integration Guide to learn how to retrieve the
service_key
. - See Alertmanager configuration for configuring alerting through different alert receivers.
1.2.10.3. Alerting rules Copia collegamentoCollegamento copiato negli appunti!
OpenShift Container Platform Cluster Monitoring by default ships with a set of pre-defined alerting rules.
Note that:
- The default alerting rules are used specifically for the OpenShift Container Platform cluster and nothing else. For example, you get alerts for a persistent volume in the cluster, but you do not get them for persistent volume in your custom namespace.
- Currently you cannot add custom alerting rules.
- Some alerting rules have identical names. This is intentional. They are sending alerts about the same event with different thresholds, with different severity, or both.
- With the inhibition rules, the lower severity is inhibited when the higher severity is firing.
1.2.10.4. Listing acting alerting rules Copia collegamentoCollegamento copiato negli appunti!
You can list the alerting rules that currently apply to the cluster.
Procedure
Configure the necessary port forwarding:
oc -n openshift-monitoring port-forward svc/prometheus-operated 9090
$ oc -n openshift-monitoring port-forward svc/prometheus-operated 9090
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Fetch the JSON object containing acting alerting rules and their properties:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional resources
- See also the Alertmanager documentation.
1.2.11. Next steps Copia collegamentoCollegamento copiato negli appunti!
- Manage cluster alerts.
- Learn about remote health reporting and, if necessary, opt out of it.
1.3. Managing cluster alerts Copia collegamentoCollegamento copiato negli appunti!
OpenShift Container Platform 4.3 provides a web interface to the Alertmanager, which enables you to manage alerts. This section demonstrates how to use the Alerting UI.
1.3.1. Contents of the Alerting UI Copia collegamentoCollegamento copiato negli appunti!
This section shows and explains the contents of the Alerting UI, a web interface to the Alertmanager.
The main three pages of the Alerting UI are the Alerts, the Silences, and the YAML pages.
The Alerts page is accessible by clicking Monitoring → Alerting → Alerts in the OpenShift Container Platform web console.
- Filtering alerts by their names.
- Filtering the alerts by their states. To fire, some alerts need a certain condition to be true for the duration of a timeout. If a condition of an alert is currently true, but the timeout has not been reached, such an alert is in the Pending state.
- Alert name.
- Description of an alert.
- Current state of the alert and when the alert went into this state.
- Value of the Severity label of the alert.
- Actions you can do with the alert.
The Silences page is accessible by clicking Monitoring → Alerting → Silences in the OpenShift Container Platform web console.
- Creating a silence for an alert.
- Filtering silences by their name.
- Filtering silences by their states. If a silence is pending, it is currently not active because it is scheduled to start at a later time. If a silence expired, it is no longer active because it has reached its end time.
- Description of a silence. It includes the specification of alerts that it matches.
- Current state of the silence. For active silences, it shows when it ends, and for pending silences, it shows when it starts.
- Number of alerts that are being silenced by the silence.
- Actions you can do with a silence.
The YAML page is accessible by clicking Monitoring → Alerting → YAML in the OpenShift Container Platform web console.
- Upload a file with Alertmanager configuration.
- Examine and edit the current Alertmanager configuration.
- Save the updated Alertmanager configuration.
Also, next to the title of each of these pages is a link to the old Alertmanager interface.
Additional resources
- See Configuring Alertmanager for more information on changing Alertmanager configuration.
1.3.2. Getting information about alerts and alerting rules Copia collegamentoCollegamento copiato negli appunti!
You can find an alert and see information about it or its governing alerting rule.
Procedure
- Open the OpenShift Container Platform web console and navigate to the Monitoring → Alerting → Alerts page.
- Optional: Filter the alerts by name using the Filter Alerts by name field.
- Optional: Filter the alerts by state using one or more of the state buttons Firing, Silenced, Pending, Not firing.
- Optional: Sort the alerts by clicking one or more of the Name, State, and Severity column headers.
After you see the alert, you can see either details of the alert or details of its governing alerting rule.
To see alert details, click on the name of the alert. This is the page with alert details:
The page has the graph with timeseries of the alert. It also has information about the alert, including:
- A link to its governing alerting rule
- Description of the alert
To see alerting rule details, click the button in the last column and select View Alerting Rule. This is the page with alerting rule details:
The page has information about the alerting rule, including:
- Alerting rule name, severity, and description
- The expression that defines the condition for firing the alert
- The time for which the condition should be true for an alert to fire
- Graph for each alert governed by the alerting rule, showing the value with which the alert is firing
- Table of all alerts governed by the alerting rule
1.3.3. Silencing alerts Copia collegamentoCollegamento copiato negli appunti!
You can either silence a specific alert or silence alerts that match a specification that you define.
Procedure
To silence a set of alerts by creating an alert specification:
- Navigate to the Monitoring → Alerting → Silences page of the OpenShift Container Platform web console.
- Click Create Silence.
- Populate the Create Silence form.
- To create the silence, click Create.
To silence a specific alert:
- Navigate to the Monitoring → Alerting → Alerts page of the OpenShift Container Platform web console.
- For the alert that you want to silence, click the button in the last column and click Silence Alert. The Create Silence form will appear with prepopulated specification of the chosen alert.
- Optional: Modify the silence.
- To create the silence, click Create.
1.3.4. Getting information about silences Copia collegamentoCollegamento copiato negli appunti!
You can find a silence and view its details.
Procedure
- Open the OpenShift Container Platform web console and navigate to the Monitoring → Alerting → Silences page.
- Optional: Filter the silences by name using the Filter Silences by name field.
- Optional: Filter the silences by state using one or more of the state buttons Active, Pending, Expired.
- Optional: Sort the silences by clicking one or more of the Name, State, and Firing alerts column headers.
After you see the silence, you can click its name to see the details, including:
- Alert specification
- State
- Start time
- End time
- Number and list of firing alerts
1.3.5. Editing silences Copia collegamentoCollegamento copiato negli appunti!
You can edit a silence, which will expire the existing silence and create a new silence with the changed configuration.
Procedure
- Navigate to the Monitoring → Alerting → Silences page.
For the silence you want to modify, click the button in the last column and click Edit silence.
Alternatively, you can click Actions → Edit Silence in the Silence Overview screen for a particular silence.
- In the Edit Silence screen, enter your changes and click the Save button. This will expire the existing silence and create one with the chosen configuration.
1.3.6. Expiring silences Copia collegamentoCollegamento copiato negli appunti!
You can expire a silence. Expiring a silence deactivates it forever.
Procedure
- Navigate to the Monitoring → Alerting → Silences page.
For the silence you want to expire, click the button in the last column and click Expire Silence.
Alternatively, you can click the Actions → Expire Silence button in the Silence Overview page for a particular silence.
- Confirm by clicking Expire Silence. This expires the silence.
1.3.7. Next steps Copia collegamentoCollegamento copiato negli appunti!
1.4. Examining cluster metrics Copia collegamentoCollegamento copiato negli appunti!
OpenShift Container Platform 4.3 provides a web interface to Prometheus, which enables you to run Prometheus Query Language (PromQL) queries and examine the metrics visualized on a plot. This functionality provides an extensive overview of the cluster state and enables you to troubleshoot problems.
1.4.1. Contents of the Metrics UI Copia collegamentoCollegamento copiato negli appunti!
This section shows and explains the contents of the Metrics UI, a web interface to Prometheus.
The Metrics page is accessible by clicking Monitoring → Metrics in the OpenShift Container Platform web console.
Actions.
- Add query.
- Expand or collapse all query tables.
- Delete all queries.
- Hide the plot.
- The interactive plot.
- The catalog of available metrics.
- Add query.
- Run queries.
- Query forms.
- Expand or collapse the form.
- The query.
- Clear query.
- Enable or disable query.
Actions for a specific query.
- Enable or disable query.
- Show or hide all series of the query from the plot.
- Delete query.
- The metrics table for a query.
- Color assigned to the graph of the metric. Clicking the square shows or hides the metric’s graph.
Additionally, there is a link to the old Prometheus interface next to the title of the page.
1.4.2. Running metrics queries Copia collegamentoCollegamento copiato negli appunti!
You begin working with metrics by entering one or several Prometheus Query Language (PromQL) queries.
Procedure
- Open the OpenShift Container Platform web console and navigate to the Monitoring → Metrics page.
In the query field, enter your PromQL query.
- To show all available metrics and PromQL functions, click Insert Metric at Cursor.
- For multiple queries, click Add Query.
-
For deleting queries, click
for the query, then select Delete query.
- For keeping but not running a query, click the Disable query button.
Once you finish creating queries, click the Run Queries button. The metrics from the queries are visualized on the plot. If a query is invalid, the UI shows an error message.
NoteQueries that operate on large amounts of data might timeout or overload the browser when drawing timeseries graphs. To avoid this, hide the graph and calibrate your query using only the metrics table. Then, after finding a feasible query, enable the plot to draw the graphs.
- Optional: The page URL now contains the queries you ran. To use this set of queries again in the future, save this URL.
Additional resources
1.4.3. Exploring the visualized metrics Copia collegamentoCollegamento copiato negli appunti!
After running the queries, the metrics are displayed on the interactive plot. The X axis of the plot represents time. The Y axis represents the metrics values. Each metric is shown as a colored graph. You can manipulate the plot and explore the metrics.
Procedure
Initially, all metrics from all enabled queries are shown on the plot. You can select which metrics are shown.
-
To hide all metrics from a query, click
for the query and click Hide all series.
- To hide a specific metric, go to the query table and click the colored square near the metric name.
-
To hide all metrics from a query, click
To zoom into the plot and change the shown time range, do one of the following:
- Visually select the time range by clicking and dragging on the plot horizontally.
- Use the menu in the left upper corner to select the time range.
To reset the time range, click Reset Zoom.
- To display outputs of all queries at a specific point in time, hold the mouse cursor on the plot at that point. The query outputs will appear in a pop-up box.
- For more detailed information about metrics of a specific query, expand the table of that query using the drop-down button. Every metric is shown with its current value.
- To hide the plot, click Hide Graph.
1.4.4. Non-administrator access to metrics Copia collegamentoCollegamento copiato negli appunti!
As a developer, one can enable user workload monitoring for an application or service in a project. As an administrator, you use the same feature to enable monitoring for infrastructure workloads. In that case, a developer or administrator of that project can examine the exposed metrics using the Developer Perspective in the Web console.
Examining metrics using the Developer Perspective is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.
Additional resources
See the documentation on monitoring your own services. It includes details on accessing non-cluster metrics as a developer or a privileged user.
1.4.5. Next steps Copia collegamentoCollegamento copiato negli appunti!
1.5. Accessing Prometheus, Alertmanager, and Grafana Copia collegamentoCollegamento copiato negli appunti!
To work with data gathered by the monitoring stack, you might want to use the Prometheus, Alertmanager, and Grafana interfaces. They are available by default.
1.5.1. Accessing Prometheus, Alerting UI, and Grafana using the web console Copia collegamentoCollegamento copiato negli appunti!
You can access Prometheus, Alerting, and Grafana web UIs using a web browser through the OpenShift Container Platform web console.
The Alerting UI accessed in this procedure is the new interface for Alertmanager.
Prerequisites
-
Authentication is performed against the OpenShift Container Platform identity and uses the same credentials or means of authentication as is used elsewhere in OpenShift Container Platform. You must use a role that has read access to all namespaces, such as the
cluster-monitoring-view
cluster role.
Procedure
- Navigate to the OpenShift Container Platform web console and authenticate.
To access Prometheus, navigate to the "Monitoring" → "Metrics" page.
To access the Alerting UI, navigate to the "Monitoring" → "Alerting" page.
To access Grafana, navigate to the "Monitoring" → "Dashboards" page.
1.5.2. Accessing Prometheus, Alertmanager, and Grafana directly Copia collegamentoCollegamento copiato negli appunti!
You can access Prometheus, Alertmanager, and Grafana web UIs using the oc
tool and a web browser.
The Alertmanager UI accessed in this procedure is the old interface for Alertmanager.
Prerequisites
-
Authentication is performed against the OpenShift Container Platform identity and uses the same credentials or means of authentication as is used elsewhere in OpenShift Container Platform. You must use a role that has read access to all namespaces, such as the
cluster-monitoring-view
cluster role.
Procedure
Run:
oc -n openshift-monitoring get routes
$ oc -n openshift-monitoring get routes NAME HOST/PORT ... alertmanager-main alertmanager-main-openshift-monitoring.apps._url_.openshift.com ... grafana grafana-openshift-monitoring.apps._url_.openshift.com ... prometheus-k8s prometheus-k8s-openshift-monitoring.apps._url_.openshift.com ...
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Prepend
https://
to the address, you cannot access web UIs using unencrypted connection.For example, this is the resulting URL for Alertmanager:
https://alertmanager-main-openshift-monitoring.apps._url_.openshift.com
https://alertmanager-main-openshift-monitoring.apps._url_.openshift.com
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Navigate to the address using a web browser and authenticate.
Additional resources
- For documentation on the new interface for Alertmanager, see Managing cluster alerts.
The monitoring routes are managed by the Cluster Monitoring Operator and cannot be modified by the user.
Chapter 2. Monitoring your own services Copia collegamentoCollegamento copiato negli appunti!
You can use OpenShift Monitoring for your own services in addition to monitoring the cluster. This way, you do not need to use an additional monitoring solution. This helps keeping monitoring centralized. Additionally, you can extend the access to the metrics of your services beyond cluster administrators. This enables developers and arbitrary users to access these metrics.
Custom Prometheus instances and the Prometheus Operator installed through Operator Lifecycle Manager (OLM) can cause issues with user-defined workload monitoring if it is enabled. Custom Prometheus instances are not supported in OpenShift Container Platform.
Monitoring your own services is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.
2.1. Enabling monitoring of your own services Copia collegamentoCollegamento copiato negli appunti!
You can enable monitoring your own services by setting the techPreviewUserWorkload/enabled
flag in the cluster monitoring ConfigMap.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role. - You have installed the OpenShift CLI (oc).
-
You have created the
cluster-monitoring-config
ConfigMap object.
Procedure
Start editing the
cluster-monitoring-config
ConfigMap:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the
techPreviewUserWorkload
setting totrue
underdata/config.yaml
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file to apply the changes. Monitoring your own services is enabled automatically.
Optional: You can check that the
prometheus-user-workload
pods were created:oc -n openshift-user-workload-monitoring get pod
$ oc -n openshift-user-workload-monitoring get pod NAME READY STATUS RESTARTS AGE prometheus-operator-85bbb7b64d-7jwjd 1/1 Running 0 3m24s prometheus-user-workload-0 5/5 Running 1 3m13s prometheus-user-workload-1 5/5 Running 1 3m13s
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional resources
-
See Creating a cluster monitoring ConfigMap to learn how to create the
cluster-monitoring-config
ConfigMap object.
2.2. Deploying a sample service Copia collegamentoCollegamento copiato negli appunti!
To test monitoring your own services, you can deploy a sample service.
Procedure
-
Create a YAML file for the service configuration. In this example, it is called
prometheus-example-app.yaml
. Fill the file with the configuration for deploying the service:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This configuration deploys a service named
prometheus-example-app
in thens1
project. This service exposes the customversion
metric.Apply the configuration file to the cluster:
oc apply -f prometheus-example-app.yaml
$ oc apply -f prometheus-example-app.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow It will take some time to deploy the service.
You can check that the service is running:
oc -n ns1 get pod
$ oc -n ns1 get pod NAME READY STATUS RESTARTS AGE prometheus-example-app-7857545cb7-sbgwq 1/1 Running 0 81m
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
2.3. Creating a role for setting up metrics collection Copia collegamentoCollegamento copiato negli appunti!
This procedure shows how to create a role that allows a user to set up metrics collection for a service as described in "Setting up metrics collection".
Procedure
-
Create a YAML file for the new role. In this example, it is called
custom-metrics-role.yaml
. Fill the file with the configuration for the
monitor-crd-edit
role:Copy to Clipboard Copied! Toggle word wrap Toggle overflow This role enables a user to set up metrics collection for services.
Apply the configuration file to the cluster:
oc apply -f custom-metrics-role.yaml
$ oc apply -f custom-metrics-role.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Now the role is created.
2.4. Granting the role to a user Copia collegamentoCollegamento copiato negli appunti!
This procedure shows how to assign the monitor-crd-edit
role to a user.
Prerequisites
- You need to have a user created.
-
You need to have the
monitor-crd-edit
role described in "Creating a role for setting up metrics collection" created.
Procedure
- In the Web console, navigate to User Management → Role Bindings → Create Binding.
- In Binding Type, select the "Namespace Role Binding" type.
- In Name, enter a name for the binding.
- In Namespace, select the namespace where you want to grant the access.
-
In Role Name, enter
monitor-crd-edit
. - In Subject, select User.
-
In Subject Name, enter name of the user, for example
johnsmith
. -
Confirm the role binding. Now the user has been assigned the
monitor-crd-edit
role, which allows him to set up metrics collection for a service in the namespace.
2.5. Setting up metrics collection Copia collegamentoCollegamento copiato negli appunti!
To use the metrics exposed by your service, you need to configure OpenShift Monitoring to scrape metrics from the /metrics
endpoint. You can do this using a ServiceMonitor, a custom resource definition (CRD) that specifies how a service should be monitored, or a PodMonitor, a CRD that specifies how a pod should be monitored. The former requires a Service object, while the latter does not, allowing Prometheus to directly scrape metrics from the metrics endpoint exposed by a Pod.
This procedure shows how to create a ServiceMonitor for the service.
Prerequisites
-
Log in as a cluster administrator or a user with the
monitor-crd-edit
role.
Procedure
-
Create a YAML file for the ServiceMonitor configuration. In this example, the file is called
example-app-service-monitor.yaml
. Fill the file with the configuration for creating the ServiceMonitor:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This configuration makes OpenShift Monitoring scrape the metrics exposed by the sample service deployed in "Deploying a sample service", which includes the single
version
metric.Apply the configuration file to the cluster:
oc apply -f example-app-service-monitor.yaml
$ oc apply -f example-app-service-monitor.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow It will take some time to deploy the ServiceMonitor.
You can check that the ServiceMonitor is running:
oc -n ns1 get servicemonitor
$ oc -n ns1 get servicemonitor NAME AGE prometheus-example-monitor 81m
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional resources
See the Prometheus Operator API documentation for more information on ServiceMonitors and PodMonitors.
2.6. Creating alerting rules Copia collegamentoCollegamento copiato negli appunti!
You can create alerting rules, which will fire alerts based on values of chosen metrics of the service.
In the current version of the Technology Preview, only administrators can access alerting rules using the Prometheus UI and the Web Console.
Procedure
-
Create a YAML file for alerting rules. In this example, it is called
example-app-alerting-rule.yaml
. Fill the file with the configuration for the alerting rules:
NoteThe expression can only reference metrics exposed by your own services. Currently it is not possible to correlate existing cluster metrics.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This configuration creates an alerting rule named
example-alert
, which fires an alert when theversion
metric exposed by the sample service becomes0
.Apply the configuration file to the cluster:
oc apply -f example-app-alerting-rule.yaml
$ oc apply -f example-app-alerting-rule.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow It will take some time to create the alerting rules.
2.7. Giving view access to a user Copia collegamentoCollegamento copiato negli appunti!
By default, only cluster administrator users and developers have access to metrics from your services. This procedure shows how to grant metrics access to a particular project to an arbitrary user.
Prerequisites
- You need to have a user created.
- You need to log in as a cluster administrator.
Procedure
Run this command to give <user> access to all metrics of your services in <namespace>:
oc policy add-role-to-user view <user> -n <namespace>
$ oc policy add-role-to-user view <user> -n <namespace>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, to give view access to the
ns1
namespace to userbobwilliams
, run:oc policy add-role-to-user view bobwilliams -n ns1
$ oc policy add-role-to-user view bobwilliams -n ns1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Alternatively, in the Web console, switch to the Developer Perspective, and click Advanced → Project Access. From there, you can select the correct namespace and assign the
view
role to a user.
2.8. Accessing the metrics of your service Copia collegamentoCollegamento copiato negli appunti!
Once you have enabled monitoring your own services, deployed a service, and set up metrics collection for it, you can access the metrics of the service as a cluster administrator, as a developer, or as a user with view permissions for the project.
The Grafana instance shipped within OpenShift Container Platform Monitoring is read-only and displays only infrastructure-related dashboards.
Prerequisites
- You need to deploy the service that you want to monitor.
- You need to enable monitoring of your own services.
- You need to have metrics scraping set up for the service.
- You need to log in as a cluster administrator, a developer, or as a user with view permissions for the project.
Procedure
Access the Prometheus web interface:
To access the metrics as a cluster administrator, go to the OpenShift Container Platform web console, switch to the Administrator Perspective, and click Monitoring → Metrics.
NoteCluster administrators, when using the Administrator Perspective, have access to all cluster metrics and to custom service metrics from all projects.
NoteOnly cluster administrators have access to the Alertmanager and Prometheus UIs.
To access the metrics as a developer or a user with permissions, go to the OpenShift Container Platform web console, switch to the Developer Perspective, then click Advanced → Metrics. Select the project you want to see the metrics for.
NoteDevelopers can only use the Developer Perspective. They can only query metrics from a single project.
- Use the PromQL interface to run queries for your services.
Additional resources
Chapter 3. Exposing custom application metrics for autoscaling Copia collegamentoCollegamento copiato negli appunti!
You can export custom application metrics for the horizontal pod autoscaler.
Prometheus Adapter is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.
3.1. Exposing custom application metrics for horizontal pod autoscaling Copia collegamentoCollegamento copiato negli appunti!
You can use the prometheus-adapter
resource to expose custom application metrics for the horizontal pod autoscaler.
Prerequisites
-
Make sure you have a custom Prometheus instance installed. In this example, it is presumed that Prometheus was installed in the
default
namespace. -
Make sure you configured monitoring for your application. In this example, it is presumed that the application and the service monitor for it were installed in the
default
namespace.
Procedure
-
Create a YAML file for your configuration. In this example, it is called
deploy.yaml
. Add configuration for creating the service account, necessary roles, and role bindings for
prometheus-adapter
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add configuration for the custom metrics for
prometheus-adapter
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add configuration for registering
prometheus-adapter
as an API service:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Show the Prometheus Adapter image to use:
kubectl get -n openshift-monitoring deploy/prometheus-adapter -o jsonpath="{..image}"
$ kubectl get -n openshift-monitoring deploy/prometheus-adapter -o jsonpath="{..image}" quay.io/openshift-release-dev/ocp-v4.3-art-dev@sha256:76db3c86554ad7f581ba33844d6a6ebc891236f7db64f2d290c3135ba81c264c
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add configuration for deploying
prometheus-adapter
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
image: openshift-release-dev/ocp-v4.3-art-dev
specifies the Prometheus Adapter image found in the previous step.
Apply the configuration file to the cluster:
oc apply -f deploy.yaml
$ oc apply -f deploy.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Now the application’s metrics are exposed and can be used to configure horizontal pod autoscaling.
Additional resources
Legal Notice
Copia collegamentoCollegamento copiato negli appunti!
Copyright © 2025 Red Hat
OpenShift documentation is licensed under the Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0).
Modified versions must remove all Red Hat trademarks.
Portions adapted from https://github.com/kubernetes-incubator/service-catalog/ with modifications by Red Hat.
Red Hat, Red Hat Enterprise Linux, the Red Hat logo, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation’s permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.