此内容没有您所选择的语言版本。
Chapter 5. Customizing Red Hat OpenStack on OpenShift Observability
Use observability with Red Hat OpenStack Services on OpenShift (RHOSO) to get insight into the metrics, logs, and alerts from your deployment.
The observability architecture in RHOSO is composed of services within OpenShift, as well as services on your Compute nodes that expose metrics, logs and alerts.You can use the OpenShift observability ecosystem for insight into the RHOSO environment. Additionally, you have access to the logging infrastructure for collecting, storing, and searching through logs. RHOSO services such as ceilometer and sg-core make metrics from your compute nodes and associated virtual infrastructure available to the OpenShift Observability framework.
5.1. Configuring Red Hat OpenStack on OpenShift Observability
The Telemetry service (ceilometer, prometheus) is enabled by default in a Red Hat OpenStack Services on OpenShift (RHOSO) deployment. You can configure observability by editing the openstack_control_plane.yaml
CR file.
Prerequisites
- The Cluster Observability Operator is installed from OperatorHub. For more information, see Installing the Cluster Observability Operator.
Optional: If you plan to enable logging, the Cluster Logging Operator is installed from OperatorHub.
- A LokiStack instance must be running. For more information, see Configuring the LokiStack log store.
- A ClusterLogging instance must be running. For more information, see Configuring the logging collector.
- The syslog receiver must be enabled. For more information, see Forwarding logs using the syslog protocol.
You do not need these Operators to expose and query OpenStack metrics in Prometheus format. If you do not disable ceilometer, then a Prometheus metrics exporter is created and exposed from inside the cluster at the following URL: http://ceilometer-internal.openstack.svc:3000/metrics
Procedure
-
Use a text editor of your choice to open the
openstack_control_plane.yaml
file. Create the
telemetry
section based on the needs of your environment:telemetry: enabled: true template: metricStorage: enabled: true monitoringStack: dashboardsEnabled: true alertingEnabled: true scrapeInterval: 30s 1 storage: strategy: persistent retention: 24h 2 persistent: pvcStorageRequest: 20G 3 autoscaling: 4 enabled: false aodh: databaseUser: aodh databaseInstance: openstack secret: osp-secret heatInstance: heat ceilometer: enabled: true secret: osp-secret logging: enabled: false ipaddr: <ip_address> 5
- 1
- Use the scrapeInterval field to control the amount of time that passes before new metrics are gathered. Changing this parameter can affect performance.
- 2
- Use the retention field to adjust the length of time telemetry metrics are stored. This field affects the amount of storage required.
- 3
- You can change the amount of storage to be allocated for the Prometheus time series database.
- 4
- You must have the
autoscaling
field present, even if you keep it disabled. For information on enabling and configuring autoscaling, see <link>. - 5
- Replace <ip_address> with the IP address on the internal network you would like to configure.
Update the control plane with the Telemetry configurations that you set in
openstack_control_plane.yaml
:$ oc apply -f openstack_control_plane.yaml -n openstack
Verification
Access the remote shell for the
OpenStackClient
pod from your workstation:$ oc rsh -n openstack openstackclient
Confirm that you can query prometheus and that the scrape endpoints are active with the following command:
$ openstack metric query up --disable-rbac -c container -c instance -c value
Example output:
+-----------------+------------------------+-------+ | container | instance | value | +-----------------+------------------------+-------+ | alertmanager | 10.217.1.112:9093 | 1 | | prometheus | 10.217.1.63:9090 | 0 | | proxy-httpd | 10.217.1.52:3000 | 1 | | | 192.168.122.100:9100 | 1 | | | 192.168.122.101:9100 | 1 | +-----------------+------------------------+-------+
NoteEach entry in the value field should be “1", except for the prometheus container. The prometheus container reports a value of “0” due to TLS, which is enabled by default.
Additional resource