Questo contenuto non è disponibile nella lingua selezionata.
Chapter 1. Configuring observability
Use observability to gain insights into the metrics, logs, and alerts from your Red Hat OpenStack Services on OpenShift (RHOSO) deployment. You can configure observability by editing the default Telemetry service (ceilometer, prometheus) in your OpenStackControlPlane custom resource (CR) file.
1.1. RHOSO observability architecture Copia collegamentoCollegamento copiato negli appunti!
The observability architecture in Red Hat OpenStack Services on OpenShift (RHOSO) is composed of services within Red Hat OpenShift Container Platform (RHOCP), as well as services on your Compute nodes that provide metrics, logs, and alerts. You can use Red Hat OpenShift Observability for insight into your RHOSO environment and for collecting, storing, and searching through logs.
The observability platform available with RHOSO does not guarantee the delivery of metrics. Metrics are exposed for scraping but they are not cached. If data is dropped there is no ability to retrospectively fill in gaps in the data, which might result in incomplete metrics.
1.2. Configuring observability on the control plane Copia collegamentoCollegamento copiato negli appunti!
The Telemetry service (ceilometer, prometheus) is enabled by default in a Red Hat OpenStack Services on OpenShift (RHOSO) deployment. You can configure observability by editing the Telemetry service in your OpenStackControlPlane custom resource (CR) file.
Prerequisites
-
The control plane includes initial configuration of the Telemetry service. For more information, see the
telemetryconfiguration in Creating the control plane in Deploying Red Hat OpenStack Services on OpenShift.
Procedure
-
Open your
OpenStackControlPlaneCR file,openstack_control_plane.yaml, on your workstation. Configure the Telemetry service,
telemetry, as required for your environment:telemetry: enabled: true template: metricStorage: enabled: true dashboardsEnabled: true dataplaneNetwork: ctlplane networkAttachments: - ctlplane monitoringStack: alertingEnabled: true scrapeInterval: 30s storage: strategy: persistent retention: 24h persistent: pvcStorageRequest: 20G autoscaling: enabled: false aodh: notificationsBus: cluster: rabbitmq-notification databaseAccount: aodh databaseInstance: openstack secret: osp-secret heatInstance: heat ceilometer: enabled: true notificationsBus: cluster: rabbitmq-notification secret: osp-secret logging: enabled: false annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 cloudkitty: enabled: false messagingBus: cluster: rabbitmq s3StorageConfig: schemas: - effectiveDate: "2024-11-18" version: v13 secret: name: cloudkitty-loki-s3 type: s3-
metricStorage.monitoringStack.scrapeInterval: Specifies the interval at which new metrics are gathered. Changing this interval can affect performance. -
metricStorage.monitoringStack.storage.retention: Specifies the length of time that telemetry metrics are stored. The duration affects the amount of storage required. -
storage.persistent.pvcStorageRequest: Specifies the amount of storage to allocate to the Prometheus time series database. -
autoscaling.enabled: Set totrueto enable autoscaling. Theautoscalingfield must be present even when autoscaling is disabled. For more information about autoscaling, see Autoscaling for Instances. -
ceilometer.enabled: Set tofalseto disable theceilometerservice. If you do not disable ceilometer, then a Prometheus metrics exporter is created and exposed from inside the cluster at the following URL: http://ceilometer-internal.openstack.svc:3000/metrics -
logging.enabled: Set totrueto enable observability logging. For more information about configuring observability logging, see Enabling RHOSO observability logging. -
cloudkitty.enabled: Set totrueto enable the Rating service (cloudkitty). For more information about configuring chargeback and rating capabilities, see Enabling the Rating service on the control plane. -
aodh.notificationsBus.clusterandceilometer.notificationsBus.cluster: Set to a dedicated RabbitMQ cluster for notifications as in this example or to a combined RabbitMQ cluster for both RPC and notifications as in the default RHOSO deployment that uses the combinedrabbitmqcluster.
-
Update the control plane:
$ oc apply -f openstack_control_plane.yaml -n openstackWait until RHOCP creates the resources related to the
OpenStackControlPlaneCR. Run the following command to check the status:$ oc get openstackcontrolplane -n openstack NAME STATUS MESSAGE openstack-control-plane Unknown Setup startedThe OpenStackControlPlane resources are created when the status is "Setup complete".
TipAppend the
-woption to the end of thegetcommand to track deployment progress.Optional: Confirm that the control plane is deployed by reviewing the pods in the
openstacknamespace for each of your cells:$ oc get pods -n openstackThe control plane is deployed when all the pods are either completed or running.
Verification
Access the remote shell for the
OpenStackClientpod from your workstation:$ oc rsh -n openstack openstackclientConfirm that you can query
prometheusand that the scrape endpoints are active:$ openstack metric query up --disable-rbac -c container -c instance -c valueExample output:
+-----------------+------------------------+-------+ | container | instance | value | +-----------------+------------------------+-------+ | alertmanager | 10.217.1.112:9093 | 1 | | prometheus | 10.217.1.63:9090 | 0 | | proxy-httpd | 10.217.1.52:3000 | 1 | | | 192.168.122.100:9100 | 1 | | | 192.168.122.101:9100 | 1 | +-----------------+------------------------+-------+NoteEach entry in the value field should be "1" when there are active workloads scheduled on the cluster, except for the
prometheuscontainer. Theprometheuscontainer reports a value of "0" due to TLS, which is enabled by default.-
You can find the
openstack-telemetry-operatordashboards by clickingObserveand thenDashboardsin the RHOCP console. For more information about RHOCP dashboards, see Reviewing monitoring dashboards as a cluster administrator in the RHOCP Monitoring Guide.
1.3. Customizing Telemetry configuration files Copia collegamentoCollegamento copiato negli appunti!
You can customize Telemetry services by creating custom configuration files for your deployment. For instance, you can customize the Ceilometer service by modifying the polling.yaml file.
Prerequisites
The pod for the service that you want to customize exists in your deployment and you are in the
openstacknamespace:$ oc get pods -A -o custom-columns="NAMESPACE:.metadata.namespace,POD:.metadata.name,CONTAINERS:.spec.containers[*].name" | grep <container>
Procedure
Create the custom configuration file for the service you want to customize. If the service already has a configuration file, then you must give your custom file the same name so that your custom file replaces the existing configuration file. For example, to customize the configuration of the Ceilometer service, you can make a copy of
polling.yamlfor editing:$ oc rsh -n openstack -c ceilometer-central-agent ceilometer-0 cat /etc/ceilometer/polling.yaml > polling.yamlAdd or update the configuration for the service as required. For example, to customize how often samples of volume and image size should be polled, create a file named
polling.yamland add the following configuration:sources: - name: pollsters interval: 300 meters: - volume.size - image.sizeFor more information on how to configure the Ceilometer service, see Polling properties.
Create the
SecretCR for the service with the custom configuration file:$ oc create secret generic <secret_name> \ --from-file <custom_config.yaml> -n openstackNoteYou can specify the
--from-fileoption as many times as required to pass more than one configuration file to theSecretCR for customizing the service.Verify that the
SecretCR is created:$ oc describe secret <secret_name>-
Open the
OpenStackControlPlaneCR file on your workstation, for example,openstack_control_plane.yaml. Locate the service definition for
telemetryand add thecustomConfigsSecretNamefield to the Telemetry service that you want to customize. The following example shows where to place thecustomConfigsSecretNamefield for each service that you can customize:spec: telemetry: template: autoscaling: aodh: customConfigsSecretName: <secret_name> ceilometer: customConfigsSecretName: <secret_name> cloudkitty: cloudKittyAPI: customConfigsSecretName: <secret_name> cloudKittyProc: customConfigsSecretName: <secret_name>Update the control plane:
$ oc apply -f openstack_control_plane.yaml -n openstackYour custom file is copied from the
Secretinto the/etc/<service>/folder. If a file with the same name already exists in the folder, it is replaced with the custom configuration file.Wait until RHOCP creates the resources related to the
OpenStackControlPlaneCR. Run the following command to check the status:$ oc get openstackcontrolplane -n openstackVerify that the custom configuration file is being used by the service:
$ oc rsh -c <service_container> <service_pod> \ cat /etc/<service>/<custom_config>.yamlExample:
$ oc rsh -c ceilometer-central-agent ceilometer-0 cat /etc/ceilometer/polling.yaml
1.3.1. Polling properties Copia collegamentoCollegamento copiato negli appunti!
You can configure polling rules to poll for data not provided by service events and notifications, such as instance resource usage. Use the polling.yaml file to specify the polling plugins (pollsters) to enable, the interval they should be polled, and the meters to poll for each source.
The following template describes how to define a source to poll:
sources:
- name: <source_name>
interval: <sample_generation_interval>
meters:
- <meter_filter>
- <meter_filter>
...
- <meter_filter>
-
interval: The interval in seconds between sample generation of the specified metrics. -
meters: A list of resources to gather samples from. Each filter must match the meter name of the polling plugin.
1.4. Enabling Telemetry power monitoring on the data plane Copia collegamentoCollegamento copiato negli appunti!
This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.
You can enable power monitoring on the data plane to collect power consumption metrics, by adding the telemetry-power-monitoring service to each OpenStackDataPlaneNodeSet custom resource (CR) defined for the data plane.
Procedure
-
Open the
OpenStackDataPlaneNodeSetCR definition file for the node set you want to update, for example,openstack_data_plane.yaml. Add the
servicesfield, and include all the required services, including the default services, then addtelemetry-power-monitoringaftertelemetry:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-data-plane namespace: openstack spec: tlsEnabled: true env: - name: ANSIBLE_FORCE_COLOR value: "True" services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ovn - neutron-metadata - libvirt - nova - telemetry - telemetry-power-monitoringFor more information about deploying data plane services, see Deploying the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide.
-
Save the
OpenStackDataPlaneNodeSetCR definition file. Apply the updated
OpenStackDataPlaneNodeSetCR configuration:$ oc apply -f openstack_data_plane.yamlVerify that the data plane resource has been updated by confirming that the status is
SetupReady:$ oc wait openstackdataplanenodeset openstack-data-plane --for condition=SetupReady --timeout=10mWhen the status is
SetupReadythe command returns acondition metmessage, otherwise it returns a timeout error.For information about the data plane conditions and states, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.
Create a file on your workstation to define the
OpenStackDataPlaneDeploymentCR:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: <node_set_deployment_name>-
Replace
<node_set_deployment_name>with the name of theOpenStackDataPlaneDeploymentCR. The name must be unique, must consist of lower case alphanumeric characters,-(hyphen) or.(period), and must start and end with an alphanumeric character.
TipGive the definition file and the
OpenStackDataPlaneDeploymentCR unique and descriptive names that indicate the purpose of the modified node set.-
Replace
Add the
OpenStackDataPlaneNodeSetCR that you modified:spec: nodeSets: - <nodeSet_name>-
Save the
OpenStackDataPlaneDeploymentCR deployment file. Deploy the modified
OpenStackDataPlaneNodeSetCR:$ oc create -f openstack_data_plane_deploy.yaml -n openstackYou can view the Ansible logs while the deployment executes:
$ oc get pod -l app=openstackansibleee -w $ oc logs -l app=openstackansibleee -f --max-log-requests 10If the
oc logscommand returns an error similar to the following error, increase the--max-log-requestsvalue:error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limitVerify that the modified
OpenStackDataPlaneNodeSetCR is deployed:$ oc get openstackdataplanedeployment -n openstack NAME STATUS MESSAGE openstack-data-plane True Setup Complete $ oc get openstackdataplanenodeset -n openstack NAME STATUS MESSAGE openstack-data-plane True NodeSet ReadyFor information about the meaning of the returned status, see Data plane conditions and states in the Deploying Red Hat OpenStack Services on OpenShift guide.
If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in the Deploying Red Hat OpenStack Services on OpenShift guide.
Verify that the
telemetry-power-monitoringservice is deployed by checking forceilometer_agent_ipmiandkeplercontainers in the data plane nodes:$ podman ps | grep -i -e ceilometer_agent_ipmi -e kepler