Ce contenu n'est pas disponible dans la langue sélectionnée.
Chapter 3. Installing the core components of Service Telemetry Framework
You can use Operators to load the Service Telemetry Framework (STF) components and objects. Operators manage each of the following STF core components:
- Certificate Management
- AMQ Interconnect
- Smart Gateways
- Prometheus and Alertmanager
Service Telemetry Framework (STF) uses other supporting Operators as part of the deployment. STF can resolve most dependencies automatically, but you need to pre-install some Operators, such as Cluster Observability Operator, which provides an instance of Prometheus and Alertmanager, and cert-manager for Red Hat OpenShift, which provides management of certificates.
Prerequisites
- An Red Hat OpenShift Container Platform Extended Update Support (EUS) release version 4.12 or 4.14 is running.
- You have prepared your Red Hat OpenShift Container Platform environment and ensured that there is persistent storage and enough resources to run the STF components on top of the Red Hat OpenShift Container Platform environment. For more information about STF performance, see the Red Hat Knowledge Base article Service Telemetry Framework Performance and Scaling.
- You have deployed STF in a fully connected or Red Hat OpenShift Container Platform-disconnected environments. STF is unavailable in network proxy environments.
STF is compatible with Red Hat OpenShift Container Platform versions 4.12 and 4.14.
Additional resources
- For more information about Operators, see the Understanding Operators guide.
- For more information about Operator catalogs, see Red Hat-provided Operator catalogs.
- For more information about the cert-manager Operator for Red Hat, see cert-manager Operator for Red Hat OpenShift overview.
- For more information about Cluster Observability Operator, see Cluster Observability Operator Overview.
- For more information about OpenShift life cycle policy and Extended Update Support (EUS), see Red Hat OpenShift Container Platform Life Cycle Policy.
3.1. Deploying Service Telemetry Framework to the Red Hat OpenShift Container Platform environment
Deploy Service Telemetry Framework (STF) to collect and store Red Hat OpenStack Platform (RHOSP) telemetry.
3.1.1. Deploying Cluster Observability Operator
You must install the Cluster Observability Operator (COO) before you create an instance of Service Telemetry Framework (STF) if the observabilityStrategy
is set to use_redhat
and the backends.metrics.prometheus.enabled
is set to true
in the ServiceTelemetry
object. For more information about COO, see Cluster Observability Operator overview in the OpenShift Container Platform Documentation.
Procedure
- Log in to your Red Hat OpenShift Container Platform environment where STF is hosted.
To store metrics in Prometheus, enable the Cluster Observability Operator by using the
redhat-operators
CatalogSource:$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: cluster-observability-operator namespace: openshift-operators spec: channel: development installPlanApproval: Automatic name: cluster-observability-operator source: redhat-operators sourceNamespace: openshift-marketplace EOF
Verify that the
ClusterServiceVersion
for Cluster Observability Operator has a status ofSucceeded
:$ oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=openshift-operators -l operators.coreos.com/cluster-observability-operator.openshift-operators clusterserviceversion.operators.coreos.com/observability-operator.v0.0.26 condition met
3.1.2. Deploying cert-manager for Red Hat OpenShift
The cert-manager for Red Hat OpenShift (cert-manager) Operator must be pre-installed before creating an instance of Service Telemetry Framework (STF). For more information about cert-manager, see cert-manager for Red Hat OpenShift overview.
In previous versions of STF, the only available cert-manager channel was tech-preview
which is available until Red Hat OpenShift Container Platform v4.12. Installations of cert-manager on versions of Red Hat OpenShift Container Platform v4.14 and later must be installed from the stable-v1
channel. For new installations of STF it is recommended to install cert-manager from the stable-v1
channel.
Only one deployment of cert-manager can be installed per Red Hat OpenShift Container Platform cluster. Subscribing to cert-manager in more than one project causes the deployments to conflict with each other.
Procedure
- Log in to your Red Hat OpenShift Container Platform environment where STF is hosted.
Verify cert-manager is not already installed on the Red Hat OpenShift Container Platform cluster. If any results are returned, do not install another instance of cert-manager:
$ oc get sub --all-namespaces -o json | jq '.items[] | select(.metadata.name | match("cert-manager")) | .metadata.name'
Create a namespace for the cert-manager Operator:
$ oc create -f - <<EOF apiVersion: project.openshift.io/v1 kind: Project metadata: name: cert-manager-operator spec: finalizers: - kubernetes EOF
Create an OperatorGroup for the cert-manager Operator:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: cert-manager-operator namespace: cert-manager-operator spec: targetNamespaces: - cert-manager-operator upgradeStrategy: Default EOF
Subscribe to the cert-manager Operator by using the redhat-operators CatalogSource:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openshift-cert-manager-operator namespace: cert-manager-operator labels: operators.coreos.com/openshift-cert-manager-operator.cert-manager-operator: "" spec: channel: stable-v1 installPlanApproval: Automatic name: openshift-cert-manager-operator source: redhat-operators sourceNamespace: openshift-marketplace EOF
Validate your ClusterServiceVersion. Ensure that cert-manager Operator displays a phase of
Succeeded
:oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=cert-manager-operator --selector=operators.coreos.com/openshift-cert-manager-operator.cert-manager-operator clusterserviceversion.operators.coreos.com/cert-manager-operator.v1.12.1 condition met
3.1.3. Deploying Service Telemetry Operator
Deploy Service Telemetry Operator on Red Hat OpenShift Container Platform to provide the supporting Operators and interface for creating an instance of Service Telemetry Framework (STF) to monitor Red Hat OpenStack Platform (RHOSP) cloud platforms.
Prerequisites
- You have installed Cluster Observability Operator to allow storage of metrics. For more information, see Section 3.1.1, “Deploying Cluster Observability Operator”.
- You have installed cert-manager for Red Hat OpenShift to allow certificate management. For more information, see Section 3.1.2, “Deploying cert-manager for Red Hat OpenShift”.
Procedure
Create a namespace to contain the STF components, for example,
service-telemetry
:$ oc new-project service-telemetry
Create an OperatorGroup in the namespace so that you can schedule the Operator pods:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: service-telemetry-operator-group namespace: service-telemetry spec: targetNamespaces: - service-telemetry EOF
For more information, see OperatorGroups.
Create the Service Telemetry Operator subscription to manage the STF instances:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: service-telemetry-operator namespace: service-telemetry spec: channel: stable-1.5 installPlanApproval: Automatic name: service-telemetry-operator source: redhat-operators sourceNamespace: openshift-marketplace EOF
Validate the Service Telemetry Operator and the dependent operators have their phase as Succeeded:
$ oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=service-telemetry -l operators.coreos.com/service-telemetry-operator.service-telemetry ; oc get csv --namespace service-telemetry clusterserviceversion.operators.coreos.com/service-telemetry-operator.v1.5.1700688542 condition met NAME DISPLAY VERSION REPLACES PHASE amq7-interconnect-operator.v1.10.17 Red Hat Integration - AMQ Interconnect 1.10.17 amq7-interconnect-operator.v1.10.4 Succeeded observability-operator.v0.0.26 Cluster Observability Operator 0.1.0 Succeeded service-telemetry-operator.v1.5.1700688542 Service Telemetry Operator 1.5.1700688542 Succeeded smart-gateway-operator.v5.0.1700688539 Smart Gateway Operator 5.0.1700688539 Succeeded
3.2. Creating a ServiceTelemetry object in Red Hat OpenShift Container Platform
Create a ServiceTelemetry
object in Red Hat OpenShift Container Platform to result in the Service Telemetry Operator creating the supporting components for a Service Telemetry Framework (STF) deployment. For more information, see Section 3.2.1, “Primary parameters of the ServiceTelemetry object”.
Prerequisites
- You have deployed STF and the supporting operators. For more information, see Section 3.1, “Deploying Service Telemetry Framework to the Red Hat OpenShift Container Platform environment”.
- You have installed Cluster Observability Operator to allow storage of metrics. For more information, see Section 3.1.1, “Deploying Cluster Observability Operator”.
- You have installed cert-manager for Red Hat OpenShift to allow certificate management. For more information, see Section 3.1.2, “Deploying cert-manager for Red Hat OpenShift”.
Procedure
- Log in to your Red Hat OpenShift Container Platform environment where STF is hosted.
To deploy STF that results in the core components for metrics delivery being configured, create a
ServiceTelemetry
object:$ oc apply -f - <<EOF apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: alerting: alertmanager: storage: persistent: pvcStorageRequest: 20G strategy: persistent enabled: true backends: metrics: prometheus: enabled: true scrapeInterval: 30s storage: persistent: pvcStorageRequest: 20G retention: 24h strategy: persistent clouds: - metrics: collectors: - bridge: ringBufferCount: 15000 ringBufferSize: 16384 verbose: false collectorType: collectd debugEnabled: false subscriptionAddress: collectd/cloud1-telemetry - bridge: ringBufferCount: 15000 ringBufferSize: 16384 verbose: false collectorType: ceilometer debugEnabled: false subscriptionAddress: anycast/ceilometer/cloud1-metering.sample - bridge: ringBufferCount: 15000 ringBufferSize: 65535 verbose: false collectorType: sensubility debugEnabled: false subscriptionAddress: sensubility/cloud1-telemetry name: cloud1 observabilityStrategy: use_redhat transports: qdr: auth: basic certificates: caCertDuration: 70080h endpointCertDuration: 70080h enabled: true web: enabled: false EOF
To override these defaults, add the configuration to the
spec
parameter.View the STF deployment logs in the Service Telemetry Operator:
$ oc logs --selector name=service-telemetry-operator ... --------------------------- Ansible Task Status Event StdOut ----------------- PLAY RECAP ********************************************************************* localhost : ok=90 changed=0 unreachable=0 failed=0 skipped=26 rescued=0 ignored=0
Verification
To determine that all workloads are operating correctly, view the pods and the status of each pod.
$ oc get pods NAME READY STATUS RESTARTS AGE alertmanager-default-0 3/3 Running 0 123m default-cloud1-ceil-meter-smartgateway-7dfb95fcb6-bs6jl 3/3 Running 0 122m default-cloud1-coll-meter-smartgateway-674d88d8fc-858jk 3/3 Running 0 122m default-cloud1-sens-meter-smartgateway-9b869695d-xcssf 3/3 Running 0 122m default-interconnect-6cbf65d797-hk7l6 1/1 Running 0 123m interconnect-operator-7bb99c5ff4-l6xc2 1/1 Running 0 138m prometheus-default-0 3/3 Running 0 122m service-telemetry-operator-7966cf57f-g4tx4 1/1 Running 0 138m smart-gateway-operator-7d557cb7b7-9ppls 1/1 Running 0 138m
3.2.1. Primary parameters of the ServiceTelemetry object
You can set the following primary configuration parameters of the ServiceTelemetry
object to configure your STF deployment:
-
alerting
-
backends
-
clouds
-
graphing
-
highAvailability
-
transports
The backends parameter
Set the value of the backends
parameter to allocate the storage back ends for metrics and events, and to enable the Smart Gateways that the clouds
parameter defines. For more information, see the section called “The clouds parameter”.
You can use Prometheus as the metrics storage back end and Elasticsearch as the events storage back end. The Service Telemetry Operator can create custom resource objects that the Prometheus Operator watches to create a Prometheus workload. You need an external deployment of Elasticsearch to store events.
Enabling Prometheus as a storage back end for metrics
To enable Prometheus as a storage back end for metrics, you must configure the ServiceTelemetry
object.
Procedure
Edit the
ServiceTelemetry
object:$ oc edit stf default
Set the value of the backends.metrics.prometheus.enabled parameter to
true
:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: [...] backends: metrics: prometheus: enabled: true
Configuring persistent storage for Prometheus
Set the additional parameters in backends.metrics.prometheus.storage.persistent
to configure persistent storage options for Prometheus, such as storage class and volume size.
Define the back end storage class with the storageClass
parameter. If you do not set this parameter, the Service Telemetry Operator uses the default storage class for the Red Hat OpenShift Container Platform cluster.
Define the minimum required volume size for the storage request with the pvcStorageRequest
parameter. By default, Service Telemetry Operator requests a volume size of 20G
(20 Gigabytes).
Procedure
List the available storage classes:
$ oc get storageclasses NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE csi-manila-ceph manila.csi.openstack.org Delete Immediate false 20h standard (default) kubernetes.io/cinder Delete WaitForFirstConsumer true 20h standard-csi cinder.csi.openstack.org Delete WaitForFirstConsumer true 20h
Edit the
ServiceTelemetry
object:$ oc edit stf default
Set the value of the
backends.metrics.prometheus.enabled
parameter totrue
and the value ofbackends.metrics.prometheus.storage.strategy
topersistent
:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: [...] backends: metrics: prometheus: enabled: true storage: strategy: persistent persistent: storageClass: standard-csi pvcStorageRequest: 50G
Enabling Elasticsearch as a storage back end for events
Previous versions of STF managed Elasticsearch objects for the community supported Elastic Cloud on Kubernetes Operator (ECK). Elasticsearch management functionality is deprecated in STF 1.5.3. You can still forward to an existing Elasticsearch instance that you deploy and manage with ECK, but you cannot manage the creation of Elasticsearch objects. When you upgrade your STF deployment, existing Elasticsearch objects and deployments remain, but are no longer managed by STF.
For more information about using Elasticsearch with STF, see the Red Hat Knowledge Base article Using Service Telemetry Framework with Elasticsearch.
To enable events forwarding to Elasticsearch as a storage back end, you must configure the ServiceTelemetry
object.
Procedure
Edit the
ServiceTelemetry
object:$ oc edit stf default
Set the value of the
backends.events.elasticsearch.enabled
parameter totrue
and configure thehostUrl
with the relevant Elasticsearch instance:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: [...] backends: events: elasticsearch: enabled: true forwarding: hostUrl: https://external-elastic-http.domain:9200 tlsServerName: "" tlsSecretName: elasticsearch-es-cert userSecretName: elasticsearch-es-elastic-user useBasicAuth: true useTls: true
Create the secret named in the
userSecretName
parameter to store the basicauth
credentials$ oc create secret generic elasticsearch-es-elastic-user --from-literal=elastic='<PASSWORD>'
Copy the CA certificate into a file named
EXTERNAL-ES-CA.pem
, then create the secret named in thetlsSecretName
parameter to make it available to STF$ cat EXTERNAL-ES-CA.pem -----BEGIN CERTIFICATE----- [...] -----END CERTIFICATE----- $ oc create secret generic elasticsearch-es-cert --from-file=ca.crt=EXTERNAL-ES-CA.pem
The clouds parameter
Configure the clouds
parameter to define which Smart Gateway objects deploy and provide the interface for monitored cloud environments to connect to an instance of STF. If a supporting back end is available, metrics and events Smart Gateways for the default cloud configuration are created. By default, the Service Telemetry Operator creates Smart Gateways for cloud1
.
You can create a list of cloud objects to control which Smart Gateways are created for the defined clouds. Each cloud consists of data types and collectors. Data types are metrics
or events
. Each data type consists of a list of collectors, the message bus subscription address, and a parameter to enable debugging. Available collectors for metrics are collectd
, ceilometer
, and sensubility
. Available collectors for events are collectd
and ceilometer
. Ensure that the subscription address for each of these collectors is unique for every cloud, data type, and collector combination.
The default cloud1
configuration is represented by the following ServiceTelemetry
object, which provides subscriptions and data storage of metrics and events for collectd, Ceilometer, and Sensubility data collectors for a particular cloud instance:
apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: clouds: - name: cloud1 metrics: collectors: - collectorType: collectd subscriptionAddress: collectd/cloud1-telemetry - collectorType: ceilometer subscriptionAddress: anycast/ceilometer/cloud1-metering.sample - collectorType: sensubility subscriptionAddress: sensubility/cloud1-telemetry debugEnabled: false events: collectors: - collectorType: collectd subscriptionAddress: collectd/cloud1-notify - collectorType: ceilometer subscriptionAddress: anycast/ceilometer/cloud1-event.sample
Each item of the clouds
parameter represents a cloud instance. A cloud instance consists of three top-level parameters: name
, metrics
, and events
. The metrics
and events
parameters represent the corresponding back end for storage of that data type. The collectors
parameter specifies a list of objects made up of two required parameters, collectorType
and subscriptionAddress
, and these represent an instance of the Smart Gateway. The collectorType
parameter specifies data collected by either collectd, Ceilometer, or Sensubility. The subscriptionAddress
parameter provides the AMQ Interconnect address to which a Smart Gateway subscribes.
You can use the optional Boolean parameter debugEnabled
within the collectors
parameter to enable additional console debugging in the running Smart Gateway pod.
Additional resources
- For more information about deleting default Smart Gateways, see Section 4.3.3, “Deleting the default Smart Gateways”.
- For more information about how to configure multiple clouds, see Section 4.3, “Configuring multiple clouds”.
The alerting parameter
Set the alerting
parameter to create an Alertmanager instance and a storage back end. By default, alerting
is enabled. For more information, see Section 6.3, “Alerts in Service Telemetry Framework”.
The graphing parameter
Set the graphing
parameter to create a Grafana instance. By default, graphing
is disabled. For more information, see Section 6.1, “Dashboards in Service Telemetry Framework”.
The highAvailability parameter
STF high availability (HA) mode is deprecated and is not supported in production environments. Red Hat OpenShift Container Platform is a highly-available platform, and you can cause issues and complicate debugging in STF if you enable HA mode.
Set the highAvailability
parameter to instantiate multiple copies of STF components to reduce recovery time of components that fail or are rescheduled. By default, highAvailability
is disabled. For more information, see Section 6.6, “High availability”.
The transports parameter
Set the transports
parameter to enable the message bus for a STF deployment. The only transport currently supported is AMQ Interconnect. By default, the qdr
transport is enabled.
3.3. Accessing user interfaces for STF components
In Red Hat OpenShift Container Platform, applications are exposed to the external network through a route. For more information about routes, see Configuring ingress cluster traffic.
In Service Telemetry Framework (STF), HTTPS routes are exposed for each service that has a web-based interface and protected by Red Hat OpenShift Container Platform role-based access control (RBAC).
You need the following permissions to access the corresponding component UI’s:
{"namespace":"service-telemetry", "resource":"grafana", "group":"grafana.integreatly.org", "verb":"get"} {"namespace":"service-telemetry", "resource":"prometheus", "group":"monitoring.rhobs", "verb":"get"} {"namespace":"service-telemetry", "resource":"alertmanager", "group":"monitoring.rhobs", "verb":"get"}
For more information about RBAC, see Using RBAC to define and apply permissions.
Procedure
- Log in to Red Hat OpenShift Container Platform.
Change to the
service-telemetry
namespace:$ oc project service-telemetry
List the available web UI routes in the
service-telemetry
project:$ oc get routes | grep web default-alertmanager-proxy default-alertmanager-proxy-service-telemetry.apps.infra.watch default-alertmanager-proxy web reencrypt/Redirect None default-prometheus-proxy default-prometheus-proxy-service-telemetry.apps.infra.watch default-prometheus-proxy web reencrypt/Redirect None
- In a web browser, navigate to https://<route_address> to access the web interface for the corresponding service.
3.4. Configuring an alternate observability strategy
To skip the deployment of storage, visualization, and alerting backends, add observabilityStrategy: none
to the ServiceTelemetry spec. In this mode, you only deploy AMQ Interconnect routers and Smart Gateways, and you must configure an external Prometheus-compatible system to collect metrics from the STF Smart Gateways, and an external Elasticsearch to receive the forwarded events.
Procedure
Create a
ServiceTelemetry
object with the propertyobservabilityStrategy: none
in thespec
parameter. The manifest shows results in a default deployment of STF that is suitable for receiving telemetry from a single cloud with all metrics collector types.$ oc apply -f - <<EOF apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: observabilityStrategy: none EOF
Delete the remaining objects that are managed by community operators
$ for o in alertmanagers.monitoring.rhobs/default prometheuses.monitoring.rhobs/default elasticsearch/elasticsearch grafana/default-grafana; do oc delete $o; done
To verify that all workloads are operating correctly, view the pods and the status of each pod:
$ oc get pods NAME READY STATUS RESTARTS AGE default-cloud1-ceil-event-smartgateway-6f8547df6c-p2db5 3/3 Running 0 132m default-cloud1-ceil-meter-smartgateway-59c845d65b-gzhcs 3/3 Running 0 132m default-cloud1-coll-event-smartgateway-bf859f8d77-tzb66 3/3 Running 0 132m default-cloud1-coll-meter-smartgateway-75bbd948b9-d5phm 3/3 Running 0 132m default-cloud1-sens-meter-smartgateway-7fdbb57b6d-dh2g9 3/3 Running 0 132m default-interconnect-668d5bbcd6-57b2l 1/1 Running 0 132m interconnect-operator-b8f5bb647-tlp5t 1/1 Running 0 47h service-telemetry-operator-566b9dd695-wkvjq 1/1 Running 0 156m smart-gateway-operator-58d77dcf7-6xsq7 1/1 Running 0 47h
Additional resources
- For more information about configuring additional clouds or to change the set of supported collectors, see Section 4.3.2, “Deploying Smart Gateways”.
-
To migrate an existing STF deployment to
use_redhat
, see the Red Hat Knowledge Base article Migrating Service Telemetry Framework to fully supported operators.