Observability
Observability and Service Mesh
Abstract
Chapter 1. Observability and Service Mesh
Red Hat OpenShift Observability provides real-time visibility, monitoring, and analysis of various system metrics, logs, and events to help you quickly diagnose and troubleshoot issues before they impact systems or applications.
Red Hat OpenShift Observability connects open-source observability tools and technologies to create a unified Observability solution. The components of Red Hat OpenShift Observability work together to help you collect, store, deliver, analyze, and visualize data.
Red Hat OpenShift Service Mesh integrates with the following Red Hat OpenShift Observability components:
- OpenShift Monitoring
- Red Hat OpenShift distributed tracing platform
OpenShift Service Mesh also integrates with:
- Kiali provided by Red Hat, a powerful console for visualizing and managing your service mesh.
- OpenShift Service Mesh Console (OSSMC) plugin, an OpenShift Container Platform console plugin that seamlessly integrates Kiali console features into your OpenShift console.
Chapter 2. Metrics and Service Mesh
2.1. Using metrics
Monitoring stack components are deployed by default in every OpenShift Container Platform installation and are managed by the Cluster Monitoring Operator (CMO). These components include Prometheus, Alertmanager, Thanos Querier, and others. The CMO also deploys the Telemeter Client, which sends a subset of data from platform Prometheus instances to Red Hat to facilitate Remote Health Monitoring for clusters.
When you have added your application to the mesh, you can monitor the in-cluster health and performance of your applications running on OpenShift Container Platform with metrics and customized alerts for CPU and memory usage, network connectivity, and other resource usage.
2.1.1. Configuring OpenShift Monitoring with Service Mesh
You can integrate Red Hat OpenShift Service Mesh with user-workload monitoring.
Prerequisites
- Red Hat OpenShift Service Mesh is installed.
- User-workload monitoring is enabled. See Enabling monitoring for user-defined projects.
Procedure
Create a YAML file named
servicemonitor.yml
to monitor the Istio control plane:Example
ServiceMonitor
objectapiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: istiod-monitor namespace: istio-system spec: targetLabels: - app selector: matchLabels: istio: pilot endpoints: - port: http-monitoring interval: 30s
Apply the YAML file by running the following command:
$ oc apply -f servicemonitor.yml
Create a YAML file named
podmonitor.yml
to collect metrics from the Istio proxies:Example
PodMonitor
objectapiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: istio-proxies-monitor namespace: istio-system 1 spec: selector: matchExpressions: - key: istio-prometheus-ignore operator: DoesNotExist podMetricsEndpoints: - path: /stats/prometheus interval: 30s relabelings: - action: keep sourceLabels: ["__meta_kubernetes_pod_container_name"] regex: "istio-proxy" - action: keep sourceLabels: ["__meta_kubernetes_pod_annotationpresent_prometheus_io_scrape"] - action: replace regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4}) replacement: '[$2]:$1' sourceLabels: ["__meta_kubernetes_pod_annotation_prometheus_io_port","__meta_kubernetes_pod_ip"] targetLabel: "__address__" - action: replace regex: (\d+);((([0-9]+?)(\.|$)){4}) replacement: '$2:$1' sourceLabels: ["__meta_kubernetes_pod_annotation_prometheus_io_port","__meta_kubernetes_pod_ip"] targetLabel: "__address__" - sourceLabels: ["__meta_kubernetes_pod_label_app_kubernetes_io_name","__meta_kubernetes_pod_label_app"] separator: ";" targetLabel: "app" action: replace regex: "(.+);.*|.*;(.+)" replacement: "${1}${2}" - sourceLabels: ["__meta_kubernetes_pod_label_app_kubernetes_io_version","__meta_kubernetes_pod_label_version"] separator: ";" targetLabel: "version" action: replace regex: "(.+);.*|.*;(.+)" replacement: "${1}${2}" - sourceLabels: ["__meta_kubernetes_namespace"] action: replace targetLabel: namespace - action: replace replacement: "the-mesh-identification-string" 2 targetLabel: mesh_id
Apply the YAML file by running the following command:
$ oc apply -f podmonitor.yml
On the OpenShift Console go to Observe → Metrics, and run the query
istio_requests_total
.NoteThe Metrics implementation can take a few minutes for the query to return results.
Chapter 3. Distributed tracing and Service Mesh
3.1. Configuring Red Hat OpenShift distributed tracing platform with Service Mesh
Integrating Red Hat OpenShift distributed tracing platform with Red Hat OpenShift Service Mesh is made of up two parts: Red Hat OpenShift distributed tracing platform (Tempo) and Red Hat OpenShift distributed tracing data collection.
- Red Hat OpenShift distributed tracing platform (Tempo)
Provides distributed tracing to monitor and troubleshoot transactions in complex distributed systems. Tempo is based on the open source Grafana Tempo project.
For more about information about distributed tracing platform (Tempo), its features, installation, and configuration, see: Red Hat OpenShift distributed tracing platform (Tempo).
- Red Hat OpenShift distributed tracing data collection
Is based on the open source OpenTelemetry project, which aims to provide unified, standardized, and vendor-neutral telemetry data collection for cloud-native software. Red Hat OpenShift distributed tracing data collection product provides support for deploying and managing the OpenTelemetry Collector and simplifying the workload instrumentation.
The OpenTelemetry Collector can receive, process, and forward telemetry data in multiple formats, making it the ideal component for telemetry processing and interoperability between telemetry systems. The Collector provides a unified solution for collecting and processing metrics, traces, and logs.
For more information about distributed tracing data collection, its features, installation, and configuration, see: Red Hat OpenShift distributed tracing data collection.
3.1.1. Configuring Red Hat OpenShift distributed tracing data collection with Service Mesh
You can integrate Red Hat OpenShift Service Mesh with Red Hat OpenShift distributed tracing data collection to instrument, generate, collect, and export OpenTelemetry traces, metrics, and logs to analyze and understand your software’s performance and behavior.
Prerequisites
- Tempo Operator is installed. See: Installing the Tempo Operator.
- Red Hat OpenShift distributed tracing data collection Operator is installed. See: Installing the Red Hat build of OpenTelemetry
-
A TempoStack is installed and configured in a
tempo
namespace. See: Installing a TempoStack instance. - An Istio instance is created.
- An Istio CNI instance is created.
Procedure
Navigate to the Red Hat OpenShift distributed tracing data collection Operator and install the
OpenTelemetryCollector
resource in theistio-system
namespace:Example OpenTelemetry Collector in
istio-system
namespacekind: OpenTelemetryCollector apiVersion: opentelemetry.io/v1beta1 metadata: name: otel namespace: istio-system spec: observability: metrics: {} deploymentUpdateStrategy: {} config: exporters: otlp: endpoint: 'tempo-sample-distributor.tempo.svc.cluster.local:4317' tls: insecure: true receivers: otlp: protocols: grpc: endpoint: '0.0.0.0:4317' http: {} service: pipelines: traces: exporters: - otlp receivers: - otlp
Configure Red Hat OpenShift Service Mesh to enable tracing, and define the distributed tracing data collection tracing providers in your
meshConfig
:Example enabling tracing and defining tracing providers
apiVersion: sailoperator.io/v1 kind: Istio metadata: # ... name: default spec: namespace: istio-system # ... values: meshConfig: enableTracing: true extensionProviders: - name: otel opentelemetry: port: 4317 service: otel-collector.istio-system.svc.cluster.local 1
- 1
- The
service
field is theOpenTelemetry
collector service in theistio-system
namespace.
Create an Istio Telemetry resource to enable tracers defined in
spec.values.meshConfig.ExtensionProviders
:Example Istio Telemetry resource
apiVersion: telemetry.istio.io/v1 kind: Telemetry metadata: name: otel-demo namespace: istio-system spec: tracing: - providers: - name: otel randomSamplingPercentage: 100
NoteOnce you verify that you can see traces, lower the
randomSamplingPercentage
value or set it todefault
to reduce the number of requests.Create the
bookinfo
namespace by running the following command:$ oc create ns bookinfo
Depending on the update strategy you are using, enable sidecar injection in the namespace by running the appropriate commands:
If you are using the
InPlace
update strategy, run the following command:$ oc label namespace curl istio-injection=enabled
If you are using the
RevisionBased
update strategy, run the following commands:Display the revision name by running the following command:
$ oc get istiorevisions.sailoperator.io
Example output
NAME TYPE READY STATUS IN USE VERSION AGE default Local True Healthy True v1.24.3 3m33s
Label the namespace with the revision name to enable sidecar injection by running the following command:
$ oc label namespace curl istio.io/rev=default
Deploy the
bookinfo
application in thebookinfo
namespace by running the following command:$ oc apply -f https://raw.githubusercontent.com/openshift-service-mesh/istio/release-1.24/samples/bookinfo/platform/kube/bookinfo.yaml -n bookinfo
Generate traffic to the
productpage
pod to generate traces:$ oc exec -it -n bookinfo deployments/productpage-v1 -c istio-proxy -- curl localhost:9080/productpage
Validate the integration by running the following command to see traces in the UI:
$ oc get routes -n tempo tempo-sample-query-frontend
NoteThe OpenShift route for Jaeger UI must be created in the Tempo namespace. You can either manually create it for the
tempo-sample-query-frontend
service, or update theTempo
custom resource with.spec.template.queryFrontend.jaegerQuery.ingress.type: route
.
Chapter 4. Kiali Operator provided by Red Hat
4.1. Using Kiali Operator provided by Red Hat
Once you have added your application to the mesh, you can use Kiali Operator provided by Red Hat to view the data flow through your application.
4.1.1. About Kiali
You can use Kiali Operator provided by Red Hat to view configurations, monitor traffic, and analyze traces in a single console. It is based on the open source Kiali project.
Kiali Operator provided by Red Hat is the management console for Red Hat OpenShift Service Mesh. It provides dashboards, observability, and robust configuration and validation capabilities. It shows the structure of your service mesh by inferring traffic topology and displays the health of your mesh. Kiali provides detailed metrics, powerful validation, access to Grafana, and strong integration with the Red Hat OpenShift distributed tracing platform (Tempo).
4.1.2. Installing the Kiali Operator provided by Red Hat
The following steps show how to install the Kiali Operator provided by Red Hat.
Do not install the Community version of the Operator. The Community version is not supported.
Prerequisites
- Access to the Red Hat OpenShift Service Mesh web console.
Procedure
- Log in to the Red Hat OpenShift Service Mesh web console.
- Navigate to Operators → OperatorHub.
- Type Kiali into the filter box to find the Kiali Operator provided by Red Hat.
- Click Kiali Operator provided by Red Hat to display information about the Operator.
- Click Install.
- On the Operator Installation page, select the stable Update Channel.
-
Select All namespaces on the cluster (default). This installs the Operator in the default
openshift-operators
project and makes the Operator available to all projects in the cluster. Select the Automatic Approval Strategy.
NoteThe Manual approval strategy requires a user with appropriate credentials to approve the Operator installation and subscription process.
- Click Install.
- The Installed Operators page displays the Kiali Operator’s installation progress.
4.1.3. Configuring OpenShift Monitoring with Kiali
The following steps show how to integrate the Kiali Operator provided by Red Hat with user-workload monitoring.
Prerequisites
- Red Hat OpenShift Service Mesh is installed.
- User-workload monitoring is enabled. See Enabling monitoring for user-defined projects.
- OpenShift Monitoring has been configured with Service Mesh. See "Configuring OpenShift Monitoring with Service Mesh".
- Kiali Operator provided by Red Hat 2.4 is installed.
Procedure
Create a
ClusterRoleBinding
resource for Kiali:Example
ClusterRoleBinding
configurationapiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kiali-monitoring-rbac roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-monitoring-view subjects: - kind: ServiceAccount name: kiali-service-account namespace: istio-system
Create a Kiali resource and point it to your Istio instance:
Example Kiali resource configuration
apiVersion: kiali.io/v1alpha1 kind: Kiali metadata: name: kiali-user-workload-monitoring namespace: istio-system spec: external_services: prometheus: auth: type: bearer use_kiali_token: true thanos_proxy: enabled: true url: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091
When the Kiali resource is ready, get the Kiali URL from the Route by running the following command:
$ echo "https://$(oc get routes -n istio-system kiali -o jsonpath='{.spec.host}')"
- Follow the URL to open Kiali in your web browser.
- Navigate to the Traffic Graph tab to check the traffic in the Kiali UI.
4.1.4. Integrating Red Hat OpenShift distributed tracing platform with Kiali Operator provided by Red Hat
You can integrate Red Hat OpenShift distributed tracing platform with Kiali Operator provided by Red Hat, which enables the following features:
- Display trace overlays and details on the graph.
- Display scatterplot charts and in-depth trace/span information on detail pages.
- Integrated span information in logs and metric charts.
- Offer links to the external tracing UI.
4.1.4.1. Configuring Red Hat OpenShift distributed tracing platform with Kiali Operator provided by Red Hat
After Kiali Operator provided by Red Hat is integrated with Red Hat OpenShift distributed tracing platform, you can view distributed traces in the Kiali console. Viewing these traces can provide insight into the communication between services within the service mesh, helping you understand how requests are flowing through your system and where potential issues might be.
Prerequisites
- You installed Red Hat OpenShift Service Mesh.
- You configured distributed tracing platform with Red Hat OpenShift Service Mesh.
Procedure
Update the
Kiali
resourcespec
configuration for tracing:Example
Kiali
resourcespec
configuration for tracingspec: external_services: tracing: enabled: true 1 provider: tempo 2 use_grpc: false internal_url: https://tempo-sample-gateway.tempo.svc.cluster.local:8080/api/traces/v1/default/tempo 3 external_url: https://tempo-sample-gateway-tempo.apps-crc.testing/api/traces/v1/default/search 4 health_check_url: https://tempo-sample-gateway-tempo.apps-crc.testing/api/traces/v1/north/tempo/api/echo 5 auth: 6 ca_file: /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt insecure_skip_verify: false type: bearer use_kiali_token: true tempo_config: url_format: "jaeger" 7
- 1
- Specifies whether tracing is enabled.
- 2
- Specifies either distributed tracing platform (Tempo) or distributed tracing platform (Jaeger). Tempo can expose a Jaeger or Tempo API.
- 3
- Specifies the internal URL for the Tempo API. If Tempo is deployed in a multi-tenant model, you must include the
tenant name
name. - 4
- Specifies the OpenShift route for the Jaeger UI. When deployed in a multi-tenant model, the gateway creates the route. Otherwise, you must create the route in the
Tempo
namespace. You can manually create the route for thetempo-sample-query-frontend
service or update theTempo
CR with.spec.template.queryFrontend.jaegerQuery.ingress.type: route
. - 5
- Specifies the health check URL. Not required by default. When Tempo is deployed in a multi-tenant model, Kiali does not expose the default health check URL. This is an example of a valid health URL.
- 6
- Specifies the configuration used when the access URL is
HTTPS
or requires authentication. Not required by default. - 7
- Specifies the configuration that defaults to
grafana
. Not required by default. Change tojaeger
if the KialiView in tracing
link redirects to the Jaeger console UI.
-
Save the updated
spec
inkiali_cr.yaml
. Run the following command to apply the configuration:
$ oc patch -n istio-system kiali kiali --type merge -p "$(cat kiali_cr.yaml)"
Example output:
kiali.kiali.io/kiali patched
Verification
Run the following command to get the Kiali route:
$ oc get route kiali ns istio-system
- Navigate to the Kiali UI.
- Navigate to Workload → Traces tab to see traces in the Kiali UI.
4.2. Using OpenShift Service Mesh Console plugin
The OpenShift Service Mesh Console (OSSMC) plugin extends the OpenShift Container Platform web console with a Service Mesh menu and enhanced tabs for workloads and services.
4.2.1. About OpenShift Service Mesh Console plugin
The OpenShift Service Mesh Console (OSSMC) plugin is an extension to OpenShift Container Platform web console that provides visibility into your Service Mesh.
The OSSMC plugin supports only one Kiali instance, regardless of its project access scope.
The OSSMC plugin provides a new category, Service Mesh, in the main OpenShift Container Platform web console navigation with the following menu options:
- Overview
- Provides a summary of your mesh, displayed as cards that represent the namespaces in the mesh.
- Traffic Graph
- Provides a full topology view of your mesh, represented by nodes and edges. Each node represents a component of the mesh and each edge represents traffic flowing through the mesh between components.
- Istio config
- Provides a list of all Istio configuration files in your mesh, with a column that provides a quick way to know if the configuration for each resource is valid.
- Mesh
- Provides detailed information about the Istio infrastructure status. It shows an infrastructure topology view with core and add-on components, their health, and how they are connected to each other.
In the web console Workloads details page, the OSSMC plugin adds a Service Mesh tab that has the following subtabs:
- Overview
- Shows a summary of the selected workload, including a localized topology graph showing the workload with all inbound and outbound edges and nodes.
- Traffic
- Shows information about all inbound and outbound traffic to the workload.
- Logs
- Shows the logs for the workload’s containers. You can see container logs individually ordered by log time and how the Envoy sidecar proxy logs relate to your workload’s application logs. You can enable the tracing span integration, which allows you to see which logs correspond to trace spans.
- Metrics
- Shows inbound and outbound metric graphs in the corresponding subtabs. All the workload metrics are here, providing a detailed view of the performance of your workload. You can enable the tracing span integration, which allows you to see which spans occurred at the same time as the metrics. With the span marker in the graph, you can see the specific spans associated with that timeframe.
- Traces
- Provides a chart showing the trace spans collected over the given timeframe. The trace spans show the most low-level detail within your workload application. The trace details further show heatmaps that provide a comparison of one span in relation to other requests and spans in the same timeframe.
- Envoy
- Shows information about the Envoy sidecar configuration.
In the web console Networking details page, the OSSMC plugin adds a Service Mesh tab similar to the Workloads details page.
In the web console Projects details page, the OSSMC plugin adds a Service Mesh tab that provides traffic graph information about that project. It is the same information shown in the Traffic Graph page but specific to that project.
4.2.2. Installing OpenShift Service Mesh Console plugin
You can install the OSSMC plugin with the Kiali Operator by creating a OSSMConsole
resource with the corresponding plugin settings. It is recommended to install the latest version of the Kiali Operator, even while installing a previous OSSMC plugin version, as it includes the latest z-stream release.
OSSM version | Kiali Server version | OSSMC plugin version | OCP version |
---|---|---|---|
3.0 | v2.4 | v2.4 | 4.15+ |
2.6 | v1.73 | v1.73 | 4.15-4.18 |
2.5 | v1.73 | v1.73 | 4.14-4.18 |
OSSMC plugin is only supported on OpenShift Container Platform 4.15 and above. For OpenShift Container Platform 4.14 users, only the standalone Kiali console is accessible.
You can install the OSSMC plugin by using the OpenShift Container Platform web console or the OpenShift CLI (oc
).
4.2.2.1. Installing OSSMC plugin by using the OpenShift Container Platform web console
You can install the OpenShift Service Mesh Console (OSSMC) plugin by using the OpenShift Container Platform web console.
Prerequisites
- You have the administrator access to the OpenShift Container Platform web console.
- You have installed the OpenShift Service Mesh (OSSM).
-
You have installed the
Istio
control plane from OSSM 3.0. - You have installed the Kiali Server 2.4.
Procedure
- Navigate to Installed Operators.
- Click Kiali Operator provided by Red Hat.
- Click Create instance on the Red Hat OpenShift Service Mesh Console tile. You can also click Create OSSMConsole button under the OpenShift Service Mesh Console tab.
Use the Create OSSMConsole form to create an instance of the
OSSMConsole
custom resource (CR). Name and Version are the required fields.NoteThe Version field must match with the
spec.version
field in your Kiali custom resource (CR). If Version value is the stringdefault
, the Kiali Operator installs OpenShift Service Mesh Console (OSSMC) with the same version as the operator. Thespec.version
field requires thev
prefix in the version number. The version number must only include the major and minor version numbers (not the patch number); for example:v1.73
.- Click Create.
Verification
- Wait until the web console notifies you that the OSSMC plugin is installed and prompts you to refresh.
- Verify that the Service Mesh category is added in the main OpenShift Container Platform web console navigation.
4.2.2.2. Installing OSSMC plugin by using the CLI
You can install the OpenShift Service Mesh Console (OSSMC) plugin by using the OpenShift CLI.
Prerequisites
-
You have access to the OpenShift CLI (
oc
) on the cluster as an administrator. - You have installed the OpenShift Service Mesh (OSSM).
-
You have installed the
Istio
control plane from OSSM 3.0. - You have installed the Kiali Server 2.4.
Procedure
Create a
OSSMConsole
custom resource (CR) to install the plugin by running the following command:$ cat <<EOM | oc apply -f - apiVersion: kiali.io/v1alpha1 kind: OSSMConsole metadata: namespace: openshift-operators name: ossmconsole spec: version: default EOM
NoteThe OpenShift Service Mesh Console (OSSMC) version must match with the Kiali Server version. If
spec.version
field value is the stringdefault
or is not specified, the Kiali Operator installs OSSMC with the same version as the operator. Thespec.version
field requires thev
prefix in the version number. The version number must only include the major and minor version numbers (not the patch number); for example:v1.73
.The plugin resources deploy in the same namespace as the
OSSMConsole
CR.Optional: If more than one Kiali Server is installed in the cluster, specify the
spec.kiali
setting in the OSSMC CR by running a command similar to the following example:$ cat <<EOM | oc apply -f - apiVersion: kiali.io/v1alpha1 kind: OSSMConsole metadata: namespace: openshift-operators name: ossmconsole spec: kiali: serviceName: kiali serviceNamespace: istio-system-two servicePort: 20001 EOM
Verification
- Go to the OpenShift Container Platform web console.
- Verify that the Service Mesh category is added in the main OpenShift Container Platform web console navigation.
- If the OSSMC plugin is not installed yet, wait until the web console notifies you that the OSSMC plugin is installed and prompts you to refresh.
4.2.3. Uninstalling OpenShift Service Mesh Console plugin
You can uninstall the OSSMC plugin by using the OpenShift Container Platform web console or the OpenShift CLI (oc
).
You must uninstall the OSSMC plugin before removing the Kiali Operator. Deleting the Operator first may leave OSSMC and Kiali CRs stuck, requiring manual removal of the finalizer. Use the following command with <custom_resource_type>
as kiali
or ossmconsole
to remove the finalizer, if needed:
$ oc patch <custom_resource_type> <custom_resource_name> -n <custom_resource_namespace> -p '{"metadata":{"finalizers": []}}' --type=merge
4.2.3.1. Uninstalling OSSMC plugin by using the web console
You can uninstall the OpenShift Service Mesh Console (OSSMC) plugin by using the OpenShift Container Platform web console.
Procedure
- Navigate to Installed Operators.
- Click Kiali Operator.
- Select the OpenShift Service Mesh Console tab.
- Click Delete OSSMConsole option from the entry menu.
- Confirm that you want to delete the plugin.
4.2.3.2. Uninstalling OSSMC plugin by using the CLI
You can uninstall the OpenShift Service Mesh Console (OSSMC) plugin by using the OpenShift CLI (oc
).
Procedure
Remove the OSSMC custom resource (CR) by running the following command:
$ oc delete ossmconsoles <custom_resource_name> -n <custom_resource_namespace>
Verification
Verify all the CRs are deleted from all namespaces by running the following command:
$ for r in $(oc get ossmconsoles --ignore-not-found=true --all-namespaces -o custom-columns=NS:.metadata.namespace,N:.metadata.name --no-headers | sed 's/ */:/g'); do oc delete ossmconsoles -n $(echo $r|cut -d: -f1) $(echo $r|cut -d: -f2); done