Chapter 4. Distributed tracing platform (Tempo)
4.1. Installing the distributed tracing platform (Tempo)
Installing the distributed tracing platform (Tempo) involves the following steps:
- Setting up supported object storage.
- Installing the Tempo Operator.
- Creating a secret for the object storage credentials.
- Creating a namespace for a TempoStack instance.
-
Creating a
TempoStack
custom resource to deploy at least one TempoStack instance.
4.1.1. Object storage setup
You can use the following configuration parameters when setting up a supported object storage.
Storage provider |
---|
Secret parameters |
|
MinIO |
See MinIO Operator.
|
Amazon S3 |
|
Microsoft Azure Blob Storage |
|
Google Cloud Storage on Google Cloud Platform (GCP) |
|
4.1.2. Installing the distributed tracing platform (Tempo) from the web console
You can install the distributed tracing platform (Tempo) from the Administrator view of the web console.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as a cluster administrator with the
cluster-admin
role. -
For Red Hat OpenShift Dedicated, you must be logged in using an account with the
dedicated-admin
role. - You are using a supported provider of object storage: Red Hat OpenShift Data Foundation, MinIO, Amazon S3, Azure Blob Storage, Google Cloud Storage.
Procedure
Install the Tempo Operator:
-
Go to Operators
OperatorHub and search for Tempo Operator
. Select the Tempo Operator that is OpenShift Operator for Tempo
Install Install View Operator. ImportantThis installs the Operator with the default presets:
-
Update channel
stable -
Installation mode
All namespaces on the cluster -
Installed Namespace
openshift-tempo-operator -
Update approval
Automatic
-
Update channel
- In the Details tab of the page of the installed Operator, under ClusterServiceVersion details, verify that the installation Status is Succeeded.
-
Go to Operators
-
Create a project of your choice for the TempoStack instance that you will create in a subsequent step: go to Home
Projects Create Project. In the project that you created for the TempoStack instance, create a secret for your object storage bucket: go to Workloads
Secrets Create From YAML. Example secret for Amazon S3 and MinIO storage
apiVersion: v1 kind: Secret metadata: name: minio-test stringData: endpoint: http://minio.minio.svc:9000 bucket: tempo access_key_id: tempo access_key_secret: <secret> type: Opaque
Create a TempoStack instance.
NoteYou can create multiple TempoStack instances in separate projects on the same cluster.
-
Go to Operators
Installed Operators. -
Select TempoStack
Create TempoStack YAML view. In the YAML view, customize the
TempoStack
custom resource (CR):apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: sample namespace: <project_of_tempostack_instance> spec: storageSize: 1Gi storage: secret: name: <secret-name> 1 type: <secret-provider> 2 template: queryFrontend: jaegerQuery: enabled: true ingress: route: termination: edge type: route
Example of a TempoStack CR for AWS S3 and MinIO storage
apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: simplest namespace: <project_of_tempostack_instance> spec: storageSize: 1Gi storage: secret: name: minio-test type: s3 resources: total: limits: memory: 2Gi cpu: 2000m template: queryFrontend: jaegerQuery: enabled: true ingress: route: termination: edge type: route
The stack deployed in this example is configured to receive Jaeger Thrift over HTTP and OpenTelemetry Protocol (OTLP), which permits visualizing the data with the Jaeger UI.
- Select Create.
-
Go to Operators
Verification
- Use the Project: dropdown list to select the project of the TempoStack instance.
-
Go to Operators
Installed Operators to verify that the Status of the TempoStack instance is Condition: Ready. -
Go to Workloads
Pods to verify that all the component pods of the TempoStack instance are running. Access the Tempo console:
-
Go to Networking
Routes and Ctrl+F to search for tempo
. - In the Location column, open the URL to access the Tempo console.
Select Log In With OpenShift to use your cluster administrator credentials for the web console.
NoteThe Tempo console initially shows no trace data following the Tempo console installation.
-
Go to Networking
4.1.3. Installing the distributed tracing platform (Tempo) by using the CLI
You can install the distributed tracing platform (Tempo) from the command line.
Prerequisites
An active OpenShift CLI (
oc
) session by a cluster administrator with thecluster-admin
role.Tip-
Ensure that your OpenShift CLI (
oc
) version is up to date and matches your OpenShift Container Platform version. Run
oc login
:$ oc login --username=<your_username>
-
Ensure that your OpenShift CLI (
- You are using a supported provider of object storage: Red Hat OpenShift Data Foundation, MinIO, Amazon S3, Azure Blob Storage, Google Cloud Storage.
Procedure
Install the Tempo Operator:
Create a project for the Tempo Operator by running the following command:
$ oc apply -f - << EOF apiVersion: project.openshift.io/v1 kind: Project metadata: labels: kubernetes.io/metadata.name: openshift-tempo-operator openshift.io/cluster-monitoring: "true" name: openshift-tempo-operator EOF
Create an Operator group by running the following command:
$ oc apply -f - << EOF apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: openshift-tempo-operator namespace: openshift-tempo-operator spec: upgradeStrategy: Default EOF
Create a subscription by running the following command:
$ oc apply -f - << EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: tempo-product namespace: openshift-tempo-operator spec: channel: stable installPlanApproval: Automatic name: tempo-product source: redhat-operators sourceNamespace: openshift-marketplace EOF
Check the operator status by running the following command:
$ oc get csv -n openshift-tempo-operator
Run the following command to create a project of your choice for the TempoStack instance that you will create in a subsequent step:
$ oc apply -f - << EOF apiVersion: project.openshift.io/v1 kind: Project metadata: name: <project_of_tempostack_instance> EOF
In the project that you created for the TempoStack instance, create a secret for your object storage bucket by running the following command:
$ oc apply -f - << EOF <object_storage_secret> EOF
Example secret for Amazon S3 and MinIO storage
apiVersion: v1 kind: Secret metadata: name: minio-test stringData: endpoint: http://minio.minio.svc:9000 bucket: tempo access_key_id: tempo access_key_secret: <secret> type: Opaque
Create a TempoStack instance in the project that you created for the TempoStack instance.
NoteYou can create multiple TempoStack instances in separate projects on the same cluster.
Customize the
TempoStack
custom resource (CR):apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: sample namespace: <project_of_tempostack_instance> spec: storageSize: 1Gi storage: secret: name: <secret-name> 1 type: <secret-provider> 2 template: queryFrontend: jaegerQuery: enabled: true ingress: route: termination: edge type: route
Example of a TempoStack CR for AWS S3 and MinIO storage
apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: simplest namespace: project_of_tempostack_instance spec: storageSize: 1Gi storage: secret: name: minio-test type: s3 resources: total: limits: memory: 2Gi cpu: 2000m template: queryFrontend: jaegerQuery: enabled: true ingress: route: termination: edge type: route
The stack deployed in this example is configured to receive Jaeger Thrift over HTTP and OpenTelemetry Protocol (OTLP), which permits visualizing the data with the Jaeger UI.
Apply the customized CR by running the following command.
$ oc apply -f - << EOF <TempoStack_custom_resource> EOF
Verification
Verify that the
status
of all TempoStackcomponents
isRunning
and theconditions
aretype: Ready
by running the following command:$ oc get tempostacks.tempo.grafana.com simplest -o yaml
Verify that all the TempoStack component pods are running by running the following command:
$ oc get pods
Access the Tempo console:
Query the route details by running the following command:
$ export TEMPO_URL=$(oc get route -n <control_plane_namespace> tempo -o jsonpath='{.spec.host}')
-
Open
https://<route_from_previous_step>
in a web browser. Log in using your cluster administrator credentials for the web console.
NoteThe Tempo console initially shows no trace data following the Tempo console installation.
4.1.4. Additional resources
4.2. Configuring and deploying the distributed tracing platform (Tempo)
The Tempo Operator uses a custom resource definition (CRD) file that defines the architecture and configuration settings to be used when creating and deploying the distributed tracing platform (Tempo) resources. You can install the default configuration or modify the file.
4.2.1. Customizing your deployment
For information about configuring the back-end storage, see Understanding persistent storage and the appropriate configuration topic for your chosen storage option.
4.2.1.1. Distributed tracing default configuration options
The Tempo custom resource (CR) defines the architecture and settings to be used when creating the distributed tracing platform (Tempo) resources. You can modify these parameters to customize your distributed tracing platform (Tempo) implementation to your business needs.
Example of a generic Tempo YAML file
apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: name spec: storage: {} resources: {} storageSize: 200M replicationFactor: 1 retention: {} template: distributor:{} ingester: {} compactor: {} querier: {} queryFrontend: {} gateway: {}
Parameter | Description | Values | Default value |
---|---|---|---|
apiVersion: | API version to use when creating the object. |
|
|
kind: | Defines the kind of Kubernetes object to create. |
| |
metadata: |
Data that uniquely identifies the object, including a |
OpenShift Container Platform automatically generates the | |
name: | Name for the object. | Name of your TempoStack instance. |
|
spec: | Specification for the object to be created. |
Contains all of the configuration parameters for your TempoStack instance. When a common definition for all Tempo components is required, it is defined under the | N/A |
resources: | Resources assigned to the TempoStack. | ||
storageSize: | Storage size for ingester PVCs. | ||
replicationFactor: | Configuration for the replication factor. | ||
retention: | Configuration options for retention of traces. | ||
storage: |
Configuration options that define the storage. All storage-related options must be placed under | ||
template.distributor: |
Configuration options for the Tempo | ||
template.ingester: |
Configuration options for the Tempo | ||
template.compactor: |
Configuration options for the Tempo | ||
template.querier: |
Configuration options for the Tempo | ||
template.queryFrontend: |
Configuration options for the Tempo | ||
template.gateway: |
Configuration options for the Tempo |
Minimum required configuration
The following is the required minimum for creating a distributed tracing platform (Tempo) deployment with the default settings:
apiVersion: tempo.grafana.com/v1alpha1
kind: TempoStack
metadata:
name: simplest
spec:
storage: 1
secret:
name: minio
type: s3
resources:
total:
limits:
memory: 2Gi
cpu: 2000m
template:
queryFrontend:
jaegerQuery:
enabled: true
ingress:
type: route
- 1
- This section specifies the deployed object storage back end, which requires a created secret with credentials for access to the object storage.
4.2.1.2. The distributed tracing platform (Tempo) storage configuration
You can configure object storage for the distributed tracing platform (Tempo) in the TempoStack
custom resource under spec.storage
. You can choose from among several storage providers that are supported.
Parameter | Description | Values | Default value |
---|---|---|---|
spec: storage: secret type: | Type of storage to use for the deployment. |
|
|
storage: secretname: | Name of the secret that contains the credentials for the set object storage type. | N/A | |
storage: tls: caName: |
CA is the name of a |
Storage provider |
---|
Secret parameters |
|
MinIO |
See MinIO Operator.
|
Amazon S3 |
|
Microsoft Azure Blob Storage |
|
Google Cloud Storage on Google Cloud Platform (GCP) |
|
4.2.1.3. Query configuration options
Two components of the distributed tracing platform (Tempo), the querier and query frontend, manage queries. You can configure both of these components.
The querier component finds the requested trace ID in the ingesters or back-end storage. Depending on the set parameters, the querier component can query both the ingesters and pull bloom or indexes from the back end to search blocks in object storage. The querier component exposes an HTTP endpoint at GET /querier/api/traces/<traceID>
, but it is not expected to be used directly. Queries must be sent to the query frontend.
Parameter | Description | Values |
---|---|---|
| The simple form of the node-selection constraint. | type: object |
| The number of replicas to be created for the component. | type: integer; format: int32 |
| Component-specific pod tolerations. | type: array |
The query frontend component is responsible for sharding the search space for an incoming query. The query frontend exposes traces via a simple HTTP endpoint: GET /api/traces/<traceID>
. Internally, the query frontend component splits the blockID
space into a configurable number of shards and then queues these requests. The querier component connects to the query frontend component via a streaming gRPC connection to process these sharded queries.
Parameter | Description | Values |
---|---|---|
| Configuration of the query frontend component. | type: object |
| The simple form of the node selection constraint. | type: object |
| The number of replicas to be created for the query frontend component. | type: integer; format: int32 |
| Pod tolerations specific to the query frontend component. | type: array |
| The options specific to the Jaeger Query component. | type: object |
|
When | type: boolean |
| The options for the Jaeger Query ingress. | type: object |
| The annotations of the ingress object. | type: object |
| The hostname of the ingress object. | type: string |
| The name of an IngressClass cluster resource. Defines which ingress controller serves this ingress resource. | type: string |
| The options for the OpenShift route. | type: object |
|
The termination type. The default is | type: string (enum: insecure, edge, passthrough, reencrypt) |
|
The type of ingress for the Jaeger Query UI. The supported types are | type: string (enum: ingress, route) |
| The monitor tab configuration. | type: object |
|
Enables the monitor tab in the Jaeger console. The | type: boolean |
|
The endpoint to the Prometheus instance that contains the span rate, error, and duration (RED) metrics. For example, | type: string |
Example configuration of the query frontend component in a TempoStack
CR
apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: simplest spec: storage: secret: name: minio type: s3 storageSize: 200M resources: total: limits: memory: 2Gi cpu: 2000m template: queryFrontend: jaegerQuery: enabled: true ingress: route: termination: edge type: route
4.2.1.3.1. Additional resources
4.2.1.4. Configuration of the monitor tab in Jaeger UI
Trace data contains rich information, and the data is normalized across instrumented languages and frameworks. Therefore, request rate, error, and duration (RED) metrics can be extracted from traces. The metrics can be visualized in Jaeger console in the Monitor tab.
The metrics are derived from spans in the OpenTelemetry Collector that are scraped from the Collector by the Prometheus deployed in the user-workload monitoring stack. The Jaeger UI queries these metrics from the Prometheus endpoint and visualizes them.
4.2.1.4.1. OpenTelemetry Collector configuration
The OpenTelemetry Collector requires configuration of the spanmetrics
connector that derives metrics from traces and exports the metrics in the Prometheus format.
OpenTelemetry Collector custom resource for span RED
kind: OpenTelemetryCollector apiVersion: opentelemetry.io/v1alpha1 metadata: name: otel spec: mode: deployment observability: metrics: enableMetrics: true 1 config: | connectors: spanmetrics: 2 metrics_flush_interval: 15s receivers: otlp: 3 protocols: grpc: http: exporters: prometheus: 4 endpoint: 0.0.0.0:8889 add_metric_suffixes: false resource_to_telemetry_conversion: enabled: true # by default resource attributes are dropped otlp: endpoint: "tempo-simplest-distributor:4317" tls: insecure: true service: pipelines: traces: receivers: [otlp] exporters: [otlp, spanmetrics] 5 metrics: receivers: [spanmetrics] 6 exporters: [prometheus]
- 1
- Creates the
ServiceMonitor
custom resource to enable scraping of the Prometheus exporter. - 2
- The Spanmetrics connector receives traces and exports metrics.
- 3
- The OTLP receiver to receive spans in the OpenTelemetry protocol.
- 4
- The Prometheus exporter is used to export metrics in the Prometheus format.
- 5
- The Spanmetrics connector is configured as exporter in traces pipeline.
- 6
- The Spanmetrics connector is configured as receiver in metrics pipeline.
4.2.1.4.2. Tempo configuration
The TempoStack
custom resource must specify the following: the Monitor tab is enabled, and the Prometheus endpoint is set to the Thanos querier service to query the data from the user-defined monitoring stack.
TempoStack custom resource with the enabled Monitor tab
kind: TempoStack apiVersion: tempo.grafana.com/v1alpha1 metadata: name: simplest spec: template: queryFrontend: jaegerQuery: enabled: true monitorTab: enabled: true 1 prometheusEndpoint: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091 2 ingress: type: route
4.2.1.5. Multitenancy
Multitenancy with authentication and authorization is provided in the Tempo Gateway service. The authentication uses OpenShift OAuth and the Kubernetes TokenReview
API. The authorization uses the Kubernetes SubjectAccessReview
API.
Sample Tempo CR with two tenants, dev
and prod
apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: simplest spec: tenants: mode: openshift 1 authentication: 2 - tenantName: dev 3 tenantId: "1610b0c3-c509-4592-a256-a1871353dbfa" 4 - tenantName: prod tenantId: "1610b0c3-c509-4592-a256-a1871353dbfb" template: gateway: enabled: true 5 queryFrontend: jaegerQuery: enabled: true
- 1
- Must be set to
openshift
. - 2
- The list of tenants.
- 3
- The tenant name. Must be provided in the
X-Scope-OrgId
header when ingesting the data. - 4
- A unique tenant ID.
- 5
- Enables a gateway that performs authentication and authorization. The Jaeger UI is exposed at
http://<gateway-ingress>/api/traces/v1/<tenant-name>/search
.
The authorization configuration uses the ClusterRole
and ClusterRoleBinding
of the Kubernetes Role-Based Access Control (RBAC). By default, no users have read or write permissions.
Sample of the read RBAC configuration that allows authenticated users to read the trace data of the dev
and prod
tenants
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: tempostack-traces-reader rules: - apiGroups: - 'tempo.grafana.com' resources: 1 - dev - prod resourceNames: - traces verbs: - 'get' 2 --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: tempostack-traces-reader roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: tempostack-traces-reader subjects: - kind: Group apiGroup: rbac.authorization.k8s.io name: system:authenticated 3
Sample of the write RBAC configuration that allows the otel-collector
service account to write the trace data for the dev
tenant
apiVersion: v1 kind: ServiceAccount metadata: name: otel-collector 1 namespace: otel --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: tempostack-traces-write rules: - apiGroups: - 'tempo.grafana.com' resources: 2 - dev resourceNames: - traces verbs: - 'create' 3 --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: tempostack-traces roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: tempostack-traces-write subjects: - kind: ServiceAccount name: otel-collector namespace: otel
Trace data can be sent to the Tempo instance from the OpenTelemetry Collector that uses the service account with RBAC for writing the data.
Sample OpenTelemetry CR configuration
apiVersion: opentelemetry.io/v1alpha1 kind: OpenTelemetryCollector metadata: name: cluster-collector namespace: tracing-system spec: mode: deployment serviceAccount: otel-collector config: | extensions: bearertokenauth: filename: "/var/run/secrets/kubernetes.io/serviceaccount/token" exporters: otlp/dev: endpoint: tempo-simplest-gateway.tempo.svc.cluster.local:8090 tls: insecure: false ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt" auth: authenticator: bearertokenauth headers: X-Scope-OrgID: "dev" service: extensions: [bearertokenauth] pipelines: traces: exporters: [otlp/dev]
4.2.2. Setting up monitoring for the distributed tracing platform (Tempo)
The Tempo Operator supports monitoring and alerting of each TempoStack component such as distributor, ingester, and so on, and exposes upgrade and operational metrics about the Operator itself.
4.2.2.1. Configuring TempoStack metrics and alerts
You can enable metrics and alerts of TempoStack instances.
Prerequisites
- Monitoring for user-defined projects is enabled in the cluster. See Enabling monitoring for user-defined projects.
Procedure
To enable metrics of a TempoStack instance, set the
spec.observability.metrics.createServiceMonitors
field totrue
:apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: <name> spec: observability: metrics: createServiceMonitors: true
To enable alerts for a TempoStack instance, set the
spec.observability.metrics.createPrometheusRules
field totrue
:apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: <name> spec: observability: metrics: createPrometheusRules: true
Verification
You can use the Administrator view of the web console to verify successful configuration:
-
Go to Observe
Targets, filter for Source: User, and check that ServiceMonitors in the format tempo-<instance_name>-<component>
have the Up status. -
To verify that alerts are set up correctly, go to Observe
Alerting Alerting rules, filter for Source: User, and check that the Alert rules for the TempoStack instance components are available.
4.2.2.2. Configuring Tempo Operator metrics and alerts
When installing the Tempo Operator from the web console, you can select the Enable Operator recommended cluster monitoring on this Namespace checkbox, which enables creating metrics and alerts of the Tempo Operator.
If the checkbox was not selected during installation, you can manually enable metrics and alerts even after installing the Tempo Operator.
Procedure
-
Add the
openshift.io/cluster-monitoring: "true"
label in the project where the Tempo Operator is installed, which isopenshift-tempo-operator
by default.
Verification
You can use the Administrator view of the web console to verify successful configuration:
-
Go to Observe
Targets, filter for Source: Platform, and search for tempo-operator
, which must have the Up status. -
To verify that alerts are set up correctly, go to Observe
Alerting Alerting rules, filter for Source: Platform, and locate the Alert rules for the Tempo Operator.
4.3. Updating the distributed tracing platform (Tempo)
For version upgrades, the Tempo Operator uses the Operator Lifecycle Manager (OLM), which controls installation, upgrade, and role-based access control (RBAC) of Operators in a cluster.
The OLM runs in the OpenShift Container Platform by default. The OLM queries for available Operators as well as upgrades for installed Operators.
When the Tempo Operator is upgraded to the new version, it scans for running TempoStack instances that it manages and upgrades them to the version corresponding to the Operator’s new version.
4.3.1. Additional resources
4.4. Removing the Red Hat OpenShift distributed tracing platform (Tempo)
The steps for removing the Red Hat OpenShift distributed tracing platform (Tempo) from an OpenShift Container Platform cluster are as follows:
- Shut down all distributed tracing platform (Tempo) pods.
- Remove any TempoStack instances.
- Remove the Tempo Operator.
4.4.1. Removing a TempoStack instance by using the web console
You can remove a TempoStack instance in the Administrator view of the web console.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as a cluster administrator with the
cluster-admin
role. -
For Red Hat OpenShift Dedicated, you must be logged in using an account with the
dedicated-admin
role.
Procedure
-
Go to Operators
Installed Operators Tempo Operator TempoStack. -
To remove the TempoStack instance, select
Delete TempoStack Delete. - Optional: Remove the Tempo Operator.
4.4.2. Removing a TempoStack instance by using the CLI
You can remove a TempoStack instance on the command line.
Prerequisites
An active OpenShift CLI (
oc
) session by a cluster administrator with thecluster-admin
role.Tip-
Ensure that your OpenShift CLI (
oc
) version is up to date and matches your OpenShift Container Platform version. Run
oc login
:$ oc login --username=<your_username>
-
Ensure that your OpenShift CLI (
Procedure
Get the name of the TempoStack instance by running the following command:
$ oc get deployments -n <project_of_tempostack_instance>
Remove the TempoStack instance by running the following command:
$ oc delete tempo <tempostack_instance_name> -n <project_of_tempostack_instance>
- Optional: Remove the Tempo Operator.
Verification
Run the following command to verify that the TempoStack instance is not found in the output, which indicates its successful removal:
$ oc get deployments -n <project_of_tempostack_instance>