此内容没有您所选择的语言版本。

Chapter 30. Finding information on Kafka restarts


After the Cluster Operator restarts a Kafka pod in an OpenShift cluster, it emits an OpenShift event into the pod’s namespace explaining why the pod restarted. For help in understanding cluster behavior, you can check restart events from the command line.

Tip

You can export and monitor restart events using metrics collection tools like Prometheus. Use the metrics tool with an event exporter that can export the output in a suitable format.

30.1. Reasons for a restart event

The Cluster Operator initiates a restart event for a specific reason. You can check the reason by fetching information on the restart event.

Expand
Table 30.1. Restart reasons
EventDescription

CaCertHasOldGeneration

The pod is still using a server certificate signed with an old CA, so needs to be restarted as part of the certificate update.

CaCertRemoved

Expired CA certificates have been removed, and the pod is restarted to run with the current certificates.

CaCertRenewed

CA certificates have been renewed, and the pod is restarted to run with the updated certificates.

ClusterCaCertKeyReplaced

The key used to sign the cluster’s CA certificates has been replaced, and the pod is being restarted as part of the CA renewal process.

ConfigChangeRequiresRestart

Some Kafka configuration properties are changed dynamically, but others require that the broker be restarted.

FileSystemResizeNeeded

The file system size has been increased, and a restart is needed to apply it.

KafkaCertificatesChanged

One or more TLS certificates used by the Kafka broker have been updated, and a restart is needed to use them.

ManualRollingUpdate

A user annotated the pod, or the StrimziPodSet set it belongs to, to trigger a restart.

PodForceRestartOnError

An error occurred that requires a pod restart to rectify.

PodHasOldRevision

A disk was added or removed from the Kafka volumes, and a restart is needed to apply the change. When using StrimziPodSet resources, the same reason is given if the pod needs to be recreated.

PodHasOldRevision

The StrimziPodSet that the pod is a member of has been updated, so the pod needs to be recreated. When using StrimziPodSet resources, the same reason is given if a disk was added or removed from the Kafka volumes.

PodStuck

The pod is still pending, and is not scheduled or cannot be scheduled, so the operator has restarted the pod in a final attempt to get it running.

PodUnresponsive

Streams for Apache Kafka was unable to connect to the pod, which can indicate a broker not starting correctly, so the operator restarted it in an attempt to resolve the issue.

30.2. Restart event filters

When checking restart events from the command line, you can specify a field-selector to filter on OpenShift event fields.

The following fields are available when filtering events with field-selector.

regardingObject.kind
The resource that owns the Pod being restarted, and for restart events, the kind is always Kafka.
regarding.namespace
The namespace that the resource belongs to.
regardingObject.name
The resource’s name, for example, strimzi-cluster.
regardingObject.uid
The unique ID of the resource.
reason
The reason the Pod was restarted, for example, JbodVolumesChanged.
reportingController
The reporting component is always strimzi.io/cluster-operator for Streams for Apache Kafka restart events.
source
source is an older version of reportingController. The reporting component is always strimzi.io/cluster-operator for Streams for Apache Kafka restart events.
type
The event type, which is either Warning or Normal. For Streams for Apache Kafka restart events, the type is Normal.
Note

In older versions of OpenShift, the fields using the regarding prefix might use an involvedObject prefix instead. reportingController was previously called reportingComponent.

30.3. Checking Kafka restarts

Use a oc command to list restart events initiated by the Cluster Operator. Filter restart events emitted by the Cluster Operator by setting the Cluster Operator as the reporting component using the reportingController or source event fields.

Prerequisites

  • The Cluster Operator is running in the OpenShift cluster.

Procedure

  1. Get all restart events emitted by the Cluster Operator:

    oc -n kafka get events --field-selector reportingController=strimzi.io/cluster-operator

    Example showing events returned

    LAST SEEN   TYPE     REASON                   OBJECT                  MESSAGE
    2m          Normal   CaCertRenewed            kafka/strimzi-cluster   Rolling Pod strimzi-cluster-kafka-0 due to CA certificate renewed
    58m         Normal   PodForceRestartOnError   kafka/strimzi-cluster   Rolling Pod strimzi-cluster-kafka-1 due to Pod needs to be forcibly restarted due to an error
    5m47s       Normal   ManualRollingUpdate      kafka/strimzi-cluster   Rolling Pod strimzi-cluster-kafka-2 due to Pod was manually annotated to be rolled

    You can also specify a reason or other field-selector options to constrain the events returned.

    Here, a specific reason is added:

    oc -n kafka get events --field-selector reportingController=strimzi.io/cluster-operator,reason=PodForceRestartOnError
  2. Use an output format, such as YAML, to return more detailed information about one or more events.

    oc -n kafka get events --field-selector reportingController=strimzi.io/cluster-operator,reason=PodForceRestartOnError -o yaml

    Example showing detailed events output

    apiVersion: v1
    items:
    - action: StrimziInitiatedPodRestart
      apiVersion: v1
      eventTime: "2022-05-13T00:22:34.168086Z"
      firstTimestamp: null
      involvedObject:
          kind: Kafka
          name: strimzi-cluster
          namespace: kafka
      kind: Event
      lastTimestamp: null
      message: Rolling Pod strimzi-cluster-kafka-1 due to Pod needs to be forcibly restarted due to an error
      metadata:
          creationTimestamp: "2022-05-13T00:22:34Z"
          generateName: strimzi-event
          name: strimzi-eventwppk6
          namespace: kafka
          resourceVersion: "432961"
          uid: 29fcdb9e-f2cf-4c95-a165-a5efcd48edfc
      reason: PodForceRestartOnError
      related:
          kind: Pod
          name: strimzi-cluster-kafka-1
          namespace: kafka
      reportingController: strimzi.io/cluster-operator
      reportingInstance: strimzi-cluster-operator-6458cfb4c6-6bpdp
      source: {}
      type: Normal
    kind: List
    metadata:
      resourceVersion: ""
      selfLink: ""

The following fields are deprecated, so they are not populated for these events:

  • firstTimestamp
  • lastTimestamp
  • source
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2026 Red Hat
返回顶部