5.5. Investigating pod issues


OpenShift Container Platform leverages the Kubernetes concept of a pod, which is one or more containers deployed together on one host. A pod is the smallest compute unit that can be defined, deployed, and managed on OpenShift Container Platform 4.5.

After a pod is defined, it is assigned to run on a node until its containers exit, or until it is removed. Depending on policy and exit code, Pods are either removed after exiting or retained so that their logs can be accessed.

The first thing to check when pod issues arise is the pod’s status. If an explicit pod failure has occurred, observe the pod’s error state to identify specific image, container, or pod network issues. Focus diagnostic data collection according to the error state. Review pod event messages, as well as pod and container log information. Diagnose issues dynamically by accessing running Pods on the command line, or start a debug pod with root access based on a problematic pod’s deployment configuration.

5.5.1. Understanding pod error states

Pod failures return explicit error states that can be observed in the status field in the output of oc get pods. Pod error states cover image, container, and container network related failures.

The following table provides a list of pod error states along with their descriptions.

Expand
表 5.2. Pod error states
Pod error stateDescription

ErrImagePull

Generic image retrieval error.

ErrImagePullBackOff

Image retrieval failed and is backed off.

ErrInvalidImageName

The specified image name was invalid.

ErrImageInspect

Image inspection did not succeed.

ErrImageNeverPull

PullPolicy is set to NeverPullImage and the target image is not present locally on the host.

ErrRegistryUnavailable

When attempting to retrieve an image from a registry, an HTTP error was encountered.

ErrContainerNotFound

The specified container is either not present or not managed by the kubelet, within the declared pod.

ErrRunInitContainer

Container initialization failed.

ErrRunContainer

None of the pod’s containers started successfully.

ErrKillContainer

None of the pod’s containers were killed successfully.

ErrCrashLoopBackOff

A container has terminated. The kubelet will not attempt to restart it.

ErrVerifyNonRoot

A container or image attempted to run with root privileges.

ErrCreatePodSandbox

Pod sandbox creation did not succeed.

ErrConfigPodSandbox

Pod sandbox configuration was not obtained.

ErrKillPodSandbox

A pod sandbox did not stop successfully.

ErrSetupNetwork

Network initialization failed.

ErrTeardownNetwork

Network termination failed.

5.5.2. Reviewing pod status

You can query pod status and error states. You can also query a pod’s associated deployment configuration and review base image availability.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).
  • skopeo is installed.

Procedure

  1. Switch into a project:

    $ oc project <project_name>
    Copy to Clipboard Toggle word wrap
  2. List pods running within the namespace, as well as pod status, error states, restarts, and age:

    $ oc get pods
    Copy to Clipboard Toggle word wrap
  3. Determine whether the namespace is managed by a deployment configuration:

    $ oc status
    Copy to Clipboard Toggle word wrap

    If the namespace is managed by a deployment configuration, the output includes the deployment configuration name and a base image reference.

  4. Inspect the base image referenced in the preceding command’s output:

    $ skopeo inspect docker://<image_reference>
    Copy to Clipboard Toggle word wrap
  5. If the base image reference is not correct, update the reference in the deployment configuration:

    $ oc edit deployment/my-deployment
    Copy to Clipboard Toggle word wrap
  6. When deployment configuration changes on exit, the configuration will automatically redeploy. Watch pod status as the deployment progresses, to determine whether the issue has been resolved:

    $ oc get pods -w
    Copy to Clipboard Toggle word wrap
  7. Review events within the namespace for diagnostic information relating to pod failures:

    $ oc get events
    Copy to Clipboard Toggle word wrap

5.5.3. Inspecting pod and container logs

You can inspect pod and container logs for warnings and error messages related to explicit pod failures. Depending on policy and exit code, pod and container logs remain available after pods have been terminated.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • Your API service is still functional.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Query logs for a specific pod:

    $ oc logs <pod_name>
    Copy to Clipboard Toggle word wrap
  2. Query logs for a specific container within a pod:

    $ oc logs <pod_name> -c <container_name>
    Copy to Clipboard Toggle word wrap

    Logs retrieved using the preceding oc logs commands are composed of messages sent to stdout within pods or containers.

  3. Inspect logs contained in /var/log/ within a pod.

    1. List log files and subdirectories contained in /var/log within a pod:

      $ oc exec <pod_name> ls -alh /var/log
      Copy to Clipboard Toggle word wrap
    2. Query a specific log file contained in /var/log within a pod:

      $ oc exec <pod_name> cat /var/log/<path_to_log>
      Copy to Clipboard Toggle word wrap
    3. List log files and subdirectories contained in /var/log within a specific container:

      $ oc exec <pod_name> -c <container_name> ls /var/log
      Copy to Clipboard Toggle word wrap
    4. Query a specific log file contained in /var/log within a specific container:

      $ oc exec <pod_name> -c <container_name> cat /var/log/<path_to_log>
      Copy to Clipboard Toggle word wrap

5.5.4. Accessing running pods

You can review running pods dynamically by opening a shell inside a pod or by gaining network access through port forwarding.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • Your API service is still functional.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Switch into the project that contains the pod you would like to access. This is necessary because the oc rsh command does not accept the -n namespace option:

    $ oc project <namespace>
    Copy to Clipboard Toggle word wrap
  2. Start a remote shell into a pod:

    $ oc rsh <pod_name>  
    1
    Copy to Clipboard Toggle word wrap
    1
    If a pod has multiple containers, oc rsh defaults to the first container unless -c <container_name> is specified.
  3. Start a remote shell into a specific container within a pod:

    $ oc rsh -c <container_name> pod/<pod_name>
    Copy to Clipboard Toggle word wrap
  4. Create a port forwarding session to a port on a pod:

    $ oc port-forward <pod_name> <host_port>:<pod_port>  
    1
    Copy to Clipboard Toggle word wrap
    1
    Enter Ctrl+C to cancel the port forwarding session.

5.5.5. Starting debug pods with root access

You can start a debug pod with root access, based on a problematic pod’s deployment or deployment configuration. Pod users typically run with non-root privileges, but running troubleshooting pods with temporary root privileges can be useful during issue investigation.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • Your API service is still functional.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Start a debug pod with root access, based on a deployment.

    1. Obtain a project’s deployment name:

      $ oc get deployment -n <project_name>
      Copy to Clipboard Toggle word wrap
    2. Start a debug pod with root privileges, based on the deployment:

      $ oc debug deployment/my-deployment --as-root -n <project_name>
      Copy to Clipboard Toggle word wrap
  2. Start a debug pod with root access, based on a deployment configuration.

    1. Obtain a project’s deployment configuration name:

      $ oc get deploymentconfigs -n <project_name>
      Copy to Clipboard Toggle word wrap
    2. Start a debug pod with root privileges, based on the deployment configuration:

      $ oc debug deploymentconfig/my-deployment-configuration --as-root -n <project_name>
      Copy to Clipboard Toggle word wrap
注意

You can append -- <command> to the preceding oc debug commands to run individual commands within a debug pod, instead of running an interactive shell.

5.5.6. Copying files to and from pods and containers

You can copy files to and from a pod to test configuration changes or gather diagnostic information.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • Your API service is still functional.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Copy a file to a pod:

    $ oc cp <local_path> <pod_name>:/<path> -c <container_name>  
    1
    Copy to Clipboard Toggle word wrap
    1
    The first container in a pod is selected if the -c option is not specified.
  2. Copy a file from a pod:

    $ oc cp <pod_name>:/<path>  -c <container_name><local_path>  
    1
    Copy to Clipboard Toggle word wrap
    1
    The first container in a pod is selected if the -c option is not specified.
    注意

    For oc cp to function, the tar binary must be available within the container.

返回顶部
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2025 Red Hat