This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.5.5. Investigating pod issues
OpenShift Container Platform leverages the Kubernetes concept of a pod, which is one or more containers deployed together on one host. A pod is the smallest compute unit that can be defined, deployed, and managed on OpenShift Container Platform 4.5.
After a pod is defined, it is assigned to run on a node until its containers exit, or until it is removed. Depending on policy and exit code, Pods are either removed after exiting or retained so that their logs can be accessed.
The first thing to check when pod issues arise is the pod’s status. If an explicit pod failure has occurred, observe the pod’s error state to identify specific image, container, or pod network issues. Focus diagnostic data collection according to the error state. Review pod event messages, as well as pod and container log information. Diagnose issues dynamically by accessing running Pods on the command line, or start a debug pod with root access based on a problematic pod’s deployment configuration.
5.5.1. Understanding pod error states 复制链接链接已复制到粘贴板!
Pod failures return explicit error states that can be observed in the status field in the output of oc get pods. Pod error states cover image, container, and container network related failures.
The following table provides a list of pod error states along with their descriptions.
| Pod error state | Description |
|---|---|
|
| Generic image retrieval error. |
|
| Image retrieval failed and is backed off. |
|
| The specified image name was invalid. |
|
| Image inspection did not succeed. |
|
|
|
|
| When attempting to retrieve an image from a registry, an HTTP error was encountered. |
|
| The specified container is either not present or not managed by the kubelet, within the declared pod. |
|
| Container initialization failed. |
|
| None of the pod’s containers started successfully. |
|
| None of the pod’s containers were killed successfully. |
|
| A container has terminated. The kubelet will not attempt to restart it. |
|
| A container or image attempted to run with root privileges. |
|
| Pod sandbox creation did not succeed. |
|
| Pod sandbox configuration was not obtained. |
|
| A pod sandbox did not stop successfully. |
|
| Network initialization failed. |
|
| Network termination failed. |
5.5.2. Reviewing pod status 复制链接链接已复制到粘贴板!
You can query pod status and error states. You can also query a pod’s associated deployment configuration and review base image availability.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. -
You have installed the OpenShift CLI (
oc). -
skopeois installed.
Procedure
Switch into a project:
oc project <project_name>
$ oc project <project_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow List pods running within the namespace, as well as pod status, error states, restarts, and age:
oc get pods
$ oc get podsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Determine whether the namespace is managed by a deployment configuration:
oc status
$ oc statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the namespace is managed by a deployment configuration, the output includes the deployment configuration name and a base image reference.
Inspect the base image referenced in the preceding command’s output:
skopeo inspect docker://<image_reference>
$ skopeo inspect docker://<image_reference>Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the base image reference is not correct, update the reference in the deployment configuration:
oc edit deployment/my-deployment
$ oc edit deployment/my-deploymentCopy to Clipboard Copied! Toggle word wrap Toggle overflow When deployment configuration changes on exit, the configuration will automatically redeploy. Watch pod status as the deployment progresses, to determine whether the issue has been resolved:
oc get pods -w
$ oc get pods -wCopy to Clipboard Copied! Toggle word wrap Toggle overflow Review events within the namespace for diagnostic information relating to pod failures:
oc get events
$ oc get eventsCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.5.3. Inspecting pod and container logs 复制链接链接已复制到粘贴板!
You can inspect pod and container logs for warnings and error messages related to explicit pod failures. Depending on policy and exit code, pod and container logs remain available after pods have been terminated.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. - Your API service is still functional.
-
You have installed the OpenShift CLI (
oc).
Procedure
Query logs for a specific pod:
oc logs <pod_name>
$ oc logs <pod_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Query logs for a specific container within a pod:
oc logs <pod_name> -c <container_name>
$ oc logs <pod_name> -c <container_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Logs retrieved using the preceding
oc logscommands are composed of messages sent to stdout within pods or containers.Inspect logs contained in
/var/log/within a pod.List log files and subdirectories contained in
/var/logwithin a pod:oc exec <pod_name> ls -alh /var/log
$ oc exec <pod_name> ls -alh /var/logCopy to Clipboard Copied! Toggle word wrap Toggle overflow Query a specific log file contained in
/var/logwithin a pod:oc exec <pod_name> cat /var/log/<path_to_log>
$ oc exec <pod_name> cat /var/log/<path_to_log>Copy to Clipboard Copied! Toggle word wrap Toggle overflow List log files and subdirectories contained in
/var/logwithin a specific container:oc exec <pod_name> -c <container_name> ls /var/log
$ oc exec <pod_name> -c <container_name> ls /var/logCopy to Clipboard Copied! Toggle word wrap Toggle overflow Query a specific log file contained in
/var/logwithin a specific container:oc exec <pod_name> -c <container_name> cat /var/log/<path_to_log>
$ oc exec <pod_name> -c <container_name> cat /var/log/<path_to_log>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.5.4. Accessing running pods 复制链接链接已复制到粘贴板!
You can review running pods dynamically by opening a shell inside a pod or by gaining network access through port forwarding.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. - Your API service is still functional.
-
You have installed the OpenShift CLI (
oc).
Procedure
Switch into the project that contains the pod you would like to access. This is necessary because the
oc rshcommand does not accept the-nnamespace option:oc project <namespace>
$ oc project <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start a remote shell into a pod:
oc rsh <pod_name>
$ oc rsh <pod_name>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- If a pod has multiple containers,
oc rshdefaults to the first container unless-c <container_name>is specified.
Start a remote shell into a specific container within a pod:
oc rsh -c <container_name> pod/<pod_name>
$ oc rsh -c <container_name> pod/<pod_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a port forwarding session to a port on a pod:
oc port-forward <pod_name> <host_port>:<pod_port>
$ oc port-forward <pod_name> <host_port>:<pod_port>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Enter
Ctrl+Cto cancel the port forwarding session.
5.5.5. Starting debug pods with root access 复制链接链接已复制到粘贴板!
You can start a debug pod with root access, based on a problematic pod’s deployment or deployment configuration. Pod users typically run with non-root privileges, but running troubleshooting pods with temporary root privileges can be useful during issue investigation.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. - Your API service is still functional.
-
You have installed the OpenShift CLI (
oc).
Procedure
Start a debug pod with root access, based on a deployment.
Obtain a project’s deployment name:
oc get deployment -n <project_name>
$ oc get deployment -n <project_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start a debug pod with root privileges, based on the deployment:
oc debug deployment/my-deployment --as-root -n <project_name>
$ oc debug deployment/my-deployment --as-root -n <project_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Start a debug pod with root access, based on a deployment configuration.
Obtain a project’s deployment configuration name:
oc get deploymentconfigs -n <project_name>
$ oc get deploymentconfigs -n <project_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start a debug pod with root privileges, based on the deployment configuration:
oc debug deploymentconfig/my-deployment-configuration --as-root -n <project_name>
$ oc debug deploymentconfig/my-deployment-configuration --as-root -n <project_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
You can append -- <command> to the preceding oc debug commands to run individual commands within a debug pod, instead of running an interactive shell.
5.5.6. Copying files to and from pods and containers 复制链接链接已复制到粘贴板!
You can copy files to and from a pod to test configuration changes or gather diagnostic information.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. - Your API service is still functional.
-
You have installed the OpenShift CLI (
oc).
Procedure
Copy a file to a pod:
oc cp <local_path> <pod_name>:/<path> -c <container_name>
$ oc cp <local_path> <pod_name>:/<path> -c <container_name>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The first container in a pod is selected if the
-coption is not specified.
Copy a file from a pod:
oc cp <pod_name>:/<path> -c <container_name><local_path>
$ oc cp <pod_name>:/<path> -c <container_name><local_path>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The first container in a pod is selected if the
-coption is not specified.
注意For
oc cpto function, thetarbinary must be available within the container.