Chapter 8. Working with clusters
8.1. Viewing system event information in Red Hat OpenShift Service on AWS clusters Copy linkLink copied to clipboard!
You can view events in Red Hat OpenShift Service on AWS, which are based on events that happen to API objects in an Red Hat OpenShift Service on AWS cluster.
8.1.1. Understanding events Copy linkLink copied to clipboard!
Review the following information to learn how Red Hat OpenShift Service on AWS uses events to record information about real-world events in a resource-agnostic manner. Events also allow developers and administrators to consume information about system components in a unified way.
8.1.2. Viewing events using the CLI Copy linkLink copied to clipboard!
You can get a list of events in a given project by using the CLI.
Procedure
View events in a project by using a command similar to the following:
oc get events [-n <project>]
$ oc get events [-n <project>]Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
project- Specifies the name of the project.
For example:
oc get events -n openshift-config
$ oc get events -n openshift-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow View events in your project from the Red Hat OpenShift Service on AWS console:
- Launch the Red Hat OpenShift Service on AWS console.
-
Click Home
Events and select your project. Move to resource that you want to see events. For example: Home
Projects <project-name> <resource-name>. Many objects, such as pods and deployments, also have an Events tab, which shows events related to that object.
8.1.3. List of events Copy linkLink copied to clipboard!
Review the information in this section to learn about Red Hat OpenShift Service on AWS events.
| Name | Description |
|---|---|
|
| Failed pod configuration validation. |
| Name | Description |
|---|---|
|
| Back-off restarting failed the container. |
|
| Container created. |
|
| Pull/Create/Start failed. |
|
| Killing the container. |
|
| Container started. |
|
| Preempting other pods. |
|
| Container runtime did not stop the pod within specified grace period. |
| Name | Description |
|---|---|
|
| Container is unhealthy. |
| Name | Description |
|---|---|
|
| Back off Ctr Start, image pull. |
|
| The image’s NeverPull Policy is violated. |
|
| Failed to pull the image. |
|
| Failed to inspect the image. |
|
| Successfully pulled the image or the container image is already present on the machine. |
|
| Pulling the image. |
| Name | Description |
|---|---|
|
| Free disk space failed. |
|
| Invalid disk capacity. |
| Name | Description |
|---|---|
|
| Volume mount failed. |
|
| Host network not supported. |
|
| Host/port conflict. |
|
| Kubelet setup failed. |
|
| Undefined shaper. |
|
| Node is not ready. |
|
| Node is not schedulable. |
|
| Node is ready. |
|
| Node is schedulable. |
|
| Node selector mismatch. |
|
| Out of disk. |
|
| Node rebooted. |
|
| Starting kubelet. |
|
| Failed to attach volume. |
|
| Failed to detach volume. |
|
| Failed to expand/reduce volume. |
|
| Successfully expanded/reduced volume. |
|
| Failed to expand/reduce file system. |
|
| Successfully expanded/reduced file system. |
|
| Failed to unmount volume. |
|
| Failed to map a volume. |
|
| Failed unmaped device. |
|
| Volume is already mounted. |
|
| Volume is successfully detached. |
|
| Volume is successfully mounted. |
|
| Volume is successfully unmounted. |
|
| Container garbage collection failed. |
|
| Image garbage collection failed. |
|
| Failed to enforce System Reserved Cgroup limit. |
|
| Enforced System Reserved Cgroup limit. |
|
| Unsupported mount option. |
|
| Pod sandbox changed. |
|
| Failed to create pod sandbox. |
|
| Failed pod sandbox status. |
| Name | Description |
|---|---|
|
| Pod sync failed. |
| Name | Description |
|---|---|
|
| There is an OOM (out of memory) situation on the cluster. |
| Name | Description |
|---|---|
|
| Failed to stop a pod. |
|
| Failed to create a pod container. |
|
| Failed to make pod data directories. |
|
| Network is not ready. |
|
|
Error creating: |
|
|
Created pod: |
|
|
Error deleting: |
|
|
Deleted pod: |
| Name | Description |
|---|---|
| SelectorRequired | Selector is required. |
|
| Could not convert selector into a corresponding internal selector object. |
|
| HPA was unable to compute the replica count. |
|
| Unknown metric source type. |
|
| HPA was able to successfully calculate a replica count. |
|
| Failed to convert the given HPA. |
|
| HPA controller was unable to get the target’s current scale. |
|
| HPA controller was able to get the target’s current scale. |
|
| Failed to compute desired number of replicas based on listed metrics. |
|
|
New size: |
|
|
New size: |
|
| Failed to update status. |
| Name | Description |
|---|---|
|
| There are no persistent volumes available and no storage class is set. |
|
| Volume size or class is different from what is requested in claim. |
|
| Error creating recycler pod. |
|
| Occurs when volume is recycled. |
|
| Occurs when pod is recycled. |
|
| Occurs when volume is deleted. |
|
| Error when deleting the volume. |
|
| Occurs when volume for the claim is provisioned either manually or via external software. |
|
| Failed to provision volume. |
|
| Error cleaning provisioned volume. |
|
| Occurs when the volume is provisioned successfully. |
|
| Delay binding until pod scheduling. |
| Name | Description |
|---|---|
|
| Handler failed for pod start. |
|
| Handler failed for pre-stop. |
|
| Pre-stop hook unfinished. |
| Name | Description |
|---|---|
|
| Failed to cancel deployment. |
|
| Canceled deployment. |
|
| Created new replication controller. |
|
| No available Ingress IP to allocate to service. |
| Name | Description |
|---|---|
|
|
Failed to schedule pod: |
|
|
By |
|
|
Successfully assigned |
| Name | Description |
|---|---|
|
| This daemon set is selecting all pods. A non-empty selector is required. |
|
|
Failed to place pod on |
|
|
Found failed daemon pod |
| Name | Description |
|---|---|
|
| Error creating load balancer. |
|
| Deleting load balancer. |
|
| Ensuring load balancer. |
|
| Ensured load balancer. |
|
|
There are no available nodes for |
|
|
Lists the new |
|
|
Lists the new IP address. For example, |
|
|
Lists external IP address. For example, |
|
|
Lists the new UID. For example, |
|
|
Lists the new |
|
|
Lists the new |
|
| Updated load balancer with new hosts. |
|
| Error updating load balancer with new hosts. |
|
| Deleting load balancer. |
|
| Error deleting load balancer. |
|
| Deleted load balancer. |
8.2. Estimating the number of pods your Red Hat OpenShift Service on AWS nodes can hold Copy linkLink copied to clipboard!
As a cluster administrator, you can use the OpenShift Cluster Capacity Tool to view the number of pods that can be scheduled in your cluster. This allows you to increase the current resources before they become exhausted and to ensure any future pods can be scheduled. This capacity comes from an individual node host in a cluster, and includes CPU, memory, disk space, and others.
8.2.1. Understanding the OpenShift Cluster Capacity Tool Copy linkLink copied to clipboard!
Review the following information to learn how to use the OpenShift Cluster Capacity Tool to simulate a sequence of scheduling decisions that determine how many instances of an input pod can be scheduled on the cluster before the cluster is exhausted of resources.
The remaining allocatable capacity is a rough estimation, because it does not count all of the resources being distributed among nodes. It analyzes only the remaining resources and estimates the available capacity that is still consumable in terms of the number of instances of a pod with given requirements that can be scheduled in a cluster.
Also, pods might only have scheduling support on particular sets of nodes based on its selection and affinity criteria. As a result, the estimation of which remaining pods a cluster can schedule can be difficult.
You can run the OpenShift Cluster Capacity Tool as a stand-alone utility from the command line, or as a job in a pod inside an Red Hat OpenShift Service on AWS cluster. Running the tool as job inside of a pod enables you to run it multiple times without intervention.
8.2.2. Running the OpenShift Cluster Capacity Tool on the command line Copy linkLink copied to clipboard!
You can run the OpenShift Cluster Capacity Tool from the command line to estimate the number of pods that can be scheduled onto your cluster.
You create a sample pod spec file, which the tool uses for estimating resource usage. The pod spec specifies its resource requirements as limits or requests. The cluster capacity tool takes the pod’s resource requirements into account for its estimation analysis.
Prerequisites
- Run the OpenShift Cluster Capacity Tool, which is available as a container image from the Red Hat Ecosystem Catalog. See the link in the "Additional resources" section.
Create a sample pod spec file:
Create a YAML file similar to the following:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the cluster role:
oc create -f <file_name>.yaml
$ oc create -f <file_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
oc create -f pod-spec.yaml
$ oc create -f pod-spec.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
From the terminal, log in to the Red Hat Registry:
podman login registry.redhat.io
$ podman login registry.redhat.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow Pull the cluster capacity tool image:
podman pull registry.redhat.io/openshift4/ose-cluster-capacity
$ podman pull registry.redhat.io/openshift4/ose-cluster-capacityCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the cluster capacity tool:
podman run -v $HOME/.kube:/kube:Z -v $(pwd):/cc:Z ose-cluster-capacity \ /bin/cluster-capacity --kubeconfig /kube/config --<pod_spec>.yaml /cc/<pod_spec>.yaml \ --verbose
$ podman run -v $HOME/.kube:/kube:Z -v $(pwd):/cc:Z ose-cluster-capacity \ /bin/cluster-capacity --kubeconfig /kube/config --<pod_spec>.yaml /cc/<pod_spec>.yaml \ --verboseCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
- <pod_spec>.yaml
- Specifies the pod spec to use.
- verbose
- Outputs a detailed description of how many pods can be scheduled on each node in the cluster.
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the above example, the number of estimated pods that can be scheduled onto the cluster is 88.
8.2.3. Running the OpenShift Cluster Capacity Tool as a job inside a pod Copy linkLink copied to clipboard!
You can run the OpenShift Cluster Capacity Tool as a job inside of a pod by using a ConfigMap object. This allows you to run the tool multiple times without needing user intervention.
Prerequisites
-
Download and install the OpenShift Cluster Capacity Tool from the
cluster-capacityrepository. See the link in the "Additional resources" section.
Procedure
Create the cluster role:
Create a YAML file similar to the following:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the cluster role by running the following command:
oc create -f <file_name>.yaml
$ oc create -f <file_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
oc create sa cluster-capacity-sa
$ oc create sa cluster-capacity-saCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create the service account:
oc create sa cluster-capacity-sa -n default
$ oc create sa cluster-capacity-sa -n defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the role to the service account:
oc adm policy add-cluster-role-to-user cluster-capacity-role \ system:serviceaccount:<namespace>:cluster-capacity-sa$ oc adm policy add-cluster-role-to-user cluster-capacity-role \ system:serviceaccount:<namespace>:cluster-capacity-saCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
- <namespace>
- Specifies the namespace where the pod is located.
Define and create the pod spec:
Create a YAML file similar to the following:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the pod by running the following command:
oc create -f <file_name>.yaml
$ oc create -f <file_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
oc create -f pod.yaml
$ oc create -f pod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Created a config map object by running the following command:
oc create configmap cluster-capacity-configmap \ --from-file=pod.yaml=pod.yaml$ oc create configmap cluster-capacity-configmap \ --from-file=pod.yaml=pod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow The cluster capacity analysis is mounted in a volume using a config map object named
cluster-capacity-configmapto mount the input pod spec filepod.yamlinto a volumetest-volumeat the path/test-pod.Create the job using the below example of a job specification file:
Create a YAML file similar to the following:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
spec.template.spec.containers.env-
Specifies a required environment variable that indicates the Cluster Capacity Tool is running inside a cluster as a pod.
Thepod.yamlkey of theConfigMapobject is the same as thePodspec file name, though it is not required. By doing this, the input pod spec file can be accessed inside the pod as/test-pod/pod.yaml.
Run the cluster capacity image as a job in a pod by running the following command:
oc create -f cluster-capacity-job.yaml
$ oc create -f cluster-capacity-job.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check the job logs to find the number of pods that can be scheduled in the cluster:
oc logs jobs/cluster-capacity-job
$ oc logs jobs/cluster-capacity-jobCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.3. Restrict resource consumption with limit ranges Copy linkLink copied to clipboard!
You can use limit ranges to restrict resource consumption for specific objects in a project.
By default, containers run with unbounded compute resources on an Red Hat OpenShift Service on AWS cluster.
You can configure resource consumption for the following objects:
- pods and containers: You can set minimum and maximum requirements for CPU and memory for pods and their containers.
-
Image streams: You can set limits on the number of images and tags in an
ImageStreamobject. - Images: You can limit the size of images that can be pushed to an internal registry.
- Persistent volume claims (PVC): You can restrict the size of the PVCs that can be requested.
If a pod does not meet the constraints imposed by the limit range, the pod cannot be created in the namespace.
8.3.1. About limit ranges Copy linkLink copied to clipboard!
You can set specific resource limits for a pod, container, image, image stream, or persistent volume claim (PVC) in a specific project by defining a LimitRange object. A limit range allows you to restrict resource consumption in that project.
All requests to create and modify resources are evaluated against each LimitRange object in the project. If the resource violates any of the enumerated constraints, the resource is rejected.
The following shows a limit range object for all components: pod, container, image, image stream, or PVC. You can configure limits for any or all of these components in the same object. You create a different limit range object for each project where you want to control resources.
Sample limit range object for a container
8.3.2. About component limits Copy linkLink copied to clipboard!
Review the following examples to learn the limit range parameters for each component for when you create or edit a LimitRange object.
The examples are broken out for clarity. You can create a single LimitRange object for any or all components as necessary.
- Container limits
A limit range allows you to specify the minimum and maximum CPU and memory that each container in a pod can request for a specific project. If a container is created in the project, the container CPU and memory requests in the
Podspec must comply with the values set in theLimitRangeobject. If not, the pod does not get created. The following requirements must hold true:-
The container CPU or memory request and limit must be greater than or equal to the
minresource constraint for containers that are specified in theLimitRangeobject. -
The container CPU or memory request and limit must be less than or equal to the
maxresource constraint for containers that are specified in theLimitRangeobject.
If the
LimitRangeobject defines amaxCPU, you do not need to define a CPUrequestvalue in thePodspec. But you must specify a CPUlimitvalue that satisfies the maximum CPU constraint specified in the limit range. The following requirements must hold true:The ratio of the container limits to requests must be less than or equal to the
maxLimitRequestRatiovalue for containers that is specified in theLimitRangeobject.If the
LimitRangeobject defines amaxLimitRequestRatioconstraint, any new containers must have both arequestand alimitvalue. Red Hat OpenShift Service on AWS calculates the limit-to-request ratio by dividing thelimitby therequest. This value should be a non-negative integer greater than 1.For example, if a container has
cpu: 500in thelimitvalue, andcpu: 100in therequestvalue, the limit-to-request ratio forcpuis5. This ratio must be less than or equal to themaxLimitRequestRatio.
If the
Podspec does not specify a container resource memory or limit, thedefaultordefaultRequestCPU and memory values for containers specified in the limit range object are assigned to the container.Container
LimitRangeobject definitionCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
metadata.name- Specifies the name of the limit range object.
spec.limit.max.cpu- Specifies the maximum amount of CPU that a single container in a pod can request.
spec.limit.max.memory- Specifies the maximum amount of memory that a single container in a pod can request.
spec.limit.min.cpu- Specifies the minimum amount of CPU that a single container in a pod can request.
spec.limit.min.memory- Specifies the minimum amount of memory that a single container in a pod can request.
spec.limit.default.cpu-
Specifies the default amount of CPU that a container can use if not specified in the
Podspec. spec.limit.default.memory-
Specifies the default amount of memory that a container can use if not specified in the
Podspec. spec.limit.defaultRequest.cpu-
Specifies the default amount of CPU that a container can request if not specified in the
Podspec. spec.limit.defaultRequest.memory-
Specifies the default amount of memory that a container can request if not specified in the
Podspec. spec.limit.maxLimitRequestRatio.cpu- Specifies the maximum limit-to-request ratio for a container.
-
The container CPU or memory request and limit must be greater than or equal to the
- Pod limits
A limit range allows you to specify the minimum and maximum CPU and memory limits for all containers across a pod in a given project. To create a container in the project, the container CPU and memory requests in the
Podspec must comply with the values set in theLimitRangeobject. If not, the pod does not get created.If the
Podspec does not specify a container resource memory or limit, thedefaultordefaultRequestCPU and memory values for containers specified in the limit range object are assigned to the container.Across all containers in a pod, the following requirements must hold true:
-
The container CPU or memory request and limit must be greater than or equal to the
minresource constraints for pods that are specified in theLimitRangeobject. -
The container CPU or memory request and limit must be less than or equal to the
maxresource constraints for pods that are specified in theLimitRangeobject. -
The ratio of the container limits to requests must be less than or equal to the
maxLimitRequestRatioconstraint specified in theLimitRangeobject.
Pod
LimitRangeobject definitionCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
metadata.name- Specifies the name of the limit range object.
spec.limit.max.cpu- Specifies the maximum amount of CPU that a pod can request across all containers.
spec.limit.max.memory- Specifies the maximum amount of memory that a pod can request across all containers.
spec.limit.min.cpu- Specifies the minimum amount of CPU that a pod can request across all containers.
spec.limit.min.memory- Specifies the minimum amount of memory that a pod can request across all containers.
spec.limit.maxLimitRequestRatio.cpu- Specifies the maximum limit-to-request ratio for a container.
-
The container CPU or memory request and limit must be greater than or equal to the
- Image limits
A limit range allows you to specify the maximum size of an image that can be pushed to an OpenShift image registry.
When pushing images to an OpenShift image registry, the following requirement must hold true:
-
The size of the image must be less than or equal to the
maxsize for images that is specified in theLimitRangeobject.
Image
LimitRangeobject definitionCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
metadata.name- Specifies the name of the limit range object.
spec.limit.max.storage- Specifies the maximum size of an image that can be pushed to an OpenShift image registry.
WarningThe image size is not always available in the manifest of an uploaded image. This is especially the case for images built with Docker 1.10 or higher and pushed to a v2 registry. If such an image is pulled with an older Docker daemon, the image manifest is converted by the registry to schema v1 lacking all the size information. No storage limit set on images prevent it from being uploaded.
-
The size of the image must be less than or equal to the
- Image stream limits
A limit range allows you to specify limits for image streams.
For each image stream, the following requirements must hold true:
-
The number of image tags in an
ImageStreamspecification must be less than or equal to theopenshift.io/image-tagsconstraint in theLimitRangeobject. -
The number of unique references to images in an
ImageStreamspecification must be less than or equal to theopenshift.io/imagesconstraint in the limit range object.
Imagestream
LimitRangeobject definitionCopy to Clipboard Copied! Toggle word wrap Toggle overflow where
metadata.name-
Specifies the name of the
LimitRangeobject. spec.limit.max.openshift.io/image-tags-
Specifies the maximum number of unique image tags in the
imagestream.spec.tagsparameter in imagestream spec. spec.limit.max.openshift.io/images-
Specifies the maximum number of unique image references in the
imagestream.status.tagsparameter in theimagestreamspec.
The
openshift.io/image-tagsresource represents unique image references. Possible references are anImageStreamTag, anImageStreamImageand aDockerImage. Tags can be created using theoc tagandoc import-imagecommands. No distinction is made between internal and external references. However, each unique reference tagged in anImageStreamspecification is counted just once. It does not restrict pushes to an internal container image registry in any way, but is useful for tag restriction.The
openshift.io/imagesresource represents unique image names recorded in image stream status. It allows for restriction of a specific number of images that can be pushed to the OpenShift image registry. Internal and external references are not distinguished.-
The number of image tags in an
- Persistent volume claim limits
A limit range allows you to restrict the storage requested in a persistent volume claim (PVC).
Across all persistent volume claims in a project, the following requirements must hold true:
-
The resource request in a persistent volume claim (PVC) must be greater than or equal the
minconstraint for PVCs that is specified in theLimitRangeobject. -
The resource request in a persistent volume claim (PVC) must be less than or equal the
maxconstraint for PVCs that is specified in theLimitRangeobject.
PVC
LimitRangeobject definitionCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
metadata.name-
Specifies the name of the
LimitRangeobject. spec.limit.min.storage- Specifies the minimum amount of storage that can be requested in a persistent volume claim.
spec.limit.max.storage- Specifies the maximum amount of storage that can be requested in a persistent volume claim.
-
The resource request in a persistent volume claim (PVC) must be greater than or equal the
8.3.3. Creating a Limit Range Copy linkLink copied to clipboard!
You can define LimitRange objects to set specific resource limits for a pod, container, image, image stream, or persistent volume claim (PVC) in a specific project. A limit range allows you to restrict resource consumption in that project.
Procedure
Create a
LimitRangeobject with your required specifications:Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
metadata.name-
Specifies a name for the
LimitRangeobject. spec.limit.type.Pod- Specifies limits for a pod, specify the minimum and maximum CPU and memory requests as needed.
spec.limit.type.Container- Specifies limits for a container, specify the minimum and maximum CPU and memory requests as needed.
spec.limit.type.default-
For a container, specifies the default amount of CPU or memory that a container can use, if not specified in the
Podspec. This parameter is optional. spec.limit.type.defaultRequest-
For a container, specifies the default amount of CPU or memory that a container can request, if not specified in the
Podspec. This parameter is optional. spec.limit.type.maxLimitRequestRatio-
For a container, specifies the maximum limit-to-request ratio that can be specified in the
Podspec. This parameter is optional. spec.limit.type.openshift.io/Image- Specifies limits for an image object. Set the maximum size of an image that can be pushed to an OpenShift image registry.
spec.limit.type.openshift.io/ImageStream-
Specifies limits for an image stream. Set the maximum number of image tags and references that can be in the
ImageStreamobject file, as needed. spec.limit.type.openshift.io/PersistentVolueClaim- Specifies limits for a persistent volume claim. Set the minimum and maximum amount of storage that can be requested.
Create the object:
oc create -f <limit_range_file> -n <project>
$ oc create -f <limit_range_file> -n <project>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<limit_range_file>- Specifies the name of the YAML file you created.
<project>- Specifies the project where you want the limits to apply.
8.3.4. Viewing a limit Copy linkLink copied to clipboard!
You can view the limits defined in a project by navigating in the web console to the project’s Quota page. This allows you to see details about each of the limit ranges in a project.
You can also use the CLI to view limit range details:
Procedure
Get the list of
LimitRangeobjects defined in the project. For example, for a project called demoproject:oc get limits -n demoproject
$ oc get limits -n demoprojectCopy to Clipboard Copied! Toggle word wrap Toggle overflow NAME CREATED AT resource-limits 2020-07-15T17:14:23Z
NAME CREATED AT resource-limits 2020-07-15T17:14:23ZCopy to Clipboard Copied! Toggle word wrap Toggle overflow Describe the
LimitRangeobject you are interested in, for example theresource-limitslimit range:oc describe limits resource-limits -n demoproject
$ oc describe limits resource-limits -n demoprojectCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.3.5. Deleting a Limit Range Copy linkLink copied to clipboard!
You can remove any active LimitRange object so that it no longer enforces the limits in a project.
Procedure
Run the following command:
oc delete limits <limit_name>
$ oc delete limits <limit_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.4. Configuring cluster memory to meet container memory and risk requirements Copy linkLink copied to clipboard!
As a cluster administrator, you can manage application memory usage to help your clusters operate more efficiently.
You can perform any of the following tasks to manage application memory:
- Determine the memory and risk requirements of a containerized application component and configuring the container memory parameters to suit those requirements.
- Configure containerized application runtimes (for example, OpenJDK) to adhere optimally to the configured container memory parameters.
- Diagnose and resolve memory-related error conditions associated with running in a container.
8.4.1. Understanding how to manage application memory Copy linkLink copied to clipboard!
You can review the following concepts to learn how Red Hat OpenShift Service on AWS manages compute resources so that you can lean how to keep your cluster running efficiently.
For each kind of resource (memory, CPU, storage), Red Hat OpenShift Service on AWS allows optional request and limit values to be placed on each container in a pod.
Note the following information about memory requests and memory limits:
Memory request
- The memory request value, if specified, influences the Red Hat OpenShift Service on AWS scheduler. The scheduler considers the memory request when scheduling a container to a node, then fences off the requested memory on the chosen node for the use of the container.
- If a node’s memory is exhausted, Red Hat OpenShift Service on AWS prioritizes evicting its containers whose memory usage most exceeds their memory request. In serious cases of memory exhaustion, the node OOM killer might select and kill a process in a container based on a similar metric.
- The cluster administrator can assign quota or assign default values for the memory request value.
- The cluster administrator can override the memory request values that a developer specifies, to manage cluster overcommit.
Memory limit
- The memory limit value, if specified, provides a hard limit on the memory that can be allocated across all the processes in a container.
- If the memory allocated by all of the processes in a container exceeds the memory limit, the node Out of Memory (OOM) killer immediately selects and kills a process in the container.
- If both memory request and limit are specified, the memory limit value must be greater than or equal to the memory request.
- The cluster administrator can assign quota or assign default values for the memory limit value.
-
The minimum memory limit is 12 MB. If a container fails to start due to a
Cannot allocate memorypod event, the memory limit is too low. Either increase or remove the memory limit. Removing the limit allows pods to consume unbounded node resources.
The steps for sizing application memory on Red Hat OpenShift Service on AWS are as follows:
Determine expected container memory usage
Determine expected mean and peak container memory usage. For example, you could perform separate load testing. Remember to consider all the processes that could potentially run in parallel in the container, such as any ancillary scripts that might be spawned by the main application.
Determine risk appetite
Determine risk appetite for eviction. If the risk appetite is low, the container should request memory according to the expected peak usage plus a percentage safety margin. If the risk appetite is higher, it might be more appropriate to request memory according to the expected mean usage.
Set container memory request
Set the container memory request based on the above. The request should represent the application memory usage as accurately as possible. If the request is too high, cluster and quota usage will be inefficient. If the request is too low, the chances of application eviction increase.
Set container memory limit, if required
Set the container memory limit, if required. Setting a limit has the effect of immediately killing a container process if the combined memory usage of all processes in the container exceeds the limit. Setting a limit might make unanticipated excess memory usage obvious early (fail fast). However, setting a limit also terminates processes abruptly.
Note that some Red Hat OpenShift Service on AWS clusters might require a limit value to be set; some might override the request based on the limit; and some application images rely on a limit value being set as this is easier to detect than a request value.
If the memory limit is set, it should not be set to less than the expected peak container memory usage plus a percentage safety margin.
Ensure applications are tuned
Ensure your applications are tuned with respect to configured request and limit values, if appropriate. This step is particularly relevant to applications which pool memory, such as the JVM. The rest of this page discusses this.
8.4.2. Understanding OpenJDK settings for Red Hat OpenShift Service on AWS Copy linkLink copied to clipboard!
You can review the following concepts to learn about how to deploy OpenJDK applications in your cluster effectively.
The default OpenJDK settings do not work well with containerized environments. As a result, some additional Java memory settings must always be provided whenever running the OpenJDK in a container.
The JVM memory layout is complex, version dependent, and describing it in detail is beyond the scope of this documentation. However, as a starting point for running OpenJDK in a container, at least the following three memory-related tasks are key:
- Overriding the JVM maximum heap size
OpenJDK defaults to using a maximum of 25% of available memory (recognizing any container memory limits in place) for heap memory. This default value is conservative, and, in a properly-configured container environment, would result in 75% of the memory assigned to a container being mostly unused. A much higher percentage for the JVM to use for heap memory, such as 80%, is more suitable in a container context where memory limits are imposed on the container level.
Most of the Red Hat containers include a startup script that replaces the OpenJDK default by setting updated values when the JVM launches.
For example, the Red Hat build of OpenJDK containers have a default value of 80%. This value can be set to a different percentage by defining the
JAVA_MAX_RAM_RATIOenvironment variable.For other OpenJDK deployements, the default value of 25% can be changed using the following command:
Example
java -XX:MaxRAMPercentage=80.0
$ java -XX:MaxRAMPercentage=80.0Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Encouraging the JVM to release unused memory to the operating system, if appropriate
By default, the OpenJDK does not aggressively return unused memory to the operating system. This could be appropriate for many containerized Java workloads, but notable exceptions include workloads where additional active processes co-exist with a JVM within a container, whether those additional processes are native, additional JVMs, or a combination of the two.
Java-based agents can use the following JVM arguments to encourage the JVM to release unused memory to the operating system:
-XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90
-XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90Copy to Clipboard Copied! Toggle word wrap Toggle overflow These arguments are intended to return heap memory to the operating system whenever allocated memory exceeds 110% of in-use memory (
-XX:MaxHeapFreeRatio), spending up to 20% of CPU time in the garbage collector (-XX:GCTimeRatio). At no time will the application heap allocation be less than the initial heap allocation (overridden by-XX:InitialHeapSize/-Xms). Detailed additional information is available Tuning Java’s footprint in OpenShift (Part 1), Tuning Java’s footprint in OpenShift (Part 2), and at OpenJDK and Containers.- Ensuring all JVM processes within a container are appropriately configured
In the case that multiple JVMs run in the same container, it is essential to ensure that they are all configured appropriately. For many workloads it will be necessary to grant each JVM a percentage memory budget, leaving a perhaps substantial additional safety margin.
Many Java tools use different environment variables (
JAVA_OPTS,GRADLE_OPTS, and so on) to configure their JVMs and it can be challenging to ensure that the right settings are being passed to the right JVM.The
JAVA_TOOL_OPTIONSenvironment variable is always respected by the OpenJDK, and values specified inJAVA_TOOL_OPTIONSwill be overridden by other options specified on the JVM command line. By default, to ensure that these options are used by default for all JVM workloads run in the Java-based agent image, the Red Hat OpenShift Service on AWS Jenkins Maven agent image sets the following variable:JAVA_TOOL_OPTIONS="-Dsun.zip.disableMemoryMapping=true"
JAVA_TOOL_OPTIONS="-Dsun.zip.disableMemoryMapping=true"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
This does not guarantee that additional options are not required, but is intended to be a helpful starting point. Optimally tuning JVM workloads for running in a container is beyond the scope of this documentation, and may involve setting multiple additional JVM options.
8.4.3. Finding the memory request and limit from within a pod Copy linkLink copied to clipboard!
You can configure your container to use the Downward API to dynamically discover its memory request and limit from within a pod. This allows your applications to better manage these resources without needing to use the API server.
Procedure
Configure the pod to add the
MEMORY_REQUESTandMEMORY_LIMITstanzas:Create a YAML file similar to the following:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
spec.consinters.env.name.MEMORY_REQUEST- This stanza discovers the application memory request value.
spec.consinters.env.name.MEMORY_LIMIT- This stanza discovers the application memory limit value.
Create the pod by running the following command:
oc create -f <file_name>.yaml
$ oc create -f <file_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Access the pod using a remote shell:
oc rsh test
$ oc rsh testCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the requested values were applied:
env | grep MEMORY | sort
$ env | grep MEMORY | sortCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
MEMORY_LIMIT=536870912 MEMORY_REQUEST=402653184
MEMORY_LIMIT=536870912 MEMORY_REQUEST=402653184Copy to Clipboard Copied! Toggle word wrap Toggle overflow
The memory limit value can also be read from inside the container by the /sys/fs/cgroup/memory/memory.limit_in_bytes file.
8.4.4. Understanding OOM kill policy Copy linkLink copied to clipboard!
Red Hat OpenShift Service on AWS can kill a process in a container if the total memory usage of all the processes in the container exceeds the memory limit, or in serious cases of node memory exhaustion.
If a process is Out of Memory (OOM) killed, the container could exit immediately. If the container PID 1 process receives the SIGKILL, the container does exit immediately. Otherwise, the container behavior is dependent on the behavior of the other processes.
For example, a container process exited with code 137, indicating it received a SIGKILL signal.
If the container does not exit immediately, use the following stepts to detect if an OOM kill occurred.
Procedure
Access the pod using a remote shell:
oc rsh <pod name>
# oc rsh <pod name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to see the current OOM kill count in
/sys/fs/cgroup/memory/memory.oom_control:grep '^oom_kill ' /sys/fs/cgroup/memory/memory.oom_control
$ grep '^oom_kill ' /sys/fs/cgroup/memory/memory.oom_controlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
oom_kill 0
oom_kill 0Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to provoke an OOM kill:
sed -e '' </dev/zero
$ sed -e '' </dev/zeroCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Killed
KilledCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to see that the OOM kill counter in
/sys/fs/cgroup/memory/memory.oom_controlincremented:grep '^oom_kill ' /sys/fs/cgroup/memory/memory.oom_control
$ grep '^oom_kill ' /sys/fs/cgroup/memory/memory.oom_controlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
oom_kill 1
oom_kill 1Copy to Clipboard Copied! Toggle word wrap Toggle overflow If one or more processes in a pod are OOM killed, when the pod subsequently exits, whether immediately or not, it will have phase Failed and reason OOMKilled. An OOM-killed pod might be restarted depending on the value of
restartPolicy. If not restarted, controllers such as the replication controller will notice the pod’s failed status and create a new pod to replace the old one.Use the following command to get the pod status:
oc get pod test
$ oc get pod testCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE test 0/1 OOMKilled 0 1m
NAME READY STATUS RESTARTS AGE test 0/1 OOMKilled 0 1mCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the pod has not restarted, run the following command to view the pod:
oc get pod test -o yaml
$ oc get pod test -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If restarted, run the following command to view the pod:
oc get pod test -o yaml
$ oc get pod test -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.4.5. Understanding pod eviction Copy linkLink copied to clipboard!
You can review the following concepts to learn the Red Hat OpenShift Service on AWS pod eviction policy.
Red Hat OpenShift Service on AWS can evict a pod from its node when the node’s memory is exhausted. Depending on the extent of memory exhaustion, the eviction might or might not be graceful. Graceful eviction implies the main process (PID 1) of each container receiving a SIGTERM signal, then some time later a SIGKILL signal if the process has not exited already. Non-graceful eviction implies the main process of each container immediately receiving a SIGKILL signal.
An evicted pod has phase Failed and reason Evicted. It is not restarted, regardless of the value of restartPolicy. However, controllers such as the replication controller will notice the pod’s failed status and create a new pod to replace the old one.
oc get pod test
$ oc get pod test
Example output
NAME READY STATUS RESTARTS AGE test 0/1 Evicted 0 1m
NAME READY STATUS RESTARTS AGE
test 0/1 Evicted 0 1m
oc get pod test -o yaml
$ oc get pod test -o yaml
Example output
8.5. Configuring your cluster to place pods on overcommitted nodes Copy linkLink copied to clipboard!
Red Hat OpenShift Service on AWS administrators can manage container density on nodes by configuring pod placement behavior and per-project resource limits that overcommit cannot exceed.
Alternatively, administrators can disable project-level resource overcommitment on customer-created namespaces that are not managed by Red Hat.
For more information about container resource management, see Additional resources.
In an overcommitted state, the sum of the container compute resource requestsand limits exceeds the resources available on the system. For example, you might want to use overcommitment in development environments where a trade-off of guaranteed performance for capacity is acceptable.
Containers can specify compute resource requests and limits. Requests are used for scheduling your container and provide a minimum service guarantee. Limits constrain the amount of compute resource that can be consumed on your node.
The scheduler attempts to optimize the compute resource use across all nodes in your cluster. It places pods onto specific nodes, taking the pods' compute resource requests and nodes' available capacity into consideration.
8.5.1. Project-level limits Copy linkLink copied to clipboard!
In Red Hat OpenShift Service on AWS, because overcommitment of project-level resources is enabled by default, if required by your use case, you can disable overcommitment on projects that are not managed by Red Hat.
For the list of projects that are managed by Red Hat and cannot be modified, see "Red Hat Managed resources" in Support.
8.5.1.1. Disabling overcommitment for a project Copy linkLink copied to clipboard!
If required by your use case, you can disable overcommitment on any project that is not managed by Red Hat. For a list of projects that cannot be modified, see "Red Hat Managed resources" in Support.
Prerequisites
- You are logged in to the cluster using an account with cluster administrator or cluster editor permissions.
Procedure
Edit the namespace object file:
If you are using the web console:
-
Click Administration
Namespaces and click the namespace for the project. - In the Annotations section, click the Edit button.
-
Click Add more and enter a new annotation that uses a Key of
quota.openshift.io/cluster-resource-override-enabledand a Value offalse. - Click Save.
-
Click Administration
If you are using the ROSA CLI (
rosa):Edit the namespace:
rosa edit namespace/<project_name>
$ rosa edit namespace/<project_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the following annotation:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
metadata.annotations.quota.openshift.io/cluster-resource-override-enabled.false- Specifies that overcommit is disabled for this namespace.