Chapter 8. Working with clusters


8.1. Viewing system event information in an OpenShift Dedicated cluster

Events in OpenShift Dedicated are modeled based on events that happen to API objects in an OpenShift Dedicated cluster.

8.1.1. Understanding events

Events allow OpenShift Dedicated to record information about real-world events in a resource-agnostic manner. They also allow developers and administrators to consume information about system components in a unified way.

8.1.2. Viewing events using the CLI

You can get a list of events in a given project using the CLI.

Procedure

  • To view events in a project use the following command:

    $ oc get events [-n <project>] 1
    1
    The name of the project.

    For example:

    $ oc get events -n openshift-config

    Example output

    LAST SEEN   TYPE      REASON                   OBJECT                      MESSAGE
    97m         Normal    Scheduled                pod/dapi-env-test-pod       Successfully assigned openshift-config/dapi-env-test-pod to ip-10-0-171-202.ec2.internal
    97m         Normal    Pulling                  pod/dapi-env-test-pod       pulling image "gcr.io/google_containers/busybox"
    97m         Normal    Pulled                   pod/dapi-env-test-pod       Successfully pulled image "gcr.io/google_containers/busybox"
    97m         Normal    Created                  pod/dapi-env-test-pod       Created container
    9m5s        Warning   FailedCreatePodSandBox   pod/dapi-volume-test-pod    Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_dapi-volume-test-pod_openshift-config_6bc60c1f-452e-11e9-9140-0eec59c23068_0(748c7a40db3d08c07fb4f9eba774bd5effe5f0d5090a242432a73eee66ba9e22): Multus: Err adding pod to network "ovn-kubernetes": cannot set "ovn-kubernetes" ifname to "eth0": no netns: failed to Statfs "/proc/33366/ns/net": no such file or directory
    8m31s       Normal    Scheduled                pod/dapi-volume-test-pod    Successfully assigned openshift-config/dapi-volume-test-pod to ip-10-0-171-202.ec2.internal
    #...

  • To view events in your project from the OpenShift Dedicated console.

    1. Launch the OpenShift Dedicated console.
    2. Click Home Events and select your project.
    3. Move to resource that you want to see events. For example: Home Projects <project-name> <resource-name>.

      Many objects, such as pods and deployments, have their own Events tab as well, which shows events related to that object.

8.1.3. List of events

This section describes the events of OpenShift Dedicated.

Table 8.1. Configuration events
NameDescription

FailedValidation

Failed pod configuration validation.

Table 8.2. Container events
NameDescription

BackOff

Back-off restarting failed the container.

Created

Container created.

Failed

Pull/Create/Start failed.

Killing

Killing the container.

Started

Container started.

Preempting

Preempting other pods.

ExceededGracePeriod

Container runtime did not stop the pod within specified grace period.

Table 8.3. Health events
NameDescription

Unhealthy

Container is unhealthy.

Table 8.4. Image events
NameDescription

BackOff

Back off Ctr Start, image pull.

ErrImageNeverPull

The image’s NeverPull Policy is violated.

Failed

Failed to pull the image.

InspectFailed

Failed to inspect the image.

Pulled

Successfully pulled the image or the container image is already present on the machine.

Pulling

Pulling the image.

Table 8.5. Image Manager events
NameDescription

FreeDiskSpaceFailed

Free disk space failed.

InvalidDiskCapacity

Invalid disk capacity.

Table 8.6. Node events
NameDescription

FailedMount

Volume mount failed.

HostNetworkNotSupported

Host network not supported.

HostPortConflict

Host/port conflict.

KubeletSetupFailed

Kubelet setup failed.

NilShaper

Undefined shaper.

NodeNotReady

Node is not ready.

NodeNotSchedulable

Node is not schedulable.

NodeReady

Node is ready.

NodeSchedulable

Node is schedulable.

NodeSelectorMismatching

Node selector mismatch.

OutOfDisk

Out of disk.

Rebooted

Node rebooted.

Starting

Starting kubelet.

FailedAttachVolume

Failed to attach volume.

FailedDetachVolume

Failed to detach volume.

VolumeResizeFailed

Failed to expand/reduce volume.

VolumeResizeSuccessful

Successfully expanded/reduced volume.

FileSystemResizeFailed

Failed to expand/reduce file system.

FileSystemResizeSuccessful

Successfully expanded/reduced file system.

FailedUnMount

Failed to unmount volume.

FailedMapVolume

Failed to map a volume.

FailedUnmapDevice

Failed unmaped device.

AlreadyMountedVolume

Volume is already mounted.

SuccessfulDetachVolume

Volume is successfully detached.

SuccessfulMountVolume

Volume is successfully mounted.

SuccessfulUnMountVolume

Volume is successfully unmounted.

ContainerGCFailed

Container garbage collection failed.

ImageGCFailed

Image garbage collection failed.

FailedNodeAllocatableEnforcement

Failed to enforce System Reserved Cgroup limit.

NodeAllocatableEnforced

Enforced System Reserved Cgroup limit.

UnsupportedMountOption

Unsupported mount option.

SandboxChanged

Pod sandbox changed.

FailedCreatePodSandBox

Failed to create pod sandbox.

FailedPodSandBoxStatus

Failed pod sandbox status.

Table 8.7. Pod worker events
NameDescription

FailedSync

Pod sync failed.

Table 8.8. System Events
NameDescription

SystemOOM

There is an OOM (out of memory) situation on the cluster.

Table 8.9. Pod events
NameDescription

FailedKillPod

Failed to stop a pod.

FailedCreatePodContainer

Failed to create a pod container.

Failed

Failed to make pod data directories.

NetworkNotReady

Network is not ready.

FailedCreate

Error creating: <error-msg>.

SuccessfulCreate

Created pod: <pod-name>.

FailedDelete

Error deleting: <error-msg>.

SuccessfulDelete

Deleted pod: <pod-id>.

Table 8.10. Horizontal Pod AutoScaler events
NameDescription

SelectorRequired

Selector is required.

InvalidSelector

Could not convert selector into a corresponding internal selector object.

FailedGetObjectMetric

HPA was unable to compute the replica count.

InvalidMetricSourceType

Unknown metric source type.

ValidMetricFound

HPA was able to successfully calculate a replica count.

FailedConvertHPA

Failed to convert the given HPA.

FailedGetScale

HPA controller was unable to get the target’s current scale.

SucceededGetScale

HPA controller was able to get the target’s current scale.

FailedComputeMetricsReplicas

Failed to compute desired number of replicas based on listed metrics.

FailedRescale

New size: <size>; reason: <msg>; error: <error-msg>.

SuccessfulRescale

New size: <size>; reason: <msg>.

FailedUpdateStatus

Failed to update status.

Table 8.11. Volume events
NameDescription

FailedBinding

There are no persistent volumes available and no storage class is set.

VolumeMismatch

Volume size or class is different from what is requested in claim.

VolumeFailedRecycle

Error creating recycler pod.

VolumeRecycled

Occurs when volume is recycled.

RecyclerPod

Occurs when pod is recycled.

VolumeDelete

Occurs when volume is deleted.

VolumeFailedDelete

Error when deleting the volume.

ExternalProvisioning

Occurs when volume for the claim is provisioned either manually or via external software.

ProvisioningFailed

Failed to provision volume.

ProvisioningCleanupFailed

Error cleaning provisioned volume.

ProvisioningSucceeded

Occurs when the volume is provisioned successfully.

WaitForFirstConsumer

Delay binding until pod scheduling.

Table 8.12. Lifecycle hooks
NameDescription

FailedPostStartHook

Handler failed for pod start.

FailedPreStopHook

Handler failed for pre-stop.

UnfinishedPreStopHook

Pre-stop hook unfinished.

Table 8.13. Deployments
NameDescription

DeploymentCancellationFailed

Failed to cancel deployment.

DeploymentCancelled

Canceled deployment.

DeploymentCreated

Created new replication controller.

IngressIPRangeFull

No available Ingress IP to allocate to service.

Table 8.14. Scheduler events
NameDescription

FailedScheduling

Failed to schedule pod: <pod-namespace>/<pod-name>. This event is raised for multiple reasons, for example: AssumePodVolumes failed, Binding rejected etc.

Preempted

By <preemptor-namespace>/<preemptor-name> on node <node-name>.

Scheduled

Successfully assigned <pod-name> to <node-name>.

Table 8.15. Daemon set events
NameDescription

SelectingAll

This daemon set is selecting all pods. A non-empty selector is required.

FailedPlacement

Failed to place pod on <node-name>.

FailedDaemonPod

Found failed daemon pod <pod-name> on node <node-name>, will try to kill it.

Table 8.16. LoadBalancer service events
NameDescription

CreatingLoadBalancerFailed

Error creating load balancer.

DeletingLoadBalancer

Deleting load balancer.

EnsuringLoadBalancer

Ensuring load balancer.

EnsuredLoadBalancer

Ensured load balancer.

UnAvailableLoadBalancer

There are no available nodes for LoadBalancer service.

LoadBalancerSourceRanges

Lists the new LoadBalancerSourceRanges. For example, <old-source-range> <new-source-range>.

LoadbalancerIP

Lists the new IP address. For example, <old-ip> <new-ip>.

ExternalIP

Lists external IP address. For example, Added: <external-ip>.

UID

Lists the new UID. For example, <old-service-uid> <new-service-uid>.

ExternalTrafficPolicy

Lists the new ExternalTrafficPolicy. For example, <old-policy> <new-policy>.

HealthCheckNodePort

Lists the new HealthCheckNodePort. For example, <old-node-port> new-node-port>.

UpdatedLoadBalancer

Updated load balancer with new hosts.

LoadBalancerUpdateFailed

Error updating load balancer with new hosts.

DeletingLoadBalancer

Deleting load balancer.

DeletingLoadBalancerFailed

Error deleting load balancer.

DeletedLoadBalancer

Deleted load balancer.

8.2. Estimating the number of pods your OpenShift Dedicated nodes can hold

As a cluster administrator, you can use the OpenShift Cluster Capacity Tool to view the number of pods that can be scheduled to increase the current resources before they become exhausted, and to ensure any future pods can be scheduled. This capacity comes from an individual node host in a cluster, and includes CPU, memory, disk space, and others.

8.2.1. Understanding the OpenShift Cluster Capacity Tool

The OpenShift Cluster Capacity Tool simulates a sequence of scheduling decisions to determine how many instances of an input pod can be scheduled on the cluster before it is exhausted of resources to provide a more accurate estimation.

Note

The remaining allocatable capacity is a rough estimation, because it does not count all of the resources being distributed among nodes. It analyzes only the remaining resources and estimates the available capacity that is still consumable in terms of a number of instances of a pod with given requirements that can be scheduled in a cluster.

Also, pods might only have scheduling support on particular sets of nodes based on its selection and affinity criteria. As a result, the estimation of which remaining pods a cluster can schedule can be difficult.

You can run the OpenShift Cluster Capacity Tool as a stand-alone utility from the command line, or as a job in a pod inside an OpenShift Dedicated cluster. Running the tool as job inside of a pod enables you to run it multiple times without intervention.

8.2.2. Running the OpenShift Cluster Capacity Tool on the command line

You can run the OpenShift Cluster Capacity Tool from the command line to estimate the number of pods that can be scheduled onto your cluster.

You create a sample pod spec file, which the tool uses for estimating resource usage. The pod spec specifies its resource requirements as limits or requests. The cluster capacity tool takes the pod’s resource requirements into account for its estimation analysis.

Prerequisites

  1. Run the OpenShift Cluster Capacity Tool, which is available as a container image from the Red Hat Ecosystem Catalog.
  2. Create a sample pod spec file:

    1. Create a YAML file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: small-pod
        labels:
          app: guestbook
          tier: frontend
      spec:
        securityContext:
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault
        containers:
        - name: php-redis
          image: gcr.io/google-samples/gb-frontend:v4
          imagePullPolicy: Always
          resources:
            limits:
              cpu: 150m
              memory: 100Mi
            requests:
              cpu: 150m
              memory: 100Mi
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop: [ALL]
    2. Create the cluster role:

      $ oc create -f <file_name>.yaml

      For example:

      $ oc create -f pod-spec.yaml

Procedure

To use the cluster capacity tool on the command line:

  1. From the terminal, log in to the Red Hat Registry:

    $ podman login registry.redhat.io
  2. Pull the cluster capacity tool image:

    $ podman pull registry.redhat.io/openshift4/ose-cluster-capacity
  3. Run the cluster capacity tool:

    $ podman run -v $HOME/.kube:/kube:Z -v $(pwd):/cc:Z  ose-cluster-capacity \
    /bin/cluster-capacity --kubeconfig /kube/config --<pod_spec>.yaml /cc/<pod_spec>.yaml \
    --verbose

    where:

    <pod_spec>.yaml
    Specifies the pod spec to use.
    verbose
    Outputs a detailed description of how many pods can be scheduled on each node in the cluster.

    Example output

    small-pod pod requirements:
    	- CPU: 150m
    	- Memory: 100Mi
    
    The cluster can schedule 88 instance(s) of the pod small-pod.
    
    Termination reason: Unschedulable: 0/5 nodes are available: 2 Insufficient cpu,
    3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't
    tolerate.
    
    Pod distribution among nodes:
    small-pod
    	- 192.168.124.214: 45 instance(s)
    	- 192.168.124.120: 43 instance(s)

    In the above example, the number of estimated pods that can be scheduled onto the cluster is 88.

8.2.3. Running the OpenShift Cluster Capacity Tool as a job inside a pod

Running the OpenShift Cluster Capacity Tool as a job inside of a pod allows you to run the tool multiple times without needing user intervention. You run the OpenShift Cluster Capacity Tool as a job by using a ConfigMap object.

Prerequisites

Download and install OpenShift Cluster Capacity Tool.

Procedure

To run the cluster capacity tool:

  1. Create the cluster role:

    1. Create a YAML file similar to the following:

      kind: ClusterRole
      apiVersion: rbac.authorization.k8s.io/v1
      metadata:
        name: cluster-capacity-role
      rules:
      - apiGroups: [""]
        resources: ["pods", "nodes", "persistentvolumeclaims", "persistentvolumes", "services", "replicationcontrollers"]
        verbs: ["get", "watch", "list"]
      - apiGroups: ["apps"]
        resources: ["replicasets", "statefulsets"]
        verbs: ["get", "watch", "list"]
      - apiGroups: ["policy"]
        resources: ["poddisruptionbudgets"]
        verbs: ["get", "watch", "list"]
      - apiGroups: ["storage.k8s.io"]
        resources: ["storageclasses"]
        verbs: ["get", "watch", "list"]
    2. Create the cluster role by running the following command:

      $ oc create -f <file_name>.yaml

      For example:

      $ oc create sa cluster-capacity-sa
  2. Create the service account:

    $ oc create sa cluster-capacity-sa -n default
  3. Add the role to the service account:

    $ oc adm policy add-cluster-role-to-user cluster-capacity-role \
        system:serviceaccount:<namespace>:cluster-capacity-sa

    where:

    <namespace>
    Specifies the namespace where the pod is located.
  4. Define and create the pod spec:

    1. Create a YAML file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: small-pod
        labels:
          app: guestbook
          tier: frontend
      spec:
        securityContext:
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault
        containers:
        - name: php-redis
          image: gcr.io/google-samples/gb-frontend:v4
          imagePullPolicy: Always
          resources:
            limits:
              cpu: 150m
              memory: 100Mi
            requests:
              cpu: 150m
              memory: 100Mi
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop: [ALL]
    2. Create the pod by running the following command:

      $ oc create -f <file_name>.yaml

      For example:

      $ oc create -f pod.yaml
  5. Created a config map object by running the following command:

    $ oc create configmap cluster-capacity-configmap \
        --from-file=pod.yaml=pod.yaml

    The cluster capacity analysis is mounted in a volume using a config map object named cluster-capacity-configmap to mount the input pod spec file pod.yaml into a volume test-volume at the path /test-pod.

  6. Create the job using the below example of a job specification file:

    1. Create a YAML file similar to the following:

      apiVersion: batch/v1
      kind: Job
      metadata:
        name: cluster-capacity-job
      spec:
        parallelism: 1
        completions: 1
        template:
          metadata:
            name: cluster-capacity-pod
          spec:
              containers:
              - name: cluster-capacity
                image: openshift/origin-cluster-capacity
                imagePullPolicy: "Always"
                volumeMounts:
                - mountPath: /test-pod
                  name: test-volume
                env:
                - name: CC_INCLUSTER 1
                  value: "true"
                command:
                - "/bin/sh"
                - "-ec"
                - |
                  /bin/cluster-capacity --podspec=/test-pod/pod.yaml --verbose
              restartPolicy: "Never"
              serviceAccountName: cluster-capacity-sa
              volumes:
              - name: test-volume
                configMap:
                  name: cluster-capacity-configmap
      1
      A required environment variable letting the cluster capacity tool know that it is running inside a cluster as a pod.
      The pod.yaml key of the ConfigMap object is the same as the Pod spec file name, though it is not required. By doing this, the input pod spec file can be accessed inside the pod as /test-pod/pod.yaml.
    2. Run the cluster capacity image as a job in a pod by running the following command:

      $ oc create -f cluster-capacity-job.yaml

Verification

  1. Check the job logs to find the number of pods that can be scheduled in the cluster:

    $ oc logs jobs/cluster-capacity-job

    Example output

    small-pod pod requirements:
            - CPU: 150m
            - Memory: 100Mi
    
    The cluster can schedule 52 instance(s) of the pod small-pod.
    
    Termination reason: Unschedulable: No nodes are available that match all of the
    following predicates:: Insufficient cpu (2).
    
    Pod distribution among nodes:
    small-pod
            - 192.168.124.214: 26 instance(s)
            - 192.168.124.120: 26 instance(s)

8.3. Restrict resource consumption with limit ranges

By default, containers run with unbounded compute resources on an OpenShift Dedicated cluster. With limit ranges, you can restrict resource consumption for specific objects in a project:

  • pods and containers: You can set minimum and maximum requirements for CPU and memory for pods and their containers.
  • Image streams: You can set limits on the number of images and tags in an ImageStream object.
  • Images: You can limit the size of images that can be pushed to an internal registry.
  • Persistent volume claims (PVC): You can restrict the size of the PVCs that can be requested.

If a pod does not meet the constraints imposed by the limit range, the pod cannot be created in the namespace.

8.3.1. About limit ranges

A limit range, defined by a LimitRange object, restricts resource consumption in a project. In the project you can set specific resource limits for a pod, container, image, image stream, or persistent volume claim (PVC).

All requests to create and modify resources are evaluated against each LimitRange object in the project. If the resource violates any of the enumerated constraints, the resource is rejected.

The following shows a limit range object for all components: pod, container, image, image stream, or PVC. You can configure limits for any or all of these components in the same object. You create a different limit range object for each project where you want to control resources.

Sample limit range object for a container

apiVersion: "v1"
kind: "LimitRange"
metadata:
  name: "resource-limits"
spec:
  limits:
    - type: "Container"
      max:
        cpu: "2"
        memory: "1Gi"
      min:
        cpu: "100m"
        memory: "4Mi"
      default:
        cpu: "300m"
        memory: "200Mi"
      defaultRequest:
        cpu: "200m"
        memory: "100Mi"
      maxLimitRequestRatio:
        cpu: "10"

8.3.1.1. About component limits

The following examples show limit range parameters for each component. The examples are broken out for clarity. You can create a single LimitRange object for any or all components as necessary.

8.3.1.1.1. Container limits

A limit range allows you to specify the minimum and maximum CPU and memory that each container in a pod can request for a specific project. If a container is created in the project, the container CPU and memory requests in the Pod spec must comply with the values set in the LimitRange object. If not, the pod does not get created.

  • The container CPU or memory request and limit must be greater than or equal to the min resource constraint for containers that are specified in the LimitRange object.
  • The container CPU or memory request and limit must be less than or equal to the max resource constraint for containers that are specified in the LimitRange object.

    If the LimitRange object defines a max CPU, you do not need to define a CPU request value in the Pod spec. But you must specify a CPU limit value that satisfies the maximum CPU constraint specified in the limit range.

  • The ratio of the container limits to requests must be less than or equal to the maxLimitRequestRatio value for containers that is specified in the LimitRange object.

    If the LimitRange object defines a maxLimitRequestRatio constraint, any new containers must have both a request and a limit value. OpenShift Dedicated calculates the limit-to-request ratio by dividing the limit by the request. This value should be a non-negative integer greater than 1.

    For example, if a container has cpu: 500 in the limit value, and cpu: 100 in the request value, the limit-to-request ratio for cpu is 5. This ratio must be less than or equal to the maxLimitRequestRatio.

If the Pod spec does not specify a container resource memory or limit, the default or defaultRequest CPU and memory values for containers specified in the limit range object are assigned to the container.

Container LimitRange object definition

apiVersion: "v1"
kind: "LimitRange"
metadata:
  name: "resource-limits" 1
spec:
  limits:
    - type: "Container"
      max:
        cpu: "2" 2
        memory: "1Gi" 3
      min:
        cpu: "100m" 4
        memory: "4Mi" 5
      default:
        cpu: "300m" 6
        memory: "200Mi" 7
      defaultRequest:
        cpu: "200m" 8
        memory: "100Mi" 9
      maxLimitRequestRatio:
        cpu: "10" 10

1
The name of the LimitRange object.
2
The maximum amount of CPU that a single container in a pod can request.
3
The maximum amount of memory that a single container in a pod can request.
4
The minimum amount of CPU that a single container in a pod can request.
5
The minimum amount of memory that a single container in a pod can request.
6
The default amount of CPU that a container can use if not specified in the Pod spec.
7
The default amount of memory that a container can use if not specified in the Pod spec.
8
The default amount of CPU that a container can request if not specified in the Pod spec.
9
The default amount of memory that a container can request if not specified in the Pod spec.
10
The maximum limit-to-request ratio for a container.
8.3.1.1.2. Pod limits

A limit range allows you to specify the minimum and maximum CPU and memory limits for all containers across a pod in a given project. To create a container in the project, the container CPU and memory requests in the Pod spec must comply with the values set in the LimitRange object. If not, the pod does not get created.

If the Pod spec does not specify a container resource memory or limit, the default or defaultRequest CPU and memory values for containers specified in the limit range object are assigned to the container.

Across all containers in a pod, the following must hold true:

  • The container CPU or memory request and limit must be greater than or equal to the min resource constraints for pods that are specified in the LimitRange object.
  • The container CPU or memory request and limit must be less than or equal to the max resource constraints for pods that are specified in the LimitRange object.
  • The ratio of the container limits to requests must be less than or equal to the maxLimitRequestRatio constraint specified in the LimitRange object.

Pod LimitRange object definition

apiVersion: "v1"
kind: "LimitRange"
metadata:
  name: "resource-limits" 1
spec:
  limits:
    - type: "Pod"
      max:
        cpu: "2" 2
        memory: "1Gi" 3
      min:
        cpu: "200m" 4
        memory: "6Mi" 5
      maxLimitRequestRatio:
        cpu: "10" 6

1
The name of the limit range object.
2
The maximum amount of CPU that a pod can request across all containers.
3
The maximum amount of memory that a pod can request across all containers.
4
The minimum amount of CPU that a pod can request across all containers.
5
The minimum amount of memory that a pod can request across all containers.
6
The maximum limit-to-request ratio for a container.
8.3.1.1.3. Image limits

A LimitRange object allows you to specify the maximum size of an image that can be pushed to an OpenShift image registry.

When pushing images to an OpenShift image registry, the following must hold true:

  • The size of the image must be less than or equal to the max size for images that is specified in the LimitRange object.

Image LimitRange object definition

apiVersion: "v1"
kind: "LimitRange"
metadata:
  name: "resource-limits" 1
spec:
  limits:
    - type: openshift.io/Image
      max:
        storage: 1Gi 2

1
The name of the LimitRange object.
2
The maximum size of an image that can be pushed to an OpenShift image registry.
Warning

The image size is not always available in the manifest of an uploaded image. This is especially the case for images built with Docker 1.10 or higher and pushed to a v2 registry. If such an image is pulled with an older Docker daemon, the image manifest is converted by the registry to schema v1 lacking all the size information. No storage limit set on images prevent it from being uploaded.

The issue is being addressed.

8.3.1.1.4. Image stream limits

A LimitRange object allows you to specify limits for image streams.

For each image stream, the following must hold true:

  • The number of image tags in an ImageStream specification must be less than or equal to the openshift.io/image-tags constraint in the LimitRange object.
  • The number of unique references to images in an ImageStream specification must be less than or equal to the openshift.io/images constraint in the limit range object.

Imagestream LimitRange object definition

apiVersion: "v1"
kind: "LimitRange"
metadata:
  name: "resource-limits" 1
spec:
  limits:
    - type: openshift.io/ImageStream
      max:
        openshift.io/image-tags: 20 2
        openshift.io/images: 30 3

1
The name of the LimitRange object.
2
The maximum number of unique image tags in the imagestream.spec.tags parameter in imagestream spec.
3
The maximum number of unique image references in the imagestream.status.tags parameter in the imagestream spec.

The openshift.io/image-tags resource represents unique image references. Possible references are an ImageStreamTag, an ImageStreamImage and a DockerImage. Tags can be created using the oc tag and oc import-image commands. No distinction is made between internal and external references. However, each unique reference tagged in an ImageStream specification is counted just once. It does not restrict pushes to an internal container image registry in any way, but is useful for tag restriction.

The openshift.io/images resource represents unique image names recorded in image stream status. It allows for restriction of a number of images that can be pushed to the OpenShift image registry. Internal and external references are not distinguished.

8.3.1.1.5. Persistent volume claim limits

A LimitRange object allows you to restrict the storage requested in a persistent volume claim (PVC).

Across all persistent volume claims in a project, the following must hold true:

  • The resource request in a persistent volume claim (PVC) must be greater than or equal the min constraint for PVCs that is specified in the LimitRange object.
  • The resource request in a persistent volume claim (PVC) must be less than or equal the max constraint for PVCs that is specified in the LimitRange object.

PVC LimitRange object definition

apiVersion: "v1"
kind: "LimitRange"
metadata:
  name: "resource-limits" 1
spec:
  limits:
    - type: "PersistentVolumeClaim"
      min:
        storage: "2Gi" 2
      max:
        storage: "50Gi" 3

1
The name of the LimitRange object.
2
The minimum amount of storage that can be requested in a persistent volume claim.
3
The maximum amount of storage that can be requested in a persistent volume claim.

8.3.2. Creating a Limit Range

To apply a limit range to a project:

  1. Create a LimitRange object with your required specifications:

    apiVersion: "v1"
    kind: "LimitRange"
    metadata:
      name: "resource-limits" 1
    spec:
      limits:
        - type: "Pod" 2
          max:
            cpu: "2"
            memory: "1Gi"
          min:
            cpu: "200m"
            memory: "6Mi"
        - type: "Container" 3
          max:
            cpu: "2"
            memory: "1Gi"
          min:
            cpu: "100m"
            memory: "4Mi"
          default: 4
            cpu: "300m"
            memory: "200Mi"
          defaultRequest: 5
            cpu: "200m"
            memory: "100Mi"
          maxLimitRequestRatio: 6
            cpu: "10"
        - type: openshift.io/Image 7
          max:
            storage: 1Gi
        - type: openshift.io/ImageStream 8
          max:
            openshift.io/image-tags: 20
            openshift.io/images: 30
        - type: "PersistentVolumeClaim" 9
          min:
            storage: "2Gi"
          max:
            storage: "50Gi"
    1
    Specify a name for the LimitRange object.
    2
    To set limits for a pod, specify the minimum and maximum CPU and memory requests as needed.
    3
    To set limits for a container, specify the minimum and maximum CPU and memory requests as needed.
    4
    Optional. For a container, specify the default amount of CPU or memory that a container can use, if not specified in the Pod spec.
    5
    Optional. For a container, specify the default amount of CPU or memory that a container can request, if not specified in the Pod spec.
    6
    Optional. For a container, specify the maximum limit-to-request ratio that can be specified in the Pod spec.
    7
    To set limits for an Image object, set the maximum size of an image that can be pushed to an OpenShift image registry.
    8
    To set limits for an image stream, set the maximum number of image tags and references that can be in the ImageStream object file, as needed.
    9
    To set limits for a persistent volume claim, set the minimum and maximum amount of storage that can be requested.
  2. Create the object:

    $ oc create -f <limit_range_file> -n <project> 1
    1
    Specify the name of the YAML file you created and the project where you want the limits to apply.

8.3.3. Viewing a limit

You can view any limits defined in a project by navigating in the web console to the project’s Quota page.

You can also use the CLI to view limit range details:

  1. Get the list of LimitRange object defined in the project. For example, for a project called demoproject:

    $ oc get limits -n demoproject
    NAME              CREATED AT
    resource-limits   2020-07-15T17:14:23Z
  2. Describe the LimitRange object you are interested in, for example the resource-limits limit range:

    $ oc describe limits resource-limits -n demoproject
    Name:                           resource-limits
    Namespace:                      demoproject
    Type                            Resource                Min     Max     Default Request Default Limit   Max Limit/Request Ratio
    ----                            --------                ---     ---     --------------- -------------   -----------------------
    Pod                             cpu                     200m    2       -               -               -
    Pod                             memory                  6Mi     1Gi     -               -               -
    Container                       cpu                     100m    2       200m            300m            10
    Container                       memory                  4Mi     1Gi     100Mi           200Mi           -
    openshift.io/Image              storage                 -       1Gi     -               -               -
    openshift.io/ImageStream        openshift.io/image      -       12      -               -               -
    openshift.io/ImageStream        openshift.io/image-tags -       10      -               -               -
    PersistentVolumeClaim           storage                 -       50Gi    -               -               -

8.3.4. Deleting a Limit Range

To remove any active LimitRange object to no longer enforce the limits in a project:

  • Run the following command:

    $ oc delete limits <limit_name>

8.4. Configuring cluster memory to meet container memory and risk requirements

As a cluster administrator, you can help your clusters operate efficiently through managing application memory by:

  • Determining the memory and risk requirements of a containerized application component and configuring the container memory parameters to suit those requirements.
  • Configuring containerized application runtimes (for example, OpenJDK) to adhere optimally to the configured container memory parameters.
  • Diagnosing and resolving memory-related error conditions associated with running in a container.

8.4.1. Understanding managing application memory

It is recommended to fully read the overview of how OpenShift Dedicated manages Compute Resources before proceeding.

For each kind of resource (memory, CPU, storage), OpenShift Dedicated allows optional request and limit values to be placed on each container in a pod.

Note the following about memory requests and memory limits:

  • Memory request

    • The memory request value, if specified, influences the OpenShift Dedicated scheduler. The scheduler considers the memory request when scheduling a container to a node, then fences off the requested memory on the chosen node for the use of the container.
    • If a node’s memory is exhausted, OpenShift Dedicated prioritizes evicting its containers whose memory usage most exceeds their memory request. In serious cases of memory exhaustion, the node OOM killer may select and kill a process in a container based on a similar metric.
    • The cluster administrator can assign quota or assign default values for the memory request value.
    • The cluster administrator can override the memory request values that a developer specifies, to manage cluster overcommit.
  • Memory limit

    • The memory limit value, if specified, provides a hard limit on the memory that can be allocated across all the processes in a container.
    • If the memory allocated by all of the processes in a container exceeds the memory limit, the node Out of Memory (OOM) killer will immediately select and kill a process in the container.
    • If both memory request and limit are specified, the memory limit value must be greater than or equal to the memory request.
    • The cluster administrator can assign quota or assign default values for the memory limit value.
    • The minimum memory limit is 12 MB. If a container fails to start due to a Cannot allocate memory pod event, the memory limit is too low. Either increase or remove the memory limit. Removing the limit allows pods to consume unbounded node resources.

8.4.1.1. Managing application memory strategy

The steps for sizing application memory on OpenShift Dedicated are as follows:

  1. Determine expected container memory usage

    Determine expected mean and peak container memory usage, empirically if necessary (for example, by separate load testing). Remember to consider all the processes that may potentially run in parallel in the container: for example, does the main application spawn any ancillary scripts?

  2. Determine risk appetite

    Determine risk appetite for eviction. If the risk appetite is low, the container should request memory according to the expected peak usage plus a percentage safety margin. If the risk appetite is higher, it may be more appropriate to request memory according to the expected mean usage.

  3. Set container memory request

    Set container memory request based on the above. The more accurately the request represents the application memory usage, the better. If the request is too high, cluster and quota usage will be inefficient. If the request is too low, the chances of application eviction increase.

  4. Set container memory limit, if required

    Set container memory limit, if required. Setting a limit has the effect of immediately killing a container process if the combined memory usage of all processes in the container exceeds the limit, and is therefore a mixed blessing. On the one hand, it may make unanticipated excess memory usage obvious early ("fail fast"); on the other hand it also terminates processes abruptly.

    Note that some OpenShift Dedicated clusters may require a limit value to be set; some may override the request based on the limit; and some application images rely on a limit value being set as this is easier to detect than a request value.

    If the memory limit is set, it should not be set to less than the expected peak container memory usage plus a percentage safety margin.

  5. Ensure application is tuned

    Ensure application is tuned with respect to configured request and limit values, if appropriate. This step is particularly relevant to applications which pool memory, such as the JVM. The rest of this page discusses this.

8.4.2. Understanding OpenJDK settings for OpenShift Dedicated

The default OpenJDK settings do not work well with containerized environments. As a result, some additional Java memory settings must always be provided whenever running the OpenJDK in a container.

The JVM memory layout is complex, version dependent, and describing it in detail is beyond the scope of this documentation. However, as a starting point for running OpenJDK in a container, at least the following three memory-related tasks are key:

  1. Overriding the JVM maximum heap size.
  2. Encouraging the JVM to release unused memory to the operating system, if appropriate.
  3. Ensuring all JVM processes within a container are appropriately configured.

Optimally tuning JVM workloads for running in a container is beyond the scope of this documentation, and may involve setting multiple additional JVM options.

8.4.2.1. Understanding how to override the JVM maximum heap size

For many Java workloads, the JVM heap is the largest single consumer of memory. Currently, the OpenJDK defaults to allowing up to 1/4 (1/-XX:MaxRAMFraction) of the compute node’s memory to be used for the heap, regardless of whether the OpenJDK is running in a container or not. It is therefore essential to override this behavior, especially if a container memory limit is also set.

There are at least two ways the above can be achieved:

  • If the container memory limit is set and the experimental options are supported by the JVM, set -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap.

    Note

    The UseCGroupMemoryLimitForHeap option has been removed in JDK 11. Use -XX:+UseContainerSupport instead.

    This sets -XX:MaxRAM to the container memory limit, and the maximum heap size (-XX:MaxHeapSize / -Xmx) to 1/-XX:MaxRAMFraction (1/4 by default).

  • Directly override one of -XX:MaxRAM, -XX:MaxHeapSize or -Xmx.

    This option involves hard-coding a value, but has the advantage of allowing a safety margin to be calculated.

8.4.2.2. Understanding how to encourage the JVM to release unused memory to the operating system

By default, the OpenJDK does not aggressively return unused memory to the operating system. This may be appropriate for many containerized Java workloads, but notable exceptions include workloads where additional active processes co-exist with a JVM within a container, whether those additional processes are native, additional JVMs, or a combination of the two.

Java-based agents can use the following JVM arguments to encourage the JVM to release unused memory to the operating system:

-XX:+UseParallelGC
-XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4
-XX:AdaptiveSizePolicyWeight=90.

These arguments are intended to return heap memory to the operating system whenever allocated memory exceeds 110% of in-use memory (-XX:MaxHeapFreeRatio), spending up to 20% of CPU time in the garbage collector (-XX:GCTimeRatio). At no time will the application heap allocation be less than the initial heap allocation (overridden by -XX:InitialHeapSize / -Xms). Detailed additional information is available Tuning Java’s footprint in OpenShift (Part 1), Tuning Java’s footprint in OpenShift (Part 2), and at OpenJDK and Containers.

8.4.2.3. Understanding how to ensure all JVM processes within a container are appropriately configured

In the case that multiple JVMs run in the same container, it is essential to ensure that they are all configured appropriately. For many workloads it will be necessary to grant each JVM a percentage memory budget, leaving a perhaps substantial additional safety margin.

Many Java tools use different environment variables (JAVA_OPTS, GRADLE_OPTS, and so on) to configure their JVMs and it can be challenging to ensure that the right settings are being passed to the right JVM.

The JAVA_TOOL_OPTIONS environment variable is always respected by the OpenJDK, and values specified in JAVA_TOOL_OPTIONS will be overridden by other options specified on the JVM command line. By default, to ensure that these options are used by default for all JVM workloads run in the Java-based agent image, the OpenShift Dedicated Jenkins Maven agent image sets:

JAVA_TOOL_OPTIONS="-XX:+UnlockExperimentalVMOptions
-XX:+UseCGroupMemoryLimitForHeap -Dsun.zip.disableMemoryMapping=true"
Note

The UseCGroupMemoryLimitForHeap option has been removed in JDK 11. Use -XX:+UseContainerSupport instead.

This does not guarantee that additional options are not required, but is intended to be a helpful starting point.

8.4.3. Finding the memory request and limit from within a pod

An application wishing to dynamically discover its memory request and limit from within a pod should use the Downward API.

Procedure

  • Configure the pod to add the MEMORY_REQUEST and MEMORY_LIMIT stanzas:

    1. Create a YAML file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: test
      spec:
        securityContext:
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault
        containers:
        - name: test
          image: fedora:latest
          command:
          - sleep
          - "3600"
          env:
          - name: MEMORY_REQUEST 1
            valueFrom:
              resourceFieldRef:
                containerName: test
                resource: requests.memory
          - name: MEMORY_LIMIT 2
            valueFrom:
              resourceFieldRef:
                containerName: test
                resource: limits.memory
          resources:
            requests:
              memory: 384Mi
            limits:
              memory: 512Mi
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop: [ALL]
      1
      Add this stanza to discover the application memory request value.
      2
      Add this stanza to discover the application memory limit value.
    2. Create the pod by running the following command:

      $ oc create -f <file_name>.yaml

Verification

  1. Access the pod using a remote shell:

    $ oc rsh test
  2. Check that the requested values were applied:

    $ env | grep MEMORY | sort

    Example output

    MEMORY_LIMIT=536870912
    MEMORY_REQUEST=402653184

Note

The memory limit value can also be read from inside the container by the /sys/fs/cgroup/memory/memory.limit_in_bytes file.

8.4.4. Understanding OOM kill policy

OpenShift Dedicated can kill a process in a container if the total memory usage of all the processes in the container exceeds the memory limit, or in serious cases of node memory exhaustion.

When a process is Out of Memory (OOM) killed, this might result in the container exiting immediately. If the container PID 1 process receives the SIGKILL, the container will exit immediately. Otherwise, the container behavior is dependent on the behavior of the other processes.

For example, a container process exited with code 137, indicating it received a SIGKILL signal.

If the container does not exit immediately, an OOM kill is detectable as follows:

  1. Access the pod using a remote shell:

    # oc rsh test
  2. Run the following command to see the current OOM kill count in /sys/fs/cgroup/memory/memory.oom_control:

    $ grep '^oom_kill ' /sys/fs/cgroup/memory/memory.oom_control

    Example output

    oom_kill 0

  3. Run the following command to provoke an OOM kill:

    $ sed -e '' </dev/zero

    Example output

    Killed

  4. Run the following command to view the exit status of the sed command:

    $ echo $?

    Example output

    137

    The 137 code indicates the container process exited with code 137, indicating it received a SIGKILL signal.

  5. Run the following command to see that the OOM kill counter in /sys/fs/cgroup/memory/memory.oom_control incremented:

    $ grep '^oom_kill ' /sys/fs/cgroup/memory/memory.oom_control

    Example output

    oom_kill 1

    If one or more processes in a pod are OOM killed, when the pod subsequently exits, whether immediately or not, it will have phase Failed and reason OOMKilled. An OOM-killed pod might be restarted depending on the value of restartPolicy. If not restarted, controllers such as the replication controller will notice the pod’s failed status and create a new pod to replace the old one.

    Use the follwing command to get the pod status:

    $ oc get pod test

    Example output

    NAME      READY     STATUS      RESTARTS   AGE
    test      0/1       OOMKilled   0          1m

    • If the pod has not restarted, run the following command to view the pod:

      $ oc get pod test -o yaml

      Example output

      ...
      status:
        containerStatuses:
        - name: test
          ready: false
          restartCount: 0
          state:
            terminated:
              exitCode: 137
              reason: OOMKilled
        phase: Failed

    • If restarted, run the following command to view the pod:

      $ oc get pod test -o yaml

      Example output

      ...
      status:
        containerStatuses:
        - name: test
          ready: true
          restartCount: 1
          lastState:
            terminated:
              exitCode: 137
              reason: OOMKilled
          state:
            running:
        phase: Running

8.4.5. Understanding pod eviction

OpenShift Dedicated may evict a pod from its node when the node’s memory is exhausted. Depending on the extent of memory exhaustion, the eviction may or may not be graceful. Graceful eviction implies the main process (PID 1) of each container receiving a SIGTERM signal, then some time later a SIGKILL signal if the process has not exited already. Non-graceful eviction implies the main process of each container immediately receiving a SIGKILL signal.

An evicted pod has phase Failed and reason Evicted. It will not be restarted, regardless of the value of restartPolicy. However, controllers such as the replication controller will notice the pod’s failed status and create a new pod to replace the old one.

$ oc get pod test

Example output

NAME      READY     STATUS    RESTARTS   AGE
test      0/1       Evicted   0          1m

$ oc get pod test -o yaml

Example output

...
status:
  message: 'Pod The node was low on resource: [MemoryPressure].'
  phase: Failed
  reason: Evicted

8.5. Configuring your cluster to place pods on overcommitted nodes

In an overcommitted state, the sum of the container compute resource requests and limits exceeds the resources available on the system. For example, you might want to use overcommitment in development environments where a trade-off of guaranteed performance for capacity is acceptable.

Containers can specify compute resource requests and limits. Requests are used for scheduling your container and provide a minimum service guarantee. Limits constrain the amount of compute resource that can be consumed on your node.

The scheduler attempts to optimize the compute resource use across all nodes in your cluster. It places pods onto specific nodes, taking the pods' compute resource requests and nodes' available capacity into consideration.

OpenShift Dedicated administrators can manage container density on nodes by configuring pod placement behavior and per-project resource limits that overcommit cannot exceed.

Alternatively, administrators can disable project-level resource overcommitment on customer-created namespaces that are not managed by Red Hat.

For more information about container resource management, see Additional resources.

8.5.1. Project-level limits

In OpenShift Dedicated, overcommitment of project-level resources is enabled by default. If required by your use case, you can disable overcommitment on projects that are not managed by Red Hat.

For the list of projects that are managed by Red Hat and cannot be modified, see "Red Hat Managed resources" in Support.

8.5.1.1. Disabling overcommitment for a project

If required by your use case, you can disable overcommitment on any project that is not managed by Red Hat. For a list of projects that cannot be modified, see "Red Hat Managed resources" in Support.

Prerequisites

  • You are logged in to the cluster using an account with cluster administrator or cluster editor permissions.

Procedure

  1. Edit the namespace object file:

    1. If you are using the web console:

      1. Click Administration Namespaces and click the namespace for the project.
      2. In the Annotations section, click the Edit button.
      3. Click Add more and enter a new annotation that uses a Key of quota.openshift.io/cluster-resource-override-enabled and a Value of false.
      4. Click Save.
    2. If you are using the OpenShift CLI (oc):

      1. Edit the namespace:

        $ oc edit namespace/<project_name>
      2. Add the following annotation:

        apiVersion: v1
        kind: Namespace
        metadata:
          annotations:
            quota.openshift.io/cluster-resource-override-enabled: "false" <.>
        # ...

        <.> Setting this annotation to false disables overcommit for this namespace.

8.5.2. Additional resources

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.