7.1. Understanding Containers

The basic units of OpenShift Container Platform applications are called containers. Linux container technologies are lightweight mechanisms for isolating running processes so that they are limited to interacting with only their designated resources.

Many application instances can be running in containers on a single host without visibility into each others' processes, files, network, and so on. Typically, each container provides a single service (often called a "micro-service"), such as a web server or a database, though containers can be used for arbitrary workloads.

The Linux kernel has been incorporating capabilities for container technologies for years. OpenShift Container Platform and Kubernetes add the ability to orchestrate containers across multi-host installations.

About containers and RHEL kernel memory

Due to Red Hat Enterprise Linux (RHEL) behavior, a container on a node with high CPU usage might seem to consume more memory than expected. The higher memory consumption could be caused by the kmem_cache in the RHEL kernel. The RHEL kernel creates a kmem_cache for each cgroup. For added performance, the kmem_cache contains a cpu_cache, and a node cache for any NUMA nodes. These caches all consume kernel memory.

The amount of memory stored in those caches is proportional to the number of CPUs that the system uses. As a result, a higher number of CPUs results in a greater amount of kernel memory being held in these caches. Higher amounts of kernel memory in these caches can cause OpenShift Container Platform containers to exceed the configured memory limits, resulting in the container being killed.

To avoid losing containers due to kernel memory issues, ensure that the containers request sufficient memory. You can use the following formula to estimate the amount of memory consumed by the kmem_cache, where nproc is the number of processing units available that are reported by the nproc command. The lower limit of container requests should be this value plus the container memory requirements:

$(nproc) X 1/2 MiB

7.2. Using Init Containers to perform tasks before a pod is deployed

OpenShift Container Platform provides init containers, which are specialized containers that run before application containers and can contain utilities or setup scripts not present in an app image.

7.2.1. Understanding Init Containers

You can use an Init Container resource to perform tasks before the rest of a pod is deployed.

A pod can have Init Containers in addition to application containers. Init containers allow you to reorganize setup scripts and binding code.

An Init Container can:

Contain and run utilities that are not desirable to include in the app Container image for security reasons.
Contain utilities or custom code for setup that is not present in an app image. For example, there is no requirement to make an image FROM another image just to use a tool like sed, awk, python, or dig during setup.
Use Linux namespaces so that they have different filesystem views from app containers, such as access to secrets that application containers are not able to access.

Each Init Container must complete successfully before the next one is started. So, Init Containers provide an easy way to block or delay the startup of app containers until some set of preconditions are met.

For example, the following are some ways you can use Init Containers:

Wait for a service to be created with a shell command like:

for i in {1..100}; do sleep 1; if dig myservice; then exit 0; fi; done; exit 1

Register this pod with a remote server from the downward API with a command like:

$ curl -X POST http://$MANAGEMENT_SERVICE_HOST:$MANAGEMENT_SERVICE_PORT/register -d ‘instance=$()&ip=$()’

Wait for some time before starting the app Container with a command like sleep 60.
Clone a git repository into a volume.
Place values into a configuration file and run a template tool to dynamically generate a configuration file for the main app Container. For example, place the POD_IP value in a configuration and generate the main app configuration file using Jinja.

See the Kubernetes documentation for more information.

7.2.2. Creating Init Containers

The following example outlines a simple pod which has two Init Containers. The first waits for myservice and the second waits for mydb. After both containers complete, the pod begins.

Procedure

Create the pod for the Init Container:

Create a YAML file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
  labels:
    app: myapp
spec:
  containers:
  - name: myapp-container
    image: registry.access.redhat.com/ubi8/ubi:latest
    command: ['sh', '-c', 'echo The app is running! && sleep 3600']
  initContainers:
  - name: init-myservice
    image: registry.access.redhat.com/ubi8/ubi:latest
    command: ['sh', '-c', 'until getent hosts myservice; do echo waiting for myservice; sleep 2; done;']
  - name: init-mydb
    image: registry.access.redhat.com/ubi8/ubi:latest
    command: ['sh', '-c', 'until getent hosts mydb; do echo waiting for mydb; sleep 2; done;']
# ...

Create the pod:
```
$ oc create -f myapp.yaml
```

View the status of the pod:

$ oc get pods

Example output

NAME                          READY     STATUS              RESTARTS   AGE
myapp-pod                     0/1       Init:0/2            0          5s

The pod status, Init:0/2, indicates it is waiting for the two services.

Create the myservice service.

Create a YAML file similar to the following:

kind: Service
apiVersion: v1
metadata:
  name: myservice
spec:
  ports:
  - protocol: TCP
    port: 80
    targetPort: 9376

Create the pod:
```
$ oc create -f myservice.yaml
```

View the status of the pod:

$ oc get pods

Example output

NAME                          READY     STATUS              RESTARTS   AGE
myapp-pod                     0/1       Init:1/2            0          5s

The pod status, Init:1/2, indicates it is waiting for one service, in this case the mydb service.

Create the mydb service:

Create a YAML file similar to the following:

kind: Service
apiVersion: v1
metadata:
  name: mydb
spec:
  ports:
  - protocol: TCP
    port: 80
    targetPort: 9377

Create the pod:
```
$ oc create -f mydb.yaml
```

View the status of the pod:

$ oc get pods

Example output

NAME                          READY     STATUS              RESTARTS   AGE
myapp-pod                     1/1       Running             0          2m

The pod status indicated that it is no longer waiting for the services and is running.

7.3. Using volumes to persist container data

Files in a container are ephemeral. As such, when a container crashes or stops, the data is lost. You can use volumes to persist the data used by the containers in a pod. A volume is directory, accessible to the Containers in a pod, where data is stored for the life of the pod.

7.3.1. Understanding volumes

Volumes are mounted file systems available to pods and their containers which may be backed by a number of host-local or network attached storage endpoints. Containers are not persistent by default; on restart, their contents are cleared.

To ensure that the file system on the volume contains no errors and, if errors are present, to repair them when possible, OpenShift Container Platform invokes the fsck utility prior to the mount utility. This occurs when either adding a volume or updating an existing volume.

The simplest volume type is emptyDir, which is a temporary directory on a single machine. Administrators may also allow you to request a persistent volume that is automatically attached to your pods.

Note

emptyDir volume storage may be restricted by a quota based on the pod’s FSGroup, if the FSGroup parameter is enabled by your cluster administrator.

7.3.2. Working with volumes using the OpenShift Container Platform CLI

You can use the CLI command oc set volume to add and remove volumes and volume mounts for any object that has a pod template like replication controllers or deployment configs. You can also list volumes in pods or any object that has a pod template.

The oc set volume command uses the following general syntax:

$ oc set volume <object_selection> <operation> <mandatory_parameters> <options>

Object selection: Specify one of the following for the object_selection parameter in the oc set volume command:

Table 7.1. Object Selection
Syntax	Description	Example
`<object_type> <name>`	Selects `<name>` of type `<object_type>`.	`deploymentConfig registry`
`<object_type>/<name>`	Selects `<name>` of type `<object_type>`.	`deploymentConfig/registry`
`<object_type>--selector=<object_label_selector>`	Selects resources of type `<object_type>` that matched the given label selector.	`deploymentConfig--selector="name=registry"`
`<object_type> --all`	Selects all resources of type `<object_type>`.	`deploymentConfig --all`
`-f` or `--filename=<file_name>`	File name, directory, or URL to file to use to edit the resource.	`-f registry-deployment-config.json`

Operation: Specify --add or --remove for the operation parameter in the oc set volume command.
Mandatory parameters: Any mandatory parameters are specific to the selected operation and are discussed in later sections.
Options: Any options are specific to the selected operation and are discussed in later sections.

7.3.3. Listing volumes and volume mounts in a pod

You can list volumes and volume mounts in pods or pod templates:

Procedure

To list volumes:

$ oc set volume <object_type>/<name> [options]

List volume supported options:

Option	Description	Default
`--name`	Name of the volume.
`-c, --containers`	Select containers by name. It can also take wildcard `'*'` that matches any character.	`'*'`

For example:

To list all volumes for pod p1:
```
$ oc set volume pod/p1
```
To list volume v1 defined on all deployment configs:
```
$ oc set volume dc --all --name=v1
```

7.3.4. Adding volumes to a pod

You can add volumes and volume mounts to a pod.

Procedure

To add a volume, a volume mount, or both to pod templates:

$ oc set volume <object_type>/<name> --add [options]

Table 7.2. Supported Options for Adding Volumes
Option	Description	Default
`--name`	Name of the volume.	Automatically generated, if not specified.
`-t, --type`	Name of the volume source. Supported values: `emptyDir`, `hostPath`, `secret`, `configmap`, `persistentVolumeClaim` or `projected`.	`emptyDir`
`-c, --containers`	Select containers by name. It can also take wildcard `'*'` that matches any character.	`'*'`
`-m, --mount-path`	Mount path inside the selected containers. Do not mount to the container root, `/`, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host `/dev/pts` files. It is safe to mount the host by using `/host`.
`--path`	Host path. Mandatory parameter for `--type=hostPath`. Do not mount to the container root, `/`, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host `/dev/pts` files. It is safe to mount the host by using `/host`.
`--secret-name`	Name of the secret. Mandatory parameter for `--type=secret`.
`--configmap-name`	Name of the configmap. Mandatory parameter for `--type=configmap`.
`--claim-name`	Name of the persistent volume claim. Mandatory parameter for `--type=persistentVolumeClaim`.
`--source`	Details of volume source as a JSON string. Recommended if the desired volume source is not supported by `--type`.
`-o, --output`	Display the modified objects instead of updating them on the server. Supported values: `json`, `yaml`.
`--output-version`	Output the modified objects with the given version.	`api-version`

For example:

To add a new volume source emptyDir to the registry DeploymentConfig object:

$ oc set volume dc/registry --add

Tip

You can alternatively apply the following YAML to add the volume:

Example 7.1. Sample deployment config with an added volume

kind: DeploymentConfig
apiVersion: apps.openshift.io/v1
metadata:
  name: registry
  namespace: registry
spec:
  replicas: 3
  selector:
    app: httpd
  template:
    metadata:
      labels:
        app: httpd
    spec:
      volumes: 1
        - name: volume-pppsw
          emptyDir: {}
      containers:
        - name: httpd
          image: >-
            image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
          ports:
            - containerPort: 8080
              protocol: TCP

1: Add the volume source emptyDir.

To add volume v1 with secret secret1 for replication controller r1 and mount inside the containers at /data:

$ oc set volume rc/r1 --add --name=v1 --type=secret --secret-name='secret1' --mount-path=/data

Tip

You can alternatively apply the following YAML to add the volume:

Example 7.2. Sample replication controller with added volume and secret

kind: ReplicationController
apiVersion: v1
metadata:
  name: example-1
  namespace: example
spec:
  replicas: 0
  selector:
    app: httpd
    deployment: example-1
    deploymentconfig: example
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: httpd
        deployment: example-1
        deploymentconfig: example
    spec:
      volumes: 1
        - name: v1
          secret:
            secretName: secret1
            defaultMode: 420
      containers:
        - name: httpd
          image: >-
            image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
          volumeMounts: 2
            - name: v1
              mountPath: /data

1: Add the volume and secret.
2: Add the container mount path.

To add existing persistent volume v1 with claim name pvc1 to deployment configuration dc.json on disk, mount the volume on container c1 at /data, and update the DeploymentConfig object on the server:

$ oc set volume -f dc.json --add --name=v1 --type=persistentVolumeClaim \
  --claim-name=pvc1 --mount-path=/data --containers=c1

Tip

You can alternatively apply the following YAML to add the volume:

Example 7.3. Sample deployment config with persistent volume added

kind: DeploymentConfig
apiVersion: apps.openshift.io/v1
metadata:
  name: example
  namespace: example
spec:
  replicas: 3
  selector:
    app: httpd
  template:
    metadata:
      labels:
        app: httpd
    spec:
      volumes:
        - name: volume-pppsw
          emptyDir: {}
        - name: v1 1
          persistentVolumeClaim:
            claimName: pvc1
      containers:
        - name: httpd
          image: >-
            image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
          ports:
            - containerPort: 8080
              protocol: TCP
          volumeMounts: 2
            - name: v1
              mountPath: /data

1: Add the persistent volume claim named `pvc1.
2: Add the container mount path.

To add a volume v1 based on Git repository https://github.com/namespace1/project1 with revision 5125c45f9f563 for all replication controllers:

$ oc set volume rc --all --add --name=v1 \
  --source='{"gitRepo": {
                "repository": "https://github.com/namespace1/project1",
                "revision": "5125c45f9f563"
            }}'

7.3.5. Updating volumes and volume mounts in a pod

You can modify the volumes and volume mounts in a pod.

Procedure

Updating existing volumes using the --overwrite option:

$ oc set volume <object_type>/<name> --add --overwrite [options]

For example:

To replace existing volume v1 for replication controller r1 with existing persistent volume claim pvc1:

$ oc set volume rc/r1 --add --overwrite --name=v1 --type=persistentVolumeClaim --claim-name=pvc1

Tip

You can alternatively apply the following YAML to replace the volume:

Example 7.4. Sample replication controller with persistent volume claim named pvc1

kind: ReplicationController
apiVersion: v1
metadata:
  name: example-1
  namespace: example
spec:
  replicas: 0
  selector:
    app: httpd
    deployment: example-1
    deploymentconfig: example
  template:
    metadata:
      labels:
        app: httpd
        deployment: example-1
        deploymentconfig: example
    spec:
      volumes:
        - name: v1 1
          persistentVolumeClaim:
            claimName: pvc1
      containers:
        - name: httpd
          image: >-
            image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
          ports:
            - containerPort: 8080
              protocol: TCP
          volumeMounts:
            - name: v1
              mountPath: /data

1: Set persistent volume claim to pvc1.

To change the DeploymentConfig object d1 mount point to /opt for volume v1:

$ oc set volume dc/d1 --add --overwrite --name=v1 --mount-path=/opt

Tip

You can alternatively apply the following YAML to change the mount point:

Example 7.5. Sample deployment config with mount point set to opt.

kind: DeploymentConfig
apiVersion: apps.openshift.io/v1
metadata:
  name: example
  namespace: example
spec:
  replicas: 3
  selector:
    app: httpd
  template:
    metadata:
      labels:
        app: httpd
    spec:
      volumes:
        - name: volume-pppsw
          emptyDir: {}
        - name: v2
          persistentVolumeClaim:
            claimName: pvc1
        - name: v1
          persistentVolumeClaim:
            claimName: pvc1
      containers:
        - name: httpd
          image: >-
            image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
          ports:
            - containerPort: 8080
              protocol: TCP
          volumeMounts: 1
            - name: v1
              mountPath: /opt

1: Set the mount point to /opt.

7.3.6. Removing volumes and volume mounts from a pod

You can remove a volume or volume mount from a pod.

Procedure

To remove a volume from pod templates:

$ oc set volume <object_type>/<name> --remove [options]

Table 7.3. Supported options for removing volumes
Option	Description	Default
`--name`	Name of the volume.
`-c, --containers`	Select containers by name. It can also take wildcard `'*'` that matches any character.	`'*'`
`--confirm`	Indicate that you want to remove multiple volumes at once.
`-o, --output`	Display the modified objects instead of updating them on the server. Supported values: `json`, `yaml`.
`--output-version`	Output the modified objects with the given version.	`api-version`

For example:

To remove a volume v1 from the DeploymentConfig object d1:
```
$ oc set volume dc/d1 --remove --name=v1
```
To unmount volume v1 from container c1 for the DeploymentConfig object d1 and remove the volume v1 if it is not referenced by any containers on d1:
```
$ oc set volume dc/d1 --remove --name=v1 --containers=c1
```
To remove all volumes for replication controller r1:
```
$ oc set volume rc/r1 --remove --confirm
```

7.3.7. Configuring volumes for multiple uses in a pod

You can configure a volume to allows you to share one volume for multiple uses in a single pod using the volumeMounts.subPath property to specify a subPath value inside a volume instead of the volume’s root.

Note

You cannot add a subPath parameter to an existing scheduled pod.

Procedure

To view the list of files in the volume, run the oc rsh command:

$ oc rsh <pod>

Example output

sh-4.2$ ls /path/to/volume/subpath/mount
example_file1 example_file2 example_file3

Specify the subPath:

Example Pod spec with subPath parameter

apiVersion: v1
kind: Pod
metadata:
  name: my-site
spec:
    containers:
    - name: mysql
      image: mysql
      volumeMounts:
      - mountPath: /var/lib/mysql
        name: site-data
        subPath: mysql 1
    - name: php
      image: php
      volumeMounts:
      - mountPath: /var/www/html
        name: site-data
        subPath: html 2
    volumes:
    - name: site-data
      persistentVolumeClaim:
        claimName: my-site-data

1: Databases are stored in the mysql folder.
2: HTML content is stored in the html folder.

7.4. Mapping volumes using projected volumes

A projected volume maps several existing volume sources into the same directory.

The following types of volume sources can be projected:

Secrets
Config Maps
Downward API

Note

All sources are required to be in the same namespace as the pod.

7.4.1. Understanding projected volumes

Projected volumes can map any combination of these volume sources into a single directory, allowing the user to:

automatically populate a single volume with the keys from multiple secrets, config maps, and with downward API information, so that I can synthesize a single directory with various sources of information;
populate a single volume with the keys from multiple secrets, config maps, and with downward API information, explicitly specifying paths for each item, so that I can have full control over the contents of that volume.

Important

When the RunAsUser permission is set in the security context of a Linux-based pod, the projected files have the correct permissions set, including container user ownership. However, when the Windows equivalent RunAsUsername permission is set in a Windows pod, the kubelet is unable to correctly set ownership on the files in the projected volume.

Therefore, the RunAsUsername permission set in the security context of a Windows pod is not honored for Windows projected volumes running in OpenShift Container Platform.

The following general scenarios show how you can use projected volumes.

Config map, secrets, Downward API.: Projected volumes allow you to deploy containers with configuration data that includes passwords. An application using these resources could be deploying Red Hat OpenStack Platform (RHOSP) on Kubernetes. The configuration data might have to be assembled differently depending on if the services are going to be used for production or for testing. If a pod is labeled with production or testing, the downward API selector metadata.labels can be used to produce the correct RHOSP configs.
Config map + secrets.: Projected volumes allow you to deploy containers involving configuration data and passwords. For example, you might execute a config map with some sensitive encrypted tasks that are decrypted using a vault password file.
ConfigMap + Downward API.: Projected volumes allow you to generate a config including the pod name (available via the metadata.name selector). This application can then pass the pod name along with requests to easily determine the source without using IP tracking.
Secrets + Downward API.: Projected volumes allow you to use a secret as a public key to encrypt the namespace of the pod (available via the metadata.namespace selector). This example allows the Operator to use the application to deliver the namespace information securely without using an encrypted transport.

7.4.1.1. Example Pod specs

The following are examples of Pod specs for creating projected volumes.

Pod with a secret, a Downward API, and a config map

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts: 1
    - name: all-in-one
      mountPath: "/projected-volume"2
      readOnly: true 3
  volumes: 4
  - name: all-in-one 5
    projected:
      defaultMode: 0400 6
      sources:
      - secret:
          name: mysecret 7
          items:
            - key: username
              path: my-group/my-username 8
      - downwardAPI: 9
          items:
            - path: "labels"
              fieldRef:
                fieldPath: metadata.labels
            - path: "cpu_limit"
              resourceFieldRef:
                containerName: container-test
                resource: limits.cpu
      - configMap: 10
          name: myconfigmap
          items:
            - key: config
              path: my-group/my-config
              mode: 0777 11

1: Add a volumeMounts section for each container that needs the secret.
2: Specify a path to an unused directory where the secret will appear.
3: Set readOnly to true.
4: Add a volumes block to list each projected volume source.
5: Specify any name for the volume.
6: Set the execute permission on the files.
7: Add a secret. Enter the name of the secret object. Each secret you want to use must be listed.
8: Specify the path to the secrets file under the mountPath. Here, the secrets file is in /projected-volume/my-group/my-username.
9: Add a Downward API source.
10: Add a ConfigMap source.
11: Set the mode for the specific projection

Note

If there are multiple containers in the pod, each container needs a volumeMounts section, but only one volumes section is needed.

Pod with multiple secrets with a non-default permission mode set

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts:
    - name: all-in-one
      mountPath: "/projected-volume"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      defaultMode: 0755
      sources:
      - secret:
          name: mysecret
          items:
            - key: username
              path: my-group/my-username
      - secret:
          name: mysecret2
          items:
            - key: password
              path: my-group/my-password
              mode: 511

Note

The defaultMode can only be specified at the projected level and not for each volume source. However, as illustrated above, you can explicitly set the mode for each individual projection.

7.4.1.2. Pathing Considerations

Collisions Between Keys when Configured Paths are Identical

If you configure any keys with the same path, the pod spec will not be accepted as valid. In the following example, the specified path for mysecret and myconfigmap are the same:

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts:
    - name: all-in-one
      mountPath: "/projected-volume"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      sources:
      - secret:
          name: mysecret
          items:
            - key: username
              path: my-group/data
      - configMap:
          name: myconfigmap
          items:
            - key: config
              path: my-group/data

Consider the following situations related to the volume file paths.

Collisions Between Keys without Configured Paths: The only run-time validation that can occur is when all the paths are known at pod creation, similar to the above scenario. Otherwise, when a conflict occurs the most recent specified resource will overwrite anything preceding it (this is true for resources that are updated after pod creation as well).
Collisions when One Path is Explicit and the Other is Automatically Projected: In the event that there is a collision due to a user specified path matching data that is automatically projected, the latter resource will overwrite anything preceding it as before

7.4.2. Configuring a Projected Volume for a Pod

When creating projected volumes, consider the volume file path situations described in Understanding projected volumes.

The following example shows how to use a projected volume to mount an existing secret volume source. The steps can be used to create a user name and password secrets from local files. You then create a pod that runs one container, using a projected volume to mount the secrets into the same shared directory.

The user name and password values can be any valid string that is base64 encoded.

The following example shows admin in base64:

$ echo -n "admin" | base64

Example output

YWRtaW4=

The following example shows the password 1f2d1e2e67df in base64:

$ echo -n "1f2d1e2e67df" | base64

Example output

MWYyZDFlMmU2N2Rm

Procedure

To use a projected volume to mount an existing secret volume source.

Create the secret:

Create a YAML file similar to the following, replacing the password and user information as appropriate:

apiVersion: v1
kind: Secret
metadata:
  name: mysecret
type: Opaque
data:
  pass: MWYyZDFlMmU2N2Rm
  user: YWRtaW4=

Use the following command to create the secret:

$ oc create -f <secrets-filename>

For example:

$ oc create -f secret.yaml

Example output

secret "mysecret" created

You can check that the secret was created using the following commands:

$ oc get secret <secret-name>

For example:

$ oc get secret mysecret

Example output

NAME       TYPE      DATA      AGE
mysecret   Opaque    2         17h

$ oc get secret <secret-name> -o yaml

For example:

$ oc get secret mysecret -o yaml

apiVersion: v1
data:
  pass: MWYyZDFlMmU2N2Rm
  user: YWRtaW4=
kind: Secret
metadata:
  creationTimestamp: 2017-05-30T20:21:38Z
  name: mysecret
  namespace: default
  resourceVersion: "2107"
  selfLink: /api/v1/namespaces/default/secrets/mysecret
  uid: 959e0424-4575-11e7-9f97-fa163e4bd54c
type: Opaque

Create a pod with a projected volume.

Create a YAML file similar to the following, including a volumes section:

kind: Pod
metadata:
  name: test-projected-volume
spec:
  containers:
  - name: test-projected-volume
    image: busybox
    args:
    - sleep
    - "86400"
    volumeMounts:
    - name: all-in-one
      mountPath: "/projected-volume"
      readOnly: true
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
            - ALL
  volumes:
  - name: all-in-one
    projected:
      sources:
      - secret:
          name: mysecret 1

1: The name of the secret you created.

Create the pod from the configuration file:

$ oc create -f <your_yaml_file>.yaml

For example:

$ oc create -f secret-pod.yaml

Example output

pod "test-projected-volume" created

Verify that the pod container is running, and then watch for changes to the pod:

$ oc get pod <name>

For example:

$ oc get pod test-projected-volume

The output should appear similar to the following:

Example output

NAME                    READY     STATUS    RESTARTS   AGE
test-projected-volume   1/1       Running   0          14s

In another terminal, use the oc exec command to open a shell to the running container:
```
$ oc exec -it <pod> <command>
```
For example:
```
$ oc exec -it test-projected-volume -- /bin/sh
```

In your shell, verify that the projected-volumes directory contains your projected sources:

/ # ls

Example output

bin               home              root              tmp
dev               proc              run               usr
etc               projected-volume  sys               var

7.5. Allowing containers to consume API objects

The Downward API is a mechanism that allows containers to consume information about API objects without coupling to OpenShift Container Platform. Such information includes the pod’s name, namespace, and resource values. Containers can consume information from the downward API using environment variables or a volume plugin.

7.5.1. Expose pod information to Containers using the Downward API

The Downward API contains such information as the pod’s name, project, and resource values. Containers can consume information from the downward API using environment variables or a volume plugin.

Fields within the pod are selected using the FieldRef API type. FieldRef has two fields:

Field	Description
`fieldPath`	The path of the field to select, relative to the pod.
`apiVersion`	The API version to interpret the `fieldPath` selector within.

Currently, the valid selectors in the v1 API include:

Selector	Description
`metadata.name`	The pod’s name. This is supported in both environment variables and volumes.
`metadata.namespace`	The pod’s namespace.This is supported in both environment variables and volumes.
`metadata.labels`	The pod’s labels. This is only supported in volumes and not in environment variables.
`metadata.annotations`	The pod’s annotations. This is only supported in volumes and not in environment variables.
`status.podIP`	The pod’s IP. This is only supported in environment variables and not volumes.

The apiVersion field, if not specified, defaults to the API version of the enclosing pod template.

7.5.2. Understanding how to consume container values using the downward API

You containers can consume API values using environment variables or a volume plugin. Depending on the method you choose, containers can consume:

Pod name
Pod project/namespace
Pod annotations
Pod labels

Annotations and labels are available using only a volume plugin.

7.5.2.1. Consuming container values using environment variables

When using a container’s environment variables, use the EnvVar type’s valueFrom field (of type EnvVarSource) to specify that the variable’s value should come from a FieldRef source instead of the literal value specified by the value field.

Only constant attributes of the pod can be consumed this way, as environment variables cannot be updated once a process is started in a way that allows the process to be notified that the value of a variable has changed. The fields supported using environment variables are:

Pod name
Pod project/namespace

Procedure

Create a new pod spec that contains the environment variables you want the container to consume:

Create a pod.yaml file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: dapi-env-test-pod
spec:
  containers:
    - name: env-test-container
      image: gcr.io/google_containers/busybox
      command: [ "/bin/sh", "-c", "env" ]
      env:
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: MY_POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
  restartPolicy: Never
# ...

Create the pod from the pod.yaml file:
```
$ oc create -f pod.yaml
```

Verification

Check the container’s logs for the MY_POD_NAME and MY_POD_NAMESPACE values:
```
$ oc logs -p dapi-env-test-pod
```

7.5.2.2. Consuming container values using a volume plugin

You containers can consume API values using a volume plugin.

Containers can consume:

Pod name
Pod project/namespace
Pod annotations
Pod labels

Procedure

To use the volume plugin:

Create a new pod spec that contains the environment variables you want the container to consume:

Create a volume-pod.yaml file similar to the following:

kind: Pod
apiVersion: v1
metadata:
  labels:
    zone: us-east-coast
    cluster: downward-api-test-cluster1
    rack: rack-123
  name: dapi-volume-test-pod
  annotations:
    annotation1: "345"
    annotation2: "456"
spec:
  containers:
    - name: volume-test-container
      image: gcr.io/google_containers/busybox
      command: ["sh", "-c", "cat /tmp/etc/pod_labels /tmp/etc/pod_annotations"]
      volumeMounts:
        - name: podinfo
          mountPath: /tmp/etc
          readOnly: false
  volumes:
  - name: podinfo
    downwardAPI:
      defaultMode: 420
      items:
      - fieldRef:
          fieldPath: metadata.name
        path: pod_name
      - fieldRef:
          fieldPath: metadata.namespace
        path: pod_namespace
      - fieldRef:
          fieldPath: metadata.labels
        path: pod_labels
      - fieldRef:
          fieldPath: metadata.annotations
        path: pod_annotations
  restartPolicy: Never
# ...

Create the pod from the volume-pod.yaml file:
```
$ oc create -f volume-pod.yaml
```

Verification

Check the container’s logs and verify the presence of the configured fields:

$ oc logs -p dapi-volume-test-pod

Example output

cluster=downward-api-test-cluster1
rack=rack-123
zone=us-east-coast
annotation1=345
annotation2=456
kubernetes.io/config.source=api

7.5.3. Understanding how to consume container resources using the Downward API

When creating pods, you can use the Downward API to inject information about computing resource requests and limits so that image and application authors can correctly create an image for specific environments.

You can do this using environment variable or a volume plugin.

7.5.3.1. Consuming container resources using environment variables

When creating pods, you can use the Downward API to inject information about computing resource requests and limits using environment variables.

When creating the pod configuration, specify environment variables that correspond to the contents of the resources field in the spec.container field.

Note

If the resource limits are not included in the container configuration, the downward API defaults to the node’s CPU and memory allocatable values.

Procedure

Create a new pod spec that contains the resources you want to inject:

Create a pod.yaml file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: dapi-env-test-pod
spec:
  containers:
    - name: test-container
      image: gcr.io/google_containers/busybox:1.24
      command: [ "/bin/sh", "-c", "env" ]
      resources:
        requests:
          memory: "32Mi"
          cpu: "125m"
        limits:
          memory: "64Mi"
          cpu: "250m"
      env:
        - name: MY_CPU_REQUEST
          valueFrom:
            resourceFieldRef:
              resource: requests.cpu
        - name: MY_CPU_LIMIT
          valueFrom:
            resourceFieldRef:
              resource: limits.cpu
        - name: MY_MEM_REQUEST
          valueFrom:
            resourceFieldRef:
              resource: requests.memory
        - name: MY_MEM_LIMIT
          valueFrom:
            resourceFieldRef:
              resource: limits.memory
# ...

Create the pod from the pod.yaml file:
```
$ oc create -f pod.yaml
```

7.5.3.2. Consuming container resources using a volume plugin

When creating pods, you can use the Downward API to inject information about computing resource requests and limits using a volume plugin.

When creating the pod configuration, use the spec.volumes.downwardAPI.items field to describe the desired resources that correspond to the spec.resources field.

Note

If the resource limits are not included in the container configuration, the Downward API defaults to the node’s CPU and memory allocatable values.

Procedure

Create a new pod spec that contains the resources you want to inject:

Create a pod.yaml file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: dapi-env-test-pod
spec:
  containers:
    - name: client-container
      image: gcr.io/google_containers/busybox:1.24
      command: ["sh", "-c", "while true; do echo; if [[ -e /etc/cpu_limit ]]; then cat /etc/cpu_limit; fi; if [[ -e /etc/cpu_request ]]; then cat /etc/cpu_request; fi; if [[ -e /etc/mem_limit ]]; then cat /etc/mem_limit; fi; if [[ -e /etc/mem_request ]]; then cat /etc/mem_request; fi; sleep 5; done"]
      resources:
        requests:
          memory: "32Mi"
          cpu: "125m"
        limits:
          memory: "64Mi"
          cpu: "250m"
      volumeMounts:
        - name: podinfo
          mountPath: /etc
          readOnly: false
  volumes:
    - name: podinfo
      downwardAPI:
        items:
          - path: "cpu_limit"
            resourceFieldRef:
              containerName: client-container
              resource: limits.cpu
          - path: "cpu_request"
            resourceFieldRef:
              containerName: client-container
              resource: requests.cpu
          - path: "mem_limit"
            resourceFieldRef:
              containerName: client-container
              resource: limits.memory
          - path: "mem_request"
            resourceFieldRef:
              containerName: client-container
              resource: requests.memory
# ...

Create the pod from the volume-pod.yaml file:
```
$ oc create -f volume-pod.yaml
```

7.5.4. Consuming secrets using the Downward API

When creating pods, you can use the downward API to inject secrets so image and application authors can create an image for specific environments.

Procedure

Create a secret to inject:

Create a secret.yaml file similar to the following:

apiVersion: v1
kind: Secret
metadata:
  name: mysecret
data:
  password: <password>
  username: <username>
type: kubernetes.io/basic-auth

Create the secret object from the secret.yaml file:
```
$ oc create -f secret.yaml
```

Create a pod that references the username field from the above Secret object:

Create a pod.yaml file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: dapi-env-test-pod
spec:
  containers:
    - name: env-test-container
      image: gcr.io/google_containers/busybox
      command: [ "/bin/sh", "-c", "env" ]
      env:
        - name: MY_SECRET_USERNAME
          valueFrom:
            secretKeyRef:
              name: mysecret
              key: username
  restartPolicy: Never
# ...

Create the pod from the pod.yaml file:
```
$ oc create -f pod.yaml
```

Verification

Check the container’s logs for the MY_SECRET_USERNAME value:
```
$ oc logs -p dapi-env-test-pod
```

7.5.5. Consuming configuration maps using the Downward API

When creating pods, you can use the Downward API to inject configuration map values so image and application authors can create an image for specific environments.

Procedure

Create a config map with the values to inject:
1. Create a configmap.yaml file similar to the following:
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: myconfigmap
data:
  mykey: myvalue
```
2. Create the config map from the configmap.yaml file:
```
$ oc create -f configmap.yaml
```

Create a pod that references the above config map:

Create a pod.yaml file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: dapi-env-test-pod
spec:
  containers:
    - name: env-test-container
      image: gcr.io/google_containers/busybox
      command: [ "/bin/sh", "-c", "env" ]
      env:
        - name: MY_CONFIGMAP_VALUE
          valueFrom:
            configMapKeyRef:
              name: myconfigmap
              key: mykey
  restartPolicy: Always
# ...

Create the pod from the pod.yaml file:
```
$ oc create -f pod.yaml
```

Verification

Check the container’s logs for the MY_CONFIGMAP_VALUE value:
```
$ oc logs -p dapi-env-test-pod
```

7.5.6. Referencing environment variables

When creating pods, you can reference the value of a previously defined environment variable by using the $() syntax. If the environment variable reference can not be resolved, the value will be left as the provided string.

Procedure

Create a pod that references an existing environment variable:

Create a pod.yaml file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: dapi-env-test-pod
spec:
  containers:
    - name: env-test-container
      image: gcr.io/google_containers/busybox
      command: [ "/bin/sh", "-c", "env" ]
      env:
        - name: MY_EXISTING_ENV
          value: my_value
        - name: MY_ENV_VAR_REF_ENV
          value: $(MY_EXISTING_ENV)
  restartPolicy: Never
# ...

Create the pod from the pod.yaml file:
```
$ oc create -f pod.yaml
```

Verification

Check the container’s logs for the MY_ENV_VAR_REF_ENV value:
```
$ oc logs -p dapi-env-test-pod
```

7.5.7. Escaping environment variable references

When creating a pod, you can escape an environment variable reference by using a double dollar sign. The value will then be set to a single dollar sign version of the provided value.

Procedure

Create a pod that references an existing environment variable:

Create a pod.yaml file similar to the following:

apiVersion: v1
kind: Pod
metadata:
  name: dapi-env-test-pod
spec:
  containers:
    - name: env-test-container
      image: gcr.io/google_containers/busybox
      command: [ "/bin/sh", "-c", "env" ]
      env:
        - name: MY_NEW_ENV
          value: $$(SOME_OTHER_ENV)
  restartPolicy: Never
# ...

Create the pod from the pod.yaml file:
```
$ oc create -f pod.yaml
```

Verification

Check the container’s logs for the MY_NEW_ENV value:
```
$ oc logs -p dapi-env-test-pod
```

7.6. Copying files to or from an OpenShift Container Platform container

You can use the CLI to copy local files to or from a remote directory in a container using the rsync command.

7.6.1. Understanding how to copy files

The oc rsync command, or remote sync, is a useful tool for copying database archives to and from your pods for backup and restore purposes. You can also use oc rsync to copy source code changes into a running pod for development debugging, when the running pod supports hot reload of source files.

$ oc rsync <source> <destination> [-c <container>]

7.6.1.1. Requirements

Specifying the Copy Source

The source argument of the oc rsync command must point to either a local directory or a pod directory. Individual files are not supported.

When specifying a pod directory the directory name must be prefixed with the pod name:

<pod name>:<dir>

If the directory name ends in a path separator (/), only the contents of the directory are copied to the destination. Otherwise, the directory and its contents are copied to the destination.

Specifying the Copy Destination

The destination argument of the oc rsync command must point to a directory. If the directory does not exist, but rsync is used for copy, the directory is created for you.

Deleting Files at the Destination

The --delete flag may be used to delete any files in the remote directory that are not in the local directory.

Continuous Syncing on File Change

Using the --watch option causes the command to monitor the source path for any file system changes, and synchronizes changes when they occur. With this argument, the command runs forever.

Synchronization occurs after short quiet periods to ensure a rapidly changing file system does not result in continuous synchronization calls.

When using the --watch option, the behavior is effectively the same as manually invoking oc rsync repeatedly, including any arguments normally passed to oc rsync. Therefore, you can control the behavior via the same flags used with manual invocations of oc rsync, such as --delete.

7.6.2. Copying files to and from containers

Support for copying local files to or from a container is built into the CLI.

Prerequisites

When working with oc rsync, note the following:

rsync must be installed. The oc rsync command uses the local rsync tool, if present on the client machine and the remote container.
If rsync is not found locally or in the remote container, a tar archive is created locally and sent to the container where the tar utility is used to extract the files. If tar is not available in the remote container, the copy will fail.
The tar copy method does not provide the same functionality as oc rsync. For example, oc rsync creates the destination directory if it does not exist and only sends files that are different between the source and the destination.
Note
In Windows, the cwRsync client should be installed and added to the PATH for use with the oc rsync command.

Procedure

To copy a local directory to a pod directory:

$ oc rsync <local-dir> <pod-name>:/<remote-dir> -c <container-name>

For example:

$ oc rsync /home/user/source devpod1234:/src -c user-container

To copy a pod directory to a local directory:

$ oc rsync devpod1234:/src /home/user/source

Example output

$ oc rsync devpod1234:/src/status.txt /home/user/

7.6.3. Using advanced Rsync features

The oc rsync command exposes fewer command line options than standard rsync. In the case that you want to use a standard rsync command line option that is not available in oc rsync, for example the --exclude-from=FILE option, it might be possible to use standard rsync 's --rsh (-e) option or RSYNC_RSH environment variable as a workaround, as follows:

$ rsync --rsh='oc rsh' --exclude-from=<file_name> <local-dir> <pod-name>:/<remote-dir>

or:

Export the RSYNC_RSH variable:

$ export RSYNC_RSH='oc rsh'

Then, run the rsync command:

$ rsync --exclude-from=<file_name> <local-dir> <pod-name>:/<remote-dir>

Both of the above examples configure standard rsync to use oc rsh as its remote shell program to enable it to connect to the remote pod, and are an alternative to running oc rsync.

7.7. Executing remote commands in an OpenShift Container Platform container

You can use the CLI to execute remote commands in an OpenShift Container Platform container.

7.7.1. Executing remote commands in containers

Support for remote container command execution is built into the CLI.

Procedure

To run a command in a container:

$ oc exec <pod> [-c <container>] -- <command> [<arg_1> ... <arg_n>]

For example:

$ oc exec mypod date

Example output

Thu Apr  9 02:21:53 UTC 2015

Important

For security purposes, the oc exec command does not work when accessing privileged containers except when the command is executed by a cluster-admin user.

7.7.2. Protocol for initiating a remote command from a client

Clients initiate the execution of a remote command in a container by issuing a request to the Kubernetes API server:

/proxy/nodes/<node_name>/exec/<namespace>/<pod>/<container>?command=<command>

In the above URL:

<node_name> is the FQDN of the node.
<namespace> is the project of the target pod.
<pod> is the name of the target pod.
<container> is the name of the target container.
<command> is the desired command to be executed.

For example:

/proxy/nodes/node123.openshift.com/exec/myns/mypod/mycontainer?command=date

Additionally, the client can add parameters to the request to indicate if:

the client should send input to the remote container’s command (stdin).
the client’s terminal is a TTY.
the remote container’s command should send output from stdout to the client.
the remote container’s command should send output from stderr to the client.

After sending an exec request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses HTTP/2.

The client creates one stream each for stdin, stdout, and stderr. To distinguish among the streams, the client sets the streamType header on the stream to one of stdin, stdout, or stderr.

The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the remote command execution request.

7.8. Using port forwarding to access applications in a container

OpenShift Container Platform supports port forwarding to pods.

7.8.1. Understanding port forwarding

You can use the CLI to forward one or more local ports to a pod. This allows you to listen on a given or random port locally, and have data forwarded to and from given ports in the pod.

Support for port forwarding is built into the CLI:

$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]

The CLI listens on each local port specified by the user, forwarding using the protocol described below.

Ports may be specified using the following formats:

`5000`	The client listens on port 5000 locally and forwards to 5000 in the pod.
`6000:5000`	The client listens on port 6000 locally and forwards to 5000 in the pod.
`:5000` or `0:5000`	The client selects a free local port and forwards to 5000 in the pod.

OpenShift Container Platform handles port-forward requests from clients. Upon receiving a request, OpenShift Container Platform upgrades the response and waits for the client to create port-forwarding streams. When OpenShift Container Platform receives a new stream, it copies data between the stream and the pod’s port.

Architecturally, there are options for forwarding to a pod’s port. The supported OpenShift Container Platform implementation invokes nsenter directly on the node host to enter the pod’s network namespace, then invokes socat to copy data between the stream and the pod’s port. However, a custom implementation could include running a helper pod that then runs nsenter and socat, so that those binaries are not required to be installed on the host.

7.8.2. Using port forwarding

You can use the CLI to port-forward one or more local ports to a pod.

Procedure

Use the following command to listen on the specified port in a pod:

$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]

For example:

Use the following command to listen on ports 5000 and 6000 locally and forward data to and from ports 5000 and 6000 in the pod:

$ oc port-forward <pod> 5000 6000

Example output

Forwarding from 127.0.0.1:5000 -> 5000
Forwarding from [::1]:5000 -> 5000
Forwarding from 127.0.0.1:6000 -> 6000
Forwarding from [::1]:6000 -> 6000

Use the following command to listen on port 8888 locally and forward to 5000 in the pod:

$ oc port-forward <pod> 8888:5000

Example output

Forwarding from 127.0.0.1:8888 -> 5000
Forwarding from [::1]:8888 -> 5000

Use the following command to listen on a free port locally and forward to 5000 in the pod:

$ oc port-forward <pod> :5000

Example output

Forwarding from 127.0.0.1:42390 -> 5000
Forwarding from [::1]:42390 -> 5000

Or:

$ oc port-forward <pod> 0:5000

7.8.3. Protocol for initiating port forwarding from a client

Clients initiate port forwarding to a pod by issuing a request to the Kubernetes API server:

/proxy/nodes/<node_name>/portForward/<namespace>/<pod>

In the above URL:

<node_name> is the FQDN of the node.
<namespace> is the namespace of the target pod.
<pod> is the name of the target pod.

For example:

/proxy/nodes/node123.openshift.com/portForward/myns/mypod

After sending a port forward request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses Hyptertext Transfer Protocol Version 2 (HTTP/2).

The client creates a stream with the port header containing the target port in the pod. All data written to the stream is delivered via the kubelet to the target pod and port. Similarly, all data sent from the pod for that forwarded connection is delivered back to the same stream in the client.

The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the port forwarding request.

7.9. Using sysctls in containers

Sysctl settings are exposed via Kubernetes, allowing users to modify certain kernel parameters at runtime for namespaces within a container. Only sysctls that are namespaced can be set independently on pods. If a sysctl is not namespaced, called node-level, you must use another method of setting the sysctl, such as the Node Tuning Operator. Moreover, only those sysctls considered safe are whitelisted by default; you can manually enable other unsafe sysctls on the node to be available to the user.

7.9.1. About sysctls

In Linux, the sysctl interface allows an administrator to modify kernel parameters at runtime. Parameters are available via the /proc/sys/ virtual process file system. The parameters cover various subsystems, such as:

kernel (common prefix: kernel.)
networking (common prefix: net.)
virtual memory (common prefix: vm.)
MDADM (common prefix: dev.)

More subsystems are described in Kernel documentation. To get a list of all parameters, run:

$ sudo sysctl -a

7.9.1.1. Namespaced versus node-level sysctls

A number of sysctls are namespaced in the Linux kernels. This means that you can set them independently for each pod on a node. Being namespaced is a requirement for sysctls to be accessible in a pod context within Kubernetes.

The following sysctls are known to be namespaced:

kernel.shm*
kernel.msg*
kernel.sem
fs.mqueue.*

Additionally, most of the sysctls in the net.* group are known to be namespaced. Their namespace adoption differs based on the kernel version and distributor.

Sysctls that are not namespaced are called node-level and must be set manually by the cluster administrator, either by means of the underlying Linux distribution of the nodes, such as by modifying the /etc/sysctls.conf file, or by using a daemon set with privileged containers. You can use the Node Tuning Operator to set node-level sysctls.

Note

Consider marking nodes with special sysctls as tainted. Only schedule pods onto them that need those sysctl settings. Use the taints and toleration feature to mark the nodes.

7.9.1.2. Safe versus unsafe sysctls

Sysctls are grouped into safe and unsafe sysctls.

For a sysctl to be considered safe, it must use proper namespacing and must be properly isolated between pods on the same node. This means that if you set a sysctl for one pod it must not:

Influence any other pod on the node
Harm the node’s health
Gain CPU or memory resources outside of the resource limits of a pod

OpenShift Container Platform supports, or whitelists, the following sysctls in the safe set:

kernel.shm_rmid_forced
net.ipv4.ip_local_port_range
net.ipv4.tcp_syncookies
net.ipv4.ping_group_range

All safe sysctls are enabled by default. You can use a sysctl in a pod by modifying the Pod spec.

Any sysctl not whitelisted by OpenShift Container Platform is considered unsafe for OpenShift Container Platform. Note that being namespaced alone is not sufficient for the sysctl to be considered safe.

All unsafe sysctls are disabled by default, and the cluster administrator must manually enable them on a per-node basis. Pods with disabled unsafe sysctls are scheduled but do not launch.

$ oc get pod

Example output

NAME        READY   STATUS            RESTARTS   AGE
hello-pod   0/1     SysctlForbidden   0          14s

7.9.2. Setting sysctls for a pod

You can set sysctls on pods using the pod’s securityContext. The securityContext applies to all containers in the same pod.

Safe sysctls are allowed by default. A pod with unsafe sysctls fails to launch on any node unless the cluster administrator explicitly enables unsafe sysctls for that node. As with node-level sysctls, use the taints and toleration feature or labels on nodes to schedule those pods onto the right nodes.

The following example uses the pod securityContext to set a safe sysctl kernel.shm_rmid_forced and two unsafe sysctls, net.core.somaxconn and kernel.msgmax. There is no distinction between safe and unsafe sysctls in the specification.

Warning

To avoid destabilizing your operating system, modify sysctl parameters only after you understand their effects.

Procedure

To use safe and unsafe sysctls:

Modify the YAML file that defines the pod and add the securityContext spec, as shown in the following example:

apiVersion: v1
kind: Pod
metadata:
  name: sysctl-example
spec:
  securityContext:
    sysctls:
    - name: kernel.shm_rmid_forced
      value: "0"
    - name: net.core.somaxconn
      value: "1024"
    - name: kernel.msgmax
      value: "65536"
  ...

Create the pod:

$ oc apply -f <file-name>.yaml

If the unsafe sysctls are not allowed for the node, the pod is scheduled, but does not deploy:

$ oc get pod

Example output

NAME        READY   STATUS            RESTARTS   AGE
hello-pod   0/1     SysctlForbidden   0          14s

7.9.3. Enabling unsafe sysctls

A cluster administrator can allow certain unsafe sysctls for very special situations such as high performance or real-time application tuning.

If you want to use unsafe sysctls, a cluster administrator must enable them individually for a specific type of node. The sysctls must be namespaced.

You can further control which sysctls are set in pods by specifying lists of sysctls or sysctl patterns in the allowedUnsafeSysctls field of the Security Context Constraints.

The allowedUnsafeSysctls option controls specific needs such as high performance or real-time application tuning.

Warning

Due to their nature of being unsafe, the use of unsafe sysctls is at-your-own-risk and can lead to severe problems, such as improper behavior of containers, resource shortage, or breaking a node.

Procedure

Add a label to the machine config pool where the containers where containers with the unsafe sysctls will run:

$ oc edit machineconfigpool worker

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  creationTimestamp: 2019-02-08T14:52:39Z
  generation: 1
  labels:
    custom-kubelet: sysctl 1

1: Add a key: pair label.

Create a KubeletConfig custom resource (CR):

apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: custom-kubelet
spec:
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: sysctl 1
  kubeletConfig:
    allowedUnsafeSysctls: 2
      - "kernel.msg*"
      - "net.core.somaxconn"

1: Specify the label from the machine config pool.
2: List the unsafe sysctls you want to allow.

Create the object:
```
$ oc apply -f set-sysctl-worker.yaml
```
A new MachineConfig object named in the 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet format is created.

Wait for the cluster to reboot using the machineconfigpool object status fields:

For example:

status:
  conditions:
    - lastTransitionTime: '2019-08-11T15:32:00Z'
      message: >-
        All nodes are updating to
        rendered-worker-ccbfb5d2838d65013ab36300b7b3dc13
      reason: ''
      status: 'True'
      type: Updating

A message similar to the following appears when the cluster is ready:

   - lastTransitionTime: '2019-08-11T16:00:00Z'
      message: >-
        All nodes are updated with
        rendered-worker-ccbfb5d2838d65013ab36300b7b3dc13
      reason: ''
      status: 'True'
      type: Updated

When the cluster is ready, check for the merged KubeletConfig object in the new MachineConfig object:

$ oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7

        "ownerReferences": [
            {
                "apiVersion": "machineconfiguration.openshift.io/v1",
                "blockOwnerDeletion": true,
                "controller": true,
                "kind": "KubeletConfig",
                "name": "custom-kubelet",
                "uid": "3f64a766-bae8-11e9-abe8-0a1a2a4813f2"
            }
        ]

You can now add unsafe sysctls to pods as needed.