Questo contenuto non è disponibile nella lingua selezionata.

Chapter 7. Working with containers


7.1. Understanding Containers

The basic units of OpenShift Container Platform applications are called containers. Linux container technologies are lightweight mechanisms for isolating running processes so that they are limited to interacting with only their designated resources.

Many application instances can be running in containers on a single host without visibility into each others' processes, files, network, and so on. Typically, each container provides a single service (often called a "micro-service"), such as a web server or a database, though containers can be used for arbitrary workloads.

The Linux kernel has been incorporating capabilities for container technologies for years. OpenShift Container Platform and Kubernetes add the ability to orchestrate containers across multi-host installations.

About containers and RHEL kernel memory

Due to Red Hat Enterprise Linux (RHEL) behavior, a container on a node with high CPU usage might seem to consume more memory than expected. The higher memory consumption could be caused by the

kmem_cache
in the RHEL kernel. The RHEL kernel creates a
kmem_cache
for each cgroup. For added performance, the
kmem_cache
contains a
cpu_cache
, and a node cache for any NUMA nodes. These caches all consume kernel memory.

The amount of memory stored in those caches is proportional to the number of CPUs that the system uses. As a result, a higher number of CPUs results in a greater amount of kernel memory being held in these caches. Higher amounts of kernel memory in these caches can cause OpenShift Container Platform containers to exceed the configured memory limits, resulting in the container being killed.

To avoid losing containers due to kernel memory issues, ensure that the containers request sufficient memory. You can use the following formula to estimate the amount of memory consumed by the

kmem_cache
, where
nproc
is the number of processing units available that are reported by the
nproc
command. The lower limit of container requests should be this value plus the container memory requirements:

$(nproc) X 1/2 MiB

7.2. Using Init Containers to perform tasks before a pod is deployed

OpenShift Container Platform provides init containers, which are specialized containers that run before application containers and can contain utilities or setup scripts not present in an app image.

7.2.1. Understanding Init Containers

You can use an Init Container resource to perform tasks before the rest of a pod is deployed.

A pod can have Init Containers in addition to application containers. Init containers allow you to reorganize setup scripts and binding code.

An Init Container can:

  • Contain and run utilities that are not desirable to include in the app Container image for security reasons.
  • Contain utilities or custom code for setup that is not present in an app image. For example, there is no requirement to make an image FROM another image just to use a tool like sed, awk, python, or dig during setup.
  • Use Linux namespaces so that they have different filesystem views from app containers, such as access to secrets that application containers are not able to access.

Each Init Container must complete successfully before the next one is started. So, Init Containers provide an easy way to block or delay the startup of app containers until some set of preconditions are met.

For example, the following are some ways you can use Init Containers:

  • Wait for a service to be created with a shell command like:

    for i in {1..100}; do sleep 1; if dig myservice; then exit 0; fi; done; exit 1
  • Register this pod with a remote server from the downward API with a command like:

    $ curl -X POST http://$MANAGEMENT_SERVICE_HOST:$MANAGEMENT_SERVICE_PORT/register -d ‘instance=$()&ip=$()
  • Wait for some time before starting the app Container with a command like
    sleep 60
    .
  • Clone a git repository into a volume.
  • Place values into a configuration file and run a template tool to dynamically generate a configuration file for the main app Container. For example, place the POD_IP value in a configuration and generate the main app configuration file using Jinja.

See the Kubernetes documentation for more information.

7.2.2. Creating Init Containers

The following example outlines a simple pod which has two Init Containers. The first waits for

myservice
and the second waits for
mydb
. After both containers complete, the pod begins.

Procedure

  1. Create the pod for the Init Container:

    1. Create a YAML file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: myapp-pod
        labels:
          app: myapp
      spec:
        containers:
        - name: myapp-container
          image: registry.access.redhat.com/ubi8/ubi:latest
          command: ['sh', '-c', 'echo The app is running! && sleep 3600']
        initContainers:
        - name: init-myservice
          image: registry.access.redhat.com/ubi8/ubi:latest
          command: ['sh', '-c', 'until getent hosts myservice; do echo waiting for myservice; sleep 2; done;']
        - name: init-mydb
          image: registry.access.redhat.com/ubi8/ubi:latest
          command: ['sh', '-c', 'until getent hosts mydb; do echo waiting for mydb; sleep 2; done;']
      # ...
    2. Create the pod:

      $ oc create -f myapp.yaml
    3. View the status of the pod:

      $ oc get pods

      Example output

      NAME                          READY     STATUS              RESTARTS   AGE
      myapp-pod                     0/1       Init:0/2            0          5s

      The pod status,

      Init:0/2
      , indicates it is waiting for the two services.

  2. Create the

    myservice
    service.

    1. Create a YAML file similar to the following:

      kind: Service
      apiVersion: v1
      metadata:
        name: myservice
      spec:
        ports:
        - protocol: TCP
          port: 80
          targetPort: 9376
    2. Create the pod:

      $ oc create -f myservice.yaml
    3. View the status of the pod:

      $ oc get pods

      Example output

      NAME                          READY     STATUS              RESTARTS   AGE
      myapp-pod                     0/1       Init:1/2            0          5s

      The pod status,

      Init:1/2
      , indicates it is waiting for one service, in this case the
      mydb
      service.

  3. Create the

    mydb
    service:

    1. Create a YAML file similar to the following:

      kind: Service
      apiVersion: v1
      metadata:
        name: mydb
      spec:
        ports:
        - protocol: TCP
          port: 80
          targetPort: 9377
    2. Create the pod:

      $ oc create -f mydb.yaml
    3. View the status of the pod:

      $ oc get pods

      Example output

      NAME                          READY     STATUS              RESTARTS   AGE
      myapp-pod                     1/1       Running             0          2m

      The pod status indicated that it is no longer waiting for the services and is running.

7.3. Using volumes to persist container data

Files in a container are ephemeral. As such, when a container crashes or stops, the data is lost. You can use volumes to persist the data used by the containers in a pod. A volume is directory, accessible to the Containers in a pod, where data is stored for the life of the pod.

7.3.1. Understanding volumes

Volumes are mounted file systems available to pods and their containers which may be backed by a number of host-local or network attached storage endpoints. Containers are not persistent by default; on restart, their contents are cleared.

To ensure that the file system on the volume contains no errors and, if errors are present, to repair them when possible, OpenShift Container Platform invokes the

fsck
utility prior to the
mount
utility. This occurs when either adding a volume or updating an existing volume.

The simplest volume type is

emptyDir
, which is a temporary directory on a single machine. Administrators may also allow you to request a persistent volume that is automatically attached to your pods.

Note

emptyDir
volume storage may be restricted by a quota based on the pod’s FSGroup, if the FSGroup parameter is enabled by your cluster administrator.

7.3.2. Working with volumes using the OpenShift Container Platform CLI

You can use the CLI command

oc set volume
to add and remove volumes and volume mounts for any object that has a pod template like replication controllers or deployment configs. You can also list volumes in pods or any object that has a pod template.

The

oc set volume
command uses the following general syntax:

$ oc set volume <object_selection> <operation> <mandatory_parameters> <options>
Object selection
Specify one of the following for the object_selection parameter in the oc set volume command:
Expand
Table 7.1. Object Selection
SyntaxDescriptionExample

<object_type> <name>

Selects

<name>
of type
<object_type>
.

deploymentConfig registry

<object_type>/<name>

Selects

<name>
of type
<object_type>
.

deploymentConfig/registry

<object_type>
--selector=<object_label_selector>

Selects resources of type

<object_type>
that matched the given label selector.

deploymentConfig
--selector="name=registry"

<object_type> --all

Selects all resources of type

<object_type>
.

deploymentConfig --all

-f
or
--filename=<file_name>

File name, directory, or URL to file to use to edit the resource.

-f registry-deployment-config.json

Operation
Specify --add or --remove for the operation parameter in the oc set volume command.
Mandatory parameters
Any mandatory parameters are specific to the selected operation and are discussed in later sections.
Options
Any options are specific to the selected operation and are discussed in later sections.

7.3.3. Listing volumes and volume mounts in a pod

You can list volumes and volume mounts in pods or pod templates:

Procedure

To list volumes:

$ oc set volume <object_type>/<name> [options]

List volume supported options:

Expand
OptionDescriptionDefault

--name

Name of the volume.

 

-c, --containers

Select containers by name. It can also take wildcard

'*'
that matches any character.

'*'

For example:

  • To list all volumes for pod p1:

    $ oc set volume pod/p1
  • To list volume v1 defined on all deployment configs:

    $ oc set volume dc --all --name=v1

7.3.4. Adding volumes to a pod

You can add volumes and volume mounts to a pod.

Procedure

To add a volume, a volume mount, or both to pod templates:

$ oc set volume <object_type>/<name> --add [options]
Expand
Table 7.2. Supported Options for Adding Volumes
OptionDescriptionDefault

--name

Name of the volume.

Automatically generated, if not specified.

-t, --type

Name of the volume source. Supported values:

emptyDir
,
hostPath
,
secret
,
configmap
,
persistentVolumeClaim
or
projected
.

emptyDir

-c, --containers

Select containers by name. It can also take wildcard

'*'
that matches any character.

'*'

-m, --mount-path

Mount path inside the selected containers. Do not mount to the container root,

/
, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host
/dev/pts
files. It is safe to mount the host by using
/host
.

 

--path

Host path. Mandatory parameter for

--type=hostPath
. Do not mount to the container root,
/
, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host
/dev/pts
files. It is safe to mount the host by using
/host
.

 

--secret-name

Name of the secret. Mandatory parameter for

--type=secret
.

 

--configmap-name

Name of the configmap. Mandatory parameter for

--type=configmap
.

 

--claim-name

Name of the persistent volume claim. Mandatory parameter for

--type=persistentVolumeClaim
.

 

--source

Details of volume source as a JSON string. Recommended if the desired volume source is not supported by

--type
.

 

-o, --output

Display the modified objects instead of updating them on the server. Supported values:

json
,
yaml
.

 

--output-version

Output the modified objects with the given version.

api-version

For example:

  • To add a new volume source emptyDir to the registry

    DeploymentConfig
    object:

    $ oc set volume dc/registry --add
    Tip

    You can alternatively apply the following YAML to add the volume:

    Example 7.1. Sample deployment config with an added volume

    kind: DeploymentConfig
    apiVersion: apps.openshift.io/v1
    metadata:
      name: registry
      namespace: registry
    spec:
      replicas: 3
      selector:
        app: httpd
      template:
        metadata:
          labels:
            app: httpd
        spec:
          volumes: 
    1
    
            - name: volume-pppsw
              emptyDir: {}
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
    1
    Add the volume source emptyDir.
  • To add volume v1 with secret secret1 for replication controller r1 and mount inside the containers at /data:

    $ oc set volume rc/r1 --add --name=v1 --type=secret --secret-name='secret1' --mount-path=/data
    Tip

    You can alternatively apply the following YAML to add the volume:

    Example 7.2. Sample replication controller with added volume and secret

    kind: ReplicationController
    apiVersion: v1
    metadata:
      name: example-1
      namespace: example
    spec:
      replicas: 0
      selector:
        app: httpd
        deployment: example-1
        deploymentconfig: example
      template:
        metadata:
          creationTimestamp: null
          labels:
            app: httpd
            deployment: example-1
            deploymentconfig: example
        spec:
          volumes: 
    1
    
            - name: v1
              secret:
                secretName: secret1
                defaultMode: 420
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              volumeMounts: 
    2
    
                - name: v1
                  mountPath: /data
    1
    Add the volume and secret.
    2
    Add the container mount path.
  • To add existing persistent volume v1 with claim name pvc1 to deployment configuration dc.json on disk, mount the volume on container c1 at /data, and update the

    DeploymentConfig
    object on the server:

    $ oc set volume -f dc.json --add --name=v1 --type=persistentVolumeClaim \
      --claim-name=pvc1 --mount-path=/data --containers=c1
    Tip

    You can alternatively apply the following YAML to add the volume:

    Example 7.3. Sample deployment config with persistent volume added

    kind: DeploymentConfig
    apiVersion: apps.openshift.io/v1
    metadata:
      name: example
      namespace: example
    spec:
      replicas: 3
      selector:
        app: httpd
      template:
        metadata:
          labels:
            app: httpd
        spec:
          volumes:
            - name: volume-pppsw
              emptyDir: {}
            - name: v1 
    1
    
              persistentVolumeClaim:
                claimName: pvc1
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
              volumeMounts: 
    2
    
                - name: v1
                  mountPath: /data
    1
    Add the persistent volume claim named `pvc1.
    2
    Add the container mount path.
  • To add a volume v1 based on Git repository https://github.com/namespace1/project1 with revision 5125c45f9f563 for all replication controllers:

    $ oc set volume rc --all --add --name=v1 \
      --source='{"gitRepo": {
                    "repository": "https://github.com/namespace1/project1",
                    "revision": "5125c45f9f563"
                }}'

7.3.5. Updating volumes and volume mounts in a pod

You can modify the volumes and volume mounts in a pod.

Procedure

Updating existing volumes using the

--overwrite
option:

$ oc set volume <object_type>/<name> --add --overwrite [options]

For example:

  • To replace existing volume v1 for replication controller r1 with existing persistent volume claim pvc1:

    $ oc set volume rc/r1 --add --overwrite --name=v1 --type=persistentVolumeClaim --claim-name=pvc1
    Tip

    You can alternatively apply the following YAML to replace the volume:

    Example 7.4. Sample replication controller with persistent volume claim named pvc1

    kind: ReplicationController
    apiVersion: v1
    metadata:
      name: example-1
      namespace: example
    spec:
      replicas: 0
      selector:
        app: httpd
        deployment: example-1
        deploymentconfig: example
      template:
        metadata:
          labels:
            app: httpd
            deployment: example-1
            deploymentconfig: example
        spec:
          volumes:
            - name: v1 
    1
    
              persistentVolumeClaim:
                claimName: pvc1
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
              volumeMounts:
                - name: v1
                  mountPath: /data
    1
    Set persistent volume claim to pvc1.
  • To change the

    DeploymentConfig
    object d1 mount point to /opt for volume v1:

    $ oc set volume dc/d1 --add --overwrite --name=v1 --mount-path=/opt
    Tip

    You can alternatively apply the following YAML to change the mount point:

    Example 7.5. Sample deployment config with mount point set to opt.

    kind: DeploymentConfig
    apiVersion: apps.openshift.io/v1
    metadata:
      name: example
      namespace: example
    spec:
      replicas: 3
      selector:
        app: httpd
      template:
        metadata:
          labels:
            app: httpd
        spec:
          volumes:
            - name: volume-pppsw
              emptyDir: {}
            - name: v2
              persistentVolumeClaim:
                claimName: pvc1
            - name: v1
              persistentVolumeClaim:
                claimName: pvc1
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
              volumeMounts: 
    1
    
                - name: v1
                  mountPath: /opt
    1
    Set the mount point to /opt.

7.3.6. Removing volumes and volume mounts from a pod

You can remove a volume or volume mount from a pod.

Procedure

To remove a volume from pod templates:

$ oc set volume <object_type>/<name> --remove [options]
Expand
Table 7.3. Supported options for removing volumes
OptionDescriptionDefault

--name

Name of the volume.

 

-c, --containers

Select containers by name. It can also take wildcard

'*'
that matches any character.

'*'

--confirm

Indicate that you want to remove multiple volumes at once.

 

-o, --output

Display the modified objects instead of updating them on the server. Supported values:

json
,
yaml
.

 

--output-version

Output the modified objects with the given version.

api-version

For example:

  • To remove a volume v1 from the

    DeploymentConfig
    object d1:

    $ oc set volume dc/d1 --remove --name=v1
  • To unmount volume v1 from container c1 for the

    DeploymentConfig
    object d1 and remove the volume v1 if it is not referenced by any containers on d1:

    $ oc set volume dc/d1 --remove --name=v1 --containers=c1
  • To remove all volumes for replication controller r1:

    $ oc set volume rc/r1 --remove --confirm

7.3.7. Configuring volumes for multiple uses in a pod

You can configure a volume to allows you to share one volume for multiple uses in a single pod using the

volumeMounts.subPath
property to specify a
subPath
value inside a volume instead of the volume’s root.

Note

You cannot add a

subPath
parameter to an existing scheduled pod.

Procedure

  1. To view the list of files in the volume, run the

    oc rsh
    command:

    $ oc rsh <pod>

    Example output

    sh-4.2$ ls /path/to/volume/subpath/mount
    example_file1 example_file2 example_file3

  2. Specify the

    subPath
    :

    Example Pod spec with subPath parameter

    apiVersion: v1
    kind: Pod
    metadata:
      name: my-site
    spec:
        containers:
        - name: mysql
          image: mysql
          volumeMounts:
          - mountPath: /var/lib/mysql
            name: site-data
            subPath: mysql 
    1
    
        - name: php
          image: php
          volumeMounts:
          - mountPath: /var/www/html
            name: site-data
            subPath: html 
    2
    
        volumes:
        - name: site-data
          persistentVolumeClaim:
            claimName: my-site-data

    1
    Databases are stored in the mysql folder.
    2
    HTML content is stored in the html folder.

7.4. Mapping volumes using projected volumes

A projected volume maps several existing volume sources into the same directory.

The following types of volume sources can be projected:

  • Secrets
  • Config Maps
  • Downward API
Note

All sources are required to be in the same namespace as the pod.

7.4.1. Understanding projected volumes

Projected volumes can map any combination of these volume sources into a single directory, allowing the user to:

  • automatically populate a single volume with the keys from multiple secrets, config maps, and with downward API information, so that I can synthesize a single directory with various sources of information;
  • populate a single volume with the keys from multiple secrets, config maps, and with downward API information, explicitly specifying paths for each item, so that I can have full control over the contents of that volume.
Important

When the

RunAsUser
permission is set in the security context of a Linux-based pod, the projected files have the correct permissions set, including container user ownership. However, when the Windows equivalent
RunAsUsername
permission is set in a Windows pod, the kubelet is unable to correctly set ownership on the files in the projected volume.

Therefore, the

RunAsUsername
permission set in the security context of a Windows pod is not honored for Windows projected volumes running in OpenShift Container Platform.

The following general scenarios show how you can use projected volumes.

Config map, secrets, Downward API.
Projected volumes allow you to deploy containers with configuration data that includes passwords. An application using these resources could be deploying Red Hat OpenStack Platform (RHOSP) on Kubernetes. The configuration data might have to be assembled differently depending on if the services are going to be used for production or for testing. If a pod is labeled with production or testing, the downward API selector metadata.labels can be used to produce the correct RHOSP configs.
Config map + secrets.
Projected volumes allow you to deploy containers involving configuration data and passwords. For example, you might execute a config map with some sensitive encrypted tasks that are decrypted using a vault password file.
ConfigMap + Downward API.
Projected volumes allow you to generate a config including the pod name (available via the metadata.name selector). This application can then pass the pod name along with requests to easily determine the source without using IP tracking.
Secrets + Downward API.
Projected volumes allow you to use a secret as a public key to encrypt the namespace of the pod (available via the metadata.namespace selector). This example allows the Operator to use the application to deliver the namespace information securely without using an encrypted transport.

7.4.1.1. Example Pod specs

The following are examples of

Pod
specs for creating projected volumes.

Pod with a secret, a Downward API, and a config map

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts: 
1

    - name: all-in-one
      mountPath: "/projected-volume"
2

      readOnly: true 
3

  volumes: 
4

  - name: all-in-one 
5

    projected:
      defaultMode: 0400 
6

      sources:
      - secret:
          name: mysecret 
7

          items:
            - key: username
              path: my-group/my-username 
8

      - downwardAPI: 
9

          items:
            - path: "labels"
              fieldRef:
                fieldPath: metadata.labels
            - path: "cpu_limit"
              resourceFieldRef:
                containerName: container-test
                resource: limits.cpu
      - configMap: 
10

          name: myconfigmap
          items:
            - key: config
              path: my-group/my-config
              mode: 0777 
11

1
Add a volumeMounts section for each container that needs the secret.
2
Specify a path to an unused directory where the secret will appear.
3
Set readOnly to true.
4
Add a volumes block to list each projected volume source.
5
Specify any name for the volume.
6
Set the execute permission on the files.
7
Add a secret. Enter the name of the secret object. Each secret you want to use must be listed.
8
Specify the path to the secrets file under the mountPath. Here, the secrets file is in /projected-volume/my-group/my-username.
9
Add a Downward API source.
10
Add a ConfigMap source.
11
Set the mode for the specific projection
Note

If there are multiple containers in the pod, each container needs a

volumeMounts
section, but only one
volumes
section is needed.

Pod with multiple secrets with a non-default permission mode set

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts:
    - name: all-in-one
      mountPath: "/projected-volume"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      defaultMode: 0755
      sources:
      - secret:
          name: mysecret
          items:
            - key: username
              path: my-group/my-username
      - secret:
          name: mysecret2
          items:
            - key: password
              path: my-group/my-password
              mode: 511

Note

The

defaultMode
can only be specified at the projected level and not for each volume source. However, as illustrated above, you can explicitly set the
mode
for each individual projection.

7.4.1.2. Pathing Considerations

Collisions Between Keys when Configured Paths are Identical

If you configure any keys with the same path, the pod spec will not be accepted as valid. In the following example, the specified path for

mysecret
and
myconfigmap
are the same:

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts:
    - name: all-in-one
      mountPath: "/projected-volume"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      sources:
      - secret:
          name: mysecret
          items:
            - key: username
              path: my-group/data
      - configMap:
          name: myconfigmap
          items:
            - key: config
              path: my-group/data

Consider the following situations related to the volume file paths.

Collisions Between Keys without Configured Paths
The only run-time validation that can occur is when all the paths are known at pod creation, similar to the above scenario. Otherwise, when a conflict occurs the most recent specified resource will overwrite anything preceding it (this is true for resources that are updated after pod creation as well).
Collisions when One Path is Explicit and the Other is Automatically Projected
In the event that there is a collision due to a user specified path matching data that is automatically projected, the latter resource will overwrite anything preceding it as before

7.4.2. Configuring a Projected Volume for a Pod

When creating projected volumes, consider the volume file path situations described in Understanding projected volumes.

The following example shows how to use a projected volume to mount an existing secret volume source. The steps can be used to create a user name and password secrets from local files. You then create a pod that runs one container, using a projected volume to mount the secrets into the same shared directory.

The user name and password values can be any valid string that is base64 encoded.

The following example shows

admin
in base64:

$ echo -n "admin" | base64

Example output

YWRtaW4=

The following example shows the password

1f2d1e2e67df
in base64:

$ echo -n "1f2d1e2e67df" | base64

Example output

MWYyZDFlMmU2N2Rm

Procedure

To use a projected volume to mount an existing secret volume source.

  1. Create the secret:

    1. Create a YAML file similar to the following, replacing the password and user information as appropriate:

      apiVersion: v1
      kind: Secret
      metadata:
        name: mysecret
      type: Opaque
      data:
        pass: MWYyZDFlMmU2N2Rm
        user: YWRtaW4=
    2. Use the following command to create the secret:

      $ oc create -f <secrets-filename>

      For example:

      $ oc create -f secret.yaml

      Example output

      secret "mysecret" created

    3. You can check that the secret was created using the following commands:

      $ oc get secret <secret-name>

      For example:

      $ oc get secret mysecret

      Example output

      NAME       TYPE      DATA      AGE
      mysecret   Opaque    2         17h

      $ oc get secret <secret-name> -o yaml

      For example:

      $ oc get secret mysecret -o yaml
      apiVersion: v1
      data:
        pass: MWYyZDFlMmU2N2Rm
        user: YWRtaW4=
      kind: Secret
      metadata:
        creationTimestamp: 2017-05-30T20:21:38Z
        name: mysecret
        namespace: default
        resourceVersion: "2107"
        selfLink: /api/v1/namespaces/default/secrets/mysecret
        uid: 959e0424-4575-11e7-9f97-fa163e4bd54c
      type: Opaque
  2. Create a pod with a projected volume.

    1. Create a YAML file similar to the following, including a

      volumes
      section:

      kind: Pod
      metadata:
        name: test-projected-volume
      spec:
        containers:
        - name: test-projected-volume
          image: busybox
          args:
          - sleep
          - "86400"
          volumeMounts:
          - name: all-in-one
            mountPath: "/projected-volume"
            readOnly: true
            securityContext:
              allowPrivilegeEscalation: false
              capabilities:
                drop:
                  - ALL
        volumes:
        - name: all-in-one
          projected:
            sources:
            - secret:
                name: mysecret 
      1
      1
      The name of the secret you created.
    2. Create the pod from the configuration file:

      $ oc create -f <your_yaml_file>.yaml

      For example:

      $ oc create -f secret-pod.yaml

      Example output

      pod "test-projected-volume" created

  3. Verify that the pod container is running, and then watch for changes to the pod:

    $ oc get pod <name>

    For example:

    $ oc get pod test-projected-volume

    The output should appear similar to the following:

    Example output

    NAME                    READY     STATUS    RESTARTS   AGE
    test-projected-volume   1/1       Running   0          14s

  4. In another terminal, use the

    oc exec
    command to open a shell to the running container:

    $ oc exec -it <pod> <command>

    For example:

    $ oc exec -it test-projected-volume -- /bin/sh
  5. In your shell, verify that the

    projected-volumes
    directory contains your projected sources:

    / # ls

    Example output

    bin               home              root              tmp
    dev               proc              run               usr
    etc               projected-volume  sys               var

7.5. Allowing containers to consume API objects

The Downward API is a mechanism that allows containers to consume information about API objects without coupling to OpenShift Container Platform. Such information includes the pod’s name, namespace, and resource values. Containers can consume information from the downward API using environment variables or a volume plugin.

7.5.1. Expose pod information to Containers using the Downward API

The Downward API contains such information as the pod’s name, project, and resource values. Containers can consume information from the downward API using environment variables or a volume plugin.

Fields within the pod are selected using the

FieldRef
API type.
FieldRef
has two fields:

Expand
FieldDescription

fieldPath

The path of the field to select, relative to the pod.

apiVersion

The API version to interpret the

fieldPath
selector within.

Currently, the valid selectors in the v1 API include:

Expand
SelectorDescription

metadata.name

The pod’s name. This is supported in both environment variables and volumes.

metadata.namespace

The pod’s namespace.This is supported in both environment variables and volumes.

metadata.labels

The pod’s labels. This is only supported in volumes and not in environment variables.

metadata.annotations

The pod’s annotations. This is only supported in volumes and not in environment variables.

status.podIP

The pod’s IP. This is only supported in environment variables and not volumes.

The

apiVersion
field, if not specified, defaults to the API version of the enclosing pod template.

7.5.2. Understanding how to consume container values using the downward API

You containers can consume API values using environment variables or a volume plugin. Depending on the method you choose, containers can consume:

  • Pod name
  • Pod project/namespace
  • Pod annotations
  • Pod labels

Annotations and labels are available using only a volume plugin.

7.5.2.1. Consuming container values using environment variables

When using a container’s environment variables, use the

EnvVar
type’s
valueFrom
field (of type
EnvVarSource
) to specify that the variable’s value should come from a
FieldRef
source instead of the literal value specified by the
value
field.

Only constant attributes of the pod can be consumed this way, as environment variables cannot be updated once a process is started in a way that allows the process to be notified that the value of a variable has changed. The fields supported using environment variables are:

  • Pod name
  • Pod project/namespace

Procedure

  1. Create a new pod spec that contains the environment variables you want the container to consume:

    1. Create a

      pod.yaml
      file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: dapi-env-test-pod
      spec:
        containers:
          - name: env-test-container
            image: gcr.io/google_containers/busybox
            command: [ "/bin/sh", "-c", "env" ]
            env:
              - name: MY_POD_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.name
              - name: MY_POD_NAMESPACE
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.namespace
        restartPolicy: Never
      # ...
    2. Create the pod from the

      pod.yaml
      file:

      $ oc create -f pod.yaml

Verification

  • Check the container’s logs for the

    MY_POD_NAME
    and
    MY_POD_NAMESPACE
    values:

    $ oc logs -p dapi-env-test-pod

7.5.2.2. Consuming container values using a volume plugin

You containers can consume API values using a volume plugin.

Containers can consume:

  • Pod name
  • Pod project/namespace
  • Pod annotations
  • Pod labels

Procedure

To use the volume plugin:

  1. Create a new pod spec that contains the environment variables you want the container to consume:

    1. Create a

      volume-pod.yaml
      file similar to the following:

      kind: Pod
      apiVersion: v1
      metadata:
        labels:
          zone: us-east-coast
          cluster: downward-api-test-cluster1
          rack: rack-123
        name: dapi-volume-test-pod
        annotations:
          annotation1: "345"
          annotation2: "456"
      spec:
        containers:
          - name: volume-test-container
            image: gcr.io/google_containers/busybox
            command: ["sh", "-c", "cat /tmp/etc/pod_labels /tmp/etc/pod_annotations"]
            volumeMounts:
              - name: podinfo
                mountPath: /tmp/etc
                readOnly: false
        volumes:
        - name: podinfo
          downwardAPI:
            defaultMode: 420
            items:
            - fieldRef:
                fieldPath: metadata.name
              path: pod_name
            - fieldRef:
                fieldPath: metadata.namespace
              path: pod_namespace
            - fieldRef:
                fieldPath: metadata.labels
              path: pod_labels
            - fieldRef:
                fieldPath: metadata.annotations
              path: pod_annotations
        restartPolicy: Never
      # ...
    2. Create the pod from the

      volume-pod.yaml
      file:

      $ oc create -f volume-pod.yaml

Verification

  • Check the container’s logs and verify the presence of the configured fields:

    $ oc logs -p dapi-volume-test-pod

    Example output

    cluster=downward-api-test-cluster1
    rack=rack-123
    zone=us-east-coast
    annotation1=345
    annotation2=456
    kubernetes.io/config.source=api

7.5.3. Understanding how to consume container resources using the Downward API

When creating pods, you can use the Downward API to inject information about computing resource requests and limits so that image and application authors can correctly create an image for specific environments.

You can do this using environment variable or a volume plugin.

7.5.3.1. Consuming container resources using environment variables

When creating pods, you can use the Downward API to inject information about computing resource requests and limits using environment variables.

When creating the pod configuration, specify environment variables that correspond to the contents of the

resources
field in the
spec.container
field.

Note

If the resource limits are not included in the container configuration, the downward API defaults to the node’s CPU and memory allocatable values.

Procedure

  1. Create a new pod spec that contains the resources you want to inject:

    1. Create a

      pod.yaml
      file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: dapi-env-test-pod
      spec:
        containers:
          - name: test-container
            image: gcr.io/google_containers/busybox:1.24
            command: [ "/bin/sh", "-c", "env" ]
            resources:
              requests:
                memory: "32Mi"
                cpu: "125m"
              limits:
                memory: "64Mi"
                cpu: "250m"
            env:
              - name: MY_CPU_REQUEST
                valueFrom:
                  resourceFieldRef:
                    resource: requests.cpu
              - name: MY_CPU_LIMIT
                valueFrom:
                  resourceFieldRef:
                    resource: limits.cpu
              - name: MY_MEM_REQUEST
                valueFrom:
                  resourceFieldRef:
                    resource: requests.memory
              - name: MY_MEM_LIMIT
                valueFrom:
                  resourceFieldRef:
                    resource: limits.memory
      # ...
    2. Create the pod from the

      pod.yaml
      file:

      $ oc create -f pod.yaml

7.5.3.2. Consuming container resources using a volume plugin

When creating pods, you can use the Downward API to inject information about computing resource requests and limits using a volume plugin.

When creating the pod configuration, use the

spec.volumes.downwardAPI.items
field to describe the desired resources that correspond to the
spec.resources
field.

Note

If the resource limits are not included in the container configuration, the Downward API defaults to the node’s CPU and memory allocatable values.

Procedure

  1. Create a new pod spec that contains the resources you want to inject:

    1. Create a

      pod.yaml
      file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: dapi-env-test-pod
      spec:
        containers:
          - name: client-container
            image: gcr.io/google_containers/busybox:1.24
            command: ["sh", "-c", "while true; do echo; if [[ -e /etc/cpu_limit ]]; then cat /etc/cpu_limit; fi; if [[ -e /etc/cpu_request ]]; then cat /etc/cpu_request; fi; if [[ -e /etc/mem_limit ]]; then cat /etc/mem_limit; fi; if [[ -e /etc/mem_request ]]; then cat /etc/mem_request; fi; sleep 5; done"]
            resources:
              requests:
                memory: "32Mi"
                cpu: "125m"
              limits:
                memory: "64Mi"
                cpu: "250m"
            volumeMounts:
              - name: podinfo
                mountPath: /etc
                readOnly: false
        volumes:
          - name: podinfo
            downwardAPI:
              items:
                - path: "cpu_limit"
                  resourceFieldRef:
                    containerName: client-container
                    resource: limits.cpu
                - path: "cpu_request"
                  resourceFieldRef:
                    containerName: client-container
                    resource: requests.cpu
                - path: "mem_limit"
                  resourceFieldRef:
                    containerName: client-container
                    resource: limits.memory
                - path: "mem_request"
                  resourceFieldRef:
                    containerName: client-container
                    resource: requests.memory
      # ...
    2. Create the pod from the

      volume-pod.yaml
      file:

      $ oc create -f volume-pod.yaml

7.5.4. Consuming secrets using the Downward API

When creating pods, you can use the downward API to inject secrets so image and application authors can create an image for specific environments.

Procedure

  1. Create a secret to inject:

    1. Create a

      secret.yaml
      file similar to the following:

      apiVersion: v1
      kind: Secret
      metadata:
        name: mysecret
      data:
        password: <password>
        username: <username>
      type: kubernetes.io/basic-auth
    2. Create the secret object from the

      secret.yaml
      file:

      $ oc create -f secret.yaml
  2. Create a pod that references the

    username
    field from the above
    Secret
    object:

    1. Create a

      pod.yaml
      file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: dapi-env-test-pod
      spec:
        containers:
          - name: env-test-container
            image: gcr.io/google_containers/busybox
            command: [ "/bin/sh", "-c", "env" ]
            env:
              - name: MY_SECRET_USERNAME
                valueFrom:
                  secretKeyRef:
                    name: mysecret
                    key: username
        restartPolicy: Never
      # ...
    2. Create the pod from the

      pod.yaml
      file:

      $ oc create -f pod.yaml

Verification

  • Check the container’s logs for the

    MY_SECRET_USERNAME
    value:

    $ oc logs -p dapi-env-test-pod

7.5.5. Consuming configuration maps using the Downward API

When creating pods, you can use the Downward API to inject configuration map values so image and application authors can create an image for specific environments.

Procedure

  1. Create a config map with the values to inject:

    1. Create a

      configmap.yaml
      file similar to the following:

      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: myconfigmap
      data:
        mykey: myvalue
    2. Create the config map from the

      configmap.yaml
      file:

      $ oc create -f configmap.yaml
  2. Create a pod that references the above config map:

    1. Create a

      pod.yaml
      file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: dapi-env-test-pod
      spec:
        containers:
          - name: env-test-container
            image: gcr.io/google_containers/busybox
            command: [ "/bin/sh", "-c", "env" ]
            env:
              - name: MY_CONFIGMAP_VALUE
                valueFrom:
                  configMapKeyRef:
                    name: myconfigmap
                    key: mykey
        restartPolicy: Always
      # ...
    2. Create the pod from the

      pod.yaml
      file:

      $ oc create -f pod.yaml

Verification

  • Check the container’s logs for the

    MY_CONFIGMAP_VALUE
    value:

    $ oc logs -p dapi-env-test-pod

7.5.6. Referencing environment variables

When creating pods, you can reference the value of a previously defined environment variable by using the

$()
syntax. If the environment variable reference can not be resolved, the value will be left as the provided string.

Procedure

  1. Create a pod that references an existing environment variable:

    1. Create a

      pod.yaml
      file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: dapi-env-test-pod
      spec:
        containers:
          - name: env-test-container
            image: gcr.io/google_containers/busybox
            command: [ "/bin/sh", "-c", "env" ]
            env:
              - name: MY_EXISTING_ENV
                value: my_value
              - name: MY_ENV_VAR_REF_ENV
                value: $(MY_EXISTING_ENV)
        restartPolicy: Never
      # ...
    2. Create the pod from the

      pod.yaml
      file:

      $ oc create -f pod.yaml

Verification

  • Check the container’s logs for the

    MY_ENV_VAR_REF_ENV
    value:

    $ oc logs -p dapi-env-test-pod

7.5.7. Escaping environment variable references

When creating a pod, you can escape an environment variable reference by using a double dollar sign. The value will then be set to a single dollar sign version of the provided value.

Procedure

  1. Create a pod that references an existing environment variable:

    1. Create a

      pod.yaml
      file similar to the following:

      apiVersion: v1
      kind: Pod
      metadata:
        name: dapi-env-test-pod
      spec:
        containers:
          - name: env-test-container
            image: gcr.io/google_containers/busybox
            command: [ "/bin/sh", "-c", "env" ]
            env:
              - name: MY_NEW_ENV
                value: $$(SOME_OTHER_ENV)
        restartPolicy: Never
      # ...
    2. Create the pod from the

      pod.yaml
      file:

      $ oc create -f pod.yaml

Verification

  • Check the container’s logs for the

    MY_NEW_ENV
    value:

    $ oc logs -p dapi-env-test-pod

7.6. Copying files to or from an OpenShift Container Platform container

You can use the CLI to copy local files to or from a remote directory in a container using the

rsync
command.

7.6.1. Understanding how to copy files

The

oc rsync
command, or remote sync, is a useful tool for copying database archives to and from your pods for backup and restore purposes. You can also use
oc rsync
to copy source code changes into a running pod for development debugging, when the running pod supports hot reload of source files.

$ oc rsync <source> <destination> [-c <container>]

7.6.1.1. Requirements

Specifying the Copy Source

The source argument of the

oc rsync
command must point to either a local directory or a pod directory. Individual files are not supported.

When specifying a pod directory the directory name must be prefixed with the pod name:

<pod name>:<dir>

If the directory name ends in a path separator (

/
), only the contents of the directory are copied to the destination. Otherwise, the directory and its contents are copied to the destination.

Specifying the Copy Destination
The destination argument of the oc rsync command must point to a directory. If the directory does not exist, but rsync is used for copy, the directory is created for you.
Deleting Files at the Destination
The --delete flag may be used to delete any files in the remote directory that are not in the local directory.
Continuous Syncing on File Change

Using the

--watch
option causes the command to monitor the source path for any file system changes, and synchronizes changes when they occur. With this argument, the command runs forever.

Synchronization occurs after short quiet periods to ensure a rapidly changing file system does not result in continuous synchronization calls.

When using the

--watch
option, the behavior is effectively the same as manually invoking
oc rsync
repeatedly, including any arguments normally passed to
oc rsync
. Therefore, you can control the behavior via the same flags used with manual invocations of
oc rsync
, such as
--delete
.

7.6.2. Copying files to and from containers

Support for copying local files to or from a container is built into the CLI.

Prerequisites

When working with

oc rsync
, note the following:

  • rsync must be installed. The

    oc rsync
    command uses the local
    rsync
    tool, if present on the client machine and the remote container.

    If

    rsync
    is not found locally or in the remote container, a tar archive is created locally and sent to the container where the tar utility is used to extract the files. If tar is not available in the remote container, the copy will fail.

    The tar copy method does not provide the same functionality as

    oc rsync
    . For example,
    oc rsync
    creates the destination directory if it does not exist and only sends files that are different between the source and the destination.

    Note

    In Windows, the

    cwRsync
    client should be installed and added to the PATH for use with the
    oc rsync
    command.

Procedure

  • To copy a local directory to a pod directory:

    $ oc rsync <local-dir> <pod-name>:/<remote-dir> -c <container-name>

    For example:

    $ oc rsync /home/user/source devpod1234:/src -c user-container
  • To copy a pod directory to a local directory:

    $ oc rsync devpod1234:/src /home/user/source

    Example output

    $ oc rsync devpod1234:/src/status.txt /home/user/

7.6.3. Using advanced Rsync features

The

oc rsync
command exposes fewer command line options than standard
rsync
. In the case that you want to use a standard
rsync
command line option that is not available in
oc rsync
, for example the
--exclude-from=FILE
option, it might be possible to use standard
rsync
's
--rsh
(
-e
) option or
RSYNC_RSH
environment variable as a workaround, as follows:

$ rsync --rsh='oc rsh' --exclude-from=<file_name> <local-dir> <pod-name>:/<remote-dir>

or:

Export the

RSYNC_RSH
variable:

$ export RSYNC_RSH='oc rsh'

Then, run the rsync command:

$ rsync --exclude-from=<file_name> <local-dir> <pod-name>:/<remote-dir>

Both of the above examples configure standard

rsync
to use
oc rsh
as its remote shell program to enable it to connect to the remote pod, and are an alternative to running
oc rsync
.

7.7. Executing remote commands in an OpenShift Container Platform container

You can use the CLI to execute remote commands in an OpenShift Container Platform container.

7.7.1. Executing remote commands in containers

Support for remote container command execution is built into the CLI.

Procedure

To run a command in a container:

$ oc exec <pod> [-c <container>] -- <command> [<arg_1> ... <arg_n>]

For example:

$ oc exec mypod date

Example output

Thu Apr  9 02:21:53 UTC 2015

Important

For security purposes, the

oc exec
command does not work when accessing privileged containers except when the command is executed by a
cluster-admin
user.

7.7.2. Protocol for initiating a remote command from a client

Clients initiate the execution of a remote command in a container by issuing a request to the Kubernetes API server:

/proxy/nodes/<node_name>/exec/<namespace>/<pod>/<container>?command=<command>

In the above URL:

  • <node_name>
    is the FQDN of the node.
  • <namespace>
    is the project of the target pod.
  • <pod>
    is the name of the target pod.
  • <container>
    is the name of the target container.
  • <command>
    is the desired command to be executed.

For example:

/proxy/nodes/node123.openshift.com/exec/myns/mypod/mycontainer?command=date

Additionally, the client can add parameters to the request to indicate if:

  • the client should send input to the remote container’s command (stdin).
  • the client’s terminal is a TTY.
  • the remote container’s command should send output from stdout to the client.
  • the remote container’s command should send output from stderr to the client.

After sending an

exec
request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses HTTP/2.

The client creates one stream each for stdin, stdout, and stderr. To distinguish among the streams, the client sets the

streamType
header on the stream to one of
stdin
,
stdout
, or
stderr
.

The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the remote command execution request.

7.8. Using port forwarding to access applications in a container

OpenShift Container Platform supports port forwarding to pods.

7.8.1. Understanding port forwarding

You can use the CLI to forward one or more local ports to a pod. This allows you to listen on a given or random port locally, and have data forwarded to and from given ports in the pod.

Support for port forwarding is built into the CLI:

$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]

The CLI listens on each local port specified by the user, forwarding using the protocol described below.

Ports may be specified using the following formats:

5000

The client listens on port 5000 locally and forwards to 5000 in the pod.

6000:5000

The client listens on port 6000 locally and forwards to 5000 in the pod.

:5000
or
0:5000

The client selects a free local port and forwards to 5000 in the pod.

OpenShift Container Platform handles port-forward requests from clients. Upon receiving a request, OpenShift Container Platform upgrades the response and waits for the client to create port-forwarding streams. When OpenShift Container Platform receives a new stream, it copies data between the stream and the pod’s port.

Architecturally, there are options for forwarding to a pod’s port. The supported OpenShift Container Platform implementation invokes

nsenter
directly on the node host to enter the pod’s network namespace, then invokes
socat
to copy data between the stream and the pod’s port. However, a custom implementation could include running a helper pod that then runs
nsenter
and
socat
, so that those binaries are not required to be installed on the host.

7.8.2. Using port forwarding

You can use the CLI to port-forward one or more local ports to a pod.

Procedure

Use the following command to listen on the specified port in a pod:

$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]

For example:

  • Use the following command to listen on ports

    5000
    and
    6000
    locally and forward data to and from ports
    5000
    and
    6000
    in the pod:

    $ oc port-forward <pod> 5000 6000

    Example output

    Forwarding from 127.0.0.1:5000 -> 5000
    Forwarding from [::1]:5000 -> 5000
    Forwarding from 127.0.0.1:6000 -> 6000
    Forwarding from [::1]:6000 -> 6000

  • Use the following command to listen on port

    8888
    locally and forward to
    5000
    in the pod:

    $ oc port-forward <pod> 8888:5000

    Example output

    Forwarding from 127.0.0.1:8888 -> 5000
    Forwarding from [::1]:8888 -> 5000

  • Use the following command to listen on a free port locally and forward to

    5000
    in the pod:

    $ oc port-forward <pod> :5000

    Example output

    Forwarding from 127.0.0.1:42390 -> 5000
    Forwarding from [::1]:42390 -> 5000

    Or:

    $ oc port-forward <pod> 0:5000

7.8.3. Protocol for initiating port forwarding from a client

Clients initiate port forwarding to a pod by issuing a request to the Kubernetes API server:

/proxy/nodes/<node_name>/portForward/<namespace>/<pod>

In the above URL:

  • <node_name>
    is the FQDN of the node.
  • <namespace>
    is the namespace of the target pod.
  • <pod>
    is the name of the target pod.

For example:

/proxy/nodes/node123.openshift.com/portForward/myns/mypod

After sending a port forward request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses Hyptertext Transfer Protocol Version 2 (HTTP/2).

The client creates a stream with the

port
header containing the target port in the pod. All data written to the stream is delivered via the kubelet to the target pod and port. Similarly, all data sent from the pod for that forwarded connection is delivered back to the same stream in the client.

The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the port forwarding request.

7.9. Using sysctls in containers

Sysctl settings are exposed through Kubernetes, allowing users to modify certain kernel parameters at runtime. Only sysctls that are namespaced can be set independently on pods. If a sysctl is not namespaced, called node-level, you must use another method of setting the sysctl, such as by using the Node Tuning Operator.

Network sysctls are a special category of sysctl. Network sysctls include:

  • System-wide sysctls, for example
    net.ipv4.ip_local_port_range
    , that are valid for all networking. You can set these independently for each pod on a node.
  • Interface-specific sysctls, for example
    net.ipv4.conf.IFNAME.accept_local
    , that only apply to a specific additional network interface for a given pod. You can set these independently for each additional network configuration. You set these by using a configuration in the
    tuning-cni
    after the network interfaces are created.

Moreover, only those sysctls considered safe are whitelisted by default; you can manually enable other unsafe sysctls on the node to be available to the user.

7.9.1. About sysctls

In Linux, the sysctl interface allows an administrator to modify kernel parameters at runtime. Parameters are available from the

/proc/sys/
virtual process file system. The parameters cover various subsystems, such as:

  • kernel (common prefix:
    kernel.
    )
  • networking (common prefix:
    net.
    )
  • virtual memory (common prefix:
    vm.
    )
  • MDADM (common prefix:
    dev.
    )

More subsystems are described in Kernel documentation. To get a list of all parameters, run:

$ sudo sysctl -a

7.9.2. Namespaced and node-level sysctls

A number of sysctls are namespaced in the Linux kernels. This means that you can set them independently for each pod on a node. Being namespaced is a requirement for sysctls to be accessible in a pod context within Kubernetes.

The following sysctls are known to be namespaced:

  • kernel.shm*
  • kernel.msg*
  • kernel.sem
  • fs.mqueue.*

Additionally, most of the sysctls in the

net.*
group are known to be namespaced. Their namespace adoption differs based on the kernel version and distributor.

Sysctls that are not namespaced are called node-level and must be set manually by the cluster administrator, either by means of the underlying Linux distribution of the nodes, such as by modifying the

/etc/sysctls.conf
file, or by using a daemon set with privileged containers. You can use the Node Tuning Operator to set node-level sysctls.

Note

Consider marking nodes with special sysctls as tainted. Only schedule pods onto them that need those sysctl settings. Use the taints and toleration feature to mark the nodes.

7.9.3. Safe and unsafe sysctls

Sysctls are grouped into safe and unsafe sysctls.

For system-wide sysctls to be considered safe, they must be namespaced. A namespaced sysctl ensures there is isolation between namespaces and therefore pods. If you set a sysctl for one pod it must not add any of the following:

  • Influence any other pod on the node
  • Harm the node health
  • Gain CPU or memory resources outside of the resource limits of a pod
Note

Being namespaced alone is not sufficient for the sysctl to be considered safe.

Any sysctl that is not added to the allowed list on OpenShift Container Platform is considered unsafe for OpenShift Container Platform.

Unsafe sysctls are not allowed by default. For system-wide sysctls the cluster administrator must manually enable them on a per-node basis. Pods with disabled unsafe sysctls are scheduled but do not launch.

Note

You cannot manually enable interface-specific unsafe sysctls.

OpenShift Container Platform adds the following system-wide and interface-specific safe sysctls to an allowed safe list:

Expand
Table 7.4. System-wide safe sysctls
sysctlDescription

kernel.shm_rmid_forced

When set to

1
, all shared memory objects in current IPC namespace are automatically forced to use IPC_RMID. For more information, see shm_rmid_forced.

net.ipv4.ip_local_port_range

Defines the local port range that is used by TCP and UDP to choose the local port. The first number is the first port number, and the second number is the last local port number. If possible, it is better if these numbers have different parity (one even and one odd value). They must be greater than or equal to

ip_unprivileged_port_start
. The default values are
32768
and
60999
respectively. For more information, see ip_local_port_range.

net.ipv4.tcp_syncookies

When

net.ipv4.tcp_syncookies
is set, the kernel handles TCP SYN packets normally until the half-open connection queue is full, at which time, the SYN cookie functionality kicks in. This functionality allows the system to keep accepting valid connections, even if under a denial-of-service attack. For more information, see tcp_syncookies.

net.ipv4.ping_group_range

This restricts

ICMP_PROTO
datagram sockets to users in the group range. The default is
1 0
, meaning that nobody, not even root, can create ping sockets. For more information, see ping_group_range.

net.ipv4.ip_unprivileged_port_start

This defines the first unprivileged port in the network namespace. To disable all privileged ports, set this to

0
. Privileged ports must not overlap with the
ip_local_port_range
. For more information, see ip_unprivileged_port_start.

Expand
Table 7.5. Interface-specific safe sysctls
sysctlDescription

net.ipv4.conf.IFNAME.accept_redirects

Accept IPv4 ICMP redirect messages.

net.ipv4.conf.IFNAME.accept_source_route

Accept IPv4 packets with strict source route (SRR) option.

net.ipv4.conf.IFNAME.arp_accept

Define behavior for gratuitous ARP frames with an IPv4 address that is not already present in the ARP table:

  • 0
    - Do not create new entries in the ARP table.
  • 1
    - Create new entries in the ARP table.

net.ipv4.conf.IFNAME.arp_notify

Define mode for notification of IPv4 address and device changes.

net.ipv4.conf.IFNAME.disable_policy

Disable IPSEC policy (SPD) for this IPv4 interface.

net.ipv4.conf.IFNAME.secure_redirects

Accept ICMP redirect messages only to gateways listed in the interface’s current gateway list.

net.ipv4.conf.IFNAME.send_redirects

Send redirects is enabled only if the node acts as a router. That is, a host should not send an ICMP redirect message. It is used by routers to notify the host about a better routing path that is available for a particular destination.

net.ipv6.conf.IFNAME.accept_ra

Accept IPv6 Router advertisements; autoconfigure using them. It also determines whether or not to transmit router solicitations. Router solicitations are transmitted only if the functional setting is to accept router advertisements.

net.ipv6.conf.IFNAME.accept_redirects

Accept IPv6 ICMP redirect messages.

net.ipv6.conf.IFNAME.accept_source_route

Accept IPv6 packets with SRR option.

net.ipv6.conf.IFNAME.arp_accept

Define behavior for gratuitous ARP frames with an IPv6 address that is not already present in the ARP table:

  • 0
    - Do not create new entries in the ARP table.
  • 1
    - Create new entries in the ARP table.

net.ipv6.conf.IFNAME.arp_notify

Define mode for notification of IPv6 address and device changes.

net.ipv6.neigh.IFNAME.base_reachable_time_ms

This parameter controls the hardware address to IP mapping lifetime in the neighbour table for IPv6.

net.ipv6.neigh.IFNAME.retrans_time_ms

Set the retransmit timer for neighbor discovery messages.

Note

When setting these values using the

tuning
CNI plugin, use the value
IFNAME
literally. The interface name is represented by the
IFNAME
token, and is replaced with the actual name of the interface at runtime.

7.9.4. Starting a pod with safe sysctls

You can set sysctls on pods using the pod’s

securityContext
. The
securityContext
applies to all containers in the same pod.

Safe sysctls are allowed by default.

This example uses the pod

securityContext
to set the following safe sysctls:

  • kernel.shm_rmid_forced
  • net.ipv4.ip_local_port_range
  • net.ipv4.tcp_syncookies
  • net.ipv4.ping_group_range
Warning

To avoid destabilizing your operating system, modify sysctl parameters only after you understand their effects.

Use this procedure to start a pod with the configured sysctl settings.

Note

In most cases you modify an existing pod definition and add the

securityContext
spec.

Procedure

  1. Create a YAML file

    sysctl_pod.yaml
    that defines an example pod and add the
    securityContext
    spec, as shown in the following example:

    apiVersion: v1
    kind: Pod
    metadata:
      name: sysctl-example
      namespace: default
    spec:
      containers:
      - name: podexample
        image: centos
        command: ["bin/bash", "-c", "sleep INF"]
        securityContext:
          runAsUser: 2000 
    1
    
          runAsGroup: 3000 
    2
    
          allowPrivilegeEscalation: false 
    3
    
          capabilities: 
    4
    
            drop: ["ALL"]
      securityContext:
        runAsNonRoot: true 
    5
    
        seccompProfile: 
    6
    
          type: RuntimeDefault
        sysctls:
        - name: kernel.shm_rmid_forced
          value: "1"
        - name: net.ipv4.ip_local_port_range
          value: "32770       60666"
        - name: net.ipv4.tcp_syncookies
          value: "0"
        - name: net.ipv4.ping_group_range
          value: "0           200000000"
    1
    runAsUser controls which user ID the container is run with.
    2
    runAsGroup controls which primary group ID the containers is run with.
    3
    allowPrivilegeEscalation determines if a pod can request to allow privilege escalation. If unspecified, it defaults to true. This boolean directly controls whether the no_new_privs flag gets set on the container process.
    4
    capabilities permit privileged actions without giving full root access. This policy ensures all capabilities are dropped from the pod.
    5
    runAsNonRoot: true requires that the container will run with a user with any UID other than 0.
    6
    RuntimeDefault enables the default seccomp profile for a pod or container workload.
  2. Create the pod by running the following command:

    $ oc apply -f sysctl_pod.yaml
  3. Verify that the pod is created by running the following command:

    $ oc get pod

    Example output

    NAME              READY   STATUS            RESTARTS   AGE
    sysctl-example    1/1     Running           0          14s

  4. Log in to the pod by running the following command:

    $ oc rsh sysctl-example
  5. Verify the values of the configured sysctl flags. For example, find the value

    kernel.shm_rmid_forced
    by running the following command:

    sh-4.4# sysctl kernel.shm_rmid_forced

    Expected output

    kernel.shm_rmid_forced = 1

7.9.5. Starting a pod with unsafe sysctls

A pod with unsafe sysctls fails to launch on any node unless the cluster administrator explicitly enables unsafe sysctls for that node. As with node-level sysctls, use the taints and toleration feature or labels on nodes to schedule those pods onto the right nodes.

The following example uses the pod

securityContext
to set a safe sysctl
kernel.shm_rmid_forced
and two unsafe sysctls,
net.core.somaxconn
and
kernel.msgmax
. There is no distinction between safe and unsafe sysctls in the specification.

Warning

To avoid destabilizing your operating system, modify sysctl parameters only after you understand their effects.

The following example illustrates what happens when you add safe and unsafe sysctls to a pod specification:

Procedure

  1. Create a YAML file

    sysctl-example-unsafe.yaml
    that defines an example pod and add the
    securityContext
    specification, as shown in the following example:

    apiVersion: v1
    kind: Pod
    metadata:
      name: sysctl-example-unsafe
    spec:
      containers:
      - name: podexample
        image: centos
        command: ["bin/bash", "-c", "sleep INF"]
        securityContext:
          runAsUser: 2000
          runAsGroup: 3000
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]
      securityContext:
        runAsNonRoot: true
        seccompProfile:
          type: RuntimeDefault
        sysctls:
        - name: kernel.shm_rmid_forced
          value: "0"
        - name: net.core.somaxconn
          value: "1024"
        - name: kernel.msgmax
          value: "65536"
  2. Create the pod using the following command:

    $ oc apply -f sysctl-example-unsafe.yaml
  3. Verify that the pod is scheduled but does not deploy because unsafe sysctls are not allowed for the node using the following command:

    $ oc get pod

    Example output

    NAME                       READY             STATUS            RESTARTS   AGE
    sysctl-example-unsafe      0/1               SysctlForbidden   0          14s

7.9.6. Enabling unsafe sysctls

A cluster administrator can allow certain unsafe sysctls for very special situations such as high performance or real-time application tuning.

If you want to use unsafe sysctls, a cluster administrator must enable them individually for a specific type of node. The sysctls must be namespaced.

You can further control which sysctls are set in pods by specifying lists of sysctls or sysctl patterns in the

allowedUnsafeSysctls
field of the Security Context Constraints.

  • The
    allowedUnsafeSysctls
    option controls specific needs such as high performance or real-time application tuning.
Warning

Due to their nature of being unsafe, the use of unsafe sysctls is at-your-own-risk and can lead to severe problems, such as improper behavior of containers, resource shortage, or breaking a node.

Procedure

  1. List existing MachineConfig objects for your OpenShift Container Platform cluster to decide how to label your machine config by running the following command:

    $ oc get machineconfigpool

    Example output

    NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
    master   rendered-master-bfb92f0cd1684e54d8e234ab7423cc96   True      False      False      3              3                   3                     0                      42m
    worker   rendered-worker-21b6cb9a0f8919c88caf39db80ac1fce   True      False      False      3              3                   3                     0                      42m

  2. Add a label to the machine config pool where the containers with the unsafe sysctls will run by running the following command:

    $ oc label machineconfigpool worker custom-kubelet=sysctl
  3. Create a YAML file

    set-sysctl-worker.yaml
    that defines a
    KubeletConfig
    custom resource (CR):

    apiVersion: machineconfiguration.openshift.io/v1
    kind: KubeletConfig
    metadata:
      name: custom-kubelet
    spec:
      machineConfigPoolSelector:
        matchLabels:
          custom-kubelet: sysctl 
    1
    
      kubeletConfig:
        allowedUnsafeSysctls: 
    2
    
          - "kernel.msg*"
          - "net.core.somaxconn"
    1
    Specify the label from the machine config pool.
    2
    List the unsafe sysctls you want to allow.
  4. Create the object by running the following command:

    $ oc apply -f set-sysctl-worker.yaml
  5. Wait for the Machine Config Operator to generate the new rendered configuration and apply it to the machines by running the following command:

    $ oc get machineconfigpool worker -w

    After some minutes the

    UPDATING
    status changes from True to False:

    NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
    worker   rendered-worker-f1704a00fc6f30d3a7de9a15fd68a800   False     True       False      3              2                   2                     0                      71m
    worker   rendered-worker-f1704a00fc6f30d3a7de9a15fd68a800   False     True       False      3              2                   3                     0                      72m
    worker   rendered-worker-0188658afe1f3a183ec8c4f14186f4d5   True      False      False      3              3                   3                     0                      72m
  6. Create a YAML file

    sysctl-example-safe-unsafe.yaml
    that defines an example pod and add the
    securityContext
    spec, as shown in the following example:

    apiVersion: v1
    kind: Pod
    metadata:
      name: sysctl-example-safe-unsafe
    spec:
      containers:
      - name: podexample
        image: centos
        command: ["bin/bash", "-c", "sleep INF"]
        securityContext:
          runAsUser: 2000
          runAsGroup: 3000
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]
      securityContext:
        runAsNonRoot: true
        seccompProfile:
          type: RuntimeDefault
        sysctls:
        - name: kernel.shm_rmid_forced
          value: "0"
        - name: net.core.somaxconn
          value: "1024"
        - name: kernel.msgmax
          value: "65536"
  7. Create the pod by running the following command:

    $ oc apply -f sysctl-example-safe-unsafe.yaml

    Expected output

    Warning: would violate PodSecurity "restricted:latest": forbidden sysctls (net.core.somaxconn, kernel.msgmax)
    pod/sysctl-example-safe-unsafe created

  8. Verify that the pod is created by running the following command:

    $ oc get pod

    Example output

    NAME                         READY   STATUS    RESTARTS   AGE
    sysctl-example-safe-unsafe   1/1     Running   0          19s

  9. Log in to the pod by running the following command:

    $ oc rsh sysctl-example-safe-unsafe
  10. Verify the values of the configured sysctl flags. For example, find the value

    net.core.somaxconn
    by running the following command:

    sh-4.4# sysctl net.core.somaxconn

    Expected output

    net.core.somaxconn = 1024

The unsafe sysctl is now allowed and the value is set as defined in the

securityContext
spec of the updated pod specification.

Red Hat logoGithubredditYoutubeTwitter

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Aiutiamo gli utenti Red Hat a innovarsi e raggiungere i propri obiettivi con i nostri prodotti e servizi grazie a contenuti di cui possono fidarsi. Esplora i nostri ultimi aggiornamenti.

Rendiamo l’open source più inclusivo

Red Hat si impegna a sostituire il linguaggio problematico nel codice, nella documentazione e nelle proprietà web. Per maggiori dettagli, visita il Blog di Red Hat.

Informazioni su Red Hat

Forniamo soluzioni consolidate che rendono più semplice per le aziende lavorare su piattaforme e ambienti diversi, dal datacenter centrale all'edge della rete.

Theme

© 2026 Red Hat
Torna in cima