Chapter 6. Working with containers
6.1. Understanding Containers
The basic units of OpenShift Container Platform applications are called containers. Linux container technologies are lightweight mechanisms for isolating running processes so that they are limited to interacting with only their designated resources.
Many application instances can be running in containers on a single host without visibility into each others' processes, files, network, and so on. Typically, each container provides a single service (often called a "micro-service"), such as a web server or a database, though containers can be used for arbitrary workloads.
The Linux kernel has been incorporating capabilities for container technologies for years. OpenShift Container Platform and Kubernetes add the ability to orchestrate containers across multi-host installations.
About containers and RHEL kernel memory
Due to Red Hat Enterprise Linux (RHEL) behavior, a container on a node with high CPU usage might seem to consume more memory than expected. The higher memory consumption could be caused by the kmem_cache
in the RHEL kernel. The RHEL kernel creates a kmem_cache
for each cgroup. For added performance, the kmem_cache
contains a cpu_cache
, and a node cache for any NUMA nodes. These caches all consume kernel memory.
The amount of memory stored in those caches is proportional to the number of CPUs that the system uses. As a result, a higher number of CPUs results in a greater amount of kernel memory being held in these caches. Higher amounts of kernel memory in these caches can cause OpenShift Container Platform containers to exceed the configured memory limits, resulting in the container being killed.
To avoid losing containers due to kernel memory issues, ensure that the containers request sufficient memory. You can use the following formula to estimate the amount of memory consumed by the kmem_cache
, where nproc
is the number of processing units available that are reported by the nproc
command. The lower limit of container requests should be this value plus the container memory requirements:
$(nproc) X 1/2 MiB
6.2. Using Init Containers to perform tasks before a pod is deployed
OpenShift Container Platform provides init containers, which are specialized containers that run before application containers and can contain utilities or setup scripts not present in an app image.
6.2.1. Understanding Init Containers
You can use an Init Container resource to perform tasks before the rest of a pod is deployed.
A pod can have Init Containers in addition to application containers. Init containers allow you to reorganize setup scripts and binding code.
An Init Container can:
- Contain and run utilities that are not desirable to include in the app Container image for security reasons.
- Contain utilities or custom code for setup that is not present in an app image. For example, there is no requirement to make an image FROM another image just to use a tool like sed, awk, python, or dig during setup.
- Use Linux namespaces so that they have different filesystem views from app containers, such as access to secrets that application containers are not able to access.
Each Init Container must complete successfully before the next one is started. So, Init Containers provide an easy way to block or delay the startup of app containers until some set of preconditions are met.
For example, the following are some ways you can use Init Containers:
Wait for a service to be created with a shell command like:
for i in {1..100}; do sleep 1; if dig myservice; then exit 0; fi; done; exit 1
Register this pod with a remote server from the downward API with a command like:
$ curl -X POST http://$MANAGEMENT_SERVICE_HOST:$MANAGEMENT_SERVICE_PORT/register -d ‘instance=$()&ip=$()’
-
Wait for some time before starting the app Container with a command like
sleep 60
. - Clone a git repository into a volume.
- Place values into a configuration file and run a template tool to dynamically generate a configuration file for the main app Container. For example, place the POD_IP value in a configuration and generate the main app configuration file using Jinja.
See the Kubernetes documentation for more information.
6.2.2. Creating Init Containers
The following example outlines a simple pod which has two Init Containers. The first waits for myservice
and the second waits for mydb
. After both containers complete, the pod begins.
Procedure
Create a YAML file for the Init Container:
apiVersion: v1 kind: Pod metadata: name: myapp-pod labels: app: myapp spec: containers: - name: myapp-container image: registry.access.redhat.com/ubi8/ubi:latest command: ['sh', '-c', 'echo The app is running! && sleep 3600'] initContainers: - name: init-myservice image: registry.access.redhat.com/ubi8/ubi:latest command: ['sh', '-c', 'until getent hosts myservice; do echo waiting for myservice; sleep 2; done;'] - name: init-mydb image: registry.access.redhat.com/ubi8/ubi:latest command: ['sh', '-c', 'until getent hosts mydb; do echo waiting for mydb; sleep 2; done;']
Create a YAML file for the
myservice
service.kind: Service apiVersion: v1 metadata: name: myservice spec: ports: - protocol: TCP port: 80 targetPort: 9376
Create a YAML file for the
mydb
service.kind: Service apiVersion: v1 metadata: name: mydb spec: ports: - protocol: TCP port: 80 targetPort: 9377
Run the following command to create the
myapp-pod
:$ oc create -f myapp.yaml
Example output
pod/myapp-pod created
View the status of the pod:
$ oc get pods
Example output
NAME READY STATUS RESTARTS AGE myapp-pod 0/1 Init:0/2 0 5s
Note that the pod status indicates it is waiting
Run the following commands to create the services:
$ oc create -f mydb.yaml
$ oc create -f myservice.yaml
View the status of the pod:
$ oc get pods
Example output
NAME READY STATUS RESTARTS AGE myapp-pod 1/1 Running 0 2m
6.3. Using volumes to persist container data
Files in a container are ephemeral. As such, when a container crashes or stops, the data is lost. You can use volumes to persist the data used by the containers in a pod. A volume is directory, accessible to the Containers in a pod, where data is stored for the life of the pod.
6.3.1. Understanding volumes
Volumes are mounted file systems available to pods and their containers which may be backed by a number of host-local or network attached storage endpoints. Containers are not persistent by default; on restart, their contents are cleared.
To ensure that the file system on the volume contains no errors and, if errors are present, to repair them when possible, OpenShift Container Platform invokes the fsck
utility prior to the mount
utility. This occurs when either adding a volume or updating an existing volume.
The simplest volume type is emptyDir
, which is a temporary directory on a single machine. Administrators may also allow you to request a persistent volume that is automatically attached to your pods.
emptyDir
volume storage may be restricted by a quota based on the pod’s FSGroup, if the FSGroup parameter is enabled by your cluster administrator.
6.3.2. Working with volumes using the OpenShift Container Platform CLI
You can use the CLI command oc set volume
to add and remove volumes and volume mounts for any object that has a pod template like replication controllers or deployment configs. You can also list volumes in pods or any object that has a pod template.
The oc set volume
command uses the following general syntax:
$ oc set volume <object_selection> <operation> <mandatory_parameters> <options>
- Object selection
-
Specify one of the following for the
object_selection
parameter in theoc set volume
command:
Syntax | Description | Example |
---|---|---|
|
Selects |
|
|
Selects |
|
|
Selects resources of type |
|
|
Selects all resources of type |
|
| File name, directory, or URL to file to use to edit the resource. |
|
- Operation
-
Specify
--add
or--remove
for theoperation
parameter in theoc set volume
command. - Mandatory parameters
- Any mandatory parameters are specific to the selected operation and are discussed in later sections.
- Options
- Any options are specific to the selected operation and are discussed in later sections.
6.3.3. Listing volumes and volume mounts in a pod
You can list volumes and volume mounts in pods or pod templates:
Procedure
To list volumes:
$ oc set volume <object_type>/<name> [options]
List volume supported options:
Option | Description | Default |
---|---|---|
| Name of the volume. | |
|
Select containers by name. It can also take wildcard |
|
For example:
To list all volumes for pod p1:
$ oc set volume pod/p1
To list volume v1 defined on all deployment configs:
$ oc set volume dc --all --name=v1
6.3.4. Adding volumes to a pod
You can add volumes and volume mounts to a pod.
Procedure
To add a volume, a volume mount, or both to pod templates:
$ oc set volume <object_type>/<name> --add [options]
Option | Description | Default |
---|---|---|
| Name of the volume. | Automatically generated, if not specified. |
|
Name of the volume source. Supported values: |
|
|
Select containers by name. It can also take wildcard |
|
|
Mount path inside the selected containers. Do not mount to the container root, | |
|
Host path. Mandatory parameter for | |
|
Name of the secret. Mandatory parameter for | |
|
Name of the configmap. Mandatory parameter for | |
|
Name of the persistent volume claim. Mandatory parameter for | |
|
Details of volume source as a JSON string. Recommended if the desired volume source is not supported by | |
|
Display the modified objects instead of updating them on the server. Supported values: | |
| Output the modified objects with the given version. |
|
For example:
To add a new volume source emptyDir to the registry
DeploymentConfig
object:$ oc set volume dc/registry --add
To add volume v1 with secret secret1 for replication controller r1 and mount inside the containers at /data:
$ oc set volume rc/r1 --add --name=v1 --type=secret --secret-name='secret1' --mount-path=/data
To add existing persistent volume v1 with claim name pvc1 to deployment configuration dc.json on disk, mount the volume on container c1 at /data, and update the
DeploymentConfig
object on the server:$ oc set volume -f dc.json --add --name=v1 --type=persistentVolumeClaim \ --claim-name=pvc1 --mount-path=/data --containers=c1
To add a volume v1 based on Git repository https://github.com/namespace1/project1 with revision 5125c45f9f563 for all replication controllers:
$ oc set volume rc --all --add --name=v1 \ --source='{"gitRepo": { "repository": "https://github.com/namespace1/project1", "revision": "5125c45f9f563" }}'
6.3.5. Updating volumes and volume mounts in a pod
You can modify the volumes and volume mounts in a pod.
Procedure
Updating existing volumes using the --overwrite
option:
$ oc set volume <object_type>/<name> --add --overwrite [options]
For example:
To replace existing volume v1 for replication controller r1 with existing persistent volume claim pvc1:
$ oc set volume rc/r1 --add --overwrite --name=v1 --type=persistentVolumeClaim --claim-name=pvc1
To change the
DeploymentConfig
object d1 mount point to /opt for volume v1:$ oc set volume dc/d1 --add --overwrite --name=v1 --mount-path=/opt
6.3.6. Removing volumes and volume mounts from a pod
You can remove a volume or volume mount from a pod.
Procedure
To remove a volume from pod templates:
$ oc set volume <object_type>/<name> --remove [options]
Option | Description | Default |
---|---|---|
| Name of the volume. | |
|
Select containers by name. It can also take wildcard |
|
| Indicate that you want to remove multiple volumes at once. | |
|
Display the modified objects instead of updating them on the server. Supported values: | |
| Output the modified objects with the given version. |
|
For example:
To remove a volume v1 from the
DeploymentConfig
object d1:$ oc set volume dc/d1 --remove --name=v1
To unmount volume v1 from container c1 for the
DeploymentConfig
object d1 and remove the volume v1 if it is not referenced by any containers on d1:$ oc set volume dc/d1 --remove --name=v1 --containers=c1
To remove all volumes for replication controller r1:
$ oc set volume rc/r1 --remove --confirm
6.3.7. Configuring volumes for multiple uses in a pod
You can configure a volume to allows you to share one volume for multiple uses in a single pod using the volumeMounts.subPath
property to specify a subPath
value inside a volume instead of the volume’s root.
Procedure
View the list of files in the volume, run the
oc rsh
command:$ oc rsh <pod>
Example output
sh-4.2$ ls /path/to/volume/subpath/mount example_file1 example_file2 example_file3
Specify the
subPath
:Example
Pod
spec withsubPath
parameterapiVersion: v1 kind: Pod metadata: name: my-site spec: containers: - name: mysql image: mysql volumeMounts: - mountPath: /var/lib/mysql name: site-data subPath: mysql 1 - name: php image: php volumeMounts: - mountPath: /var/www/html name: site-data subPath: html 2 volumes: - name: site-data persistentVolumeClaim: claimName: my-site-data
6.4. Mapping volumes using projected volumes
A projected volume maps several existing volume sources into the same directory.
The following types of volume sources can be projected:
- Secrets
- Config Maps
- Downward API
All sources are required to be in the same namespace as the pod.
6.4.1. Understanding projected volumes
Projected volumes can map any combination of these volume sources into a single directory, allowing the user to:
- automatically populate a single volume with the keys from multiple secrets, config maps, and with downward API information, so that I can synthesize a single directory with various sources of information;
- populate a single volume with the keys from multiple secrets, config maps, and with downward API information, explicitly specifying paths for each item, so that I can have full control over the contents of that volume.
The following general scenarios show how you can use projected volumes.
- Config map, secrets, Downward API.
-
Projected volumes allow you to deploy containers with configuration data that includes passwords. An application using these resources could be deploying Red Hat OpenStack Platform (RHOSP) on Kubernetes. The configuration data might have to be assembled differently depending on if the services are going to be used for production or for testing. If a pod is labeled with production or testing, the downward API selector
metadata.labels
can be used to produce the correct RHOSP configs. - Config map + secrets.
- Projected volumes allow you to deploy containers involving configuration data and passwords. For example, you might execute a config map with some sensitive encrypted tasks that are decrypted using a vault password file.
- ConfigMap + Downward API.
-
Projected volumes allow you to generate a config including the pod name (available via the
metadata.name
selector). This application can then pass the pod name along with requests in order to easily determine the source without using IP tracking. - Secrets + Downward API.
-
Projected volumes allow you to use a secret as a public key to encrypt the namespace of the pod (available via the
metadata.namespace
selector). This example allows the Operator to use the application to deliver the namespace information securely without using an encrypted transport.
6.4.1.1. Example Pod specs
The following are examples of Pod
specs for creating projected volumes.
Pod with a secret, a Downward API, and a config map
apiVersion: v1 kind: Pod metadata: name: volume-test spec: containers: - name: container-test image: busybox volumeMounts: 1 - name: all-in-one mountPath: "/projected-volume"2 readOnly: true 3 volumes: 4 - name: all-in-one 5 projected: defaultMode: 0400 6 sources: - secret: name: mysecret 7 items: - key: username path: my-group/my-username 8 - downwardAPI: 9 items: - path: "labels" fieldRef: fieldPath: metadata.labels - path: "cpu_limit" resourceFieldRef: containerName: container-test resource: limits.cpu - configMap: 10 name: myconfigmap items: - key: config path: my-group/my-config mode: 0777 11
- 1
- Add a
volumeMounts
section for each container that needs the secret. - 2
- Specify a path to an unused directory where the secret will appear.
- 3
- Set
readOnly
totrue
. - 4
- Add a
volumes
block to list each projected volume source. - 5
- Specify any name for the volume.
- 6
- Set the execute permission on the files.
- 7
- Add a secret. Enter the name of the secret object. Each secret you want to use must be listed.
- 8
- Specify the path to the secrets file under the
mountPath
. Here, the secrets file is in /projected-volume/my-group/my-username. - 9
- Add a Downward API source.
- 10
- Add a ConfigMap source.
- 11
- Set the mode for the specific projection
If there are multiple containers in the pod, each container needs a volumeMounts
section, but only one volumes
section is needed.
Pod with multiple secrets with a non-default permission mode set
apiVersion: v1 kind: Pod metadata: name: volume-test spec: containers: - name: container-test image: busybox volumeMounts: - name: all-in-one mountPath: "/projected-volume" readOnly: true volumes: - name: all-in-one projected: defaultMode: 0755 sources: - secret: name: mysecret items: - key: username path: my-group/my-username - secret: name: mysecret2 items: - key: password path: my-group/my-password mode: 511
The defaultMode
can only be specified at the projected level and not for each volume source. However, as illustrated above, you can explicitly set the mode
for each individual projection.
6.4.1.2. Pathing Considerations
- Collisions Between Keys when Configured Paths are Identical
If you configure any keys with the same path, the pod spec will not be accepted as valid. In the following example, the specified path for
mysecret
andmyconfigmap
are the same:apiVersion: v1 kind: Pod metadata: name: volume-test spec: containers: - name: container-test image: busybox volumeMounts: - name: all-in-one mountPath: "/projected-volume" readOnly: true volumes: - name: all-in-one projected: sources: - secret: name: mysecret items: - key: username path: my-group/data - configMap: name: myconfigmap items: - key: config path: my-group/data
Consider the following situations related to the volume file paths.
- Collisions Between Keys without Configured Paths
- The only run-time validation that can occur is when all the paths are known at pod creation, similar to the above scenario. Otherwise, when a conflict occurs the most recent specified resource will overwrite anything preceding it (this is true for resources that are updated after pod creation as well).
- Collisions when One Path is Explicit and the Other is Automatically Projected
- In the event that there is a collision due to a user specified path matching data that is automatically projected, the latter resource will overwrite anything preceding it as before
6.4.2. Configuring a Projected Volume for a Pod
When creating projected volumes, consider the volume file path situations described in Understanding projected volumes.
The following example shows how to use a projected volume to mount an existing secret volume source. The steps can be used to create a user name and password secrets from local files. You then create a pod that runs one container, using a projected volume to mount the secrets into the same shared directory.
Procedure
To use a projected volume to mount an existing secret volume source.
Create files containing the secrets, entering the following, replacing the password and user information as appropriate:
apiVersion: v1 kind: Secret metadata: name: mysecret type: Opaque data: pass: MWYyZDFlMmU2N2Rm user: YWRtaW4=
The
user
andpass
values can be any valid string that is base64 encoded.The following example shows
admin
in base64:$ echo -n "admin" | base64
Example output
YWRtaW4=
The following example shows the password
1f2d1e2e67df
in base64:.$ echo -n "1f2d1e2e67df" | base64
Example output
MWYyZDFlMmU2N2Rm
Use the following command to create the secrets:
$ oc create -f <secrets-filename>
For example:
$ oc create -f secret.yaml
Example output
secret "mysecret" created
You can check that the secret was created using the following commands:
$ oc get secret <secret-name>
For example:
$ oc get secret mysecret
Example output
NAME TYPE DATA AGE mysecret Opaque 2 17h
$ oc get secret <secret-name> -o yaml
For example:
$ oc get secret mysecret -o yaml
apiVersion: v1 data: pass: MWYyZDFlMmU2N2Rm user: YWRtaW4= kind: Secret metadata: creationTimestamp: 2017-05-30T20:21:38Z name: mysecret namespace: default resourceVersion: "2107" selfLink: /api/v1/namespaces/default/secrets/mysecret uid: 959e0424-4575-11e7-9f97-fa163e4bd54c type: Opaque
Create a pod configuration file similar to the following that includes a
volumes
section:apiVersion: v1 kind: Pod metadata: name: test-projected-volume spec: containers: - name: test-projected-volume image: busybox args: - sleep - "86400" volumeMounts: - name: all-in-one mountPath: "/projected-volume" readOnly: true volumes: - name: all-in-one projected: sources: - secret: 1 name: user - secret: 2 name: pass
Create the pod from the configuration file:
$ oc create -f <your_yaml_file>.yaml
For example:
$ oc create -f secret-pod.yaml
Example output
pod "test-projected-volume" created
Verify that the pod container is running, and then watch for changes to the pod:
$ oc get pod <name>
For example:
$ oc get pod test-projected-volume
The output should appear similar to the following:
Example output
NAME READY STATUS RESTARTS AGE test-projected-volume 1/1 Running 0 14s
In another terminal, use the
oc exec
command to open a shell to the running container:$ oc exec -it <pod> <command>
For example:
$ oc exec -it test-projected-volume -- /bin/sh
In your shell, verify that the
projected-volumes
directory contains your projected sources:/ # ls
Example output
bin home root tmp dev proc run usr etc projected-volume sys var
6.5. Allowing containers to consume API objects
The Downward API is a mechanism that allows containers to consume information about API objects without coupling to OpenShift Container Platform. Such information includes the pod’s name, namespace, and resource values. Containers can consume information from the downward API using environment variables or a volume plug-in.
6.5.1. Expose pod information to Containers using the Downward API
The Downward API contains such information as the pod’s name, project, and resource values. Containers can consume information from the downward API using environment variables or a volume plug-in.
Fields within the pod are selected using the FieldRef
API type. FieldRef
has two fields:
Field | Description |
---|---|
| The path of the field to select, relative to the pod. |
|
The API version to interpret the |
Currently, the valid selectors in the v1 API include:
Selector | Description |
---|---|
| The pod’s name. This is supported in both environment variables and volumes. |
| The pod’s namespace.This is supported in both environment variables and volumes. |
| The pod’s labels. This is only supported in volumes and not in environment variables. |
| The pod’s annotations. This is only supported in volumes and not in environment variables. |
| The pod’s IP. This is only supported in environment variables and not volumes. |
The apiVersion
field, if not specified, defaults to the API version of the enclosing pod template.
6.5.2. Understanding how to consume container values using the downward API
You containers can consume API values using environment variables or a volume plug-in. Depending on the method you choose, containers can consume:
- Pod name
- Pod project/namespace
- Pod annotations
- Pod labels
Annotations and labels are available using only a volume plug-in.
6.5.2.1. Consuming container values using environment variables
When using a container’s environment variables, use the EnvVar
type’s valueFrom
field (of type EnvVarSource
) to specify that the variable’s value should come from a FieldRef
source instead of the literal value specified by the value
field.
Only constant attributes of the pod can be consumed this way, as environment variables cannot be updated once a process is started in a way that allows the process to be notified that the value of a variable has changed. The fields supported using environment variables are:
- Pod name
- Pod project/namespace
Procedure
To use environment variables
Create a
pod.yaml
file:apiVersion: v1 kind: Pod metadata: name: dapi-env-test-pod spec: containers: - name: env-test-container image: gcr.io/google_containers/busybox command: [ "/bin/sh", "-c", "env" ] env: - name: MY_POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: MY_POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace restartPolicy: Never
Create the pod from the
pod.yaml
file:$ oc create -f pod.yaml
Check the container’s logs for the
MY_POD_NAME
andMY_POD_NAMESPACE
values:$ oc logs -p dapi-env-test-pod
6.5.2.2. Consuming container values using a volume plug-in
You containers can consume API values using a volume plug-in.
Containers can consume:
- Pod name
- Pod project/namespace
- Pod annotations
- Pod labels
Procedure
To use the volume plug-in:
Create a
volume-pod.yaml
file:kind: Pod apiVersion: v1 metadata: labels: zone: us-east-coast cluster: downward-api-test-cluster1 rack: rack-123 name: dapi-volume-test-pod annotations: annotation1: "345" annotation2: "456" spec: containers: - name: volume-test-container image: gcr.io/google_containers/busybox command: ["sh", "-c", "cat /tmp/etc/pod_labels /tmp/etc/pod_annotations"] volumeMounts: - name: podinfo mountPath: /tmp/etc readOnly: false volumes: - name: podinfo downwardAPI: defaultMode: 420 items: - fieldRef: fieldPath: metadata.name path: pod_name - fieldRef: fieldPath: metadata.namespace path: pod_namespace - fieldRef: fieldPath: metadata.labels path: pod_labels - fieldRef: fieldPath: metadata.annotations path: pod_annotations restartPolicy: Never
Create the pod from the
volume-pod.yaml
file:$ oc create -f volume-pod.yaml
Check the container’s logs and verify the presence of the configured fields:
$ oc logs -p dapi-volume-test-pod
Example output
cluster=downward-api-test-cluster1 rack=rack-123 zone=us-east-coast annotation1=345 annotation2=456 kubernetes.io/config.source=api
6.5.3. Understanding how to consume container resources using the Downward API
When creating pods, you can use the Downward API to inject information about computing resource requests and limits so that image and application authors can correctly create an image for specific environments.
You can do this using environment variable or a volume plug-in.
6.5.3.1. Consuming container resources using environment variables
When creating pods, you can use the Downward API to inject information about computing resource requests and limits using environment variables.
Procedure
To use environment variables:
When creating a pod configuration, specify environment variables that correspond to the contents of the
resources
field in thespec.container
field:.... spec: containers: - name: test-container image: gcr.io/google_containers/busybox:1.24 command: [ "/bin/sh", "-c", "env" ] resources: requests: memory: "32Mi" cpu: "125m" limits: memory: "64Mi" cpu: "250m" env: - name: MY_CPU_REQUEST valueFrom: resourceFieldRef: resource: requests.cpu - name: MY_CPU_LIMIT valueFrom: resourceFieldRef: resource: limits.cpu - name: MY_MEM_REQUEST valueFrom: resourceFieldRef: resource: requests.memory - name: MY_MEM_LIMIT valueFrom: resourceFieldRef: resource: limits.memory ....
If the resource limits are not included in the container configuration, the downward API defaults to the node’s CPU and memory allocatable values.
Create the pod from the
pod.yaml
file:$ oc create -f pod.yaml
6.5.3.2. Consuming container resources using a volume plug-in
When creating pods, you can use the Downward API to inject information about computing resource requests and limits using a volume plug-in.
Procedure
To use the Volume Plug-in:
When creating a pod configuration, use the
spec.volumes.downwardAPI.items
field to describe the desired resources that correspond to thespec.resources
field:.... spec: containers: - name: client-container image: gcr.io/google_containers/busybox:1.24 command: ["sh", "-c", "while true; do echo; if [[ -e /etc/cpu_limit ]]; then cat /etc/cpu_limit; fi; if [[ -e /etc/cpu_request ]]; then cat /etc/cpu_request; fi; if [[ -e /etc/mem_limit ]]; then cat /etc/mem_limit; fi; if [[ -e /etc/mem_request ]]; then cat /etc/mem_request; fi; sleep 5; done"] resources: requests: memory: "32Mi" cpu: "125m" limits: memory: "64Mi" cpu: "250m" volumeMounts: - name: podinfo mountPath: /etc readOnly: false volumes: - name: podinfo downwardAPI: items: - path: "cpu_limit" resourceFieldRef: containerName: client-container resource: limits.cpu - path: "cpu_request" resourceFieldRef: containerName: client-container resource: requests.cpu - path: "mem_limit" resourceFieldRef: containerName: client-container resource: limits.memory - path: "mem_request" resourceFieldRef: containerName: client-container resource: requests.memory ....
If the resource limits are not included in the container configuration, the Downward API defaults to the node’s CPU and memory allocatable values.
Create the pod from the
volume-pod.yaml
file:$ oc create -f volume-pod.yaml
6.5.4. Consuming secrets using the Downward API
When creating pods, you can use the downward API to inject secrets so image and application authors can create an image for specific environments.
Procedure
Create a
secret.yaml
file:apiVersion: v1 kind: Secret metadata: name: mysecret data: password: cGFzc3dvcmQ= username: ZGV2ZWxvcGVy type: kubernetes.io/basic-auth
Create a
Secret
object from thesecret.yaml
file:$ oc create -f secret.yaml
Create a
pod.yaml
file that references theusername
field from the aboveSecret
object:apiVersion: v1 kind: Pod metadata: name: dapi-env-test-pod spec: containers: - name: env-test-container image: gcr.io/google_containers/busybox command: [ "/bin/sh", "-c", "env" ] env: - name: MY_SECRET_USERNAME valueFrom: secretKeyRef: name: mysecret key: username restartPolicy: Never
Create the pod from the
pod.yaml
file:$ oc create -f pod.yaml
Check the container’s logs for the
MY_SECRET_USERNAME
value:$ oc logs -p dapi-env-test-pod
6.5.5. Consuming configuration maps using the Downward API
When creating pods, you can use the Downward API to inject configuration map values so image and application authors can create an image for specific environments.
Procedure
Create a
configmap.yaml
file:apiVersion: v1 kind: ConfigMap metadata: name: myconfigmap data: mykey: myvalue
Create a
ConfigMap
object from theconfigmap.yaml
file:$ oc create -f configmap.yaml
Create a
pod.yaml
file that references the aboveConfigMap
object:apiVersion: v1 kind: Pod metadata: name: dapi-env-test-pod spec: containers: - name: env-test-container image: gcr.io/google_containers/busybox command: [ "/bin/sh", "-c", "env" ] env: - name: MY_CONFIGMAP_VALUE valueFrom: configMapKeyRef: name: myconfigmap key: mykey restartPolicy: Always
Create the pod from the
pod.yaml
file:$ oc create -f pod.yaml
Check the container’s logs for the
MY_CONFIGMAP_VALUE
value:$ oc logs -p dapi-env-test-pod
6.5.6. Referencing environment variables
When creating pods, you can reference the value of a previously defined environment variable by using the $()
syntax. If the environment variable reference can not be resolved, the value will be left as the provided string.
Procedure
Create a
pod.yaml
file that references an existingenvironment variable
:apiVersion: v1 kind: Pod metadata: name: dapi-env-test-pod spec: containers: - name: env-test-container image: gcr.io/google_containers/busybox command: [ "/bin/sh", "-c", "env" ] env: - name: MY_EXISTING_ENV value: my_value - name: MY_ENV_VAR_REF_ENV value: $(MY_EXISTING_ENV) restartPolicy: Never
Create the pod from the
pod.yaml
file:$ oc create -f pod.yaml
Check the container’s logs for the
MY_ENV_VAR_REF_ENV
value:$ oc logs -p dapi-env-test-pod
6.5.7. Escaping environment variable references
When creating a pod, you can escape an environment variable reference by using a double dollar sign. The value will then be set to a single dollar sign version of the provided value.
Procedure
Create a
pod.yaml
file that references an existingenvironment variable
:apiVersion: v1 kind: Pod metadata: name: dapi-env-test-pod spec: containers: - name: env-test-container image: gcr.io/google_containers/busybox command: [ "/bin/sh", "-c", "env" ] env: - name: MY_NEW_ENV value: $$(SOME_OTHER_ENV) restartPolicy: Never
Create the pod from the
pod.yaml
file:$ oc create -f pod.yaml
Check the container’s logs for the
MY_NEW_ENV
value:$ oc logs -p dapi-env-test-pod
6.6. Copying files to or from an OpenShift Container Platform container
You can use the CLI to copy local files to or from a remote directory in a container using the rsync
command.
6.6.1. Understanding how to copy files
The oc rsync
command, or remote sync, is a useful tool for copying database archives to and from your pods for backup and restore purposes. You can also use oc rsync
to copy source code changes into a running pod for development debugging, when the running pod supports hot reload of source files.
$ oc rsync <source> <destination> [-c <container>]
6.6.1.1. Requirements
- Specifying the Copy Source
-
The source argument of the
oc rsync
command must point to either a local directory or a pod directory. Individual files are not supported.
When specifying a pod directory the directory name must be prefixed with the pod name:
<pod name>:<dir>
If the directory name ends in a path separator (/
), only the contents of the directory are copied to the destination. Otherwise, the directory and its contents are copied to the destination.
- Specifying the Copy Destination
-
The destination argument of the
oc rsync
command must point to a directory. If the directory does not exist, butrsync
is used for copy, the directory is created for you. - Deleting Files at the Destination
-
The
--delete
flag may be used to delete any files in the remote directory that are not in the local directory. - Continuous Syncing on File Change
-
Using the
--watch
option causes the command to monitor the source path for any file system changes, and synchronizes changes when they occur. With this argument, the command runs forever.
Synchronization occurs after short quiet periods to ensure a rapidly changing file system does not result in continuous synchronization calls.
When using the --watch
option, the behavior is effectively the same as manually invoking oc rsync
repeatedly, including any arguments normally passed to oc rsync
. Therefore, you can control the behavior via the same flags used with manual invocations of oc rsync
, such as --delete
.
6.6.2. Copying files to and from containers
Support for copying local files to or from a container is built into the CLI.
Prerequisites
When working with oc rsync
, note the following:
- rsync must be installed
-
The
oc rsync
command uses the localrsync
tool if present on the client machine and the remote container.
If rsync
is not found locally or in the remote container, a tar archive is created locally and sent to the container where the tar utility is used to extract the files. If tar is not available in the remote container, the copy will fail.
The tar copy method does not provide the same functionality as oc rsync
. For example, oc rsync
creates the destination directory if it does not exist and only sends files that are different between the source and the destination.
In Windows, the cwRsync
client should be installed and added to the PATH for use with the oc rsync
command.
Procedure
To copy a local directory to a pod directory:
$ oc rsync <local-dir> <pod-name>:/<remote-dir> -c <container-name>
For example:
$ oc rsync /home/user/source devpod1234:/src -c user-container
To copy a pod directory to a local directory:
$ oc rsync devpod1234:/src /home/user/source
Example output
$ oc rsync devpod1234:/src/status.txt /home/user/
6.6.3. Using advanced Rsync features
The oc rsync
command exposes fewer command line options than standard rsync
. In the case that you want to use a standard rsync
command line option that is not available in oc rsync
, for example the --exclude-from=FILE
option, it might be possible to use standard rsync
's --rsh
(-e
) option or RSYNC_RSH
environment variable as a workaround, as follows:
$ rsync --rsh='oc rsh' --exclude-from=FILE SRC POD:DEST
or:
Export the RSYNC_RSH
variable:
$ export RSYNC_RSH='oc rsh'
Then, run the rsync command:
$ rsync --exclude-from=FILE SRC POD:DEST
Both of the above examples configure standard rsync
to use oc rsh
as its remote shell program to enable it to connect to the remote pod, and are an alternative to running oc rsync
.
6.7. Executing remote commands in an OpenShift Container Platform container
You can use the CLI to execute remote commands in an OpenShift Container Platform container.
6.7.1. Executing remote commands in containers
Support for remote container command execution is built into the CLI.
Procedure
To run a command in a container:
$ oc exec <pod> [-c <container>] <command> [<arg_1> ... <arg_n>]
For example:
$ oc exec mypod date
Example output
Thu Apr 9 02:21:53 UTC 2015
For security purposes, the oc exec
command does not work when accessing privileged containers except when the command is executed by a cluster-admin
user.
6.7.2. Protocol for initiating a remote command from a client
Clients initiate the execution of a remote command in a container by issuing a request to the Kubernetes API server:
/proxy/nodes/<node_name>/exec/<namespace>/<pod>/<container>?command=<command>
In the above URL:
-
<node_name>
is the FQDN of the node. -
<namespace>
is the project of the target pod. -
<pod>
is the name of the target pod. -
<container>
is the name of the target container. -
<command>
is the desired command to be executed.
For example:
/proxy/nodes/node123.openshift.com/exec/myns/mypod/mycontainer?command=date
Additionally, the client can add parameters to the request to indicate if:
- the client should send input to the remote container’s command (stdin).
- the client’s terminal is a TTY.
- the remote container’s command should send output from stdout to the client.
- the remote container’s command should send output from stderr to the client.
After sending an exec
request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses HTTP/2.
The client creates one stream each for stdin, stdout, and stderr. To distinguish among the streams, the client sets the streamType
header on the stream to one of stdin
, stdout
, or stderr
.
The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the remote command execution request.
6.8. Using port forwarding to access applications in a container
OpenShift Container Platform supports port forwarding to pods.
6.8.1. Understanding port forwarding
You can use the CLI to forward one or more local ports to a pod. This allows you to listen on a given or random port locally, and have data forwarded to and from given ports in the pod.
Support for port forwarding is built into the CLI:
$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]
The CLI listens on each local port specified by the user, forwarding using the protocol described below.
Ports may be specified using the following formats:
| The client listens on port 5000 locally and forwards to 5000 in the pod. |
| The client listens on port 6000 locally and forwards to 5000 in the pod. |
| The client selects a free local port and forwards to 5000 in the pod. |
OpenShift Container Platform handles port-forward requests from clients. Upon receiving a request, OpenShift Container Platform upgrades the response and waits for the client to create port-forwarding streams. When OpenShift Container Platform receives a new stream, it copies data between the stream and the pod’s port.
Architecturally, there are options for forwarding to a pod’s port. The supported OpenShift Container Platform implementation invokes nsenter
directly on the node host to enter the pod’s network namespace, then invokes socat
to copy data between the stream and the pod’s port. However, a custom implementation could include running a helper pod that then runs nsenter
and socat
, so that those binaries are not required to be installed on the host.
6.8.2. Using port forwarding
You can use the CLI to port-forward one or more local ports to a pod.
Procedure
Use the following command to listen on the specified port in a pod:
$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]
For example:
Use the following command to listen on ports
5000
and6000
locally and forward data to and from ports5000
and6000
in the pod:$ oc port-forward <pod> 5000 6000
Example output
Forwarding from 127.0.0.1:5000 -> 5000 Forwarding from [::1]:5000 -> 5000 Forwarding from 127.0.0.1:6000 -> 6000 Forwarding from [::1]:6000 -> 6000
Use the following command to listen on port
8888
locally and forward to5000
in the pod:$ oc port-forward <pod> 8888:5000
Example output
Forwarding from 127.0.0.1:8888 -> 5000 Forwarding from [::1]:8888 -> 5000
Use the following command to listen on a free port locally and forward to
5000
in the pod:$ oc port-forward <pod> :5000
Example output
Forwarding from 127.0.0.1:42390 -> 5000 Forwarding from [::1]:42390 -> 5000
Or:
$ oc port-forward <pod> 0:5000
6.8.3. Protocol for initiating port forwarding from a client
Clients initiate port forwarding to a pod by issuing a request to the Kubernetes API server:
/proxy/nodes/<node_name>/portForward/<namespace>/<pod>
In the above URL:
-
<node_name>
is the FQDN of the node. -
<namespace>
is the namespace of the target pod. -
<pod>
is the name of the target pod.
For example:
/proxy/nodes/node123.openshift.com/portForward/myns/mypod
After sending a port forward request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses Hyptertext Transfer Protocol Version 2 (HTTP/2).
The client creates a stream with the port
header containing the target port in the pod. All data written to the stream is delivered via the kubelet to the target pod and port. Similarly, all data sent from the pod for that forwarded connection is delivered back to the same stream in the client.
The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the port forwarding request.
6.9. Using sysctls in containers
Sysctl settings are exposed via Kubernetes, allowing users to modify certain kernel parameters at runtime for namespaces within a container. Only sysctls that are namespaced can be set independently on pods. If a sysctl is not namespaced, called node-level, you must use another method of setting the sysctl, such as the Node Tuning Operator. Moreover, only those sysctls considered safe are whitelisted by default; you can manually enable other unsafe sysctls on the node to be available to the user.
6.9.1. About sysctls
In Linux, the sysctl interface allows an administrator to modify kernel parameters at runtime. Parameters are available via the /proc/sys/ virtual process file system. The parameters cover various subsystems, such as:
- kernel (common prefix: kernel.)
- networking (common prefix: net.)
- virtual memory (common prefix: vm.)
- MDADM (common prefix: dev.)
More subsystems are described in Kernel documentation. To get a list of all parameters, run:
$ sudo sysctl -a
6.9.1.1. Namespaced versus node-level sysctls
A number of sysctls are namespaced in the Linux kernels. This means that you can set them independently for each pod on a node. Being namespaced is a requirement for sysctls to be accessible in a pod context within Kubernetes.
The following sysctls are known to be namespaced:
- kernel.shm*
- kernel.msg*
- kernel.sem
- fs.mqueue.*
Additionally, most of the sysctls in the net.* group are known to be namespaced. Their namespace adoption differs based on the kernel version and distributor.
Sysctls that are not namespaced are called node-level and must be set manually by the cluster administrator, either by means of the underlying Linux distribution of the nodes, such as by modifying the /etc/sysctls.conf file, or by using a daemon set with privileged containers. You can use the Node Tuning Operator to set node-level sysctls.
Consider marking nodes with special sysctls as tainted. Only schedule pods onto them that need those sysctl settings. Use the taints and toleration feature to mark the nodes.
6.9.1.2. Safe versus unsafe sysctls
Sysctls are grouped into safe and unsafe sysctls.
For a sysctl to be considered safe, it must use proper namespacing and must be properly isolated between pods on the same node. This means that if you set a sysctl for one pod it must not:
- Influence any other pod on the node
- Harm the node’s health
- Gain CPU or memory resources outside of the resource limits of a pod
OpenShift Container Platform supports, or whitelists, the following sysctls in the safe set:
- kernel.shm_rmid_forced
- net.ipv4.ip_local_port_range
- net.ipv4.tcp_syncookies
- net.ipv4.ping_group_range
All safe sysctls are enabled by default. You can use a sysctl in a pod by modifying the Pod
spec.
Any sysctl not whitelisted by OpenShift Container Platform is considered unsafe for OpenShift Container Platform. Note that being namespaced alone is not sufficient for the sysctl to be considered safe.
All unsafe sysctls are disabled by default, and the cluster administrator must manually enable them on a per-node basis. Pods with disabled unsafe sysctls are scheduled but do not launch.
$ oc get pod
Example output
NAME READY STATUS RESTARTS AGE hello-pod 0/1 SysctlForbidden 0 14s
6.9.2. Setting sysctls for a pod
You can set sysctls on pods using the pod’s securityContext
. The securityContext
applies to all containers in the same pod.
Safe sysctls are allowed by default. A pod with unsafe sysctls fails to launch on any node unless the cluster administrator explicitly enables unsafe sysctls for that node. As with node-level sysctls, use the taints and toleration feature or labels on nodes to schedule those pods onto the right nodes.
The following example uses the pod securityContext
to set a safe sysctl kernel.shm_rmid_forced
and two unsafe sysctls, net.core.somaxconn
and kernel.msgmax
. There is no distinction between safe and unsafe sysctls in the specification.
To avoid destabilizing your operating system, modify sysctl parameters only after you understand their effects.
Procedure
To use safe and unsafe sysctls:
Modify the YAML file that defines the pod and add the
securityContext
spec, as shown in the following example:apiVersion: v1 kind: Pod metadata: name: sysctl-example spec: securityContext: sysctls: - name: kernel.shm_rmid_forced value: "0" - name: net.core.somaxconn value: "1024" - name: kernel.msgmax value: "65536" ...
Create the pod:
$ oc apply -f <file-name>.yaml
If the unsafe sysctls are not allowed for the node, the pod is scheduled, but does not deploy:
$ oc get pod
Example output
NAME READY STATUS RESTARTS AGE hello-pod 0/1 SysctlForbidden 0 14s
6.9.3. Enabling unsafe sysctls
A cluster administrator can allow certain unsafe sysctls for very special situations such as high performance or real-time application tuning.
If you want to use unsafe sysctls, a cluster administrator must enable them individually for a specific type of node. The sysctls must be namespaced.
You can further control which sysctls can be set in pods by specifying lists of sysctls or sysctl patterns in the forbiddenSysctls
and allowedUnsafeSysctls
fields of the Security Context Constraints.
-
The
forbiddenSysctls
option excludes specific sysctls. -
The
allowedUnsafeSysctls
option controls specific needs such as high performance or real-time application tuning.
Due to their nature of being unsafe, the use of unsafe sysctls is at-your-own-risk and can lead to severe problems, such as improper behavior of containers, resource shortage, or breaking a node.
Procedure
Add a label to the machine config pool where the containers where containers with the unsafe sysctls will run:
$ oc edit machineconfigpool worker
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: creationTimestamp: 2019-02-08T14:52:39Z generation: 1 labels: custom-kubelet: sysctl 1
- 1
- Add a
key: pair
label.
Create a
KubeletConfig
custom resource (CR):apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: custom-kubelet spec: machineConfigPoolSelector: matchLabels: custom-kubelet: sysctl 1 kubeletConfig: allowedUnsafeSysctls: 2 - "kernel.msg*" - "net.core.somaxconn"
Create the object:
$ oc apply -f set-sysctl-worker.yaml
A new
MachineConfig
object named in the99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet
format is created.Wait for the cluster to reboot using the
machineconfigpool
objectstatus
fields:For example:
status: conditions: - lastTransitionTime: '2019-08-11T15:32:00Z' message: >- All nodes are updating to rendered-worker-ccbfb5d2838d65013ab36300b7b3dc13 reason: '' status: 'True' type: Updating
A message similar to the following appears when the cluster is ready:
- lastTransitionTime: '2019-08-11T16:00:00Z' message: >- All nodes are updated with rendered-worker-ccbfb5d2838d65013ab36300b7b3dc13 reason: '' status: 'True' type: Updated
When the cluster is ready, check for the merged
KubeletConfig
object in the newMachineConfig
object:$ oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7
"ownerReferences": [ { "apiVersion": "machineconfiguration.openshift.io/v1", "blockOwnerDeletion": true, "controller": true, "kind": "KubeletConfig", "name": "custom-kubelet", "uid": "3f64a766-bae8-11e9-abe8-0a1a2a4813f2" } ]
You can now add unsafe sysctls to pods as needed.