Chapter 10. Pruning objects to reclaim resources
Over time, API objects created in OpenShift Container Platform can accumulate in the cluster’s etcd data store through normal user operations, such as when building and deploying applications.
Cluster administrators can periodically prune older versions of objects from the cluster that are no longer required. For example, by pruning images you can delete older images and layers that are no longer in use, but are still taking up disk space.
10.1. Basic pruning operations
The CLI groups prune operations under a common parent command:
$ oc adm prune <object_type> <options>
This specifies:
-
The
<object_type>
to perform the action on, such asgroups
,builds
,deployments
, orimages
. -
The
<options>
supported to prune that object type.
10.2. Pruning groups
To prune groups records from an external provider, administrators can run the following command:
$ oc adm prune groups \ --sync-config=path/to/sync/config [<options>]
Options | Description |
---|---|
| Indicate that pruning should occur, instead of performing a dry-run. |
| Path to the group blacklist file. |
| Path to the group whitelist file. |
| Path to the synchronization configuration file. |
To see the groups that the prune command deletes:
$ oc adm prune groups --sync-file=ldap-sync-config.yaml
To perform the prune operation:
$ oc adm prune groups --sync-file=ldap-sync-config.yaml --confirm
10.3. Pruning deployments
In order to prune deployments that are no longer required by the system due to age and status, administrators can run the following command:
$ oc adm prune deployments [<options>]
Option | Description |
---|---|
| Indicate that pruning should occur, instead of performing a dry-run. |
|
Prune all deployments that no longer have a DeploymentConfig, has status is |
|
Per DeploymentConfig, keep the last |
|
Per DeploymentConfig, keep the last |
|
Do not prune any object that is younger than |
To see what a pruning operation would delete:
$ oc adm prune deployments --orphans --keep-complete=5 --keep-failed=1 \ --keep-younger-than=60m
To actually perform the prune operation:
$ oc adm prune deployments --orphans --keep-complete=5 --keep-failed=1 \ --keep-younger-than=60m --confirm
10.4. Pruning builds
In order to prune builds that are no longer required by the system due to age and status, administrators can run the following command:
$ oc adm prune builds [<options>]
Option | Description |
---|---|
| Indicate that pruning should occur, instead of performing a dry-run. |
| Prune all builds whose Build Configuration no longer exists, status is complete, failed, error, or canceled. |
|
Per Build Configuration, keep the last |
|
Per Build Configuration, keep the last |
|
Do not prune any object that is younger than |
To see what a pruning operation would delete:
$ oc adm prune builds --orphans --keep-complete=5 --keep-failed=1 \ --keep-younger-than=60m
To actually perform the prune operation:
$ oc adm prune builds --orphans --keep-complete=5 --keep-failed=1 \ --keep-younger-than=60m --confirm
Developers can enable automatic build pruning by modifying their Build Configuration.
Additional resources
10.5. Pruning images
In order to prune images that are no longer required by the system due to age, status, or exceed limits, administrators can run the following command:
$ oc adm prune images [<options>]
Currently, to prune images you must first log in to the CLI as a user with an access token. The user must also have the cluster role system:image-pruner
or greater (for example, cluster-admin
).
Pruning images removes data from the integrated registry unless --prune-registry=false
is used. For this operation to work properly, the registry must be configured with storage:delete:enabled
set to true
.
Pruning images with the --namespace
flag does not remove images, only image streams. Images are non-namespaced resources. Therefore, limiting pruning to a particular namespace makes it impossible to calculate their current usage.
By default, the integrated registry caches blobs metadata to reduce the number of requests to storage, and increase the speed of processing the request. Pruning does not update the integrated registry cache. Images pushed after pruning that contain pruned layers will be broken, because the pruned layers that have metadata in the cache will not be pushed. Therefore it is necessary to clear the cache after pruning. This can be accomplished by redeploying the registry:
# oc patch deployment image-registry -n openshift-image-registry --type=merge --patch="{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"kubectl.kubernetes.io/restartedAt\": \"$(date '+%Y-%m-%dT%H:%M:%SZ' -u)\"}}}}}"
If the integrated registry uses a Redis cache, you must clean the database manually.
oc adm prune images
operations require a route for your registry. Registry routes are not created by default. See Image Registry Operator in OpenShift Container Platform for information on how to create a registry route and see Exposing the registry for details on how to expose the registry service.
Option | Description |
---|---|
|
Include images that were not pushed to the registry, but have been mirrored by pullthrough. This is on by default. To limit the pruning to images that were pushed to the integrated registry, pass |
| The path to a certificate authority file to use when communicating with the OpenShift Container Platform-managed registries. Defaults to the certificate authority data from the current user’s configuration file. If provided, a secure connection is initiated. |
|
Indicate that pruning should occur, instead of performing a dry-run. This requires a valid route to the integrated container image registry. If this command is run outside of the cluster network, the route must be provided using |
| Use caution with this option. Allow an insecure connection to the container registry that is hosted via HTTP or has an invalid HTTPS certificate. |
|
For each imagestream, keep up to at most |
|
Do not prune any image that is younger than |
|
Prune each image that exceeds the smallest limit defined in the same project. This flag cannot be combined with |
|
The address to use when contacting the registry. The command attempts to use a cluster-internal URL determined from managed images and imagestreams. In case it fails (the registry cannot be resolved or reached), an alternative route that works needs to be provided using this flag. The registry host name can be prefixed by |
| In conjunction with the conditions stipulated by the other options, this option controls whether the data in the registry corresponding to the OpenShift Container Platform image API object is pruned. By default, image pruning processes both the image API objects and corresponding data in the registry. This options is useful when you are only concerned with removing etcd content, possibly to reduce the number of image objects (but are not concerned with cleaning up registry storage) or intend to do that separately by hard pruning the registry, possibly during an appropriate maintenance window for the registry. |
10.5.1. Image prune conditions
Remove any image "managed by OpenShift Container Platform" (images with the annotation
openshift.io/image.managed
) that was created at least--keep-younger-than
minutes ago and is not currently referenced by:-
any Pod created less than
--keep-younger-than
minutes ago. -
any imagestream created less than
--keep-younger-than
minutes ago. - any running Pods.
- any pending Pods.
- any ReplicationControllers.
- any DeploymentConfigs.
- any Build Configurations.
- any Builds.
-
the
--keep-tag-revisions
most recent items instream.status.tags[].items
.
-
any Pod created less than
Remove any image "managed by OpenShift Container Platform" (images with the annotation
openshift.io/image.managed
) that is exceeding the smallest limit defined in the same project and is not currently referenced by:- any running Pods.
- any pending Pods.
- any ReplicationControllers.
- any DeploymentConfigs.
- any Build Configurations.
- any Builds.
- There is no support for pruning from external registries.
-
When an image is pruned, all references to the image are removed from all imagestreams that have a reference to the image in
status.tags
. - Image layers that are no longer referenced by any images are removed.
The --prune-over-size-limit
flag cannot be combined with --keep-tag-revisions
nor --keep-younger-than
flags. Doing so returns information that this operation is not allowed.
Separating the removal of OpenShift Container Platform image API objects and image data from the Registry by using --prune-registry=false
followed by hard pruning the registry narrows some timing windows and is safer when compared to trying to prune both through one command. However, timing windows are not completely removed.
For example, you can still create a Pod referencing an image as pruning identifies that image for pruning. You should still keep track of an API Object created during the pruning operations that might reference images, so you can mitigate any references to deleted content.
Also, keep in mind that re-doing the pruning without the --prune-registry
option or with --prune-registry=true
does not lead to pruning the associated storage in the image registry for images previously pruned by --prune-registry=false
. Any images that were pruned with --prune-registry=false
can only be deleted from registry storage by hard pruning the registry.
10.5.2. Running the image prune operation
Procedure
To see what a pruning operation would delete:
Keeping up to three tag revisions, and keeping resources (images, image streams and Pods) younger than sixty minutes:
$ oc adm prune images --keep-tag-revisions=3 --keep-younger-than=60m
Pruning every image that exceeds defined limits:
$ oc adm prune images --prune-over-size-limit
To actually perform the prune operation with the options from the previous step:
$ oc adm prune images --keep-tag-revisions=3 --keep-younger-than=60m --confirm
$ oc adm prune images --prune-over-size-limit --confirm
10.5.3. Using secure or insecure connections
The secure connection is the preferred and recommended approach. It is done over HTTPS protocol with a mandatory certificate verification. The prune
command always attempts to use it if possible. If it is not possible, in some cases it can fall-back to insecure connection, which is dangerous. In this case, either certificate verification is skipped or plain HTTP protocol is used.
The fall-back to insecure connection is allowed in the following cases unless --certificate-authority
is specified:
-
The
prune
command is run with the--force-insecure
option. -
The provided
registry-url
is prefixed with thehttp://
scheme. -
The provided
registry-url
is a local-link address orlocalhost
. -
The configuration of the current user allows for an insecure connection. This can be caused by the user either logging in using
--insecure-skip-tls-verify
or choosing the insecure connection when prompted.
If the registry is secured by a certificate authority different from the one used by OpenShift Container Platform, it must be specified using the --certificate-authority
flag. Otherwise, the prune
command fails with an error.
10.5.4. Image pruning problems
Images not being pruned
If your images keep accumulating and the prune
command removes just a small portion of what you expect, ensure that you understand the image prune conditions that must apply for an image to be considered a candidate for pruning.
Ensure that images you want removed occur at higher positions in each tag history than your chosen tag revisions threshold. For example, consider an old and obsolete image named sha:abz
. By running the following command in namespace N
, where the image is tagged, the image is tagged three times in a single imagestream named myapp
:
$ image_name="sha:abz" $ oc get is -n N -o go-template='{{range $isi, $is := .items}}{{range $ti, $tag := $is.status.tags}}'\ '{{range $ii, $item := $tag.items}}{{if eq $item.image "'"${image_name}"\ $'"}}{{$is.metadata.name}}:{{$tag.tag}} at position {{$ii}} out of {{len $tag.items}}\n'\ '{{end}}{{end}}{{end}}{{end}}' myapp:v2 at position 4 out of 5 myapp:v2.1 at position 2 out of 2 myapp:v2.1-may-2016 at position 0 out of 1
When default options are used, the image is never pruned because it occurs at position 0
in a history of myapp:v2.1-may-2016
tag. For an image to be considered for pruning, the administrator must either:
Specify
--keep-tag-revisions=0
with theoc adm prune images
command.CautionThis action effectively removes all the tags from all the namespaces with underlying images, unless they are younger or they are referenced by objects younger than the specified threshold.
-
Delete all the
istags
where the position is below the revision threshold, which meansmyapp:v2.1
andmyapp:v2.1-may-2016
. -
Move the image further in the history, either by running new Builds pushing to the same
istag
, or by tagging other image. Unfortunately, this is not always desirable for old release tags.
Tags having a date or time of a particular image’s Build in their names should be avoided, unless the image must be preserved for an undefined amount of time. Such tags tend to have just one image in its history, which effectively prevents them from ever being pruned.
Using a secure connection against insecure registry
If you see a message similar to the following in the output of the oadm prune images
command, then your registry is not secured and the oadm prune images
client attempts to use a secure connection:
error: error communicating with registry: Get https://172.30.30.30:5000/healthz: http: server gave HTTP response to HTTPS client
-
The recommend solution is to secure the registry. Otherwise, you can force the client to use an insecure connection by appending
--force-insecure
to the command, however this is not recommended.
Using an insecure connection against a secured registry
If you see one of the following errors in the output of the oadm prune images
command, it means that your registry is secured using a certificate signed by a certificate authority other than the one used by oadm prune images
client for connection verification:
error: error communicating with registry: Get http://172.30.30.30:5000/healthz: malformed HTTP response "\x15\x03\x01\x00\x02\x02" error: error communicating with registry: [Get https://172.30.30.30:5000/healthz: x509: certificate signed by unknown authority, Get http://172.30.30.30:5000/healthz: malformed HTTP response "\x15\x03\x01\x00\x02\x02"]
By default, the certificate authority data stored in the user’s configuration file are used; the same is true for communication with the master API.
Use the --certificate-authority
option to provide the right certificate authority for the container image registry server.
Using the wrong certificate authority
The following error means that the certificate authority used to sign the certificate of the secured container image registry is different than the authority used by the client:
error: error communicating with registry: Get https://172.30.30.30:5000/: x509: certificate signed by unknown authority
Make sure to provide the right one with the flag --certificate-authority
.
As a workaround, the --force-insecure
flag can be added instead. However, this is not recommended.
Additional resources
10.6. Hard pruning the registry
The OpenShift Container Registry can accumulate blobs that are not referenced by the OpenShift Container Platform cluster’s etcd. The basic pruning images procedure, therefore, is unable to operate on them. These are called orphaned blobs.
Orphaned blobs can occur from the following scenarios:
-
Manually deleting an image with
oc delete image <sha256:image-id>
command, which only removes the image from etcd, but not from the registry’s storage. -
Pushing to the registry initiated by
docker
daemon failures, which causes some blobs to get uploaded, but the image manifest (which is uploaded as the very last component) does not. All unique image blobs become orphans. - OpenShift Container Platform refusing an image because of quota restrictions.
- The standard image pruner deleting an image manifest, but is interrupted before it deletes the related blobs.
- A bug in the registry pruner, which fails to remove the intended blobs, causing the image objects referencing them to be removed and the blobs becoming orphans.
Hard pruning the registry, a separate procedure from basic image pruning, allows cluster administrators to remove orphaned blobs. You should hard prune if you are running out of storage space in your OpenShift Container Registry and believe you have orphaned blobs.
This should be an infrequent operation and is necessary only when you have evidence that significant numbers of new orphans have been created. Otherwise, you can perform standard image pruning at regular intervals, for example, once a day (depending on the number of images being created).
Procedure
To hard prune orphaned blobs from the registry:
Log in.
Log in to the cluster with the CLI as a user with an access token.
Run a basic image prune.
Basic image pruning removes additional images that are no longer needed. The hard prune does not remove images on its own. It only removes blobs stored in the registry storage. Therefore, you should run this just before the hard prune.
Switch the registry to read-only mode.
If the registry is not running in read-only mode, any pushes happening at the same time as the prune will either:
- fail and cause new orphans, or
- succeed although the images cannot be pulled (because some of the referenced blobs were deleted).
Pushes will not succeed until the registry is switched back to read-write mode. Therefore, the hard prune must be carefully scheduled.
To switch the registry to read-only mode:
Set the following environment variable:
$ oc set env -n default \ dc/docker-registry \ 'REGISTRY_STORAGE_MAINTENANCE_READONLY={"enabled":true}'
By default, the registry automatically redeploys when the previous step completes; wait for the redeployment to complete before continuing. However, if you have disabled these triggers, you must manually redeploy the registry so that the new environment variables are picked up:
$ oc rollout -n default \ latest dc/docker-registry
Add the
system:image-pruner
role.The service account used to run the registry instances requires additional permissions in order to list some resources.
Get the service account name:
$ service_account=$(oc get -n default \ -o jsonpath=$'system:serviceaccount:{.metadata.namespace}:{.spec.template.spec.serviceAccountName}\n' \ dc/docker-registry)
Add the
system:image-pruner
cluster role to the service account:$ oc adm policy add-cluster-role-to-user \ system:image-pruner \ ${service_account}
(Optional) Run the pruner in dry-run mode.
To see how many blobs would be removed, run the hard pruner in dry-run mode. No changes are actually made:
$ oc -n default \ exec -i -t "$(oc -n default get pods -l deploymentconfig=docker-registry \ -o jsonpath=$'{.items[0].metadata.name}\n')" \ -- /usr/bin/dockerregistry -prune=check
Alternatively, to get the exact paths for the prune candidates, increase the logging level:
$ oc -n default \ exec "$(oc -n default get pods -l deploymentconfig=docker-registry \ -o jsonpath=$'{.items[0].metadata.name}\n')" \ -- /bin/sh \ -c 'REGISTRY_LOG_LEVEL=info /usr/bin/dockerregistry -prune=check'
Truncated sample output
$ oc exec docker-registry-3-vhndw \ -- /bin/sh -c 'REGISTRY_LOG_LEVEL=info /usr/bin/dockerregistry -prune=check' time="2017-06-22T11:50:25.066156047Z" level=info msg="start prune (dry-run mode)" distribution_version="v2.4.1+unknown" kubernetes_version=v1.6.1+$Format:%h$ openshift_version=unknown time="2017-06-22T11:50:25.092257421Z" level=info msg="Would delete blob: sha256:00043a2a5e384f6b59ab17e2c3d3a3d0a7de01b2cabeb606243e468acc663fa5" go.version=go1.7.5 instance.id=b097121c-a864-4e0c-ad6c-cc25f8fdf5a6 time="2017-06-22T11:50:25.092395621Z" level=info msg="Would delete blob: sha256:0022d49612807cb348cabc562c072ef34d756adfe0100a61952cbcb87ee6578a" go.version=go1.7.5 instance.id=b097121c-a864-4e0c-ad6c-cc25f8fdf5a6 time="2017-06-22T11:50:25.092492183Z" level=info msg="Would delete blob: sha256:0029dd4228961086707e53b881e25eba0564fa80033fbbb2e27847a28d16a37c" go.version=go1.7.5 instance.id=b097121c-a864-4e0c-ad6c-cc25f8fdf5a6 time="2017-06-22T11:50:26.673946639Z" level=info msg="Would delete blob: sha256:ff7664dfc213d6cc60fd5c5f5bb00a7bf4a687e18e1df12d349a1d07b2cf7663" go.version=go1.7.5 instance.id=b097121c-a864-4e0c-ad6c-cc25f8fdf5a6 time="2017-06-22T11:50:26.674024531Z" level=info msg="Would delete blob: sha256:ff7a933178ccd931f4b5f40f9f19a65be5eeeec207e4fad2a5bafd28afbef57e" go.version=go1.7.5 instance.id=b097121c-a864-4e0c-ad6c-cc25f8fdf5a6 time="2017-06-22T11:50:26.674675469Z" level=info msg="Would delete blob: sha256:ff9b8956794b426cc80bb49a604a0b24a1553aae96b930c6919a6675db3d5e06" go.version=go1.7.5 instance.id=b097121c-a864-4e0c-ad6c-cc25f8fdf5a6 ... Would delete 13374 blobs Would free up 2.835 GiB of disk space Use -prune=delete to actually delete the data
Run the hard prune.
Execute the following command inside one running instance of a
docker-registry
pod to run the hard prune:$ oc -n default \ exec -i -t "$(oc -n default get pods -l deploymentconfig=docker-registry -o jsonpath=$'{.items[0].metadata.name}\n')" \ -- /usr/bin/dockerregistry -prune=delete
Sample output
$ oc exec docker-registry-3-vhndw \ -- /usr/bin/dockerregistry -prune=delete Deleted 13374 blobs Freed up 2.835 GiB of disk space
Switch the registry back to read-write mode.
After the prune is finished, the registry can be switched back to read-write mode by executing:
$ oc set env -n default dc/docker-registry REGISTRY_STORAGE_MAINTENANCE_READONLY-
10.7. Pruning cron jobs
Cron jobs can perform pruning of successful jobs, but might not properly handle failed jobs. Therefore, the cluster administrator should perform regular cleanup of jobs manually. They should also restrict the access to cron jobs to a small group of trusted users and set appropriate quota to prevent the cron job from creating too many jobs and pods.