Chapter 13. Prometheus and Grafana metrics under Red Hat Quay
Red Hat Quay exports a Prometheus- and Grafana-compatible endpoint on each instance to allow for easy monitoring and alerting.
13.1. Exposing the Prometheus endpoint Copy linkLink copied to clipboard!
13.1.1. Standalone Red Hat Quay Copy linkLink copied to clipboard!
When using podman run
to start the Quay
container, expose the metrics port 9091
:
sudo podman run -d --rm -p 80:8080 -p 443:8443 -p 9091:9091\ --name=quay \ -v $QUAY/config:/conf/stack:Z \ -v $QUAY/storage:/datastorage:Z \ registry.redhat.io/quay/quay-rhel8:v3.6.8
$ sudo podman run -d --rm -p 80:8080 -p 443:8443 -p 9091:9091\
--name=quay \
-v $QUAY/config:/conf/stack:Z \
-v $QUAY/storage:/datastorage:Z \
registry.redhat.io/quay/quay-rhel8:v3.6.8
The metrics will now be available:
curl quay.example.com:9091/metrics
$ curl quay.example.com:9091/metrics
See Monitoring Quay with Prometheus and Grafana for details on configuring Prometheus and Grafana to monitor Quay repository counts.
13.1.2. Red Hat Quay Operator Copy linkLink copied to clipboard!
Determine the cluster IP for the quay-metrics
service:
Connect to your cluster and access the metrics using the cluster IP and port for the quay-metrics
service:
13.1.3. Setting up Prometheus to consume metrics Copy linkLink copied to clipboard!
Prometheus needs a way to access all Red Hat Quay instances running in a cluster. In the typical setup, this is done by listing all the Red Hat Quay instances in a single named DNS entry, which is then given to Prometheus.
13.1.4. DNS configuration under Kubernetes Copy linkLink copied to clipboard!
A simple Kubernetes service can be configured to provide the DNS entry for Prometheus.
13.1.5. DNS configuration for a manual cluster Copy linkLink copied to clipboard!
SkyDNS is a simple solution for managing this DNS record when not using Kubernetes. SkyDNS can run on an etcd cluster. Entries for each Red Hat Quay instance in the cluster can be added and removed in the etcd store. SkyDNS will regularly read them from there and update the list of Quay instances in the DNS record accordingly.
13.2. Introduction to metrics Copy linkLink copied to clipboard!
Red Hat Quay provides metrics to help monitor the registry, including metrics for general registry usage, uploads, downloads, garbage collection, and authentication.
13.2.1. General registry statistics Copy linkLink copied to clipboard!
General registry statistics can indicate how large the registry has grown.
Metric name | Description |
---|---|
quay_user_rows | Number of users in the database |
quay_robot_rows | Number of robot accounts in the database |
quay_org_rows | Number of organizations in the database |
quay_repository_rows | Number of repositories in the database |
quay_security_scanning_unscanned_images_remaining_total | Number of images that are not scanned by the latest security scanner |
Sample metrics output
13.2.2. Queue items Copy linkLink copied to clipboard!
The queue items metrics provide information on the multiple queues used by Quay for managing work.
Metric name | Description |
---|---|
quay_queue_items_available | Number of items in a specific queue |
quay_queue_items_locked | Number of items that are running |
quay_queue_items_available_unlocked | Number of items that are waiting to be processed |
Metric labels
queue_name: The name of the queue. One of:
- exportactionlogs: Queued requests to export action logs. These logs are then processed and put in storage. A link is then sent to the requester via email.
- namespacegc: Queued namespaces to be garbage collected
- notification: Queue for repository notifications to be sent out
- repositorygc: Queued repositories to be garbage collected
- secscanv4: Notification queue specific for Clair V4
- dockerfilebuild: Queue for Quay docker builds
- imagestoragereplication: Queued blob to be replicated across multiple storages
- chunk_cleanup: Queued blob segments that needs to be deleted. This is only used by some storage implementations, for example, Swift.
For example, the queue labelled repositorygc contains the repositories marked for deletion by the repository garbage collection worker. For metrics with a queue_name label of repositorygc:
- quay_queue_items_locked is the number of repositories currently being deleted.
- quay_queue_items_available_unlocked is the number of repositories waiting to get processed by the worker.
Sample metrics output
13.2.3. Garbage collection metrics Copy linkLink copied to clipboard!
These metrics show you how many resources have been removed from garbage collection (gc). They show many times the gc workers have run and how many namespaces, repositories, and blobs were removed.
Metric name | Description |
---|---|
quay_gc_iterations_total | Number of iterations by the GCWorker |
quay_gc_namespaces_purged_total | Number of namespaces purged by the NamespaceGCWorker |
quay_gc_repos_purged_total | Number of repositories purged by the RepositoryGCWorker or NamespaceGCWorker |
quay_gc_storage_blobs_deleted_total | Number of storage blobs deleted |
Sample metrics output
13.2.3.1. Multipart uploads metrics Copy linkLink copied to clipboard!
The multipart uploads metrics show the number of blobs uploads to storage (S3, Rados, GoogleCloudStorage, RHOCS). These can help identify issues when Quay is unable to correctly upload blobs to storage.
Metric name | Description |
---|---|
quay_multipart_uploads_started_total | Number of multipart uploads to Quay storage that completed |
quay_multipart_uploads_completed_total | Number of multipart uploads to Quay storage that started |
Sample metrics output
13.2.4. Image push / pull metrics Copy linkLink copied to clipboard!
A number of metrics are available related to pushing and pulling images.
13.2.4.1. Image pulls total Copy linkLink copied to clipboard!
Metric name | Description |
---|---|
quay_registry_image_pulls_total | The number of images downloaded from the registry. |
Metric labels
- protocol: the registry protocol used (should always be v2)
- ref: ref used to pull - tag, manifest
- status: http return code of the request
13.2.4.2. Image bytes pulled Copy linkLink copied to clipboard!
Metric name | Description |
---|---|
quay_registry_image_pulled_estimated_bytes_total | The number of bytes downloaded from the registry |
Metric labels
- protocol: the registry protocol used (should always be v2)
13.2.4.3. Image pushes total Copy linkLink copied to clipboard!
Metric name | Description |
---|---|
quay_registry_image_pushes_total | The number of images uploaded from the registry. |
Metric labels
- protocol: the registry protocol used (should always be v2)
- pstatus: http return code of the request
- pmedia_type: the uploaded manifest type
13.2.4.4. Image bytes pushed Copy linkLink copied to clipboard!
Metric name | Description |
---|---|
quay_registry_image_pushed_bytes_total | The number of bytes uploaded to the registry |
Sample metrics output
HELP quay_registry_image_pushed_bytes_total number of bytes pushed to the registry TYPE quay_registry_image_pushed_bytes_total counter
# HELP quay_registry_image_pushed_bytes_total number of bytes pushed to the registry
# TYPE quay_registry_image_pushed_bytes_total counter
quay_registry_image_pushed_bytes_total{host="example-registry-quay-app-6df87f7b66-9tfn6",instance="",job="quay",pid="221",process_name="registry:application"} 0
...
13.2.5. Authentication metrics Copy linkLink copied to clipboard!
The authentication metrics provide the number of authentication requests, labeled by type and whether it succeeded or not. For example, this metric could be used to monitor failed basic authentication requests.
Metric name | Description |
---|---|
quay_authentication_attempts_total | Number of authentication attempts across the registry and API |
Metric labels
auth_kind: The type of auth used, including:
- basic
- oauth
- credentials
- success: true or false
Sample metrics output