Chapter 14. Geo-replication
Geo-replication allows multiple, geographically distributed Red Hat Quay deployments to work as a single registry from the perspective of a client or user. It significantly improves push and pull performance in a globally-distributed Red Hat Quay setup. Image data is asynchronously replicated in the background with transparent failover and redirect for clients.
Deployments of Red Hat Quay with geo-replication is supported on standalone and Operator deployments.
14.1. Geo-replication features
- When geo-replication is configured, container image pushes will be written to the preferred storage engine for that Red Hat Quay instance. This is typically the nearest storage backend within the region.
- After the initial push, image data will be replicated in the background to other storage engines.
- The list of replication locations is configurable and those can be different storage backends.
- An image pull will always use the closest available storage engine, to maximize pull performance.
- If replication has not been completed yet, the pull will use the source storage backend instead.
14.2. Geo-replication requirements and constraints
- In geo-replicated setups, Red Hat Quay requires that all regions are able to read and write to all other region’s object storage. Object storage must be geographically accessible by all other regions.
- In case of an object storage system failure of one geo-replicating site, that site’s Red Hat Quay deployment must be shut down so that clients are redirected to the remaining site with intact storage systems by a global load balancer. Otherwise, clients will experience pull and push failures.
- Red Hat Quay has no internal awareness of the health or availability of the connected object storage system. Users must configure a global load balancer (LB) to monitor the health of your distributed system and to route traffic to different sites based on their storage status.
-
To check the status of your geo-replication deployment, you must use the
/health/endtoend
checkpoint, which is used for global health monitoring. You must configure the redirect manually using the/health/endtoend
endpoint. The/health/instance
end point only checks local instance health. - If the object storage system of one site becomes unavailable, there will be no automatic redirect to the remaining storage system, or systems, of the remaining site, or sites.
- Geo-replication is asynchronous. The permanent loss of a site incurs the loss of the data that has been saved in that sites' object storage system but has not yet been replicated to the remaining sites at the time of failure.
A single database, and therefore all metadata and Red Hat Quay configuration, is shared across all regions.
Geo-replication does not replicate the database. In the event of an outage, Red Hat Quay with geo-replication enabled will not failover to another database.
- A single Redis cache is shared across the entire Red Hat Quay setup and needs to accessible by all Red Hat Quay pods.
-
The exact same configuration should be used across all regions, with exception of the storage backend, which can be configured explicitly using the
QUAY_DISTRIBUTED_STORAGE_PREFERENCE
environment variable. - Geo-replication requires object storage in each region. It does not work with local storage.
- Each region must be able to access every storage engine in each region, which requires a network path.
- Alternatively, the storage proxy option can be used.
- The entire storage backend, for example, all blobs, is replicated. Repository mirroring, by contrast, can be limited to a repository, or an image.
- All Red Hat Quay instances must share the same entrypoint, typically through a load balancer.
- All Red Hat Quay instances must have the same set of superusers, as they are defined inside the common configuration file.
-
Geo-replication requires your Clair configuration to be set to
unmanaged
. An unmanaged Clair database allows the Red Hat Quay Operator to work in a geo-replicated environment, where multiple instances of the Red Hat Quay Operator must communicate with the same database. For more information, see Advanced Clair configuration. - Geo-Replication requires SSL/TLS certificates and keys. For more information, see Using SSL/TLS to protect connections to Red Hat Quay.
If the above requirements cannot be met, you should instead use two or more distinct Red Hat Quay deployments and take advantage of repository mirroring functions.
14.3. Geo-replication using standalone Red Hat Quay
In the following image, Red Hat Quay is running standalone in two separate regions, with a common database and a common Redis instance. Localized image storage is provided in each region and image pulls are served from the closest available storage engine. Container image pushes are written to the preferred storage engine for the Red Hat Quay instance, and will then be replicated, in the background, to the other storage engines.
If Clair fails in one cluster, for example, the US cluster, US users would not see vulnerability reports in Red Hat Quay for the second cluster (EU). This is because all Clair instances have the same state. When Clair fails, it is usually because of a problem within the cluster.
Geo-replication architecture
14.3.1. Enable storage replication - standalone Quay
Use the following procedure to enable storage replication on Red Hat Quay.
Procedure
- In your Red Hat Quay config editor, locate the Registry Storage section.
- Click Enable Storage Replication.
- Add each of the storage engines to which data will be replicated. All storage engines to be used must be listed.
If complete replication of all images to all storage engines is required, click Replicate to storage engine by default under each storage engine configuration. This ensures that all images are replicated to that storage engine.
NoteTo enable per-namespace replication, contact Red Hat Quay support.
- When finished, click Save Configuration Changes. The configuration changes will take effect after Red Hat Quay restarts.
After adding storage and enabling Replicate to storage engine by default for geo-replication, you must sync existing image data across all storage. To do this, you must
oc exec
(alternatively,docker exec
orkubectl exec
) into the container and enter the following commands:# scl enable python27 bash # python -m util.backfillreplication
NoteThis is a one time operation to sync content after adding new storage.
14.3.2. Run Red Hat Quay with storage preferences
- Copy the config.yaml to all machines running Red Hat Quay
For each machine in each region, add a
QUAY_DISTRIBUTED_STORAGE_PREFERENCE
environment variable with the preferred storage engine for the region in which the machine is running.For example, for a machine running in Europe with the config directory on the host available from
$QUAY/config
:$ sudo podman run -d --rm -p 80:8080 -p 443:8443 \ --name=quay \ -v $QUAY/config:/conf/stack:Z \ -e QUAY_DISTRIBUTED_STORAGE_PREFERENCE=europestorage \ registry.redhat.io/quay/quay-rhel8:v3.9.7
NoteThe value of the environment variable specified must match the name of a Location ID as defined in the config panel.
- Restart all Red Hat Quay containers
14.3.3. Removing a geo-replicated site from your standalone Red Hat Quay deployment
By using the following procedure, Red Hat Quay administrators can remove sites in a geo-replicated setup.
Prerequisites
-
You have configured Red Hat Quay geo-replication with at least two sites, for example,
usstorage
andeustorage
. - Each site has its own Organization, Repository, and image tags.
Procedure
Sync the blobs between all of your defined sites by running the following command:
$ python -m util.backfillreplication
WarningPrior to removing storage engines from your Red Hat Quay
config.yaml
file, you must ensure that all blobs are synced between all defined sites. Complete this step before proceeding.-
In your Red Hat Quay
config.yaml
file for siteusstorage
, remove theDISTRIBUTED_STORAGE_CONFIG
entry for theeustorage
site. Enter the following command to obtain a list of running containers:
$ podman ps
Example output
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 92c5321cde38 registry.redhat.io/rhel8/redis-5:1 run-redis 11 days ago Up 11 days ago 0.0.0.0:6379->6379/tcp redis 4e6d1ecd3811 registry.redhat.io/rhel8/postgresql-13:1-109 run-postgresql 33 seconds ago Up 34 seconds ago 0.0.0.0:5432->5432/tcp postgresql-quay d2eadac74fda registry-proxy.engineering.redhat.com/rh-osbs/quay-quay-rhel8:v3.9.0-131 registry 4 seconds ago Up 4 seconds ago 0.0.0.0:80->8080/tcp, 0.0.0.0:443->8443/tcp quay
Enter the following command to execute a shell inside of the PostgreSQL container:
$ podman exec -it postgresql-quay -- /bin/bash
Enter psql by running the following command:
bash-4.4$ psql
Enter the following command to reveal a list of sites in your geo-replicated deployment:
quay=# select * from imagestoragelocation;
Example output
id | name ----+------------------- 1 | usstorage 2 | eustorage
Enter the following command to exit the postgres CLI to re-enter bash-4.4:
\q
Enter the following command to permanently remove the
eustorage
site:ImportantThe following action cannot be undone. Use with caution.
bash-4.4$ python -m util.removelocation eustorage
Example output
WARNING: This is a destructive operation. Are you sure you want to remove eustorage from your storage locations? [y/n] y Deleted placement 30 Deleted placement 31 Deleted placement 32 Deleted placement 33 Deleted location eustorage
14.4. Geo-replication using the Red Hat Quay Operator
In the example shown above, the Red Hat Quay Operator is deployed in two separate regions, with a common database and a common Redis instance. Localized image storage is provided in each region and image pulls are served from the closest available storage engine. Container image pushes are written to the preferred storage engine for the Quay instance, and will then be replicated, in the background, to the other storage engines.
Because the Operator now manages the Clair security scanner and its database separately, geo-replication setups can be leveraged so that they do not manage the Clair database. Instead, an external shared database would be used. Red Hat Quay and Clair support several providers and vendors of PostgreSQL, which can be found in the Red Hat Quay 3.x test matrix. Additionally, the Operator also supports custom Clair configurations that can be injected into the deployment, which allows users to configure Clair with the connection credentials for the external database.
14.4.1. Setting up geo-replication on OpenShift Container Platform
Use the following procedure to set up geo-replication on OpenShift Container Platform.
Procedure
- Deploy a postgres instance for Red Hat Quay.
Login to the database by entering the following command:
psql -U <username> -h <hostname> -p <port> -d <database_name>
Create a database for Red Hat Quay named
quay
. For example:CREATE DATABASE quay;
Enable pg_trm extension inside the database
\c quay; CREATE EXTENSION IF NOT EXISTS pg_trgm;
Deploy a Redis instance:
Note- Deploying a Redis instance might be unnecessary if your cloud provider has its own service.
- Deploying a Redis instance is required if you are leveraging Builders.
- Deploy a VM for Redis
- Verify that it is accessible from the clusters where Red Hat Quay is running
- Port 6379/TCP must be open
Run Redis inside the instance
sudo dnf install -y podman podman run -d --name redis -p 6379:6379 redis
- Create two object storage backends, one for each cluster. Ideally, one object storage bucket will be close to the first, or primary, cluster, and the other will run closer to the second, or secondary, cluster.
- Deploy the clusters with the same config bundle, using environment variable overrides to select the appropriate storage backend for an individual cluster.
- Configure a load balancer to provide a single entry point to the clusters.
14.4.1.1. Configuring geo-replication for the Red Hat Quay Operator on OpenShift Container Platform
Use the following procedure to configure geo-replication for the Red Hat Quay Operator.
Procedure
Create a
config.yaml
file that is shared between clusters. Thisconfig.yaml
file contains the details for the common PostgreSQL, Redis and storage backends:Geo-replication
config.yaml
fileSERVER_HOSTNAME: <georep.quayteam.org or any other name> 1 DB_CONNECTION_ARGS: autorollback: true threadlocals: true DB_URI: postgresql://postgres:password@10.19.0.1:5432/quay 2 BUILDLOGS_REDIS: host: 10.19.0.2 port: 6379 USER_EVENTS_REDIS: host: 10.19.0.2 port: 6379 DISTRIBUTED_STORAGE_CONFIG: usstorage: - GoogleCloudStorage - access_key: GOOGQGPGVMASAAMQABCDEFG bucket_name: georep-test-bucket-0 secret_key: AYWfEaxX/u84XRA2vUX5C987654321 storage_path: /quaygcp eustorage: - GoogleCloudStorage - access_key: GOOGQGPGVMASAAMQWERTYUIOP bucket_name: georep-test-bucket-1 secret_key: AYWfEaxX/u84XRA2vUX5Cuj12345678 storage_path: /quaygcp DISTRIBUTED_STORAGE_DEFAULT_LOCATIONS: - usstorage - eustorage DISTRIBUTED_STORAGE_PREFERENCE: - usstorage - eustorage FEATURE_STORAGE_REPLICATION: true
- 1
- A proper
SERVER_HOSTNAME
must be used for the route and must match the hostname of the global load balancer. - 2
- To retrieve the configuration file for a Clair instance deployed using the OpenShift Container Platform Operator, see Retrieving the Clair config.
Create the
configBundleSecret
by entering the following command:$ oc create secret generic --from-file config.yaml=./config.yaml georep-config-bundle
In each of the clusters, set the
configBundleSecret
and use theQUAY_DISTRIBUTED_STORAGE_PREFERENCE
environmental variable override to configure the appropriate storage for that cluster. For example:NoteThe
config.yaml
file between both deployments must match. If making a change to one cluster, it must also be changed in the other.US cluster
QuayRegistry
exampleapiVersion: quay.redhat.com/v1 kind: QuayRegistry metadata: name: example-registry namespace: quay-enterprise spec: configBundleSecret: georep-config-bundle components: - kind: objectstorage managed: false - kind: route managed: true - kind: tls managed: false - kind: postgres managed: false - kind: clairpostgres managed: false - kind: redis managed: false - kind: quay managed: true overrides: env: - name: QUAY_DISTRIBUTED_STORAGE_PREFERENCE value: usstorage - kind: mirror managed: true overrides: env: - name: QUAY_DISTRIBUTED_STORAGE_PREFERENCE value: usstorage
NoteBecause SSL/TLS is unmanaged, and the route is managed, you must supply the certificates with either with the config tool or directly in the config bundle. For more information, see Configuring TLS and routes.
European cluster
apiVersion: quay.redhat.com/v1 kind: QuayRegistry metadata: name: example-registry namespace: quay-enterprise spec: configBundleSecret: georep-config-bundle components: - kind: objectstorage managed: false - kind: route managed: true - kind: tls managed: false - kind: postgres managed: false - kind: clairpostgres managed: false - kind: redis managed: false - kind: quay managed: true overrides: env: - name: QUAY_DISTRIBUTED_STORAGE_PREFERENCE value: eustorage - kind: mirror managed: true overrides: env: - name: QUAY_DISTRIBUTED_STORAGE_PREFERENCE value: eustorage
NoteBecause SSL/TLS is unmanaged, and the route is managed, you must supply the certificates with either with the config tool or directly in the config bundle. For more information, see Configuring TLS and routes.
14.4.2. Removing a geo-replicated site from your Red Hat Quay Operator deployment
By using the following procedure, Red Hat Quay administrators can remove sites in a geo-replicated setup.
Prerequisites
- You are logged into OpenShift Container Platform.
-
You have configured Red Hat Quay geo-replication with at least two sites, for example,
usstorage
andeustorage
. - Each site has its own Organization, Repository, and image tags.
Procedure
Sync the blobs between all of your defined sites by running the following command:
$ python -m util.backfillreplication
WarningPrior to removing storage engines from your Red Hat Quay
config.yaml
file, you must ensure that all blobs are synced between all defined sites.When running this command, replication jobs are created which are picked up by the replication worker. If there are blobs that need replicated, the script returns UUIDs of blobs that will be replicated. If you run this command multiple times, and the output from the return script is empty, it does not mean that the replication process is done; it means that there are no more blobs to be queued for replication. Customers should use appropriate judgement before proceeding, as the allotted time replication takes depends on the number of blobs detected.
Alternatively, you could use a third party cloud tool, such as Microsoft Azure, to check the synchronization status.
This step must be completed before proceeding.
-
In your Red Hat Quay
config.yaml
file for siteusstorage
, remove theDISTRIBUTED_STORAGE_CONFIG
entry for theeustorage
site. Enter the following command to identify your
Quay
application pods:$ oc get pod -n <quay_namespace>
Example output
quay390usstorage-quay-app-5779ddc886-2drh2 quay390eustorage-quay-app-66969cd859-n2ssm
Enter the following command to open an interactive shell session in the
usstorage
pod:$ oc rsh quay390usstorage-quay-app-5779ddc886-2drh2
Enter the following command to permanently remove the
eustorage
site:ImportantThe following action cannot be undone. Use with caution.
sh-4.4$ python -m util.removelocation eustorage
Example output
WARNING: This is a destructive operation. Are you sure you want to remove eustorage from your storage locations? [y/n] y Deleted placement 30 Deleted placement 31 Deleted placement 32 Deleted placement 33 Deleted location eustorage
14.5. Mixed storage for geo-replication
Red Hat Quay geo-replication supports the use of different and multiple replication targets, for example, using AWS S3 storage on public cloud and using Ceph storage on premise. This complicates the key requirement of granting access to all storage backends from all Red Hat Quay pods and cluster nodes. As a result, it is recommended that you use the following:
- A VPN to prevent visibility of the internal storage, or
- A token pair that only allows access to the specified bucket used by Red Hat Quay
This results in the public cloud instance of Red Hat Quay having access to on-premise storage, but the network will be encrypted, protected, and will use ACLs, thereby meeting security requirements.
If you cannot implement these security measures, it might be preferable to deploy two distinct Red Hat Quay registries and to use repository mirroring as an alternative to geo-replication.