Chapter 13. Setting up cross-site replication

13.1. Cross-site replication expose types

You can use a NodePort service, a LoadBalancer service, or an OpenShift Route to handle network traffic for backup operations between Data Grid clusters. Before you start setting up cross-site replication you should determine what expose type is available for your Red Hat OpenShift cluster. In some cases you may require an administrator to provision services before you can configure an expose type.

NodePort

A NodePort is a service that accepts network traffic at a static port, in the 30000 to 32767 range, on an IP address that is available externally to the OpenShift cluster.

To use a NodePort as the expose type for cross-site replication, an administrator must provision external IP addresses for each OpenShift node. In most cases, an administrator must also configure DNS routing for those external IP addresses.

LoadBalancer

A LoadBalancer is a service that directs network traffic to the correct node in the OpenShift cluster.

Whether you can use a LoadBalancer as the expose type for cross-site replication depends on the host platform. AWS supports network load balancers (NLB) while some other cloud platforms do not. To use a LoadBalancer service, an administrator must first create an ingress controller backed by an NLB.

Route

An OpenShift Route allows Data Grid clusters to connect with each other through a public secure URL.

Data Grid uses TLS with the SNI header to send backup requests between clusters through an OpenShift Route. To do this you must add a keystore with TLS certificates so that Data Grid can encrypt network traffic for cross-site replication.

When you specify Route as the expose type for cross-site replication, Data Grid Operator creates a route with TLS passthrough encryption for each Data Grid cluster that it manages. You can specify a hostname for the Route but you cannot specify a Route that you have already created.

Additional resources

Configuring ingress cluster traffic overview

13.2. Managed cross-site replication

Data Grid Operator can discover Data Grid clusters running in different data centers to form global clusters.

When you configure managed cross-site connections, Data Grid Operator creates router pods in each Data Grid cluster. Data Grid pods use the <cluster_name>-site service to connect to these router pods and send backup requests.

Router pods maintain a record of all pod IP addresses and parse RELAY message headers to forward backup requests to the correct Data Grid cluster. If a router pod crashes then all Data Grid pods start using any other available router pod until OpenShift restores it.

Important

To manage cross-site connections, Data Grid Operator uses the Kubernetes API. Each OpenShift cluster must have network access to the remote Kubernetes API and a service account token for each backup cluster.

Note

Data Grid clusters do not start running until Data Grid Operator discovers all backup locations that you configure.

13.2.1. Creating service account tokens for managed cross-site connections

Generate service account tokens on OpenShift clusters that allow Data Grid Operator to automatically discover Data Grid clusters and manage cross-site connections.

Prerequisites

Ensure all OpenShift clusters have access to the Kubernetes API.
Data Grid Operator uses this API to manage cross-site connections.
Note
Data Grid Operator does not modify remote Data Grid clusters. The service account tokens provide read-only access through the Kubernetes API.

Procedure

Log in to an OpenShift cluster.
Create a service account.
For example, create a service account at LON:
```
oc create sa -n <namespace> lon
```
Add the view role to the service account with the following command:
```
oc policy add-role-to-user view -n <namespace> -z lon
```
If you use a NodePort service to expose Data Grid clusters on the network, you must also add the cluster-reader role to the service account:
```
oc adm policy add-cluster-role-to-user cluster-reader -z lon -n <namespace>
```
Repeat the preceding steps on your other OpenShift clusters.
Exchange service account tokens on each OpenShift cluster.

13.2.2. Exchanging service account tokens

Generate service account tokens on your OpenShift clusters and add them into secrets at each backup location. The tokens that you generate in this procedure do not expire. For bound service account tokens, see Exchanging bound service account tokens.

Prerequisites

You have created a service account.

Procedure

Log in to your OpenShift cluster.

Create a service account token secret file as follows:

sa-token.yaml

apiVersion: v1
kind: Secret
metadata:
  name: ispn-xsite-sa-token 1
  annotations:
    kubernetes.io/service-account.name: "<service-account>" 2
type: kubernetes.io/service-account-token

1: Specifies the name of the secret.
2: Specifies the service account name.

Create the secret in your OpenShift cluster:
```
oc -n <namespace> create -f sa-token.yaml
```

Retrieve the service account token:

oc -n <namespace> get secrets ispn-xsite-sa-token -o jsonpath="{.data.token}" | base64 -d

The command prints the token in the terminal.

Copy the token for deployment in the backup OpenShift cluster.
Log in to the backup OpenShift cluster.
Add the service account token for a backup location:
```
oc -n <namespace> create secret generic <token-secret> --from-literal=token=<token>
```
The <token-secret> is the name of the secret configured in the Infinispan CR.

Next steps

Repeat the preceding steps on your other OpenShift clusters.

Additional resources

Creating a service account token secret

13.2.3. Exchanging bound service account tokens

Create service account tokens with a limited lifespan and add them into secrets at each backup location. You must refresh the token periodically to prevent Data Grid Operator from losing access to the remote OpenShift cluster. For non-expiring tokens, see Exchanging service account tokens.

Prerequisites

You have created a service account.

Procedure

Log in to your OpenShift cluster.
Create a bound token for the service account:
```
oc -n <namespace> create token <service-account>
```
Note
By default, service account tokens are valid for one hour. Use the command option --duration to specify the lifespan in seconds..
The command prints the token in the terminal.
Copy the token for deployment in the backup OpenShift cluster(s).
Log in to the backup OpenShift cluster.
Add the service account token for a backup location:
```
oc -n <namespace> create secret generic <token-secret> --from-literal=token=<token>
```
The <token-secret> is the name of the secret configured in the Infinispan CR.
Repeat the steps on other OpenShift clusters.

Deleting expired tokens

When a token expires, delete the expired token secret, and then repeat the procedure to generate and exchange a new one.

Log in to the backup OpenShift cluster.

Delete the expired secret <token-secret>:

oc -n <namespace> delete secrets <token-secret>

Repeat the procedure to create a new token and generate a new <token-secret>.

Additional resources

Creating bound service account tokens

13.2.4. Configuring managed cross-site connections

Configure Data Grid Operator to establish cross-site views with Data Grid clusters.

Prerequisites

Determine a suitable expose type for cross-site replication.
If you use an OpenShift Route you must add a keystore with TLS certificates and secure cross-site connections.
Create and exchange Red Hat OpenShift service account tokens for each Data Grid cluster.

Procedure

Create an Infinispan CR for each Data Grid cluster.
Specify the name of the local site with spec.service.sites.local.name.
Configure the expose type for cross-site replication.
1. Set the value of the spec.service.sites.local.expose.type field to one of the following:
  - NodePort
  - LoadBalancer
  - Route
2. Optionally specify a port or custom hostname with the following fields:
  - spec.service.sites.local.expose.nodePort if you use a NodePort service.
  - spec.service.sites.local.expose.port if you use a LoadBalancer service.
  - spec.service.sites.local.expose.routeHostName if you use an OpenShift Route.
Specify the number of pods that can send RELAY messages with the service.sites.local.maxRelayNodes field.
Tip
Configure all pods in your cluster to send RELAY messages for better performance. If all pods send backup requests directly, then no pods need to forward backup requests.
Provide the name, URL, and secret for each Data Grid cluster that acts as a backup location with spec.service.sites.locations.

If Data Grid cluster names or namespaces at the remote site do not match the local site, specify those values with the clusterName and namespace fields.

The following are example Infinispan CR definitions for LON and NYC:

LON

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
spec:
  replicas: 3
  version: <Data Grid_version>
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        expose:
          type: LoadBalancer
          port: 65535
        maxRelayNodes: 1
      locations:
        - name: NYC
          clusterName: <nyc_cluster_name>
          namespace: <nyc_cluster_namespace>
          url: openshift://api.rhdg-nyc.openshift-aws.myhost.com:6443
          secretName: nyc-token
  logging:
    categories:
      org.jgroups.protocols.TCP: error
      org.jgroups.protocols.relay.RELAY2: error

NYC

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: nyc-cluster
spec:
  replicas: 2
  version: <Data Grid_version>
  service:
    type: DataGrid
    sites:
      local:
        name: NYC
        expose:
          type: LoadBalancer
          port: 65535
        maxRelayNodes: 1
      locations:
        - name: LON
          clusterName: infinispan
          namespace: rhdg-namespace
          url: openshift://api.rhdg-lon.openshift-aws.myhost.com:6443
          secretName: lon-token
  logging:
    categories:
      org.jgroups.protocols.TCP: error
      org.jgroups.protocols.relay.RELAY2: error

Important

Be sure to adjust logging categories in your Infinispan CR to decrease log levels for JGroups TCP and RELAY2 protocols. This prevents a large number of log files from uses container storage.

spec:
  logging:
    categories:
      org.jgroups.protocols.TCP: error
      org.jgroups.protocols.relay.RELAY2: error

Configure your Infinispan CRs with any other Data Grid service resources and then apply the changes.
Verify that Data Grid clusters form a cross-site view.
1. Retrieve the Infinispan CR.
```
oc get infinispan -o yaml
```
2. Check for the type: CrossSiteViewFormed condition.

Next steps

If your clusters have formed a cross-site view, you can start adding backup locations to caches.

Additional resources

Data Grid guide to cross-site replication

13.3. Manually configuring cross-site connections

You can specify static network connection details to perform cross-site replication with Data Grid clusters running outside OpenShift. Manual cross-site connections are necessary in any scenario where access to the Kubernetes API is not available outside the OpenShift cluster where Data Grid runs.

Prerequisites

Determine a suitable expose type for cross-site replication.
If you use an OpenShift Route you must add a keystore with TLS certificates and secure cross-site connections.
Ensure you have the correct host names and ports for each Data Grid cluster and each <cluster-name>-site service.
Manually connecting Data Grid clusters to form cross-site views requires predictable network locations for Data Grid services, which means you need to know the network locations before they are created.

Procedure

Create an Infinispan CR for each Data Grid cluster.
Specify the name of the local site with spec.service.sites.local.name.
Configure the expose type for cross-site replication.
1. Set the value of the spec.service.sites.local.expose.type field to one of the following:
  - NodePort
  - LoadBalancer
  - Route
2. Optionally specify a port or custom hostname with the following fields:
  - spec.service.sites.local.expose.nodePort if you use a NodePort service.
  - spec.service.sites.local.expose.port if you use a LoadBalancer service.
  - spec.service.sites.local.expose.routeHostName if you use an OpenShift Route.

Provide the name and static URL for each Data Grid cluster that acts as a backup location with spec.service.sites.locations, for example:

LON

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
spec:
  replicas: 3
  version: <Data Grid_version>
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        expose:
          type: LoadBalancer
          port: 65535
        maxRelayNodes: 1
      locations:
        - name: NYC
          url: infinispan+xsite://infinispan-nyc.myhost.com:7900
  logging:
    categories:
      org.jgroups.protocols.TCP: error
      org.jgroups.protocols.relay.RELAY2: error

NYC

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
spec:
  replicas: 2
  version: <Data Grid_version>
  service:
    type: DataGrid
    sites:
      local:
        name: NYC
        expose:
          type: LoadBalancer
          port: 65535
        maxRelayNodes: 1
      locations:
        - name: LON
          url: infinispan+xsite://infinispan-lon.myhost.com
  logging:
    categories:
      org.jgroups.protocols.TCP: error
      org.jgroups.protocols.relay.RELAY2: error

Important

Be sure to adjust logging categories in your Infinispan CR to decrease log levels for JGroups TCP and RELAY2 protocols. This prevents a large number of log files from uses container storage.

spec:
  logging:
    categories:
      org.jgroups.protocols.TCP: error
      org.jgroups.protocols.relay.RELAY2: error

Configure your Infinispan CRs with any other Data Grid service resources and then apply the changes.
Verify that Data Grid clusters form a cross-site view.
1. Retrieve the Infinispan CR.
```
oc get infinispan -o yaml
```
2. Check for the type: CrossSiteViewFormed condition.

Next steps

If your clusters have formed a cross-site view, you can start adding backup locations to caches.

Additional resources

Data Grid guide to cross-site replication

13.4. Allocating CPU and memory for Gossip router pod

Allocate CPU and memory resources to Data Grid Gossip router.

Prerequisite

Have Gossip router enabled. The service.sites.local.discovery.launchGossipRouter property must be set to true, which is the default value.

Procedure

Allocate the number of CPU units using the service.sites.local.discovery.cpu field.
Allocate the amount of memory, in bytes, using the service.sites.local.discovery.memory field.
The cpu and memory fields have values in the format of <limit>:<requests>. For example, cpu: "2000m:1000m" limits pods to a maximum of 2000m of CPU and requests 1000m of CPU for each pod at startup. Specifying a single value sets both the limit and request.
Apply your Infinispan CR.

spec:
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        discovery:
          launchGossipRouter: true
          memory: "2Gi:1Gi"
          cpu: "2000m:1000m"

13.5. Disabling local Gossip router and service

The Data Grid Operator starts a Gossip router on each site, but you only need a single Gossip router to manage traffic between the Data Grid cluster members. You can disable the additional Gossip routers to save resources.

For example, you have Data Grid clusters in LON and NYC sites. The following procedure shows how you can disable Gossip router in LON site and connect to NYC that has the Gossip router enabled.

Procedure

Create an Infinispan CR for each Data Grid cluster.
Specify the name of the local site with the spec.service.sites.local.name field.
For the LON cluster, set false as the value for the spec.service.sites.local.discovery.launchGossipRouter field.
For the LON cluster, specify the url with the spec.service.sites.locations.url to connect to the NYC.

In the NYC configuration, do not specify the spec.service.sites.locations.url.

LON

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
spec:
  replicas: 3
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        discovery:
          launchGossipRouter: false
      locations:
        - name: NYC
          url: infinispan+xsite://infinispan-nyc.myhost.com:7900

NYC

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
spec:
  replicas: 3
  service:
    type: DataGrid
    sites:
      local:
        name: NYC
      locations:
        - name: LON

Important

If you have three or more sites, Data Grid recommends to keep the Gossip router enabled on all the remote sites. When you have multiple Gossip routers and one of them becomes unavailable, the remaining routers continue exchanging messages. If a single Gossip router is defined, and it becomes unavailable, the connection between the remote sites breaks.

Next steps

If your clusters have formed a cross-site view, you can start adding backup locations to caches.

Additional resources

Data Grid cross-site replication

13.6. Resources for configuring cross-site replication

The following tables provides fields and descriptions for cross-site resources.

Table 13.1. service.type
Field	Description
`service.type: DataGrid`	Data Grid supports cross-site replication with Data Grid service clusters only.

Table 13.2. service.sites.local
Field	Description
`service.sites.local.name`	Names the local site where a Data Grid cluster runs.
`service.sites.local.maxRelayNodes`	Specifies the maximum number of pods that can send RELAY messages for cross-site replication. The default value is `1`.
`service.sites.local.discovery.launchGossipRouter`	If `false`, the cross-site services and the Gossip router pod are not created in the local site. The default value is `true`.
`service.sites.local.discovery.memory`	Allocates the amount of memory in bytes. It uses the following format `<limit>:<requests>` (example `"2Gi:1Gi"`).
`service.sites.local.discovery.cpu`	Allocates the number of CPU units. It uses the following format `<limit>:<requests>` (example `"2000m:1000m"`).
`service.sites.local.expose.type`	Specifies the network service for cross-site replication. Data Grid clusters use this service to communicate and perform backup operations. You can set the value to `NodePort`, `LoadBalancer`, or `Route`.
`service.sites.local.expose.nodePort`	Specifies a static port within the default range of `30000` to `32767` if you expose Data Grid through a `NodePort` service. If you do not specify a port, the platform selects an available one.
`service.sites.local.expose.port`	Specifies the network port for the service if you expose Data Grid through a `LoadBalancer` service. The default port is `7900`.
`service.sites.local.expose.routeHostName`	Specifies a custom hostname if you expose Data Grid through an OpenShift `Route`. If you do not set a value then OpenShift generates a hostname.

Table 13.3. service.sites.locations
Field	Description
`service.sites.locations`	Provides connection information for all backup locations.
`service.sites.locations.name`	Specifies a backup location that matches `.spec.service.sites.local.name`.
`service.sites.locations.url`	Specifies the URL of the Kubernetes API for managed connections or a static URL for manual connections. Use `openshift://` to specify the URL of the Kubernetes API for an OpenShift cluster. Note that the `openshift://` URL must present a valid, CA-signed certificate. You cannot use self-signed certificates. Use the `infinispan+xsite://<hostname>:<port>` format for static hostnames and ports. The default port is `7900`.
`service.sites.locations.secretName`	Specifies the secret that contains the service account token for the backup site.
`service.sites.locations.clusterName`	Specifies the cluster name at the backup location if it is different to the cluster name at the local site.
`service.sites.locations.namespace`	Specifies the namespace of the Data Grid cluster at the backup location if it does not match the namespace at the local site.

Managed cross-site connections

spec:
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        expose:
          type: LoadBalancer
        maxRelayNodes: 1
      locations:
      - name: NYC
        clusterName: <nyc_cluster_name>
        namespace: <nyc_cluster_namespace>
        url: openshift://api.site-b.devcluster.openshift.com:6443
        secretName: nyc-token

Manual cross-site connections

spec:
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        expose:
          type: LoadBalancer
          port: 65535
        maxRelayNodes: 1
      locations:
      - name: NYC
        url: infinispan+xsite://infinispan-nyc.myhost.com:7900

13.7. Securing cross-site connections

Add keystores and trust stores so that Data Grid clusters can secure cross-site replication traffic.

You must add a keystore to use an OpenShift Route as the expose type for cross-site replication. Securing cross-site connections is optional if you use a NodePort or LoadBalancer as the expose type.

Note

Cross-site replication does not support the OpenShift CA service. You must provide your own certificates.

Prerequisites

Have a PKCS12 keystore that Data Grid can use to encrypt and decrypt RELAY messages.
You must provide a keystore for relay pods and router pods to secure cross-site connections.
The keystore can be the same for relay pods and router pods or you can provide separate keystores for each.
You can also use the same keystore for each Data Grid cluster or a unique keystore for each cluster.
Have a PKCS12 trust store that contains part of the certificate chain or root CA certificate that verifies public certificates for Data Grid relay pods and router pods.

Procedure

Create cross-site encryption secrets.
1. Create keystore secrets.
2. Create trust store secrets.
Modify the Infinispan CR for each Data Grid cluster to specify the secret name for the encryption.transportKeyStore.secretName and encryption.routerKeyStore.secretName fields.

Configure any other fields to encrypt RELAY messages as required and then apply the changes.

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
spec:
  replicas: 2
  version: <Data Grid_version>
  expose:
    type: LoadBalancer
  service:
    type: DataGrid
    sites:
      local:
        name: SiteA
        # ...
        encryption:
          protocol: TLSv1.3
          transportKeyStore:
            secretName: transport-tls-secret
            alias: transport
            filename: keystore.p12
          routerKeyStore:
            secretName: router-tls-secret
            alias: router
            filename: keystore.p12
          trustStore:
            secretName: truststore-tls-secret
            filename: truststore.p12
      locations:
        # ...

13.7.1. Resources for configuring cross-site encryption

The following tables provides fields and descriptions for encrypting cross-site connections.

Table 13.4. service.type.sites.local.encryption
Field	Description
`service.type.sites.local.encryption.protocol`	Specifies the TLS protocol to use for cross-site connections. The default value is `TLSv1.2` but you can set `TLSv1.3` if required.
`service.type.sites.local.encryption.transportKeyStore`	Configures a keystore secret for relay pods.
`service.type.sites.local.encryption.routerKeyStore`	Configures a keystore secret for router pods.
`service.type.sites.local.encryption.trustStore`	Configures a trust store secret for relay pods and router pods.

Table 13.5. service.type.sites.local.encryption.transportKeyStore
Field	Description
`secretName`	Specifies the secret that contains a keystore that relay pods can use to encrypt and decrypt RELAY messages. This field is required.
`alias`	Optionally specifies the alias of the certificate in the keystore. The default value is `transport`.
`filename`	Optionally specifies the filename of the keystore. The default value is `keystore.p12`.

Table 13.6. service.type.sites.local.encryption.routerKeyStore
Field	Description
`secretName`	Specifies the secret that contains a keystore that router pods can use to encrypt and decrypt RELAY messages. This field is required.
`alias`	Optionally specifies the alias of the certificate in the keystore. The default value is `router`.
`filename`	Optionally specifies the filename of the keystore. The default value is `keystore.p12`.

Table 13.7. service.type.sites.local.encryption.trustStore
Field	Description
`secretName`	Specifies the secret that contains a trust store to verify public certificates for relay pods and router pods. This field is required.
`filename`	Optionally specifies the filename of the trust store. The default value is `truststore.p12`.

13.7.2. Cross-site encryption secrets

Cross-site replication encryption secrets add keystores and trust store for securing cross-site connections.

Cross-site encryption secrets

apiVersion: v1
kind: Secret
metadata:
  name: tls-secret
type: Opaque
stringData:
  password: changeme
  type: pkcs12
data:
  <file-name>: "MIIKDgIBAzCCCdQGCSqGSIb3DQEHA..."

Field	Description
`stringData.password`	Specifies the password for the keystore or trust store.
`stringData.type`	Optionally specifies the keystore or trust store type. The default value is `pkcs12`.
`data.<file-name>`	Adds a base64-encoded keystore or trust store.

13.8. Configuring sites in the same OpenShift cluster

For evaluation and demonstration purposes, you can configure Data Grid to back up between pods in the same OpenShift cluster.

Important

Using ClusterIP as the expose type for cross-site replication is intended for demonstration purposes only. It would be appropriate to use this expose type only to perform a temporary proof-of-concept deployment on a laptop or something of that nature.

Procedure

Create an Infinispan CR for each Data Grid cluster.
Specify the name of the local site with spec.service.sites.local.name.
Set ClusterIP as the value of the spec.service.sites.local.expose.type field.
Provide the name of the Data Grid cluster that acts as a backup location with spec.service.sites.locations.clusterName.

If both Data Grid clusters have the same name, specify the namespace of the backup location with spec.service.sites.locations.namespace.

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: example-clustera
spec:
  replicas: 1
  expose:
    type: LoadBalancer
  service:
    type: DataGrid
    sites:
      local:
        name: SiteA
        expose:
          type: ClusterIP
        maxRelayNodes: 1
      locations:
        - name: SiteB
          clusterName: example-clusterb
          namespace: cluster-namespace

Configure your Infinispan CRs with any other Data Grid service resources and then apply the changes.
Verify that Data Grid clusters form a cross-site view.
1. Retrieve the Infinispan CR.
```
oc get infinispan -o yaml
```
2. Check for the type: CrossSiteViewFormed condition.

13.1. Cross-site replication expose types

13.2. Managed cross-site replication

13.2.1. Creating service account tokens for managed cross-site connections

13.2.2. Exchanging service account tokens

13.2.3. Exchanging bound service account tokens

Deleting expired tokens

13.2.4. Configuring managed cross-site connections

13.3. Manually configuring cross-site connections

13.4. Allocating CPU and memory for Gossip router pod

13.5. Disabling local Gossip router and service

13.6. Resources for configuring cross-site replication

Managed cross-site connections

Manual cross-site connections

13.7. Securing cross-site connections

13.7.1. Resources for configuring cross-site encryption

13.7.2. Cross-site encryption secrets

13.8. Configuring sites in the same OpenShift cluster

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Red Hat legal and privacy links

Red Hat legal and privacy links