此内容没有您所选择的语言版本。
Chapter 5. Telco hub reference design specification
The telco hub reference design specification (RDS) describes the configuration for a hub cluster that deploys and operates fleets of OpenShift Container Platform clusters in a telco environment.
The telco hub RDS is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
5.1. Reference design scope 复制链接链接已复制到粘贴板!
The telco core, telco RAN and telco hub reference design specifications (RDS) capture the recommended, tested, and supported configurations to get reliable and repeatable performance for clusters running the telco core and telco RAN profiles.
Each RDS includes the released features and supported configurations that are engineered and validated for clusters to run the individual profiles. The configurations provide a baseline OpenShift Container Platform installation that meets feature and KPI targets. Each RDS also describes expected variations for each individual configuration. Validation of each RDS includes many long duration and at-scale tests.
The validated reference configurations are updated for each major Y-stream release of OpenShift Container Platform. Z-stream patch releases are periodically re-tested against the reference configurations.
5.2. Deviations from the reference design 复制链接链接已复制到粘贴板!
Deviating from the validated telco core, telco RAN DU, and telco hub reference design specifications (RDS) can have significant impact beyond the specific component or feature that you change. Deviations require analysis and engineering in the context of the complete solution.
All deviations from the RDS should be analyzed and documented with clear action tracking information. Due diligence is expected from partners to understand how to bring deviations into line with the reference design. This might require partners to provide additional resources to engage with Red Hat to work towards enabling their use case to achieve a best in class outcome with the platform. This is critical for the supportability of the solution and ensuring alignment across Red Hat and with partners.
Deviation from the RDS can have some or all of the following consequences:
- It can take longer to resolve issues.
- There is a risk of missing project service-level agreements (SLAs), project deadlines, end provider performance requirements, and so on.
Unapproved deviations may require escalation at executive levels.
NoteRed Hat prioritizes the servicing of requests for deviations based on partner engagement priorities.
5.3. Hub cluster architecture overview 复制链接链接已复制到粘贴板!
Use the features and components running on the management hub cluster to manage many other clusters in a hub-and-spoke topology. The hub cluster provides a highly available and centralized interface for managing the configuration, lifecycle, and observability of the fleet of deployed clusters.
All management hub functionality can be deployed on a dedicated OpenShift Container Platform cluster or as applications that are co-resident on an existing cluster.
- Managed cluster lifecycle
- Using a combination of Day 2 Operators, the hub cluster provides the necessary infrastructure to deploy and configure the fleet of clusters by using a GitOps methodology. Over the lifetime of the deployed clusters, further management of upgrades, scaling the number of clusters, node replacement, and other lifecycle management functions can be declaratively defined and rolled out. You can control the timing and progression of the rollout across the fleet.
- Monitoring
- The hub cluster provides monitoring and status reporting for the managed clusters through the Observability pillar of the RHACM Operator. This includes aggregated metrics, alerts, and compliance monitoring through the Governance policy framework.
The telco management hub reference design specification (RDS) and the associated reference custom resources (CRs) describe the telco engineering and QE validated method for deploying, configuring and managing the lifecycle of telco managed cluster infrastructure. The reference configuration includes the installation and configuration of the hub cluster components on top of OpenShift Container Platform.
Figure 5.1. Hub cluster reference design components
Figure 5.2. Hub cluster reference design architecture
5.4. Telco management hub cluster use model 复制链接链接已复制到粘贴板!
The hub cluster provides managed cluster installation, configuration, observability and ongoing lifecycle management for telco application and workload clusters.
5.5. Hub cluster scaling target 复制链接链接已复制到粘贴板!
The resource requirements for the hub cluster are directly dependent on the number of clusters being managed by the hub, the number of policies used for each managed cluster, and the set of features that are configured in Red Hat Advanced Cluster Management (RHACM).
The hub cluster reference configuration can support up to 3500 managed single-node OpenShift clusters under the following conditions:
- 5 policies for each cluster with hub-side templating configured with a 10 minute evaluation interval.
Only the following RHACM add-ons are enabled:
- Policy controller
- Observability with the default configuration
- You deploy managed clusters by using GitOps ZTP in batches of up to 500 clusters at a time.
The reference configuration is also validated for deployment and management of a mix of managed cluster topologies. The specific limits depend on the mix of cluster topologies, enabled RHACM features, and so on. In a mixed topology scenario, the reference hub configuration is validated with a combination of 1200 single-node OpenShift clusters, 400 compact clusters (3 nodes combined control plane and compute nodes), and 230 standard clusters (3 control plane and 2 worker nodes).
Specific dimensioning requirements are highly dependent on the cluster topology and workload. For more information, see "Storage requirements". Adjust cluster dimensions for the specific characteristics of your fleet of managed clusters.
5.6. Hub cluster resource utilization 复制链接链接已复制到粘贴板!
Resource utilization was measured for deploying hub clusters in the following scenario:
- Under reference load managing 3500 single-node OpenShift clusters.
- 3-node compact cluster for management hub running on dual socket bare-metal servers.
- Network impairment of 50 ms round-trip latency, 100 Mbps bandwidth limit and 0.02% packet loss.
- Observability was not enabled.
- Only local storage was used.
Metric | Peak Measurement |
---|---|
OpenShift Platform CPU | 106 cores (52 cores peak per node) |
OpenShift Platform memory | 504 G (168 G peak per node) |
5.7. Hub cluster topology 复制链接链接已复制到粘贴板!
In production environments, the OpenShift Container Platform hub cluster must be highly available to maintain high availability of the management functions.
- Limits and requirements
Use a highly available cluster topology for the hub cluster, for example:
- Compact (3 nodes combined control plane and compute nodes)
- Standard (3 control plane nodes + N compute nodes)
- Engineering considerations
- In non-production environments, a single-node OpenShift cluster can be used for limited hub cluster functionality.
- Certain capabilities, for example Red Hat OpenShift Data Foundation, are not supported on single-node OpenShift. In this configuration, some hub cluster features might not be available.
- The number of optional compute nodes can vary depending on the scale of the specific use case.
- Compute nodes can be added later as required.
5.8. Hub cluster networking 复制链接链接已复制到粘贴板!
The reference hub cluster is designed to operate in a disconnected networking environment where direct access to the internet is not possible. As with all OpenShift Container Platform clusters, the hub cluster requires access to an image registry hosting all OpenShift and Day 2 Operator Lifecycle Manager (OLM) images.
The hub cluster supports dual-stack networking support for IPv6 and IPv4 networks. IPv6 is typical in edge or far-edge network segments, while IPv4 is more prevalent for use with legacy equipment in the data center.
- Limits and requirements
Regardless of the installation method, you must configure the following network types for the hub cluster:
-
clusterNetwork
-
serviceNetwork
-
machineNetwork
-
You must configure the following IP addresses for the hub cluster:
-
apiVIP
-
ingressVIP
-
NoteFor the above networking configurations, some values are required, or can be auto-assigned, depending on the chosen architecture and DHCP configuration.
- You must use the default OpenShift Container Platform network provider OVN-Kubernetes.
Networking between the managed cluster and hub cluster must meet the networking requirements in the Red Hat Advanced Cluster Management (RHACM) documentation, for example:
- Hub cluster access to managed cluster API service, Ironic Python agent, and baseboard management controller (BMC) port.
- Managed cluster access to hub cluster API service, ingress IP and control plane node IP addresses.
- Managed cluster BMC access to hub cluster control plane node IP addresses.
An image registry must be accessible throughout the lifetime of the hub cluster.
- All required container images must be mirrored to the disconnected registry.
- The hub cluster must be configured to use a disconnected registry.
- The hub cluster cannot host its own image registry. For example, the registry must be available in a scenario where a power failure affects all cluster nodes.
- Engineering considerations
- When deploying a hub cluster, ensure you define appropriately sized CIDR range definitions.
5.9. Hub cluster memory and CPU requirements 复制链接链接已复制到粘贴板!
The memory and CPU requirements of the hub cluster vary depending on the configuration of the hub cluster, the number of resources on the cluster, and the number of managed clusters.
- Limits and requirements
- Ensure that the hub cluster meets the underlying memory and CPU requirements for OpenShift Container Platform and Red Hat Advanced Cluster Management (RHACM).
- Engineering considerations
- Before deploying a telco hub cluster, ensure that your cluster host meets cluster requirements.
For more information about scaling the number of managed clusters, see "Hub cluster scaling target".
5.10. Hub cluster storage requirements 复制链接链接已复制到粘贴板!
The total amount of storage required by the management hub cluster is dependant on the storage requirements for each of the applications deployed on the cluster. The main components that require storage through highly available PersistentVolume
resources are described in the following sections.
The storage required for the underlying OpenShift Container Platform installation is separate to these requirements.
5.10.1. Assisted Service 复制链接链接已复制到粘贴板!
The Assisted Service is deployed with the multicluster engine and Red Hat Advanced Cluster Management (RHACM).
Persistent volume resource | Size (GB) |
---|---|
| 50 |
| 700 |
| 20 |
5.10.2. RHACM Observability 复制链接链接已复制到粘贴板!
Cluster Observability is provided by the multicluster engine and Red Hat Advanced Cluster Management (RHACM).
-
Observability storage needs several
PV
resources and an S3 compatible bucket storage for long term retention of the metrics. -
Storage requirements calculation is complex and dependent on the specific workloads and characteristics of managed clusters. Requirements for
PV
resources and the S3 bucket depend on many aspects including data retention, the number of managed clusters, managed cluster workloads, and so on. - Estimate the required storage for observability by using the observability sizing calculator in the RHACM capacity planning repository. See the Red Hat Knowledgebase article Calculating storage need for MultiClusterHub Observability on telco environments for an explanation of using the calculator to estimate observability storage requirements. The below table uses inputs derived from the telco RAN DU RDS and the hub cluster RDS as representative values.
The following numbers are estimated. Tune the values for more accurate results. Add an engineering margin, for example +20%, to the results to account for potential estimation inaccuracies.
Capacity planner input | Data source | Example value |
---|---|---|
Number of control plane nodes | Hub cluster RDS (scale) and telco RAN DU RDS (topology) | 3500 |
Number of additional worker nodes | Hub cluster RDS (scale) and telco RAN DU RDS (topology) | 0 |
Days for storage of data | Hub cluster RDS | 15 |
Total number of pods per cluster | Telco RAN DU RDS | 120 |
Number of namespaces (excluding OpenShift Container Platform) | Telco RAN DU RDS | 4 |
Number of metric samples per hour | Default value | 12 |
Number of hours of retention in receiver persistent volume (PV) | Default value | 24 |
With these input values, the sizing calculator as described in the Red Hat Knowledgebase article Calculating storage need for MultiClusterHub Observability on telco environments indicates the following storage needs:
alertmanager PV | thanos receive PV | thanos compact PV | |||
---|---|---|---|---|---|
Per replica | Total | Per replica | Total | Total | |
10 GiB | 30 GiB | 10 GiB | 30 GiB | 100 GiB |
thanos rule PV | thanos store PV | Object bucket[1] | |||
---|---|---|---|---|---|
Per replica | Total | Per replica | Total | Per day | Total |
30 GiB | 90 GiB | 100 GiB | 300 GiB | 15 GiB | 101 GiB |
[1] For the object bucket, it is assumed that downsampling is disabled, so that only raw data is calculated for storage requirements.
5.10.3. Storage considerations 复制链接链接已复制到粘贴板!
- Limits and requirements
- Minimum OpenShift Container Platform and Red Hat Advanced Cluster Management (RHACM) limits apply
- High availability should be provided through a storage backend. The hub cluster reference configuration provides storage through Red Hat OpenShift Data Foundation.
- Object bucket storage is provided through OpenShift Data Foundation.
- Engineering considerations
- Use SSD or NVMe disks with low latency and high throughput for etcd storage.
The storage solution for telco hub clusters is OpenShift Data Foundation.
- Local Storage Operator supports the storage class used by OpenShift Data Foundation to provide block, file, and object storage as needed by other components on the hub cluster.
-
The Local Storage Operator
LocalVolume
configuration includes settingforceWipeDevicesAndDestroyAllData: true
to support the reinstallation of hub cluster nodes where OpenShift Data Foundation has previously been used.
5.10.4. Git repository 复制链接链接已复制到粘贴板!
The telco management hub cluster supports a GitOps-driven methodology for installing and managing the configuration of OpenShift clusters for various telco applications. This methodology requires an accessible Git repository that serves as the authoritative source of truth for cluster definitions and configuration artifacts.
Red Hat does not offer a commercially supported Git server. An existing Git server provided in the production environment can be used. Gitea and Gogs are examples of self-hosted Git servers that you can use.
The Git repository is typically provided in the production network external to the hub cluster. In a large-scale deployment, multiple hub clusters can use the same Git repository for maintaining the definitions of managed clusters. Using this approach, you can easily review the state of the complete network. As the source of truth for cluster definitions, the Git repository should be highly available and recoverable in disaster scenarios.
For disaster recovery and multi-hub considerations, run the Git repository separately from the hub cluster.
- Limits and requirements
- A Git repository is required to support the GitOps ZTP functions of the hub cluster, including installation, configuration, and lifecycle management of the managed clusters.
- The Git repository must be accessible from the management cluster.
- Engineering considerations
- The Git repository is used by the GitOps Operator to ensure continuous deployment and a single source of truth for the applied configuration.
- Description
The reference method for installing OpenShift Container Platform for the hub cluster is through the Agent-based Installer.
Agent-based Installer provides installation capabilities without additional centralized infrastructure. The Agent-based Installer creates an ISO image, which you mount to the server to be installed. When you boot the server, OpenShift Container Platform is installed alongside optionally supplied extra manifests, such as the Red Hat OpenShift GitOps Operator.
NoteYou can also install OpenShift Container Platform in the hub cluster by using other installation methods.
If hub cluster functions are being applied to an existing OpenShift Container Platform cluster, the Agent-based Installer installation is not required. The remaining steps to install Day 2 Operators and configure the cluster for these functions remains the same. When OpenShift Container Platform installation is complete, the set of additional Operators and their configuration must be installed on the hub cluster.
The reference configuration includes all of these custom resources (CRs), which you can apply manually, for example:
oc apply -f <reference_cr>
$ oc apply -f <reference_cr>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can also add the reference configuration to the Git repository and apply it using ArgoCD.
NoteIf you apply the CRs manually, ensure you apply the CRs in the order of their dependencies. For example, apply namespaces before Operators and apply Operators before configurations.
- Limits and requirements
- Agent-based Installer requires an accessible image repository containing all required OpenShift Container Platform and Day 2 Operator images.
- Agent-based Installer builds ISO images based on a specific OpenShift releases and specific cluster details. Installation of a second hub requires a separate ISO image to be built.
- Engineering considerations
- Agent-based Installer provides a baseline OpenShift Container Platform installation. You apply Day 2 Operators and other configuration CRs after the cluster is installed.
- The reference configuration supports Agent-based Installer installation in a disconnected environment.
- A limited set of additional manifests can be supplied at installation time.
5.12. Day 2 Operators in the hub cluster 复制链接链接已复制到粘贴板!
The management hub cluster relies on a set of Day 2 Operators to provide critical management services and infrastructure. Use Operator versions that match the set of managed cluster versions in your fleet.
Install Day 2 Operators using Operator Lifecycle Manager (OLM) and Subscription
custom resources (CRs). Subscription
CRs identify the specific Day 2 Operator to install, the catalog in which the Operator is found, and the appropriate version channel for the Operator. By default OLM installs and attempt to keep Operators updated with the latest z-stream version available in the channel. By default all Subscriptions are set with an installPlanApproval: Automatic
value. In this mode, OLM automatically installs new Operator versions when they are available in the catalog and channel.
Setting installPlanApproval
to automatic exposes the risk of the Operator being updated outside of defined maintenance windows if the catalog index is updated to include newer Operator versions. In a disconnected environment where you are building and maintaining a curated set of Operators and versions in the catalog, and if you follow a strategy of creating a new catalog index for updated versions, the risk of the Operators being inadvertently updated is largely removed. However, if you want to further close this risk, the Subscription
CRs can be set to installPlanApproval: Manual
which prevents Operators from being updated without explicit administrator approval.
- Limits and requirements
- When upgrading a telco hub cluster, the versions of OpenShift Container Platform and Operators must meet the requirements of all relevant compatibility matrixes.
5.13. Observability 复制链接链接已复制到粘贴板!
The Red Hat Advanced Cluster Management (RHACM) multicluster engine Observability component provides centralized aggregation and visualization of metrics and alerts for all managed clusters. To balance performance and data analysis, the monitoring service maintains a subset list of aggregated metrics that are collected at a downsampled interval. The metrics can be accessed on the hub through a set of different preconfigured dashboards.
- Observability installation
The primary CR to enable and configure the Observability service is the
MulticlusterObservability
CR, which defines the following settings: The primary custom resource (CR) to enable and configure the observability service is theMulticlusterObservability
CR, which defines the following settings:- Configurable retention settings.
-
Storage for the different components:
thanos receive
,thanos compact
,thanos rule
,thanos store
sharding,alertmanager
. The
metadata.annotations.mco-disable-alerting="true"
annotation that enables tuning for the monitoring configuration on managed clusters.NoteWithout this setting the Observability component attempts to configure the managed cluster monitoring configuration. With this value set you can merge your desired configuration with the necessary Observability configuration of alert forwarding into the managed cluster monitoring
ConfigMap
object. When the Observability service is enabled RHACM will deploy to each managed cluster a workload to push metrics and alerts generated by local Monitoring to the hub cluster. The metrics and alerts to be forwarded from the managed cluster to the hub, are defined by aConfigMap
CR in theopen-cluster-management-addon-observability
namespace. You can also specify custom metrics, for more information, see Adding custom metrics.
- Alertmananger configuration
- The hub cluster provides an Observability Alertmanager that can be configured to push alerts to external systems, for example, email. The Alertmanager is enabled by default.
- You must configure alert forwarding.
- When the Alertmanager is enabled but not configured, the hub Alertmanager does not forward alerts externally.
- When Observability is enabled, the managed clusters can be configured to send alerts to any endpoint including the hub Alertmanager.
- When a managed cluster is configured to forward alerts to external sources, alerts are not routed through the hub cluster Alertmanager.
- Alert state is available as a metric.
- When observability is enabled, the managed cluster alert states are included in the subset of metrics forwarded to the hub cluster and are available through Observability dashboards.
- Limits and requirements
- Observability requires persistent object storage for long-term metrics. For more information, see "Storage requirements".
- Engineering considerations
-
Forwarding of metrics is a subset of the full metric data. It includes only the metrics defined in the
observability-metrics-allowlist
config map and any custom metrics added by the user. -
Metrics are forwarded at a downsampled rate. Metrics are forwarded by taking the latest datapoint at a 5 minute interval (or as defined by the
MultiClusterObservability
CR configuration). - A network outage may lead to a loss of metrics forwarded to the hub cluster during that interval. This can be mitigated if metrics are also forwarded directly from managed clusters to an external metrics collector in the providers network. Full resolution metrics are available on the managed cluster.
- In addition to default metrics dashboards on the hub, users may define custom dashboards.
- The reference configuration is sized based on 15 days of metrics storage by the hub cluster for 3500 single-node OpenShift clusters. If longer retention or other managed cluster topology or sizing is required, the storage calculations must be updated and sufficient storage capacity be maintained. For more information about calculating new values, see "Storage requirements".
-
Forwarding of metrics is a subset of the full metric data. It includes only the metrics defined in the
5.14. Managed cluster lifecycle management 复制链接链接已复制到粘贴板!
To provision and manage sites at the far edge of the network, use GitOps ZTP in a hub-and-spoke architecture, where a single hub cluster manages many managed clusters.
Lifecycle management for spoke clusters can be divided into two different stages: cluster deployment, including OpenShift Container Platform installation, and cluster configuration.
5.14.1. Managed cluster deployment 复制链接链接已复制到粘贴板!
- Description
-
As of Red Hat Advanced Cluster Management (RHACM) 2.12, using the SiteConfig Operator is the recommended method for deploying managed clusters. The SiteConfig Operator introduces a unified ClusterInstance API that decouples the parameters that define the cluster from the manner in which it is deployed. The SiteConfig Operator uses a set of cluster templates that are instantiated using the data from a
ClusterInstance
custom resource (CR) to dynamically generate installation manifests. Following the GitOps methodology, theClusterInstance
CR is sourced from a Git repository through ArgoCD. TheClusterInstance
CR can be used to initiate cluster installation by using either Assisted Installer, or the image-based installation available in multicluster engine. - Limits and requirements
-
The SiteConfig ArgoCD plugin which handles
SiteConfig
CRs is deprecated from OpenShift Container Platform 4.18.
-
The SiteConfig ArgoCD plugin which handles
- Engineering considerations
-
You must create a
Secret
CR with the login information for the cluster baseboard management controller (BMC). ThisSecret
CR is then referenced in theSiteConfig
CR. Integration with a secret store, such as Vault, can be used to manage the secrets. - Besides offering deployment method isolation and unification of Git and non-Git workflows, the SiteConfig Operator provides better scalability, greater flexibility with the use of custom templates, and an enhanced troubleshooting experience.
-
You must create a
5.14.2. Managed cluster updates 复制链接链接已复制到粘贴板!
- Description
You can upgrade versions of OpenShift Container Platform, Day 2 Operators, and managed cluster configurations, by declaring the required version in the
Policy
custom resources (CRs) that target the clusters to be upgraded.Policy controllers periodically check for policy compliance. If the result is negative, a violation report is created. If the policy remediation action is set to
enforce
the violations are remediated according to the updated policy. If the policy remediation action is set toinform
, the process ends with a non-compliant status report and responsibility to initiate the upgrade is left to the user to perform during an appropriate maintenance window.The Topology Aware Lifecycle Manager (TALM) extends Red Hat Advanced Cluster Management (RHACM) with features to manage the rollout of upgrades or configuration throughout the lifecycle of the fleet of clusters. It operates in progressive, limited size batches of clusters. When upgrades to OpenShift Container Platform or the Day 2 Operators are required, TALM progressively rolls out the updates by stepping through the set of policies and switching them to an "enforce" policy to push the configuration to the managed cluster.
The custom resource (CR) that TALM uses to build the remediation plan is the
ClusterGroupUpgrade
CR.You can use image-based upgrade (IBU) with the Lifecycle Agent as an alternative upgrade path for the single-node OpenShift cluster platform version. IBU uses an OCI image generated from a dedicated seed cluster to install single-node OpenShift on the target cluster.
TALM uses the
ImageBasedGroupUpgrade
CR to roll out image-based upgrades to a set of identified clusters.- Limits and requirements
-
You can perform direct upgrades for single-node OpenShift clusters using image-based upgrade for OpenShift Container Platform
<4.y>
to<4.y+2>
, and<4.y.z>
to<4.y.z+n>
. - Image-based upgrade uses custom images that are specific to the hardware platform that the clusters are running on. Different hardware platforms require separate seed images.
-
You can perform direct upgrades for single-node OpenShift clusters using image-based upgrade for OpenShift Container Platform
- Engineering considerations
-
In edge deployments, you can minimize the disruption to managed clusters by managing the timing and rollout of changes. Set all policies to
inform
to monitor compliance without triggering automatic enforcement. Similarly, configure Day 2 Operator subscriptions to manual to prevent updates from occurring outside of scheduled maintenance windows. - The recommended upgrade aproach for single-node OpenShift clusters is the image-based upgrade.
For multi-node cluster upgrades, consider the following
MachineConfigPool
CR configurations to reduce upgrade times:-
Pause configuration deployments to nodes during a maintenance window by setting the
paused
field totrue
. -
Adjust the
maxUnavailable
field to control how many nodes in the pool can be updated simultaneously. TheMaxUnavailable
field defines the percentage of nodes in the pool that can be simultaneously unavailable during aMachineConfig
object update. SetmaxUnavailable
to the maximum tolerable value. This reduces the number of reboots in a cluster during upgrades which results in shorter upgrade times. -
Resume configuration deployments by setting the
paused
field tofalse
. The configuration changes are applied in a single reboot.
-
Pause configuration deployments to nodes during a maintenance window by setting the
-
During cluster installation, you can pause
MachineConfigPool
CRs by setting thepaused
field totrue
and settingmaxUnavailable
to 100% to improve installation times.
-
In edge deployments, you can minimize the disruption to managed clusters by managing the timing and rollout of changes. Set all policies to
5.15. Hub cluster disaster recovery 复制链接链接已复制到粘贴板!
Note that loss of the hub cluster does not typically create a service outage on the managed clusters. Functions provided by the hub cluster will be lost, such as observability, configuration, lifecycle management updates being driven through the hub cluster, and so on.
- Limits and requirements
- Backup,restore and disaster recovery are offered by the cluster backup and restore Operator, which depends on the OpenShift API for Data Protection (OADP) Operator.
- Engineering considerations
- You can extend the cluster backup and restore operator to third party resources of the hub cluster based on your configuration.
- The cluster backup and restore operator is not enabled by default in Red Hat Advanced Cluster Management (RHACM). The reference configuration enables this feature.
5.16. Hub cluster components 复制链接链接已复制到粘贴板!
5.16.1. Red Hat Advanced Cluster Management (RHACM) 复制链接链接已复制到粘贴板!
- New in this release
- No reference design updates in this release.
- Description
Red Hat Advanced Cluster Management (RHACM) provides multicluster engine installation and ongoing lifecycle management functionality for deployed clusters. You can manage cluster configuration and upgrades declaratively by applying
Policy
custom resources (CRs) to clusters during maintenance windows.RHACM provides functionality such as the following:
- Zero touch provisioning (ZTP) and ongoing scaling of clusters using the multicluster engine component in RHACM.
- Configuration, upgrades, and cluster status through the RHACM policy controller.
-
During managed cluster installation, RHACM can apply labels to individual nodes as configured through the
ClusterInstance
CR. - The Topology Aware Lifecycle Manager component of RHACM provides phased rollout of configuration changes to managed clusters.
- The RHACM multicluster engine Observability component provides selective monitoring, dashboards, alerts, and metrics.
The recommended method for single-node OpenShift cluster installation is the image-based installation method in multicluster engine, which uses the
ClusterInstance
CR for cluster definition.The recommended method for single-node OpenShift upgrade is the image-based upgrade method.
NoteThe RHACM multicluster engine Observability component brings you a centralized view of the health and status of all the managed clusters. By default, every managed cluster is enabled to send metrics and alerts, created by their Cluster Monitoring Operator (CMO), back to Observability. For more information, see "Observability".
- Limits and requirements
- For more information about limits on number of clusters managed by a single hub cluster, see "Telco management hub cluster use model".
The number of managed clusters that can be effectively managed by the hub depends on various factors, including:
- Resource availability at each managed cluster
- Policy complexity and cluster size
- Network utilization
- Workload demands and distribution
- The hub and managed clusters must maintain sufficient bi-directional connectivity.
- Engineering considerations
- You can configure the cluster backup and restore Operator to include third-party resources.
- The use of RHACM hub side templating when defining configuration through policy is strongly recommended. This feature reduces the number of policies needed to manage the fleet by enabling for each cluster or for each group. For example, regional or hardware type content to be templated in a policy and substituted on cluster or group basis.
-
Managed clusters typically have some number of configuration values which are specific to an individual cluster. These should be managed using RHACM policy hub side templating with values pulled from
ConfigMap
CRs based on the cluster name.
5.16.2. Topology Aware Lifecycle Manager 复制链接链接已复制到粘贴板!
- New in this release
- No reference design updates in this release.
- Description
TALM is an Operator that runs only on the hub cluster for managing how changes like cluster upgrades, Operator upgrades, and cluster configuration are rolled out to the network. TALM supports the following features:
- Progressive rollout of policy updates to fleets of clusters in user configurable batches.
-
Per-cluster actions add
ztp-done
labels or other user-configurable labels following configuration changes to managed clusters. TALM supports optional pre-caching of OpenShift Container Platform, OLM Operator, and additional images to single-node OpenShift clusters before initiating an upgrade. The pre-caching feature is not applicable when using the recommended image-based upgrade method for upgrading single-node OpenShift clusters.
-
Specifying optional pre-caching configurations with
PreCachingConfig
CRs. - Configurable image filtering to exclude unused content.
- Storage validation before and after pre-caching, using defined space requirement parameters.
-
Specifying optional pre-caching configurations with
- Limits and requirements
- TALM supports concurrent cluster upgrades in batches of 500.
- Pre-caching is limited to single-node OpenShift cluster topology.
- Engineering considerations
-
The
PreCachingConfig
custom resource (CR) is optional. You do not need to create it if you want to pre-cache platform-related images only, such as OpenShift Container Platform and OLM. - TALM supports the use of hub-side templating with Red Hat Advanced Cluster Management policies.
-
The
5.16.3. GitOps Operator and GitOps ZTP 复制链接链接已复制到粘贴板!
- New in this release
- No reference design updates in this release
- Description
GitOps Operator and GitOps ZTP provide a GitOps-based infrastructure for managing cluster deployment and configuration. Cluster definitions and configurations are maintained as a declarative state in Git. You can apply
ClusterInstance
custom resources (CRs) to the hub cluster where theSiteConfig
Operator renders them as installation CRs. In earlier releases, a GitOps ZTP plugin supported the generation of installation CRs fromSiteConfig
CRs. This plugin is now deprecated. A separate GitOps ZTP plugin is available to enable automatic wrapping of configuration CRs into policies based on thePolicyGenerator
or thePolicyGenTemplate
CRs.You can deploy and manage multiple versions of OpenShift Container Platform on managed clusters by using the baseline reference configuration CRs. You can use custom CRs alongside the baseline CRs. To maintain multiple per-version policies simultaneously, use Git to manage the versions of the source and policy CRs by using the
PolicyGenerator
or thePolicyGenTemplate
CRs.- Limits and requirements
-
300 single node
SiteConfig
CRs can be synchronized for each ArgoCD application. You can use multiple applications to achieve the maximum number of clusters supported by a single hub cluster. - To ensure consistent and complete cleanup of managed clusters and their associated resources during cluster or node deletion, you must configure ArgoCD to use background deletion mode.
-
300 single node
- Engineering considerations
-
To avoid confusion or unintentional overwrite when updating content, use unique and distinguishable names for custom CRs in the
source-crs
directory and extra manifests. - Keep reference source CRs in a separate directory from custom CRs. This facilitates easy update of reference CRs as required.
- To help with multiple versions, keep all source CRs and policy creation CRs in versioned Git repositories to ensure consistent generation of policies for each OpenShift Container Platform version.
-
To avoid confusion or unintentional overwrite when updating content, use unique and distinguishable names for custom CRs in the
5.16.4. Local Storage Operator 复制链接链接已复制到粘贴板!
- New in this release
- No reference design updates in this release
- Description
-
You can create persistent volumes that can be used as
PVC
resources by applications with the Local Storage Operator. The number and type ofPV
resources that you create depends on your requirements. - Engineering considerations
-
Create backing storage for
PV
CRs before creating the persistent volume. This can be a partition, a local volume, LVM volume, or full disk. -
Refer to the device listing in
LocalVolume
CRs by the hardware path used to access each device to ensure correct allocation of disks and partitions, for example,/dev/disk/by-path/<id>
. Logical names (for example,/dev/sda
) are not guaranteed to be consistent across node reboots.
-
Create backing storage for
5.16.5. Red Hat OpenShift Data Foundation 复制链接链接已复制到粘贴板!
- New in this release
- No reference design updates in this release
- Description
- Red Hat OpenShift Data Foundation provides file, block, and object storage services to the hub cluster.
- Limits and requirements
- Red Hat OpenShift Data Foundation (ODF) in internal mode requires the Local Storage Operator to define a storage class which will provide the necessary underlying storage.
- When doing the planning for a telco management cluster, consider the ODF infrastructure and networking requirements.
- Dual stack support is limited. ODF IPv4 is supported on dual-stack clusters.
- Engineering considerations
- Address capacity warnings promptly as recovery can be difficult in case of storage capacity exhaustion, see Capacity planning.
5.16.6. Logging 复制链接链接已复制到粘贴板!
- New in this release
- No reference design updates in this release
- Description
- Use the Cluster Logging Operator to collect and ship logs off the node for remote archival and analysis. The reference configuration uses Kafka to ship audit and infrastructure logs to a remote archive.
- Limits and requirements
- The reference configuration does not include local log storage.
- The reference configuration does not include aggregation of managed cluster logs at the hub cluster.
- Engineering considerations
- The impact of cluster CPU use is based on the number or size of logs generated and the amount of log filtering configured.
- The reference configuration does not include shipping of application logs. The inclusion of application logs in the configuration requires you to evaluate the application logging rate and have sufficient additional CPU resources allocated to the reserved set.
5.16.7. OpenShift API for Data Protection 复制链接链接已复制到粘贴板!
- New in this release
- No reference design updates in this release
- Description
The OpenShift API for Data Protection (OADP) Operator is automatically installed and managed by Red Hat Advanced Cluster Management (RHACM) when the backup feature is enabled.
The OADP Operator facilitates the backup and restore of workloads in OpenShift Container Platform clusters. Based on the upstream open source project Velero, it allows you to backup and restore all Kubernetes resources for a given project, including persistent volumes.
While it is not mandatory to have it on the hub cluster, it is highly recommended for cluster backup, disaster recovery and high availability architecture for the hub cluster. The OADP Operator must be enabled to use the disaster recovery solutions for RHACM. The reference configuration enables backup (OADP) through the
MultiClusterHub
custom resource (CR) provided by the RHACM Operator.- Limits and requirements
- Only one version of OADP can be installed on a cluster. The version installed by RHACM must be used for RHACM disaster recovery features.
- Engineering considerations
- No engineering consideration updates in this release.
5.17. Hub cluster reference configuration CRs 复制链接链接已复制到粘贴板!
The following is the complete YAML reference of all the custom resources (CRs) for the telco management hub reference configuration in 4.18.
5.17.1. RHACM reference YAML 复制链接链接已复制到粘贴板!
acmAgentServiceConfig.yaml
acmMCH.yaml
acmMirrorRegistryCM.yaml
acmNS.yaml
acmOperGroup.yaml
acmPerfSearch.yaml
acmProvisioning.yaml
acmSubscription.yaml
observabilityMCO.yaml
observabilityNS.yaml
observabilityOBC.yaml
observabilitySecret.yaml
thanosSecret.yaml
talmSubscription.yaml
5.17.2. Storage reference YAML 复制链接链接已复制到粘贴板!
lsoLocalVolume.yaml
lsoNS.yaml
lsoOperatorgroup.yaml
lsoSubscription.yaml
odfNS.yaml
odfOperatorGroup.yaml
odfSubscription.yaml
storageCluster.yaml
5.17.3. GitOps Operator and GitOps ZTP reference YAML 复制链接链接已复制到粘贴板!
argocd-ssh-known-hosts-cm.yaml
gitopsNS.yaml
gitopsOperatorGroup.yaml
gitopsSubscription.yaml
ztp-repo.yaml
app-project.yaml
clusters-app.yaml
gitops-cluster-rolebinding.yaml
gitops-policy-rolebinding.yaml
kustomization.yaml
policies-app-project.yaml
policies-app.yaml
5.17.4. Logging reference YAML 复制链接链接已复制到粘贴板!
clusterLogNS.yaml
clusterLogOperGroup.yaml
clusterLogSubscription.yaml
5.17.5. Installation reference YAML 复制链接链接已复制到粘贴板!
agent-config.yaml
install-config.yaml
The telco hub 4.18 solution has been validated using the following Red Hat software products for OpenShift Container Platform clusters.
Component | Software version |
---|---|
OpenShift Container Platform | 4.18 |
Local Storage Operator | 4.18 |
Red Hat OpenShift Data Foundation (ODF) | 4.18 |
Red Hat Advanced Cluster Management (RHACM) | 2.13 |
Red Hat OpenShift GitOps | 1.15 |
GitOps Zero Touch Provisioning (ZTP) plugins | 4.18 |
multicluster engine Operator PolicyGenerator plugin | 2.12 |
Topology Aware Lifecycle Manager (TALM) | 4.18 |
Cluster Logging Operator | 6.2 |
OpenShift API for Data Protection (OADP) | The version aligned with the RHACM release. |