Ce contenu n'est pas disponible dans la langue sélectionnée.
Chapter 9. Managing cluster policies with PolicyGenerator resources
9.1. Configuring managed cluster policies by using PolicyGenerator resources Copier lienLien copié sur presse-papiers!
You can customize how Red Hat Advanced Cluster Management (RHACM) uses PolicyGenerator CRs to generate Policy CRs that configure the managed clusters that you provision.
Using RHACM and PolicyGenerator CRs is the recommended approach for managing policies and deploying them to managed clusters. This replaces the use of PolicyGenTemplate CRs for this purpose. For more information about PolicyGenerator resources, see the RHACM Policy Generator documentation.
9.1.1. Comparing RHACM PolicyGenerator and PolicyGenTemplate resource patching Copier lienLien copié sur presse-papiers!
PolicyGenerator custom resources (CRs) and PolicyGenTemplate CRs can be used in GitOps ZTP to generate RHACM policies for managed clusters.
There are advantages to using PolicyGenerator CRs over PolicyGenTemplate CRs when it comes to patching OpenShift Container Platform resources with GitOps ZTP. Using the RHACM PolicyGenerator API provides a generic way of patching resources which is not possible with PolicyGenTemplate resources.
The PolicyGenerator API is a part of the Open Cluster Management standard, while the PolicyGenTemplate API is not. A comparison of PolicyGenerator and PolicyGenTemplate resource patching and placement strategies are described in the following table.
Using PolicyGenTemplate CRs to manage and deploy policies to managed clusters will be deprecated in an upcoming OpenShift Container Platform release. Equivalent and improved functionality is available using Red Hat Advanced Cluster Management (RHACM) and PolicyGenerator CRs.
For more information about PolicyGenerator resources, see the RHACM Integrating Policy Generator documentation.
| PolicyGenerator patching | PolicyGenTemplate patching |
|---|---|
| Uses Kustomize strategic merges for merging resources. For more information see Declarative Management of Kubernetes Objects Using Kustomize. | Works by replacing variables with their values as defined by the patch. This is less flexible than Kustomize merge strategies. |
|
Supports |
Does not support |
| Relies only on patching, no embedded variable substitution is required. | Overwrites variable values defined in the patch. |
| Does not support merging lists in merge patches. Replacing a list in a merge patch is supported. | Merging and replacing lists is supported in a limited fashion - you can only merge one object in the list. |
|
Does not currently support the OpenAPI specification for resource patching. This means that additional directives are required in the patch to merge content that does not follow a schema, for example, | Works by replacing fields and values with values as defined by the patch. |
|
Requires additional directives, for example, |
Substitutes fields and values defined in the source CR with values defined in the patch, for example |
|
Can patch the |
Can patch the |
9.1.2. About the PolicyGenerator CRD Copier lienLien copié sur presse-papiers!
The PolicyGenerator custom resource definition (CRD) tells the PolicyGen policy generator what custom resources (CRs) to include in the cluster configuration, how to combine the CRs into the generated policies, and what items in those CRs need to be updated with overlay content.
The following example shows a PolicyGenerator CR (acm-common-du-ranGen.yaml) extracted from the ztp-site-generate reference container. The acm-common-du-ranGen.yaml file defines two Red Hat Advanced Cluster Management (RHACM) policies. The policies manage a collection of configuration CRs, one for each unique value of policyName in the CR. acm-common-du-ranGen.yaml creates a single placement binding and a placement rule to bind the policies to clusters based on the labels listed in the policyDefaults.placement.labelSelector section.
Example PolicyGenerator CR - acm-common-ranGen.yaml
A PolicyGenerator CR can be constructed with any number of included CRs. Apply the following example CR in the hub cluster to generate a policy containing a single CR:
Using the source file PtpConfigSlave.yaml as an example, the file defines a PtpConfig CR. The generated policy for the PtpConfigSlave example is named group-du-sno-config-policy. The PtpConfig CR defined in the generated group-du-sno-config-policy is named du-ptp-slave. The spec defined in PtpConfigSlave.yaml is placed under du-ptp-slave along with the other spec items defined under the source file.
The following example shows the group-du-sno-config-policy CR:
9.1.3. Recommendations when customizing PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
Consider the following best practices when customizing site configuration PolicyGenerator custom resources (CRs):
-
Use as few policies as are necessary. Using fewer policies requires less resources. Each additional policy creates increased CPU load for the hub cluster and the deployed managed cluster. CRs are combined into policies based on the
policyNamefield in thePolicyGeneratorCR. CRs in the samePolicyGeneratorwhich have the same value forpolicyNameare managed under a single policy. -
In disconnected environments, use a single catalog source for all Operators by configuring the registry as a single index containing all Operators. Each additional
CatalogSourceCR on the managed clusters increases CPU usage. -
MachineConfigCRs should be included asextraManifestsin theSiteConfigCR so that they are applied during installation. This can reduce the overall time taken until the cluster is ready to deploy applications. -
PolicyGeneratorCRs should override the channel field to explicitly identify the desired version. This ensures that changes in the source CR during upgrades does not update the generated subscription. -
The default setting for
policyDefaults.consolidateManifestsistrue. This is the recommended setting for DU profile. Setting it tofalsemight impact large scale deployments. -
The default setting for
policyDefaults.orderPoliciesisfalse. This is the recommended setting for DU profile. After the cluster installation is complete and a cluster becomesReady, TALM creates aClusterGroupUpgradeCR corresponding to this cluster. TheClusterGroupUpgradeCR contains a list of ordered policies defined by theran.openshift.io/ztp-deploy-waveannotation. If you use thePolicyGeneratorCR to change the order of the policies, conflicts might occur and the configuration might not be applied.
When managing large numbers of spoke clusters on the hub cluster, minimize the number of policies to reduce resource consumption.
Grouping multiple configuration CRs into a single or limited number of policies is one way to reduce the overall number of policies on the hub cluster. When using the common, group, and site hierarchy of policies for managing site configuration, it is especially important to combine site-specific configuration into a single policy.
9.1.4. PolicyGenerator CRs for RAN deployments Copier lienLien copié sur presse-papiers!
Use PolicyGenerator custom resources (CRs) to customize the configuration applied to the cluster by using the GitOps Zero Touch Provisioning (ZTP) pipeline. The PolicyGenerator CR allows you to generate one or more policies to manage the set of configuration CRs on your fleet of clusters. The PolicyGenerator CR identifies the set of managed CRs, bundles them into policies, builds the policy wrapping around those CRs, and associates the policies with clusters by using label binding rules.
The reference configuration, obtained from the GitOps ZTP container, is designed to provide a set of critical features and node tuning settings that ensure the cluster can support the stringent performance and resource utilization constraints typical of RAN (Radio Access Network) Distributed Unit (DU) applications. Changes or omissions from the baseline configuration can affect feature availability, performance, and resource utilization. Use the reference PolicyGenerator CRs as the basis to create a hierarchy of configuration files tailored to your specific site requirements.
The baseline PolicyGenerator CRs that are defined for RAN DU cluster configuration can be extracted from the GitOps ZTP ztp-site-generate container. See "Preparing the GitOps ZTP site configuration repository" for further details.
The PolicyGenerator CRs can be found in the ./out/argocd/example/acmpolicygenerator/ folder. The reference architecture has common, group, and site-specific configuration CRs. Each PolicyGenerator CR refers to other CRs that can be found in the ./out/source-crs folder.
The PolicyGenerator CRs relevant to RAN cluster configuration are described below. Variants are provided for the group PolicyGenerator CRs to account for differences in single-node, three-node compact, and standard cluster configurations. Similarly, site-specific configuration variants are provided for single-node clusters and multi-node (compact or standard) clusters. Use the group and site-specific configuration variants that are relevant for your deployment.
| PolicyGenerator CR | Description |
|---|---|
|
| Contains a set of CRs that get applied to multi-node clusters. These CRs configure SR-IOV features typical for RAN installations. |
|
| Contains a set of CRs that get applied to single-node OpenShift clusters. These CRs configure SR-IOV features typical for RAN installations. |
|
| Contains a set of common RAN policy configuration that get applied to multi-node clusters. |
|
| Contains a set of common RAN CRs that get applied to all clusters. These CRs subscribe to a set of operators providing cluster features typical for RAN as well as baseline cluster tuning. |
|
| Contains the RAN policies for three-node clusters only. |
|
| Contains the RAN policies for single-node clusters only. |
|
| Contains the RAN policies for standard three control-plane clusters. |
|
|
|
|
|
|
|
|
|
9.1.5. Customizing a managed cluster with PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
Use the following procedure to customize the policies that get applied to the managed cluster that you provision using the GitOps Zero Touch Provisioning (ZTP) pipeline.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have logged in to the hub cluster as a user with
cluster-adminprivileges. - You configured the hub cluster for generating the required installation and policy CRs.
- You created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for the Argo CD application.
Procedure
Create a
PolicyGeneratorCR for site-specific configuration CRs.-
Choose the appropriate example for your CR from the
out/argocd/example/acmpolicygenerator/folder, for example,acm-example-sno-site.yamloracm-example-multinode-site.yaml. Change the
policyDefaults.placement.labelSelectorfield in the example file to match the site-specific label included in theSiteConfigCR. In the exampleSiteConfigfile, the site-specific label issites: example-sno.NoteEnsure that the labels defined in your
PolicyGeneratorpolicyDefaults.placement.labelSelectorfield correspond to the labels that are defined in the related managed clustersSiteConfigCR.- Change the content in the example file to match the desired configuration.
-
Choose the appropriate example for your CR from the
Optional: Create a
PolicyGeneratorCR for any common configuration CRs that apply to the entire fleet of clusters.-
Select the appropriate example for your CR from the
out/argocd/example/acmpolicygenerator/folder, for example,acm-common-ranGen.yaml. - Change the content in the example file to match the required configuration.
-
Select the appropriate example for your CR from the
Optional: Create a
PolicyGeneratorCR for any group configuration CRs that apply to the certain groups of clusters in the fleet.Ensure that the content of the overlaid spec files matches your required end state. As a reference, the
out/source-crsdirectory contains the full list of source-crs available to be included and overlaid by your PolicyGenerator templates.NoteDepending on the specific requirements of your clusters, you might need more than a single group policy per cluster type, especially considering that the example group policies each have a single
PerformancePolicy.yamlfile that can only be shared across a set of clusters if those clusters consist of identical hardware configurations.-
Select the appropriate example for your CR from the
out/argocd/example/acmpolicygenerator/folder, for example,acm-group-du-sno-ranGen.yaml. - Change the content in the example file to match the required configuration.
-
Select the appropriate example for your CR from the
-
Optional. Create a validator inform policy
PolicyGeneratorCR to signal when the GitOps ZTP installation and configuration of the deployed cluster is complete. For more information, see "Creating a validator inform policy". Define all the policy namespaces in a YAML file similar to the example
out/argocd/example/acmpolicygenerator//ns.yamlfile.ImportantDo not include the
NamespaceCR in the same file with thePolicyGeneratorCR.-
Add the
PolicyGeneratorCRs andNamespaceCR to thekustomization.yamlfile in the generators section, similar to the example shown inout/argocd/example/acmpolicygenerator/kustomization.yaml. Commit the
PolicyGeneratorCRs,NamespaceCR, and associatedkustomization.yamlfile in your Git repository and push the changes.The ArgoCD pipeline detects the changes and begins the managed cluster deployment. You can push the changes to the
SiteConfigCR and thePolicyGeneratorCR simultaneously.
9.1.6. Monitoring managed cluster policy deployment progress Copier lienLien copié sur presse-papiers!
The ArgoCD pipeline uses PolicyGenerator CRs in Git to generate the RHACM policies and then sync them to the hub cluster. You can monitor the progress of the managed cluster policy synchronization after the assisted service installs OpenShift Container Platform on the managed cluster.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have logged in to the hub cluster as a user with
cluster-adminprivileges.
Procedure
The Topology Aware Lifecycle Manager (TALM) applies the configuration policies that are bound to the cluster.
After the cluster installation is complete and the cluster becomes
Ready, aClusterGroupUpgradeCR corresponding to this cluster, with a list of ordered policies defined by theran.openshift.io/ztp-deploy-wave annotations, is automatically created by the TALM. The cluster’s policies are applied in the order listed inClusterGroupUpgradeCR.You can monitor the high-level progress of configuration policy reconciliation by using the following commands:
export CLUSTER=<clusterName>
$ export CLUSTER=<clusterName>Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[-1:]}' | jq$ oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[-1:]}' | jqCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can monitor the detailed cluster policy compliance status by using the RHACM dashboard or the command line.
To check policy compliance by using
oc, run the following command:oc get policies -n $CLUSTER
$ oc get policies -n $CLUSTERCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To check policy status from the RHACM web console, perform the following actions:
-
Click Governance
Find policies. - Click on a cluster policy to check its status.
-
Click Governance
When all of the cluster policies become compliant, GitOps ZTP installation and configuration for the cluster is complete. The ztp-done label is added to the cluster.
In the reference configuration, the final policy that becomes compliant is the one defined in the *-du-validator-policy policy. This policy, when compliant on a cluster, ensures that all cluster configuration, Operator installation, and Operator configuration is complete.
9.1.7. Coordinating reboots for configuration changes Copier lienLien copié sur presse-papiers!
You can use Topology Aware Lifecycle Manager (TALM) to coordinate reboots across a fleet of spoke clusters when configuration changes require a reboot, such as deferred tuning changes. TALM reboots all nodes in the targeted MachineConfigPool on the selected clusters when the reboot policy is applied.
Instead of rebooting nodes after each individual change, you can apply all configuration updates through policies and then trigger a single, coordinated reboot.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have logged in to the hub cluster as a user with
cluster-adminprivileges. - You have deployed and configured TALM.
Procedure
Generate the configuration policies by creating a
PolicyGeneratorcustom resource (CR). You can use one of the following sample manifests:-
out/argocd/example/acmpolicygenerator/acm-example-sno-reboot -
out/argocd/example/acmpolicygenerator/acm-example-multinode-reboot
-
Update the
policyDefaults.placement.labelSelectorfield in thePolicyGeneratorCR to target the clusters that you want to reboot. Modify other fields as necessary for your use case.If you are coordinating a reboot to apply a deferred tuning change, ensure the
MachineConfigPoolin the reboot policy matches the value specified in thespec.recommendfield in theTunedobject.-
Apply the
PolicyGeneratorCR to generate and apply the configuration policies. For detailed steps, see "Customizing a managed cluster with PolicyGenerator CRs". After ArgoCD completes syncing the policies, create and apply the
ClusterGroupUpgrade(CGU) CR.Example CGU custom resource configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Configure the labels that match the clusters you want to reboot.
- 2
- Add all required configuration policies before the reboot policy. TALM applies the configuration changes as specified in the policies, in the order they are listed.
- 3
- Specify the timeout in seconds for the entire upgrade across all selected clusters. Set this field by considering the worst-case scenario.
-
After you apply the CGU custom resource, TALM rolls out the configuration policies in order. Once all policies are compliant, it applies the reboot policy and triggers a reboot of all nodes in the specified
MachineConfigPool.
Verification
Monitor the CGU rollout status.
You can monitor the rollout of the CGU custom resource on the hub by checking the status. Verify the successful rollout of the reboot by running the following command:
oc get cgu -A
oc get cgu -ACopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME AGE STATE DETAILS default reboot 1d Completed All clusters are compliant with all the managed policies
NAMESPACE NAME AGE STATE DETAILS default reboot 1d Completed All clusters are compliant with all the managed policiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify successful reboot on a specific node.
To confirm that the reboot was successful on a specific node, check the status of the
MachineConfigPool(MCP) for the node by running the following command:oc get mcp master
oc get mcp masterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-be5785c3b98eb7a1ec902fef2b81e865 True False False 3 3 3 0 72d
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-be5785c3b98eb7a1ec902fef2b81e865 True False False 3 3 3 0 72dCopy to Clipboard Copied! Toggle word wrap Toggle overflow
9.1.8. Validating the generation of configuration policy CRs Copier lienLien copié sur presse-papiers!
Policy custom resources (CRs) are generated in the same namespace as the PolicyGenerator from which they are created. The same troubleshooting flow applies to all policy CRs generated from a PolicyGenerator regardless of whether they are ztp-common, ztp-group, or ztp-site based, as shown using the following commands:
export NS=<namespace>
$ export NS=<namespace>
oc get policy -n $NS
$ oc get policy -n $NS
The expected set of policy-wrapped CRs should be displayed.
If the policies failed synchronization, use the following troubleshooting steps.
Procedure
To display detailed information about the policies, run the following command:
oc describe -n openshift-gitops application policies
$ oc describe -n openshift-gitops application policiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check for
Status: Conditions:to show the error logs. For example, setting an invalidsourceFileentry tofileName:generates the error shown below:Status: Conditions: Last Transition Time: 2021-11-26T17:21:39Z Message: rpc error: code = Unknown desc = `kustomize build /tmp/https___git.com/ran-sites/policies/ --enable-alpha-plugins` failed exit status 1: 2021/11/26 17:21:40 Error could not find test.yaml under source-crs/: no such file or directory Error: failure in plugin configured via /tmp/kust-plugin-config-52463179; exit status 1: exit status 1 Type: ComparisonErrorStatus: Conditions: Last Transition Time: 2021-11-26T17:21:39Z Message: rpc error: code = Unknown desc = `kustomize build /tmp/https___git.com/ran-sites/policies/ --enable-alpha-plugins` failed exit status 1: 2021/11/26 17:21:40 Error could not find test.yaml under source-crs/: no such file or directory Error: failure in plugin configured via /tmp/kust-plugin-config-52463179; exit status 1: exit status 1 Type: ComparisonErrorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check for
Status: Sync:. If there are log errors atStatus: Conditions:, theStatus: Sync:showsUnknownorError:Copy to Clipboard Copied! Toggle word wrap Toggle overflow When Red Hat Advanced Cluster Management (RHACM) recognizes that policies apply to a
ManagedClusterobject, the policy CR objects are applied to the cluster namespace. Check to see if the policies were copied to the cluster namespace:oc get policy -n $CLUSTER
$ oc get policy -n $CLUSTERCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow RHACM copies all applicable policies into the cluster namespace. The copied policy names have the format:
<PolicyGenerator.Namespace>.<PolicyGenerator.Name>-<policyName>.Check the placement rule for any policies not copied to the cluster namespace. The
matchSelectorin thePlacementfor those policies should match labels on theManagedClusterobject:oc get Placement -n $NS
$ oc get Placement -n $NSCopy to Clipboard Copied! Toggle word wrap Toggle overflow Note the
Placementname appropriate for the missing policy, common, group, or site, using the following command:oc get Placement -n $NS <placement_rule_name> -o yaml
$ oc get Placement -n $NS <placement_rule_name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow - The status-decisions should include your cluster name.
-
The key-value pair of the
matchSelectorin the spec must match the labels on your managed cluster.
Check the labels on the
ManagedClusterobject by using the following command:oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jq$ oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jqCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check to see what policies are compliant by using the following command:
oc get policy -n $CLUSTER
$ oc get policy -n $CLUSTERCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the
Namespace,OperatorGroup, andSubscriptionpolicies are compliant but the Operator configuration policies are not, it is likely that the Operators did not install on the managed cluster. This causes the Operator configuration policies to fail to apply because the CRD is not yet applied to the spoke.
9.1.9. Restarting policy reconciliation Copier lienLien copié sur presse-papiers!
You can restart policy reconciliation when unexpected compliance issues occur, for example, when the ClusterGroupUpgrade custom resource (CR) has timed out.
Procedure
A
ClusterGroupUpgradeCR is generated in the namespaceztp-installby the Topology Aware Lifecycle Manager after the managed cluster becomesReady:export CLUSTER=<clusterName>
$ export CLUSTER=<clusterName>Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get clustergroupupgrades -n ztp-install $CLUSTER
$ oc get clustergroupupgrades -n ztp-install $CLUSTERCopy to Clipboard Copied! Toggle word wrap Toggle overflow If there are unexpected issues and the policies fail to become complaint within the configured timeout (the default is 4 hours), the status of the
ClusterGroupUpgradeCR showsUpgradeTimedOut:oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'$ oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow A
ClusterGroupUpgradeCR in theUpgradeTimedOutstate automatically restarts its policy reconciliation every hour. If you have changed your policies, you can start a retry immediately by deleting the existingClusterGroupUpgradeCR. This triggers the automatic creation of a newClusterGroupUpgradeCR that begins reconciling the policies immediately:oc delete clustergroupupgrades -n ztp-install $CLUSTER
$ oc delete clustergroupupgrades -n ztp-install $CLUSTERCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Note that when the ClusterGroupUpgrade CR completes with status UpgradeCompleted and the managed cluster has the label ztp-done applied, you can make additional configuration changes by using PolicyGenerator. Deleting the existing ClusterGroupUpgrade CR will not make the TALM generate a new CR.
At this point, GitOps ZTP has completed its interaction with the cluster and any further interactions should be treated as an update and a new ClusterGroupUpgrade CR created for remediation of the policies.
9.1.10. Changing applied managed cluster CRs using policies Copier lienLien copié sur presse-papiers!
You can remove content from a custom resource (CR) that is deployed in a managed cluster through a policy.
By default, all Policy CRs created from a PolicyGenerator CR have the complianceType field set to musthave. A musthave policy without the removed content is still compliant because the CR on the managed cluster has all the specified content. With this configuration, when you remove content from a CR, TALM removes the content from the policy but the content is not removed from the CR on the managed cluster.
With the complianceType field to mustonlyhave, the policy ensures that the CR on the cluster is an exact match of what is specified in the policy.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have logged in to the hub cluster as a user with
cluster-adminprivileges. - You have deployed a managed cluster from a hub cluster running RHACM.
- You have installed Topology Aware Lifecycle Manager on the hub cluster.
Procedure
Remove the content that you no longer need from the affected CRs. In this example, the
disableDrain: falseline was removed from theSriovOperatorConfigCR.Example CR
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Change the
complianceTypeof the affected policies tomustonlyhavein theacm-group-du-sno-ranGen.yamlfile.Example YAML
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
ClusterGroupUpdatesCR and specify the clusters that must receive the CR changes::Example ClusterGroupUpdates CR
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
ClusterGroupUpgradeCR by running the following command:oc create -f cgu-remove.yaml
$ oc create -f cgu-remove.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow When you are ready to apply the changes, for example, during an appropriate maintenance window, change the value of the
spec.enablefield totrueby running the following command:oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-remove \ --patch '{"spec":{"enable":true}}' --type=merge$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-remove \ --patch '{"spec":{"enable":true}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check the status of the policies by running the following command:
oc get <kind> <changed_cr_name>
$ oc get <kind> <changed_cr_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME REMEDIATION ACTION COMPLIANCE STATE AGE default cgu-ztp-group.group-du-sno-config-policy enforce 17m default ztp-group.group-du-sno-config-policy inform NonCompliant 15h
NAMESPACE NAME REMEDIATION ACTION COMPLIANCE STATE AGE default cgu-ztp-group.group-du-sno-config-policy enforce 17m default ztp-group.group-du-sno-config-policy inform NonCompliant 15hCopy to Clipboard Copied! Toggle word wrap Toggle overflow When the
COMPLIANCE STATEof the policy isCompliant, it means that the CR is updated and the unwanted content is removed.Check that the policies are removed from the targeted clusters by running the following command on the managed clusters:
oc get <kind> <changed_cr_name>
$ oc get <kind> <changed_cr_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow If there are no results, the CR is removed from the managed cluster.
9.1.11. Indication of done for GitOps ZTP installations Copier lienLien copié sur presse-papiers!
GitOps Zero Touch Provisioning (ZTP) simplifies the process of checking the GitOps ZTP installation status for a cluster. The GitOps ZTP status moves through three phases: cluster installation, cluster configuration, and GitOps ZTP done.
- Cluster installation phase
-
The cluster installation phase is shown by the
ManagedClusterJoinedandManagedClusterAvailableconditions in theManagedClusterCR . If theManagedClusterCR does not have these conditions, or the condition is set toFalse, the cluster is still in the installation phase. Additional details about installation are available from theAgentClusterInstallandClusterDeploymentCRs. For more information, see "Troubleshooting GitOps ZTP". - Cluster configuration phase
-
The cluster configuration phase is shown by a
ztp-runninglabel applied theManagedClusterCR for the cluster. - GitOps ZTP done
Cluster installation and configuration is complete in the GitOps ZTP done phase. This is shown by the removal of the
ztp-runninglabel and addition of theztp-donelabel to theManagedClusterCR. Theztp-donelabel shows that the configuration has been applied and the baseline DU configuration has completed cluster tuning.The change to the GitOps ZTP done state is conditional on the compliant state of a Red Hat Advanced Cluster Management (RHACM) validator inform policy. This policy captures the existing criteria for a completed installation and validates that it moves to a compliant state only when GitOps ZTP provisioning of the managed cluster is complete.
The validator inform policy ensures the configuration of the cluster is fully applied and Operators have completed their initialization. The policy validates the following:
-
The target
MachineConfigPoolcontains the expected entries and has finished updating. All nodes are available and not degraded. -
The SR-IOV Operator has completed initialization as indicated by at least one
SriovNetworkNodeStatewithsyncStatus: Succeeded. - The PTP Operator daemon set exists.
-
The target
9.2. Advanced managed cluster configuration with PolicyGenerator resources Copier lienLien copié sur presse-papiers!
You can use PolicyGenerator CRs to deploy custom functionality in your managed clusters. Using RHACM and PolicyGenerator CRs is the recommended approach for managing policies and deploying them to managed clusters. This replaces the use of PolicyGenTemplate CRs for this purpose. For more information about PolicyGenerator resources, see the RHACM Policy Generator documentation.
9.2.1. Deploying additional changes to clusters Copier lienLien copié sur presse-papiers!
If you require cluster configuration changes outside of the base GitOps Zero Touch Provisioning (ZTP) pipeline configuration, there are three options:
- Apply the additional configuration after the GitOps ZTP pipeline is complete
- When the GitOps ZTP pipeline deployment is complete, the deployed cluster is ready for application workloads. At this point, you can install additional Operators and apply configurations specific to your requirements. Ensure that additional configurations do not negatively affect the performance of the platform or allocated CPU budget.
- Add content to the GitOps ZTP library
- The base source custom resources (CRs) that you deploy with the GitOps ZTP pipeline can be augmented with custom content as required.
- Create extra manifests for the cluster installation
- Extra manifests are applied during installation and make the installation process more efficient.
Providing additional source CRs or modifying existing source CRs can significantly impact the performance or CPU profile of OpenShift Container Platform.
9.2.2. Using PolicyGenerator CRs to override source CRs content Copier lienLien copié sur presse-papiers!
PolicyGenerator custom resources (CRs) allow you to overlay additional configuration details on top of the base source CRs provided with the GitOps plugin in the ztp-site-generate container. You can think of PolicyGenerator CRs as a logical merge or patch to the base CR. Use PolicyGenerator CRs to update a single field of the base CR, or overlay the entire contents of the base CR. You can update values and insert fields that are not in the base CR.
The following example procedure describes how to update fields in the generated PerformanceProfile CR for the reference configuration based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml file. Use the procedure as a basis for modifying other parts of the PolicyGenerator based on your requirements.
Prerequisites
- Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.
Procedure
Review the baseline source CR for existing content. You can review the source CRs listed in the reference
PolicyGeneratorCRs by extracting them from the GitOps Zero Touch Provisioning (ZTP) container.Create an
/outfolder:mkdir -p ./out
$ mkdir -p ./outCopy to Clipboard Copied! Toggle word wrap Toggle overflow Extract the source CRs:
podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.19.1 extract /home/ztp --tar | tar x -C ./out
$ podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.19.1 extract /home/ztp --tar | tar x -C ./outCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Review the baseline
PerformanceProfileCR in./out/source-crs/PerformanceProfile.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteAny fields in the source CR which contain
$…are removed from the generated CR if they are not provided in thePolicyGeneratorCR.Update the
PolicyGeneratorentry forPerformanceProfilein theacm-group-du-sno-ranGen.yamlreference file. The following examplePolicyGeneratorCR stanza supplies appropriate CPU specifications, sets thehugepagesconfiguration, and adds a new field that setsgloballyDisableIrqLoadBalancingto false.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Commit the
PolicyGeneratorchange in Git, and then push to the Git repository being monitored by the GitOps ZTP argo CD application.Example output
The GitOps ZTP application generates an RHACM policy that contains the generated
PerformanceProfileCR. The contents of that CR are derived by merging themetadataandspeccontents from thePerformanceProfileentry in thePolicyGeneratoronto the source CR. The resulting CR has the following content:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
In the /source-crs folder that you extract from the ztp-site-generate container, the $ syntax is not used for template substitution as implied by the syntax. Rather, if the policyGen tool sees the $ prefix for a string and you do not specify a value for that field in the related PolicyGenerator CR, the field is omitted from the output CR entirely.
An exception to this is the $mcp variable in /source-crs YAML files that is substituted with the specified value for mcp from the PolicyGenerator CR. For example, in example/acmpolicygenerator/acm-group-du-standard-ranGen.yaml, the value for mcp is worker:
spec:
bindingRules:
group-du-standard: ""
mcp: "worker"
spec:
bindingRules:
group-du-standard: ""
mcp: "worker"
The policyGen tool replace instances of $mcp with worker in the output CRs.
9.2.3. Adding custom content to the GitOps ZTP pipeline Copier lienLien copié sur presse-papiers!
Perform the following procedure to add new content to the GitOps ZTP pipeline.
Procedure
-
Create a subdirectory named
source-crsin the directory that contains thekustomization.yamlfile for thePolicyGeneratorcustom resource (CR). Add your user-provided CRs to the
source-crssubdirectory, as shown in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
source-crssubdirectory must be in the same directory as thekustomization.yamlfile.
Update the required
PolicyGeneratorCRs to include references to the content you added in thesource-crs/custom-crsandsource-crs/elasticsearchdirectories. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Commit the
PolicyGeneratorchange in Git, and then push to the Git repository that is monitored by the GitOps ZTP Argo CD policies application. Update the
ClusterGroupUpgradeCR to include the changedPolicyGeneratorand save it ascgu-test.yaml. The following example shows a generatedcgu-test.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the updated
ClusterGroupUpgradeCR by running the following command:oc apply -f cgu-test.yaml
$ oc apply -f cgu-test.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check that the updates have succeeded by running the following command:
oc get cgu -A
$ oc get cgu -ACopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME AGE STATE DETAILS ztp-clusters custom-source-cr 6s InProgress Remediating non-compliant policies ztp-install cluster1 19h Completed All clusters are compliant with all the managed policies
NAMESPACE NAME AGE STATE DETAILS ztp-clusters custom-source-cr 6s InProgress Remediating non-compliant policies ztp-install cluster1 19h Completed All clusters are compliant with all the managed policiesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
9.2.4. Configuring policy compliance evaluation timeouts for PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
Use Red Hat Advanced Cluster Management (RHACM) installed on a hub cluster to monitor and report on whether your managed clusters are compliant with applied policies. RHACM uses policy templates to apply predefined policy controllers and policies. Policy controllers are Kubernetes custom resource definition (CRD) instances.
You can override the default policy evaluation intervals with PolicyGenerator custom resources (CRs). You configure duration settings that define how long a ConfigurationPolicy CR can be in a state of policy compliance or non-compliance before RHACM re-evaluates the applied cluster policies.
The GitOps Zero Touch Provisioning (ZTP) policy generator generates ConfigurationPolicy CR policies with pre-defined policy evaluation intervals. The default value for the noncompliant state is 10 seconds. The default value for the compliant state is 10 minutes. To disable the evaluation interval, set the value to never.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have logged in to the hub cluster as a user with
cluster-adminprivileges. - You have created a Git repository where you manage your custom site configuration data.
Procedure
To configure the evaluation interval for all policies in a
PolicyGeneratorCR, set appropriatecompliantandnoncompliantvalues for theevaluationIntervalfield. For example:policyDefaults: evaluationInterval: compliant: 30m noncompliant: 45spolicyDefaults: evaluationInterval: compliant: 30m noncompliant: 45sCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou can also set
compliantandnoncompliantfields toneverto stop evaluating the policy after it reaches particular compliance state.To configure the evaluation interval for an individual policy object in a
PolicyGeneratorCR, add theevaluationIntervalfield and set appropriate values. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Commit the
PolicyGeneratorCRs files in the Git repository and push your changes.
Verification
Check that the managed spoke cluster policies are monitored at the expected intervals.
-
Log in as a user with
cluster-adminprivileges on the managed cluster. Get the pods that are running in the
open-cluster-management-agent-addonnamespace. Run the following command:oc get pods -n open-cluster-management-agent-addon
$ oc get pods -n open-cluster-management-agent-addonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE config-policy-controller-858b894c68-v4xdb 1/1 Running 22 (5d8h ago) 10d
NAME READY STATUS RESTARTS AGE config-policy-controller-858b894c68-v4xdb 1/1 Running 22 (5d8h ago) 10dCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the applied policies are being evaluated at the expected interval in the logs for the
config-policy-controllerpod:oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb
$ oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdbCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-config-policy-config"} 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-common-compute-1-catalog-policy-config"}2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-config-policy-config"} 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-common-compute-1-catalog-policy-config"}Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.2.5. Signalling GitOps ZTP cluster deployment completion with validator inform policies Copier lienLien copié sur presse-papiers!
Create a validator inform policy that signals when the GitOps Zero Touch Provisioning (ZTP) installation and configuration of the deployed cluster is complete. This policy can be used for deployments of single-node OpenShift clusters, three-node clusters, and standard clusters.
Procedure
Create a standalone
PolicyGeneratorcustom resource (CR) that contains the source filevalidatorCRs/informDuValidator.yaml. You only need one standalonePolicyGeneratorCR for each cluster type. For example, this CR applies a validator inform policy for single-node OpenShift clusters:Example single-node cluster validator inform policy CR (acm-group-du-sno-validator-ranGen.yaml)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Commit the
PolicyGeneratorCR file in your Git repository and push the changes.
9.2.6. Configuring power states using PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
For low latency and high-performance edge deployments, it is necessary to disable or limit C-states and P-states. With this configuration, the CPU runs at a constant frequency, which is typically the maximum turbo frequency. This ensures that the CPU is always running at its maximum speed, which results in high performance and low latency. This leads to the best latency for workloads. However, this also leads to the highest power consumption, which might not be necessary for all workloads.
Workloads can be classified as critical or non-critical, with critical workloads requiring disabled C-state and P-state settings for high performance and low latency, while non-critical workloads use C-state and P-state settings for power savings at the expense of some latency and performance. You can configure the following three power states using GitOps Zero Touch Provisioning (ZTP):
- High-performance mode provides ultra low latency at the highest power consumption.
- Performance mode provides low latency at a relatively high power consumption.
- Power saving balances reduced power consumption with increased latency.
The default configuration is for a low latency, performance mode.
PolicyGenerator custom resources (CRs) allow you to overlay additional configuration details onto the base source CRs provided with the GitOps plugin in the ztp-site-generate container.
Configure the power states by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.
The following common prerequisites apply to configuring all three power states.
Prerequisites
- You have created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.
- You have followed the procedure described in "Preparing the GitOps ZTP site configuration repository".
9.2.6.1. Configuring performance mode using PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
Follow this example to set performance mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.
Performance mode provides low latency at a relatively high power consumption.
Prerequisites
- You have configured the BIOS with performance related settings by following the guidance in "Configuring host firmware for low latency and high performance".
Procedure
Update the
PolicyGeneratorentry forPerformanceProfilein theacm-group-du-sno-ranGen.yamlreference file inout/argocd/example/acmpolicygenerator//as follows to set performance mode.Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Commit the
PolicyGeneratorchange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
9.2.6.2. Configuring high-performance mode using PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
Follow this example to set high performance mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.
High performance mode provides ultra low latency at the highest power consumption.
Prerequisites
- You have configured the BIOS with performance related settings by following the guidance in "Configuring host firmware for low latency and high performance".
Procedure
Update the
PolicyGeneratorentry forPerformanceProfilein theacm-group-du-sno-ranGen.yamlreference file inout/argocd/example/acmpolicygenerator/as follows to set high-performance mode.Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Commit the
PolicyGeneratorchange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
9.2.6.3. Configuring power saving mode using PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
Follow this example to set power saving mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.
The power saving mode balances reduced power consumption with increased latency.
Prerequisites
- You enabled C-states and OS-controlled P-states in the BIOS.
Procedure
Update the
PolicyGeneratorentry forPerformanceProfilein theacm-group-du-sno-ranGen.yamlreference file inout/argocd/example/acmpolicygenerator/as follows to configure power saving mode. It is recommended to configure the CPU governor for the power saving mode through the additional kernel arguments object.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
schedutilgovernor is recommended, however, you can also use other governors, includingondemandandpowersave.
-
Commit the
PolicyGeneratorchange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
Verification
Select a worker node in your deployed cluster from the list of nodes identified by using the following command:
oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Log in to the node by using the following command:
oc debug node/<node-name>
$ oc debug node/<node-name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node-name>with the name of the node you want to verify the power state on.Set
/hostas the root directory within the debug shell. The debug pod mounts the host’s root file system in/hostwithin the pod. By changing the root directory to/host, you can run binaries contained in the host’s executable paths as shown in the following example:chroot /host
# chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to verify the applied power state:
cat /proc/cmdline
# cat /proc/cmdlineCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Expected output
-
For power saving mode the
intel_pstate=passive.
9.2.6.4. Maximizing power savings Copier lienLien copié sur presse-papiers!
Limiting the maximum CPU frequency is recommended to achieve maximum power savings. Enabling C-states on the non-critical workload CPUs without restricting the maximum CPU frequency negates much of the power savings by boosting the frequency of the critical CPUs.
Maximize power savings by updating the sysfs plugin fields, setting an appropriate value for max_perf_pct in the TunedPerformancePatch CR for the reference configuration. This example based on the acm-group-du-sno-ranGen.yaml describes the procedure to follow to restrict the maximum CPU frequency.
Prerequisites
- You have configured power savings mode as described in "Using PolicyGenerator CRs to configure power savings mode".
Procedure
Update the
PolicyGeneratorentry forTunedPerformancePatchin theacm-group-du-sno-ranGen.yamlreference file inout/argocd/example/acmpolicygenerator/. To maximize power savings, addmax_perf_pctas shown in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
max_perf_pctcontrols the maximum frequency thecpufreqdriver is allowed to set as a percentage of the maximum supported CPU frequency. This value applies to all CPUs. You can check the maximum supported frequency in/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq. As a starting point, you can use a percentage that caps all CPUs at theAll Cores Turbofrequency. TheAll Cores Turbofrequency is the frequency that all cores run at when the cores are all fully occupied.
NoteTo maximize power savings, set a lower value. Setting a lower value for
max_perf_pctlimits the maximum CPU frequency, thereby reducing power consumption, but also potentially impacting performance. Experiment with different values and monitor the system’s performance and power consumption to find the optimal setting for your use-case.-
Commit the
PolicyGeneratorchange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
9.2.7. Configuring LVM Storage using PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
You can configure Logical Volume Manager (LVM) Storage for managed clusters that you deploy with GitOps Zero Touch Provisioning (ZTP).
You use LVM Storage to persist event subscriptions when you use PTP events or bare-metal hardware events with HTTP transport.
Use the Local Storage Operator for persistent storage that uses local volumes in distributed units.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - Create a Git repository where you manage your custom site configuration data.
Procedure
To configure LVM Storage for new managed clusters, add the following YAML to
policies.manifestsin theacm-common-ranGen.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe Storage LVMO subscription is deprecated. In future releases of OpenShift Container Platform, the storage LVMO subscription will not be available. Instead, you must use the Storage LVMS subscription.
In OpenShift Container Platform 4.19, you can use the Storage LVMS subscription instead of the LVMO subscription. The LVMS subscription does not require manual overrides in the
acm-common-ranGen.yamlfile. Add the following YAML topolicies.manifestsin theacm-common-ranGen.yamlfile to use the Storage LVMS subscription:- path: source-crs/StorageLVMSubscriptionNS.yaml - path: source-crs/StorageLVMSubscriptionOperGroup.yaml - path: source-crs/StorageLVMSubscription.yaml
- path: source-crs/StorageLVMSubscriptionNS.yaml - path: source-crs/StorageLVMSubscriptionOperGroup.yaml - path: source-crs/StorageLVMSubscription.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
LVMClusterCR topolicies.manifestsin your specific group or individual site configuration file. For example, in theacm-group-du-sno-ranGen.yamlfile, add the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow This example configuration creates a volume group (
vg1) with all the available devices, except the disk where OpenShift Container Platform is installed. A thin-pool logical volume is also created.- Merge any other required changes and files with your custom site repository.
-
Commit the
PolicyGeneratorchanges in Git, and then push the changes to your site configuration repository to deploy LVM Storage to new sites using GitOps ZTP.
9.2.8. Configuring PTP events with PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
You can use the GitOps ZTP pipeline to configure PTP events that use HTTP transport.
9.2.8.1. Configuring PTP events that use HTTP transport Copier lienLien copié sur presse-papiers!
You can configure PTP events that use HTTP transport on managed clusters that you deploy with the GitOps Zero Touch Provisioning (ZTP) pipeline.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have logged in as a user with
cluster-adminprivileges. - You have created a Git repository where you manage your custom site configuration data.
Procedure
Apply the following
PolicyGeneratorchanges toacm-group-du-3node-ranGen.yaml,acm-group-du-sno-ranGen.yaml, oracm-group-du-standard-ranGen.yamlfiles according to your requirements:In
policies.manifests, add thePtpOperatorConfigCR file that configures the transport host:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIn OpenShift Container Platform 4.13 or later, you do not need to set the
transportHostfield in thePtpOperatorConfigresource when you use HTTP transport with PTP events.Configure the
linuxptpandphc2sysfor the PTP clock type and interface. For example, add the following YAML intopolicies.manifests:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Can be
PtpConfigMaster.yamlorPtpConfigSlave.yamldepending on your requirements. For configurations based onacm-group-du-sno-ranGen.yamloracm-group-du-3node-ranGen.yaml, usePtpConfigSlave.yaml. - 2
- Device specific interface name.
- 3
- You must append the
--summary_interval -4value toptp4lOptsin.spec.sourceFiles.spec.profileto enable PTP fast events. - 4
- Required
phc2sysOptsvalues.-mprints messages tostdout. Thelinuxptp-daemonDaemonSetparses the logs and generates Prometheus metrics. - 5
- Optional. If the
ptpClockThresholdstanza is not present, default values are used for theptpClockThresholdfields. The stanza shows defaultptpClockThresholdvalues. TheptpClockThresholdvalues configure how long after the PTP master clock is disconnected before PTP events are triggered.holdOverTimeoutis the time value in seconds before the PTP clock event state changes toFREERUNwhen the PTP master clock is disconnected. ThemaxOffsetThresholdandminOffsetThresholdsettings configure offset values in nanoseconds that compare against the values forCLOCK_REALTIME(phc2sys) or master offset (ptp4l). When theptp4lorphc2sysoffset value is outside this range, the PTP clock state is set toFREERUN. When the offset value is within this range, the PTP clock state is set toLOCKED.
- Merge any other required changes and files with your custom site repository.
- Push the changes to your site configuration repository to deploy PTP fast events to new sites using GitOps ZTP.
9.2.9. Configuring the Image Registry Operator for local caching of images Copier lienLien copié sur presse-papiers!
OpenShift Container Platform manages image caching using a local registry. In edge computing use cases, clusters are often subject to bandwidth restrictions when communicating with centralized image registries, which might result in long image download times.
Long download times are unavoidable during initial deployment. Over time, there is a risk that CRI-O will erase the /var/lib/containers/storage directory in the case of an unexpected shutdown. To address long image download times, you can create a local image registry on remote managed clusters using GitOps Zero Touch Provisioning (ZTP). This is useful in Edge computing scenarios where clusters are deployed at the far edge of the network.
Before you can set up the local image registry with GitOps ZTP, you need to configure disk partitioning in the SiteConfig CR that you use to install the remote managed cluster. After installation, you configure the local image registry using a PolicyGenerator CR. Then, the GitOps ZTP pipeline creates Persistent Volume (PV) and Persistent Volume Claim (PVC) CRs and patches the imageregistry configuration.
The local image registry can only be used for user application images and cannot be used for the OpenShift Container Platform or Operator Lifecycle Manager operator images.
9.2.9.1. Configuring disk partitioning with SiteConfig Copier lienLien copié sur presse-papiers!
Configure disk partitioning for a managed cluster using a SiteConfig CR and GitOps Zero Touch Provisioning (ZTP). The disk partition details in the SiteConfig CR must match the underlying disk.
You must complete this procedure at installation time.
Prerequisites
- Install Butane.
Procedure
Create the
storage.bufile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Convert the
storage.buto an Ignition file by running the following command:butane storage.bu
$ butane storage.buCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
{"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}{"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Use a tool such as JSON Pretty Print to convert the output into JSON format.
Copy the output into the
.spec.clusters.nodes.ignitionConfigOverridefield in theSiteConfigCR.Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf the
.spec.clusters.nodes.ignitionConfigOverridefield does not exist, create it.
Verification
During or after installation, verify on the hub cluster that the
BareMetalHostobject shows the annotation by running the following command:oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"]
$ oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
"{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}""{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}"Copy to Clipboard Copied! Toggle word wrap Toggle overflow After installation, check the single-node OpenShift disk status.
Enter into a debug session on the single-node OpenShift node by running the following command. This step instantiates a debug pod called
<node_name>-debug:oc debug node/my-sno-node
$ oc debug node/my-sno-nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Set
/hostas the root directory within the debug shell by running the following command. The debug pod mounts the host’s root file system in/hostwithin the pod. By changing the root directory to/host, you can run binaries contained in the host’s executable paths:chroot /host
# chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow List information about all available block devices by running the following command:
lsblk
# lsblkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Display information about the file system disk space usage by running the following command:
df -h
# df -hCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.2.9.2. Configuring the image registry using PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
Use PolicyGenerator (PGT) CRs to apply the CRs required to configure the image registry and patch the imageregistry configuration.
Prerequisites
- You have configured a disk partition in the managed cluster.
-
You have installed the OpenShift CLI (
oc). -
You have logged in to the hub cluster as a user with
cluster-adminprivileges. - You have created a Git repository where you manage your custom site configuration data for use with GitOps Zero Touch Provisioning (ZTP).
Procedure
Configure the storage class, persistent volume claim, persistent volume, and image registry configuration in the appropriate
PolicyGeneratorCR. For example, to configure an individual site, add the following YAML to the fileacm-example-sno-site.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set the appropriate value for
ztp-deploy-wavedepending on whether you are configuring image registries at the site, common, or group level.ztp-deploy-wave: "100"is suitable for development or testing because it allows you to group the referenced source files together. - 2
- In
ImageRegistryPV.yaml, ensure that thespec.local.pathfield is set to/var/imageregistryto match the value set for themount_pointfield in theSiteConfigCR.
ImportantDo not set
complianceType: mustonlyhavefor the- fileName: ImageRegistryConfig.yamlconfiguration. This can cause the registry pod deployment to fail.-
Commit the
PolicyGeneratorchange in Git, and then push to the Git repository being monitored by the GitOps ZTP ArgoCD application.
Verification
Use the following steps to troubleshoot errors with the local image registry on the managed clusters:
Verify successful login to the registry while logged in to the managed cluster. Run the following commands:
Export the managed cluster name:
cluster=<managed_cluster_name>
$ cluster=<managed_cluster_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the managed cluster
kubeconfigdetails:oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$cluster$ oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Download and export the cluster
kubeconfig:oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$cluster$ oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Verify access to the image registry from the managed cluster. See "Accessing the registry".
Check that the
ConfigCRD in theimageregistry.operator.openshift.iogroup instance is not reporting errors. Run the following command while logged in to the managed cluster:oc get image.config.openshift.io cluster -o yaml
$ oc get image.config.openshift.io cluster -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the
PersistentVolumeClaimon the managed cluster is populated with data. Run the following command while logged in to the managed cluster:oc get pv image-registry-sc
$ oc get pv image-registry-scCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the
registry*pod is running and is located under theopenshift-image-registrynamespace.oc get pods -n openshift-image-registry | grep registry*
$ oc get pods -n openshift-image-registry | grep registry*Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
cluster-image-registry-operator-68f5c9c589-42cfg 1/1 Running 0 8d image-registry-5f8987879-6nx6h 1/1 Running 0 8d
cluster-image-registry-operator-68f5c9c589-42cfg 1/1 Running 0 8d image-registry-5f8987879-6nx6h 1/1 Running 0 8dCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the disk partition on the managed cluster is correct:
Open a debug shell to the managed cluster:
oc debug node/sno-1.example.com
$ oc debug node/sno-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run
lsblkto check the host disk partitions:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
/var/imageregistryindicates that the disk is correctly partitioned.
9.3. Updating managed clusters in a disconnected environment with PolicyGenerator resources and TALM Copier lienLien copié sur presse-papiers!
You can use the Topology Aware Lifecycle Manager (TALM) to manage the software lifecycle of managed clusters that you have deployed using GitOps Zero Touch Provisioning (ZTP) and Topology Aware Lifecycle Manager (TALM). TALM uses Red Hat Advanced Cluster Management (RHACM) PolicyGenerator policies to manage and control changes applied to target clusters.
Using PolicyGenerator resources with GitOps ZTP is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
9.3.1. Setting up the disconnected environment Copier lienLien copié sur presse-papiers!
TALM can perform both platform and Operator updates.
You must mirror both the platform image and Operator images that you want to update to in your mirror registry before you can use TALM to update your disconnected clusters. Complete the following steps to mirror the images:
For platform updates, you must perform the following steps:
Mirror the desired OpenShift Container Platform image repository. Ensure that the desired platform image is mirrored by following the "Mirroring the OpenShift Container Platform image repository" procedure linked in the Additional resources. Save the contents of the
imageContentSourcessection in theimageContentSources.yamlfile:Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save the image signature of the desired platform image that was mirrored. You must add the image signature to the
PolicyGeneratorCR for platform updates. To get the image signature, perform the following steps:Specify the desired OpenShift Container Platform tag by running the following command:
OCP_RELEASE_NUMBER=<release_version>
$ OCP_RELEASE_NUMBER=<release_version>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the architecture of the cluster by running the following command:
ARCHITECTURE=<cluster_architecture>
$ ARCHITECTURE=<cluster_architecture>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the architecture of the cluster, such as
x86_64,aarch64,s390x, orppc64le.
Get the release image digest from Quay by running the following command
DIGEST="$(oc adm release info quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_NUMBER}-${ARCHITECTURE} | sed -n 's/Pull From: .*@//p')"$ DIGEST="$(oc adm release info quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_NUMBER}-${ARCHITECTURE} | sed -n 's/Pull From: .*@//p')"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the digest algorithm by running the following command:
DIGEST_ALGO="${DIGEST%%:*}"$ DIGEST_ALGO="${DIGEST%%:*}"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the digest signature by running the following command:
DIGEST_ENCODED="${DIGEST#*:}"$ DIGEST_ENCODED="${DIGEST#*:}"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the image signature from the mirror.openshift.com website by running the following command:
SIGNATURE_BASE64=$(curl -s "https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST_ALGO}=${DIGEST_ENCODED}/signature-1" | base64 -w0 && echo)$ SIGNATURE_BASE64=$(curl -s "https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST_ALGO}=${DIGEST_ENCODED}/signature-1" | base64 -w0 && echo)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save the image signature to the
checksum-<OCP_RELEASE_NUMBER>.yamlfile by running the following commands:cat >checksum-${OCP_RELEASE_NUMBER}.yaml <<EOF ${DIGEST_ALGO}-${DIGEST_ENCODED}: ${SIGNATURE_BASE64} EOF$ cat >checksum-${OCP_RELEASE_NUMBER}.yaml <<EOF ${DIGEST_ALGO}-${DIGEST_ENCODED}: ${SIGNATURE_BASE64} EOFCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Prepare the update graph. You have two options to prepare the update graph:
Use the OpenShift Update Service.
For more information about how to set up the graph on the hub cluster, see Deploy the operator for OpenShift Update Service and Build the graph data init container.
Make a local copy of the upstream graph. Host the update graph on an
httporhttpsserver in the disconnected environment that has access to the managed cluster. To download the update graph, use the following command:curl -s https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.19 -o ~/upgrade-graph_stable-4.19
$ curl -s https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.19 -o ~/upgrade-graph_stable-4.19Copy to Clipboard Copied! Toggle word wrap Toggle overflow
For Operator updates, you must perform the following task:
- Mirror the Operator catalogs. Ensure that the desired operator images are mirrored by following the procedure in the "Mirroring Operator catalogs for use with disconnected clusters" section.
9.3.2. Performing a platform update with PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
You can perform a platform update with the TALM.
Prerequisites
- Install the Topology Aware Lifecycle Manager (TALM).
- Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
- Provision one or more managed clusters with GitOps ZTP.
- Mirror the desired image repository.
-
Log in as a user with
cluster-adminprivileges. - Create RHACM policies in the hub cluster.
Procedure
Create a
PolicyGeneratorCR for the platform update:Save the following
PolicyGeneratorCR in thedu-upgrade.yamlfile:Example of
PolicyGeneratorfor platform updateCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Shows the
ClusterVersionCR to trigger the update. Thechannel,upstream, anddesiredVersionfields are all required for image pre-caching. - 2
ImageSignature.yamlcontains the image signature of the required release image. The image signature is used to verify the image before applying the platform update.- 3
- Shows the mirror repository that contains the required OpenShift Container Platform image. Get the mirrors from the
imageContentSources.yamlfile that you saved when following the procedures in the "Setting up the environment" section.
The
PolicyGeneratorCR generates two policies:-
The
du-upgrade-platform-upgrade-preppolicy does the preparation work for the platform update. It creates theConfigMapCR for the desired release image signature, creates the image content source of the mirrored release image repository, and updates the cluster version with the desired update channel and the update graph reachable by the managed cluster in the disconnected environment. -
The
du-upgrade-platform-upgradepolicy is used to perform platform upgrade.
Add the
du-upgrade.yamlfile contents to thekustomization.yamlfile located in the GitOps ZTP Git repository for thePolicyGeneratorCRs and push the changes to the Git repository.ArgoCD pulls the changes from the Git repository and generates the policies on the hub cluster.
Check the created policies by running the following command:
oc get policies -A | grep platform-upgrade
$ oc get policies -A | grep platform-upgradeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create the
ClusterGroupUpdateCR for the platform update with thespec.enablefield set tofalse.Save the content of the platform update
ClusterGroupUpdateCR with thedu-upgrade-platform-upgrade-prepand thedu-upgrade-platform-upgradepolicies and the target clusters to thecgu-platform-upgrade.ymlfile, as shown in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the
ClusterGroupUpdateCR to the hub cluster by running the following command:oc apply -f cgu-platform-upgrade.yml
$ oc apply -f cgu-platform-upgrade.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Optional: Pre-cache the images for the platform update.
Enable pre-caching in the
ClusterGroupUpdateCR by running the following command:oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the update process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the hub cluster:
oc get cgu cgu-platform-upgrade -o jsonpath='{.status.precaching.status}'$ oc get cgu cgu-platform-upgrade -o jsonpath='{.status.precaching.status}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Start the platform update:
Enable the
cgu-platform-upgradepolicy and disable pre-caching by running the following command:oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
oc get policies --all-namespaces
$ oc get policies --all-namespacesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
9.3.3. Performing an Operator update with PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
You can perform an Operator update with the TALM.
Prerequisites
- Install the Topology Aware Lifecycle Manager (TALM).
- Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
- Provision one or more managed clusters with GitOps ZTP.
- Mirror the desired index image, bundle images, and all Operator images referenced in the bundle images.
-
Log in as a user with
cluster-adminprivileges. - Create RHACM policies in the hub cluster.
Procedure
Update the
PolicyGeneratorCR for the Operator update.Update the
du-upgradePolicyGeneratorCR with the following additional contents in thedu-upgrade.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Contains the required Operator images. If the index images are always pushed to the same image name and tag, this change is not needed.
- 2
- Sets how frequently the Operator Lifecycle Manager (OLM) polls the index image for new Operator versions with the
registryPoll.intervalfield. This change is not needed if a new index image tag is always pushed for y-stream and z-stream Operator updates. TheregistryPoll.intervalfield can be set to a shorter interval to expedite the update, however shorter intervals increase computational load. To counteract this, you can restoreregistryPoll.intervalto the default value once the update is complete. - 3
- Displays the observed state of the catalog connection. The
READYvalue ensures that theCatalogSourcepolicy is ready, indicating that the index pod is pulled and is running. This way, TALM upgrades the Operators based on up-to-date policy compliance states.
This update generates one policy,
du-upgrade-operator-catsrc-policy, to update theredhat-operators-disconnectedcatalog source with the new index images that contain the desired Operators images.NoteIf you want to use the image pre-caching for Operators and there are Operators from a different catalog source other than
redhat-operators-disconnected, you must perform the following tasks:- Prepare a separate catalog source policy with the new index image or registry poll interval update for the different catalog source.
- Prepare a separate subscription policy for the desired Operators that are from the different catalog source.
For example, the desired SRIOV-FEC Operator is available in the
certified-operatorscatalog source. To update the catalog source and the Operator subscription, add the following contents to generate two policies,du-upgrade-fec-catsrc-policyanddu-upgrade-subscriptions-fec-policy:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the specified subscriptions channels in the common
PolicyGeneratorCR, if they exist. The default subscriptions channels from the GitOps ZTP image are used for the update.NoteThe default channel for the Operators applied through GitOps ZTP 4.19 is
stable, except for theperformance-addon-operator. As of OpenShift Container Platform 4.11, theperformance-addon-operatorfunctionality was moved to thenode-tuning-operator. For the 4.10 release, the default channel for PAO isv4.10. You can also specify the default channels in the commonPolicyGeneratorCR.Push the
PolicyGeneratorCRs updates to the GitOps ZTP Git repository.ArgoCD pulls the changes from the Git repository and generates the policies on the hub cluster.
Check the created policies by running the following command:
oc get policies -A | grep -E "catsrc-policy|subscription"
$ oc get policies -A | grep -E "catsrc-policy|subscription"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Apply the required catalog source updates before starting the Operator update.
Save the content of the
ClusterGroupUpgradeCR namedoperator-upgrade-prepwith the catalog source policies and the target managed clusters to thecgu-operator-upgrade-prep.ymlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the policy to the hub cluster by running the following command:
oc apply -f cgu-operator-upgrade-prep.yml
$ oc apply -f cgu-operator-upgrade-prep.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the update process. Upon completion, ensure that the policy is compliant by running the following command:
oc get policies -A | grep -E "catsrc-policy"
$ oc get policies -A | grep -E "catsrc-policy"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Create the
ClusterGroupUpgradeCR for the Operator update with thespec.enablefield set tofalse.Save the content of the Operator update
ClusterGroupUpgradeCR with thedu-upgrade-operator-catsrc-policypolicy and the subscription policies created from the commonPolicyGeneratorand the target clusters to thecgu-operator-upgrade.ymlfile, as shown in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The policy is needed by the image pre-caching feature to retrieve the operator images from the catalog source.
- 2
- The policy contains Operator subscriptions. If you have followed the structure and content of the reference
PolicyGenTemplates, all Operator subscriptions are grouped into thecommon-subscriptions-policypolicy.
NoteOne
ClusterGroupUpgradeCR can only pre-cache the images of the desired Operators defined in the subscription policy from one catalog source included in theClusterGroupUpgradeCR. If the desired Operators are from different catalog sources, such as in the example of the SRIOV-FEC Operator, anotherClusterGroupUpgradeCR must be created withdu-upgrade-fec-catsrc-policyanddu-upgrade-subscriptions-fec-policypolicies for the SRIOV-FEC Operator images pre-caching and update.Apply the
ClusterGroupUpgradeCR to the hub cluster by running the following command:oc apply -f cgu-operator-upgrade.yml
$ oc apply -f cgu-operator-upgrade.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Optional: Pre-cache the images for the Operator update.
Before starting image pre-caching, verify the subscription policy is
NonCompliantat this point by running the following command:oc get policy common-subscriptions-policy -n <policy_namespace>
$ oc get policy common-subscriptions-policy -n <policy_namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME REMEDIATION ACTION COMPLIANCE STATE AGE common-subscriptions-policy inform NonCompliant 27d
NAME REMEDIATION ACTION COMPLIANCE STATE AGE common-subscriptions-policy inform NonCompliant 27dCopy to Clipboard Copied! Toggle word wrap Toggle overflow Enable pre-caching in the
ClusterGroupUpgradeCR by running the following command:oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the managed cluster:
oc get cgu cgu-operator-upgrade -o jsonpath='{.status.precaching.status}'$ oc get cgu cgu-operator-upgrade -o jsonpath='{.status.precaching.status}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check if the pre-caching is completed before starting the update by running the following command:
oc get cgu -n default cgu-operator-upgrade -ojsonpath='{.status.conditions}' | jq$ oc get cgu -n default cgu-operator-upgrade -ojsonpath='{.status.conditions}' | jqCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Start the Operator update.
Enable the
cgu-operator-upgradeClusterGroupUpgradeCR and disable pre-caching to start the Operator update by running the following command:oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
oc get policies --all-namespaces
$ oc get policies --all-namespacesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
9.3.4. Troubleshooting missed Operator updates with PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
In some scenarios, Topology Aware Lifecycle Manager (TALM) might miss Operator updates due to an out-of-date policy compliance state.
After a catalog source update, it takes time for the Operator Lifecycle Manager (OLM) to update the subscription status. The status of the subscription policy might continue to show as compliant while TALM decides whether remediation is needed. As a result, the Operator specified in the subscription policy does not get upgraded.
To avoid this scenario, add another catalog source configuration to the PolicyGenerator and specify this configuration in the subscription for any Operators that require an update.
Procedure
Add a catalog source configuration in the
PolicyGeneratorresource:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the
Subscriptionresource to point to the new configuration for Operators that require an update:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Enter the name of the additional catalog source configuration that you defined in the
PolicyGeneratorresource.
9.3.5. Performing a platform and an Operator update together Copier lienLien copié sur presse-papiers!
You can perform a platform and an Operator update at the same time.
Prerequisites
- Install the Topology Aware Lifecycle Manager (TALM).
- Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
- Provision one or more managed clusters with GitOps ZTP.
-
Log in as a user with
cluster-adminprivileges. - Create RHACM policies in the hub cluster.
Procedure
-
Create the
PolicyGeneratorCR for the updates by following the steps described in the "Performing a platform update" and "Performing an Operator update" sections. Apply the prep work for the platform and the Operator update.
Save the content of the
ClusterGroupUpgradeCR with the policies for platform update preparation work, catalog source updates, and target clusters to thecgu-platform-operator-upgrade-prep.ymlfile, for example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the
cgu-platform-operator-upgrade-prep.ymlfile to the hub cluster by running the following command:oc apply -f cgu-platform-operator-upgrade-prep.yml
$ oc apply -f cgu-platform-operator-upgrade-prep.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
oc get policies --all-namespaces
$ oc get policies --all-namespacesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create the
ClusterGroupUpdateCR for the platform and the Operator update with thespec.enablefield set tofalse.Save the contents of the platform and Operator update
ClusterGroupUpdateCR with the policies and the target clusters to thecgu-platform-operator-upgrade.ymlfile, as shown in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the
cgu-platform-operator-upgrade.ymlfile to the hub cluster by running the following command:oc apply -f cgu-platform-operator-upgrade.yml
$ oc apply -f cgu-platform-operator-upgrade.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Optional: Pre-cache the images for the platform and the Operator update.
Enable pre-caching in the
ClusterGroupUpgradeCR by running the following command:oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the update process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the managed cluster:
oc get jobs,pods -n openshift-talm-pre-cache
$ oc get jobs,pods -n openshift-talm-pre-cacheCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check if the pre-caching is completed before starting the update by running the following command:
oc get cgu cgu-du-upgrade -ojsonpath='{.status.conditions}'$ oc get cgu cgu-du-upgrade -ojsonpath='{.status.conditions}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Start the platform and Operator update.
Enable the
cgu-du-upgradeClusterGroupUpgradeCR to start the platform and the Operator update by running the following command:oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
oc get policies --all-namespaces
$ oc get policies --all-namespacesCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe CRs for the platform and Operator updates can be created from the beginning by configuring the setting to
spec.enable: true. In this case, the update starts immediately after pre-caching completes and there is no need to manually enable the CR.Both pre-caching and the update create extra resources, such as policies, placement bindings, placement rules, managed cluster actions, and managed cluster view, to help complete the procedures. Setting the
afterCompletion.deleteObjectsfield totruedeletes all these resources after the updates complete.
9.3.6. Removing Performance Addon Operator subscriptions from deployed clusters with PolicyGenerator CRs Copier lienLien copié sur presse-papiers!
In earlier versions of OpenShift Container Platform, the Performance Addon Operator provided automatic, low latency performance tuning for applications. In OpenShift Container Platform 4.11 or later, these functions are part of the Node Tuning Operator.
Do not install the Performance Addon Operator on clusters running OpenShift Container Platform 4.11 or later. If you upgrade to OpenShift Container Platform 4.11 or later, the Node Tuning Operator automatically removes the Performance Addon Operator.
You need to remove any policies that create Performance Addon Operator subscriptions to prevent a re-installation of the Operator.
The reference DU profile includes the Performance Addon Operator in the PolicyGenerator CR acm-common-ranGen.yaml. To remove the subscription from deployed managed clusters, you must update acm-common-ranGen.yaml.
If you install Performance Addon Operator 4.10.3-5 or later on OpenShift Container Platform 4.11 or later, the Performance Addon Operator detects the cluster version and automatically hibernates to avoid interfering with the Node Tuning Operator functions. However, to ensure best performance, remove the Performance Addon Operator from your OpenShift Container Platform 4.11 clusters.
Prerequisites
- Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for ArgoCD.
- Update to OpenShift Container Platform 4.11 or later.
-
Log in as a user with
cluster-adminprivileges.
Procedure
Change the
complianceTypetomustnothavefor the Performance Addon Operator namespace, Operator group, and subscription in theacm-common-ranGen.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Merge the changes with your custom site repository and wait for the ArgoCD application to synchronize the change to the hub cluster. The status of the
common-subscriptions-policypolicy changes toNon-Compliant. - Apply the change to your target clusters by using the Topology Aware Lifecycle Manager. For more information about rolling out configuration changes, see the "Additional resources" section.
Monitor the process. When the status of the
common-subscriptions-policypolicy for a target cluster isCompliant, the Performance Addon Operator has been removed from the cluster. Get the status of thecommon-subscriptions-policyby running the following command:oc get policy -n ztp-common common-subscriptions-policy
$ oc get policy -n ztp-common common-subscriptions-policyCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Delete the Performance Addon Operator namespace, Operator group and subscription CRs from
policies.manifestsin theacm-common-ranGen.yamlfile. - Merge the changes with your custom site repository and wait for the ArgoCD application to synchronize the change to the hub cluster. The policy remains compliant.
9.3.7. Pre-caching user-specified images with TALM on single-node OpenShift clusters Copier lienLien copié sur presse-papiers!
You can pre-cache application-specific workload images on single-node OpenShift clusters before upgrading your applications.
You can specify the configuration options for the pre-caching jobs using the following custom resources (CR):
-
PreCachingConfigCR -
ClusterGroupUpgradeCR
All fields in the PreCachingConfig CR are optional.
Example PreCachingConfig CR
- 1
- By default, TALM automatically populates the
platformImage,operatorsIndexes, and theoperatorsPackagesAndChannelsfields from the policies of the managed clusters. You can specify values to override the default TALM-derived values for these fields. - 2
- Specifies the minimum required disk space on the cluster. If unspecified, TALM defines a default value for OpenShift Container Platform images. The disk space field must include an integer value and the storage unit. For example:
40 GiB,200 MB,1 TiB. - 3
- Specifies the images to exclude from pre-caching based on image name matching.
- 4
- Specifies the list of additional images to pre-cache.
Example ClusterGroupUpgrade CR with PreCachingConfig CR reference
9.3.7.1. Creating the custom resources for pre-caching Copier lienLien copié sur presse-papiers!
You must create the PreCachingConfig CR before or concurrently with the ClusterGroupUpgrade CR.
Create the
PreCachingConfigCR with the list of additional images you want to pre-cache.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
ClusterGroupUpgradeCR with thepreCachingfield set totrueand specify thePreCachingConfigCR created in the previous step:Copy to Clipboard Copied! Toggle word wrap Toggle overflow WarningOnce you install the images on the cluster, you cannot change or delete them.
When you want to start pre-caching the images, apply the
ClusterGroupUpgradeCR by running the following command:oc apply -f cgu.yaml
$ oc apply -f cgu.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
TALM verifies the ClusterGroupUpgrade CR.
From this point, you can continue with the TALM pre-caching workflow.
All sites are pre-cached concurrently.
Verification
Check the pre-caching status on the hub cluster where the
ClusterUpgradeGroupCR is applied by running the following command:oc get cgu <cgu_name> -n <cgu_namespace> -oyaml
$ oc get cgu <cgu_name> -n <cgu_namespace> -oyamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The pre-caching configurations are validated by checking if the managed policies exist. Valid configurations of the
ClusterGroupUpgradeand thePreCachingConfigCRs result in the following statuses:Example output of valid CRs
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example of an invalid PreCachingConfig CR
Type: "PrecacheSpecValid" Status: False, Reason: "PrecacheSpecIncomplete" Message: "Precaching spec is incomplete: failed to get PreCachingConfig resource due to PreCachingConfig.ran.openshift.io "<pre-caching_cr_name>" not found"
Type: "PrecacheSpecValid" Status: False, Reason: "PrecacheSpecIncomplete" Message: "Precaching spec is incomplete: failed to get PreCachingConfig resource due to PreCachingConfig.ran.openshift.io "<pre-caching_cr_name>" not found"Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can find the pre-caching job by running the following command on the managed cluster:
oc get jobs -n openshift-talo-pre-cache
$ oc get jobs -n openshift-talo-pre-cacheCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example of pre-caching job in progress
NAME COMPLETIONS DURATION AGE pre-cache 0/1 1s 1s
NAME COMPLETIONS DURATION AGE pre-cache 0/1 1s 1sCopy to Clipboard Copied! Toggle word wrap Toggle overflow You can check the status of the pod created for the pre-caching job by running the following command:
oc describe pod pre-cache -n openshift-talo-pre-cache
$ oc describe pod pre-cache -n openshift-talo-pre-cacheCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example of pre-caching job in progress
Type Reason Age From Message Normal SuccesfulCreate 19s job-controller Created pod: pre-cache-abcd1
Type Reason Age From Message Normal SuccesfulCreate 19s job-controller Created pod: pre-cache-abcd1Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can get live updates on the status of the job by running the following command:
oc logs -f pre-cache-abcd1 -n openshift-talo-pre-cache
$ oc logs -f pre-cache-abcd1 -n openshift-talo-pre-cacheCopy to Clipboard Copied! Toggle word wrap Toggle overflow To verify the pre-cache job is successfully completed, run the following command:
oc describe pod pre-cache -n openshift-talo-pre-cache
$ oc describe pod pre-cache -n openshift-talo-pre-cacheCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example of completed pre-cache job
Type Reason Age From Message Normal SuccesfulCreate 5m19s job-controller Created pod: pre-cache-abcd1 Normal Completed 19s job-controller Job completed
Type Reason Age From Message Normal SuccesfulCreate 5m19s job-controller Created pod: pre-cache-abcd1 Normal Completed 19s job-controller Job completedCopy to Clipboard Copied! Toggle word wrap Toggle overflow To verify that the images are successfully pre-cached on the single-node OpenShift, do the following:
Enter into the node in debug mode:
oc debug node/cnfdf00.example.lab
$ oc debug node/cnfdf00.example.labCopy to Clipboard Copied! Toggle word wrap Toggle overflow Change root to
host:chroot /host/
$ chroot /host/Copy to Clipboard Copied! Toggle word wrap Toggle overflow Search for the desired images:
sudo podman images | grep <operator_name>
$ sudo podman images | grep <operator_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.3.8. About the auto-created ClusterGroupUpgrade CR for GitOps ZTP Copier lienLien copié sur presse-papiers!
TALM has a controller called ManagedClusterForCGU that monitors the Ready state of the ManagedCluster CRs on the hub cluster and creates the ClusterGroupUpgrade CRs for GitOps Zero Touch Provisioning (ZTP).
For any managed cluster in the Ready state without a ztp-done label applied, the ManagedClusterForCGU controller automatically creates a ClusterGroupUpgrade CR in the ztp-install namespace with its associated RHACM policies that are created during the GitOps ZTP process. TALM then remediates the set of configuration policies that are listed in the auto-created ClusterGroupUpgrade CR to push the configuration CRs to the managed cluster.
If there are no policies for the managed cluster at the time when the cluster becomes Ready, a ClusterGroupUpgrade CR with no policies is created. Upon completion of the ClusterGroupUpgrade the managed cluster is labeled as ztp-done. If there are policies that you want to apply for that managed cluster, manually create a ClusterGroupUpgrade as a day-2 operation.
Example of an auto-created ClusterGroupUpgrade CR for GitOps ZTP