Chapter 10. Managing cluster policies with PolicyGenTemplate resources
10.1. Configuring managed cluster policies by using PolicyGenTemplate resources
				Applied Policy custom resources (CRs) configure the managed clusters that you provision. You can customize how Red Hat Advanced Cluster Management (RHACM) uses PolicyGenTemplate CRs to generate the applied Policy CRs.
			
					Using PolicyGenTemplate CRs to manage and deploy policies to managed clusters will be deprecated in an upcoming OpenShift Container Platform release. Equivalent and improved functionality is available using Red Hat Advanced Cluster Management (RHACM) and PolicyGenerator CRs.
				
					For more information about PolicyGenerator resources, see the RHACM Integrating Policy Generator documentation.
				
10.1.1. About the PolicyGenTemplate CRD
					The PolicyGenTemplate custom resource definition (CRD) tells the PolicyGen policy generator what custom resources (CRs) to include in the cluster configuration, how to combine the CRs into the generated policies, and what items in those CRs need to be updated with overlay content.
				
					The following example shows a PolicyGenTemplate CR (common-du-ranGen.yaml) extracted from the ztp-site-generate reference container. The common-du-ranGen.yaml file defines two Red Hat Advanced Cluster Management (RHACM) policies. The policies manage a collection of configuration CRs, one for each unique value of policyName in the CR. common-du-ranGen.yaml creates a single placement binding and a placement rule to bind the policies to clusters based on the labels listed in the spec.bindingRules section.
				
Example PolicyGenTemplate CR - common-ranGen.yaml
- 1
- common: "true"applies the policies to all clusters with this label.
- 2
- Files listed undersourceFilescreate the Operator policies for installed clusters.
- 3
- DefaultCatsrc.yamlconfigures the catalog source for the disconnected registry.
- 4
- policyName: "config-policy"configures Operator subscriptions. The- OperatorHubCR disables the default and this CR replaces- redhat-operatorswith a- CatalogSourceCR that points to the disconnected registry.
					A PolicyGenTemplate CR can be constructed with any number of included CRs. Apply the following example CR in the hub cluster to generate a policy containing a single CR:
				
					Using the source file PtpConfigSlave.yaml as an example, the file defines a PtpConfig CR. The generated policy for the PtpConfigSlave example is named group-du-sno-config-policy. The PtpConfig CR defined in the generated group-du-sno-config-policy is named du-ptp-slave. The spec defined in PtpConfigSlave.yaml is placed under du-ptp-slave along with the other spec items defined under the source file.
				
					The following example shows the group-du-sno-config-policy CR:
				
10.1.2. Recommendations when customizing PolicyGenTemplate CRs
					Consider the following best practices when customizing site configuration PolicyGenTemplate custom resources (CRs):
				
- 
							Use as few policies as are necessary. Using fewer policies requires less resources. Each additional policy creates increased CPU load for the hub cluster and the deployed managed cluster. CRs are combined into policies based on the policyNamefield in thePolicyGenTemplateCR. CRs in the samePolicyGenTemplatewhich have the same value forpolicyNameare managed under a single policy.
- 
							In disconnected environments, use a single catalog source for all Operators by configuring the registry as a single index containing all Operators. Each additional CatalogSourceCR on the managed clusters increases CPU usage.
- 
							MachineConfigCRs should be included asextraManifestsin theSiteConfigCR so that they are applied during installation. This can reduce the overall time taken until the cluster is ready to deploy applications.
- 
							PolicyGenTemplateCRs should override the channel field to explicitly identify the desired version. This ensures that changes in the source CR during upgrades does not update the generated subscription.
When managing large numbers of spoke clusters on the hub cluster, minimize the number of policies to reduce resource consumption.
Grouping multiple configuration CRs into a single or limited number of policies is one way to reduce the overall number of policies on the hub cluster. When using the common, group, and site hierarchy of policies for managing site configuration, it is especially important to combine site-specific configurations into a single policy.
10.1.3. PolicyGenTemplate CRs for RAN deployments
					Use PolicyGenTemplate custom resources (CRs) to customize the configuration applied to the cluster by using the GitOps Zero Touch Provisioning (ZTP) pipeline. The PolicyGenTemplate CR allows you to generate one or more policies to manage the set of configuration CRs on your fleet of clusters. The PolicyGenTemplate CR identifies the set of managed CRs, bundles them into policies, builds the policy wrapping around those CRs, and associates the policies with clusters by using label binding rules.
				
					The reference configuration, obtained from the GitOps ZTP container, is designed to provide a set of critical features and node tuning settings that ensure the cluster can support the stringent performance and resource utilization constraints typical of RAN (Radio Access Network) Distributed Unit (DU) applications. Changes or omissions from the baseline configuration can affect feature availability, performance, and resource utilization. Use the reference PolicyGenTemplate CRs as the basis to create a hierarchy of configuration files tailored to your specific site requirements.
				
					The baseline PolicyGenTemplate CRs that are defined for RAN DU cluster configuration can be extracted from the GitOps ZTP ztp-site-generate container. See "Preparing the GitOps ZTP site configuration repository" for further details.
				
					The PolicyGenTemplate CRs can be found in the ./out/argocd/example/policygentemplates folder. The reference architecture has common, group, and site-specific configuration CRs. Each PolicyGenTemplate CR refers to other CRs that can be found in the ./out/source-crs folder.
				
					The PolicyGenTemplate CRs relevant to RAN cluster configuration are described below. Variants are provided for the group PolicyGenTemplate CRs to account for differences in single-node, three-node compact, and standard cluster configurations. Similarly, site-specific configuration variants are provided for single-node clusters and multi-node (compact or standard) clusters. Use the group and site-specific configuration variants that are relevant for your deployment.
				
| PolicyGenTemplate CR | Description | 
|---|---|
| 
									 | Contains a set of CRs that get applied to multi-node clusters. These CRs configure SR-IOV features typical for RAN installations. | 
| 
									 | Contains a set of CRs that get applied to single-node OpenShift clusters. These CRs configure SR-IOV features typical for RAN installations. | 
| 
									 | Contains a set of common RAN policy configuration that get applied to multi-node clusters. | 
| 
									 | Contains a set of common RAN CRs that get applied to all clusters. These CRs subscribe to a set of operators providing cluster features typical for RAN as well as baseline cluster tuning. | 
| 
									 | Contains the RAN policies for three-node clusters only. | 
| 
									 | Contains the RAN policies for single-node clusters only. | 
| 
									 | Contains the RAN policies for standard three control-plane clusters. | 
| 
									 | 
									 | 
| 
									 | 
									 | 
| 
									 | 
									 | 
10.1.4. Customizing a managed cluster with PolicyGenTemplate CRs
Use the following procedure to customize the policies that get applied to the managed cluster that you provision using the GitOps Zero Touch Provisioning (ZTP) pipeline.
Prerequisites
- 
							You have installed the OpenShift CLI (oc).
- 
							You have logged in to the hub cluster as a user with cluster-adminprivileges.
- You configured the hub cluster for generating the required installation and policy CRs.
- You created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for the Argo CD application.
Procedure
- Create a - PolicyGenTemplateCR for site-specific configuration CRs.- 
									Choose the appropriate example for your CR from the out/argocd/example/policygentemplatesfolder, for example,example-sno-site.yamlorexample-multinode-site.yaml.
- Change the - spec.bindingRulesfield in the example file to match the site-specific label included in the- SiteConfigCR. In the example- SiteConfigfile, the site-specific label is- sites: example-sno.Note- Ensure that the labels defined in your - PolicyGenTemplate- spec.bindingRulesfield correspond to the labels that are defined in the related managed clusters- SiteConfigCR.
- Change the content in the example file to match the desired configuration.
 
- 
									Choose the appropriate example for your CR from the 
- Optional: Create a - PolicyGenTemplateCR for any common configuration CRs that apply to the entire fleet of clusters.- 
									Select the appropriate example for your CR from the out/argocd/example/policygentemplatesfolder, for example,common-ranGen.yaml.
- Change the content in the example file to match the required configuration.
 
- 
									Select the appropriate example for your CR from the 
- Optional: Create a - PolicyGenTemplateCR for any group configuration CRs that apply to the certain groups of clusters in the fleet.- Ensure that the content of the overlaid spec files matches your required end state. As a reference, the - out/source-crsdirectory contains the full list of source-crs available to be included and overlaid by your PolicyGenTemplate templates.Note- Depending on the specific requirements of your clusters, you might need more than a single group policy per cluster type, especially considering that the example group policies each have a single - PerformancePolicy.yamlfile that can only be shared across a set of clusters if those clusters consist of identical hardware configurations.- 
									Select the appropriate example for your CR from the out/argocd/example/policygentemplatesfolder, for example,group-du-sno-ranGen.yaml.
- Change the content in the example file to match the required configuration.
 
- 
									Select the appropriate example for your CR from the 
- 
							Optional. Create a validator inform policy PolicyGenTemplateCR to signal when the GitOps ZTP installation and configuration of the deployed cluster is complete. For more information, see "Creating a validator inform policy".
- Define all the policy namespaces in a YAML file similar to the example - out/argocd/example/policygentemplates/ns.yamlfile.Important- Do not include the - NamespaceCR in the same file with the- PolicyGenTemplateCR.
- 
							Add the PolicyGenTemplateCRs andNamespaceCR to thekustomization.yamlfile in the generators section, similar to the example shown inout/argocd/example/policygentemplateskustomization.yaml.
- Commit the - PolicyGenTemplateCRs,- NamespaceCR, and associated- kustomization.yamlfile in your Git repository and push the changes.- The ArgoCD pipeline detects the changes and begins the managed cluster deployment. You can push the changes to the - SiteConfigCR and the- PolicyGenTemplateCR simultaneously.
10.1.5. Monitoring managed cluster policy deployment progress
					The ArgoCD pipeline uses PolicyGenTemplate CRs in Git to generate the RHACM policies and then sync them to the hub cluster. You can monitor the progress of the managed cluster policy synchronization after the assisted service installs OpenShift Container Platform on the managed cluster.
				
Prerequisites
- 
							You have installed the OpenShift CLI (oc).
- 
							You have logged in to the hub cluster as a user with cluster-adminprivileges.
Procedure
- The Topology Aware Lifecycle Manager (TALM) applies the configuration policies that are bound to the cluster. - After the cluster installation is complete and the cluster becomes - Ready, a- ClusterGroupUpgradeCR corresponding to this cluster, with a list of ordered policies defined by the- ran.openshift.io/ztp-deploy-wave annotations, is automatically created by the TALM. The cluster’s policies are applied in the order listed in- ClusterGroupUpgradeCR.- You can monitor the high-level progress of configuration policy reconciliation by using the following commands: - export CLUSTER=<clusterName> - $ export CLUSTER=<clusterName>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[-1:]}' | jq- $ oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[-1:]}' | jq- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- You can monitor the detailed cluster policy compliance status by using the RHACM dashboard or the command line. - To check policy compliance by using - oc, run the following command:- oc get policies -n $CLUSTER - $ oc get policies -n $CLUSTER- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- To check policy status from the RHACM web console, perform the following actions: - 
											Click Governance Find policies. 
- Click on a cluster policy to check its status.
 
- 
											Click Governance 
 
					When all of the cluster policies become compliant, GitOps ZTP installation and configuration for the cluster is complete. The ztp-done label is added to the cluster.
				
					In the reference configuration, the final policy that becomes compliant is the one defined in the *-du-validator-policy policy. This policy, when compliant on a cluster, ensures that all cluster configuration, Operator installation, and Operator configuration is complete.
				
10.1.6. Validating the generation of configuration policy CRs
					Policy custom resources (CRs) are generated in the same namespace as the PolicyGenTemplate from which they are created. The same troubleshooting flow applies to all policy CRs generated from a PolicyGenTemplate regardless of whether they are ztp-common, ztp-group, or ztp-site based, as shown using the following commands:
				
export NS=<namespace>
$ export NS=<namespace>oc get policy -n $NS
$ oc get policy -n $NSThe expected set of policy-wrapped CRs should be displayed.
If the policies failed synchronization, use the following troubleshooting steps.
Procedure
- To display detailed information about the policies, run the following command: - oc describe -n openshift-gitops application policies - $ oc describe -n openshift-gitops application policies- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check for - Status: Conditions:to show the error logs. For example, setting an invalid- sourceFileentry to- fileName:generates the error shown below:- Status: Conditions: Last Transition Time: 2021-11-26T17:21:39Z Message: rpc error: code = Unknown desc = `kustomize build /tmp/https___git.com/ran-sites/policies/ --enable-alpha-plugins` failed exit status 1: 2021/11/26 17:21:40 Error could not find test.yaml under source-crs/: no such file or directory Error: failure in plugin configured via /tmp/kust-plugin-config-52463179; exit status 1: exit status 1 Type: ComparisonError- Status: Conditions: Last Transition Time: 2021-11-26T17:21:39Z Message: rpc error: code = Unknown desc = `kustomize build /tmp/https___git.com/ran-sites/policies/ --enable-alpha-plugins` failed exit status 1: 2021/11/26 17:21:40 Error could not find test.yaml under source-crs/: no such file or directory Error: failure in plugin configured via /tmp/kust-plugin-config-52463179; exit status 1: exit status 1 Type: ComparisonError- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check for - Status: Sync:. If there are log errors at- Status: Conditions:, the- Status: Sync:shows- Unknownor- Error:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- When Red Hat Advanced Cluster Management (RHACM) recognizes that policies apply to a - ManagedClusterobject, the policy CR objects are applied to the cluster namespace. Check to see if the policies were copied to the cluster namespace:- oc get policy -n $CLUSTER - $ oc get policy -n $CLUSTER- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - RHACM copies all applicable policies into the cluster namespace. The copied policy names have the format: - <PolicyGenTemplate.Namespace>.<PolicyGenTemplate.Name>-<policyName>.
- Check the placement rule for any policies not copied to the cluster namespace. The - matchSelectorin the- PlacementRulefor those policies should match labels on the- ManagedClusterobject:- oc get PlacementRule -n $NS - $ oc get PlacementRule -n $NS- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Note the - PlacementRulename appropriate for the missing policy, common, group, or site, using the following command:- oc get PlacementRule -n $NS <placement_rule_name> -o yaml - $ oc get PlacementRule -n $NS <placement_rule_name> -o yaml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - The status-decisions should include your cluster name.
- 
									The key-value pair of the matchSelectorin the spec must match the labels on your managed cluster.
 
- Check the labels on the - ManagedClusterobject by using the following command:- oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jq- $ oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jq- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check to see what policies are compliant by using the following command: - oc get policy -n $CLUSTER - $ oc get policy -n $CLUSTER- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - If the - Namespace,- OperatorGroup, and- Subscriptionpolicies are compliant but the Operator configuration policies are not, it is likely that the Operators did not install on the managed cluster. This causes the Operator configuration policies to fail to apply because the CRD is not yet applied to the spoke.
10.1.7. Restarting policy reconciliation
					You can restart policy reconciliation when unexpected compliance issues occur, for example, when the ClusterGroupUpgrade custom resource (CR) has timed out.
				
Procedure
- A - ClusterGroupUpgradeCR is generated in the namespace- ztp-installby the Topology Aware Lifecycle Manager after the managed cluster becomes- Ready:- export CLUSTER=<clusterName> - $ export CLUSTER=<clusterName>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - oc get clustergroupupgrades -n ztp-install $CLUSTER - $ oc get clustergroupupgrades -n ztp-install $CLUSTER- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- If there are unexpected issues and the policies fail to become complaint within the configured timeout (the default is 4 hours), the status of the - ClusterGroupUpgradeCR shows- UpgradeTimedOut:- oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'- $ oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- A - ClusterGroupUpgradeCR in the- UpgradeTimedOutstate automatically restarts its policy reconciliation every hour. If you have changed your policies, you can start a retry immediately by deleting the existing- ClusterGroupUpgradeCR. This triggers the automatic creation of a new- ClusterGroupUpgradeCR that begins reconciling the policies immediately:- oc delete clustergroupupgrades -n ztp-install $CLUSTER - $ oc delete clustergroupupgrades -n ztp-install $CLUSTER- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
					Note that when the ClusterGroupUpgrade CR completes with status UpgradeCompleted and the managed cluster has the label ztp-done applied, you can make additional configuration changes by using PolicyGenTemplate. Deleting the existing ClusterGroupUpgrade CR will not make the TALM generate a new CR.
				
					At this point, GitOps ZTP has completed its interaction with the cluster and any further interactions should be treated as an update and a new ClusterGroupUpgrade CR created for remediation of the policies.
				
10.1.8. Changing applied managed cluster CRs using policies
You can remove content from a custom resource (CR) that is deployed in a managed cluster through a policy.
					By default, all Policy CRs created from a PolicyGenTemplate CR have the complianceType field set to musthave. A musthave policy without the removed content is still compliant because the CR on the managed cluster has all the specified content. With this configuration, when you remove content from a CR, TALM removes the content from the policy but the content is not removed from the CR on the managed cluster.
				
					With the complianceType field to mustonlyhave, the policy ensures that the CR on the cluster is an exact match of what is specified in the policy.
				
Prerequisites
- 
							You have installed the OpenShift CLI (oc).
- 
							You have logged in to the hub cluster as a user with cluster-adminprivileges.
- You have deployed a managed cluster from a hub cluster running RHACM.
- You have installed Topology Aware Lifecycle Manager on the hub cluster.
Procedure
- Remove the content that you no longer need from the affected CRs. In this example, the - disableDrain: falseline was removed from the- SriovOperatorConfigCR.- Example CR - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Change the - complianceTypeof the affected policies to- mustonlyhavein the- group-du-sno-ranGen.yamlfile.- Example YAML - - fileName: SriovOperatorConfig.yaml policyName: "config-policy" complianceType: mustonlyhave - - fileName: SriovOperatorConfig.yaml policyName: "config-policy" complianceType: mustonlyhave- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Create a - ClusterGroupUpdatesCR and specify the clusters that must receive the CR changes::- Example ClusterGroupUpdates CR - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Create the - ClusterGroupUpgradeCR by running the following command:- oc create -f cgu-remove.yaml - $ oc create -f cgu-remove.yaml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- When you are ready to apply the changes, for example, during an appropriate maintenance window, change the value of the - spec.enablefield to- trueby running the following command:- oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-remove \ --patch '{"spec":{"enable":true}}' --type=merge- $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-remove \ --patch '{"spec":{"enable":true}}' --type=merge- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
Verification
- Check the status of the policies by running the following command: - oc get <kind> <changed_cr_name> - $ oc get <kind> <changed_cr_name>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - NAMESPACE NAME REMEDIATION ACTION COMPLIANCE STATE AGE default cgu-ztp-group.group-du-sno-config-policy enforce 17m default ztp-group.group-du-sno-config-policy inform NonCompliant 15h - NAMESPACE NAME REMEDIATION ACTION COMPLIANCE STATE AGE default cgu-ztp-group.group-du-sno-config-policy enforce 17m default ztp-group.group-du-sno-config-policy inform NonCompliant 15h- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - When the - COMPLIANCE STATEof the policy is- Compliant, it means that the CR is updated and the unwanted content is removed.
- Check that the policies are removed from the targeted clusters by running the following command on the managed clusters: - oc get <kind> <changed_cr_name> - $ oc get <kind> <changed_cr_name>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - If there are no results, the CR is removed from the managed cluster. 
10.1.9. Indication of done for GitOps ZTP installations
GitOps Zero Touch Provisioning (ZTP) simplifies the process of checking the GitOps ZTP installation status for a cluster. The GitOps ZTP status moves through three phases: cluster installation, cluster configuration, and GitOps ZTP done.
- Cluster installation phase
- 
								The cluster installation phase is shown by the ManagedClusterJoinedandManagedClusterAvailableconditions in theManagedClusterCR . If theManagedClusterCR does not have these conditions, or the condition is set toFalse, the cluster is still in the installation phase. Additional details about installation are available from theAgentClusterInstallandClusterDeploymentCRs. For more information, see "Troubleshooting GitOps ZTP".
- Cluster configuration phase
- 
								The cluster configuration phase is shown by a ztp-runninglabel applied theManagedClusterCR for the cluster.
- GitOps ZTP done
- Cluster installation and configuration is complete in the GitOps ZTP done phase. This is shown by the removal of the - ztp-runninglabel and addition of the- ztp-donelabel to the- ManagedClusterCR. The- ztp-donelabel shows that the configuration has been applied and the baseline DU configuration has completed cluster tuning.- The change to the GitOps ZTP done state is conditional on the compliant state of a Red Hat Advanced Cluster Management (RHACM) validator inform policy. This policy captures the existing criteria for a completed installation and validates that it moves to a compliant state only when GitOps ZTP provisioning of the managed cluster is complete. - The validator inform policy ensures the configuration of the cluster is fully applied and Operators have completed their initialization. The policy validates the following: - 
										The target MachineConfigPoolcontains the expected entries and has finished updating. All nodes are available and not degraded.
- 
										The SR-IOV Operator has completed initialization as indicated by at least one SriovNetworkNodeStatewithsyncStatus: Succeeded.
- The PTP Operator daemon set exists.
 
- 
										The target 
10.2. Advanced managed cluster configuration with PolicyGenTemplate resources
				You can use PolicyGenTemplate CRs to deploy custom functionality in your managed clusters.
			
					Using PolicyGenTemplate CRs to manage and deploy policies to managed clusters will be deprecated in an upcoming OpenShift Container Platform release. Equivalent and improved functionality is available using Red Hat Advanced Cluster Management (RHACM) and PolicyGenerator CRs.
				
					For more information about PolicyGenerator resources, see the RHACM Integrating Policy Generator documentation.
				
10.2.1. Deploying additional changes to clusters
If you require cluster configuration changes outside of the base GitOps Zero Touch Provisioning (ZTP) pipeline configuration, there are three options:
- Apply the additional configuration after the GitOps ZTP pipeline is complete
- When the GitOps ZTP pipeline deployment is complete, the deployed cluster is ready for application workloads. At this point, you can install additional Operators and apply configurations specific to your requirements. Ensure that additional configurations do not negatively affect the performance of the platform or allocated CPU budget.
- Add content to the GitOps ZTP library
- The base source custom resources (CRs) that you deploy with the GitOps ZTP pipeline can be augmented with custom content as required.
- Create extra manifests for the cluster installation
- Extra manifests are applied during installation and make the installation process more efficient.
Providing additional source CRs or modifying existing source CRs can significantly impact the performance or CPU profile of OpenShift Container Platform.
10.2.2. Using PolicyGenTemplate CRs to override source CRs content
					PolicyGenTemplate custom resources (CRs) allow you to overlay additional configuration details on top of the base source CRs provided with the GitOps plugin in the ztp-site-generate container. You can think of PolicyGenTemplate CRs as a logical merge or patch to the base CR. Use PolicyGenTemplate CRs to update a single field of the base CR, or overlay the entire contents of the base CR. You can update values and insert fields that are not in the base CR.
				
					The following example procedure describes how to update fields in the generated PerformanceProfile CR for the reference configuration based on the PolicyGenTemplate CR in the group-du-sno-ranGen.yaml file. Use the procedure as a basis for modifying other parts of the PolicyGenTemplate based on your requirements.
				
Prerequisites
- Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.
Procedure
- Review the baseline source CR for existing content. You can review the source CRs listed in the reference - PolicyGenTemplateCRs by extracting them from the GitOps Zero Touch Provisioning (ZTP) container.- Create an - /outfolder:- mkdir -p ./out - $ mkdir -p ./out- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Extract the source CRs: - podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.18.1 extract /home/ztp --tar | tar x -C ./out - $ podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.18.1 extract /home/ztp --tar | tar x -C ./out- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Review the baseline - PerformanceProfileCR in- ./out/source-crs/PerformanceProfile.yaml:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- Any fields in the source CR which contain - $…are removed from the generated CR if they are not provided in the- PolicyGenTemplateCR.
- Update the - PolicyGenTemplateentry for- PerformanceProfilein the- group-du-sno-ranGen.yamlreference file. The following example- PolicyGenTemplateCR stanza supplies appropriate CPU specifications, sets the- hugepagesconfiguration, and adds a new field that sets- globallyDisableIrqLoadBalancingto false.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Commit the - PolicyGenTemplatechange in Git, and then push to the Git repository being monitored by the GitOps ZTP argo CD application.- Example output - The GitOps ZTP application generates an RHACM policy that contains the generated - PerformanceProfileCR. The contents of that CR are derived by merging the- metadataand- speccontents from the- PerformanceProfileentry in the- PolicyGenTemplateonto the source CR. The resulting CR has the following content:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
						In the /source-crs folder that you extract from the ztp-site-generate container, the $ syntax is not used for template substitution as implied by the syntax. Rather, if the policyGen tool sees the $ prefix for a string and you do not specify a value for that field in the related PolicyGenTemplate CR, the field is omitted from the output CR entirely.
					
						An exception to this is the $mcp variable in /source-crs YAML files that is substituted with the specified value for mcp from the PolicyGenTemplate CR. For example, in example/policygentemplates/group-du-standard-ranGen.yaml, the value for mcp is worker:
					
spec:
  bindingRules:
    group-du-standard: ""
  mcp: "worker"
spec:
  bindingRules:
    group-du-standard: ""
  mcp: "worker"
						The policyGen tool replace instances of $mcp with worker in the output CRs.
					
10.2.3. Adding custom content to the GitOps ZTP pipeline
Perform the following procedure to add new content to the GitOps ZTP pipeline.
Procedure
- 
							Create a subdirectory named source-crsin the directory that contains thekustomization.yamlfile for thePolicyGenTemplatecustom resource (CR).
- Add your user-provided CRs to the - source-crssubdirectory, as shown in the following example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- Thesource-crssubdirectory must be in the same directory as thekustomization.yamlfile.
 
- Update the required - PolicyGenTemplateCRs to include references to the content you added in the- source-crs/custom-crsand- source-crs/elasticsearchdirectories. For example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- 
							Commit the PolicyGenTemplatechange in Git, and then push to the Git repository that is monitored by the GitOps ZTP Argo CD policies application.
- Update the - ClusterGroupUpgradeCR to include the changed- PolicyGenTemplateand save it as- cgu-test.yaml. The following example shows a generated- cgu-test.yamlfile.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Apply the updated - ClusterGroupUpgradeCR by running the following command:- oc apply -f cgu-test.yaml - $ oc apply -f cgu-test.yaml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
Verification
- Check that the updates have succeeded by running the following command: - oc get cgu -A - $ oc get cgu -A- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - NAMESPACE NAME AGE STATE DETAILS ztp-clusters custom-source-cr 6s InProgress Remediating non-compliant policies ztp-install cluster1 19h Completed All clusters are compliant with all the managed policies - NAMESPACE NAME AGE STATE DETAILS ztp-clusters custom-source-cr 6s InProgress Remediating non-compliant policies ztp-install cluster1 19h Completed All clusters are compliant with all the managed policies- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
10.2.4. Configuring policy compliance evaluation timeouts for PolicyGenTemplate CRs
Use Red Hat Advanced Cluster Management (RHACM) installed on a hub cluster to monitor and report on whether your managed clusters are compliant with applied policies. RHACM uses policy templates to apply predefined policy controllers and policies. Policy controllers are Kubernetes custom resource definition (CRD) instances.
					You can override the default policy evaluation intervals with PolicyGenTemplate custom resources (CRs). You configure duration settings that define how long a ConfigurationPolicy CR can be in a state of policy compliance or non-compliance before RHACM re-evaluates the applied cluster policies.
				
					The GitOps Zero Touch Provisioning (ZTP) policy generator generates ConfigurationPolicy CR policies with pre-defined policy evaluation intervals. The default value for the noncompliant state is 10 seconds. The default value for the compliant state is 10 minutes. To disable the evaluation interval, set the value to never.
				
Prerequisites
- 
							You have installed the OpenShift CLI (oc).
- 
							You have logged in to the hub cluster as a user with cluster-adminprivileges.
- You have created a Git repository where you manage your custom site configuration data.
Procedure
- To configure the evaluation interval for all policies in a - PolicyGenTemplateCR, set appropriate- compliantand- noncompliantvalues for the- evaluationIntervalfield. For example:- spec: evaluationInterval: compliant: 30m noncompliant: 20s- spec: evaluationInterval: compliant: 30m noncompliant: 20s- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- You can also set - compliantand- noncompliantfields to- neverto stop evaluating the policy after it reaches particular compliance state.
- To configure the evaluation interval for an individual policy object in a - PolicyGenTemplateCR, add the- evaluationIntervalfield and set appropriate values. For example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- 
							Commit the PolicyGenTemplateCRs files in the Git repository and push your changes.
Verification
Check that the managed spoke cluster policies are monitored at the expected intervals.
- 
							Log in as a user with cluster-adminprivileges on the managed cluster.
- Get the pods that are running in the - open-cluster-management-agent-addonnamespace. Run the following command:- oc get pods -n open-cluster-management-agent-addon - $ oc get pods -n open-cluster-management-agent-addon- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - NAME READY STATUS RESTARTS AGE config-policy-controller-858b894c68-v4xdb 1/1 Running 22 (5d8h ago) 10d - NAME READY STATUS RESTARTS AGE config-policy-controller-858b894c68-v4xdb 1/1 Running 22 (5d8h ago) 10d- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check the applied policies are being evaluated at the expected interval in the logs for the - config-policy-controllerpod:- oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb - $ oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-config-policy-config"} 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-common-compute-1-catalog-policy-config"}- 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-config-policy-config"} 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-common-compute-1-catalog-policy-config"}- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
10.2.5. Signalling GitOps ZTP cluster deployment completion with validator inform policies
Create a validator inform policy that signals when the GitOps Zero Touch Provisioning (ZTP) installation and configuration of the deployed cluster is complete. This policy can be used for deployments of single-node OpenShift clusters, three-node clusters, and standard clusters.
Procedure
- Create a standalone - PolicyGenTemplatecustom resource (CR) that contains the source file- validatorCRs/informDuValidator.yaml. You only need one standalone- PolicyGenTemplateCR for each cluster type. For example, this CR applies a validator inform policy for single-node OpenShift clusters:- Example single-node cluster validator inform policy CR (group-du-sno-validator-ranGen.yaml) - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- The name of the{policy-gen-crs}object. This name is also used as part of the names for theplacementBinding,placementRule, andpolicythat are created in the requestednamespace.
- 2
- This value should match thenamespaceused in the grouppolicy-gen-crs.
- 3
- Thegroup-du-*label defined inbindingRulesmust exist in theSiteConfigfiles.
- 4
- The label defined inbindingExcludedRulesmust be`ztp-done:`. Theztp-donelabel is used in coordination with the Topology Aware Lifecycle Manager.
- 5
- mcpdefines the- MachineConfigPoolobject that is used in the source file- validatorCRs/informDuValidator.yaml. It should be- masterfor single node and three-node cluster deployments and- workerfor standard cluster deployments.
- 6
- Optional. The default value isinform.
- 7
- This value is used as part of the name for the generated RHACM policy. The generated validator policy for the single node example isgroup-du-sno-validator-du-policy.
 
- 
							Commit the PolicyGenTemplateCR file in your Git repository and push the changes.
10.2.6. Configuring power states using PolicyGenTemplate CRs
For low latency and high-performance edge deployments, it is necessary to disable or limit C-states and P-states. With this configuration, the CPU runs at a constant frequency, which is typically the maximum turbo frequency. This ensures that the CPU is always running at its maximum speed, which results in high performance and low latency. This leads to the best latency for workloads. However, this also leads to the highest power consumption, which might not be necessary for all workloads.
Workloads can be classified as critical or non-critical, with critical workloads requiring disabled C-state and P-state settings for high performance and low latency, while non-critical workloads use C-state and P-state settings for power savings at the expense of some latency and performance. You can configure the following three power states using GitOps Zero Touch Provisioning (ZTP):
- High-performance mode provides ultra low latency at the highest power consumption.
- Performance mode provides low latency at a relatively high power consumption.
- Power saving balances reduced power consumption with increased latency.
The default configuration is for a low latency, performance mode.
					PolicyGenTemplate custom resources (CRs) allow you to overlay additional configuration details onto the base source CRs provided with the GitOps plugin in the ztp-site-generate container.
				
					Configure the power states by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenTemplate CR in the group-du-sno-ranGen.yaml.
				
The following common prerequisites apply to configuring all three power states.
Prerequisites
- You have created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.
- You have followed the procedure described in "Preparing the GitOps ZTP site configuration repository".
10.2.6.1. Configuring performance mode using PolicyGenTemplate CRs
						Follow this example to set performance mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenTemplate CR in the group-du-sno-ranGen.yaml.
					
Performance mode provides low latency at a relatively high power consumption.
Prerequisites
- You have configured the BIOS with performance related settings by following the guidance in "Configuring host firmware for low latency and high performance".
Procedure
- Update the - PolicyGenTemplateentry for- PerformanceProfilein the- group-du-sno-ranGen.yamlreference file in- out/argocd/example/policygentemplates//as follows to set performance mode.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- 
								Commit the PolicyGenTemplatechange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
10.2.6.2. Configuring high-performance mode using PolicyGenTemplate CRs
						Follow this example to set high performance mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenTemplate CR in the group-du-sno-ranGen.yaml.
					
High performance mode provides ultra low latency at the highest power consumption.
Prerequisites
- You have configured the BIOS with performance related settings by following the guidance in "Configuring host firmware for low latency and high performance".
Procedure
- Update the - PolicyGenTemplateentry for- PerformanceProfilein the- group-du-sno-ranGen.yamlreference file in- out/argocd/example/policygentemplates/as follows to set high-performance mode.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- 
								Commit the PolicyGenTemplatechange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
10.2.6.3. Configuring power saving mode using PolicyGenTemplate CRs
						Follow this example to set power saving mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenTemplate CR in the group-du-sno-ranGen.yaml.
					
The power saving mode balances reduced power consumption with increased latency.
Prerequisites
- You enabled C-states and OS-controlled P-states in the BIOS.
Procedure
- Update the - PolicyGenTemplateentry for- PerformanceProfilein the- group-du-sno-ranGen.yamlreference file in- out/argocd/example/policygentemplates/as follows to configure power saving mode. It is recommended to configure the CPU governor for the power saving mode through the additional kernel arguments object.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- Theschedutilgovernor is recommended, however, other governors that can be used includeondemandandpowersave.
 
- 
								Commit the PolicyGenTemplatechange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
Verification
- Select a worker node in your deployed cluster from the list of nodes identified by using the following command: - oc get nodes - $ oc get nodes- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Log in to the node by using the following command: - oc debug node/<node-name> - $ oc debug node/<node-name>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Replace - <node-name>with the name of the node you want to verify the power state on.
- Set - /hostas the root directory within the debug shell. The debug pod mounts the host’s root file system in- /hostwithin the pod. By changing the root directory to- /host, you can run binaries contained in the host’s executable paths as shown in the following example:- chroot /host - # chroot /host- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Run the following command to verify the applied power state: - cat /proc/cmdline - # cat /proc/cmdline- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
Expected output
- 
								For power saving mode the intel_pstate=passive.
10.2.6.4. Maximizing power savings
Limiting the maximum CPU frequency is recommended to achieve maximum power savings. Enabling C-states on the non-critical workload CPUs without restricting the maximum CPU frequency negates much of the power savings by boosting the frequency of the critical CPUs.
						Maximize power savings by updating the sysfs plugin fields, setting an appropriate value for max_perf_pct in the TunedPerformancePatch CR for the reference configuration. This example based on the group-du-sno-ranGen.yaml describes the procedure to follow to restrict the maximum CPU frequency.
					
Prerequisites
- You have configured power savings mode as described in "Using PolicyGenTemplate CRs to configure power savings mode".
Procedure
- Update the - PolicyGenTemplateentry for- TunedPerformancePatchin the- group-du-sno-ranGen.yamlreference file in- out/argocd/example/policygentemplates/. To maximize power savings, add- max_perf_pctas shown in the following example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- Themax_perf_pctcontrols the maximum frequency thecpufreqdriver is allowed to set as a percentage of the maximum supported CPU frequency. This value applies to all CPUs. You can check the maximum supported frequency in/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq. As a starting point, you can use a percentage that caps all CPUs at theAll Cores Turbofrequency. TheAll Cores Turbofrequency is the frequency that all cores will run at when the cores are all fully occupied.
 Note- To maximize power savings, set a lower value. Setting a lower value for - max_perf_pctlimits the maximum CPU frequency, thereby reducing power consumption, but also potentially impacting performance. Experiment with different values and monitor the system’s performance and power consumption to find the optimal setting for your use-case.
- 
								Commit the PolicyGenTemplatechange in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.
10.2.7. Configuring LVM Storage using PolicyGenTemplate CRs
You can configure Logical Volume Manager (LVM) Storage for managed clusters that you deploy with GitOps Zero Touch Provisioning (ZTP).
You use LVM Storage to persist event subscriptions when you use PTP events or bare-metal hardware events with HTTP transport.
Use the Local Storage Operator for persistent storage that uses local volumes in distributed units.
Prerequisites
- 
							Install the OpenShift CLI (oc).
- 
							Log in as a user with cluster-adminprivileges.
- Create a Git repository where you manage your custom site configuration data.
Procedure
- To configure LVM Storage for new managed clusters, add the following YAML to - spec.sourceFilesin the- common-ranGen.yamlfile:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- The Storage LVMO subscription is deprecated. In future releases of OpenShift Container Platform, the storage LVMO subscription will not be available. Instead, you must use the Storage LVMS subscription. - In OpenShift Container Platform 4.18, you can use the Storage LVMS subscription instead of the LVMO subscription. The LVMS subscription does not require manual overrides in the - common-ranGen.yamlfile. Add the following YAML to- spec.sourceFilesin the- common-ranGen.yamlfile to use the Storage LVMS subscription:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Add the - LVMClusterCR to- spec.sourceFilesin your specific group or individual site configuration file. For example, in the- group-du-sno-ranGen.yamlfile, add the following:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - This example configuration creates a volume group ( - vg1) with all the available devices, except the disk where OpenShift Container Platform is installed. A thin-pool logical volume is also created.
- Merge any other required changes and files with your custom site repository.
- 
							Commit the PolicyGenTemplatechanges in Git, and then push the changes to your site configuration repository to deploy LVM Storage to new sites using GitOps ZTP.
10.2.8. Configuring PTP events with PolicyGenTemplate CRs
You can use the GitOps ZTP pipeline to configure PTP events that use HTTP transport.
10.2.8.1. Configuring PTP events that use HTTP transport
You can configure PTP events that use HTTP transport on managed clusters that you deploy with the GitOps Zero Touch Provisioning (ZTP) pipeline.
Prerequisites
- 
								You have installed the OpenShift CLI (oc).
- 
								You have logged in as a user with cluster-adminprivileges.
- You have created a Git repository where you manage your custom site configuration data.
Procedure
- Apply the following - PolicyGenTemplatechanges to- group-du-3node-ranGen.yaml,- group-du-sno-ranGen.yaml, or- group-du-standard-ranGen.yamlfiles according to your requirements:- In - spec.sourceFiles, add the- PtpOperatorConfigCR file that configures the transport host:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- In OpenShift Container Platform 4.13 or later, you do not need to set the - transportHostfield in the- PtpOperatorConfigresource when you use HTTP transport with PTP events.
- Configure the - linuxptpand- phc2sysfor the PTP clock type and interface. For example, add the following YAML into- spec.sourceFiles:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- Can bePtpConfigMaster.yamlorPtpConfigSlave.yamldepending on your requirements. For configurations based ongroup-du-sno-ranGen.yamlorgroup-du-3node-ranGen.yaml, usePtpConfigSlave.yaml.
- 2
- Device specific interface name.
- 3
- You must append the--summary_interval -4value toptp4lOptsin.spec.sourceFiles.spec.profileto enable PTP fast events.
- 4
- Requiredphc2sysOptsvalues.-mprints messages tostdout. Thelinuxptp-daemonDaemonSetparses the logs and generates Prometheus metrics.
- 5
- Optional. If theptpClockThresholdstanza is not present, default values are used for theptpClockThresholdfields. The stanza shows defaultptpClockThresholdvalues. TheptpClockThresholdvalues configure how long after the PTP master clock is disconnected before PTP events are triggered.holdOverTimeoutis the time value in seconds before the PTP clock event state changes toFREERUNwhen the PTP master clock is disconnected. ThemaxOffsetThresholdandminOffsetThresholdsettings configure offset values in nanoseconds that compare against the values forCLOCK_REALTIME(phc2sys) or master offset (ptp4l). When theptp4lorphc2sysoffset value is outside this range, the PTP clock state is set toFREERUN. When the offset value is within this range, the PTP clock state is set toLOCKED.
 
 
- Merge any other required changes and files with your custom site repository.
- Push the changes to your site configuration repository to deploy PTP fast events to new sites using GitOps ZTP.
10.2.9. Configuring the Image Registry Operator for local caching of images
OpenShift Container Platform manages image caching using a local registry. In edge computing use cases, clusters are often subject to bandwidth restrictions when communicating with centralized image registries, which might result in long image download times.
					Long download times are unavoidable during initial deployment. Over time, there is a risk that CRI-O will erase the /var/lib/containers/storage directory in the case of an unexpected shutdown. To address long image download times, you can create a local image registry on remote managed clusters using GitOps Zero Touch Provisioning (ZTP). This is useful in Edge computing scenarios where clusters are deployed at the far edge of the network.
				
					Before you can set up the local image registry with GitOps ZTP, you need to configure disk partitioning in the SiteConfig CR that you use to install the remote managed cluster. After installation, you configure the local image registry using a PolicyGenTemplate CR. Then, the GitOps ZTP pipeline creates Persistent Volume (PV) and Persistent Volume Claim (PVC) CRs and patches the imageregistry configuration.
				
The local image registry can only be used for user application images and cannot be used for the OpenShift Container Platform or Operator Lifecycle Manager operator images.
10.2.9.1. Configuring disk partitioning with SiteConfig
						Configure disk partitioning for a managed cluster using a SiteConfig CR and GitOps Zero Touch Provisioning (ZTP). The disk partition details in the SiteConfig CR must match the underlying disk.
					
You must complete this procedure at installation time.
Prerequisites
- Install Butane.
Procedure
- Create the - storage.bufile.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Convert the - storage.buto an Ignition file by running the following command:- butane storage.bu - $ butane storage.bu- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - {"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}- {"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Use a tool such as JSON Pretty Print to convert the output into JSON format.
- Copy the output into the - .spec.clusters.nodes.ignitionConfigOverridefield in the- SiteConfigCR.- Example - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- If the - .spec.clusters.nodes.ignitionConfigOverridefield does not exist, create it.
Verification
- During or after installation, verify on the hub cluster that the - BareMetalHostobject shows the annotation by running the following command:- oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"] - $ oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"]- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - "{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}"- "{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}"- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- After installation, check the single-node OpenShift disk status. - Enter into a debug session on the single-node OpenShift node by running the following command. This step instantiates a debug pod called - <node_name>-debug:- oc debug node/my-sno-node - $ oc debug node/my-sno-node- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Set - /hostas the root directory within the debug shell by running the following command. The debug pod mounts the host’s root file system in- /hostwithin the pod. By changing the root directory to- /host, you can run binaries contained in the host’s executable paths:- chroot /host - # chroot /host- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- List information about all available block devices by running the following command: - lsblk - # lsblk- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Display information about the file system disk space usage by running the following command: - df -h - # df -h- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
10.2.9.2. Configuring the image registry using PolicyGenTemplate CRs
						Use PolicyGenTemplate (PGT) CRs to apply the CRs required to configure the image registry and patch the imageregistry configuration.
					
Prerequisites
- You have configured a disk partition in the managed cluster.
- 
								You have installed the OpenShift CLI (oc).
- 
								You have logged in to the hub cluster as a user with cluster-adminprivileges.
- You have created a Git repository where you manage your custom site configuration data for use with GitOps Zero Touch Provisioning (ZTP).
Procedure
- Configure the storage class, persistent volume claim, persistent volume, and image registry configuration in the appropriate - PolicyGenTemplateCR. For example, to configure an individual site, add the following YAML to the file- example-sno-site.yaml:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- Set the appropriate value forztp-deploy-wavedepending on whether you are configuring image registries at the site, common, or group level.ztp-deploy-wave: "100"is suitable for development or testing because it allows you to group the referenced source files together.
- 2
- InImageRegistryPV.yaml, ensure that thespec.local.pathfield is set to/var/imageregistryto match the value set for themount_pointfield in theSiteConfigCR.
 Important- Do not set - complianceType: mustonlyhavefor the- - fileName: ImageRegistryConfig.yamlconfiguration. This can cause the registry pod deployment to fail.
- 
								Commit the PolicyGenTemplatechange in Git, and then push to the Git repository being monitored by the GitOps ZTP ArgoCD application.
Verification
Use the following steps to troubleshoot errors with the local image registry on the managed clusters:
- Verify successful login to the registry while logged in to the managed cluster. Run the following commands: - Export the managed cluster name: - cluster=<managed_cluster_name> - $ cluster=<managed_cluster_name>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Get the managed cluster - kubeconfigdetails:- oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$cluster- $ oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$cluster- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Download and export the cluster - kubeconfig:- oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$cluster- $ oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$cluster- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Verify access to the image registry from the managed cluster. See "Accessing the registry".
 
- Check that the - ConfigCRD in the- imageregistry.operator.openshift.iogroup instance is not reporting errors. Run the following command while logged in to the managed cluster:- oc get image.config.openshift.io cluster -o yaml - $ oc get image.config.openshift.io cluster -o yaml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check that the - PersistentVolumeClaimon the managed cluster is populated with data. Run the following command while logged in to the managed cluster:- oc get pv image-registry-sc - $ oc get pv image-registry-sc- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check that the - registry*pod is running and is located under the- openshift-image-registrynamespace.- oc get pods -n openshift-image-registry | grep registry* - $ oc get pods -n openshift-image-registry | grep registry*- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - cluster-image-registry-operator-68f5c9c589-42cfg 1/1 Running 0 8d image-registry-5f8987879-6nx6h 1/1 Running 0 8d - cluster-image-registry-operator-68f5c9c589-42cfg 1/1 Running 0 8d image-registry-5f8987879-6nx6h 1/1 Running 0 8d- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check that the disk partition on the managed cluster is correct: - Open a debug shell to the managed cluster: - oc debug node/sno-1.example.com - $ oc debug node/sno-1.example.com- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Run - lsblkto check the host disk partitions:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- /var/imageregistryindicates that the disk is correctly partitioned.
 
 
10.3. Updating managed clusters in a disconnected environment with PolicyGenTemplate resources and TALM
You can use the Topology Aware Lifecycle Manager (TALM) to manage the software lifecycle of managed clusters that you have deployed by using GitOps Zero Touch Provisioning (ZTP) and Topology Aware Lifecycle Manager (TALM). TALM uses Red Hat Advanced Cluster Management (RHACM) PolicyGenTemplate policies to manage and control changes applied to target clusters.
					Using PolicyGenTemplate CRs to manage and deploy policies to managed clusters will be deprecated in an upcoming OpenShift Container Platform release. Equivalent and improved functionality is available using Red Hat Advanced Cluster Management (RHACM) and PolicyGenerator CRs.
				
					For more information about PolicyGenerator resources, see the RHACM Integrating Policy Generator documentation.
				
10.3.1. Setting up the disconnected environment
TALM can perform both platform and Operator updates.
You must mirror both the platform image and Operator images that you want to update to in your mirror registry before you can use TALM to update your disconnected clusters. Complete the following steps to mirror the images:
- For platform updates, you must perform the following steps: - Mirror the desired OpenShift Container Platform image repository. Ensure that the desired platform image is mirrored by following the "Mirroring the OpenShift Container Platform image repository" procedure linked in the Additional resources. Save the contents of the - imageContentSourcessection in the- imageContentSources.yamlfile:- Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Save the image signature of the desired platform image that was mirrored. You must add the image signature to the - PolicyGenTemplateCR for platform updates. To get the image signature, perform the following steps:- Specify the desired OpenShift Container Platform tag by running the following command: - OCP_RELEASE_NUMBER=<release_version> - $ OCP_RELEASE_NUMBER=<release_version>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Specify the architecture of the cluster by running the following command: - ARCHITECTURE=<cluster_architecture> - $ ARCHITECTURE=<cluster_architecture>- 1 - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- Specify the architecture of the cluster, such asx86_64,aarch64,s390x, orppc64le.
 
- Get the release image digest from Quay by running the following command - DIGEST="$(oc adm release info quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_NUMBER}-${ARCHITECTURE} | sed -n 's/Pull From: .*@//p')"- $ DIGEST="$(oc adm release info quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_NUMBER}-${ARCHITECTURE} | sed -n 's/Pull From: .*@//p')"- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Set the digest algorithm by running the following command: - DIGEST_ALGO="${DIGEST%%:*}"- $ DIGEST_ALGO="${DIGEST%%:*}"- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Set the digest signature by running the following command: - DIGEST_ENCODED="${DIGEST#*:}"- $ DIGEST_ENCODED="${DIGEST#*:}"- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Get the image signature from the mirror.openshift.com website by running the following command: - SIGNATURE_BASE64=$(curl -s "https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST_ALGO}=${DIGEST_ENCODED}/signature-1" | base64 -w0 && echo)- $ SIGNATURE_BASE64=$(curl -s "https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST_ALGO}=${DIGEST_ENCODED}/signature-1" | base64 -w0 && echo)- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Save the image signature to the - checksum-<OCP_RELEASE_NUMBER>.yamlfile by running the following commands:- cat >checksum-${OCP_RELEASE_NUMBER}.yaml <<EOF ${DIGEST_ALGO}-${DIGEST_ENCODED}: ${SIGNATURE_BASE64} EOF- $ cat >checksum-${OCP_RELEASE_NUMBER}.yaml <<EOF ${DIGEST_ALGO}-${DIGEST_ENCODED}: ${SIGNATURE_BASE64} EOF- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Prepare the update graph. You have two options to prepare the update graph: - Use the OpenShift Update Service. - For more information about how to set up the graph on the hub cluster, see Deploy the operator for OpenShift Update Service and Build the graph data init container. 
- Make a local copy of the upstream graph. Host the update graph on an - httpor- httpsserver in the disconnected environment that has access to the managed cluster. To download the update graph, use the following command:- curl -s https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.18 -o ~/upgrade-graph_stable-4.18 - $ curl -s https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.18 -o ~/upgrade-graph_stable-4.18- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
 
- For Operator updates, you must perform the following task: - Mirror the Operator catalogs. Ensure that the desired operator images are mirrored by following the procedure in the "Mirroring Operator catalogs for use with disconnected clusters" section.
 
10.3.2. Performing a platform update with PolicyGenTemplate CRs
You can perform a platform update with the TALM.
Prerequisites
- Install the Topology Aware Lifecycle Manager (TALM).
- Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
- Provision one or more managed clusters with GitOps ZTP.
- Mirror the desired image repository.
- 
							Log in as a user with cluster-adminprivileges.
- Create RHACM policies in the hub cluster.
Procedure
- Create a - PolicyGenTemplateCR for the platform update:- Save the following - PolicyGenTemplateCR in the- du-upgrade.yamlfile:- Example of - PolicyGenTemplatefor platform update- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- TheConfigMapCR contains the signature of the desired release image to update to.
- 2
- Shows the image signature of the desired OpenShift Container Platform release. Get the signature from thechecksum-${OCP_RELEASE_NUMBER}.yamlfile you saved when following the procedures in the "Setting up the environment" section.
- 3
- Shows the mirror repository that contains the desired OpenShift Container Platform image. Get the mirrors from theimageContentSources.yamlfile that you saved when following the procedures in the "Setting up the environment" section.
- 4
- Shows theClusterVersionCR to trigger the update. Thechannel,upstream, anddesiredVersionfields are all required for image pre-caching.
 - The - PolicyGenTemplateCR generates two policies:- 
											The du-upgrade-platform-upgrade-preppolicy does the preparation work for the platform update. It creates theConfigMapCR for the desired release image signature, creates the image content source of the mirrored release image repository, and updates the cluster version with the desired update channel and the update graph reachable by the managed cluster in the disconnected environment.
- 
											The du-upgrade-platform-upgradepolicy is used to perform platform upgrade.
 
- Add the - du-upgrade.yamlfile contents to the- kustomization.yamlfile located in the GitOps ZTP Git repository for the- PolicyGenTemplateCRs and push the changes to the Git repository.- ArgoCD pulls the changes from the Git repository and generates the policies on the hub cluster. 
- Check the created policies by running the following command: - oc get policies -A | grep platform-upgrade - $ oc get policies -A | grep platform-upgrade- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Create the - ClusterGroupUpdateCR for the platform update with the- spec.enablefield set to- false.- Save the content of the platform update - ClusterGroupUpdateCR with the- du-upgrade-platform-upgrade-prepand the- du-upgrade-platform-upgradepolicies and the target clusters to the- cgu-platform-upgrade.ymlfile, as shown in the following example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Apply the - ClusterGroupUpdateCR to the hub cluster by running the following command:- oc apply -f cgu-platform-upgrade.yml - $ oc apply -f cgu-platform-upgrade.yml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Optional: Pre-cache the images for the platform update. - Enable pre-caching in the - ClusterGroupUpdateCR by running the following command:- oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge- $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the update process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the hub cluster: - oc get cgu cgu-platform-upgrade -o jsonpath='{.status.precaching.status}'- $ oc get cgu cgu-platform-upgrade -o jsonpath='{.status.precaching.status}'- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Start the platform update: - Enable the - cgu-platform-upgradepolicy and disable pre-caching by running the following command:- oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge- $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the process. Upon completion, ensure that the policy is compliant by running the following command: - oc get policies --all-namespaces - $ oc get policies --all-namespaces- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
10.3.3. Performing an Operator update with PolicyGenTemplate CRs
You can perform an Operator update with the TALM.
Prerequisites
- Install the Topology Aware Lifecycle Manager (TALM).
- Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
- Provision one or more managed clusters with GitOps ZTP.
- Mirror the desired index image, bundle images, and all Operator images referenced in the bundle images.
- 
							Log in as a user with cluster-adminprivileges.
- Create RHACM policies in the hub cluster.
Procedure
- Update the - PolicyGenTemplateCR for the Operator update.- Update the - du-upgrade- PolicyGenTemplateCR with the following additional contents in the- du-upgrade.yamlfile:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- The index image URL contains the desired Operator images. If the index images are always pushed to the same image name and tag, this change is not needed.
- 2
- Set how frequently the Operator Lifecycle Manager (OLM) polls the index image for new Operator versions with theregistryPoll.intervalfield. This change is not needed if a new index image tag is always pushed for y-stream and z-stream Operator updates. TheregistryPoll.intervalfield can be set to a shorter interval to expedite the update, however shorter intervals increase computational load. To counteract this behavior, you can restoreregistryPoll.intervalto the default value once the update is complete.
- 3
- Last observed state of the catalog connection. TheREADYvalue ensures that theCatalogSourcepolicy is ready, indicating that the index pod is pulled and is running. This way, TALM upgrades the Operators based on up-to-date policy compliance states.
 
- This update generates one policy, - du-upgrade-operator-catsrc-policy, to update the- redhat-operators-disconnectedcatalog source with the new index images that contain the desired Operators images.Note- If you want to use the image pre-caching for Operators and there are Operators from a different catalog source other than - redhat-operators-disconnected, you must perform the following tasks:- Prepare a separate catalog source policy with the new index image or registry poll interval update for the different catalog source.
- Prepare a separate subscription policy for the desired Operators that are from the different catalog source.
 - For example, the desired SRIOV-FEC Operator is available in the - certified-operatorscatalog source. To update the catalog source and the Operator subscription, add the following contents to generate two policies,- du-upgrade-fec-catsrc-policyand- du-upgrade-subscriptions-fec-policy:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Remove the specified subscriptions channels in the common - PolicyGenTemplateCR, if they exist. The default subscriptions channels from the GitOps ZTP image are used for the update.Note- The default channel for the Operators applied through GitOps ZTP 4.18 is - stable, except for the- performance-addon-operator. As of OpenShift Container Platform 4.11, the- performance-addon-operatorfunctionality was moved to the- node-tuning-operator. For the 4.10 release, the default channel for PAO is- v4.10. You can also specify the default channels in the common- PolicyGenTemplateCR.
- Push the - PolicyGenTemplateCRs updates to the GitOps ZTP Git repository.- ArgoCD pulls the changes from the Git repository and generates the policies on the hub cluster. 
- Check the created policies by running the following command: - oc get policies -A | grep -E "catsrc-policy|subscription" - $ oc get policies -A | grep -E "catsrc-policy|subscription"- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Apply the required catalog source updates before starting the Operator update. - Save the content of the - ClusterGroupUpgradeCR named- operator-upgrade-prepwith the catalog source policies and the target managed clusters to the- cgu-operator-upgrade-prep.ymlfile:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Apply the policy to the hub cluster by running the following command: - oc apply -f cgu-operator-upgrade-prep.yml - $ oc apply -f cgu-operator-upgrade-prep.yml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the update process. Upon completion, ensure that the policy is compliant by running the following command: - oc get policies -A | grep -E "catsrc-policy" - $ oc get policies -A | grep -E "catsrc-policy"- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Create the - ClusterGroupUpgradeCR for the Operator update with the- spec.enablefield set to- false.- Save the content of the Operator update - ClusterGroupUpgradeCR with the- du-upgrade-operator-catsrc-policypolicy and the subscription policies created from the common- PolicyGenTemplateand the target clusters to the- cgu-operator-upgrade.ymlfile, as shown in the following example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- The policy is needed by the image pre-caching feature to retrieve the operator images from the catalog source.
- 2
- The policy contains Operator subscriptions. If you have followed the structure and content of the referencePolicyGenTemplates, all Operator subscriptions are grouped into thecommon-subscriptions-policypolicy.
 Note- One - ClusterGroupUpgradeCR can only pre-cache the images of the desired Operators defined in the subscription policy from one catalog source included in the- ClusterGroupUpgradeCR. If the desired Operators are from different catalog sources, such as in the example of the SRIOV-FEC Operator, another- ClusterGroupUpgradeCR must be created with- du-upgrade-fec-catsrc-policyand- du-upgrade-subscriptions-fec-policypolicies for the SRIOV-FEC Operator images pre-caching and update.
- Apply the - ClusterGroupUpgradeCR to the hub cluster by running the following command:- oc apply -f cgu-operator-upgrade.yml - $ oc apply -f cgu-operator-upgrade.yml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Optional: Pre-cache the images for the Operator update. - Before starting image pre-caching, verify the subscription policy is - NonCompliantat this point by running the following command:- oc get policy common-subscriptions-policy -n <policy_namespace> - $ oc get policy common-subscriptions-policy -n <policy_namespace>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - NAME REMEDIATION ACTION COMPLIANCE STATE AGE common-subscriptions-policy inform NonCompliant 27d - NAME REMEDIATION ACTION COMPLIANCE STATE AGE common-subscriptions-policy inform NonCompliant 27d- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Enable pre-caching in the - ClusterGroupUpgradeCR by running the following command:- oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge- $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the managed cluster: - oc get cgu cgu-operator-upgrade -o jsonpath='{.status.precaching.status}'- $ oc get cgu cgu-operator-upgrade -o jsonpath='{.status.precaching.status}'- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check if the pre-caching is completed before starting the update by running the following command: - oc get cgu -n default cgu-operator-upgrade -ojsonpath='{.status.conditions}' | jq- $ oc get cgu -n default cgu-operator-upgrade -ojsonpath='{.status.conditions}' | jq- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Start the Operator update. - Enable the - cgu-operator-upgrade- ClusterGroupUpgradeCR and disable pre-caching to start the Operator update by running the following command:- oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge- $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the process. Upon completion, ensure that the policy is compliant by running the following command: - oc get policies --all-namespaces - $ oc get policies --all-namespaces- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
10.3.4. Troubleshooting missed Operator updates with PolicyGenTemplate CRs
In some scenarios, Topology Aware Lifecycle Manager (TALM) might miss Operator updates due to an out-of-date policy compliance state.
After a catalog source update, it takes time for the Operator Lifecycle Manager (OLM) to update the subscription status. The status of the subscription policy might continue to show as compliant while TALM decides whether remediation is needed. As a result, the Operator specified in the subscription policy does not get upgraded.
					To avoid this scenario, add another catalog source configuration to the PolicyGenTemplate and specify this configuration in the subscription for any Operators that require an update.
				
Procedure
- Add a catalog source configuration in the - PolicyGenTemplateresource:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Update the - Subscriptionresource to point to the new configuration for Operators that require an update:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - 1
- Enter the name of the additional catalog source configuration that you defined in thePolicyGenTemplateresource.
 
10.3.5. Performing a platform and an Operator update together
You can perform a platform and an Operator update at the same time.
Prerequisites
- Install the Topology Aware Lifecycle Manager (TALM).
- Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
- Provision one or more managed clusters with GitOps ZTP.
- 
							Log in as a user with cluster-adminprivileges.
- Create RHACM policies in the hub cluster.
Procedure
- 
							Create the PolicyGenTemplateCR for the updates by following the steps described in the "Performing a platform update" and "Performing an Operator update" sections.
- Apply the prep work for the platform and the Operator update. - Save the content of the - ClusterGroupUpgradeCR with the policies for platform update preparation work, catalog source updates, and target clusters to the- cgu-platform-operator-upgrade-prep.ymlfile, for example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Apply the - cgu-platform-operator-upgrade-prep.ymlfile to the hub cluster by running the following command:- oc apply -f cgu-platform-operator-upgrade-prep.yml - $ oc apply -f cgu-platform-operator-upgrade-prep.yml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the process. Upon completion, ensure that the policy is compliant by running the following command: - oc get policies --all-namespaces - $ oc get policies --all-namespaces- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Create the - ClusterGroupUpdateCR for the platform and the Operator update with the- spec.enablefield set to- false.- Save the contents of the platform and Operator update - ClusterGroupUpdateCR with the policies and the target clusters to the- cgu-platform-operator-upgrade.ymlfile, as shown in the following example:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Apply the - cgu-platform-operator-upgrade.ymlfile to the hub cluster by running the following command:- oc apply -f cgu-platform-operator-upgrade.yml - $ oc apply -f cgu-platform-operator-upgrade.yml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Optional: Pre-cache the images for the platform and the Operator update. - Enable pre-caching in the - ClusterGroupUpgradeCR by running the following command:- oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge- $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the update process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the managed cluster: - oc get jobs,pods -n openshift-talm-pre-cache - $ oc get jobs,pods -n openshift-talm-pre-cache- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Check if the pre-caching is completed before starting the update by running the following command: - oc get cgu cgu-du-upgrade -ojsonpath='{.status.conditions}'- $ oc get cgu cgu-du-upgrade -ojsonpath='{.status.conditions}'- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Start the platform and Operator update. - Enable the - cgu-du-upgrade- ClusterGroupUpgradeCR to start the platform and the Operator update by running the following command:- oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge- $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Monitor the process. Upon completion, ensure that the policy is compliant by running the following command: - oc get policies --all-namespaces - $ oc get policies --all-namespaces- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- The CRs for the platform and Operator updates can be created from the beginning by configuring the setting to - spec.enable: true. In this case, the update starts immediately after pre-caching completes and there is no need to manually enable the CR.- Both pre-caching and the update create extra resources, such as policies, placement bindings, placement rules, managed cluster actions, and managed cluster view, to help complete the procedures. Setting the - afterCompletion.deleteObjectsfield to- truedeletes all these resources after the updates complete.
 
10.3.6. Removing Performance Addon Operator subscriptions from deployed clusters with PolicyGenTemplate CRs
In earlier versions of OpenShift Container Platform, the Performance Addon Operator provided automatic, low latency performance tuning for applications. In OpenShift Container Platform 4.11 or later, these functions are part of the Node Tuning Operator.
Do not install the Performance Addon Operator on clusters running OpenShift Container Platform 4.11 or later. If you upgrade to OpenShift Container Platform 4.11 or later, the Node Tuning Operator automatically removes the Performance Addon Operator.
You need to remove any policies that create Performance Addon Operator subscriptions to prevent a re-installation of the Operator.
					The reference DU profile includes the Performance Addon Operator in the PolicyGenTemplate CR common-ranGen.yaml. To remove the subscription from deployed managed clusters, you must update common-ranGen.yaml.
				
If you install Performance Addon Operator 4.10.3-5 or later on OpenShift Container Platform 4.11 or later, the Performance Addon Operator detects the cluster version and automatically hibernates to avoid interfering with the Node Tuning Operator functions. However, to ensure best performance, remove the Performance Addon Operator from your OpenShift Container Platform 4.11 clusters.
Prerequisites
- Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for ArgoCD.
- Update to OpenShift Container Platform 4.11 or later.
- 
							Log in as a user with cluster-adminprivileges.
Procedure
- Change the - complianceTypeto- mustnothavefor the Performance Addon Operator namespace, Operator group, and subscription in the- common-ranGen.yamlfile.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- 
							Merge the changes with your custom site repository and wait for the ArgoCD application to synchronize the change to the hub cluster. The status of the common-subscriptions-policypolicy changes toNon-Compliant.
- Apply the change to your target clusters by using the Topology Aware Lifecycle Manager. For more information about rolling out configuration changes, see the "Additional resources" section.
- Monitor the process. When the status of the - common-subscriptions-policypolicy for a target cluster is- Compliant, the Performance Addon Operator has been removed from the cluster. Get the status of the- common-subscriptions-policyby running the following command:- oc get policy -n ztp-common common-subscriptions-policy - $ oc get policy -n ztp-common common-subscriptions-policy- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- 
							Delete the Performance Addon Operator namespace, Operator group and subscription CRs from spec.sourceFilesin thecommon-ranGen.yamlfile.
- Merge the changes with your custom site repository and wait for the ArgoCD application to synchronize the change to the hub cluster. The policy remains compliant.
10.3.7. Pre-caching user-specified images with TALM on single-node OpenShift clusters
You can pre-cache application-specific workload images on single-node OpenShift clusters before upgrading your applications.
You can specify the configuration options for the pre-caching jobs using the following custom resources (CR):
- 
							PreCachingConfigCR
- 
							ClusterGroupUpgradeCR
						All fields in the PreCachingConfig CR are optional.
					
Example PreCachingConfig CR
- 1
- By default, TALM automatically populates theplatformImage,operatorsIndexes, and theoperatorsPackagesAndChannelsfields from the policies of the managed clusters. You can specify values to override the default TALM-derived values for these fields.
- 2
- Specifies the minimum required disk space on the cluster. If unspecified, TALM defines a default value for OpenShift Container Platform images. The disk space field must include an integer value and the storage unit. For example:40 GiB,200 MB,1 TiB.
- 3
- Specifies the images to exclude from pre-caching based on image name matching.
- 4
- Specifies the list of additional images to pre-cache.
Example ClusterGroupUpgrade CR with PreCachingConfig CR reference
10.3.7.1. Creating the custom resources for pre-caching
						You must create the PreCachingConfig CR before or concurrently with the ClusterGroupUpgrade CR.
					
- Create the - PreCachingConfigCR with the list of additional images you want to pre-cache.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Create a - ClusterGroupUpgradeCR with the- preCachingfield set to- trueand specify the- PreCachingConfigCR created in the previous step:- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Warning- Once you install the images on the cluster, you cannot change or delete them. 
- When you want to start pre-caching the images, apply the - ClusterGroupUpgradeCR by running the following command:- oc apply -f cgu.yaml - $ oc apply -f cgu.yaml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
						TALM verifies the ClusterGroupUpgrade CR.
					
From this point, you can continue with the TALM pre-caching workflow.
All sites are pre-cached concurrently.
Verification
- Check the pre-caching status on the hub cluster where the - ClusterUpgradeGroupCR is applied by running the following command:- oc get cgu <cgu_name> -n <cgu_namespace> -oyaml - $ oc get cgu <cgu_name> -n <cgu_namespace> -oyaml- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example output - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - The pre-caching configurations are validated by checking if the managed policies exist. Valid configurations of the - ClusterGroupUpgradeand the- PreCachingConfigCRs result in the following statuses:- Example output of valid CRs - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example of an invalid PreCachingConfig CR - Type: "PrecacheSpecValid" Status: False, Reason: "PrecacheSpecIncomplete" Message: "Precaching spec is incomplete: failed to get PreCachingConfig resource due to PreCachingConfig.ran.openshift.io "<pre-caching_cr_name>" not found" - Type: "PrecacheSpecValid" Status: False, Reason: "PrecacheSpecIncomplete" Message: "Precaching spec is incomplete: failed to get PreCachingConfig resource due to PreCachingConfig.ran.openshift.io "<pre-caching_cr_name>" not found"- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- You can find the pre-caching job by running the following command on the managed cluster: - oc get jobs -n openshift-talo-pre-cache - $ oc get jobs -n openshift-talo-pre-cache- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example of pre-caching job in progress - NAME COMPLETIONS DURATION AGE pre-cache 0/1 1s 1s - NAME COMPLETIONS DURATION AGE pre-cache 0/1 1s 1s- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- You can check the status of the pod created for the pre-caching job by running the following command: - oc describe pod pre-cache -n openshift-talo-pre-cache - $ oc describe pod pre-cache -n openshift-talo-pre-cache- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example of pre-caching job in progress - Type Reason Age From Message Normal SuccesfulCreate 19s job-controller Created pod: pre-cache-abcd1 - Type Reason Age From Message Normal SuccesfulCreate 19s job-controller Created pod: pre-cache-abcd1- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- You can get live updates on the status of the job by running the following command: - oc logs -f pre-cache-abcd1 -n openshift-talo-pre-cache - $ oc logs -f pre-cache-abcd1 -n openshift-talo-pre-cache- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- To verify the pre-cache job is successfully completed, run the following command: - oc describe pod pre-cache -n openshift-talo-pre-cache - $ oc describe pod pre-cache -n openshift-talo-pre-cache- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example of completed pre-cache job - Type Reason Age From Message Normal SuccesfulCreate 5m19s job-controller Created pod: pre-cache-abcd1 Normal Completed 19s job-controller Job completed - Type Reason Age From Message Normal SuccesfulCreate 5m19s job-controller Created pod: pre-cache-abcd1 Normal Completed 19s job-controller Job completed- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- To verify that the images are successfully pre-cached on the single-node OpenShift, do the following: - Enter into the node in debug mode: - oc debug node/cnfdf00.example.lab - $ oc debug node/cnfdf00.example.lab- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Change root to - host:- chroot /host/ - $ chroot /host/- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Search for the desired images: - sudo podman images | grep <operator_name> - $ sudo podman images | grep <operator_name>- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
10.3.8. About the auto-created ClusterGroupUpgrade CR for GitOps ZTP
					TALM has a controller called ManagedClusterForCGU that monitors the Ready state of the ManagedCluster CRs on the hub cluster and creates the ClusterGroupUpgrade CRs for GitOps Zero Touch Provisioning (ZTP).
				
					For any managed cluster in the Ready state without a ztp-done label applied, the ManagedClusterForCGU controller automatically creates a ClusterGroupUpgrade CR in the ztp-install namespace with its associated RHACM policies that are created during the GitOps ZTP process. TALM then remediates the set of configuration policies that are listed in the auto-created ClusterGroupUpgrade CR to push the configuration CRs to the managed cluster.
				
					If there are no policies for the managed cluster at the time when the cluster becomes Ready, a ClusterGroupUpgrade CR with no policies is created. Upon completion of the ClusterGroupUpgrade the managed cluster is labeled as ztp-done. If there are policies that you want to apply for that managed cluster, manually create a ClusterGroupUpgrade as a day-2 operation.
				
Example of an auto-created ClusterGroupUpgrade CR for GitOps ZTP