Accueiil
Products
OpenShift Container Platform
4.19
Edge computing
Chapter 9. Managing cluster policies with PolicyGenerator resources

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 9. Managing cluster policies with PolicyGenerator resources

9.1. Configuring managed cluster policies by using PolicyGenerator resources
Copier lien

You can customize how Red Hat Advanced Cluster Management (RHACM) uses PolicyGenerator CRs to generate Policy CRs that configure the managed clusters that you provision.

Using RHACM and PolicyGenerator CRs is the recommended approach for managing policies and deploying them to managed clusters. This replaces the use of PolicyGenTemplate CRs for this purpose. For more information about PolicyGenerator resources, see the RHACM Policy Generator documentation.

9.1.1. Comparing RHACM PolicyGenerator and PolicyGenTemplate resource patching
Copier lien

PolicyGenerator custom resources (CRs) and PolicyGenTemplate CRs can be used in GitOps ZTP to generate RHACM policies for managed clusters.

There are advantages to using PolicyGenerator CRs over PolicyGenTemplate CRs when it comes to patching OpenShift Container Platform resources with GitOps ZTP. Using the RHACM PolicyGenerator API provides a generic way of patching resources which is not possible with PolicyGenTemplate resources.

The PolicyGenerator API is a part of the Open Cluster Management standard, while the PolicyGenTemplate API is not. A comparison of PolicyGenerator and PolicyGenTemplate resource patching and placement strategies are described in the following table.

Important

Using PolicyGenTemplate CRs to manage and deploy policies to managed clusters will be deprecated in an upcoming OpenShift Container Platform release. Equivalent and improved functionality is available using Red Hat Advanced Cluster Management (RHACM) and PolicyGenerator CRs.

For more information about PolicyGenerator resources, see the RHACM Integrating Policy Generator documentation.

Expand

Table 9.1. Comparison of RHACM PolicyGenerator and PolicyGenTemplate patching
PolicyGenerator patching	PolicyGenTemplate patching
Uses Kustomize strategic merges for merging resources. For more information see Declarative Management of Kubernetes Objects Using Kustomize.	Works by replacing variables with their values as defined by the patch. This is less flexible than Kustomize merge strategies.
Supports `ManagedClusterSet` and `Binding` resources.	Does not support `ManagedClusterSet` and `Binding` resources.
Relies only on patching, no embedded variable substitution is required.	Overwrites variable values defined in the patch.
Does not support merging lists in merge patches. Replacing a list in a merge patch is supported.	Merging and replacing lists is supported in a limited fashion - you can only merge one object in the list.
Does not currently support the OpenAPI specification for resource patching. This means that additional directives are required in the patch to merge content that does not follow a schema, for example, `PtpConfig` resources.	Works by replacing fields and values with values as defined by the patch.
Requires additional directives, for example, `$patch: replace` in the patch to merge content that does not follow a schema.	Substitutes fields and values defined in the source CR with values defined in the patch, for example `$name`.
Can patch the `Name` and `Namespace` fields defined in the reference source CR, but only if the CR file has a single object.	Can patch the `Name` and `Namespace` fields defined in the reference source CR.

9.1.2. About the PolicyGenerator CRD
Copier lien

The PolicyGenerator custom resource definition (CRD) tells the PolicyGen policy generator what custom resources (CRs) to include in the cluster configuration, how to combine the CRs into the generated policies, and what items in those CRs need to be updated with overlay content.

The following example shows a PolicyGenerator CR (acm-common-du-ranGen.yaml) extracted from the ztp-site-generate reference container. The acm-common-du-ranGen.yaml file defines two Red Hat Advanced Cluster Management (RHACM) policies. The policies manage a collection of configuration CRs, one for each unique value of policyName in the CR. acm-common-du-ranGen.yaml creates a single placement binding and a placement rule to bind the policies to clusters based on the labels listed in the policyDefaults.placement.labelSelector section.

Example PolicyGenerator CR - acm-common-ranGen.yaml

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: common-latest
placementBindingDefaults:
    name: common-latest-placement-binding 
policyDefaults:
    namespace: ztp-common
    placement:
        labelSelector:
            matchExpressions:
                - key: common
                  operator: In
                  values:
                    - "true"
                - key: du-profile
                  operator: In
                  values:
                    - latest
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: common-latest-config-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/ReduceMonitoringFootprint.yaml
        - path: source-crs/DefaultCatsrc.yaml 
          patches:
            - metadata:
                name: redhat-operators-disconnected
              spec:
                displayName: disconnected-redhat-operators
                image: registry.example.com:5000/disconnected-redhat-operators/disconnected-redhat-operator-index:v4.9
        - path: source-crs/DisconnectedICSP.yaml
          patches:
            - spec:
                repositoryDigestMirrors:
                    - mirrors:
                        - registry.example.com:5000
                      source: registry.redhat.io
    - name: common-latest-subscriptions-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests: 
        - path: source-crs/SriovSubscriptionNS.yaml
        - path: source-crs/SriovSubscriptionOperGroup.yaml
        - path: source-crs/SriovSubscription.yaml
        - path: source-crs/SriovOperatorStatus.yaml
        - path: source-crs/PtpSubscriptionNS.yaml
        - path: source-crs/PtpSubscriptionOperGroup.yaml
        - path: source-crs/PtpSubscription.yaml
        - path: source-crs/PtpOperatorStatus.yaml
        - path: source-crs/ClusterLogNS.yaml
        - path: source-crs/ClusterLogOperGroup.yaml
        - path: source-crs/ClusterLogSubscription.yaml
        - path: source-crs/ClusterLogOperatorStatus.yaml
        - path: source-crs/StorageNS.yaml
        - path: source-crs/StorageOperGroup.yaml
        - path: source-crs/StorageSubscription.yaml
        - path: source-crs/StorageOperatorStatus.yaml

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: common-latest
placementBindingDefaults:
    name: common-latest-placement-binding


policyDefaults:
    namespace: ztp-common
    placement:
        labelSelector:
            matchExpressions:
                - key: common
                  operator: In
                  values:
                    - "true"
                - key: du-profile
                  operator: In
                  values:
                    - latest
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: common-latest-config-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/ReduceMonitoringFootprint.yaml
        - path: source-crs/DefaultCatsrc.yaml


          patches:
            - metadata:
                name: redhat-operators-disconnected
              spec:
                displayName: disconnected-redhat-operators
                image: registry.example.com:5000/disconnected-redhat-operators/disconnected-redhat-operator-index:v4.9
        - path: source-crs/DisconnectedICSP.yaml
          patches:
            - spec:
                repositoryDigestMirrors:
                    - mirrors:
                        - registry.example.com:5000
                      source: registry.redhat.io
    - name: common-latest-subscriptions-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:


        - path: source-crs/SriovSubscriptionNS.yaml
        - path: source-crs/SriovSubscriptionOperGroup.yaml
        - path: source-crs/SriovSubscription.yaml
        - path: source-crs/SriovOperatorStatus.yaml
        - path: source-crs/PtpSubscriptionNS.yaml
        - path: source-crs/PtpSubscriptionOperGroup.yaml
        - path: source-crs/PtpSubscription.yaml
        - path: source-crs/PtpOperatorStatus.yaml
        - path: source-crs/ClusterLogNS.yaml
        - path: source-crs/ClusterLogOperGroup.yaml
        - path: source-crs/ClusterLogSubscription.yaml
        - path: source-crs/ClusterLogOperatorStatus.yaml
        - path: source-crs/StorageNS.yaml
        - path: source-crs/StorageOperGroup.yaml
        - path: source-crs/StorageSubscription.yaml
        - path: source-crs/StorageOperatorStatus.yaml

Copy to Clipboard

Toggle word wrap

1: Applies the policies to all clusters with this label.
2: The DefaultCatsrc.yaml file contains the catalog source for the disconnected registry and related registry configuration details.
3: Files listed under policies.manifests create the Operator policies for installed clusters.

A PolicyGenerator CR can be constructed with any number of included CRs. Apply the following example CR in the hub cluster to generate a policy containing a single CR:

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
  name: group-du-sno
placementBindingDefaults:
  name: group-du-sno-placement-binding
policyDefaults:
  namespace: ztp-group
  placement:
    labelSelector:
      matchExpressions:
        - key: group-du-sno
          operator: Exists
  remediationAction: inform
  severity: low
  namespaceSelector:
    exclude:
      - kube-*
    include:
      - '*'
  evaluationInterval:
    compliant: 10m
    noncompliant: 10s
policies:
  - name: group-du-sno-config-policy
    policyAnnotations:
      ran.openshift.io/ztp-deploy-wave: '10'
    manifests:
      - path: source-crs/PtpConfigSlave-MCP-master.yaml
        patches:
          - metadata: null
            name: du-ptp-slave
            namespace: openshift-ptp
            annotations:
              ran.openshift.io/ztp-deploy-wave: '10'
            spec:
              profile:
                - name: slave
                  interface: $interface
                  ptp4lOpts: '-2 -s'
                  phc2sysOpts: '-a -r -n 24'
                  ptpSchedulingPolicy: SCHED_FIFO
                  ptpSchedulingPriority: 10
                  ptpSettings:
                    logReduce: 'true'
                  ptp4lConf: |
                    [global]
                    #
                    # Default Data Set
                    #
                    twoStepFlag 1
                    slaveOnly 1
                    priority1 128
                    priority2 128
                    domainNumber 24
                    #utc_offset 37
                    clockClass 255
                    clockAccuracy 0xFE
                    offsetScaledLogVariance 0xFFFF
                    free_running 0
                    freq_est_interval 1
                    dscp_event 0
                    dscp_general 0
                    dataset_comparison G.8275.x
                    G.8275.defaultDS.localPriority 128
                    #
                    # Port Data Set
                    #
                    logAnnounceInterval -3
                    logSyncInterval -4
                    logMinDelayReqInterval -4
                    logMinPdelayReqInterval -4
                    announceReceiptTimeout 3
                    syncReceiptTimeout 0
                    delayAsymmetry 0
                    fault_reset_interval -4
                    neighborPropDelayThresh 20000000
                    masterOnly 0
                    G.8275.portDS.localPriority 128
                    #
                    # Run time options
                    #
                    assume_two_step 0
                    logging_level 6
                    path_trace_enabled 0
                    follow_up_info 0
                    hybrid_e2e 0
                    inhibit_multicast_service 0
                    net_sync_monitor 0
                    tc_spanning_tree 0
                    tx_timestamp_timeout 50
                    unicast_listen 0
                    unicast_master_table 0
                    unicast_req_duration 3600
                    use_syslog 1
                    verbose 0
                    summary_interval 0
                    kernel_leap 1
                    check_fup_sync 0
                    clock_class_threshold 7
                    #
                    # Servo Options
                    #
                    pi_proportional_const 0.0
                    pi_integral_const 0.0
                    pi_proportional_scale 0.0
                    pi_proportional_exponent -0.3
                    pi_proportional_norm_max 0.7
                    pi_integral_scale 0.0
                    pi_integral_exponent 0.4
                    pi_integral_norm_max 0.3
                    step_threshold 2.0
                    first_step_threshold 0.00002
                    max_frequency 900000000
                    clock_servo pi
                    sanity_freq_limit 200000000
                    ntpshm_segment 0
                    #
                    # Transport options
                    #
                    transportSpecific 0x0
                    ptp_dst_mac 01:1B:19:00:00:00
                    p2p_dst_mac 01:80:C2:00:00:0E
                    udp_ttl 1
                    udp6_scope 0x0E
                    uds_address /var/run/ptp4l
                    #
                    # Default interface options
                    #
                    clock_type OC
                    network_transport L2
                    delay_mechanism E2E
                    time_stamping hardware
                    tsproc_mode filter
                    delay_filter moving_median
                    delay_filter_length 10
                    egressLatency 0
                    ingressLatency 0
                    boundary_clock_jbod 0
                    #
                    # Clock description
                    #
                    productDescription ;;
                    revisionData ;;
                    manufacturerIdentity 00:00:00
                    userDescription ;
                    timeSource 0xA0
              recommend:
                - profile: slave
                  priority: 4
                  match:
                    - nodeLabel: node-role.kubernetes.io/master

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
  name: group-du-sno
placementBindingDefaults:
  name: group-du-sno-placement-binding
policyDefaults:
  namespace: ztp-group
  placement:
    labelSelector:
      matchExpressions:
        - key: group-du-sno
          operator: Exists
  remediationAction: inform
  severity: low
  namespaceSelector:
    exclude:
      - kube-*
    include:
      - '*'
  evaluationInterval:
    compliant: 10m
    noncompliant: 10s
policies:
  - name: group-du-sno-config-policy
    policyAnnotations:
      ran.openshift.io/ztp-deploy-wave: '10'
    manifests:
      - path: source-crs/PtpConfigSlave-MCP-master.yaml
        patches:
          - metadata: null
            name: du-ptp-slave
            namespace: openshift-ptp
            annotations:
              ran.openshift.io/ztp-deploy-wave: '10'
            spec:
              profile:
                - name: slave
                  interface: $interface
                  ptp4lOpts: '-2 -s'
                  phc2sysOpts: '-a -r -n 24'
                  ptpSchedulingPolicy: SCHED_FIFO
                  ptpSchedulingPriority: 10
                  ptpSettings:
                    logReduce: 'true'
                  ptp4lConf: |
                    [global]
                    #
                    # Default Data Set
                    #
                    twoStepFlag 1
                    slaveOnly 1
                    priority1 128
                    priority2 128
                    domainNumber 24
                    #utc_offset 37
                    clockClass 255
                    clockAccuracy 0xFE
                    offsetScaledLogVariance 0xFFFF
                    free_running 0
                    freq_est_interval 1
                    dscp_event 0
                    dscp_general 0
                    dataset_comparison G.8275.x
                    G.8275.defaultDS.localPriority 128
                    #
                    # Port Data Set
                    #
                    logAnnounceInterval -3
                    logSyncInterval -4
                    logMinDelayReqInterval -4
                    logMinPdelayReqInterval -4
                    announceReceiptTimeout 3
                    syncReceiptTimeout 0
                    delayAsymmetry 0
                    fault_reset_interval -4
                    neighborPropDelayThresh 20000000
                    masterOnly 0
                    G.8275.portDS.localPriority 128
                    #
                    # Run time options
                    #
                    assume_two_step 0
                    logging_level 6
                    path_trace_enabled 0
                    follow_up_info 0
                    hybrid_e2e 0
                    inhibit_multicast_service 0
                    net_sync_monitor 0
                    tc_spanning_tree 0
                    tx_timestamp_timeout 50
                    unicast_listen 0
                    unicast_master_table 0
                    unicast_req_duration 3600
                    use_syslog 1
                    verbose 0
                    summary_interval 0
                    kernel_leap 1
                    check_fup_sync 0
                    clock_class_threshold 7
                    #
                    # Servo Options
                    #
                    pi_proportional_const 0.0
                    pi_integral_const 0.0
                    pi_proportional_scale 0.0
                    pi_proportional_exponent -0.3
                    pi_proportional_norm_max 0.7
                    pi_integral_scale 0.0
                    pi_integral_exponent 0.4
                    pi_integral_norm_max 0.3
                    step_threshold 2.0
                    first_step_threshold 0.00002
                    max_frequency 900000000
                    clock_servo pi
                    sanity_freq_limit 200000000
                    ntpshm_segment 0
                    #
                    # Transport options
                    #
                    transportSpecific 0x0
                    ptp_dst_mac 01:1B:19:00:00:00
                    p2p_dst_mac 01:80:C2:00:00:0E
                    udp_ttl 1
                    udp6_scope 0x0E
                    uds_address /var/run/ptp4l
                    #
                    # Default interface options
                    #
                    clock_type OC
                    network_transport L2
                    delay_mechanism E2E
                    time_stamping hardware
                    tsproc_mode filter
                    delay_filter moving_median
                    delay_filter_length 10
                    egressLatency 0
                    ingressLatency 0
                    boundary_clock_jbod 0
                    #
                    # Clock description
                    #
                    productDescription ;;
                    revisionData ;;
                    manufacturerIdentity 00:00:00
                    userDescription ;
                    timeSource 0xA0
              recommend:
                - profile: slave
                  priority: 4
                  match:
                    - nodeLabel: node-role.kubernetes.io/master

Copy to Clipboard

Toggle word wrap

Using the source file PtpConfigSlave.yaml as an example, the file defines a PtpConfig CR. The generated policy for the PtpConfigSlave example is named group-du-sno-config-policy. The PtpConfig CR defined in the generated group-du-sno-config-policy is named du-ptp-slave. The spec defined in PtpConfigSlave.yaml is placed under du-ptp-slave along with the other spec items defined under the source file.

The following example shows the group-du-sno-config-policy CR:

---
apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-operator-catsrc-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/DefaultCatsrc.yaml
          patches:
            - metadata:
                name: redhat-operators
              spec:
                displayName: Red Hat Operators Catalog
                image: registry.example.com:5000/olm/redhat-operators:v4.14
                updateStrategy:
                    registryPoll:
                        interval: 1h
              status:
                connectionState:
                    lastObservedState: READY

---
apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-operator-catsrc-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/DefaultCatsrc.yaml
          patches:
            - metadata:
                name: redhat-operators
              spec:
                displayName: Red Hat Operators Catalog
                image: registry.example.com:5000/olm/redhat-operators:v4.14
                updateStrategy:
                    registryPoll:
                        interval: 1h
              status:
                connectionState:
                    lastObservedState: READY

Copy to Clipboard

Toggle word wrap

9.1.3. Recommendations when customizing PolicyGenerator CRs
Copier lien

Consider the following best practices when customizing site configuration PolicyGenerator custom resources (CRs):

Use as few policies as are necessary. Using fewer policies requires less resources. Each additional policy creates increased CPU load for the hub cluster and the deployed managed cluster. CRs are combined into policies based on the policyName field in the PolicyGenerator CR. CRs in the same PolicyGenerator which have the same value for policyName are managed under a single policy.
In disconnected environments, use a single catalog source for all Operators by configuring the registry as a single index containing all Operators. Each additional CatalogSource CR on the managed clusters increases CPU usage.
MachineConfig CRs should be included as extraManifests in the SiteConfig CR so that they are applied during installation. This can reduce the overall time taken until the cluster is ready to deploy applications.
PolicyGenerator CRs should override the channel field to explicitly identify the desired version. This ensures that changes in the source CR during upgrades does not update the generated subscription.
The default setting for policyDefaults.consolidateManifests is true. This is the recommended setting for DU profile. Setting it to false might impact large scale deployments.
The default setting for policyDefaults.orderPolicies is false. This is the recommended setting for DU profile. After the cluster installation is complete and a cluster becomes Ready, TALM creates a ClusterGroupUpgrade CR corresponding to this cluster. The ClusterGroupUpgrade CR contains a list of ordered policies defined by the ran.openshift.io/ztp-deploy-wave annotation. If you use the PolicyGenerator CR to change the order of the policies, conflicts might occur and the configuration might not be applied.

Note

When managing large numbers of spoke clusters on the hub cluster, minimize the number of policies to reduce resource consumption.

Grouping multiple configuration CRs into a single or limited number of policies is one way to reduce the overall number of policies on the hub cluster. When using the common, group, and site hierarchy of policies for managing site configuration, it is especially important to combine site-specific configuration into a single policy.

9.1.4. PolicyGenerator CRs for RAN deployments
Copier lien

Use PolicyGenerator custom resources (CRs) to customize the configuration applied to the cluster by using the GitOps Zero Touch Provisioning (ZTP) pipeline. The PolicyGenerator CR allows you to generate one or more policies to manage the set of configuration CRs on your fleet of clusters. The PolicyGenerator CR identifies the set of managed CRs, bundles them into policies, builds the policy wrapping around those CRs, and associates the policies with clusters by using label binding rules.

The reference configuration, obtained from the GitOps ZTP container, is designed to provide a set of critical features and node tuning settings that ensure the cluster can support the stringent performance and resource utilization constraints typical of RAN (Radio Access Network) Distributed Unit (DU) applications. Changes or omissions from the baseline configuration can affect feature availability, performance, and resource utilization. Use the reference PolicyGenerator CRs as the basis to create a hierarchy of configuration files tailored to your specific site requirements.

The baseline PolicyGenerator CRs that are defined for RAN DU cluster configuration can be extracted from the GitOps ZTP ztp-site-generate container. See "Preparing the GitOps ZTP site configuration repository" for further details.

The PolicyGenerator CRs can be found in the ./out/argocd/example/acmpolicygenerator/ folder. The reference architecture has common, group, and site-specific configuration CRs. Each PolicyGenerator CR refers to other CRs that can be found in the ./out/source-crs folder.

The PolicyGenerator CRs relevant to RAN cluster configuration are described below. Variants are provided for the group PolicyGenerator CRs to account for differences in single-node, three-node compact, and standard cluster configurations. Similarly, site-specific configuration variants are provided for single-node clusters and multi-node (compact or standard) clusters. Use the group and site-specific configuration variants that are relevant for your deployment.

Expand

Table 9.2. PolicyGenerator CRs for RAN deployments
PolicyGenerator CR	Description
`acm-example-multinode-site.yaml`	Contains a set of CRs that get applied to multi-node clusters. These CRs configure SR-IOV features typical for RAN installations.
`acm-example-sno-site.yaml`	Contains a set of CRs that get applied to single-node OpenShift clusters. These CRs configure SR-IOV features typical for RAN installations.
`acm-common-mno-ranGen.yaml`	Contains a set of common RAN policy configuration that get applied to multi-node clusters.
`acm-common-ranGen.yaml`	Contains a set of common RAN CRs that get applied to all clusters. These CRs subscribe to a set of operators providing cluster features typical for RAN as well as baseline cluster tuning.
`acm-group-du-3node-ranGen.yaml`	Contains the RAN policies for three-node clusters only.
`acm-group-du-sno-ranGen.yaml`	Contains the RAN policies for single-node clusters only.
`acm-group-du-standard-ranGen.yaml`	Contains the RAN policies for standard three control-plane clusters.
`acm-group-du-3node-validator-ranGen.yaml`	`PolicyGenerator` CR used to generate the various policies required for three-node clusters.
`acm-group-du-standard-validator-ranGen.yaml`	`PolicyGenerator` CR used to generate the various policies required for standard clusters.
`acm-group-du-sno-validator-ranGen.yaml`	`PolicyGenerator` CR used to generate the various policies required for single-node OpenShift clusters.

9.1.5. Customizing a managed cluster with PolicyGenerator CRs
Copier lien

Use the following procedure to customize the policies that get applied to the managed cluster that you provision using the GitOps Zero Touch Provisioning (ZTP) pipeline.

Prerequisites

You have installed the OpenShift CLI (oc).
You have logged in to the hub cluster as a user with cluster-admin privileges.
You configured the hub cluster for generating the required installation and policy CRs.
You created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for the Argo CD application.

Procedure

Create a PolicyGenerator CR for site-specific configuration CRs.
1. Choose the appropriate example for your CR from the out/argocd/example/acmpolicygenerator/ folder, for example, acm-example-sno-site.yaml or acm-example-multinode-site.yaml.
2. Change the policyDefaults.placement.labelSelector field in the example file to match the site-specific label included in the SiteConfig CR. In the example SiteConfig file, the site-specific label is sites: example-sno.
  Note
  Ensure that the labels defined in your PolicyGenerator policyDefaults.placement.labelSelector field correspond to the labels that are defined in the related managed clusters SiteConfig CR.
3. Change the content in the example file to match the desired configuration.
Optional: Create a PolicyGenerator CR for any common configuration CRs that apply to the entire fleet of clusters.
1. Select the appropriate example for your CR from the out/argocd/example/acmpolicygenerator/ folder, for example, acm-common-ranGen.yaml.
2. Change the content in the example file to match the required configuration.
Optional: Create a PolicyGenerator CR for any group configuration CRs that apply to the certain groups of clusters in the fleet.
Ensure that the content of the overlaid spec files matches your required end state. As a reference, the out/source-crs directory contains the full list of source-crs available to be included and overlaid by your PolicyGenerator templates.
Note
Depending on the specific requirements of your clusters, you might need more than a single group policy per cluster type, especially considering that the example group policies each have a single PerformancePolicy.yaml file that can only be shared across a set of clusters if those clusters consist of identical hardware configurations.
1. Select the appropriate example for your CR from the out/argocd/example/acmpolicygenerator/ folder, for example, acm-group-du-sno-ranGen.yaml.
2. Change the content in the example file to match the required configuration.
Optional. Create a validator inform policy PolicyGenerator CR to signal when the GitOps ZTP installation and configuration of the deployed cluster is complete. For more information, see "Creating a validator inform policy".
Define all the policy namespaces in a YAML file similar to the example out/argocd/example/acmpolicygenerator//ns.yaml file.
Important
Do not include the Namespace CR in the same file with the PolicyGenerator CR.
Add the PolicyGenerator CRs and Namespace CR to the kustomization.yaml file in the generators section, similar to the example shown in out/argocd/example/acmpolicygenerator/kustomization.yaml.
Commit the PolicyGenerator CRs, Namespace CR, and associated kustomization.yaml file in your Git repository and push the changes.
The ArgoCD pipeline detects the changes and begins the managed cluster deployment. You can push the changes to the SiteConfig CR and the PolicyGenerator CR simultaneously.

9.1.6. Monitoring managed cluster policy deployment progress
Copier lien

The ArgoCD pipeline uses PolicyGenerator CRs in Git to generate the RHACM policies and then sync them to the hub cluster. You can monitor the progress of the managed cluster policy synchronization after the assisted service installs OpenShift Container Platform on the managed cluster.

Prerequisites

You have installed the OpenShift CLI (oc).
You have logged in to the hub cluster as a user with cluster-admin privileges.

Procedure

The Topology Aware Lifecycle Manager (TALM) applies the configuration policies that are bound to the cluster.
After the cluster installation is complete and the cluster becomes Ready, a ClusterGroupUpgrade CR corresponding to this cluster, with a list of ordered policies defined by the ran.openshift.io/ztp-deploy-wave annotations, is automatically created by the TALM. The cluster’s policies are applied in the order listed in ClusterGroupUpgrade CR.
You can monitor the high-level progress of configuration policy reconciliation by using the following commands:
```
export CLUSTER=<clusterName>
```
```
$ export CLUSTER=<clusterName>
```
Copy to Clipboard Toggle word wrap
```
oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[-1:]}' | jq
```
```
$ oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[-1:]}' | jq
```
Copy to Clipboard Toggle word wrap
Example output
```
{
  "lastTransitionTime": "2022-11-09T07:28:09Z",
  "message": "Remediating non-compliant policies",
  "reason": "InProgress",
  "status": "True",
  "type": "Progressing"
}
```
```
{
  "lastTransitionTime": "2022-11-09T07:28:09Z",
  "message": "Remediating non-compliant policies",
  "reason": "InProgress",
  "status": "True",
  "type": "Progressing"
}
```
Copy to Clipboard Toggle word wrap

You can monitor the detailed cluster policy compliance status by using the RHACM dashboard or the command line.

To check policy compliance by using oc, run the following command:

oc get policies -n $CLUSTER

$ oc get policies -n $CLUSTER

Copy to Clipboard

Toggle word wrap

Example output

NAME                                                     REMEDIATION ACTION   COMPLIANCE STATE   AGE
ztp-common.common-config-policy                          inform               Compliant          3h42m
ztp-common.common-subscriptions-policy                   inform               NonCompliant       3h42m
ztp-group.group-du-sno-config-policy                     inform               NonCompliant       3h42m
ztp-group.group-du-sno-validator-du-policy               inform               NonCompliant       3h42m
ztp-install.example1-common-config-policy-pjz9s          enforce              Compliant          167m
ztp-install.example1-common-subscriptions-policy-zzd9k   enforce              NonCompliant       164m
ztp-site.example1-config-policy                          inform               NonCompliant       3h42m
ztp-site.example1-perf-policy                            inform               NonCompliant       3h42m

NAME                                                     REMEDIATION ACTION   COMPLIANCE STATE   AGE
ztp-common.common-config-policy                          inform               Compliant          3h42m
ztp-common.common-subscriptions-policy                   inform               NonCompliant       3h42m
ztp-group.group-du-sno-config-policy                     inform               NonCompliant       3h42m
ztp-group.group-du-sno-validator-du-policy               inform               NonCompliant       3h42m
ztp-install.example1-common-config-policy-pjz9s          enforce              Compliant          167m
ztp-install.example1-common-subscriptions-policy-zzd9k   enforce              NonCompliant       164m
ztp-site.example1-config-policy                          inform               NonCompliant       3h42m
ztp-site.example1-perf-policy                            inform               NonCompliant       3h42m

Copy to Clipboard

Toggle word wrap

To check policy status from the RHACM web console, perform the following actions:
1. Click Governance Find policies.
2. Click on a cluster policy to check its status.

When all of the cluster policies become compliant, GitOps ZTP installation and configuration for the cluster is complete. The ztp-done label is added to the cluster.

In the reference configuration, the final policy that becomes compliant is the one defined in the *-du-validator-policy policy. This policy, when compliant on a cluster, ensures that all cluster configuration, Operator installation, and Operator configuration is complete.

9.1.7. Coordinating reboots for configuration changes
Copier lien

You can use Topology Aware Lifecycle Manager (TALM) to coordinate reboots across a fleet of spoke clusters when configuration changes require a reboot, such as deferred tuning changes. TALM reboots all nodes in the targeted MachineConfigPool on the selected clusters when the reboot policy is applied.

Instead of rebooting nodes after each individual change, you can apply all configuration updates through policies and then trigger a single, coordinated reboot.

Prerequisites

You have installed the OpenShift CLI (oc).
You have logged in to the hub cluster as a user with cluster-admin privileges.
You have deployed and configured TALM.

Procedure

Generate the configuration policies by creating a PolicyGenerator custom resource (CR). You can use one of the following sample manifests:
- out/argocd/example/acmpolicygenerator/acm-example-sno-reboot
- out/argocd/example/acmpolicygenerator/acm-example-multinode-reboot
Update the policyDefaults.placement.labelSelector field in the PolicyGenerator CR to target the clusters that you want to reboot. Modify other fields as necessary for your use case.
If you are coordinating a reboot to apply a deferred tuning change, ensure the MachineConfigPool in the reboot policy matches the value specified in the spec.recommend field in the Tuned object.
Apply the PolicyGenerator CR to generate and apply the configuration policies. For detailed steps, see "Customizing a managed cluster with PolicyGenerator CRs".

After ArgoCD completes syncing the policies, create and apply the ClusterGroupUpgrade (CGU) CR.

Example CGU custom resource configuration

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: reboot
  namespace: default
spec:
  clusterLabelSelectors:
  - matchLabels: 
# ...
  enable: true
  managedPolicies: 
  - example-reboot
  remediationStrategy:
    timeout: 300 
    maxConcurrency: 10
# ...

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: reboot
  namespace: default
spec:
  clusterLabelSelectors:
  - matchLabels:


# ...
  enable: true
  managedPolicies:


  - example-reboot
  remediationStrategy:
    timeout: 300


    maxConcurrency: 10
# ...

Copy to Clipboard

Toggle word wrap

1: Configure the labels that match the clusters you want to reboot.
2: Add all required configuration policies before the reboot policy. TALM applies the configuration changes as specified in the policies, in the order they are listed.
3: Specify the timeout in seconds for the entire upgrade across all selected clusters. Set this field by considering the worst-case scenario.

After you apply the CGU custom resource, TALM rolls out the configuration policies in order. Once all policies are compliant, it applies the reboot policy and triggers a reboot of all nodes in the specified MachineConfigPool.

Verification

Monitor the CGU rollout status.
You can monitor the rollout of the CGU custom resource on the hub by checking the status. Verify the successful rollout of the reboot by running the following command:
```
oc get cgu -A
```
```
oc get cgu -A
```
Copy to Clipboard Toggle word wrap
Example output
```
NAMESPACE   NAME     AGE   STATE       DETAILS
default     reboot   1d    Completed   All clusters are compliant with all the managed policies
```
```
NAMESPACE   NAME     AGE   STATE       DETAILS
default     reboot   1d    Completed   All clusters are compliant with all the managed policies
```
Copy to Clipboard Toggle word wrap

Verify successful reboot on a specific node.

To confirm that the reboot was successful on a specific node, check the status of the MachineConfigPool (MCP) for the node by running the following command:

oc get mcp master

oc get mcp master

Copy to Clipboard

Toggle word wrap

Example output

NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-be5785c3b98eb7a1ec902fef2b81e865   True      False      False      3              3                   3                     0                      72d

NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-be5785c3b98eb7a1ec902fef2b81e865   True      False      False      3              3                   3                     0                      72d

Copy to Clipboard

Toggle word wrap

9.1.8. Validating the generation of configuration policy CRs
Copier lien

Policy custom resources (CRs) are generated in the same namespace as the PolicyGenerator from which they are created. The same troubleshooting flow applies to all policy CRs generated from a PolicyGenerator regardless of whether they are ztp-common, ztp-group, or ztp-site based, as shown using the following commands:

export NS=<namespace>

$ export NS=<namespace>

Copy to Clipboard

Toggle word wrap

oc get policy -n $NS

$ oc get policy -n $NS

Copy to Clipboard

Toggle word wrap

The expected set of policy-wrapped CRs should be displayed.

If the policies failed synchronization, use the following troubleshooting steps.

Procedure

To display detailed information about the policies, run the following command:
```
oc describe -n openshift-gitops application policies
```
```
$ oc describe -n openshift-gitops application policies
```
Copy to Clipboard Toggle word wrap

Check for Status: Conditions: to show the error logs. For example, setting an invalid sourceFile entry to fileName: generates the error shown below:

Status:
  Conditions:
    Last Transition Time:  2021-11-26T17:21:39Z
    Message:               rpc error: code = Unknown desc = `kustomize build /tmp/https___git.com/ran-sites/policies/ --enable-alpha-plugins` failed exit status 1: 2021/11/26 17:21:40 Error could not find test.yaml under source-crs/: no such file or directory Error: failure in plugin configured via /tmp/kust-plugin-config-52463179; exit status 1: exit status 1
    Type:  ComparisonError

Status:
  Conditions:
    Last Transition Time:  2021-11-26T17:21:39Z
    Message:               rpc error: code = Unknown desc = `kustomize build /tmp/https___git.com/ran-sites/policies/ --enable-alpha-plugins` failed exit status 1: 2021/11/26 17:21:40 Error could not find test.yaml under source-crs/: no such file or directory Error: failure in plugin configured via /tmp/kust-plugin-config-52463179; exit status 1: exit status 1
    Type:  ComparisonError

Copy to Clipboard

Toggle word wrap

Check for Status: Sync:. If there are log errors at Status: Conditions:, the Status: Sync: shows Unknown or Error:

Status:
  Sync:
    Compared To:
      Destination:
        Namespace:  policies-sub
        Server:     https://kubernetes.default.svc
      Source:
        Path:             policies
        Repo URL:         https://git.com/ran-sites/policies/.git
        Target Revision:  master
    Status:               Error

Status:
  Sync:
    Compared To:
      Destination:
        Namespace:  policies-sub
        Server:     https://kubernetes.default.svc
      Source:
        Path:             policies
        Repo URL:         https://git.com/ran-sites/policies/.git
        Target Revision:  master
    Status:               Error

Copy to Clipboard

Toggle word wrap

When Red Hat Advanced Cluster Management (RHACM) recognizes that policies apply to a ManagedCluster object, the policy CR objects are applied to the cluster namespace. Check to see if the policies were copied to the cluster namespace:

oc get policy -n $CLUSTER

$ oc get policy -n $CLUSTER

Copy to Clipboard

Toggle word wrap

Example output

NAME                                         REMEDIATION ACTION   COMPLIANCE STATE   AGE
ztp-common.common-config-policy              inform               Compliant          13d
ztp-common.common-subscriptions-policy       inform               Compliant          13d
ztp-group.group-du-sno-config-policy         inform               Compliant          13d
ztp-group.group-du-sno-validator-du-policy   inform               Compliant          13d
ztp-site.example-sno-config-policy           inform               Compliant          13d

NAME                                         REMEDIATION ACTION   COMPLIANCE STATE   AGE
ztp-common.common-config-policy              inform               Compliant          13d
ztp-common.common-subscriptions-policy       inform               Compliant          13d
ztp-group.group-du-sno-config-policy         inform               Compliant          13d
ztp-group.group-du-sno-validator-du-policy   inform               Compliant          13d
ztp-site.example-sno-config-policy           inform               Compliant          13d

Copy to Clipboard

Toggle word wrap

RHACM copies all applicable policies into the cluster namespace. The copied policy names have the format: <PolicyGenerator.Namespace>.<PolicyGenerator.Name>-<policyName>.

Check the placement rule for any policies not copied to the cluster namespace. The matchSelector in the Placement for those policies should match labels on the ManagedCluster object:
```
oc get Placement -n $NS
```
```
$ oc get Placement -n $NS
```
Copy to Clipboard Toggle word wrap
Note the Placement name appropriate for the missing policy, common, group, or site, using the following command:
```
oc get Placement -n $NS <placement_rule_name> -o yaml
```
```
$ oc get Placement -n $NS <placement_rule_name> -o yaml
```
Copy to Clipboard Toggle word wrap
- The status-decisions should include your cluster name.
- The key-value pair of the matchSelector in the spec must match the labels on your managed cluster.

Check the labels on the ManagedCluster object by using the following command:

oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jq

$ oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jq

Copy to Clipboard

Toggle word wrap

Check to see what policies are compliant by using the following command:
```
oc get policy -n $CLUSTER
```
```
$ oc get policy -n $CLUSTER
```
Copy to Clipboard Toggle word wrap
If the Namespace, OperatorGroup, and Subscription policies are compliant but the Operator configuration policies are not, it is likely that the Operators did not install on the managed cluster. This causes the Operator configuration policies to fail to apply because the CRD is not yet applied to the spoke.

9.1.9. Restarting policy reconciliation
Copier lien

You can restart policy reconciliation when unexpected compliance issues occur, for example, when the ClusterGroupUpgrade custom resource (CR) has timed out.

Procedure

A ClusterGroupUpgrade CR is generated in the namespace ztp-install by the Topology Aware Lifecycle Manager after the managed cluster becomes Ready:
```
export CLUSTER=<clusterName>
```
```
$ export CLUSTER=<clusterName>
```
Copy to Clipboard Toggle word wrap
```
oc get clustergroupupgrades -n ztp-install $CLUSTER
```
```
$ oc get clustergroupupgrades -n ztp-install $CLUSTER
```
Copy to Clipboard Toggle word wrap
If there are unexpected issues and the policies fail to become complaint within the configured timeout (the default is 4 hours), the status of the ClusterGroupUpgrade CR shows UpgradeTimedOut:
```
oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'
```
```
$ oc get clustergroupupgrades -n ztp-install $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'
```
Copy to Clipboard Toggle word wrap
A ClusterGroupUpgrade CR in the UpgradeTimedOut state automatically restarts its policy reconciliation every hour. If you have changed your policies, you can start a retry immediately by deleting the existing ClusterGroupUpgrade CR. This triggers the automatic creation of a new ClusterGroupUpgrade CR that begins reconciling the policies immediately:
```
oc delete clustergroupupgrades -n ztp-install $CLUSTER
```
```
$ oc delete clustergroupupgrades -n ztp-install $CLUSTER
```
Copy to Clipboard Toggle word wrap

Note that when the ClusterGroupUpgrade CR completes with status UpgradeCompleted and the managed cluster has the label ztp-done applied, you can make additional configuration changes by using PolicyGenerator. Deleting the existing ClusterGroupUpgrade CR will not make the TALM generate a new CR.

At this point, GitOps ZTP has completed its interaction with the cluster and any further interactions should be treated as an update and a new ClusterGroupUpgrade CR created for remediation of the policies.

9.1.10. Changing applied managed cluster CRs using policies
Copier lien

You can remove content from a custom resource (CR) that is deployed in a managed cluster through a policy.

By default, all Policy CRs created from a PolicyGenerator CR have the complianceType field set to musthave. A musthave policy without the removed content is still compliant because the CR on the managed cluster has all the specified content. With this configuration, when you remove content from a CR, TALM removes the content from the policy but the content is not removed from the CR on the managed cluster.

With the complianceType field to mustonlyhave, the policy ensures that the CR on the cluster is an exact match of what is specified in the policy.

Prerequisites

You have installed the OpenShift CLI (oc).
You have logged in to the hub cluster as a user with cluster-admin privileges.
You have deployed a managed cluster from a hub cluster running RHACM.
You have installed Topology Aware Lifecycle Manager on the hub cluster.

Procedure

Remove the content that you no longer need from the affected CRs. In this example, the disableDrain: false line was removed from the SriovOperatorConfig CR.

Example CR

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovOperatorConfig
metadata:
  name: default
  namespace: openshift-sriov-network-operator
spec:
  configDaemonNodeSelector:
    "node-role.kubernetes.io/$mcp": ""
  disableDrain: true
  enableInjector: true
  enableOperatorWebhook: true

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovOperatorConfig
metadata:
  name: default
  namespace: openshift-sriov-network-operator
spec:
  configDaemonNodeSelector:
    "node-role.kubernetes.io/$mcp": ""
  disableDrain: true
  enableInjector: true
  enableOperatorWebhook: true

Copy to Clipboard

Toggle word wrap

Change the complianceType of the affected policies to mustonlyhave in the acm-group-du-sno-ranGen.yaml file.

Example YAML

# ...
policyDefaults:
  complianceType: "mustonlyhave"
# ...
policies:
  - name: config-policy
    policyAnnotations:
      ran.openshift.io/ztp-deploy-wave: ""
    manifests:
      - path: source-crs/SriovOperatorConfig.yaml

# ...
policyDefaults:
  complianceType: "mustonlyhave"
# ...
policies:
  - name: config-policy
    policyAnnotations:
      ran.openshift.io/ztp-deploy-wave: ""
    manifests:
      - path: source-crs/SriovOperatorConfig.yaml

Copy to Clipboard

Toggle word wrap

Create a ClusterGroupUpdates CR and specify the clusters that must receive the CR changes::

Example ClusterGroupUpdates CR

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-remove
  namespace: default
spec:
  managedPolicies:
    - ztp-group.group-du-sno-config-policy
  enable: false
  clusters:
  - spoke1
  - spoke2
  remediationStrategy:
    maxConcurrency: 2
    timeout: 240
  batchTimeoutAction:

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-remove
  namespace: default
spec:
  managedPolicies:
    - ztp-group.group-du-sno-config-policy
  enable: false
  clusters:
  - spoke1
  - spoke2
  remediationStrategy:
    maxConcurrency: 2
    timeout: 240
  batchTimeoutAction:

Copy to Clipboard

Toggle word wrap

Create the ClusterGroupUpgrade CR by running the following command:
```
oc create -f cgu-remove.yaml
```
```
$ oc create -f cgu-remove.yaml
```
Copy to Clipboard Toggle word wrap

When you are ready to apply the changes, for example, during an appropriate maintenance window, change the value of the spec.enable field to true by running the following command:

oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-remove \
--patch '{"spec":{"enable":true}}' --type=merge

$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-remove \
--patch '{"spec":{"enable":true}}' --type=merge

Copy to Clipboard

Toggle word wrap

Verification

Check the status of the policies by running the following command:

oc get <kind> <changed_cr_name>

$ oc get <kind> <changed_cr_name>

Copy to Clipboard

Toggle word wrap

Example output

NAMESPACE   NAME                                                   REMEDIATION ACTION   COMPLIANCE STATE   AGE
default     cgu-ztp-group.group-du-sno-config-policy               enforce                                 17m
default     ztp-group.group-du-sno-config-policy                   inform               NonCompliant       15h

NAMESPACE   NAME                                                   REMEDIATION ACTION   COMPLIANCE STATE   AGE
default     cgu-ztp-group.group-du-sno-config-policy               enforce                                 17m
default     ztp-group.group-du-sno-config-policy                   inform               NonCompliant       15h

Copy to Clipboard

Toggle word wrap

When the COMPLIANCE STATE of the policy is Compliant, it means that the CR is updated and the unwanted content is removed.

Check that the policies are removed from the targeted clusters by running the following command on the managed clusters:
```
oc get <kind> <changed_cr_name>
```
```
$ oc get <kind> <changed_cr_name>
```
Copy to Clipboard Toggle word wrap
If there are no results, the CR is removed from the managed cluster.

9.1.11. Indication of done for GitOps ZTP installations
Copier lien

GitOps Zero Touch Provisioning (ZTP) simplifies the process of checking the GitOps ZTP installation status for a cluster. The GitOps ZTP status moves through three phases: cluster installation, cluster configuration, and GitOps ZTP done.

Cluster installation phase

The cluster installation phase is shown by the ManagedClusterJoined and ManagedClusterAvailable conditions in the ManagedCluster CR . If the ManagedCluster CR does not have these conditions, or the condition is set to False, the cluster is still in the installation phase. Additional details about installation are available from the AgentClusterInstall and ClusterDeployment CRs. For more information, see "Troubleshooting GitOps ZTP".

Cluster configuration phase

The cluster configuration phase is shown by a ztp-running label applied the ManagedCluster CR for the cluster.

GitOps ZTP done

Cluster installation and configuration is complete in the GitOps ZTP done phase. This is shown by the removal of the ztp-running label and addition of the ztp-done label to the ManagedCluster CR. The ztp-done label shows that the configuration has been applied and the baseline DU configuration has completed cluster tuning.

The change to the GitOps ZTP done state is conditional on the compliant state of a Red Hat Advanced Cluster Management (RHACM) validator inform policy. This policy captures the existing criteria for a completed installation and validates that it moves to a compliant state only when GitOps ZTP provisioning of the managed cluster is complete.

The validator inform policy ensures the configuration of the cluster is fully applied and Operators have completed their initialization. The policy validates the following:

The target MachineConfigPool contains the expected entries and has finished updating. All nodes are available and not degraded.
The SR-IOV Operator has completed initialization as indicated by at least one SriovNetworkNodeState with syncStatus: Succeeded.
The PTP Operator daemon set exists.

9.2. Advanced managed cluster configuration with PolicyGenerator resources
Copier lien

You can use PolicyGenerator CRs to deploy custom functionality in your managed clusters. Using RHACM and PolicyGenerator CRs is the recommended approach for managing policies and deploying them to managed clusters. This replaces the use of PolicyGenTemplate CRs for this purpose. For more information about PolicyGenerator resources, see the RHACM Policy Generator documentation.

9.2.1. Deploying additional changes to clusters
Copier lien

If you require cluster configuration changes outside of the base GitOps Zero Touch Provisioning (ZTP) pipeline configuration, there are three options:

Apply the additional configuration after the GitOps ZTP pipeline is complete: When the GitOps ZTP pipeline deployment is complete, the deployed cluster is ready for application workloads. At this point, you can install additional Operators and apply configurations specific to your requirements. Ensure that additional configurations do not negatively affect the performance of the platform or allocated CPU budget.
Add content to the GitOps ZTP library: The base source custom resources (CRs) that you deploy with the GitOps ZTP pipeline can be augmented with custom content as required.
Create extra manifests for the cluster installation: Extra manifests are applied during installation and make the installation process more efficient.

Important

Providing additional source CRs or modifying existing source CRs can significantly impact the performance or CPU profile of OpenShift Container Platform.

9.2.2. Using PolicyGenerator CRs to override source CRs content
Copier lien

PolicyGenerator custom resources (CRs) allow you to overlay additional configuration details on top of the base source CRs provided with the GitOps plugin in the ztp-site-generate container. You can think of PolicyGenerator CRs as a logical merge or patch to the base CR. Use PolicyGenerator CRs to update a single field of the base CR, or overlay the entire contents of the base CR. You can update values and insert fields that are not in the base CR.

The following example procedure describes how to update fields in the generated PerformanceProfile CR for the reference configuration based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml file. Use the procedure as a basis for modifying other parts of the PolicyGenerator based on your requirements.

Prerequisites

Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.

Procedure

Review the baseline source CR for existing content. You can review the source CRs listed in the reference PolicyGenerator CRs by extracting them from the GitOps Zero Touch Provisioning (ZTP) container.
1. Create an /out folder:
  $ mkdir -p ./out
  Copy to Clipboard Toggle word wrap
2. Extract the source CRs:
  $ podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.19.1 extract /home/ztp --tar | tar x -C ./out
  Copy to Clipboard Toggle word wrap

Review the baseline PerformanceProfile CR in ./out/source-crs/PerformanceProfile.yaml:

apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  name: $name
  annotations:
    ran.openshift.io/ztp-deploy-wave: "10"
spec:
  additionalKernelArgs:
  - "idle=poll"
  - "rcupdate.rcu_normal_after_boot=0"
  cpu:
    isolated: $isolated
    reserved: $reserved
  hugepages:
    defaultHugepagesSize: $defaultHugepagesSize
    pages:
      - size: $size
        count: $count
        node: $node
  machineConfigPoolSelector:
    pools.operator.machineconfiguration.openshift.io/$mcp: ""
  net:
    userLevelNetworking: true
  nodeSelector:
    node-role.kubernetes.io/$mcp: ''
  numa:
    topologyPolicy: "restricted"
  realTimeKernel:
    enabled: true

apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  name: $name
  annotations:
    ran.openshift.io/ztp-deploy-wave: "10"
spec:
  additionalKernelArgs:
  - "idle=poll"
  - "rcupdate.rcu_normal_after_boot=0"
  cpu:
    isolated: $isolated
    reserved: $reserved
  hugepages:
    defaultHugepagesSize: $defaultHugepagesSize
    pages:
      - size: $size
        count: $count
        node: $node
  machineConfigPoolSelector:
    pools.operator.machineconfiguration.openshift.io/$mcp: ""
  net:
    userLevelNetworking: true
  nodeSelector:
    node-role.kubernetes.io/$mcp: ''
  numa:
    topologyPolicy: "restricted"
  realTimeKernel:
    enabled: true

Copy to Clipboard

Toggle word wrap

Note

Any fields in the source CR which contain $… are removed from the generated CR if they are not provided in the PolicyGenerator CR.

Update the PolicyGenerator entry for PerformanceProfile in the acm-group-du-sno-ranGen.yaml reference file. The following example PolicyGenerator CR stanza supplies appropriate CPU specifications, sets the hugepages configuration, and adds a new field that sets globallyDisableIrqLoadBalancing to false.

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        # These must be tailored for the specific hardware platform
        cpu:
          isolated: "2-19,22-39"
          reserved: "0-1,20-21"
        hugepages:
          defaultHugepagesSize: 1G
          pages:
          - size: 1G
            count: 10
        globallyDisableIrqLoadBalancing: false

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        # These must be tailored for the specific hardware platform
        cpu:
          isolated: "2-19,22-39"
          reserved: "0-1,20-21"
        hugepages:
          defaultHugepagesSize: 1G
          pages:
          - size: 1G
            count: 10
        globallyDisableIrqLoadBalancing: false

Copy to Clipboard

Toggle word wrap

Commit the PolicyGenerator change in Git, and then push to the Git repository being monitored by the GitOps ZTP argo CD application.

Example output

The GitOps ZTP application generates an RHACM policy that contains the generated PerformanceProfile CR. The contents of that CR are derived by merging the metadata and spec contents from the PerformanceProfile entry in the PolicyGenerator onto the source CR. The resulting CR has the following content:

---
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
    name: openshift-node-performance-profile
spec:
    additionalKernelArgs:
        - idle=poll
        - rcupdate.rcu_normal_after_boot=0
    cpu:
        isolated: 2-19,22-39
        reserved: 0-1,20-21
    globallyDisableIrqLoadBalancing: false
    hugepages:
        defaultHugepagesSize: 1G
        pages:
            - count: 10
              size: 1G
    machineConfigPoolSelector:
        pools.operator.machineconfiguration.openshift.io/master: ""
    net:
        userLevelNetworking: true
    nodeSelector:
        node-role.kubernetes.io/master: ""
    numa:
        topologyPolicy: restricted
    realTimeKernel:
        enabled: true

---
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
    name: openshift-node-performance-profile
spec:
    additionalKernelArgs:
        - idle=poll
        - rcupdate.rcu_normal_after_boot=0
    cpu:
        isolated: 2-19,22-39
        reserved: 0-1,20-21
    globallyDisableIrqLoadBalancing: false
    hugepages:
        defaultHugepagesSize: 1G
        pages:
            - count: 10
              size: 1G
    machineConfigPoolSelector:
        pools.operator.machineconfiguration.openshift.io/master: ""
    net:
        userLevelNetworking: true
    nodeSelector:
        node-role.kubernetes.io/master: ""
    numa:
        topologyPolicy: restricted
    realTimeKernel:
        enabled: true

Copy to Clipboard

Toggle word wrap

Note

In the /source-crs folder that you extract from the ztp-site-generate container, the $ syntax is not used for template substitution as implied by the syntax. Rather, if the policyGen tool sees the $ prefix for a string and you do not specify a value for that field in the related PolicyGenerator CR, the field is omitted from the output CR entirely.

An exception to this is the $mcp variable in /source-crs YAML files that is substituted with the specified value for mcp from the PolicyGenerator CR. For example, in example/acmpolicygenerator/acm-group-du-standard-ranGen.yaml, the value for mcp is worker:

spec:
  bindingRules:
    group-du-standard: ""
  mcp: "worker"

spec:
  bindingRules:
    group-du-standard: ""
  mcp: "worker"

Copy to Clipboard

Toggle word wrap

The policyGen tool replace instances of $mcp with worker in the output CRs.

9.2.3. Adding custom content to the GitOps ZTP pipeline
Copier lien

Perform the following procedure to add new content to the GitOps ZTP pipeline.

Procedure

Create a subdirectory named source-crs in the directory that contains the kustomization.yaml file for the PolicyGenerator custom resource (CR).

Add your user-provided CRs to the source-crs subdirectory, as shown in the following example:

example
└── acmpolicygenerator
    ├── dev.yaml
    ├── kustomization.yaml
    ├── mec-edge-sno1.yaml
    ├── sno.yaml
    └── source-crs 
        ├── PaoCatalogSource.yaml
        ├── PaoSubscription.yaml
        ├── custom-crs
        |   ├── apiserver-config.yaml
        |   └── disable-nic-lldp.yaml
        └── elasticsearch
            ├── ElasticsearchNS.yaml
            └── ElasticsearchOperatorGroup.yaml

example
└── acmpolicygenerator
    ├── dev.yaml
    ├── kustomization.yaml
    ├── mec-edge-sno1.yaml
    ├── sno.yaml
    └── source-crs


        ├── PaoCatalogSource.yaml
        ├── PaoSubscription.yaml
        ├── custom-crs
        |   ├── apiserver-config.yaml
        |   └── disable-nic-lldp.yaml
        └── elasticsearch
            ├── ElasticsearchNS.yaml
            └── ElasticsearchOperatorGroup.yaml

Copy to Clipboard

Toggle word wrap

1: The source-crs subdirectory must be in the same directory as the kustomization.yaml file.

Update the required PolicyGenerator CRs to include references to the content you added in the source-crs/custom-crs and source-crs/elasticsearch directories. For example:

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: group-dev
placementBindingDefaults:
    name: group-dev-placement-binding
policyDefaults:
    namespace: ztp-clusters
    placement:
        labelSelector:
            matchExpressions:
                - key: dev
                  operator: In
                  values:
                    - "true"
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: group-dev-group-dev-cluster-log-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/ClusterLogNS.yaml
    - name: group-dev-group-dev-cluster-log-operator-group
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/ClusterLogOperGroup.yaml
    - name: group-dev-group-dev-cluster-log-sub
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/ClusterLogSubscription.yaml
    - name: group-dev-group-dev-lso-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/StorageNS.yaml
    - name: group-dev-group-dev-lso-operator-group
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/StorageOperGroup.yaml
    - name: group-dev-group-dev-lso-sub
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/StorageSubscription.yaml
    - name: group-dev-group-dev-pao-cat-source
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/PaoSubscriptionCatalogSource.yaml
          patches:
            - spec:
                image: <container_image_url>
    - name: group-dev-group-dev-pao-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/PaoSubscriptionNS.yaml
    - name: group-dev-group-dev-pao-sub
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/PaoSubscription.yaml
    - name: group-dev-group-dev-elasticsearch-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: elasticsearch/ElasticsearchNS.yaml 
    - name: group-dev-group-dev-elasticsearch-operator-group
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: elasticsearch/ElasticsearchOperatorGroup.yaml
    - name: group-dev-group-dev-apiserver-config
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: custom-crs/apiserver-config.yaml 
    - name: group-dev-group-dev-disable-nic-lldp
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: custom-crs/disable-nic-lldp.yaml

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: group-dev
placementBindingDefaults:
    name: group-dev-placement-binding
policyDefaults:
    namespace: ztp-clusters
    placement:
        labelSelector:
            matchExpressions:
                - key: dev
                  operator: In
                  values:
                    - "true"
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: group-dev-group-dev-cluster-log-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/ClusterLogNS.yaml
    - name: group-dev-group-dev-cluster-log-operator-group
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/ClusterLogOperGroup.yaml
    - name: group-dev-group-dev-cluster-log-sub
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/ClusterLogSubscription.yaml
    - name: group-dev-group-dev-lso-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/StorageNS.yaml
    - name: group-dev-group-dev-lso-operator-group
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/StorageOperGroup.yaml
    - name: group-dev-group-dev-lso-sub
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/StorageSubscription.yaml
    - name: group-dev-group-dev-pao-cat-source
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/PaoSubscriptionCatalogSource.yaml
          patches:
            - spec:
                image: <container_image_url>
    - name: group-dev-group-dev-pao-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/PaoSubscriptionNS.yaml
    - name: group-dev-group-dev-pao-sub
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/PaoSubscription.yaml
    - name: group-dev-group-dev-elasticsearch-ns
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: elasticsearch/ElasticsearchNS.yaml


    - name: group-dev-group-dev-elasticsearch-operator-group
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: elasticsearch/ElasticsearchOperatorGroup.yaml
    - name: group-dev-group-dev-apiserver-config
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: custom-crs/apiserver-config.yaml


    - name: group-dev-group-dev-disable-nic-lldp
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: custom-crs/disable-nic-lldp.yaml

Copy to Clipboard

Toggle word wrap

1 2: Set policies.manifests.path to include the relative path to the file from the /source-crs parent directory.

Commit the PolicyGenerator change in Git, and then push to the Git repository that is monitored by the GitOps ZTP Argo CD policies application.

Update the ClusterGroupUpgrade CR to include the changed PolicyGenerator and save it as cgu-test.yaml. The following example shows a generated cgu-test.yaml file.

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: custom-source-cr
  namespace: ztp-clusters
spec:
  managedPolicies:
    - group-dev-config-policy
  enable: true
  clusters:
  - cluster1
  remediationStrategy:
    maxConcurrency: 2
    timeout: 240

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: custom-source-cr
  namespace: ztp-clusters
spec:
  managedPolicies:
    - group-dev-config-policy
  enable: true
  clusters:
  - cluster1
  remediationStrategy:
    maxConcurrency: 2
    timeout: 240

Copy to Clipboard

Toggle word wrap

Apply the updated ClusterGroupUpgrade CR by running the following command:
```
oc apply -f cgu-test.yaml
```
```
$ oc apply -f cgu-test.yaml
```
Copy to Clipboard Toggle word wrap

Verification

Check that the updates have succeeded by running the following command:

oc get cgu -A

$ oc get cgu -A

Copy to Clipboard

Toggle word wrap

Example output

NAMESPACE     NAME               AGE   STATE        DETAILS
ztp-clusters  custom-source-cr   6s    InProgress   Remediating non-compliant policies
ztp-install   cluster1           19h   Completed    All clusters are compliant with all the managed policies

NAMESPACE     NAME               AGE   STATE        DETAILS
ztp-clusters  custom-source-cr   6s    InProgress   Remediating non-compliant policies
ztp-install   cluster1           19h   Completed    All clusters are compliant with all the managed policies

Copy to Clipboard

Toggle word wrap

9.2.4. Configuring policy compliance evaluation timeouts for PolicyGenerator CRs
Copier lien

Use Red Hat Advanced Cluster Management (RHACM) installed on a hub cluster to monitor and report on whether your managed clusters are compliant with applied policies. RHACM uses policy templates to apply predefined policy controllers and policies. Policy controllers are Kubernetes custom resource definition (CRD) instances.

You can override the default policy evaluation intervals with PolicyGenerator custom resources (CRs). You configure duration settings that define how long a ConfigurationPolicy CR can be in a state of policy compliance or non-compliance before RHACM re-evaluates the applied cluster policies.

The GitOps Zero Touch Provisioning (ZTP) policy generator generates ConfigurationPolicy CR policies with pre-defined policy evaluation intervals. The default value for the noncompliant state is 10 seconds. The default value for the compliant state is 10 minutes. To disable the evaluation interval, set the value to never.

Prerequisites

You have installed the OpenShift CLI (oc).
You have logged in to the hub cluster as a user with cluster-admin privileges.
You have created a Git repository where you manage your custom site configuration data.

Procedure

To configure the evaluation interval for all policies in a PolicyGenerator CR, set appropriate compliant and noncompliant values for the evaluationInterval field. For example:
```
policyDefaults:
  evaluationInterval:
    compliant: 30m
    noncompliant: 45s
```
```
policyDefaults:
  evaluationInterval:
    compliant: 30m
    noncompliant: 45s
```
Copy to Clipboard Toggle word wrap
Note
You can also set compliant and noncompliant fields to never to stop evaluating the policy after it reaches particular compliance state.

To configure the evaluation interval for an individual policy object in a PolicyGenerator CR, add the evaluationInterval field and set appropriate values. For example:

policies:
  - name: "sriov-sub-policy"
    manifests:
      - path: "SriovSubscription.yaml"
        evaluationInterval:
          compliant: never
          noncompliant: 10s

policies:
  - name: "sriov-sub-policy"
    manifests:
      - path: "SriovSubscription.yaml"
        evaluationInterval:
          compliant: never
          noncompliant: 10s

Copy to Clipboard

Toggle word wrap

Commit the PolicyGenerator CRs files in the Git repository and push your changes.

Verification

Check that the managed spoke cluster policies are monitored at the expected intervals.

Get the pods that are running in the open-cluster-management-agent-addon namespace. Run the following command:

oc get pods -n open-cluster-management-agent-addon

$ oc get pods -n open-cluster-management-agent-addon

Copy to Clipboard

Toggle word wrap

Example output

NAME                                         READY   STATUS    RESTARTS        AGE
config-policy-controller-858b894c68-v4xdb    1/1     Running   22 (5d8h ago)   10d

NAME                                         READY   STATUS    RESTARTS        AGE
config-policy-controller-858b894c68-v4xdb    1/1     Running   22 (5d8h ago)   10d

Copy to Clipboard

Toggle word wrap

Check the applied policies are being evaluated at the expected interval in the logs for the config-policy-controller pod:

oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb

$ oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb

Copy to Clipboard

Toggle word wrap

Example output

2022-05-10T15:10:25.280Z       info   configuration-policy-controller controllers/configurationpolicy_controller.go:166      Skipping the policy evaluation due to the policy not reaching the evaluation interval  {"policy": "compute-1-config-policy-config"}
2022-05-10T15:10:25.280Z       info   configuration-policy-controller controllers/configurationpolicy_controller.go:166      Skipping the policy evaluation due to the policy not reaching the evaluation interval  {"policy": "compute-1-common-compute-1-catalog-policy-config"}

2022-05-10T15:10:25.280Z       info   configuration-policy-controller controllers/configurationpolicy_controller.go:166      Skipping the policy evaluation due to the policy not reaching the evaluation interval  {"policy": "compute-1-config-policy-config"}
2022-05-10T15:10:25.280Z       info   configuration-policy-controller controllers/configurationpolicy_controller.go:166      Skipping the policy evaluation due to the policy not reaching the evaluation interval  {"policy": "compute-1-common-compute-1-catalog-policy-config"}

Copy to Clipboard

Toggle word wrap

9.2.5. Signalling GitOps ZTP cluster deployment completion with validator inform policies
Copier lien

Create a validator inform policy that signals when the GitOps Zero Touch Provisioning (ZTP) installation and configuration of the deployed cluster is complete. This policy can be used for deployments of single-node OpenShift clusters, three-node clusters, and standard clusters.

Procedure

Create a standalone PolicyGenerator custom resource (CR) that contains the source file validatorCRs/informDuValidator.yaml. You only need one standalone PolicyGenerator CR for each cluster type. For example, this CR applies a validator inform policy for single-node OpenShift clusters:

Example single-node cluster validator inform policy CR (acm-group-du-sno-validator-ranGen.yaml)

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: group-du-sno-validator-latest
placementBindingDefaults:
    name: group-du-sno-validator-latest-placement-binding
policyDefaults:
    namespace: ztp-group
    placement:
        labelSelector:
            matchExpressions:
                - key: du-profile
                  operator: In
                  values:
                    - latest
                - key: group-du-sno
                  operator: Exists
                - key: ztp-done
                  operator: DoesNotExist
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: group-du-sno-validator-latest-du-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "10000"
      evaluationInterval:
        compliant: 5s
      manifests:
        - path: source-crs/validatorCRs/informDuValidator-MCP-master.yaml

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: group-du-sno-validator-latest
placementBindingDefaults:
    name: group-du-sno-validator-latest-placement-binding
policyDefaults:
    namespace: ztp-group
    placement:
        labelSelector:
            matchExpressions:
                - key: du-profile
                  operator: In
                  values:
                    - latest
                - key: group-du-sno
                  operator: Exists
                - key: ztp-done
                  operator: DoesNotExist
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: group-du-sno-validator-latest-du-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "10000"
      evaluationInterval:
        compliant: 5s
      manifests:
        - path: source-crs/validatorCRs/informDuValidator-MCP-master.yaml

Copy to Clipboard

Toggle word wrap

Commit the PolicyGenerator CR file in your Git repository and push the changes.

9.2.6. Configuring power states using PolicyGenerator CRs
Copier lien

For low latency and high-performance edge deployments, it is necessary to disable or limit C-states and P-states. With this configuration, the CPU runs at a constant frequency, which is typically the maximum turbo frequency. This ensures that the CPU is always running at its maximum speed, which results in high performance and low latency. This leads to the best latency for workloads. However, this also leads to the highest power consumption, which might not be necessary for all workloads.

Workloads can be classified as critical or non-critical, with critical workloads requiring disabled C-state and P-state settings for high performance and low latency, while non-critical workloads use C-state and P-state settings for power savings at the expense of some latency and performance. You can configure the following three power states using GitOps Zero Touch Provisioning (ZTP):

High-performance mode provides ultra low latency at the highest power consumption.
Performance mode provides low latency at a relatively high power consumption.
Power saving balances reduced power consumption with increased latency.

The default configuration is for a low latency, performance mode.

PolicyGenerator custom resources (CRs) allow you to overlay additional configuration details onto the base source CRs provided with the GitOps plugin in the ztp-site-generate container.

Configure the power states by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.

The following common prerequisites apply to configuring all three power states.

Prerequisites

You have created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.
You have followed the procedure described in "Preparing the GitOps ZTP site configuration repository".

9.2.6.1. Configuring performance mode using PolicyGenerator CRs
Copier lien

Follow this example to set performance mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.

Performance mode provides low latency at a relatively high power consumption.

Prerequisites

You have configured the BIOS with performance related settings by following the guidance in "Configuring host firmware for low latency and high performance".

Procedure

Update the PolicyGenerator entry for PerformanceProfile in the acm-group-du-sno-ranGen.yaml reference file in out/argocd/example/acmpolicygenerator// as follows to set performance mode.

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        workloadHints:
             realTime: true
             highPowerConsumption: false
             perPodPowerManagement: false

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        workloadHints:
             realTime: true
             highPowerConsumption: false
             perPodPowerManagement: false

Copy to Clipboard

Toggle word wrap

Commit the PolicyGenerator change in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.

9.2.6.2. Configuring high-performance mode using PolicyGenerator CRs
Copier lien

Follow this example to set high performance mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.

High performance mode provides ultra low latency at the highest power consumption.

Prerequisites

You have configured the BIOS with performance related settings by following the guidance in "Configuring host firmware for low latency and high performance".

Procedure

Update the PolicyGenerator entry for PerformanceProfile in the acm-group-du-sno-ranGen.yaml reference file in out/argocd/example/acmpolicygenerator/ as follows to set high-performance mode.

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        workloadHints:
             realTime: true
             highPowerConsumption: true
             perPodPowerManagement: false

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        workloadHints:
             realTime: true
             highPowerConsumption: true
             perPodPowerManagement: false

Copy to Clipboard

Toggle word wrap

Commit the PolicyGenerator change in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.

9.2.6.3. Configuring power saving mode using PolicyGenerator CRs
Copier lien

Follow this example to set power saving mode by updating the workloadHints fields in the generated PerformanceProfile CR for the reference configuration, based on the PolicyGenerator CR in the acm-group-du-sno-ranGen.yaml.

The power saving mode balances reduced power consumption with increased latency.

Prerequisites

You enabled C-states and OS-controlled P-states in the BIOS.

Procedure

Update the PolicyGenerator entry for PerformanceProfile in the acm-group-du-sno-ranGen.yaml reference file in out/argocd/example/acmpolicygenerator/ as follows to configure power saving mode. It is recommended to configure the CPU governor for the power saving mode through the additional kernel arguments object.

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        # ...
        workloadHints:
          realTime: true
          highPowerConsumption: false
          perPodPowerManagement: true
        # ...
        additionalKernelArgs:
          - # ...
          - "cpufreq.default_governor=schedutil"

- path: source-crs/PerformanceProfile.yaml
  patches:
    - spec:
        # ...
        workloadHints:
          realTime: true
          highPowerConsumption: false
          perPodPowerManagement: true
        # ...
        additionalKernelArgs:
          - # ...
          - "cpufreq.default_governor=schedutil"

Copy to Clipboard

Toggle word wrap

1: The schedutil governor is recommended, however, you can also use other governors, including ondemand and powersave.

Commit the PolicyGenerator change in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.

Verification

Select a worker node in your deployed cluster from the list of nodes identified by using the following command:
```
oc get nodes
```
```
$ oc get nodes
```
Copy to Clipboard Toggle word wrap
Log in to the node by using the following command:
```
oc debug node/<node-name>
```
```
$ oc debug node/<node-name>
```
Copy to Clipboard Toggle word wrap
Replace <node-name> with the name of the node you want to verify the power state on.
Set /host as the root directory within the debug shell. The debug pod mounts the host’s root file system in /host within the pod. By changing the root directory to /host, you can run binaries contained in the host’s executable paths as shown in the following example:
```
chroot /host
```
```
# chroot /host
```
Copy to Clipboard Toggle word wrap
Run the following command to verify the applied power state:
```
cat /proc/cmdline
```
```
# cat /proc/cmdline
```
Copy to Clipboard Toggle word wrap

Expected output

For power saving mode the intel_pstate=passive.

9.2.6.4. Maximizing power savings
Copier lien

Limiting the maximum CPU frequency is recommended to achieve maximum power savings. Enabling C-states on the non-critical workload CPUs without restricting the maximum CPU frequency negates much of the power savings by boosting the frequency of the critical CPUs.

Maximize power savings by updating the sysfs plugin fields, setting an appropriate value for max_perf_pct in the TunedPerformancePatch CR for the reference configuration. This example based on the acm-group-du-sno-ranGen.yaml describes the procedure to follow to restrict the maximum CPU frequency.

Prerequisites

You have configured power savings mode as described in "Using PolicyGenerator CRs to configure power savings mode".

Procedure

Update the PolicyGenerator entry for TunedPerformancePatch in the acm-group-du-sno-ranGen.yaml reference file in out/argocd/example/acmpolicygenerator/. To maximize power savings, add max_perf_pct as shown in the following example:
```
- path: source-crs/TunedPerformancePatch.yaml
  patches:
    - spec:
      profile:
        - name: performance-patch
          data: |
            # ...
            [sysfs]
            /sys/devices/system/cpu/intel_pstate/max_perf_pct=<x> 
```
```
- path: source-crs/TunedPerformancePatch.yaml
  patches:
    - spec:
      profile:
        - name: performance-patch
          data: |
            # ...
            [sysfs]
            /sys/devices/system/cpu/intel_pstate/max_perf_pct=<x> 
```
1
Copy to Clipboard Toggle word wrap
1
The max_perf_pct controls the maximum frequency the cpufreq driver is allowed to set as a percentage of the maximum supported CPU frequency. This value applies to all CPUs. You can check the maximum supported frequency in /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq. As a starting point, you can use a percentage that caps all CPUs at the All Cores Turbo frequency. The All Cores Turbo frequency is the frequency that all cores run at when the cores are all fully occupied.
Note
To maximize power savings, set a lower value. Setting a lower value for max_perf_pct limits the maximum CPU frequency, thereby reducing power consumption, but also potentially impacting performance. Experiment with different values and monitor the system’s performance and power consumption to find the optimal setting for your use-case.
Commit the PolicyGenerator change in Git, and then push to the Git repository being monitored by the GitOps ZTP Argo CD application.

9.2.7. Configuring LVM Storage using PolicyGenerator CRs
Copier lien

You can configure Logical Volume Manager (LVM) Storage for managed clusters that you deploy with GitOps Zero Touch Provisioning (ZTP).

Note

You use LVM Storage to persist event subscriptions when you use PTP events or bare-metal hardware events with HTTP transport.

Use the Local Storage Operator for persistent storage that uses local volumes in distributed units.

Prerequisites

Install the OpenShift CLI (oc).
Log in as a user with cluster-admin privileges.
Create a Git repository where you manage your custom site configuration data.

Procedure

To configure LVM Storage for new managed clusters, add the following YAML to policies.manifests in the acm-common-ranGen.yaml file:

- name: subscription-policies
  policyAnnotations:
    ran.openshift.io/ztp-deploy-wave: "2"
  manifests:
    - path: source-crs/StorageLVMOSubscriptionNS.yaml
    - path: source-crs/StorageLVMOSubscriptionOperGroup.yaml
    - path: source-crs/StorageLVMOSubscription.yaml
      spec:
        name: lvms-operator
        channel: stable-4.19

- name: subscription-policies
  policyAnnotations:
    ran.openshift.io/ztp-deploy-wave: "2"
  manifests:
    - path: source-crs/StorageLVMOSubscriptionNS.yaml
    - path: source-crs/StorageLVMOSubscriptionOperGroup.yaml
    - path: source-crs/StorageLVMOSubscription.yaml
      spec:
        name: lvms-operator
        channel: stable-4.19

Copy to Clipboard

Toggle word wrap

Note

The Storage LVMO subscription is deprecated. In future releases of OpenShift Container Platform, the storage LVMO subscription will not be available. Instead, you must use the Storage LVMS subscription.

In OpenShift Container Platform 4.19, you can use the Storage LVMS subscription instead of the LVMO subscription. The LVMS subscription does not require manual overrides in the acm-common-ranGen.yaml file. Add the following YAML to policies.manifests in the acm-common-ranGen.yaml file to use the Storage LVMS subscription:

- path: source-crs/StorageLVMSubscriptionNS.yaml
- path: source-crs/StorageLVMSubscriptionOperGroup.yaml
- path: source-crs/StorageLVMSubscription.yaml

- path: source-crs/StorageLVMSubscriptionNS.yaml
- path: source-crs/StorageLVMSubscriptionOperGroup.yaml
- path: source-crs/StorageLVMSubscription.yaml

Copy to Clipboard

Toggle word wrap

Add the LVMCluster CR to policies.manifests in your specific group or individual site configuration file. For example, in the acm-group-du-sno-ranGen.yaml file, add the following:

- fileName: StorageLVMCluster.yaml
  policyName: "lvms-config"
    metadata:
      name: "lvms-storage-cluster-config"
        spec:
          storage:
            deviceClasses:
            - name: vg1
              thinPoolConfig:
                name: thin-pool-1
                sizePercent: 90
                overprovisionRatio: 10

- fileName: StorageLVMCluster.yaml
  policyName: "lvms-config"
    metadata:
      name: "lvms-storage-cluster-config"
        spec:
          storage:
            deviceClasses:
            - name: vg1
              thinPoolConfig:
                name: thin-pool-1
                sizePercent: 90
                overprovisionRatio: 10

Copy to Clipboard

Toggle word wrap

This example configuration creates a volume group (vg1) with all the available devices, except the disk where OpenShift Container Platform is installed. A thin-pool logical volume is also created.

Merge any other required changes and files with your custom site repository.
Commit the PolicyGenerator changes in Git, and then push the changes to your site configuration repository to deploy LVM Storage to new sites using GitOps ZTP.

9.2.8. Configuring PTP events with PolicyGenerator CRs
Copier lien

You can use the GitOps ZTP pipeline to configure PTP events that use HTTP transport.

9.2.8.1. Configuring PTP events that use HTTP transport
Copier lien

You can configure PTP events that use HTTP transport on managed clusters that you deploy with the GitOps Zero Touch Provisioning (ZTP) pipeline.

Prerequisites

You have installed the OpenShift CLI (oc).
You have logged in as a user with cluster-admin privileges.
You have created a Git repository where you manage your custom site configuration data.

Procedure

Apply the following PolicyGenerator changes to acm-group-du-3node-ranGen.yaml, acm-group-du-sno-ranGen.yaml, or acm-group-du-standard-ranGen.yaml files according to your requirements:

In policies.manifests, add the PtpOperatorConfig CR file that configures the transport host:

- path: source-crs/PtpOperatorConfigForEvent.yaml
  patches:
  - metadata:
      name: default
      namespace: openshift-ptp
      annotations:
        ran.openshift.io/ztp-deploy-wave: "10"
    spec:
      daemonNodeSelector:
        node-role.kubernetes.io/$mcp: ""
      ptpEventConfig:
        enableEventPublisher: true
        transportHost: "http://ptp-event-publisher-service-NODE_NAME.openshift-ptp.svc.cluster.local:9043"

- path: source-crs/PtpOperatorConfigForEvent.yaml
  patches:
  - metadata:
      name: default
      namespace: openshift-ptp
      annotations:
        ran.openshift.io/ztp-deploy-wave: "10"
    spec:
      daemonNodeSelector:
        node-role.kubernetes.io/$mcp: ""
      ptpEventConfig:
        enableEventPublisher: true
        transportHost: "http://ptp-event-publisher-service-NODE_NAME.openshift-ptp.svc.cluster.local:9043"

Copy to Clipboard

Toggle word wrap

Note

In OpenShift Container Platform 4.13 or later, you do not need to set the transportHost field in the PtpOperatorConfig resource when you use HTTP transport with PTP events.

Configure the linuxptp and phc2sys for the PTP clock type and interface. For example, add the following YAML into policies.manifests:

- path: source-crs/PtpConfigSlave.yaml 
  patches:
  - metadata:
      name: "du-ptp-slave"
    spec:
      recommend:
      - match:
        - nodeLabel: node-role.kubernetes.io/master
        priority: 4
        profile: slave
      profile:
      - name: "slave"
        # This interface must match the hardware in this group
        interface: "ens5f0" 
        ptp4lOpts: "-2 -s --summary_interval -4" 
        phc2sysOpts: "-a -r -n 24" 
        ptpSchedulingPolicy: SCHED_FIFO
        ptpSchedulingPriority: 10
        ptpSettings:
          logReduce: "true"
        ptp4lConf: |
          [global]
          #
          # Default Data Set
          #
          twoStepFlag 1
          slaveOnly 1
          priority1 128
          priority2 128
          domainNumber 24
          #utc_offset 37
          clockClass 255
          clockAccuracy 0xFE
          offsetScaledLogVariance 0xFFFF
          free_running 0
          freq_est_interval 1
          dscp_event 0
          dscp_general 0
          dataset_comparison G.8275.x
          G.8275.defaultDS.localPriority 128
          #
          # Port Data Set
          #
          logAnnounceInterval -3
          logSyncInterval -4
          logMinDelayReqInterval -4
          logMinPdelayReqInterval -4
          announceReceiptTimeout 3
          syncReceiptTimeout 0
          delayAsymmetry 0
          fault_reset_interval -4
          neighborPropDelayThresh 20000000
          masterOnly 0
          G.8275.portDS.localPriority 128
          #
          # Run time options
          #
          assume_two_step 0
          logging_level 6
          path_trace_enabled 0
          follow_up_info 0
          hybrid_e2e 0
          inhibit_multicast_service 0
          net_sync_monitor 0
          tc_spanning_tree 0
          tx_timestamp_timeout 50
          unicast_listen 0
          unicast_master_table 0
          unicast_req_duration 3600
          use_syslog 1
          verbose 0
          summary_interval 0
          kernel_leap 1
          check_fup_sync 0
          clock_class_threshold 7
          #
          # Servo Options
          #
          pi_proportional_const 0.0
          pi_integral_const 0.0
          pi_proportional_scale 0.0
          pi_proportional_exponent -0.3
          pi_proportional_norm_max 0.7
          pi_integral_scale 0.0
          pi_integral_exponent 0.4
          pi_integral_norm_max 0.3
          step_threshold 2.0
          first_step_threshold 0.00002
          max_frequency 900000000
          clock_servo pi
          sanity_freq_limit 200000000
          ntpshm_segment 0
          #
          # Transport options
          #
          transportSpecific 0x0
          ptp_dst_mac 01:1B:19:00:00:00
          p2p_dst_mac 01:80:C2:00:00:0E
          udp_ttl 1
          udp6_scope 0x0E
          uds_address /var/run/ptp4l
          #
          # Default interface options
          #
          clock_type OC
          network_transport L2
          delay_mechanism E2E
          time_stamping hardware
          tsproc_mode filter
          delay_filter moving_median
          delay_filter_length 10
          egressLatency 0
          ingressLatency 0
          boundary_clock_jbod 0
          #
          # Clock description
          #
          productDescription ;;
          revisionData ;;
          manufacturerIdentity 00:00:00
          userDescription ;
          timeSource 0xA0
      ptpClockThreshold: 
        holdOverTimeout: 30 # seconds
        maxOffsetThreshold: 100  # nano seconds
        minOffsetThreshold: -100

- path: source-crs/PtpConfigSlave.yaml


  patches:
  - metadata:
      name: "du-ptp-slave"
    spec:
      recommend:
      - match:
        - nodeLabel: node-role.kubernetes.io/master
        priority: 4
        profile: slave
      profile:
      - name: "slave"
        # This interface must match the hardware in this group
        interface: "ens5f0"


        ptp4lOpts: "-2 -s --summary_interval -4"


        phc2sysOpts: "-a -r -n 24"


        ptpSchedulingPolicy: SCHED_FIFO
        ptpSchedulingPriority: 10
        ptpSettings:
          logReduce: "true"
        ptp4lConf: |
          [global]
          #
          # Default Data Set
          #
          twoStepFlag 1
          slaveOnly 1
          priority1 128
          priority2 128
          domainNumber 24
          #utc_offset 37
          clockClass 255
          clockAccuracy 0xFE
          offsetScaledLogVariance 0xFFFF
          free_running 0
          freq_est_interval 1
          dscp_event 0
          dscp_general 0
          dataset_comparison G.8275.x
          G.8275.defaultDS.localPriority 128
          #
          # Port Data Set
          #
          logAnnounceInterval -3
          logSyncInterval -4
          logMinDelayReqInterval -4
          logMinPdelayReqInterval -4
          announceReceiptTimeout 3
          syncReceiptTimeout 0
          delayAsymmetry 0
          fault_reset_interval -4
          neighborPropDelayThresh 20000000
          masterOnly 0
          G.8275.portDS.localPriority 128
          #
          # Run time options
          #
          assume_two_step 0
          logging_level 6
          path_trace_enabled 0
          follow_up_info 0
          hybrid_e2e 0
          inhibit_multicast_service 0
          net_sync_monitor 0
          tc_spanning_tree 0
          tx_timestamp_timeout 50
          unicast_listen 0
          unicast_master_table 0
          unicast_req_duration 3600
          use_syslog 1
          verbose 0
          summary_interval 0
          kernel_leap 1
          check_fup_sync 0
          clock_class_threshold 7
          #
          # Servo Options
          #
          pi_proportional_const 0.0
          pi_integral_const 0.0
          pi_proportional_scale 0.0
          pi_proportional_exponent -0.3
          pi_proportional_norm_max 0.7
          pi_integral_scale 0.0
          pi_integral_exponent 0.4
          pi_integral_norm_max 0.3
          step_threshold 2.0
          first_step_threshold 0.00002
          max_frequency 900000000
          clock_servo pi
          sanity_freq_limit 200000000
          ntpshm_segment 0
          #
          # Transport options
          #
          transportSpecific 0x0
          ptp_dst_mac 01:1B:19:00:00:00
          p2p_dst_mac 01:80:C2:00:00:0E
          udp_ttl 1
          udp6_scope 0x0E
          uds_address /var/run/ptp4l
          #
          # Default interface options
          #
          clock_type OC
          network_transport L2
          delay_mechanism E2E
          time_stamping hardware
          tsproc_mode filter
          delay_filter moving_median
          delay_filter_length 10
          egressLatency 0
          ingressLatency 0
          boundary_clock_jbod 0
          #
          # Clock description
          #
          productDescription ;;
          revisionData ;;
          manufacturerIdentity 00:00:00
          userDescription ;
          timeSource 0xA0
      ptpClockThreshold:


        holdOverTimeout: 30 # seconds
        maxOffsetThreshold: 100  # nano seconds
        minOffsetThreshold: -100

Copy to Clipboard

Toggle word wrap

1: Can be PtpConfigMaster.yaml or PtpConfigSlave.yaml depending on your requirements. For configurations based on acm-group-du-sno-ranGen.yaml or acm-group-du-3node-ranGen.yaml, use PtpConfigSlave.yaml.
2: Device specific interface name.
3: You must append the --summary_interval -4 value to ptp4lOpts in .spec.sourceFiles.spec.profile to enable PTP fast events.
4: Required phc2sysOpts values. -m prints messages to stdout. The linuxptp-daemon DaemonSet parses the logs and generates Prometheus metrics.
5: Optional. If the ptpClockThreshold stanza is not present, default values are used for the ptpClockThreshold fields. The stanza shows default ptpClockThreshold values. The ptpClockThreshold values configure how long after the PTP master clock is disconnected before PTP events are triggered. holdOverTimeout is the time value in seconds before the PTP clock event state changes to FREERUN when the PTP master clock is disconnected. The maxOffsetThreshold and minOffsetThreshold settings configure offset values in nanoseconds that compare against the values for CLOCK_REALTIME (phc2sys) or master offset (ptp4l). When the ptp4l or phc2sys offset value is outside this range, the PTP clock state is set to FREERUN. When the offset value is within this range, the PTP clock state is set to LOCKED.

Merge any other required changes and files with your custom site repository.
Push the changes to your site configuration repository to deploy PTP fast events to new sites using GitOps ZTP.

9.2.9. Configuring the Image Registry Operator for local caching of images
Copier lien

OpenShift Container Platform manages image caching using a local registry. In edge computing use cases, clusters are often subject to bandwidth restrictions when communicating with centralized image registries, which might result in long image download times.

Long download times are unavoidable during initial deployment. Over time, there is a risk that CRI-O will erase the /var/lib/containers/storage directory in the case of an unexpected shutdown. To address long image download times, you can create a local image registry on remote managed clusters using GitOps Zero Touch Provisioning (ZTP). This is useful in Edge computing scenarios where clusters are deployed at the far edge of the network.

Before you can set up the local image registry with GitOps ZTP, you need to configure disk partitioning in the SiteConfig CR that you use to install the remote managed cluster. After installation, you configure the local image registry using a PolicyGenerator CR. Then, the GitOps ZTP pipeline creates Persistent Volume (PV) and Persistent Volume Claim (PVC) CRs and patches the imageregistry configuration.

Note

The local image registry can only be used for user application images and cannot be used for the OpenShift Container Platform or Operator Lifecycle Manager operator images.

9.2.9.1. Configuring disk partitioning with SiteConfig
Copier lien

Configure disk partitioning for a managed cluster using a SiteConfig CR and GitOps Zero Touch Provisioning (ZTP). The disk partition details in the SiteConfig CR must match the underlying disk.

Important

You must complete this procedure at installation time.

Prerequisites

Install Butane.

Procedure

Create the storage.bu file.

variant: fcos
version: 1.3.0
storage:
  disks:
  - device: /dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0 
    wipe_table: false
    partitions:
    - label: var-lib-containers
      start_mib: <start_of_partition> 
      size_mib: <partition_size> 
  filesystems:
    - path: /var/lib/containers
      device: /dev/disk/by-partlabel/var-lib-containers
      format: xfs
      wipe_filesystem: true
      with_mount_unit: true
      mount_options:
        - defaults
        - prjquota

variant: fcos
version: 1.3.0
storage:
  disks:
  - device: /dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0


    wipe_table: false
    partitions:
    - label: var-lib-containers
      start_mib: <start_of_partition>


      size_mib: <partition_size>


  filesystems:
    - path: /var/lib/containers
      device: /dev/disk/by-partlabel/var-lib-containers
      format: xfs
      wipe_filesystem: true
      with_mount_unit: true
      mount_options:
        - defaults
        - prjquota

Copy to Clipboard

Toggle word wrap

1: Specify the root disk.
2: Specify the start of the partition in MiB. If the value is too small, the installation fails.
3: Specify the size of the partition. If the value is too small, the deployments fails.

Convert the storage.bu to an Ignition file by running the following command:

butane storage.bu

$ butane storage.bu

Copy to Clipboard

Toggle word wrap

Example output

{"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}

{"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}

Copy to Clipboard

Toggle word wrap

Use a tool such as JSON Pretty Print to convert the output into JSON format.

Copy the output into the .spec.clusters.nodes.ignitionConfigOverride field in the SiteConfig CR.

Example

[...]
spec:
  clusters:
    - nodes:
        - ignitionConfigOverride: |
          {
            "ignition": {
              "version": "3.2.0"
            },
            "storage": {
              "disks": [
                {
                  "device": "/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0",
                  "partitions": [
                    {
                      "label": "var-lib-containers",
                      "sizeMiB": 0,
                      "startMiB": 250000
                    }
                  ],
                  "wipeTable": false
                }
              ],
              "filesystems": [
                {
                  "device": "/dev/disk/by-partlabel/var-lib-containers",
                  "format": "xfs",
                  "mountOptions": [
                    "defaults",
                    "prjquota"
                  ],
                  "path": "/var/lib/containers",
                  "wipeFilesystem": true
                }
              ]
            },
            "systemd": {
              "units": [
                {
                  "contents": "# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target",
                  "enabled": true,
                  "name": "var-lib-containers.mount"
                }
              ]
            }
          }
[...]

[...]
spec:
  clusters:
    - nodes:
        - ignitionConfigOverride: |
          {
            "ignition": {
              "version": "3.2.0"
            },
            "storage": {
              "disks": [
                {
                  "device": "/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0",
                  "partitions": [
                    {
                      "label": "var-lib-containers",
                      "sizeMiB": 0,
                      "startMiB": 250000
                    }
                  ],
                  "wipeTable": false
                }
              ],
              "filesystems": [
                {
                  "device": "/dev/disk/by-partlabel/var-lib-containers",
                  "format": "xfs",
                  "mountOptions": [
                    "defaults",
                    "prjquota"
                  ],
                  "path": "/var/lib/containers",
                  "wipeFilesystem": true
                }
              ]
            },
            "systemd": {
              "units": [
                {
                  "contents": "# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target",
                  "enabled": true,
                  "name": "var-lib-containers.mount"
                }
              ]
            }
          }
[...]

Copy to Clipboard

Toggle word wrap

Note

If the .spec.clusters.nodes.ignitionConfigOverride field does not exist, create it.

Verification

During or after installation, verify on the hub cluster that the BareMetalHost object shows the annotation by running the following command:

oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"]

$ oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"]

Copy to Clipboard

Toggle word wrap

Example output

"{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}"

"{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}"

Copy to Clipboard

Toggle word wrap

After installation, check the single-node OpenShift disk status.

Enter into a debug session on the single-node OpenShift node by running the following command. This step instantiates a debug pod called <node_name>-debug:
```
oc debug node/my-sno-node
```
```
$ oc debug node/my-sno-node
```
Copy to Clipboard Toggle word wrap
Set /host as the root directory within the debug shell by running the following command. The debug pod mounts the host’s root file system in /host within the pod. By changing the root directory to /host, you can run binaries contained in the host’s executable paths:
```
chroot /host
```
```
# chroot /host
```
Copy to Clipboard Toggle word wrap

List information about all available block devices by running the following command:

lsblk

# lsblk

Copy to Clipboard

Toggle word wrap

Example output

NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda      8:0    0 446.6G  0 disk
├─sda1   8:1    0     1M  0 part
├─sda2   8:2    0   127M  0 part
├─sda3   8:3    0   384M  0 part /boot
├─sda4   8:4    0 243.6G  0 part /var
│                                /sysroot/ostree/deploy/rhcos/var
│                                /usr
│                                /etc
│                                /
│                                /sysroot
└─sda5   8:5    0 202.5G  0 part /var/lib/containers

NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda      8:0    0 446.6G  0 disk
├─sda1   8:1    0     1M  0 part
├─sda2   8:2    0   127M  0 part
├─sda3   8:3    0   384M  0 part /boot
├─sda4   8:4    0 243.6G  0 part /var
│                                /sysroot/ostree/deploy/rhcos/var
│                                /usr
│                                /etc
│                                /
│                                /sysroot
└─sda5   8:5    0 202.5G  0 part /var/lib/containers

Copy to Clipboard

Toggle word wrap

Display information about the file system disk space usage by running the following command:

df -h

# df -h

Copy to Clipboard

Toggle word wrap

Example output

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           126G   84K  126G   1% /dev/shm
tmpfs            51G   93M   51G   1% /run
/dev/sda4       244G  5.2G  239G   3% /sysroot
tmpfs           126G  4.0K  126G   1% /tmp
/dev/sda5       203G  119G   85G  59% /var/lib/containers
/dev/sda3       350M  110M  218M  34% /boot
tmpfs            26G     0   26G   0% /run/user/1000

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           126G   84K  126G   1% /dev/shm
tmpfs            51G   93M   51G   1% /run
/dev/sda4       244G  5.2G  239G   3% /sysroot
tmpfs           126G  4.0K  126G   1% /tmp
/dev/sda5       203G  119G   85G  59% /var/lib/containers
/dev/sda3       350M  110M  218M  34% /boot
tmpfs            26G     0   26G   0% /run/user/1000

Copy to Clipboard

Toggle word wrap

9.2.9.2. Configuring the image registry using PolicyGenerator CRs
Copier lien

Use PolicyGenerator (PGT) CRs to apply the CRs required to configure the image registry and patch the imageregistry configuration.

Prerequisites

You have configured a disk partition in the managed cluster.
You have installed the OpenShift CLI (oc).
You have logged in to the hub cluster as a user with cluster-admin privileges.
You have created a Git repository where you manage your custom site configuration data for use with GitOps Zero Touch Provisioning (ZTP).

Procedure

Configure the storage class, persistent volume claim, persistent volume, and image registry configuration in the appropriate PolicyGenerator CR. For example, to configure an individual site, add the following YAML to the file acm-example-sno-site.yaml:

sourceFiles:
  # storage class
  - fileName: StorageClass.yaml
    policyName: "sc-for-image-registry"
    metadata:
      name: image-registry-sc
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100" 
  # persistent volume claim
  - fileName: StoragePVC.yaml
    policyName: "pvc-for-image-registry"
    metadata:
      name: image-registry-pvc
      namespace: openshift-image-registry
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100"
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 100Gi
      storageClassName: image-registry-sc
      volumeMode: Filesystem
  # persistent volume
  - fileName: ImageRegistryPV.yaml 
    policyName: "pv-for-image-registry"
    metadata:
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100"
  - fileName: ImageRegistryConfig.yaml
    policyName: "config-for-image-registry"
    complianceType: musthave
    metadata:
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100"
    spec:
      storage:
        pvc:
          claim: "image-registry-pvc"

sourceFiles:
  # storage class
  - fileName: StorageClass.yaml
    policyName: "sc-for-image-registry"
    metadata:
      name: image-registry-sc
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100"


  # persistent volume claim
  - fileName: StoragePVC.yaml
    policyName: "pvc-for-image-registry"
    metadata:
      name: image-registry-pvc
      namespace: openshift-image-registry
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100"
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 100Gi
      storageClassName: image-registry-sc
      volumeMode: Filesystem
  # persistent volume
  - fileName: ImageRegistryPV.yaml


    policyName: "pv-for-image-registry"
    metadata:
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100"
  - fileName: ImageRegistryConfig.yaml
    policyName: "config-for-image-registry"
    complianceType: musthave
    metadata:
      annotations:
        ran.openshift.io/ztp-deploy-wave: "100"
    spec:
      storage:
        pvc:
          claim: "image-registry-pvc"

Copy to Clipboard

Toggle word wrap

1: Set the appropriate value for ztp-deploy-wave depending on whether you are configuring image registries at the site, common, or group level. ztp-deploy-wave: "100" is suitable for development or testing because it allows you to group the referenced source files together.
2: In ImageRegistryPV.yaml, ensure that the spec.local.path field is set to /var/imageregistry to match the value set for the mount_point field in the SiteConfig CR.

Important

Do not set complianceType: mustonlyhave for the - fileName: ImageRegistryConfig.yaml configuration. This can cause the registry pod deployment to fail.

Commit the PolicyGenerator change in Git, and then push to the Git repository being monitored by the GitOps ZTP ArgoCD application.

Verification

Use the following steps to troubleshoot errors with the local image registry on the managed clusters:

Verify successful login to the registry while logged in to the managed cluster. Run the following commands:

Export the managed cluster name:
```
cluster=<managed_cluster_name>
```
```
$ cluster=<managed_cluster_name>
```
Copy to Clipboard Toggle word wrap

Get the managed cluster kubeconfig details:

oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$cluster

$ oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$cluster

Copy to Clipboard

Toggle word wrap

Download and export the cluster kubeconfig:

oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$cluster

$ oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$cluster

Copy to Clipboard

Toggle word wrap

Verify access to the image registry from the managed cluster. See "Accessing the registry".

Check that the Config CRD in the imageregistry.operator.openshift.io group instance is not reporting errors. Run the following command while logged in to the managed cluster:

oc get image.config.openshift.io cluster -o yaml

$ oc get image.config.openshift.io cluster -o yaml

Copy to Clipboard

Toggle word wrap

Example output

apiVersion: config.openshift.io/v1
kind: Image
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2021-10-08T19:02:39Z"
  generation: 5
  name: cluster
  resourceVersion: "688678648"
  uid: 0406521b-39c0-4cda-ba75-873697da75a4
spec:
  additionalTrustedCA:
    name: acm-ice

apiVersion: config.openshift.io/v1
kind: Image
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2021-10-08T19:02:39Z"
  generation: 5
  name: cluster
  resourceVersion: "688678648"
  uid: 0406521b-39c0-4cda-ba75-873697da75a4
spec:
  additionalTrustedCA:
    name: acm-ice

Copy to Clipboard

Toggle word wrap

Check that the PersistentVolumeClaim on the managed cluster is populated with data. Run the following command while logged in to the managed cluster:
```
oc get pv image-registry-sc
```
```
$ oc get pv image-registry-sc
```
Copy to Clipboard Toggle word wrap

Check that the registry* pod is running and is located under the openshift-image-registry namespace.

oc get pods -n openshift-image-registry | grep registry*

$ oc get pods -n openshift-image-registry | grep registry*

Copy to Clipboard

Toggle word wrap

Example output

cluster-image-registry-operator-68f5c9c589-42cfg   1/1     Running     0          8d
image-registry-5f8987879-6nx6h                     1/1     Running     0          8d

cluster-image-registry-operator-68f5c9c589-42cfg   1/1     Running     0          8d
image-registry-5f8987879-6nx6h                     1/1     Running     0          8d

Copy to Clipboard

Toggle word wrap

Check that the disk partition on the managed cluster is correct:

Open a debug shell to the managed cluster:
```
oc debug node/sno-1.example.com
```
```
$ oc debug node/sno-1.example.com
```
Copy to Clipboard Toggle word wrap

Run lsblk to check the host disk partitions:

lsblk

sh-4.4# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 446.6G  0 disk
  |-sda1   8:1    0     1M  0 part
  |-sda2   8:2    0   127M  0 part
  |-sda3   8:3    0   384M  0 part /boot
  |-sda4   8:4    0 336.3G  0 part /sysroot
  `-sda5   8:5    0 100.1G  0 part /var/imageregistry


sdb      8:16   0 446.6G  0 disk
sr0     11:0    1   104M  0 rom

Copy to Clipboard

Toggle word wrap

1: /var/imageregistry indicates that the disk is correctly partitioned.

9.3. Updating managed clusters in a disconnected environment with PolicyGenerator resources and TALM
Copier lien

You can use the Topology Aware Lifecycle Manager (TALM) to manage the software lifecycle of managed clusters that you have deployed using GitOps Zero Touch Provisioning (ZTP) and Topology Aware Lifecycle Manager (TALM). TALM uses Red Hat Advanced Cluster Management (RHACM) PolicyGenerator policies to manage and control changes applied to target clusters.

Important

Using PolicyGenerator resources with GitOps ZTP is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

9.3.1. Setting up the disconnected environment
Copier lien

TALM can perform both platform and Operator updates.

You must mirror both the platform image and Operator images that you want to update to in your mirror registry before you can use TALM to update your disconnected clusters. Complete the following steps to mirror the images:

For platform updates, you must perform the following steps:
1. Mirror the desired OpenShift Container Platform image repository. Ensure that the desired platform image is mirrored by following the "Mirroring the OpenShift Container Platform image repository" procedure linked in the Additional resources. Save the contents of the imageContentSources section in the imageContentSources.yaml file:
  Example output
  imageContentSources: - mirrors: - mirror-ocp-registry.ibmcloud.io.cpak:5000/openshift-release-dev/openshift4 source: quay.io/openshift-release-dev/ocp-release - mirrors: - mirror-ocp-registry.ibmcloud.io.cpak:5000/openshift-release-dev/openshift4 source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
  
  Copy to Clipboard Toggle word wrap
2. Save the image signature of the desired platform image that was mirrored. You must add the image signature to the PolicyGenerator CR for platform updates. To get the image signature, perform the following steps:
  1. Specify the desired OpenShift Container Platform tag by running the following command:
    
    $ OCP_RELEASE_NUMBER=<release_version>
    
    Copy to Clipboard Toggle word wrap
  2. Specify the architecture of the cluster by running the following command:
    
    $ ARCHITECTURE=<cluster_architecture>
    1
    
    Copy to Clipboard Toggle word wrap
    
    1
    Specify the architecture of the cluster, such as x86_64, aarch64, s390x, or ppc64le.
  3. Get the release image digest from Quay by running the following command
    
    $ DIGEST="$(oc adm release info quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_NUMBER}-${ARCHITECTURE} | sed -n 's/Pull From: .*@//p')"
    
    Copy to Clipboard Toggle word wrap
  4. Set the digest algorithm by running the following command:
    
    $ DIGEST_ALGO="${DIGEST%%:*}"
    
    Copy to Clipboard Toggle word wrap
  5. Set the digest signature by running the following command:
    
    $ DIGEST_ENCODED="${DIGEST#*:}"
    
    Copy to Clipboard Toggle word wrap
  6. Get the image signature from the mirror.openshift.com website by running the following command:
    
    $ SIGNATURE_BASE64=$(curl -s "https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST_ALGO}=${DIGEST_ENCODED}/signature-1" | base64 -w0 && echo)
    
    Copy to Clipboard Toggle word wrap
  7. Save the image signature to the checksum-<OCP_RELEASE_NUMBER>.yaml file by running the following commands:
    
    $ cat >checksum-${OCP_RELEASE_NUMBER}.yaml <<EOF
    
    Copy to Clipboard Toggle word wrap
    
    ${DIGEST_ALGO}-${DIGEST_ENCODED}: ${SIGNATURE_BASE64} EOF
    
    Copy to Clipboard Toggle word wrap
3. Prepare the update graph. You have two options to prepare the update graph:
  1. Use the OpenShift Update Service.
    For more information about how to set up the graph on the hub cluster, see Deploy the operator for OpenShift Update Service and Build the graph data init container.
  2. Make a local copy of the upstream graph. Host the update graph on an http or https server in the disconnected environment that has access to the managed cluster. To download the update graph, use the following command:
    
    $ curl -s https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.19 -o ~/upgrade-graph_stable-4.19
    
    Copy to Clipboard Toggle word wrap
For Operator updates, you must perform the following task:
- Mirror the Operator catalogs. Ensure that the desired operator images are mirrored by following the procedure in the "Mirroring Operator catalogs for use with disconnected clusters" section.

9.3.2. Performing a platform update with PolicyGenerator CRs
Copier lien

You can perform a platform update with the TALM.

Prerequisites

Install the Topology Aware Lifecycle Manager (TALM).
Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
Provision one or more managed clusters with GitOps ZTP.
Mirror the desired image repository.
Log in as a user with cluster-admin privileges.
Create RHACM policies in the hub cluster.

Procedure

Create a PolicyGenerator CR for the platform update:

Save the following PolicyGenerator CR in the du-upgrade.yaml file:

Example of PolicyGenerator for platform update

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-platform-upgrade
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "100"
      manifests:
        - path: source-crs/ClusterVersion.yaml 
          patches:
            - metadata:
                name: version
              spec:
                channel: stable-4.19
                desiredUpdate:
                    version: 4.19.4
                upstream: http://upgrade.example.com/images/upgrade-graph_stable-4.19
              status:
                history:
                    - state: Completed
                      version: 4.19.4
    - name: du-upgrade-platform-upgrade-prep
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/ImageSignature.yaml 
        - path: source-crs/DisconnectedICSP.yaml
          patches:
            - metadata:
                name: disconnected-internal-icsp-for-ocp
              spec:
                repositoryDigestMirrors: 
                    - mirrors:
                        - quay-intern.example.com/ocp4/openshift-release-dev
                      source: quay.io/openshift-release-dev/ocp-release
                    - mirrors:
                        - quay-intern.example.com/ocp4/openshift-release-dev
                      source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-platform-upgrade
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "100"
      manifests:
        - path: source-crs/ClusterVersion.yaml


          patches:
            - metadata:
                name: version
              spec:
                channel: stable-4.19
                desiredUpdate:
                    version: 4.19.4
                upstream: http://upgrade.example.com/images/upgrade-graph_stable-4.19
              status:
                history:
                    - state: Completed
                      version: 4.19.4
    - name: du-upgrade-platform-upgrade-prep
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/ImageSignature.yaml


        - path: source-crs/DisconnectedICSP.yaml
          patches:
            - metadata:
                name: disconnected-internal-icsp-for-ocp
              spec:
                repositoryDigestMirrors:


                    - mirrors:
                        - quay-intern.example.com/ocp4/openshift-release-dev
                      source: quay.io/openshift-release-dev/ocp-release
                    - mirrors:
                        - quay-intern.example.com/ocp4/openshift-release-dev
                      source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

Copy to Clipboard

Toggle word wrap

1: Shows the ClusterVersion CR to trigger the update. The channel, upstream, and desiredVersion fields are all required for image pre-caching.
2: ImageSignature.yaml contains the image signature of the required release image. The image signature is used to verify the image before applying the platform update.
3: Shows the mirror repository that contains the required OpenShift Container Platform image. Get the mirrors from the imageContentSources.yaml file that you saved when following the procedures in the "Setting up the environment" section.

The PolicyGenerator CR generates two policies:

The du-upgrade-platform-upgrade-prep policy does the preparation work for the platform update. It creates the ConfigMap CR for the desired release image signature, creates the image content source of the mirrored release image repository, and updates the cluster version with the desired update channel and the update graph reachable by the managed cluster in the disconnected environment.
The du-upgrade-platform-upgrade policy is used to perform platform upgrade.

Add the du-upgrade.yaml file contents to the kustomization.yaml file located in the GitOps ZTP Git repository for the PolicyGenerator CRs and push the changes to the Git repository.
ArgoCD pulls the changes from the Git repository and generates the policies on the hub cluster.
Check the created policies by running the following command:
```
oc get policies -A | grep platform-upgrade
```
```
$ oc get policies -A | grep platform-upgrade
```
Copy to Clipboard Toggle word wrap

Create the ClusterGroupUpdate CR for the platform update with the spec.enable field set to false.

Save the content of the platform update ClusterGroupUpdate CR with the du-upgrade-platform-upgrade-prep and the du-upgrade-platform-upgrade policies and the target clusters to the cgu-platform-upgrade.yml file, as shown in the following example:

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-platform-upgrade
  namespace: default
spec:
  managedPolicies:
  - du-upgrade-platform-upgrade-prep
  - du-upgrade-platform-upgrade
  preCaching: false
  clusters:
  - spoke1
  remediationStrategy:
    maxConcurrency: 1
  enable: false

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-platform-upgrade
  namespace: default
spec:
  managedPolicies:
  - du-upgrade-platform-upgrade-prep
  - du-upgrade-platform-upgrade
  preCaching: false
  clusters:
  - spoke1
  remediationStrategy:
    maxConcurrency: 1
  enable: false

Copy to Clipboard

Toggle word wrap

Apply the ClusterGroupUpdate CR to the hub cluster by running the following command:
```
oc apply -f cgu-platform-upgrade.yml
```
```
$ oc apply -f cgu-platform-upgrade.yml
```
Copy to Clipboard Toggle word wrap

Optional: Pre-cache the images for the platform update.

Enable pre-caching in the ClusterGroupUpdate CR by running the following command:

oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \
--patch '{"spec":{"preCaching": true}}' --type=merge

$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \
--patch '{"spec":{"preCaching": true}}' --type=merge

Copy to Clipboard

Toggle word wrap

Monitor the update process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the hub cluster:
```
oc get cgu cgu-platform-upgrade -o jsonpath='{.status.precaching.status}'
```
```
$ oc get cgu cgu-platform-upgrade -o jsonpath='{.status.precaching.status}'
```
Copy to Clipboard Toggle word wrap

Start the platform update:

Enable the cgu-platform-upgrade policy and disable pre-caching by running the following command:

oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \
--patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge

$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-platform-upgrade \
--patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge

Copy to Clipboard

Toggle word wrap

Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
```
oc get policies --all-namespaces
```
```
$ oc get policies --all-namespaces
```
Copy to Clipboard Toggle word wrap

9.3.3. Performing an Operator update with PolicyGenerator CRs
Copier lien

You can perform an Operator update with the TALM.

Prerequisites

Install the Topology Aware Lifecycle Manager (TALM).
Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
Provision one or more managed clusters with GitOps ZTP.
Mirror the desired index image, bundle images, and all Operator images referenced in the bundle images.
Log in as a user with cluster-admin privileges.
Create RHACM policies in the hub cluster.

Procedure

Update the PolicyGenerator CR for the Operator update.

Update the du-upgrade PolicyGenerator CR with the following additional contents in the du-upgrade.yaml file:

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-operator-catsrc-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/DefaultCatsrc.yaml
          patches:
            - metadata:
                name: redhat-operators-disconnected
              spec:
                displayName: Red Hat Operators Catalog
                image: registry.example.com:5000/olm/redhat-operators-disconnected:v4.19 
                updateStrategy: 
                    registryPoll:
                        interval: 1h
              status:
                connectionState:
                    lastObservedState: READY

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-operator-catsrc-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/DefaultCatsrc.yaml
          patches:
            - metadata:
                name: redhat-operators-disconnected
              spec:
                displayName: Red Hat Operators Catalog
                image: registry.example.com:5000/olm/redhat-operators-disconnected:v4.19


                updateStrategy:


                    registryPoll:
                        interval: 1h
              status:
                connectionState:
                    lastObservedState: READY

Copy to Clipboard

Toggle word wrap

1: Contains the required Operator images. If the index images are always pushed to the same image name and tag, this change is not needed.
2: Sets how frequently the Operator Lifecycle Manager (OLM) polls the index image for new Operator versions with the registryPoll.interval field. This change is not needed if a new index image tag is always pushed for y-stream and z-stream Operator updates. The registryPoll.interval field can be set to a shorter interval to expedite the update, however shorter intervals increase computational load. To counteract this, you can restore registryPoll.interval to the default value once the update is complete.
3: Displays the observed state of the catalog connection. The READY value ensures that the CatalogSource policy is ready, indicating that the index pod is pulled and is running. This way, TALM upgrades the Operators based on up-to-date policy compliance states.

This update generates one policy, du-upgrade-operator-catsrc-policy, to update the redhat-operators-disconnected catalog source with the new index images that contain the desired Operators images.

Note

If you want to use the image pre-caching for Operators and there are Operators from a different catalog source other than redhat-operators-disconnected, you must perform the following tasks:

Prepare a separate catalog source policy with the new index image or registry poll interval update for the different catalog source.
Prepare a separate subscription policy for the desired Operators that are from the different catalog source.

For example, the desired SRIOV-FEC Operator is available in the certified-operators catalog source. To update the catalog source and the Operator subscription, add the following contents to generate two policies, du-upgrade-fec-catsrc-policy and du-upgrade-subscriptions-fec-policy:

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-fec-catsrc-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/DefaultCatsrc.yaml
          patches:
            - metadata:
                name: certified-operators
              spec:
                displayName: Intel SRIOV-FEC Operator
                image: registry.example.com:5000/olm/far-edge-sriov-fec:v4.10
                updateStrategy:
                    registryPoll:
                        interval: 10m
    - name: du-upgrade-subscriptions-fec-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/AcceleratorsSubscription.yaml
          patches:
            - spec:
                channel: stable
                source: certified-operators

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
    name: du-upgrade
placementBindingDefaults:
    name: du-upgrade-placement-binding
policyDefaults:
    namespace: ztp-group-du-sno
    placement:
        labelSelector:
            matchExpressions:
                - key: group-du-sno
                  operator: Exists
    remediationAction: inform
    severity: low
    namespaceSelector:
        exclude:
            - kube-*
        include:
            - '*'
    evaluationInterval:
        compliant: 10m
        noncompliant: 10s
policies:
    - name: du-upgrade-fec-catsrc-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "1"
      manifests:
        - path: source-crs/DefaultCatsrc.yaml
          patches:
            - metadata:
                name: certified-operators
              spec:
                displayName: Intel SRIOV-FEC Operator
                image: registry.example.com:5000/olm/far-edge-sriov-fec:v4.10
                updateStrategy:
                    registryPoll:
                        interval: 10m
    - name: du-upgrade-subscriptions-fec-policy
      policyAnnotations:
        ran.openshift.io/ztp-deploy-wave: "2"
      manifests:
        - path: source-crs/AcceleratorsSubscription.yaml
          patches:
            - spec:
                channel: stable
                source: certified-operators

Copy to Clipboard

Toggle word wrap

Remove the specified subscriptions channels in the common PolicyGenerator CR, if they exist. The default subscriptions channels from the GitOps ZTP image are used for the update.
Note
The default channel for the Operators applied through GitOps ZTP 4.19 is stable, except for the performance-addon-operator. As of OpenShift Container Platform 4.11, the performance-addon-operator functionality was moved to the node-tuning-operator. For the 4.10 release, the default channel for PAO is v4.10. You can also specify the default channels in the common PolicyGenerator CR.
Push the PolicyGenerator CRs updates to the GitOps ZTP Git repository.
ArgoCD pulls the changes from the Git repository and generates the policies on the hub cluster.

Check the created policies by running the following command:

oc get policies -A | grep -E "catsrc-policy|subscription"

$ oc get policies -A | grep -E "catsrc-policy|subscription"

Copy to Clipboard

Toggle word wrap

Apply the required catalog source updates before starting the Operator update.

Save the content of the ClusterGroupUpgrade CR named operator-upgrade-prep with the catalog source policies and the target managed clusters to the cgu-operator-upgrade-prep.yml file:

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-operator-upgrade-prep
  namespace: default
spec:
  clusters:
  - spoke1
  enable: true
  managedPolicies:
  - du-upgrade-operator-catsrc-policy
  remediationStrategy:
    maxConcurrency: 1

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-operator-upgrade-prep
  namespace: default
spec:
  clusters:
  - spoke1
  enable: true
  managedPolicies:
  - du-upgrade-operator-catsrc-policy
  remediationStrategy:
    maxConcurrency: 1

Copy to Clipboard

Toggle word wrap

Apply the policy to the hub cluster by running the following command:
```
oc apply -f cgu-operator-upgrade-prep.yml
```
```
$ oc apply -f cgu-operator-upgrade-prep.yml
```
Copy to Clipboard Toggle word wrap
Monitor the update process. Upon completion, ensure that the policy is compliant by running the following command:
```
oc get policies -A | grep -E "catsrc-policy"
```
```
$ oc get policies -A | grep -E "catsrc-policy"
```
Copy to Clipboard Toggle word wrap

Create the ClusterGroupUpgrade CR for the Operator update with the spec.enable field set to false.
1. Save the content of the Operator update ClusterGroupUpgrade CR with the du-upgrade-operator-catsrc-policy policy and the subscription policies created from the common PolicyGenerator and the target clusters to the cgu-operator-upgrade.yml file, as shown in the following example:
  apiVersion: ran.openshift.io/v1alpha1 kind: ClusterGroupUpgrade metadata: name: cgu-operator-upgrade namespace: default spec: managedPolicies: - du-upgrade-operator-catsrc-policy
  1
  - common-subscriptions-policy
  2
  preCaching: false clusters: - spoke1 remediationStrategy: maxConcurrency: 1 enable: false
  Copy to Clipboard Toggle word wrap
  1
  The policy is needed by the image pre-caching feature to retrieve the operator images from the catalog source.
  2
  The policy contains Operator subscriptions. If you have followed the structure and content of the reference PolicyGenTemplates, all Operator subscriptions are grouped into the common-subscriptions-policy policy.
  Note
  One ClusterGroupUpgrade CR can only pre-cache the images of the desired Operators defined in the subscription policy from one catalog source included in the ClusterGroupUpgrade CR. If the desired Operators are from different catalog sources, such as in the example of the SRIOV-FEC Operator, another ClusterGroupUpgrade CR must be created with du-upgrade-fec-catsrc-policy and du-upgrade-subscriptions-fec-policy policies for the SRIOV-FEC Operator images pre-caching and update.
2. Apply the ClusterGroupUpgrade CR to the hub cluster by running the following command:
  $ oc apply -f cgu-operator-upgrade.yml
  Copy to Clipboard Toggle word wrap

Optional: Pre-cache the images for the Operator update.

Before starting image pre-caching, verify the subscription policy is NonCompliant at this point by running the following command:

oc get policy common-subscriptions-policy -n <policy_namespace>

$ oc get policy common-subscriptions-policy -n <policy_namespace>

Copy to Clipboard

Toggle word wrap

Example output

NAME                          REMEDIATION ACTION   COMPLIANCE STATE     AGE
common-subscriptions-policy   inform               NonCompliant         27d

NAME                          REMEDIATION ACTION   COMPLIANCE STATE     AGE
common-subscriptions-policy   inform               NonCompliant         27d

Copy to Clipboard

Toggle word wrap

Enable pre-caching in the ClusterGroupUpgrade CR by running the following command:

oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \
--patch '{"spec":{"preCaching": true}}' --type=merge

$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \
--patch '{"spec":{"preCaching": true}}' --type=merge

Copy to Clipboard

Toggle word wrap

Monitor the process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the managed cluster:
```
oc get cgu cgu-operator-upgrade -o jsonpath='{.status.precaching.status}'
```
```
$ oc get cgu cgu-operator-upgrade -o jsonpath='{.status.precaching.status}'
```
Copy to Clipboard Toggle word wrap

Check if the pre-caching is completed before starting the update by running the following command:

oc get cgu -n default cgu-operator-upgrade -ojsonpath='{.status.conditions}' | jq

$ oc get cgu -n default cgu-operator-upgrade -ojsonpath='{.status.conditions}' | jq

Copy to Clipboard

Toggle word wrap

Example output

[
    {
      "lastTransitionTime": "2022-03-08T20:49:08.000Z",
      "message": "The ClusterGroupUpgrade CR is not enabled",
      "reason": "UpgradeNotStarted",
      "status": "False",
      "type": "Ready"
    },
    {
      "lastTransitionTime": "2022-03-08T20:55:30.000Z",
      "message": "Precaching is completed",
      "reason": "PrecachingCompleted",
      "status": "True",
      "type": "PrecachingDone"
    }
]

[
    {
      "lastTransitionTime": "2022-03-08T20:49:08.000Z",
      "message": "The ClusterGroupUpgrade CR is not enabled",
      "reason": "UpgradeNotStarted",
      "status": "False",
      "type": "Ready"
    },
    {
      "lastTransitionTime": "2022-03-08T20:55:30.000Z",
      "message": "Precaching is completed",
      "reason": "PrecachingCompleted",
      "status": "True",
      "type": "PrecachingDone"
    }
]

Copy to Clipboard

Toggle word wrap

Start the Operator update.

Enable the cgu-operator-upgrade ClusterGroupUpgrade CR and disable pre-caching to start the Operator update by running the following command:

oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \
--patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge

$ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-operator-upgrade \
--patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge

Copy to Clipboard

Toggle word wrap

Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
```
oc get policies --all-namespaces
```
```
$ oc get policies --all-namespaces
```
Copy to Clipboard Toggle word wrap

9.3.4. Troubleshooting missed Operator updates with PolicyGenerator CRs
Copier lien

In some scenarios, Topology Aware Lifecycle Manager (TALM) might miss Operator updates due to an out-of-date policy compliance state.

After a catalog source update, it takes time for the Operator Lifecycle Manager (OLM) to update the subscription status. The status of the subscription policy might continue to show as compliant while TALM decides whether remediation is needed. As a result, the Operator specified in the subscription policy does not get upgraded.

To avoid this scenario, add another catalog source configuration to the PolicyGenerator and specify this configuration in the subscription for any Operators that require an update.

Procedure

Add a catalog source configuration in the PolicyGenerator resource:

manifests:
- path: source-crs/DefaultCatsrc.yaml
  patches:
    - metadata:
        name: redhat-operators-disconnected
      spec:
        displayName: Red Hat Operators Catalog
        image: registry.example.com:5000/olm/redhat-operators-disconnected:v{product-version}
        updateStrategy:
            registryPoll:
                interval: 1h
      status:
        connectionState:
            lastObservedState: READY
- path: source-crs/DefaultCatsrc.yaml
  patches:
    - metadata:
        name: redhat-operators-disconnected-v2 
      spec:
        displayName: Red Hat Operators Catalog v2 
        image: registry.example.com:5000/olm/redhat-operators-disconnected:<version> 
        updateStrategy:
            registryPoll:
                interval: 1h
      status:
        connectionState:
            lastObservedState: READY

manifests:
- path: source-crs/DefaultCatsrc.yaml
  patches:
    - metadata:
        name: redhat-operators-disconnected
      spec:
        displayName: Red Hat Operators Catalog
        image: registry.example.com:5000/olm/redhat-operators-disconnected:v{product-version}
        updateStrategy:
            registryPoll:
                interval: 1h
      status:
        connectionState:
            lastObservedState: READY
- path: source-crs/DefaultCatsrc.yaml
  patches:
    - metadata:
        name: redhat-operators-disconnected-v2


      spec:
        displayName: Red Hat Operators Catalog v2


        image: registry.example.com:5000/olm/redhat-operators-disconnected:<version>


        updateStrategy:
            registryPoll:
                interval: 1h
      status:
        connectionState:
            lastObservedState: READY

Copy to Clipboard

Toggle word wrap

1: Update the name for the new configuration.
2: Update the display name for the new configuration.
3: Update the index image URL. This policies.manifests.patches.spec.image field overrides any configuration in the DefaultCatsrc.yaml file.

Update the Subscription resource to point to the new configuration for Operators that require an update:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: operator-subscription
  namespace: operator-namspace
# ...
spec:
  source: redhat-operators-disconnected-v2 
# ...

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: operator-subscription
  namespace: operator-namspace
# ...
spec:
  source: redhat-operators-disconnected-v2


# ...

Copy to Clipboard

Toggle word wrap

1: Enter the name of the additional catalog source configuration that you defined in the PolicyGenerator resource.

9.3.5. Performing a platform and an Operator update together
Copier lien

You can perform a platform and an Operator update at the same time.

Prerequisites

Install the Topology Aware Lifecycle Manager (TALM).
Update GitOps Zero Touch Provisioning (ZTP) to the latest version.
Provision one or more managed clusters with GitOps ZTP.
Log in as a user with cluster-admin privileges.
Create RHACM policies in the hub cluster.

Procedure

Create the PolicyGenerator CR for the updates by following the steps described in the "Performing a platform update" and "Performing an Operator update" sections.

Apply the prep work for the platform and the Operator update.

Save the content of the ClusterGroupUpgrade CR with the policies for platform update preparation work, catalog source updates, and target clusters to the cgu-platform-operator-upgrade-prep.yml file, for example:

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-platform-operator-upgrade-prep
  namespace: default
spec:
  managedPolicies:
  - du-upgrade-platform-upgrade-prep
  - du-upgrade-operator-catsrc-policy
  clusterSelector:
  - group-du-sno
  remediationStrategy:
    maxConcurrency: 10
  enable: true

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-platform-operator-upgrade-prep
  namespace: default
spec:
  managedPolicies:
  - du-upgrade-platform-upgrade-prep
  - du-upgrade-operator-catsrc-policy
  clusterSelector:
  - group-du-sno
  remediationStrategy:
    maxConcurrency: 10
  enable: true

Copy to Clipboard

Toggle word wrap

Apply the cgu-platform-operator-upgrade-prep.yml file to the hub cluster by running the following command:
```
oc apply -f cgu-platform-operator-upgrade-prep.yml
```
```
$ oc apply -f cgu-platform-operator-upgrade-prep.yml
```
Copy to Clipboard Toggle word wrap
Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
```
oc get policies --all-namespaces
```
```
$ oc get policies --all-namespaces
```
Copy to Clipboard Toggle word wrap

Create the ClusterGroupUpdate CR for the platform and the Operator update with the spec.enable field set to false.

Save the contents of the platform and Operator update ClusterGroupUpdate CR with the policies and the target clusters to the cgu-platform-operator-upgrade.yml file, as shown in the following example:

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-du-upgrade
  namespace: default
spec:
  managedPolicies:
  - du-upgrade-platform-upgrade 
  - du-upgrade-operator-catsrc-policy 
  - common-subscriptions-policy 
  preCaching: true
  clusterSelector:
  - group-du-sno
  remediationStrategy:
    maxConcurrency: 1
  enable: false

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu-du-upgrade
  namespace: default
spec:
  managedPolicies:
  - du-upgrade-platform-upgrade


  - du-upgrade-operator-catsrc-policy


  - common-subscriptions-policy


  preCaching: true
  clusterSelector:
  - group-du-sno
  remediationStrategy:
    maxConcurrency: 1
  enable: false

Copy to Clipboard

Toggle word wrap

1: This is the platform update policy.
2: This is the policy containing the catalog source information for the Operators to be updated. It is needed for the pre-caching feature to determine which Operator images to download to the managed cluster.
3: This is the policy to update the Operators.

Apply the cgu-platform-operator-upgrade.yml file to the hub cluster by running the following command:
```
oc apply -f cgu-platform-operator-upgrade.yml
```
```
$ oc apply -f cgu-platform-operator-upgrade.yml
```
Copy to Clipboard Toggle word wrap

Optional: Pre-cache the images for the platform and the Operator update.
1. Enable pre-caching in the ClusterGroupUpgrade CR by running the following command:
  $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"preCaching": true}}' --type=merge
  Copy to Clipboard Toggle word wrap
2. Monitor the update process and wait for the pre-caching to complete. Check the status of pre-caching by running the following command on the managed cluster:
  $ oc get jobs,pods -n openshift-talm-pre-cache
  Copy to Clipboard Toggle word wrap
3. Check if the pre-caching is completed before starting the update by running the following command:
  $ oc get cgu cgu-du-upgrade -ojsonpath='{.status.conditions}'
  Copy to Clipboard Toggle word wrap
Start the platform and Operator update.
1. Enable the cgu-du-upgrade ClusterGroupUpgrade CR to start the platform and the Operator update by running the following command:
  $ oc --namespace=default patch clustergroupupgrade.ran.openshift.io/cgu-du-upgrade \ --patch '{"spec":{"enable":true, "preCaching": false}}' --type=merge
  Copy to Clipboard Toggle word wrap
2. Monitor the process. Upon completion, ensure that the policy is compliant by running the following command:
  $ oc get policies --all-namespaces
  Copy to Clipboard Toggle word wrap
  Note
  The CRs for the platform and Operator updates can be created from the beginning by configuring the setting to spec.enable: true. In this case, the update starts immediately after pre-caching completes and there is no need to manually enable the CR.
  Both pre-caching and the update create extra resources, such as policies, placement bindings, placement rules, managed cluster actions, and managed cluster view, to help complete the procedures. Setting the afterCompletion.deleteObjects field to true deletes all these resources after the updates complete.

9.3.6. Removing Performance Addon Operator subscriptions from deployed clusters with PolicyGenerator CRs
Copier lien

In earlier versions of OpenShift Container Platform, the Performance Addon Operator provided automatic, low latency performance tuning for applications. In OpenShift Container Platform 4.11 or later, these functions are part of the Node Tuning Operator.

Do not install the Performance Addon Operator on clusters running OpenShift Container Platform 4.11 or later. If you upgrade to OpenShift Container Platform 4.11 or later, the Node Tuning Operator automatically removes the Performance Addon Operator.

Note

You need to remove any policies that create Performance Addon Operator subscriptions to prevent a re-installation of the Operator.

The reference DU profile includes the Performance Addon Operator in the PolicyGenerator CR acm-common-ranGen.yaml. To remove the subscription from deployed managed clusters, you must update acm-common-ranGen.yaml.

Note

If you install Performance Addon Operator 4.10.3-5 or later on OpenShift Container Platform 4.11 or later, the Performance Addon Operator detects the cluster version and automatically hibernates to avoid interfering with the Node Tuning Operator functions. However, to ensure best performance, remove the Performance Addon Operator from your OpenShift Container Platform 4.11 clusters.

Prerequisites

Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for ArgoCD.
Update to OpenShift Container Platform 4.11 or later.
Log in as a user with cluster-admin privileges.

Procedure

Change the complianceType to mustnothave for the Performance Addon Operator namespace, Operator group, and subscription in the acm-common-ranGen.yaml file.

- name: group-du-sno-pg-subscriptions-policy
  policyAnnotations:
    ran.openshift.io/ztp-deploy-wave: "2"
  manifests:
    - path: source-crs/PaoSubscriptionNS.yaml
    - path: source-crs/PaoSubscriptionOperGroup.yaml
    - path: source-crs/PaoSubscription.yaml

- name: group-du-sno-pg-subscriptions-policy
  policyAnnotations:
    ran.openshift.io/ztp-deploy-wave: "2"
  manifests:
    - path: source-crs/PaoSubscriptionNS.yaml
    - path: source-crs/PaoSubscriptionOperGroup.yaml
    - path: source-crs/PaoSubscription.yaml

Copy to Clipboard

Toggle word wrap

Merge the changes with your custom site repository and wait for the ArgoCD application to synchronize the change to the hub cluster. The status of the common-subscriptions-policy policy changes to Non-Compliant.
Apply the change to your target clusters by using the Topology Aware Lifecycle Manager. For more information about rolling out configuration changes, see the "Additional resources" section.
Monitor the process. When the status of the common-subscriptions-policy policy for a target cluster is Compliant, the Performance Addon Operator has been removed from the cluster. Get the status of the common-subscriptions-policy by running the following command:
```
oc get policy -n ztp-common common-subscriptions-policy
```
```
$ oc get policy -n ztp-common common-subscriptions-policy
```
Copy to Clipboard Toggle word wrap
Delete the Performance Addon Operator namespace, Operator group and subscription CRs from policies.manifests in the acm-common-ranGen.yaml file.
Merge the changes with your custom site repository and wait for the ArgoCD application to synchronize the change to the hub cluster. The policy remains compliant.

9.3.7. Pre-caching user-specified images with TALM on single-node OpenShift clusters
Copier lien

You can pre-cache application-specific workload images on single-node OpenShift clusters before upgrading your applications.

You can specify the configuration options for the pre-caching jobs using the following custom resources (CR):

PreCachingConfig CR
ClusterGroupUpgrade CR

Note

All fields in the PreCachingConfig CR are optional.

Example PreCachingConfig CR

apiVersion: ran.openshift.io/v1alpha1
kind: PreCachingConfig
metadata:
  name: exampleconfig
  namespace: exampleconfig-ns
spec:
  overrides: 
    platformImage: quay.io/openshift-release-dev/ocp-release@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
    operatorsIndexes:
      - registry.example.com:5000/custom-redhat-operators:1.0.0
    operatorsPackagesAndChannels:
      - local-storage-operator: stable
      - ptp-operator: stable
      - sriov-network-operator: stable
  spaceRequired: 30 Gi 
  excludePrecachePatterns: 
    - aws
    - vsphere
  additionalImages: 
    - quay.io/exampleconfig/application1@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
    - quay.io/exampleconfig/application2@sha256:3d5800123dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47adfaef
    - quay.io/exampleconfig/applicationN@sha256:4fe1334adfafadsf987123adfffdaf1243340adfafdedga0991234afdadfsa09

apiVersion: ran.openshift.io/v1alpha1
kind: PreCachingConfig
metadata:
  name: exampleconfig
  namespace: exampleconfig-ns
spec:
  overrides:


    platformImage: quay.io/openshift-release-dev/ocp-release@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
    operatorsIndexes:
      - registry.example.com:5000/custom-redhat-operators:1.0.0
    operatorsPackagesAndChannels:
      - local-storage-operator: stable
      - ptp-operator: stable
      - sriov-network-operator: stable
  spaceRequired: 30 Gi


  excludePrecachePatterns:


    - aws
    - vsphere
  additionalImages:


    - quay.io/exampleconfig/application1@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
    - quay.io/exampleconfig/application2@sha256:3d5800123dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47adfaef
    - quay.io/exampleconfig/applicationN@sha256:4fe1334adfafadsf987123adfffdaf1243340adfafdedga0991234afdadfsa09

Copy to Clipboard

Toggle word wrap

1: By default, TALM automatically populates the platformImage, operatorsIndexes, and the operatorsPackagesAndChannels fields from the policies of the managed clusters. You can specify values to override the default TALM-derived values for these fields.
2: Specifies the minimum required disk space on the cluster. If unspecified, TALM defines a default value for OpenShift Container Platform images. The disk space field must include an integer value and the storage unit. For example: 40 GiB, 200 MB, 1 TiB.
3: Specifies the images to exclude from pre-caching based on image name matching.
4: Specifies the list of additional images to pre-cache.

Example ClusterGroupUpgrade CR with PreCachingConfig CR reference

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu
spec:
  preCaching: true 
  preCachingConfigRef:
    name: exampleconfig 
    namespace: exampleconfig-ns

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu
spec:
  preCaching: true


  preCachingConfigRef:
    name: exampleconfig


    namespace: exampleconfig-ns

Copy to Clipboard

Toggle word wrap

1: The preCaching field set to true enables the pre-caching job.
2: The preCachingConfigRef.name field specifies the PreCachingConfig CR that you want to use.
3: The preCachingConfigRef.namespace specifies the namespace of the PreCachingConfig CR that you want to use.

9.3.7.1. Creating the custom resources for pre-caching
Copier lien

You must create the PreCachingConfig CR before or concurrently with the ClusterGroupUpgrade CR.

Create the PreCachingConfig CR with the list of additional images you want to pre-cache.

apiVersion: ran.openshift.io/v1alpha1
kind: PreCachingConfig
metadata:
  name: exampleconfig
  namespace: default 
spec:
[...]
  spaceRequired: 30Gi 
  additionalImages:
    - quay.io/exampleconfig/application1@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
    - quay.io/exampleconfig/application2@sha256:3d5800123dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47adfaef
    - quay.io/exampleconfig/applicationN@sha256:4fe1334adfafadsf987123adfffdaf1243340adfafdedga0991234afdadfsa09

apiVersion: ran.openshift.io/v1alpha1
kind: PreCachingConfig
metadata:
  name: exampleconfig
  namespace: default


spec:
[...]
  spaceRequired: 30Gi


  additionalImages:
    - quay.io/exampleconfig/application1@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
    - quay.io/exampleconfig/application2@sha256:3d5800123dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47adfaef
    - quay.io/exampleconfig/applicationN@sha256:4fe1334adfafadsf987123adfffdaf1243340adfafdedga0991234afdadfsa09

Copy to Clipboard

Toggle word wrap

1: The namespace must be accessible to the hub cluster.
2: It is recommended to set the minimum disk space required field to ensure that there is sufficient storage space for the pre-cached images.

Create a ClusterGroupUpgrade CR with the preCaching field set to true and specify the PreCachingConfig CR created in the previous step:

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu
  namespace: default
spec:
  clusters:
  - sno1
  - sno2
  preCaching: true
  preCachingConfigRef:
  - name: exampleconfig
    namespace: default
  managedPolicies:
    - du-upgrade-platform-upgrade
    - du-upgrade-operator-catsrc-policy
    - common-subscriptions-policy
  remediationStrategy:
    timeout: 240

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  name: cgu
  namespace: default
spec:
  clusters:
  - sno1
  - sno2
  preCaching: true
  preCachingConfigRef:
  - name: exampleconfig
    namespace: default
  managedPolicies:
    - du-upgrade-platform-upgrade
    - du-upgrade-operator-catsrc-policy
    - common-subscriptions-policy
  remediationStrategy:
    timeout: 240

Copy to Clipboard

Toggle word wrap

Warning

Once you install the images on the cluster, you cannot change or delete them.

When you want to start pre-caching the images, apply the ClusterGroupUpgrade CR by running the following command:
```
oc apply -f cgu.yaml
```
```
$ oc apply -f cgu.yaml
```
Copy to Clipboard Toggle word wrap

TALM verifies the ClusterGroupUpgrade CR.

From this point, you can continue with the TALM pre-caching workflow.

Note

All sites are pre-cached concurrently.

Verification

Check the pre-caching status on the hub cluster where the ClusterUpgradeGroup CR is applied by running the following command:

oc get cgu <cgu_name> -n <cgu_namespace> -oyaml

$ oc get cgu <cgu_name> -n <cgu_namespace> -oyaml

Copy to Clipboard

Toggle word wrap

Example output

  precaching:
    spec:
      platformImage: quay.io/openshift-release-dev/ocp-release@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
      operatorsIndexes:
        - registry.example.com:5000/custom-redhat-operators:1.0.0
      operatorsPackagesAndChannels:
        - local-storage-operator: stable
        - ptp-operator: stable
        - sriov-network-operator: stable
      excludePrecachePatterns:
        - aws
        - vsphere
      additionalImages:
        - quay.io/exampleconfig/application1@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
        - quay.io/exampleconfig/application2@sha256:3d5800123dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47adfaef
        - quay.io/exampleconfig/applicationN@sha256:4fe1334adfafadsf987123adfffdaf1243340adfafdedga0991234afdadfsa09
      spaceRequired: "30"
    status:
      sno1: Starting
      sno2: Starting

  precaching:
    spec:
      platformImage: quay.io/openshift-release-dev/ocp-release@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
      operatorsIndexes:
        - registry.example.com:5000/custom-redhat-operators:1.0.0
      operatorsPackagesAndChannels:
        - local-storage-operator: stable
        - ptp-operator: stable
        - sriov-network-operator: stable
      excludePrecachePatterns:
        - aws
        - vsphere
      additionalImages:
        - quay.io/exampleconfig/application1@sha256:3d5800990dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47e2e1ef
        - quay.io/exampleconfig/application2@sha256:3d5800123dee7cd4727d3fe238a97e2d2976d3808fc925ada29c559a47adfaef
        - quay.io/exampleconfig/applicationN@sha256:4fe1334adfafadsf987123adfffdaf1243340adfafdedga0991234afdadfsa09
      spaceRequired: "30"
    status:
      sno1: Starting
      sno2: Starting

Copy to Clipboard

Toggle word wrap

The pre-caching configurations are validated by checking if the managed policies exist. Valid configurations of the ClusterGroupUpgrade and the PreCachingConfig CRs result in the following statuses:

Example output of valid CRs

- lastTransitionTime: "2023-01-01T00:00:01Z"
  message: All selected clusters are valid
  reason: ClusterSelectionCompleted
  status: "True"
  type: ClusterSelected
- lastTransitionTime: "2023-01-01T00:00:02Z"
  message: Completed validation
  reason: ValidationCompleted
  status: "True"
  type: Validated
- lastTransitionTime: "2023-01-01T00:00:03Z"
  message: Precaching spec is valid and consistent
  reason: PrecacheSpecIsWellFormed
  status: "True"
  type: PrecacheSpecValid
- lastTransitionTime: "2023-01-01T00:00:04Z"
  message: Precaching in progress for 1 clusters
  reason: InProgress
  status: "False"
  type: PrecachingSucceeded

- lastTransitionTime: "2023-01-01T00:00:01Z"
  message: All selected clusters are valid
  reason: ClusterSelectionCompleted
  status: "True"
  type: ClusterSelected
- lastTransitionTime: "2023-01-01T00:00:02Z"
  message: Completed validation
  reason: ValidationCompleted
  status: "True"
  type: Validated
- lastTransitionTime: "2023-01-01T00:00:03Z"
  message: Precaching spec is valid and consistent
  reason: PrecacheSpecIsWellFormed
  status: "True"
  type: PrecacheSpecValid
- lastTransitionTime: "2023-01-01T00:00:04Z"
  message: Precaching in progress for 1 clusters
  reason: InProgress
  status: "False"
  type: PrecachingSucceeded

Copy to Clipboard

Toggle word wrap

Example of an invalid PreCachingConfig CR

Type:    "PrecacheSpecValid"
Status:  False,
Reason:  "PrecacheSpecIncomplete"
Message: "Precaching spec is incomplete: failed to get PreCachingConfig resource due to PreCachingConfig.ran.openshift.io "<pre-caching_cr_name>" not found"

Type:    "PrecacheSpecValid"
Status:  False,
Reason:  "PrecacheSpecIncomplete"
Message: "Precaching spec is incomplete: failed to get PreCachingConfig resource due to PreCachingConfig.ran.openshift.io "<pre-caching_cr_name>" not found"

Copy to Clipboard

Toggle word wrap

You can find the pre-caching job by running the following command on the managed cluster:

oc get jobs -n openshift-talo-pre-cache

$ oc get jobs -n openshift-talo-pre-cache

Copy to Clipboard

Toggle word wrap

Example of pre-caching job in progress

NAME        COMPLETIONS       DURATION      AGE
pre-cache   0/1               1s            1s

NAME        COMPLETIONS       DURATION      AGE
pre-cache   0/1               1s            1s

Copy to Clipboard

Toggle word wrap

You can check the status of the pod created for the pre-caching job by running the following command:

oc describe pod pre-cache -n openshift-talo-pre-cache

$ oc describe pod pre-cache -n openshift-talo-pre-cache

Copy to Clipboard

Toggle word wrap

Example of pre-caching job in progress

Type        Reason              Age    From              Message
Normal      SuccesfulCreate     19s    job-controller    Created pod: pre-cache-abcd1

Type        Reason              Age    From              Message
Normal      SuccesfulCreate     19s    job-controller    Created pod: pre-cache-abcd1

Copy to Clipboard

Toggle word wrap

You can get live updates on the status of the job by running the following command:
```
oc logs -f pre-cache-abcd1 -n openshift-talo-pre-cache
```
```
$ oc logs -f pre-cache-abcd1 -n openshift-talo-pre-cache
```
Copy to Clipboard Toggle word wrap

To verify the pre-cache job is successfully completed, run the following command:

oc describe pod pre-cache -n openshift-talo-pre-cache

$ oc describe pod pre-cache -n openshift-talo-pre-cache

Copy to Clipboard

Toggle word wrap

Example of completed pre-cache job

Type        Reason              Age    From              Message
Normal      SuccesfulCreate     5m19s  job-controller    Created pod: pre-cache-abcd1
Normal      Completed           19s    job-controller    Job completed

Type        Reason              Age    From              Message
Normal      SuccesfulCreate     5m19s  job-controller    Created pod: pre-cache-abcd1
Normal      Completed           19s    job-controller    Job completed

Copy to Clipboard

Toggle word wrap

To verify that the images are successfully pre-cached on the single-node OpenShift, do the following:
1. Enter into the node in debug mode:
  $ oc debug node/cnfdf00.example.lab
  Copy to Clipboard Toggle word wrap
2. Change root to host:
  $ chroot /host/
  Copy to Clipboard Toggle word wrap
3. Search for the desired images:
  $ sudo podman images | grep <operator_name>
  Copy to Clipboard Toggle word wrap

9.3.8. About the auto-created ClusterGroupUpgrade CR for GitOps ZTP
Copier lien

TALM has a controller called ManagedClusterForCGU that monitors the Ready state of the ManagedCluster CRs on the hub cluster and creates the ClusterGroupUpgrade CRs for GitOps Zero Touch Provisioning (ZTP).

For any managed cluster in the Ready state without a ztp-done label applied, the ManagedClusterForCGU controller automatically creates a ClusterGroupUpgrade CR in the ztp-install namespace with its associated RHACM policies that are created during the GitOps ZTP process. TALM then remediates the set of configuration policies that are listed in the auto-created ClusterGroupUpgrade CR to push the configuration CRs to the managed cluster.

If there are no policies for the managed cluster at the time when the cluster becomes Ready, a ClusterGroupUpgrade CR with no policies is created. Upon completion of the ClusterGroupUpgrade the managed cluster is labeled as ztp-done. If there are policies that you want to apply for that managed cluster, manually create a ClusterGroupUpgrade as a day-2 operation.

Example of an auto-created ClusterGroupUpgrade CR for GitOps ZTP

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  generation: 1
  name: spoke1
  namespace: ztp-install
  ownerReferences:
  - apiVersion: cluster.open-cluster-management.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: ManagedCluster
    name: spoke1
    uid: 98fdb9b2-51ee-4ee7-8f57-a84f7f35b9d5
  resourceVersion: "46666836"
  uid: b8be9cd2-764f-4a62-87d6-6b767852c7da
spec:
  actions:
    afterCompletion:
      addClusterLabels:
        ztp-done: "" 
      deleteClusterLabels:
        ztp-running: ""
      deleteObjects: true
    beforeEnable:
      addClusterLabels:
        ztp-running: "" 
  clusters:
  - spoke1
  enable: true
  managedPolicies:
  - common-spoke1-config-policy
  - common-spoke1-subscriptions-policy
  - group-spoke1-config-policy
  - spoke1-config-policy
  - group-spoke1-validator-du-policy
  preCaching: false
  remediationStrategy:
    maxConcurrency: 1
    timeout: 240

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  generation: 1
  name: spoke1
  namespace: ztp-install
  ownerReferences:
  - apiVersion: cluster.open-cluster-management.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: ManagedCluster
    name: spoke1
    uid: 98fdb9b2-51ee-4ee7-8f57-a84f7f35b9d5
  resourceVersion: "46666836"
  uid: b8be9cd2-764f-4a62-87d6-6b767852c7da
spec:
  actions:
    afterCompletion:
      addClusterLabels:
        ztp-done: ""


      deleteClusterLabels:
        ztp-running: ""
      deleteObjects: true
    beforeEnable:
      addClusterLabels:
        ztp-running: ""


  clusters:
  - spoke1
  enable: true
  managedPolicies:
  - common-spoke1-config-policy
  - common-spoke1-subscriptions-policy
  - group-spoke1-config-policy
  - spoke1-config-policy
  - group-spoke1-validator-du-policy
  preCaching: false
  remediationStrategy:
    maxConcurrency: 1
    timeout: 240

Copy to Clipboard

Toggle word wrap

1: Applied to the managed cluster when TALM completes the cluster configuration.
2: Applied to the managed cluster when TALM starts deploying the configuration policies.

Retour au début

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 9. Managing cluster policies with PolicyGenerator resources

9.1. Configuring managed cluster policies by using PolicyGenerator resourcesCopier lienLien copié sur presse-papiers!

9.1.1. Comparing RHACM PolicyGenerator and PolicyGenTemplate resource patchingCopier lienLien copié sur presse-papiers!

9.1.2. About the PolicyGenerator CRDCopier lienLien copié sur presse-papiers!

9.1.3. Recommendations when customizing PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.1.4. PolicyGenerator CRs for RAN deploymentsCopier lienLien copié sur presse-papiers!

9.1.5. Customizing a managed cluster with PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.1.6. Monitoring managed cluster policy deployment progressCopier lienLien copié sur presse-papiers!

9.1.7. Coordinating reboots for configuration changesCopier lienLien copié sur presse-papiers!

9.1.8. Validating the generation of configuration policy CRsCopier lienLien copié sur presse-papiers!

9.1.9. Restarting policy reconciliationCopier lienLien copié sur presse-papiers!

9.1.10. Changing applied managed cluster CRs using policiesCopier lienLien copié sur presse-papiers!

9.1.11. Indication of done for GitOps ZTP installationsCopier lienLien copié sur presse-papiers!

9.2. Advanced managed cluster configuration with PolicyGenerator resourcesCopier lienLien copié sur presse-papiers!

9.2.1. Deploying additional changes to clustersCopier lienLien copié sur presse-papiers!

9.2.2. Using PolicyGenerator CRs to override source CRs contentCopier lienLien copié sur presse-papiers!

9.2.3. Adding custom content to the GitOps ZTP pipelineCopier lienLien copié sur presse-papiers!

9.2.4. Configuring policy compliance evaluation timeouts for PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.2.5. Signalling GitOps ZTP cluster deployment completion with validator inform policiesCopier lienLien copié sur presse-papiers!

9.2.6. Configuring power states using PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.2.6.1. Configuring performance mode using PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.2.6.2. Configuring high-performance mode using PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.2.6.3. Configuring power saving mode using PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.2.6.4. Maximizing power savingsCopier lienLien copié sur presse-papiers!

9.2.7. Configuring LVM Storage using PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.2.8. Configuring PTP events with PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.2.8.1. Configuring PTP events that use HTTP transportCopier lienLien copié sur presse-papiers!

9.2.9. Configuring the Image Registry Operator for local caching of imagesCopier lienLien copié sur presse-papiers!

9.2.9.1. Configuring disk partitioning with SiteConfigCopier lienLien copié sur presse-papiers!

9.2.9.2. Configuring the image registry using PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.3. Updating managed clusters in a disconnected environment with PolicyGenerator resources and TALMCopier lienLien copié sur presse-papiers!

9.3.1. Setting up the disconnected environmentCopier lienLien copié sur presse-papiers!

9.3.2. Performing a platform update with PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.3.3. Performing an Operator update with PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.3.4. Troubleshooting missed Operator updates with PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.3.5. Performing a platform and an Operator update togetherCopier lienLien copié sur presse-papiers!

9.3.6. Removing Performance Addon Operator subscriptions from deployed clusters with PolicyGenerator CRsCopier lienLien copié sur presse-papiers!

9.3.7. Pre-caching user-specified images with TALM on single-node OpenShift clustersCopier lienLien copié sur presse-papiers!

9.3.7.1. Creating the custom resources for pre-cachingCopier lienLien copié sur presse-papiers!

9.3.8. About the auto-created ClusterGroupUpgrade CR for GitOps ZTPCopier lienLien copié sur presse-papiers!

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Rendre l’open source plus inclusif

À propos de Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

9.1. Configuring managed cluster policies by using PolicyGenerator resources
Copier lien

9.1.1. Comparing RHACM PolicyGenerator and PolicyGenTemplate resource patching
Copier lien

9.1.2. About the PolicyGenerator CRD
Copier lien

9.1.3. Recommendations when customizing PolicyGenerator CRs
Copier lien

9.1.4. PolicyGenerator CRs for RAN deployments
Copier lien

9.1.5. Customizing a managed cluster with PolicyGenerator CRs
Copier lien

9.1.6. Monitoring managed cluster policy deployment progress
Copier lien

9.1.7. Coordinating reboots for configuration changes
Copier lien

9.1.8. Validating the generation of configuration policy CRs
Copier lien

9.1.9. Restarting policy reconciliation
Copier lien

9.1.10. Changing applied managed cluster CRs using policies
Copier lien

9.1.11. Indication of done for GitOps ZTP installations
Copier lien

9.2. Advanced managed cluster configuration with PolicyGenerator resources
Copier lien

9.2.1. Deploying additional changes to clusters
Copier lien

9.2.2. Using PolicyGenerator CRs to override source CRs content
Copier lien

9.2.3. Adding custom content to the GitOps ZTP pipeline
Copier lien

9.2.4. Configuring policy compliance evaluation timeouts for PolicyGenerator CRs
Copier lien

9.2.5. Signalling GitOps ZTP cluster deployment completion with validator inform policies
Copier lien

9.2.6. Configuring power states using PolicyGenerator CRs
Copier lien

9.2.6.1. Configuring performance mode using PolicyGenerator CRs
Copier lien

9.2.6.2. Configuring high-performance mode using PolicyGenerator CRs
Copier lien

9.2.6.3. Configuring power saving mode using PolicyGenerator CRs
Copier lien

9.2.6.4. Maximizing power savings
Copier lien

9.2.7. Configuring LVM Storage using PolicyGenerator CRs
Copier lien

9.2.8. Configuring PTP events with PolicyGenerator CRs
Copier lien

9.2.8.1. Configuring PTP events that use HTTP transport
Copier lien

9.2.9. Configuring the Image Registry Operator for local caching of images
Copier lien

9.2.9.1. Configuring disk partitioning with SiteConfig
Copier lien

9.2.9.2. Configuring the image registry using PolicyGenerator CRs
Copier lien

9.3. Updating managed clusters in a disconnected environment with PolicyGenerator resources and TALM
Copier lien

9.3.1. Setting up the disconnected environment
Copier lien

9.3.2. Performing a platform update with PolicyGenerator CRs
Copier lien

9.3.3. Performing an Operator update with PolicyGenerator CRs
Copier lien

9.3.4. Troubleshooting missed Operator updates with PolicyGenerator CRs
Copier lien

9.3.5. Performing a platform and an Operator update together
Copier lien

9.3.6. Removing Performance Addon Operator subscriptions from deployed clusters with PolicyGenerator CRs
Copier lien

9.3.7. Pre-caching user-specified images with TALM on single-node OpenShift clusters
Copier lien

9.3.7.1. Creating the custom resources for pre-caching
Copier lien

9.3.8. About the auto-created ClusterGroupUpgrade CR for GitOps ZTP
Copier lien