Chapter 1. Upgrading by using the Operator
Upgrades through the Red Hat Advanced Cluster Security for Kubernetes (RHACS) Operator are performed automatically or manually, depending on the Update approval option you chose at installation.
RHACS 4.0 includes a significant architectural change, moving Central’s database to PostgreSQL. Because of this change, RHACS 4.0 Operator is published by a new subscription channel. Therefore, as part of the upgrade instructions, you must manually change the subscription channel to upgrade from RHACS 3.74 to RHACS 4.0.
- Because of the database related changes introduced in RHACS 4.0, even if you have selected Automatic in the Update approval field, you must manually upgrade to RHACS 4.0.
- You must be using RHACS 3.74 to upgrade to RHACS 4.0. If you are using a version older than 3.74, you must first upgrade to RHACS 3.74 and then upgrade to RHACS 4.0.
1.1. Preparing to upgrade Copy linkLink copied to clipboard!
Before you upgrade Red Hat Advanced Cluster Security for Kubernetes (RHACS) version, you must:
- Verify that you are running the latest patch release version of the RHACS Operator 3.74.
- Backup your existing Central database.
1.2. Modifying Central custom resource Copy linkLink copied to clipboard!
The Central DB service requires persistent storage. If you have not configured a default storage class for the Central cluster that is an SSD or is high performance, you must update the Central custom resource to configure the storage class for the Central DB persistent volume claim (PVC).
Skip this section if you have already configured a default storage class for Central.
Procedure
- Update the central custom resource with the following configuration:
spec:
central:
db:
isEnabled: Default
persistence:
persistentVolumeClaim:
claimName: central-db
size: 100Gi
storageClassName: <storage-class-name>
1.3. Modifying Central custom resource for external database Copy linkLink copied to clipboard!
External PostgreSQL support is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Prerequisites
- You must provision a database that supports PostgreSQL 13 and you must only use it for RHACS.
- User must be superuser with ability to create and delete databases.
- A multitenant database is currently unsupported.
- Connections through PgBouncer are not supported.
Procedure
Create a password secret in the deployed namespace by using the OpenShift Container Platform web console or the terminal.
-
On the OpenShift Container Platform web console, go to the Workloads
Secrets page. Create a Key/Value secret with the key passwordand the value as the path of a plain text file containing the password for the superuser of the provisioned database. Or, run the following command in your terminal:
$ oc create secret generic external-db-password \1 --from-file=password=<password.txt>2
-
On the OpenShift Container Platform web console, go to the Workloads
- Go to the Red Hat Advanced Cluster Security for Kubernetes operator page in the OpenShift Container Platform web console. Select Central in the top navigation bar and select the instance you want to connect to the database.
- Go to the YAML editor view.
-
For
db.passwordSecret.namespecify the referenced secret that you created in earlier steps. For example,external-db-password. -
For
db.connectionStringspecify the connection string inkeyword=valueformat, for example,host=<host> port=5432 user=postgres sslmode=verify-ca -
For
db.persistencedelete the entire block. If necessary, you can specify a Certificate Authority for Central to trust the database certificate by adding a TLS block under the top-level spec, as shown in the following example:
Update the central custom resource with the following configuration:
spec: tls: additionalCAs: - name: db-ca content: | <certificate> central: db: isEnabled: Default1 connectionString: "host=<host> port=5432 user=<user> sslmode=verify-ca" passwordSecret: name: external-db-password- 1
- You must not change the value of
IsEnabledtoEnabled.
- Click Save.
1.4. Changing subscription channel Copy linkLink copied to clipboard!
You can change the update channel for the RHACS Operator by using the OpenShift Container Platform web console or by using the command line. For upgrading to RHACS 4.0 from RHACS 3.74, you must change the update channel.
You must change the subscription channel for all clusters where you have installed RHACS Operator, including Central and all Secured clusters.
Prerequisites
- You must verify that you are using the latest RHACS 3.74 Operator and there are no pending manual Operator upgrades.
- You must verify that you have backed up your existing Central database.
-
You have access to an OpenShift Container Platform cluster web console using an account with
cluster-adminpermissions.
Changing the subscription channel by using the web console
Use the following instructions for changing the subscription channel by using the web console:
Procedure
-
In the Administrator perspective of the OpenShift Container Platform web console, go to Operators
Installed Operators. - Locate the RHACS Operator and click on it.
- Click the Subscription tab.
- Click the name of the update channel under Update Channel.
- Select stable, then click Save.
For subscriptions with an Automatic approval strategy, the update begins automatically. Navigate back to the Operators
Installed Operators page to monitor the progress of the update. When complete, the status changes to Succeeded and Up to date. For subscriptions with a Manual approval strategy, you can manually approve the update from the Subscription tab.
Changing the subscription channel by using command line
Use the following instructions for changing the subscription channel by using command line:
Procedure
Run the following command to change the subscription channel to
stable:$ oc -n rhacs-operator \1 patch subscriptions.operators.coreos.com rhacs-operator \ --type=merge --patch='{ "spec": { "channel": "stable" }}'- 1
- If you use Kubernetes, enter
kubectlinstead ofoc.
During the update the RHACS Operator provisions a new deployment called central-db and your data begins migrating. It takes around 30 minutes and only happens once when you upgrade.
1.5. Rolling back an Operator upgrade Copy linkLink copied to clipboard!
To roll back an Operator upgrade, you must perform the steps described in one of the following sections. You can roll back an Operator upgrade by using the CLI or the OpenShift Container Platform web console.
If you are rolling back from RHACS 4.0, you can only rollback to the latest patch release version of RHACS 3.74.
1.5.1. Rolling back an Operator upgrade by using the CLI Copy linkLink copied to clipboard!
You can roll back the Operator version by using CLI commands.
Procedure
Delete the OLM subscription by running the following command:
For OpenShift Container Platform, run the following command:
$ oc -n rhacs-operator delete subscription rhacs-operatorFor Kubernetes, run the following command:
$ kubectl -n rhacs-operator delete subscription rhacs-operator
Delete the cluster service version (CSV) by running the following command:
For OpenShift Container Platform, run the following command:
$ oc -n rhacs-operator delete csv -l operators.coreos.com/rhacs-operator.rhacs-operatorFor Kubernetes, run the following command:
$ kubectl -n rhacs-operator delete csv -l operators.coreos.com/rhacs-operator.rhacs-operator
Determine the previous version you want to roll back to by choosing one of the following options:
If the current Central instance is running, query the RHACS API to get the rollback version by running the following command:
$ curl -k -s -u <user>:<password> https://<central hostname>/v1/centralhealth/upgradestatus | jq -r .upgradeStatus.forceRollbackToIf the current Central instance is not running, perform the following steps:
NoteThis procedure can only be used for RHACS release 3.74 and earlier when the
rocksdbdatabase is installed.Ensure the Central deployment is scaled down by running the following command:
For OpenShift Container Platform, run the following command:
$ oc scale -n <central namespace> –replicas=0 deploy/centralFor Kubernetes, run the following command:
$ kubectl scale -n <central namespace> –replicas=0 deploy/central
Save the following pod spec as a YAML file:
apiVersion: v1 kind: Pod metadata: name: get-previous-db-version spec: containers: - name: get-previous-db-version image: registry.redhat.io/advanced-cluster-security/rhacs-main-rhel8:<rollback version> command: - sh args: - '-c' - "cat /var/lib/stackrox/.previous/migration_version.yaml | grep '^image:' | cut -f 2 -d : | tr -d ' '" volumeMounts: - name: stackrox-db mountPath: /var/lib/stackrox volumes: - name: stackrox-db persistentVolumeClaim: claimName: stackrox-dbCreate a pod in your Central namespace by running the following command using the YAML file that you saved:
For OpenShift Container Platform, run the following command:
$ oc create -n <central namespace> -f pod.yamlFor Kubernetes, run the following command:
$ kubectl create -n <central namespace> -f pod.yaml
After pod creation is complete, get the version by running the following command:
For OpenShift Container Platform, run the following command:
$ oc logs -n <central namespace> get-previous-db-versionFor Kubernetes, run the following command:
$ kubectl logs -n <central namespace> get-previous-db-version
Edit the
central-config.yamlConfigMapto set themaintenance.forceRollBackVersion:<version>parameter by running the following command:For OpenShift Container Platform, run the following command:
$ oc get configmap -n <central namespace> central-config -o yaml | sed -e "s/forceRollbackVersion: none/forceRollbackVersion: <version>/" | oc -n <central namespace> apply -f -For Kubernetes, run the following command:
$ kubectl get configmap -n <central namespace> central-config -o yaml | sed -e "s/forceRollbackVersion: none/forceRollbackVersion: <version>/" | kubectl -n <central namespace> apply -f -
Set the image for the Central deployment using the version string shown in Step 3 as the image tag. For example, run the following command:
For OpenShift Container Platform, run the following command:
$ oc set image -n <central namespace> deploy/central central=registry.redhat.io/advanced-cluster-security/rhacs-main-rhel8:<version>For Kubernetes, run the following command:
$ kubectl set image -n <central namespace> deploy/central central=registry.redhat.io/advanced-cluster-security/rhacs-main-rhel8:<version>
Verification
Ensure that the Central pod starts and has a
readystatus. If the pod crashes, check the logs to see if the backup was restored. A successful log message appears similar to the following example:Clone to Migrate ".previous", ""-
Reinstall the Operator on the rolled back channel. For example,
3.74.2is installed on therhacs-3.74channel.
1.5.2. Rolling back an Operator upgrade by using the web console Copy linkLink copied to clipboard!
You can roll back the Operator version by using the OpenShift Container Platform web console.
Prerequisites
-
You have access to an OpenShift Container Platform cluster web console using an account with
cluster-adminpermissions.
Procedure
-
Navigate to the Operators
Installed Operators page. - Locate the RHACS Operator and click on it.
- On the Operator Details page, select Uninstall Operator from the Actions list. Following this action, the Operator stops running and no longer receives updates.
Determine the previous version you want to roll back to by choosing one of the following options:
If the current Central instance is running, you can query the RHACS API to get the rollback version by running the following command from a terminal window:
$ curl -k -s -u <user>:<password> https://<central hostname>/v1/centralhealth/upgradestatus | jq -r .upgradeStatus.forceRollbackToYou can create a pod and extract the previous version by performing the following steps:
NoteThis procedure can only be used for RHACS release 3.74 and earlier when the
rocksdbdatabase is installed.-
Navigate to Workloads
Deployments central. - Under Deployment details, click the down arrow next to the pod count to scale down the pod.
Navigate to Workloads
Pods Create Pod and paste the contents of the pod spec as shown in the following example into the editor: apiVersion: v1 kind: Pod metadata: name: get-previous-db-version spec: containers: - name: get-previous-db-version image: registry.redhat.io/advanced-cluster-security/rhacs-main-rhel8:<rollback version> command: - sh args: - '-c' - "cat /var/lib/stackrox/.previous/migration_version.yaml | grep '^image:' | cut -f 2 -d : | tr -d ' '" volumeMounts: - name: stackrox-db mountPath: /var/lib/stackrox volumes: - name: stackrox-db persistentVolumeClaim: claimName: stackrox-db- Click Create.
- After the pod is created, click the Logs tab to get the version string.
-
Navigate to Workloads
Update the rollback configuration by performing the following steps:
-
Navigate to Workloads
ConfigMaps central-config and select Edit ConfigMap from the Actions list. -
Find the
forceRollbackVersionline in the value of thecentral-config.yamlkey. -
Replace
nonewith3.74.3, and then save the file.
-
Navigate to Workloads
Update Central to the earlier version by performing the following steps:
-
Navigate to Workloads
Deployments central and select Edit Deployment from the Actions list. - Update the image name, and then save the changes.
-
Navigate to Workloads
Verification
Ensure that the Central pod starts and has a
readystatus. If the pod crashes, check the logs to see if the backup was restored. A successful log message appears similar to the following example:Clone to Migrate ".previous", ""-
Reinstall the Operator on the rolled back channel. For example,
3.74.3is installed on therhacs-3.74channel.
1.6. Troubleshooting Operator upgrade issues Copy linkLink copied to clipboard!
Follow the instructions in this section to investigate and resolve upgrade-related issues for the RHACS Operator.
1.6.1. Central DB cannot be scheduled Copy linkLink copied to clipboard!
Follow the instructions here to troubleshoot a failing Central DB pod during an upgrade:
Check the status of the
central-dbpod:$ oc -n <namespace> get pod -l app=central-db1 - 1
- If you use Kubernetes, enter
kubectlinstead ofoc.
If the status of the pod is
Pending, use the describe command to get more details:$ oc -n <namespace> describe po/<central-db-pod-name>1 - 1
- If you use Kubernetes, enter
kubectlinstead ofoc.
You might see the
FailedSchedulingwarning message:Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 54s default-scheduler 0/7 nodes are available: 1 Insufficient memory, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 4 Insufficient cpu. preemption: 0/7 nodes are available: 3 Preemption is not helpful for scheduling, 4 No preemption victims found for incoming pod.This warning message suggests that the scheduled node had insufficient memory to accommodate the pod’s resource requirements. If you have a small environment, consider increasing resources on the nodes or adding a larger node that can support the database.
Otherwise, consider decreasing the resource requirements for the
central-dbpod in the custom resource undercentraldbresources. However, running central with fewer resources than the recommended minimum might lead to degraded performance for RHACS.
1.6.2. Central or Secured cluster fails to deploy Copy linkLink copied to clipboard!
When RHACS Operator:
- fails to deploy Central or Secured Cluster.
- fails to apply CR changes to actual resources.
You must check the custom resource conditions to find the issue.
For Central, run the following command to check the conditions:
$ oc -n rhacs-operator describe centrals.platform.stackrox.io1 - 1
- If you use Kubernetes, enter
kubectlinstead ofoc.
For Secured clusters, run the following command to check the conditions:
$ oc -n rhacs-operator describe securedclusters.platform.stackrox.io1 - 1
- If you use Kubernetes, enter
kubectlinstead ofoc.
You can identify configuration errors from the conditions output:
Example output
Conditions:
Last Transition Time: 2023-04-19T10:49:57Z
Status: False
Type: Deployed
Last Transition Time: 2023-04-19T10:49:57Z
Status: True
Type: Initialized
Last Transition Time: 2023-04-19T10:59:10Z
Message: Deployment.apps "central" is invalid: spec.template.spec.containers[0].resources.requests: Invalid value: "50": must be less than or equal to cpu limit
Reason: ReconcileError
Status: True
Type: Irreconcilable
Last Transition Time: 2023-04-19T10:49:57Z
Message: No proxy configuration is desired
Reason: NoProxyConfig
Status: False
Type: ProxyConfigFailed
Last Transition Time: 2023-04-19T10:49:57Z
Message: Deployment.apps "central" is invalid: spec.template.spec.containers[0].resources.requests: Invalid value: "50": must be less than or equal to cpu limit
Reason: InstallError
Status: True
Type: ReleaseFailed
Additionally, you can view RHACS pod logs to find more information about the issue. Run the following command to view the logs:
oc -n rhacs-operator logs deploy/rhacs-operator-controller-manager manager
- 1
- If you use Kubernetes, enter
kubectlinstead ofoc.