Questo contenuto non è disponibile nella lingua selezionata.
Chapter 2. Deploying and configuring the high availability for Compute instances service
The Red Hat OpenStack Services on OpenShift (RHOSO) high availability for Compute instances (Instance HA) service is managed by the infra-operator, which RHOSO installs by default.
You must deploy an Instance HA service to automate the process of monitoring which Compute nodes have failed and, if necessary, to evacuate instances from the failed Compute nodes. For more information, see Deploying the Instance HA service.
You must not use the Instance HA service to evacuate Compute nodes that host your storage in a RHOSO hyperconverged infrastructure (HCI) environment. In HCI environments, you must tag a subset of your Compute nodes, which do not host the Red Hat Ceph Storage services. For more information, see Tag images, flavors, or host aggregates for evacuation.
2.1. Deploying the Instance HA service Copia collegamentoCollegamento copiato negli appunti!
You must deploy an Red Hat OpenStack Services on OpenShift (RHOSO) high availability for Compute instances (Instance HA) service to automate the process of monitoring failed Compute nodes and, if necessary, to evacuate instances from the failed Compute nodes.
If you have multiple clouds defined, you can create a separate Instance HA service pod to monitor each cloud. For more information, see Configuring the Instance HA service pod specification.
Procedure
Create a YAML Instance HA service manifest file, for example
Instance-HA-service-0.yaml:-
Define the specification
.specand name.metadata.nameof your Instance HA service pod. For more information, see Configuring the Instance HA service pod specification. -
Configure
.spec.fencingSecretto specify the name of the YAML file, in which you have configured the fencing agents of all the Compute nodes that can be evacuated. In this example, the file is calledfencing-0. For more information, see Configuring the fencing of Compute nodes.
-
Define the specification
Apply the Instance HA service manifest and the
fencingSecretfiles:$ oc apply -f fencing-0.yaml $ oc apply -f Instance-HA-service-0.yamlVerify that the Instance HA service pod
Messagefield displaysSetup completebefore continuing:$ oc get instanceha -w NAME STATUS MESSAGE instanceha-0 True Setup completeNoteA unique string is appended to the
.metadata.namethat you specified in the manifest file.Determine the fully qualified name and status of your deployed Instance HA service pod:
$ oc get pods |grep instanceha instanceha-0-54f865b6dd-w6h4t 1/1 Running 0 10hIn this example, the fully qualified name is
instanceha-0-54f865b6dd-w6h4t.WarningA new, unique fully qualified name is created every time the Instance HA service pod restarts. All log entries associated with the previous name are removed. For more information, see Troubleshooting the Instance HA service.
Next steps
-
If you did not specify the
.spec.instanceHaConfigMapwhen creating the Instance HA service pod specification, theinfra-operatorautomatically creates aConfigMapcalledinstanceha-config. This ConfigMap provides the default values of the Instance HA service parameters that you can modify as needed. For more information, see Instance HA service parameters and Editing the Instance HA service parameters. - You can completely remove the Instance HA service. For more information, see Removing the Instance HA service.
2.1.1. Configuring the Instance HA service pod specification Copia collegamentoCollegamento copiato negli appunti!
When you deploy the Red Hat OpenStack Services on OpenShift (RHOSO) high availability for Compute instances (Instance HA) service, you must create a YAML Instance HA service manifest file to define the specification .spec of your Instance HA service pod.
In this example, the YAML Instance HA service manifest file is Instance-HA-service-0.yaml.
$ cat Instance-HA-service-0.yaml
---
apiVersion: instanceha.openstack.org/v1beta1
kind: InstanceHa
metadata:
name: instanceha-0
spec:
caBundleSecretName: combined-ca-bundle
fencingSecret: fencing-0
#instanceHaConfigMap:
#networkAttachments: ['internalapi']
#instanceHaKdumpPort:
#openStackCloud: "default"
#openStackConfigMap:
#openStackConfigSecret:
#nodeSelector:
-
Use
.spec.caBundleSecretNameto specify the name of the secret containing the CA Certificate Bundle that has been used during the deployment of RHOSO. By default this parameter is set tocombined-ca-bundle, but this value might change if you implement custom TLS certificates. For more information, see Adding custom TLS certificates for Red Hat OpenStack Services on OpenShift in Configuring security services. Use
.spec.fencingSecretto specify the name of the YAML file with the configured fencing agents of all the Compute nodes that can be evacuated. In this example, this file is calledfencing-0. For more information, see Configuring the fencing of Compute nodes.NoteAll the other values for defining the specification of your Instance HA service pod are optional, this why they have been commented out in this example.
-
Optional: You can create and name a YAML file containing a
ConfigMapthat provides your configured Instance HA service parameters. In this case, you must use.spec.instanceHaConfigMapto specify the name of this YAML file. If you do not create this file, then a YAML file calledinstanceha-config, is created automatically when the Instance HA service is installed, providing the default values of the Instance HA service parameters. Optional: If you configure the Instance HA service to detect if a Compute node is capturing a kernel dump, then:
-
You must use
.spec.networkAttachmentsto specify the network that receives thekdumpnotifications from thekdumpservice. -
If you do not use the default UDP port of 7410, you must use
.spec.instanceHaKdumpPortto specify the UDP port that receives thekdumpnotifications from thekdumpservice. For more information, see Detecting if a Compute node is capturing a kernel dump.
-
You must use
Optional: If you have multiple clouds defined, you can create a separate Instance HA service pod to monitor each cloud. In this case, you can use the following settings to specify the required authentication details for each cloud:
-
You can use
.spec.openStackCloudto specify the name of the cloud detailed in yourclouds.yamlfile. If you do not specify a value, thendefaultis used. -
You can use
.spec.openStackConfigMapto specify the name of theConfigMapcontaining yourclouds.yamlfile. -
You can use
.spec.openStackConfigSecretto specify the name of the secret containing the admin password.
-
You can use
-
Optional: You can use
.spec.nodeSelectorto specify the label of the Red Hat OpenShift Container Platform (RHOCP) worker nodes that you need the Instance HA service pod to run on. For more information, see Placing pods on specific nodes using node selectors in RHOCP Nodes.
2.1.2. Configuring the fencing of Compute nodes Copia collegamentoCollegamento copiato negli appunti!
You must fence each Compute node that is eligible for evacuation. Configure their fencing agents in the fencingSecret YAML file that you specify when deploying the Red Hat OpenStack Services on OpenShift (RHOSO) high availability for Compute instances (Instance HA) service pod.
You cannot evacuate a Compute node unless it has a configured fencing agent.
The supported fencing agents are: IPMI, Redfish, or BareMetalHost (BMH), which is the fencing agent for Metal³.
You can use the FENCING_TIMEOUT parameter to specify the expected timeout for a fencing operation to be performed, in seconds. The default value is 30 seconds and maximum configurable timeout is 120 seconds. For more information, see Editing the Instance HA service parameters.
The following is an example of a fencingSecret YAML file called fencing-0.yaml, which provides an example configuration of each of the three supported fencing agents.
You must use the Compute service (nova) hostname to identify each Compute node, for example, compute-0. You can use the following command to obtain these hostnames: $ openstack compute service list.
$ cat fencing-0.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: fencing-0
stringData:
fencing.yaml: |
FencingConfig:
compute-0:
agent: ipmi
ipaddr: 192.168.111.9
ipport: 443
login: admin
passwd: password
compute-1:
agent: redfish
ipaddr: 192.158.12.3
ipport: 8000
tls: 'true'
login: admin
passwd: password
uuid: b7d32e6b-edbc-477d-80bf-4cda77ada8cb
compute-2:
agent: bmh
host: edpm-compute-0
namespace: openstack-edpm-ipam
token: $2a$10$yc9Q.eHLiQmCdS0/LzxJ5.V5/lrmx8JxwFbU5X4Hdr1albfDl7wtm
-
You must provide each IPMI fencing agent
agent: ipmiwith the IP connection and user authentication details of the Intelligent Platform Management Interface (IPMI). You must provide each Redfish fencing agent
agent: redfishwith the IP connection and user authentication details of the Redfish Host Interface.You must specify the
ipportparameter when your Redfish Host Interface does not use the default 443 port. You must specify the value of thetlsparameter in quotes as'true'. Theuuidparameter is optional forstandardservers, in which case the Instance HA service uses the default value ofSystem.Embedded.1to specify the Redfish node UUID.You must provide each BareMetalHost (BMH) fencing agent
agent: bmhwith the details of the associated BMH resource.You can use this command to obtain the
hostandnamespaceof the BMH resource:$ oc get bmh NAME STATE CONSUMER ONLINE ERROR AGE edpm-compute-0 provisioned openstack-edpm-ipam true 17h edpm-compute-1 provisioned openstack-edpm-ipam true 17h-
The
NAMEcolumn provides the BMH resourcehost, for example,edpm-compute-0. The
CONSUMERcolumn provides the BMH resourcenamespace, for exampleopenstack-edpm-ipam.If you already have a user that has the necessary privileges to power the BMH resource on and off, then you can provide their authentication token as the BMH resource
token. If not, then you must create a dedicated Red Hat OpenShift Container Platform (RHOCP) service account and provide this authentication token. For more information, see: RHOCP Authentication and authorization.
-
The
2.2. Instance HA service parameters Copia collegamentoCollegamento copiato negli appunti!
The Red Hat OpenStack Services on OpenShift (RHOSO) high availability for Compute instances (Instance HA) service provides a number of parameters that allow you to customize the process of evacuating instances from your failed Compute nodes. For information about editing these parameters values, see Editing the Instance HA service parameters.
| Parameter | Default | Description |
|---|---|---|
|
|
|
You must specify how often you want the status of |
|
|
| You must specify how often you want the Instance HA service to poll the Compute service (nova) database, in seconds. For more information, see How the Instance HA service evacuates failed Compute nodes. |
|
|
| You must specify the percentage of the total number of Compute nodes that are eligible for evacuation that can fail before the evacuation process becomes impractical. The Instance HA service stops evacuating the Compute nodes when this percentage is exceeded. For more information, see How the Instance HA service evacuates failed Compute nodes. Note
When the |
|
|
|
You must specify the amount of detail you want the Instance HA service log file messages to provide. When the |
|
|
| Optional: When you tag flavors, images, or host aggregates, you must specify the text that you use to tag their metadata. For more information, see Tag images, flavors, or host aggregates for evacuation. |
|
|
| Optional: You can specify whether you want the Instance HA service to check for tagged host aggregates when deciding which Compute nodes to evacuate. For more information, see Tag images, flavors, or host aggregates for evacuation. |
|
|
| Optional: You can specify whether you want the Instance HA service to check for tagged flavors when deciding which Compute nodes to evacuate. For more information, see Tag images, flavors, or host aggregates for evacuation. |
|
|
| Optional: You can specify whether you want the Instance HA service to check for tagged images when deciding which Compute nodes to evacuate. For more information, see Tag images, flavors, or host aggregates for evacuation. |
|
|
|
Optional: You can configure the Instance HA service to use |
|
|
|
Optional: When |
|
|
| Optional: You can specify the time to wait before fencing a Compute node, in seconds. For more information, see How the Instance HA service evacuates failed Compute nodes. |
|
|
| Optional: You can specify the expected timeout for a fencing operation to be performed, in seconds. The maximum configurable timeout is 120 seconds. |
|
|
| Optional: You can reserve healthy Compute nodes to evacuate the instances of failed Compute nodes. For more information, see Reserving healthy Compute nodes. |
|
|
| Optional: You can configure the Instance HA service to leave the fenced Compute nodes disabled after they have been evacuated. For more information, see How the Instance HA service evacuates failed Compute nodes. |
|
|
| Optional: You can configure the Instance HA service to enable a Compute node even when the instances have not been successfully evacuated. For more information, see How the Instance HA service evacuates failed Compute nodes. |
|
|
| Optional: You can configure the Instance HA service to detect if a Compute node is capturing a kernel before fencing and evacuating the Compute node. For more information, see Detecting if a Compute node is capturing a kernel dump. |
|
|
| Optional: You can configure the Instance HA service to not evacuate failed Compute nodes. For more information, see How the Instance HA service evacuates failed Compute nodes. |
|
|
| Optional: You can configure the Instance HA service to force the evacuation to the reserved host that was enabled to replace the failed Compute node. Warning This evacuation might fail if the destination Compute node does not have sufficient capacity. |
2.2.1. Editing the Instance HA service parameters Copia collegamentoCollegamento copiato negli appunti!
The parameters of the Red Hat OpenStack Services on OpenShift (RHOSO) high availability for Compute instances (Instance HA) service pod are stored as strings within a YAML ConfigMap file. For more information about the supported Instance HA service parameters, see Instance HA service parameters.
When you edit the value of an Instance HA service parameter, all the log file entries are lost when the Instance HA service pod restarts. For more information, see Troubleshooting the Instance HA service.
You must enclose all the parameter values in double quotes (“).
The name of this YAML ConfigMap file depends upon how you have chosen to create it:
-
You can create and name a YAML file containing a
ConfigMapthat provides your configured Instance HA service parameters. In this case, you must use.spec.instanceHaConfigMapto specify the name of this YAML file, when you create the Instance HA service manifest file. For more information, see Configuring the Instance HA service pod specification. -
You can choose to let the
infra-operatorcreate this YAMLConfigMapwhen the Instance HA service pod is deployed. This is calledinstanceha-configand contains the default values of the Instance HA service parameters that you can modify as needed.
You can use the following command to edit your Instance HA service parameters:
$ oc edit cm <config_map_name>
-
Replace
<config_map_name>with the name of your YAMLConfigMapfile; for example,instanceha-config.
You can use the following command to display the current configuration of your Instance HA service parameters:
$ oc get cm <config_map_name> -o yaml
The following example displays the default values of the parameters configured in the instanceha-config file when the Instance HA service pod is deployed.
$ oc get cm instanceha-config -o yaml
apiVersion: v1
data:
config.yaml: |
config:
EVACUABLE_TAG: "evacuable"
TAGGED_IMAGES: "true"
TAGGED_FLAVORS: "true"
DELTA: "30"
DELAY: "0"
POLL: "45"
THRESHOLD: "50"
WORKERS: "4"
SMART_EVACUATION: "false"
RESERVED_HOSTS: "false"
LEAVE_DISABLED: "false"
FORCE_ENABLE: "false"
CHECK_KDUMP: "false"
LOGLEVEL: "info"
DISABLED: "false"
kind: ConfigMap
2.3. Removing the Instance HA service Copia collegamentoCollegamento copiato negli appunti!
If you want to completely remove the Red Hat OpenStack Services on OpenShift (RHOSO) high availability for Compute instances (Instance HA) service, in addition to removing the Instance HA service, you must remove the ConfigMap containing the Instance HA service parameters and the fencing secret containing the fencing configuration of the Compute nodes that can be evacuated.
Prerequisites
-
You must know the name of your deployed Instance HA service, which is the
.metadata.namethat you specified in the manifest file. You can run this command to obtain this name:$ oc get instanceha. -
You must know the name of the
ConfigMapcontaining the Instance HA service parameters. -
You must know the name of the
fencingSecretYAML file containing the fencing configuration of the Compute nodes that can be evacuated.
Procedure
Delete the Instance HA service:
$ oc delete instanceha/<instanceha_service_name>-
Replace
<instanceha_service_name>with the name of your deployed Instance HA service, for example,instanceha-0.
-
Replace
Delete the
ConfigMapcontaining the Instance HA service parameters:$ oc delete cm/<config_map_name> instanceha-config-
Replace
<config_map_name>with the name of theConfigMap. For example, if you use the defaultConfigMaptheConfigMapisinstanceha-config.
-
Replace
Delete the fencing secret containing the fencing configuration of the Compute nodes that can be evacuated:
$ oc delete secret/<fencing_secret_name>-
Replace
<fencing_secret_name>with the name of the fencing secret that you specified when defining the specification of your Instance HA service pod, for example,fencing-0.
-
Replace