Chapter 3. Release information
These release notes highlight updates in some or all of the following categories. Consider these updates when you deploy this release of Red Hat OpenStack Platform (RHOSP):
- Bug fixes
- Enhancements
- Technology previews
- Release notes
- Known issues
- Deprecated functionality
- Removed functionality
Notes for updates released during the support lifecycle of this RHOSP release appear in the advisory text associated with each update.
3.1. Red Hat OpenStack Platform 17.1.4 Maintenance Release - November 20, 2024
Consider the following updates in Red Hat OpenStack Platform (RHOSP) when you deploy this RHOSP release.
3.1.1. Advisory list
This release of Red Hat OpenStack Platform (RHOSP) includes the following advisories:
- RHBA-2024:9973
- RHOSP 17.1.4 bug fix and enhancement advisory
- RHBA-2024:9974
- RHOSP 17.1.4 bug fix and enhancement advisory
- RHSA-2024:9975
- Important: RHOSP 17.1.4 (python-werkzeug) security update
- RHSA-2024:9976
- Important: RHOSP 17.1.4 (python-werkzeug) security update
- RHSA-2024:9977
- Moderate: RHOSP 17.1.4 (python-zipp) security update
- RHSA-2024:9978
- Moderate: RHOSP 17.1.4 (openstack-tripleo-heat-templates) security update
- RHBA-2024:9979
- RHOSP 17.1.4 RHEL 9 director images
- RHBA-2024:9980
- Updated RHOSP 17.1.4 container images
- RHBA-2024:9981
- Updated RHOSP 17.1.4 container images
- RHSA-2024:9982
- Important: RHOSP 17.1.4 (openstack-ironic) security update
- RHSA-2024:9983
- Moderate: RHOSP 17.1.4 (python-webob) security update
- RHSA-2024:9984
- Moderate: RHOSP 17.1.4 (python-sqlparse) security update
- RHSA-2024:9985
- Moderate: RHOSP 17.1.4 (python-urllib3) security update
- RHSA-2024:9986
- Moderate: RHOSP 17.1.4 (python-sqlparse) security update
- RHSA-2024:9988
- Moderate: RHOSP 17.1.4 (python-requests) security update
- RHSA-2024:9989
- Moderate: RHOSP 17.1.4 (python-webob) security update
- RHSA-2024:9990
- Moderate: RHOSP 17.1.4 (openstack-tripleo-common and python-tripleoclient) security update
- RHSA-2024:9991
- Moderate: RHOSP 17.1.4 (openstack-tripleo-common and python-tripleoclient) security update
3.1.2. Bug fixes
These bugs were fixed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#2107599
Previously, if you changed
binding:vnic_type
on a port that is attached to an instance,nova_compute
entered a restart loop when restarted.This update makes it impossible to change
binding:vnic_type
on a port that is attached to an instance.- BZ#2143940
- Before this update, in OVS-DPDK environments that use NVIDIA Mellanox ConnectX Lx series NICs, interface flapping could occur. This was caused by OVS-DPDK interfaces being reset during host provisioning. With this update, the problem has been fixed and this flapping no longer occurs.
- BZ#2217867
- Before this update, when using hardware offload and VLAN on NVIDIA ConnectX-5 and ConnectX-6 NICs, some offloaded flows on a Physical Function (PF) could cause transient performance issues with LLDP and VRRP traffic on the associated Virtual Functions (VFs). This update raises the priority of VLAN traffic so that LLDP and VRRP traffic does not impact performance.
- BZ#2234902
-
Before this update, the validation
check-kernel-version
did not function correctly and reported a failure. With this update, the issue has been resolved, andcheck-kernel-version
no longer reports a failure. - BZ#2236671
Before this update, VM instances created with IPv6-only networks could not access the Compute metadata service. The metadata agent was not accounting for the possibility of IPv6 addresses when provisioning the namespace used to access the Compute metadata service. In RHOSP 17.1.4, this issue has been fixed. The metadata agent was modified to account for IPv6 addresses when provisioning the namespace used to access the Compute metadata service. Instances created with IPv6-only networks are now able to access the Compute metadata service.
NoteA bug in versions of
cloud-init
earlier than24.2-0ubuntu1_23.10.1
also prevents instances from accessing the metadata service. Therefore, ensure that the version ofcloud-init
is24.2-0ubuntu1_23.10.1
or later.- BZ#2241270
- Before this release, the frr-status and oslo-config-validator validations reported FAILED during an update. The error messages were specific to the validation code and do not indicate any conditions that affect 17.1 operations. A related bug fix, BZ 2272546, eliminates these error messages.
- BZ#2243267
Before this update, the Leapp upgrade failed because the presence of the Virtual Data Optimizer (VDO) package caused the
checkvdo
Leapp actor to fail.In RHOSP 17.1.4, the VDO package has been removed and Leapp upgrades now succeed.
- BZ#2250074
- Before this update, when external BIND servers were used with the DNS service (designate), the BIND server pool would contain invalid information. This problem was caused by a deployment bug that failed to add some name server entries to the BIND instance pool. In RHOSP 17.1.4, this problem has been resolved. During DNS service installation, the BIND pool YAML files are correctly generated.
- BZ#2251176
-
Before this update, the Ceph Dashboard could not reach the Prometheus service endpoint and displayed the following error message:
404 not found.
This error occurred because the configuration of the VIP for the Prometheus service was not correct. In RHOSP 17.1.4, this problem has been resolved and the Dashboard is now able to reach the Prometheus service endpoint. - BZ#2255302
-
Before this update, you could not set the required
cephfs_filesystem_name
driver configuration parameter for an external Red Hat Ceph Storage cluster directly in director’s heat template parameters. This issue required the creation of a YAML file and anExtraConfig
parameter to create a Shared File Systems service (manila) share. This update introduces theManilaCephFSFileSystemName
parameter to heat templates, where you can update the file system name directly and successfully create a share. - BZ#2259873
Before this update, Lenovo UEFI firmware resets boot records after deployment. During deployments, the servers failed on their first boot, and displayed a blue screen stating that there is a lack of valid boot devices.
In RHOSP 17.1.4, this issue has been resolved. During deployments, Lenovo UEFI firmware no longer resets boot records after deployment, and the manual workaround is no longer required.
- BZ#2263502
- Before this update, links were failing for instances connected directly to an external network. The failure was caused by a delay in OVN performing source network address translation (SNAT) on packets for these instances. Because the IP address for the instance is not translated until OVN pushes the packet, the TCP connection would reset and the link between the instance and the external network was failing. With this update, the bug has been resolved, and the connection between instances and the external network is no longer disrupted.
- BZ#2266778
Before this update, the
python3-dns
package version 2.x introduced an incompatibility with the RHOSP DNS service (designate). This incompatibility caused some zone transfers that involved a Transaction Signature (TSIG) key to fail, resulting in the log message:AttributeError: 'TsigKeyring' object has no attribute 'name'
.In RHOSP 17.1.4, this issue has been resolved. Zone transfers that involve TSIG keys now work correctly.
- BZ#2267882
- Before this update, for RHOSP environments that used the DNS service (designate), there was a known issue where the Dashboard (horizon) displayed a maximum of 20 records when listing records in a zone, even when the zone contained more than 20 records. With this update, this issue has been fixed.
- BZ#2269509
-
Before this update, during an upgrade from RHOSP 16.2 to 17.1, the steps to trigger a
mysqld
upgrade were sometimes skipped. With this update, the upgrade logic for MySQL is rewritten to address this issue. - BZ#2271411
- Before this update, when using the Hitachi Block Storage driver to delete a volume, a different volume could be removed, resulting in the unexpected loss of data. This update provides an updated Hitachi Block Storage driver that removes the specified volume when deleting a volume, as expected.
- BZ#2276551
- Before this update, if the Block Storage service (cinder) was used as the back end to store the images of the Image service (glance), then their simultaneous operations could impede each other and result in the failure of one of these operations. With this update, the simultaneous operations of the Block Storage service and the Image service are prevented from impeding each other.
- BZ#2276865
-
Before this update, the
tripleo_iptables
role depended on theiptables
module, which could not insert rules at specific locations within a rule chain. With this update, theiptables
module is migrated to thenftables
module. Firewall rules are now applied on the overcloud by usingnftables
. - BZ#2278025
-
Before this update, GRUB 2 did not load LVM modules by default. As a result, during an upgrade from RHOSP 16.2 to 17.1, the custom overcloud images did not boot after a Leapp upgrade in UEFI mode. With this update, the EFI
grub.cfg
file loads the LVM grub module and the Leapp upgrade can proceed. - BZ#2283796
When you use the Shared File Systems service (manila) with an external Red Hat Ceph Storage cluster that has multiple CephFS filesystems, you must specify a filesystem name in authorization and deauthorization contexts.
Before this update, the authorization workflow for NFS did not specify the filesystem name, causing NFS exports to fail in the NFS-Ganesha service. With this update, the Shared File Systems service specifies the filesystem name when authorizing CephFS-NFS clients, and NFS exports succeed in the NFS-Ganesha service.
- BZ#2284095
-
In previous RHOSP releases, when all members of an OVN load balancer, including health monitors, were removed, the health monitor port was not deleted and remained in
ovsdb
. In RHOSP 17.1.4, all health monitor ports are properly deleted when all members are removed from an OVN load balancer, and the health monitor port is not needed by any other OVN load balancer. - BZ#2290323
Before this update, export configurations were overwritten when deployments included an external Red Hat Ceph Storage cluster, configured with the Shared File Systems service (manila) and CephFS-NFS. After any upgrade or update to the RHOSP 17.1 overcloud, the CephFS-NFS service (NFS-Ganesha) started without export configurations, causing clients to lose access to their shares.
With this update, director checks for the presence of export configurations and preserves them during upgrades and updates on the overcloud. CephFS-NFS clients can continue to access their data through upgrade and update procedures and routine recovery operations during these procedures.
- BZ#2290400
- Before this update, an instance port with many tags could cause very long database queries that resulted in the instance timing out during its creation. With this update, the method used for querying tags from the Networking service server has changed, and there are no longer errors when creating or migrating instances that contain large numbers of tags.
- BZ#2293368
This RHOSP 17.1.4 update fixes a bug that caused packet transmission latency in certain situations.
Before this bug fix, if your RHOSP 17.1 deployment included a filter rule in nft or iptables with a LOG action, and the kernel command line (/proc/cmdline) has console=tty50, logging actions could cause substantial latency in packet transmission.
This bug fix fixes the bug and eliminates the latency. An update from 17.1.x to 17.1.4 does not automatically apply the fix. If you want the fix, before updating to 17.1.4, perform the steps in the Knowledgebase solution: Sometimes receiving packet(e.g. ICMP echo) has latency, around 190 ms.
- BZ#2293735
Before this update, a RHOSP upgrade from 16.2 to 17.1 might fail with a 'cephadm command not found' error during the Red Hat Ceph Storage health validation.
To avoid this error, the Red Hat Ceph Storage health validation is now done by using 'podman run' if the
cephadm
tool is not installed.- BZ#2294369
- In previous RHOSP releases, users were unable to update the delay of a Load-balancing service (octavia) UDP health monitor. This problem was caused by a bug in the validation of parameters used to configure UDP health monitors. In RHOSP 17.1.4, this bug has been fixed, and users can now modify the delay for UDP health monitors.
- BZ#2295757
-
Before this update, an issue with the
systemd
file caused a restart loop of ovn-controllers during an upgrade from RHOSP 16.2 to 17.1. This issue caused an outage for DHCP and DNS services on the workloads. With this update, the issue has been fixed. - BZ#2298567
- Before this update, the HPE 3par Block Storage driver was unable to support HPE storage running the latest WSAPI version. With this update, the HPE 3par driver is able to support HPE storage running the latest WSAPI version.
- BZ#2300098
Before this update, if you used an external network as a share network for the Shared File Systems service (manila), an incompatibility in OVN caused incorrect ARP responses for ports that were created on external switches attached to external storage systems. Client machines, such as virtual machines, containers, or bare metal nodes, could not reach shares that were exported on external networks.
With this update, the Shared File Systems service sets the OpenStack network port status to DOWN to prevent OVN from setting up ARP responses through Networking service (neutron) ports. Instead, MAC addresses are learned from the external ports. Client machines can reach shares that are exported on external networks, and they can mount the shares when there are access control lists (ACLs) in the Shared File Systems service.
- BZ#2306799
- Before this update, load balancers could not reply to requests from hosts attached to a subnet that is also used as a member subnet in the same load balancer. This was caused by a missing default network route on the VIP interface of the load balancer. In RHOSP 17.1.4, the missing route has been added, and connectivity is restored.
- BZ#2308089
In previous RHOSP releases, connectivity disruptions could occur to IPv4 load balancers. This problem was caused during execution of a maintenance task for fixing an incorrect format used in the
ip_port_mappings
parameter on IPv6 load balancers. The task fixed the incorrect format on IPv6 load balancers, but also erroneously changed theip_port_mappings
parameter on IPv4 load balancers.This issue has been resolved, and IPv4 load balancers no longer experience connectivity disruptions after the maintenance task is run.
- BZ#2308660
This update fixes a bug that caused Neutron failure and connectivity loss with some IPv6 deployments.
In some IPv6 deployments, resolv.conf uses an IPv6 network with a scope of (fe80::5054:ff:fe96:8af7%eth2). This scope is not compatible with some of the libraries used by Neutron.
This update trims the address to exclude the scope and avoid any incompatibilities.
- BZ#2310889
- Before this update, you could not boot instances from qcow2v2 images. With this update, support for qcow2v2 images is added to the format inspector, therefore you can now boot instances from qcow2v2 images.
- BZ#2311465
-
Before this update, the Red Hat Enterprise Linux (RHEL) 8.4 images did not have the
GRUB_DEFAULT=saved
configuration. If your operating system was created by using the default 8.4 image, during an upgrade from RHOSP 16.2 to 17.1, the system upgrade was interrupted after the undercloud reboot. This update introduces theGRUB_DEFAULT=saved
configuration to ensure that GRUB boots the saved option and that the system upgrade works on nodes that are deployed on the default RHEL 8.4 image. - BZ#2311875
-
Before this update, the addition of the mixed CPU policy for flavors feature (
hw:cpu_policy=mixed
), which enables cloud users to create instances that have a mix of dedicated and shared CPUs, caused instances with a NUMA topology that were not pinned (hw:cpu_policy
was not set todedicated
) to not be restarted after upgrade to 17.1.3. This issue has been fixed with this update. - BZ#2315341
Before this update, the
os-net-config
script had to restart the host interface whenever there were changes to the DNS nameserver or to changes to the domain name. This was problematic during adoption of the RHOSO 18.0.x data plane from RHOSP 17.1.4.In RHOSP 17.1.4, this fix allows DNS servers and search domains to be modified on the fly without restarting interfaces, when the DNS server IP addresses and search domains are the only change to the previous configuration. The DNS nameservers and search domains are written to the ifcfg file and
/etc/resolv.conf
is updated in place.
3.1.3. Enhancements
This release of Red Hat OpenStack Platform (RHOSP) features the following enhancements:
- BZ#1954673
-
In RHOSP 17.1.4, the Networking service (neutron) Metadata service is now available for instances connected to IPv6-only networks. Instances can now communicate with the Metadata service using its dedicated IPv6 address,
fe80::a9fe:a9fe
. - BZ#2192913
This introduces an enhancement that you can configure to prevent flooding on the mesh between instances on different tenant networks.
Before this update, RHOSP environments with the OVN or OVS mechanism driver that had DVR enabled and used VLAN tenant networks, east/west traffic between instances connected to different tenant networks was flooded to the fabric.
To prevent this flooding in environments that use the OVN mechanism driver, you now can configure the parameter
OVNControllerGarpMaxTimeout
to control the frequency at which ovn-controller sends out out gratuitous ARP packets that announce MAC addresses of ports on VLAN tenant networks.This parameter is not available in environments that use the OVS mechanism driver.
- BZ#2262077
- When upgrading to RHOSP 17.1.4, you must reboot the Networking service (neutron). The Networking service depends on the mechanism drivers Open vSwitch (OVS) and Open Virtual Network (OVN). Both OVS and OVN are automatically upgraded to version 3.3 and version 24.03 respectively. Therefore, plan your upgrade to RHOSP 17.1.4 during a maintenance window. For more information, see _Framework for upgrades (16.2 to 17.1) .
- BZ#2264238
-
If you configure the wrong
nova-libvirt
images for the operating system on your Compute nodes, the RHOSP upgrade fails. This enhancement prevents you from turning off your virtual machines on the Compute nodes with the incorrect images. - BZ#2269653
-
With this update, you can use the parameter
packing_host_numa_cells_allocation_strategy
to configure the placement strategy that the Compute service (nova) uses when scheduling instances on host nodes. You can either spread instances across NUMA nodes or pack instances on the same host NUMA node until the resources of the node are exhausted. This enhancement also improves the scheduling of instances on NUMA nodes that do not have a PCI device if the instance does not request a PCI device, which makes the PCI devices more likely to remain available for instances that request them with a NUMA affinity. - BZ#2278832
- This enhancement introduces a validation to verify that the appropriate CephFS-NFS resources are enabled when preparing the overcloud upgrade from RHOSP 16.2. This validation only runs when the use of the Shared File Systems service (manila) with CephFS-NFS is detected in the environment. This validation prevents the CephFS-NFS service from being inadvertently omitted from the upgrade.
- BZ#2290340
The Shared File Systems service (manila) now includes a storage driver to support VAST Data Platform. The driver allows provisioning and management of NFS shares and point-in-time backups through snapshots.
This driver will be added to Red Hat OpenStack Services on OpenShift version 18.0.4.
- BZ#2296256
- When you use the Block Storage service (cinder) over Fiber Channel as a storage back end for the Image service (glance), image download speeds are faster than they were in previous releases of RHOSP.
- BZ#2308677
-
The
kernel-modules-extra
package has been added to the overcloud image for deployments that require thetcp_htcp
module. - BZ#2321095
- This enhancement improves the Block Storage Pure Storage backend by including the NVMe-TCP protocol.
3.1.4. Technology previews
You can test the following Technology Preview features in this release of Red Hat OpenStack Platform (RHOSP). These features provide early access to upcoming product features so that you can test functionality and provide feedback during the development process. These features are not supported with with your Red Hat subscription, and Red Hat does not recommend using them for production. For more information about the scope of support for Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.
- BZ#2149301
In RHOSP 17.1.4, a technology preview is available to avoid link flapping on OVS-DPDK bonds when the links are over subscribed. This enhancement is achieved by setting the
rx-steering=lacp+rss
option on every member of the DPDK bond.Example:
$ ovs-vsctl add-bond br-dpdk0 dpdkbond0 dpdk0 dpdk1 -- \ set interface dpdk0 type=dpdk options:dpdk-devargs=0000:ca:00.0 \ options:rx-steering=rss+lacp -- \ set interface dpdk1 type=dpdk options:dpdk-devargs=0000:ca:00.1 \ options:rx-steering=rss+lacp
- BZ#2217663
In RHOSP 17.1, a technology preview is available for the VF-LAG transmit hash policy offload that enables load balancing at NIC hardware for offloaded traffic/flows. This hash policy is only available for layer3+4 base hashing.
To use the technology preview, verify that your templates include a bonding options parameter to enable the xmit hash policy as shown in the following example:
bonding_options: "mode=802.3ad miimon=100 lacp_rate=fast xmit_hash_policy=layer3+4"
- BZ#2309656
-
In RHOSP 17.1.4, quality of service (QoS) settings for data center bridging (DCB) specific to a port or interface are now available as a technology preview feature as part of the network configuration template for the
os-net-config
tool. For more information see the Knowledge base article, Feature Integration document - DCB for E2E QoS.
3.1.5. Release notes
This section outlines important details about the release, including recommended practices and notable changes to Red Hat OpenStack Platform (RHOSP). You must take this information into account to ensure the best possible outcomes for your deployment.
- BZ#2077681
- RHOSP 17.1 does not support QoS minimum bandwidth in environments with the ML2/OVN mechanism driver.
- BZ#2276557
This update introduces a new sample OVN user defined router flavor driver (class UserDefinedNoLsp in neutron/neutron/services/ovn_l3/service_providers/user_defined.py).
This driver enables the creation of router interfaces with no associated underlying Logical Switch Ports. In this scenario, Neutron only acts as the IP address manager for the router interfaces. This gives user defined router flavors total control of the traffic traversing the router interfaces while bypassing the OVN processing.
- BZ#2278562
-
The
/boot/efi
partition for theovercloud-hardened-uefi-full
image has been increased from 16MiB to 200MiB to accommodate binaries for firmware upgrades. - BZ#2280656
-
In RHOSP 17.1.4, the default size for the
/var/log
partition on newly deployed overcloud nodes has been changed from 10 GiB to 25 GiB. - BZ#2294876
This update adds a configuration option called
broadcast_arps_to_all_routers
to the "[ovn]" config section.This option configures the external networks with the
broadcast-arps-to-all-routers
config option that became available in OVN 23.06. This option is enabled by default. It causes OVN to flood ARP requests to all attached ports on a network.[ovn] broadcast_arps_to_all_routers=true
If you disable
broadcast_arps_to_all_routers
, ARP requests are only sent to routers on a network if the target MAC address matches. ARP requests that do not match a router are only forwarded to non-router ports.- BZ#2296291
-
When you upgrade from RHOSP 16.2 to 17.1.4, Ansible
iptables
modules are automatically migrated tonftables
modules. Puppet tripleo firewall options also change to a new format. For more information about firewall options, see Adding services to the overcloud firewall in Hardening Red Hat OpenStack Platform. - BZ#2310427
-
Before this release, when upgrading from RHOSP 16.x to RHOSP 17.x with NIC partitioning on NVIDIA Mellanox cards, connectivity was lost on the Linux bond. With this update, this issue has been fixed. To use this fix, ensure that you set the Ansible variable,
dpdk_extra
in your bare metal node definition file before upgrading to RHOSP 17.1.4. For more information, see Creating a bare metal nodes definition file in Configuring network functions virtualization. - BZ#2320103
-
When you use the
--registry-url
option withopenstack tripleo container image
commands to provide a target registry other than the undercloud registry, you must use the--insecure
option if the target registry is insecure.
3.1.6. Known issues
These known issues exist in Red Hat OpenStack Platform (RHOSP) at this time:
- BZ#2216209
- Currently, for Load-balancing service (octavia) deployments, every time an Ansible workflow is triggered, Ansible dumps an amphora image into the temporary directory. The Ansible workflow does not auto-clean these images and space utilization issues can occur. Workaround: refer to the Red Hat Knowledgebase Solution, OpenStack deployments with octavia leave orphan amphora images.
- BZ#2219830
In RHOSP 17.1, there is a known issue of transient packet loss where hardware interrupt requests (IRQs) are causing non-voluntary context switches on OVS-DPDK PMD threads or in guests running DPDK applications.
This issue is the result of provisioning large numbers of VFs during deployment. VFs need IRQs, each of which must be bound to a physical CPU. When there are not enough housekeeping CPUs to handle the capacity of IRQs,
irqbalance
fails to bind all of them and the IRQs overspill on isolated CPUs.Workaround: You can try one or more of these actions:
- Reduce the number of provisioned VFs to avoid unused VFs remaining bound to their default Linux driver.
- Increase the number of housekeeping CPUs to handle all IRQs.
- Force unused VF network interfaces down to avoid IRQs from interrupting isolated CPUs.
- Disable multicast and broadcast traffic on unused, down VF network interfaces to avoid IRQs from interrupting isolated CPUs.
- BZ#2269564
- The time you will need to upgrade from RHOSP 16.2 to 17.1 increases with the number of nodes in your environment. To reduce the amount of time it takes to complete the upgrade, you can split your nodes into multiple roles. For more information, see the Red Hat Knowledgebase article How to split roles during upgrade from RHOSP 16.2 to RHOSP 17.1.
- BZ#2282237
Currently, in IPv6 environments, there is a known issue where the IPv6 key-value pair,
IPV6_AUTOCONF=no
, inifcfg-*
files generated byos-net-config
does not prevent the system from configuring the default route in response to router advertisements.Workaround: perform the following steps to correct this issue:
Set
net.ipv6.conf.<interface>.accept_ra_defrtr=0
to prevent learning the default route from router advertisements andnet.ipv6.conf.<interface>.accept_ra=0
to prevent the system from accepting router advertisements for any routes. Set both of these variables in/etc/sysctl.conf
or/etc/sysctl.d/99-sysctl.conf
(or another file in that subdirectory):For each specific interface:
net.ipv6.conf.<interface>.accept_ra_defrtr=0 net.ipv6.conf.<interface>.accept_ra=0
NoteSetting each specific interface can help to ensure that the settings are not overridden.
For newly created interfaces:
net.ipv6.conf.default.accept_ra_defrtr=0 net.ipv6.conf.default.accept_ra=0
For all interfaces present at boot time:
net.ipv6.conf.all.accept_ra_defrtr=0 net.ipv6.conf.all.accept_ra=0
Run the following command:
$ sudo sysctl -p
Alternatively, you can run the
sysctl
command to activate and write these values to thesysctl.conf
file:For each specific interface:
$ sudo sysctl -w net.ipv6.conf.<interface>.accept_ra_defrtr=0 $ sudo sysctl -w net.ipv6.conf.<interface>.accept_ra=0
NoteSetting each specific interface can help to ensure that the settings are not overridden.
For newly created interfaces:
$ sudo sysctl -w net.ipv6.conf.default.accept_ra_defrtr=0 $ sudo sysctl -w net.ipv6.conf.default.accept_ra=0
For all interfaces present at boot time:
$ sudo sysctl -w net.ipv6.conf.all.accept_ra_defrtr=0 $ sudo sysctl -w net.ipv6.conf.all.accept_ra=0
- BZ#2292053
During an upgrade from RHOSP 16.2 to 17.1, if your environment includes pre-provisioned nodes, the
openstack overcloud node provision
command fails because pre-provisioned nodes are not defined correctly in thebaremetal-deployment.yaml
file.Workaround: See the Red Hat Knowledgebase article LEAPP upgrade from 16.2 to 17.1 fails during overcloud node provisioning step when adding pre-provisioned nodes.
- BZ#2294189
This update fixes a bug that prevented the values of the parameters
fdb_removal_limit
andmac_binding_removal_limit
from being applied to the OVN database. The values were parsed, but not applied.Now the values are applied to the database.
- BZ#2305981
When you upgrade from RHOSP 16.2 to 17.1, during the system upgrade, a known issue causes GRUB to contain RHEL 7 entries instead of RHEL 8 entries. As a result, the hosts cannot reboot. This issue affects environments that previously ran RHOSP 13.0 or earlier.
Workaround: See the Red Hat Knowledgebase solution Openstack 16 to 17 FFU - During LEAPP upgrade UEFI systems do not boot due to invalid /boot/grub2/grub.cfg.
- BZ#2308346
In RHOSP 17.1 dynamic routing environments, restarting the Free Range Routing (FRR) container can cause unexpected network disruption. This disruption occurs because during a restart, some OVN BGP agent and FRR configurations erroneously delete the routes for transferring networking packets and the configurations must re-learn these routes.
Workaround:
Ensure that you have configured your leaf nodes with the graceful restart options.
For more information, see Configuring the leaf networks in Configuring dynamic routing in Red Hat OpenStack Platform.
-
Restart the
ovn-bgp-agent
container immediately after restarting thefrr
container, and wait for the agent to synchronize after it reloads its new configuration that include the graceful restart options.
- BZ#2313372
- During an update from RHOSP 17.1 GA, 17.1.1, and 17.1.2 to RHOSP 17.1.4, when you use an Open Virtual Network (OVN) back end, there is a possibility of a short network API outage during the external run of the OVN update.
- BZ#2313764
-
Currently, during RHOSP updates, if
os-net-config
encounters errors, it fails to report them and the deployment can complete without the administrator realizing that there is a problem with the network configuration. There is no workaround. - BZ#2313793
In RHOSP 17.1, when you delete an OVS agent, the Networking service (neutron) does not remove the corresponding VXLAN endpoint from
ovsdb
. This situation might cause issues with the L3 HA routers which can mistakenly think that the node where the deleted agent was running is still active.Workaround: see, FIPs stopped working after network/controller node replacement.
- BZ#2322497
-
Currently, in IPv6 environments that use FRRouting and the OVN BGP agent to achieve dynamic routing, outages can occur on both the control and data planes. These outages are triggered after reboot,
ifdown
, orifup
events that cause new IPv6 routes to be learned, and that can erroneously override IPv6 default routes obtained by BGP. Workaround: Disable IPv6 router solicitations on the interfaces that send them. - BZ#2322938
-
Currently, when MTUs mismatch, the communicating peers are unaware of the discrepancy, and the Networking service (neutron) can silently drop the packets. OVN is the cause of this problem because it fails to emit the message,
ICMP Fragmentation Needed
. Workaround: the preferred method is to adjust the MTU value to prevent packets that are too large from being transmitted. An alternative method is to set theOVNEmitNeedToFrag
option in the tripleo templates. For more information, see the Knowledgebase solution, Neutron ML2/OVN packet fragmentation problems. - BZ#2322948
If you create an IPv6 subnet, ensure that the network has an MTU of at least 1280 octets.
IPv6 subnets require an MTU of at least 1280 octets, but RHOSP does not include a validation process to enforce that requirement.
- BZ#2323725
- If your RHOSP 17.0 environment is deployed with ML2/OVN, you cannot update your environment directly from RHOSP 17.0 to 17.1.4. You must update to RHOSP 17.0.1 first. For more information, see Keeping Red Hat OpenStack Platform Updated.
3.1.7. Deprecated functionality
The items in this section are either no longer supported, or will no longer be supported in a future release of Red Hat OpenStack Platform (RHOSP):
- BZ#2303725
- In RHOSP 17.1.4, the OVS hardware offload and VIRTIO data path acceleration (VDPA) features are deprecated. Both features are removed and not supported in the Red Hat OpenStack Services on OpenShift (RHOSO) 18 release. Red Hat will provide bug fixes and support for this feature during the current release lifecycle, but this feature will no longer receive enhancements and will be removed. If you have a business requirement for either of these features, contact Red Hat Support.
3.2. Red Hat OpenStack Platform 17.1.3 Maintenance Release - May 22, 2024
Consider the following updates in Red Hat OpenStack Platform (RHOSP) when you deploy this RHOSP release.
3.2.1. Advisory list
This release of Red Hat OpenStack Platform (RHOSP) includes the following advisories:
- RHSA-2024:2727
- Important: Red Hat OpenStack Platform 17.1 (python-gunicorn) security update
- RHSA-2024:2729
- Important: Red Hat OpenStack Platform 17.1 (etcd) security update
- RHSA-2024:2730
- Important: Red Hat OpenStack Platform 17.1 (collectd-sensubility) security update
- RHSA-2024:2731
- Moderate: Red Hat OpenStack Platform 17.1 (python-django) security update
- RHSA-2024:2732
- Moderate: Red Hat OpenStack Platform 17.1 (python-glance-store) security update
- RHSA-2024:2733
- Moderate: Red Hat OpenStack Platform 17.1 (openstack-ansible-core) security update
- RHSA-2024:2734
- Moderate: Red Hat OpenStack Platform 17.1 (python-urllib3) security update
- RHSA-2024:2735
- Moderate: Red Hat OpenStack Platform 17.1 (python-paramiko) security update
- RHSA-2024:2736
- Moderate: Red Hat OpenStack Platform 17.1 (openstack-tripleo-heat-templates and tripleo-ansible) security update
- RHSA-2024:2737
- Moderate: Red Hat OpenStack Platform 17.1 (python-openstackclient) security update
- RHBA-2024:2738
- Updated Red Hat OpenStack Platform 17.1 container images
- RHBA-2024:2739
- Updated Red Hat OpenStack Platform 17.1 container images
- RHBA-2024:2740
- Red Hat OpenStack Platform 17.1 RHEL 9 director images
- RHBA-2024:2741
- Red Hat OpenStack Platform 17.1 bug fix and enhancement advisory
- RHBA-2024:2742
- Red Hat OpenStack Platform 17.1 bug fix and enhancement advisory
- RHSA-2024:2767
- Important: Red Hat OpenStack Platform 17.1 (collectd-sensubility) security update
- RHSA-2024:2768
- Moderate: Red Hat OpenStack Platform 17.1 (python-paramiko) security update
- RHSA-2024:2769
- Moderate: Red Hat OpenStack Platform 17.1 (python-openstackclient) security update
- RHSA-2024:2770
- Moderate: Red Hat OpenStack Platform 17.1 (tripleo-ansible and openstack-tripleo-heat-templates) security update
3.2.2. Bug fixes
These bugs were fixed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#2222683
With this update, Red Hat support for Multi-RHEL is expanded to include the following deployment architectures:
- Edge (DCN)
- ShiftOnStack
- director Operator-based deployments
- BZ#2229779
- This update improved the OVN metadata agent process exception handling in an event when a child process is not able to start. With this update, the child process exception is logged and the main OVN metadata agent continues running.
- BZ#2237866
-
Before this update, configuring caching parameters for ceilometer were not supported. With this update, for caching, ceilometer uses the
dogpile.cache.memcached
back end. If you manually disable caching, ceilometer uses theoslo_cache.dict
back end. - BZ#2248873
-
With some versions of Pluggable Authentication Modules (PAM), the pam_loginuid module
/proc/self/loginuid
must be writable. This is not the case in thesshd
container used for migrations. Migrations failed because SSH login between Compute hosts was failing. With this update, thepam_loginuid
module has been removed from the PAM config and, as a result, SSH login between Compute hosts, and migrations, work again. - BZ#2249444
Before this update, libvirt-related containers on Compute nodes were needlessly restarted during deployment, update, and scaling operations, even when the container configurations were not changed.
With this update, libvirt-related containers that do not have configuration changes are not restarted during deployment, update, and scaling operations.
- BZ#2249690
Before this update, after every successful Red Hat Ceph Storage adoption during upgrade, the
ceph-ansible
package was removed by default.This update introduces the tag,
cleanup_cephansible
, to the task that removesceph-ansible
. You can use this tag with--skip-tags
while running the adoption playbook to avoid removal.- BZ#2254036
Before this update, during a DCN FFU system upgrade of nodes on the setup with multiple stacks, the Red Hat Ceph Storage task
Set noout flag
might fail to run the ceph command on the right host.After the update, a system upgrade on any node in a multi-stack setup now delegates the Red Hat Ceph Storage task
Set noout flag
to the relevant host, and theceph
commands are run on the specific cluster.- BZ#2254994
This update fixes a bug that caused unintended deletion of Load-balancing service (octavia) health monitor ports and OVN metadata ports.
Before this update, in RHOSP environments that used Load-balancing service health monitor ports from a previous version, running
neutron-db-sync-tool
sometimes randomly deleted those pre-existing ports or OVN metadata ports. This unintended port deletion resulted in loss of health monitor capacity or communication loss with the affected instances.This update resolves the conflicts. It uses a new value,
ovn-lb-hm:distributed
, for the OVN Load-balancing service health monitor portsdevice_owner
field. Old OVN Load-balancing service health monitor ports are automatically updated with this version.- BZ#2255324
Before this update, a director bug could disrupt or crash client workloads during updates or upgrades to any RHOSP 17.1 version. This bug affected deployments that enabled the RHOSP Shared File System service (manila) with the CephFS on NFS back ends.
With this update this issue has been addressed, and the Shared File System service (manila) operates properly when users set up "access rules" on their shares.
- BZ#2257274
-
Before this update, when using jumbo frames for Networking service (neutron) tenant networks, a RHOSP Controller shutting down could sometimes cause the RHOSP Load-balancing service (octavia) management interface (
o-hm0
) to have its MTU reset to a small value, such as 1500 or 1450. This problem usually occurred when the RHOSP Controller was rebooted for the first time, or in a situation when the Controller was abruptly terminated. With this update, RHOSP director now ensures that Open vSwitch (OVS) is configured with the correct MTU when theo-hm0
is created. - BZ#2259286
Before this update, the FFU procedure sometimes used the incorrect Red Hat Ceph Storage image when the documented upgrade procedure was not followed by the user.
With this update, the FFU procedure always uses the correct Red Hat Ceph Storage image. The
multi-rhel-container-image-prepare.py
script has been updated to use the correct defaults and version validation checks have been added to the FFU process.- BZ#2263552
This update fixes a bug that prevented load-balancing of traffic on some Load-balancing service (octavia) pool members on IPv6 networks in an ML2/OVN environment.
Before this update, if you added a second
listener+pool+member
to a pool, the pool entered an ERROR state and traffic within that pool was not load-balanced.With this update, traffic is load-balanced to all members as expected.
- BZ#2263916
This update safeguards against upgrades from RHOSP 16.2 to RHOSP 17.1 with libvirt configurations that could result in workload disruptions after the upgrade.
Before this update, if you performed an upgrade to RHOSP 17.1 from a RHOSP 16.2 environment that included a modular deployment of libvirt on Red Hat Enterprise Linux (RHEL) 8 or that ran libvirt UBI9 on RHEL 8, these configurations sometimes resulted in workload disruption.
With this update, the upgrade from RHOSP 16.2 to 17.1 fails if the RHOSP 16.2 environment includes a modular deployment of libvirt on Red Hat Enterprise Linux (RHEL) 8, or runs libvirt UBI9 on RHEL 8.
- BZ#2266285
This update fixes a bug that prevented operation of the OVN Health Monitoring for Load-balancing service (octavia) on IPv6 networks in ML2/OVN deployments.
Previously, the OVN Health Monitors service did not correctly identify the
ONLINE
andOFFLINE
statuses of back-end members.With this update, load-balancing works as expected, and the OVN Health Monitors correctly identify the
ONLINE
andOFFLINE
statuses of back-end members.- BZ#2278028
Before this update, during upgrades to RHOSP 17.1 and updates between minor versions of 17.1, Networking service (neutron) environments that used the ML2/OVN mechanism driver stopped receiving updates from the OVN databases for a period of time. When the Controller node that contained the RAFT leader was also updated, the mechanism driver would then receive the updates from the OVN databases.
With this update, this problem has been fixed. Now, during RHOSP updates and upgrades, the Networking service using the ML2/OVN mechanism driver operates properly.
3.2.3. Enhancements
This release of Red Hat OpenStack Platform (RHOSP) features the following enhancements:
- BZ#1900663
- With this update, Red Hat support for Framework for upgrades is expanded to include DCN deployments without storage at the edge.
- BZ#1997638
- With this update, Red Hat support for Framework for upgrades is expanded to include DCN deployments with storage at the edge.
- BZ#2218000
- With this update, you can now use the Bare Metal service (ironic) to boot an ISO image directly for use as a RAM disk. For more information, see Enabling ISO boot for bare-metal instances and Booting an ISO image directly for use as a RAM disk.
- BZ#2224492
RHOSP 17.1.3 now supports a new aging mechanism for MAC addresses learned through the
localnet_learn_fdb
option that is included in Open Virtual Network (OVN) version 23.09. This new aging mechanism consists of two new options,fdb_age_thhreshold
andfdb_removal_limit
.The
fdb_age_thhreshold
option enables you to set the maximum time the learned MACs stay in the FDB table (in seconds).The
fdb_removal_limit
option prevents OVN from removing a large number of FDB table entries all at one time.When you use these new options with
localnet_learn_fdb
, you reduce the likelihood of performance and scalability issues caused by the FDB table from growing too big, a problem often experienced in RHOSP environments that have large provider networks.- BZ#2225163
- A power save profile, cpu-partitioning-powersave, has been introduced in Red Hat Enterprise Linux 9 (RHEL 9), and is now available in Red Hat OpenStack Platform (RHOSP) 17.1.3. This TuneD profile is the base building block to save power in RHOSP 17.1 NFV environments. For more information, see Saving power in OVS-DPDK deployments in Configuring network functions virtualization.
- BZ#2255168
-
With this update, you can add load balancing capability to specific availability zones. In the
OS::Octavia::LoadBalancer
resource, use the newavailability_zone
property to specify an availability zone for the load balancer. - BZ#2255373
- This enhancement updates the Block Storage (cinder) driver for Dell PowerFlex storage to support Dell PowerFlex Software version 4.5.
- BZ#2261924
- With this update, RHOSP 17.1 supports RHCS 7 as an external Red Hat Ceph Storage cluster.
- BZ#2262266
- The Shared File Systems service (manila) now includes a back-end driver to provision and manage NFS shares on Dell PowerFlex storage systems. The use of this driver is supported when the vendor publishes certification on the Ecosystem Catalog.
- BZ#2262313
- The Shared File Systems service (manila) now includes a back-end driver to provision and manage NFS and CIFS shares on Dell PowerStore storage systems. The use of this driver is supported when the vendor publishes certification on the Ecosystem Catalog.
- BZ#2264273
- This enhancement updates the Block Storage (cinder) driver for the Hewlett Packard Enterprise (HPE) 3PAR product line to support the Alletra MP Storage array.
3.2.4. Technology previews
You can test the following Technology Preview features in this release of Red Hat OpenStack Platform (RHOSP). These features provide early access to upcoming product features so that you can test functionality and provide feedback during the development process. These features are not supported with your Red Hat subscription, and Red Hat does not recommend using them for production. For more information about the scope of support for Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.
- BZ#2217663
In RHOSP 17.1, a technology preview is available for the VF-LAG transmit hash policy offload that enables load balancing at NIC hardware for offloaded traffic/flows. This hash policy is only available for layer3+4 base hashing.
To use the technology preview, verify that your templates include a bonding options parameter to enable the xmit hash policy as shown in the following example:
bonding_options: "mode=802.3ad miimon=100 lacp_rate=fast xmit_hash_policy=layer3+4"
3.2.5. Known issues
These known issues exist in Red Hat OpenStack Platform (RHOSP) at this time:
- BZ#2163477
Currently, in RHOSP 17.1 environments that use BGP dynamic routing, the RHOSP Compute service cannot route packets sent from one of these instances to a multicast IP address destination. Therefore, instances subscribed to a multicast group fail to receive the packets sent to them. The cause is that BGP multicast routing is not properly configured on the overcloud nodes.
Workaround: Currently, there is no workaround.
- BZ#2187985
Adding a load balancer member whose subnet is not in the Load-balancing service (octavia) availability zone puts the load balancer in
ERROR
. The member cannot be removed because of theERROR
status, making the load balancer unusable.Workaround: Delete the load balancer.
- BZ#2192913
In RHOSP environments with ML2/OVN or ML2/OVS that have DVR enabled and use VLAN tenant networks, east/west traffic between instances connected to different tenant networks is flooded to the fabric.
As a result, packets between those instances reach not only the Compute nodes where those instances run, but also any other overcloud node.
This might impact the network and it might be a security risk because the fabric sends traffic everywhere.
This bug will be fixed in a later FDP release. You do not need to perform a RHOSP update to obtain the FDP fix.
- BZ#2210319
Currently, the Retbleed vulnerability mitigation in RHEL 9.2 can cause a performance drop for Open vSwitch with Data Plane Development Kit (OVS-DPDK) on Intel Skylake CPUs.
This performance regression happens only if C-states are disabled in the BIOS, Hyper-Threading Technology is enabled, and OVS-DPDK is using only one logical core of a given core.
Workaround: Assign both logical cores to OVS-DPDK or to SR-IOV guests that have DPDK running as recommended in Configuring network functions virtualization.
- BZ#2216021
RHOSP 17.1 with the OVN mechanism driver does not support logging of flow events per port or the use of the
--target
option of thenetwork log create
command.RHOSP 17.1 supports logging of flow events per security groups, using the
--resource
option of thenetwork log create
command. For more information, see Logging security group actions in Configuring Red Hat OpenStack Platform networking.- BZ#2217867
- On Nvidia ConnectX-5 and ConnectX-6 NICs, when using hardware offload, some offloaded flows on a PF can cause transient performance issues on the associated VFs. This issue is specifically observed with LLDP and VRRP traffic.
- BZ#2220887
- The data collection service (ceilometer) reports the wrong unit for power current. Current is measured in Amperes not in Watts.
- BZ#2234902
-
The validation
check-kernel-version
does not function correctly and reports a failure. You can ignore the failure. - BZ#2237290
The Networking service (neutron) does not prevent you from disabling or removing a networking profile, even if that profile is part of a flavor that is in use by a router. The disablement or removal of the profile can disrupt proper operation of the router.
Workaround: Before you disable or remove a networking profile, ensure that it is not part of a flavor that is currently used by a router.
- BZ#2241270
-
The
frr-status
andoslo-config-validator
validations report FAILED during an update. You can ignore these error messages. They are specific to the validation code and do not indicate any conditions that affect 17.1 operations. They will be fixed in a future release. - BZ#2241326
-
LDAP server connections are removed as expected from the Keystone LDAP pool on either
TIMEOUT
orSERVER_DOWN
errors. The LDAP pool exhausts its connections and is unable to re-establish new ones. TheMaxConnectionReachedError
is issued. Workaround: DisableLDAP pool
. - BZ#2243267
-
The presence of the Virtual Data Optimizer (VDO) package causes the
checkvdo
Leapp actor to fail. As a result, the Leapp upgrade fails. To complete the Leapp upgrade successfully, remove the VDO package. - BZ#2251176
The Ceph Dashboard cannot reach the Prometheus service endpoint and displays the following error message: 404 not found. This error occurs because the configuration of the VIP for the Prometheus service is not correct.
Workaround:
-
Verify haproxy is properly configured: ssh into a controller node (such as controller-0) and run
curl http://10.143.0.25:9092
. If the curl is successful, the configuration is correct. If the
curl
succeeded, ssh into the controller node and update the prometheus API config in the ceph cluster:$ sudo cephadm shell -- ceph dashboard set-prometheus-api-host http://10.143.0.25:9092
To verify that the Ceph Dashboard can reach the Prometheus service endpoint and no longer displays the 404 not found error message, check the Ceph Dashboard UI.
-
Verify haproxy is properly configured: ssh into a controller node (such as controller-0) and run
- BZ#2254553
-
Currently, in Red Hat Ceph Storage 6,
cephadm
attempts to bind the Grafana daemon to all interfaces when a valid network list is provided. This prevents the Grafana daemon from starting. - BZ#2255302
If your deployment has an external Ceph cluster with multiple file systems, you can not create a Shared File System service (Manila) share as expected.
The
cephfs_filesystem_name
driver configuration parameter that is needed to avoid this situation cannot be set using director’s heat template parameters.Workaround: Set the "cephfs_filesystem_name" parameter to specify the filesystem that the Shared File System service (Manila) must use via "ExtraConfig".
Add the parameter to an environment file as shown in the following example:
$ cat /home/stack/manila_cephfs_customization.yaml parameter_defaults: ExtraConfig: manila::config::manila_config: cephfs/cephfs_filesystem_name: value: <filesystem>
Replace the value of <filesystem> with the appropriate name and include this environment file with the
openstack overcloud deploy
command.- BZ#2257419
The way cgroups are managed for
libvirt
changed between RHEL releases. As a result,virsh cpu-stats
is not an officially supported part of the Red Hat OpenStack Platform (RHOSP) product. The functionality is provided by the underlying version oflibvirt
received from RHEL. Greenfield RHOSP 17.x is only supported on RHEL 9, which uses cgroups v2. The cgroups v2 API does not provide the API required to support thevirsh cpu-stats
API and this functionality is unavailable when using 17.1 on RHEL 9.RHOSP 17.1 on RHEL 8, supported during mixed RHEL upgrades from 16.2, has been updated to make virsh cpu-stats functional while running on rhel 8.4. As a result, virsh cpu-stats functionality has been restored on RHEL 8 hosts but is unavailable on fully upgraded RHEL 9 hosts.
- BZ#2259873
When upgrading RHOSP 16.2 to 17.1 on environments that use Lenovo SR650 servers, the servers fail on their first boot, displaying a blue screen stating that there is a lack of valid boot devices.
This issue is caused by Lenovo UEFI firmware resetting boot records after deployment. RHOSP director requests two changes to UEFI firmware settings. However, Lenovo hardware can only handle one request before rebooting.
Workaround: you must manually reboot the Lenovo servers to the desired operating system.
- BZ#2266778
-
In RHOSP 17.1 environments that use the RHOSP DNS service (designate), zone transfers that involve TSIG keys can fail. The log message is:
AttributeError: 'TsigKeyring' object has no attribute 'name'
. This issue is caused by thepython3-dns
package version 2.x introducing an incompatibility with the RHOSP DNS service. This has been fixed and will be available in a later maintenance release. Workaround: Currently, there is no workaround. - BZ#2267882
- There is a known issue where listing the records in a zone using the RHOSP Dashboard (horizon) returns only 20 results, even if the zone contains more than 20 records. The RHOSP DNS service (designate) dashboard does not properly support pagination in the Dashboard. This has been fixed and will be available in a future RHOSP maintenance release. Workaround: Currently, the workaround is to use the RHOSP command line interface instead of the Dashboard.
- BZ#2274468
In RHOSP 17.1 environments that use dynamic routing and the OpenStack Load-balancing service (octavia) with the OVN provider driver, there is a known issue where the Load-balancing VIP is deleted. The deletion is caused by the process that synchronizes the OVN BGP agent and the Load-balancing service.
Workaround: The workaround is to increase the reconcile interval to a very high value. Create a custom environment YAML file and add the following values:
parameter_defaults: FrrOvnBgpAgentReconcileInterval: 999999
For more information, see 4.11. Deploying a spine-leaf enabled overcloud.
ImportantUsing this workaround means that the OVN load-balancing VIPs are operational, but the sync between the OVN BGP agent and Free Range Routing (FRR) is effectively not working. With sync non-operational, if a problem occurs during FRR configuration, FRR will not recover until the configured interval has passed.
- BZ#2274663
In RHOSP environments that use dynamic routing, during a minor update, Free Range Routing (FRR) is restarted twice in a row. This occurs during updates in the following scenarios:
- from RHOSP 17.1.0 to 17.1.2 or 17.1.3.
from RHOSP 17.1.1 to 17.1.2 or 17.1.3.
The first restart occurs because there is a new container image. The second restart is triggered by a change to the
tripleo_frr.service
systemd file.These unwanted restarts were introduced in a bug fix to address BZ 2237245.
Workaround: Perform the following steps:
ImportantThis workaround requires restarting the
tripleo_frr
service and might cause network down time. Therefore, perform these steps during a maintenance window.-
Open the config file,
/etc/systemd/system/tripleo_frr.service
. After the first instance of
ExecStopPost
, add another instance ofExecStopPost
that contains the following values:ExecStopPost=/usr/bin/sleep 10
Example
[Unit] Description=frr container After=tripleo-container-shutdown.service [Service] Restart=always ExecStart=/usr/bin/podman start frr ExecReload=/usr/bin/podman kill --signal HUP frr ExecStop=/usr/bin/podman stop -t 42 frr ExecStopPost=/usr/bin/podman stop -t 42 frr ExecStopPost=/usr/bin/sleep 10 SuccessExitStatus=137 142 143 TimeoutStopSec=84 KillMode=control-group Type=forking PIDFile=/run/frr.pid [Install] WantedBy=multi-user.target …
Restart the
tripleo_frr
service:# systemctl daemon-reload # systemctl restart tripleo_frr
-
Open the config file,
3.2.6. Deprecated functionality
The items in this section are either no longer supported, or will no longer be supported in a future release of Red Hat OpenStack Platform (RHOSP):
- BZ#1946898
-
The i440FX PC machine type
pc-i440fx
was deprecated in RHEL 8. While thepc-i440fx-*
machine types are still available, Red Hat recommends that you use the default Q35 machine type in RHOSP 17.1. Some RHOSP 17.1 features do not work with the i440FX PC machine type. For example, the VirtIO Block (virtio-blk
) device does not work with the i440FX PC machine type in RHOSP 17.1. To use VirtIO Block as the block device for instances in RHOSP 17.1, your instances must use the Q35 machine type.
3.3. Red Hat OpenStack Platform 17.1.2 Maintenance Release - January 16, 2024
Consider the following updates in Red Hat OpenStack Platform (RHOSP) when you deploy this RHOSP release.
3.3.1. Advisory list
This release of Red Hat OpenStack Platform (RHOSP) includes the following advisories:
- RHBA-2024:0185
- Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory
- RHBA-2024:0186
- Updated Red Hat OpenStack Platform 17.1.2 container images
- RHSA-2024:0187
- Moderate: Red Hat OpenStack Platform 17.1 (python-urllib3) security update
- RHSA-2024:0188
- Moderate: Red Hat OpenStack Platform 17.1 (python-eventlet) security update
- RHSA-2024:0189
- Moderate: Red Hat OpenStack Platform 17.1 (python-werkzeug) security update
- RHSA-2024:0190
- Moderate: Red Hat OpenStack Platform 17.1 (GitPython) security update
- RHSA-2024:0191
- Moderate: Red Hat OpenStack Platform 17.1 (openstack-tripleo-common) security update
- RHBA-2024:0209
- Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory
- RHBA-2024:0210
- Updated Red Hat OpenStack Platform 17.1.2 container images
- RHBA-2024:0211
- Red Hat OpenStack Platform 17.1.2 RHEL 9 director images
- RHSA-2024:0212
- Moderate: Red Hat OpenStack Platform 17.1 (python-django) security update
- RHSA-2024:0213
- Moderate: Red Hat OpenStack Platform 17.1 (python-eventlet) security update
- RHSA-2024:0214
- Moderate: Red Hat OpenStack Platform 17.1 (python-werkzeug) security update
- RHSA-2024:0215
- Moderate: Red Hat OpenStack Platform 17.1 (GitPython) security update
- RHSA-2024:0216
- Moderate: Red Hat OpenStack Platform 17.1 (openstack-tripleo-common) security update
- RHSA-2024:0217
- Moderate: Red Hat OpenStack Platform 17.1 (rabbitmq-server) security update
- RHSA-2024:0263
- Updated Red Hat OpenStack Platform 17.1.2 director Operator container images
3.3.2. Bug fixes
These bugs were fixed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#2108212
This update fixes an issue that disrupted connection to instances over IPv6 during migration from the OVS mechanism driver to the OVN mechanism driver.
Now you can migrate from OVS to OVN with IPv6 without experiencing the instance connection disruption.
- BZ#2126725
- Before this update, hard-coded certificate location operated independently of user-provided values. During deployment with custom certificate locations, services did not retrieve information from API endpoints because Transport Layer Security (TLS) verification failed. With this update, user-provided certificate locations are used during deployment.
- BZ#2151219
-
Before this update, RHOSP director did not allow for automatically configuring nameserver (NS) records to match a parent’s NS records. In RHOSP 17.1.2, this issue has been resolved by the addition of a new Orchestration service (heat) parameter,
DesignateBindNSRecords
. Administrators can use this new parameter to define the list of root NS for the domains that the DNS service (designate) populates. For more information, see Configuring DNS as a service. - BZ#2167428
- Before this update, during a new deployment, the Identity service (keystone) was often not available during initialization of the agent-notification service. This prevented the data collection service (ceilometer) from discovering the gnocchi endpoint. As a result, metrics were not sent to gnocchi. With this update, gnocchi tries to connect to the data collection service multiple times before declaring that it cannot be reached.
- BZ#2180542
This update fixes a bug that caused the
ceph-nfs
service to fail after a reboot of all Controller nodes.The Pacemaker-controlled
ceph-nfs
resource requires a runtime directory to store some process data.Before this update, the directory was created when you installed or upgraded RHOSP. However, a reboot of the Controller nodes removed the directory, and the
ceph-nfs
service did not recover when the Controller nodes were rebooted. If all Controller nodes were rebooted, theceph-nfs
service failed permanently.With this update, the directory is created before spawning the
ceph-nfs
service, and thecephfs-nfs
service continues through reboots.- BZ#2180883
-
This update fixes a bug that caused
rsyslog
to stop sending logs to Elasticsearch. - BZ#2193388
- Before this update, the Dashboard service (horizon) was configured to validate client TLS certificates by default, which broke the Dashboard service on all TLS everywhere (TLS-e) deployments. With this update, the Dashboard service no longer validates client TLS certificates by default, and the Dashboard service functions as expected.
- BZ#2196291
- This update fixes a bug that prevented non-admin users from listing or managing policy rules. Now you can allow non-admin users to list or manage policy rules.
- BZ#2203785
This update fixes a permission issue that caused collectd sensubility to stop working after you rebooted a baremetal node.
Now collectd sensubility continues working after you reboot a baremetal node.
- BZ#2213126
This update fixes an issue that sometimes caused the security group logging queue to stop accepting entries before reaching the limit set in
NeutronOVNLoggingRateLimit
.You can set the maximum number of log entries per second with the parameter
NeutronOVNLoggingRateLimit
. If the log entry creation exceeds that rate, the excess is buffered in a queue up to the number of log entries that you specify inNeutronOVNLoggingBurstLimit
.Before this update, during short bursts, the queue sometimes stopped accepting entries before reaching the limit specified in
NeutronOVNLoggingBurstLimit
.With this update, the
NeutronOVNLoggingBurstLimit
value affects the queue limit as expected.- BZ#2213742
- This update fixes a bug that prevented TCP health monitors in UDP pools from performing as expected. Previously, the states of the pool members and the health monitors were not correctly reported. This was caused by SELinux rules that broke the use of TCP health monitors on specific port numbers in UDP pools. Now the health monitors perform correctly.
- BZ#2215969
- Before this update, Google Chrome did not display the list of load members correctly, which prevented members from using the dashboard to add members to a load balancer. With this update, Google Chrome displays the list of load balancer members.
- BZ#2216130
Before this update,
puppet-ceilometer
did not populate thetenant_name_discovery
parameter in the ceilometer configuration on Compute nodes. This prevented identification of theProject name
andUser name
fields.With this update, the addition of the
tenant_name_discovery
parameter to the Compute namespace inpuppet-ceilometer
resolves the issue. When thetenant_name_discovery
parameter is set totrue
, theProject name
andUser name
fields are populated.- BZ#2218596
This update fixes a bug that caused problems after a migration to the OVN mechanism driver if the original ML2/OVS environment used iptables_hybrid firewall and trunk ports.
Previously, if the original ML2/OVS environment used iptables_hybrid firewall and trunk ports, instance networking problems occurred if you recreated an instance with trunks after an event such as a hard reboot, start and stop, or node reboot.
Now you can migrate to the OVN mechanism driver if the original ML2/OVS environment uses iptables_hybrid firewall and trunk ports.
- BZ#2219574
- Before this update, puppet-ceilometer did not support configuring caching options for the data collection service (ceilometer). With this update, puppet-ceilometer provides configuring caching options for the data collection service (ceilometer). This support uses tripleo heat templates to provide better flexibility for configuring the caching back end.
- BZ#2219613
-
Before this update, in RHOSP 17.1 distributed virtual router (DVR) environments, traffic was being incorrectly centralized when sent to floating IP addresses (FIPs) whose attached ports were
DOWN
. With this update, network traffic is no longer centralized if the FIP ports are in aDOWN
state. - BZ#2220808
-
Before this update, the data collection service (ceilometer) did not create a resource in gnocchi because of the missing
hardware.ipmi.fan
metric in gnocchi’s resource types. With this update, gnocchi reports fan metrics, which resolves the issue. - BZ#2220930
-
Before this update, in environments that ran the DNS service (designate), there was a known issue where the
bind9
andunbound
services did not automatically restart if the configuration changed. With this update, thebind9
andunbound
services automatically restart if the configuration changes. - BZ#2222420
- Before this update, in environments that used IPv6 networks that ran the RHOSP DNS service (designate), the BIND 9 backend server rejected DNS notify messages. With this update, the BIND 9 back-end server does not reject DNS notify messages.
- BZ#2222825
-
Before this update, when you configured Nova with
[quota]count_usage_from_placement = True
, and you unshelved a shelved offloaded server, you could exceed your quota limit because a quota was not enforced. With this update, when you configure Nova with[quota]count_usage_from_placement = True
, and you unshelve a shelved offloaded server, a quota limit is enforced. - BZ#2223294
-
This update fixes a bug that caused failure of the collection agent
collectd-sensubility
on RHEL 8 Compute nodes during an in-place upgrade from RHOSP 16.2 to 17.1. - BZ#2226963
-
Before this update, if a DCN site had 3
DistributedComputeHCI
nodes and at least 1DistributedComputeHCIScaleOut
node,cephadm
generated the incorrect spec. With this update, if a DCN site has a mix ofDistributedComputeHCI
andDistributedComputeHCIScaleOut
nodes,cephadm
generates the spec correctly. - BZ#2227360
-
Before this update, the image cache cleanup task of the NetApp NFS driver caused unpredictable slowdowns in other Block Storage services. With this update, the image cache cleanup task of the NetApp NFS driver no longer causes unpredictable slowdowns in other Block Storage services. The NetApp NFS driver also provides the
netapp_nfs_image_cache_cleanup_interval
configuration option, with a default value of 600 seconds that should be adequate for most situations. - BZ#2228818
Previously, the nova_virtlogd container did not get updated to from ubi 8 to ubi 9 as expected after a RHOSP upgrade of the Compute node to RHOSP 17.1 with RHEL 9.2. The container was updated only after rebooting the Compute node.
Now, the nova_virtlogd container gets updated to ubi 9 before the RHOSP upgrade. Note that in subsequent RHOSP updates, you must reboot the Compute node after any change to the virtlogd container, because a restart would cause workload logs to become unreachable.
- BZ#2231378
- Before this update, the Red Hat Ceph Storage back end of the Block Storage (cinder) backup service did not form the internal backup name correctly. As a result, backups that were stored in Ceph could not be restored to volumes that were stored on a non-Ceph back end. With this update, the Red Hat Ceph Storage back end forms backup names correctly. Ceph can now identify all the constituent parts of a backup and can restore the data to a volume that is stored on a non-Ceph back end.
- BZ#2232562
Before this update,
openstack overcloud deploy
did not pass the value of theOVNAvailabilityZone
role parameter to OVS.With this update, the
OVNAvailabilityZone
role parameter correctly passes the value as anavailability-zones
value inexternal-ids:ovn-cms-options
.The following example shows how to use the parameter in an environment file to set `OVNAvailabilityZone. Include the environment file in the deployment command.
ControllerParameters: OVNAvailabilityZone: 'az1'
The deployment adds
availability-zones=az1
to OVSexternal-ids:ovn-cms-options
.- BZ#2233136
-
Before this update, when multiple values were provided in a comma-delimited list, the
CinderNetappNfsShares
parameter was incorrectly parsed. As a result, a NetApp back end with multiple NFS shares could not be defined. With this update, theCinderNetappNfsShares
parameter is correctly parsed when provided with multiple values in a comma-delimited list. As a result, a NetApp with multiple NFS shares is correctly defined. - BZ#2233457
-
Before this update, the WSGI logs for the
cinder-api
service were not stored in a persistent location, which caused you to not be able to view the logs to troubleshoot issues. With this update, the WSGI logs are stored on the controller nodes where thecinder-api
service runs in the/var/log/containers/httpd/cinder-api
directory, which resolves the issue. - BZ#2233487
- Before this update, if you used RHOSP dynamic routing in your RHOSP environment and you created a load balancer by using the RHOSP Load-balancing service (octavia), the latency between the Controller nodes might have caused the OVN provider driver to fail. With this update, load balancers are successfully created when using the OVN provider driver on Controller nodes that are experiencing latency.
- BZ#2235621
-
Before this update, the RHOSP upgrade from 16.2 to 17.1 failed when pulling images from
registry.redhat.io
because the upgrade playbook did not include the Podman registry login task. This issue is resolved in RHOSP 17.1.2. - BZ#2237245
- With this update, RHOSP 17.1 environments that use dynamic routing that are updating to RHOSP 17.1.2 now work properly. RHOSP director now successfully updates the Free Range Routing (FRR) component without requiring any workaround.
- BZ#2237251
-
Before this update, RHOSP environments that used the Load-balancing service (octavia) with the OVN provider and a health monitor caused the load-balancing pool to display a fake member’s status as
ONLINE
. With this update, if you use a health monitor for a pool, the fake load-balancing pool member now has theERROR
operating status and the Load Balancer/Listener/Pool operating statuses are updated accordingly. - BZ#2237866
-
Before this update, configuring caching parameters for ceilometer were not supported. With this update, for caching, ceilometer uses the
dogpile.cache.memcached
back end. If you manually disable caching, celiometer uses theoslo_cache.dict
back end. - BZ#2240591
- Before this update, calling the member batch update API triggered race conditions in the Octavia API service, which caused the load balancer to be stuck in the “PENDING_UPDATE” provisioning_status. With this update, calling the member batch update API does not trigger race conditions, which resolves the issue.
- BZ#2242605
-
Before this update, an upgrade from RHOSP 16.2 to 17.1 failed on environments that were not connected to the internet because the
infra_image
value was not defined. Theovercloud_upgrade_prepare.sh
script tried to pullregistry.access.redhat.com/ubi8/pause
instead, which caused an error. The issue is resolved in RHOSP 17.1.2. - BZ#2244631
-
Before this update, performing a manual OVN DB sync while the OVN metadata and the OVN LB health monitor ports were present in the same environment caused the OVN DB sync to delete one of the ports. If the OVN metadata port was deleted, you lost communication with the VMs. With this update, a manual OVN DB sync does not delete one of the ports because the OVN-provider uses the
ovn-lb-hm:distributed
value for thedevice_owner
parameter. OVN provider updates existing OVN LB Health Monitor ports to theovn-lb-hm:distributed
value. - BZ#2246563
- Before this update, Director did not include the puppet modules and the heat templates that you needed to configure the Pure Flashblade driver with your Red Hat Openstack Shared File System Service (manila). With this update, Director now includes the necessary puppet modules and heat templates for your configuration.
3.3.3. Enhancements
This release of Red Hat OpenStack Platform (RHOSP) features the following enhancements:
- BZ#1759007
- The upgrade of multi-cell environments is now supported.
- BZ#1813561
- With this update, the Load-balancing service (octavia) supports HTTP/2 load balancing by using the Application Layer Protocol Negotiation (ALPN) for listeners and pools that are enabled with Transport Layer Security (TLS). The HTTP/2 protocol improves performance by loading pages faster.
- BZ#1816766
- This enhancement adds support for uploading compressed images to the Image service (glance). You can use the image decompression plugin to optimize network bandwidth by reducing image upload times and storage consumption on hosts.
- BZ#2222699
This update fixes a bug that set the wrong MTU value on tenant networks that were changed from VXLAN to Geneve after a migration from the OVS mechanism driver to the OVN mechanism driver. Before this update, the
cloud-init
package overrode the value that was correctly set by the DHCP server.For example, after a migration from the OVS mechanism driver with VXLAN to the OVN mechanism driver to Geneve with a 1442 MTU, cloud-init reset the MTU to 1500.
With this update, the value set by the DHCP server persists.
- BZ#2233695
- This enhancement adds support for the Revert to Snapshot feature for iSCSI, FC, and NFS drivers with FlexVol pool. Limitations: This feature does not support FlexGroups. Also, you can revert to only the most recent snapshot of a Block Storage volume.
- BZ#2237500
-
This update clarifies an error message produced by
openstack-tripleo-validations
. Previously, if a host was not found when you ran a validation, the command reported the status as FAILED. Now the Status is reported as SKIPPED.
3.3.4. Technology previews
You can test the following Technology Preview features in this release of Red Hat OpenStack Platform (RHOSP). These features provide early access to upcoming product features so that you can test functionality and provide feedback during the development process. These features are not supported with your Red Hat subscription, and Red Hat does not recommend using them for production. For more information about the scope of support for Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.
- BZ#1848407
- In RHOSP 17.1, a technology preview is available for the Stream Control Transmission Protocol (SCTP) in the Load-balancing service (octavia). Users can create SCTP listeners and attach SCTP pools in a load balancer.
- BZ#2217663
In RHOSP 17.1, a technology preview is available for the VF-LAG transmit hash policy offload that enables load balancing at NIC hardware for offloaded traffic/flows. This hash policy is only available for layer3+4 base hashing.
To use the technology preview, verify that your templates include a bonding options parameter to enable the xmit hash policy as shown in the following example:
bonding_options: "mode=802.3ad miimon=100 lacp_rate=fast xmit_hash_policy=layer3+4"
3.3.5. Known issues
These known issues exist in Red Hat OpenStack Platform (RHOSP) at this time:
- BZ#2034801
A RHOSP deployment can fail when a very large number of virtual functions (VFs) are created per physical function (PF). The NetworkManager issues a DHCP request on all of them, leading to failures in the NetworkManager service.
For example, this issue occurred during a deployment that included 256 VFs across 4 PFs.
Workaround: Avoid creating a very large number of VFs per PF.
- BZ#2107599
-
Do not change
binding:vnic_type
on a port that is attached to an instance. Doing so causesnova_compute
to go into a restart loop if it is restarted. - BZ#2160481
In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue where floating IP (FIP) port forwarding fails.
When FIP port forwarding is configured, packets sent to a specific destination port with a destination IP that equals the FIP are redirected to an internal IP from a RHOSP Networking service (neutron) port. This occurs regardless of the protocol that is used: TCP, UDP, and so on.
When BGP dynamic routing is configured, the routes to the FIPs used to perform FIP port forwarding are not exposed, and these packets cannot reach their final destinations.
Workaround: Currently, there is no workaround.
- BZ#2163477
- In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue affecting instances connected to provider networks. The RHOSP Compute service cannot route packets sent from one of these instances to a multicast IP address destination. Therefore, instances subscribed to a multicast group fail to receive the packets sent to them. The cause is that BGP multicast routing is not properly configured on the overcloud nodes. Workaround: Currently, there is no workaround.
- BZ#2178500
-
If a volume refresh fails when using the
nova-manage
CLI, this causes the instance to stay in a locked state. - BZ#2187985
Adding a load balancer member whose subnet is not in the Load-balancing service (octavia) availability zone puts the load balancer in
ERROR
. The member cannot be removed because of theERROR
status, making the load balancer unusable.Workaround: Delete the load balancer.
- BZ#2192913
In RHOSP environments with ML2/OVN or ML2/OVS that have DVR enabled and use VLAN tenant networks, east/west traffic between instances connected to different tenant networks is flooded to the fabric.
As a result, packets between those instances reach not only the Compute nodes where those instances run, but also any other overcloud node.
This might impact the network and it might be a security risk because the fabric sends traffic everywhere.
This bug will be fixed in a later FDP release. You do not need to perform a RHOSP update to obtain the FDP fix.
- BZ#2210319
Currently, the Retbleed vulnerability mitigation in RHEL 9.2 can cause a performance drop for Open vSwitch with Data Plane Development Kit (OVS-DPDK) on Intel Skylake CPUs.
This performance regression happens only if C-states are disabled in the BIOS, Hyper-Threading Technology is enabled, and OVS-DPDK is using only one logical core of a given core.
Workaround: Assign both logical cores to OVS-DPDK or to SRIOV guests that have DPDK running as recommended in the NFV configuration guide.
- BZ#2216021
RHOSP 17.1 with the OVN mechanism driver does not support logging of flow events per port or the use of the
--target
option of thenetwork log create
command.RHOSP 17.1 supports logging of flow events per security groups, using the
--resource
option of thenetwork log create
command. For more information, see Logging security group actions in Configuring Red Hat OpenStack Platform networking.- BZ#2217867
- On Nvidia ConnectX-5 and ConnectX-6 NICs, when using hardware offload, where some offloaded flows on a PF can cause transient performance issues on the associated VFs. This issue is specifically observed with LLDP and VRRP traffic.
- BZ#2220887
- The data collection service (ceilometer) does not filter separate power and current metrics.
- BZ#2222683
Currently, there is no support for Multi-RHEL for the following deployment architectures:
- Edge (DCN)
- ShiftOnStack
Director operator-based deployments
Workaround: Use only a single version of RHEL across your RHOSP deployment when operating one of the listed architectures.
- BZ#2223916
In RHOSP 17.1 GA environments that use the ML2/OVN mechanism driver, floating IP port forwarding does not function correctly.
FIP port forwarding should be centralized on the Controller or the Networker nodes. Instead, VLAN and flat networks distribute north-south network traffic when FIPs are used.
Workaround: To resolve this problem and force FIP port forwarding through the centralized gateway node, either set the RHOSP Orchestration service (heat) parameter
NeutronEnableDVR
tofalse
, or use Geneve instead of VLAN or flat project networks.- BZ#2224236
In this release of RHOSP, SR-IOV interfaces that use Intel X710 and E810 series controller virtual functions (VFs) with the iavf driver can experience network connectivity issues that involve link status flapping. The affected guest kernel versions are:
-
RHEL 8.7.0
8.7.3 (No fixes planned. End of life.) -
RHEL 8.8.0
8.8.2 (Fix planned in version 8.8.3.) -
RHEL 9.2.0
9.2.2 (Fix planned in version 9.2.3.) Upstream Linux 4.9.0
6.4.* (Fix planned in version 6.5.) Workaround: There is none, other than to use a non-affected guest kernel.
-
RHEL 8.7.0
- BZ#2231893
The metadata service can become unavailable after the metadata agent fails in multiple attempts to start a malfunctioning HAProxy child container. The metadata agent logs an error message similar to: `ProcessExecutionError: Exit code: 125; Stdin: ; Stdout: Starting a new child container neutron-haproxy-ovnmeta-<uuid>”.
Workaround: Run
podman kill <_container name_>
to stop the problematic haproxy child container.- BZ#2231960
- When a Block Storage volume uses the Red Hat Ceph Storage back end, a volume cannot be removed when a snapshot is created from this volume and then a volume clone is created from this snapshot. In this case, you cannot remove the original volume while the volume clone exists.
- BZ#2237290
The Networking service (neutron) does not prevent you from disabling or removing a networking profile, even if that profile is part of a flavor that is in use by a router. The disablement or removal of the profile can disrupt proper operation of the router.
Workaround: Before you disable or remove a networking profile, ensure that it is not part of a flavor that is currently used by a router.
- BZ#2241270
-
The
frr-status
andoslo-config-validator
validations report FAILED during an update. You can ignore these error messages. They are specific to the validation code and do not indicate any conditions that affect 17.1 operations. They will be fixed in a future release. - BZ#2241326
-
LDAP server connections are removed as expected from the Keystone LDAP pool on either
TIMEOUT
orSERVER_DOWN
errors. The LDAP pool exhausts its connections and is unable to re-establish new ones. TheMaxConnectionReachedError
is issued. Workaround: DisableLDAP pool
. - BZ#2242439
-
With
localnet_learn_fdb
enabled, packet loss can occur in traffic between two instances hosted by different Compute nodes. This is a core OVN issue. To avoid the issue, do not enablelocalnet_learn_fdb
. - BZ#2249690
-
If there are multiple clusters in DCN FFU, Ceph cluster upgrades fail because they cannot find the
ceph-ansible
package as it is removed during the first Ceph cluster upgrade. - BZ#2251176
The Ceph Dashboard cannot reach the Prometheus service endpoint and displays the following error message: 404 not found. This error occurs because the configuration of the VIP for the Prometheus service is not correct.
Workaround:
-
Verify haproxy is properly configured: ssh into a controller node (such as controller-0) and run
curl http://10.143.0.25:9092
. If the curl is successful, the configuration is correct. If the
curl
succeeded, ssh into the controller node and update the prometheus API config in the ceph cluster:$ sudo cephadm shell -- ceph dashboard set-prometheus-api-host http://10.143.0.25:9092
To verify that the Ceph Dashboard can reach the Prometheus service endpoint and no longer displays the 404 not found error message, check the Ceph Dashboard UI.
-
Verify haproxy is properly configured: ssh into a controller node (such as controller-0) and run
- BZ#2252723
Some AMD environments fail to boot when provisioned with the overcloud-hardened-uefi-full.raw image, due to the included kernel argument
console=ttyS0
. As a result, the boot sequence halts with no diagnostic nor error message.Workaround: Run the following commands to edit the overcloud image:
sudo yum install guestfs-tools -y sudo systemctl start libvirtd sudo virt-customize -a /var/lib/ironic/images/overcloud-hardened-uefi-full.raw \ --run-command "sed -i 's/console=ttyS0 //g' /etc/default/grub" \ --run-command "grub2-mkconfig -o /boot/grub2/grub.cfg" \ --run-command "grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg"
After running those commands, you can provision the AMD nodes using the provision command.
- BZ#2254036
- During director-deployed Ceph upgrade, if the CephClusterName variable was overridden to a value other than "ceph", then the upgrade process fails. All distributed compute nodes (DCN) deployments override this variable.
- BZ#2254553
-
In Red Hat Ceph Storage 6, there is currently a known issue where
cephadm
attempts to bind the Grafana daemon to all interfaces when a valid network list is provided. This prevents the Grafana daemon from starting. - BZ#2254994
In RHOSP 17.1.2 environments that contain Load-balancing service (octavia) health monitor ports from a previous version, running
neutron-db-sync-tool
might randomly delete any of those pre-existing ports or OVN metadata ports. This unintended port deletion, results in a loss of health monitor capacity, or communication loss with the affected instances.Workaround: Manually update the 'device_owner' field on existing Load-balancing service health monitor ports to the value of
ovn-lb-hm:distributed
. Doing so ensures that if theneutron-db-sync-tool
is launched, the health monitor or OVN metadata ports are not adversely impacted.- BZ#2255302
If your deployment has an external Ceph cluster with multiple file systems, you can not create a Shared File System service (Manila) share as expected.
The
cephfs_filesystem_name
driver configuration parameter that is needed to avoid this situation cannot be set using director’s heat template parameters.Workaround: Set the "cephfs_filesystem_name" parameter to specify the filesystem that the Shared File System service (Manila) must use via "ExtraConfig".
Add the parameter to an environment file as shown in the following example:
$ cat /home/stack/manila_cephfs_customization.yaml parameter_defaults: ExtraConfig: manila::config::manila_config: cephfs/cephfs_filesystem_name: value: <filesystem>
Replace the value of <filesystem> with the appropriate name and include this environment file with the
openstack overcloud deploy
command.- BZ#2255324
A director bug can disrupt or crash client workloads during updates or upgrades to any RHOSP 17.1 version. This bug affects deployments that enable the RHOSP Shared File Systems service (manila) with the CephFS-via-NFS backend.
The bug causes deletion of Ceph NFS export information during update or upgrade operations. This export information is created by the Shared File System Service (manila) when users set up "access rules" on their shares.
When the NFS server goes into a recovery mode, Client workloads can hang and eventually crash if they were actively reading or writing to NFS shares.
Workaround: See Manila shares with Red Hat OpenStack 17.1 can be abruptly disconnected due to export information loss.
3.4. Red Hat OpenStack Platform 17.1.1 Maintenance Release - September 20, 2023
Consider the following updates in Red Hat OpenStack Platform (RHOSP) when you deploy this RHOSP release.
3.4.1. Advisory list
This release of Red Hat OpenStack Platform (RHOSP) includes the following advisories:
- RHBA-2023:5134
- Release of containers for OSP 17.1
- RHBA-2023:5135
- Release of components for OSP 17.1
- RHBA-2023:5136
- Release of containers for OSP 17.1
- RHBA-2023:5137
- Red Hat OpenStack Platform 17.1 RHEL 9 deployment images
- RHBA-2023:5138
- Release of components for OSP 17.1
3.4.2. Bug fixes
These bugs were fixed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#2184834
-
Before this update, the Block Storage API supported the creation of a Block Storage multi-attach volume by passing a parameter in the volume-create request, even though this method of creating multi-attach volume had been deprecated for removal because it is unsafe and can lead to data loss when creating a multi-attach volume on a back end that does not support multi-attach volumes. The
openstack
andcinder
CLI only supported creating a multi-attach volume by using a multi-attach volume-type. With this update, the Block Storage API only supports creating a multi-attach volume by using a multi-attach volume-type. Therefore some Block Storage API requests that used to work will be rejected with a 400 (Bad Request) response code and an informative error message. - BZ#2222543
This update fixes a bug that negatively affected OVN database operation after the replacement of a bootstrap Controller node. Before this update, you could not use the original bootstrap Controller node hostname and IP address for the replacement Controller node, because the name reuse caused issues with OVN database RAFT clusters.
Now, you can use the original hostname and IP address for the replacement Controller node.
- BZ#2222589
- Before this update, during the upgrade from RHOSP 16.2 to 17.1, the director upgrade script stopped executing when upgrading Red Hat Ceph Storage 4 to 5 in a director-deployed Ceph Storage environment that used IPv6. This issue is resolved in RHOSP 17.1.1.
- BZ#2224527
- Before this update, the upgrade procedure from RHOSP 16.2 to 17.1 failed when RADOS Gateway (RGW) was deployed as part of director-deployed Red Hat Ceph Storage because HAProxy did not restart on the next stack update. This issue was resolved in Red Hat Ceph Storage 5.3.5 and no longer impacts RHOSP upgrades.
- BZ#2226366
-
Before this update, when retyping
in-use
Red Hat Ceph Storage (RHCS) volumes to store the volume in a different pool than its current location, data could be corrupted or lost. With this update, the Block Storage RHCS back end resolves this issue. - BZ#2227199
Before this update, in RHOSP 17.1 environments that used the Load-balancing service (octavia) with the OVN service provider driver, load balancer health checks for floating IP addresses (FIPs) were not properly populated with the protocol port. Requests to the FIPs were incorrectly distributed to load balancer members that were in the `ERROR`state.
With this update, the issue is resolved, and any new load balancer health checks for floating IP addresses (FIPs) are properly populated with the protocol port. If you created health monitors before deploying this update, you must recreate them to resolve the port issue.
- BZ#2229750
- Before this update, when specifying an availability zone (AZ) when creating a Block Storage volume backup, the AZ was ignored, which could cause the backup to fail. With this update, the Block Storage backup service resolves this issue.
- BZ#2229761
-
Before this update, a race condition in the deployment steps for
ovn_controller
andovn_dbs
causedovn_dbs
to be upgraded beforeovn_controller. If `ovn_controller
is not upgraded beforeovn_dbs
, an error before the restart to the new version causes packet loss. In RHOSP 17.1.1, this issue has been resolved. - BZ#2229767
-
Before this update, when you upgraded Red Hat Ceph Storage 4 to 5 during the upgrade from RHOSP 16.2 to 17.1, the overcloud upgrade failed because the containers that were associated with
ceph-nfs-pacemaker
were down, impacting the Shared File Systems service (manila). This issue is resolved in RHOSP 17.1.1.
3.4.3. Enhancements
This release of Red Hat OpenStack Platform (RHOSP) features the following enhancements:
- BZ#2210151
-
In RHOSP 17.1.1, the RHOSP Orchestration service (heat) parameter,
FrrBgpAsn
, can now be set on a per-role basis instead of being a global parameter for RHOSP 17.1 environments that use RHOSP dynamic routing. - BZ#2229026
In RHOSP 17.1.1, the
tripleo_frr_bgp_peers
role-specific parameter can now be used to specify a list of IP addresses or hostnames for Free Range Routing (FRR) to peer with.Example
ControllerRack1ExtraGroupVars: tripleo_frr_bgp_peers: ["172.16.0.1", "172.16.0.2"]
3.4.4. Technology previews
The items listed in this section are provided as Technology Previews in this release of Red Hat OpenStack Platform (RHOSP). For further information on the scope of Technology Preview status, and the associated support implications, refer to https://access.redhat.com/support/offerings/techpreview/.
- BZ#1813561
- With this update, the Load-balancing service (octavia) supports HTTP/2 load balancing by using the Application Layer Protocol Negotiation (ALPN) for listeners and pools that are enabled with Transport Layer Security (TLS). The HTTP/2 protocol improves performance by loading pages faster.
- BZ#1848407
- In RHOSP 17.1, a technology preview is available for the Stream Control Transmission Protocol (SCTP) in the Load-balancing service (octavia). Users can create SCTP listeners and attach SCTP pools in a load balancer.
- BZ#2211796
This release includes a Technology Preview of the optional feature that you can use to define custom router flavors and create routers with the custom router flavors.
For more information, see Creating custom virtual routers with router flavors.
- BZ#2217663
- In RHOSP 17.1, a technology preview is available for the VF-LAG transmit hash policy offload that enables load balancing at NIC hardware for offloaded traffic/flows. This hash policy is only available for layer3+4 base hashing.
3.4.5. Known issues
These known issues exist in Red Hat OpenStack Platform (RHOSP) at this time:
- BZ#2108212
If you use IPv6 to connect to instances during migration to the OVN mechanism driver, connection to the instances might be disrupted for up to several minutes when the ML2/OVS services are stopped.
The router advertisement daemon
radvd
for IPv6 is stopped during migration to the OVN mechanism driver. Whileradvd
is stopped, router advertisements are no longer broadcast. This broadcast interruption results in instance connection loss over IPv6. IPv6 communication is automatically restored once the new ML2/OVN services start.Workaround: To avoid the potential disruption, use IPv4 instead.
- BZ#2126725
- Hard-coded certificate location operates independently of user-provided values. During deployment with custom certificate locations, services do not retrieve information from API endpoints because Transport Layer Security (TLS) verification fails.
- BZ#2144492
- If you migrate a RHOSP 17.1.0 ML2/OVS deployment with distributed virtual routing (DVR) to ML2/OVN, the floating IP (FIP) downtime that occurs during ML2/OVN migration can exceed 60 seconds.
- BZ#2151290
In RHOSP 17.1.1, director does not allow for automatically configuring NS records to match a parent’s NS records. Workaround: Until an automated workaround is provided in a future release, administrators can manually change the Orchestration service (heat) template file that resides on the undercloud in
/usr/share/ansible/roles/designate_bind_pool/templates/
. In the Jinja template,pools.yaml.j2
, remove the code following the line containingns_records
until the next empty line (lines 13-16) and insert appropriate values for their infrastructure. Finally, administrators should redeploy the overcloud.Example
ns_records: - hostname: ns1.desiexample.com priority: 1 - hostname: ns2.desiexample.com priority: 2
- BZ#2160481
In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue where floating IP (FIP) port forwarding fails.
When FIP port forwarding is configured, packets sent to a specific destination port with a destination IP that equals the FIP are redirected to an internal IP from a RHOSP Networking service (neutron) port. This occurs regardless of the protocol that is used: TCP, UDP, and so on.
When BGP dynamic routing is configured, the routes to the FIPs used to perform FIP port forwarding are not exposed, and these packets cannot reach their final destinations.
Currently, there is no workaround.
- BZ#2163477
- In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue affecting instances connected to provider networks. The RHOSP Compute service cannot route packets sent from one of these instances to a multicast IP address destination. Therefore, instances subscribed to a multicast group fail to receive the packets sent to them. The cause is that BGP multicast routing is not properly configured on the overcloud nodes. Currently, there is no workaround.
- BZ#2167428
In RHOSP 17.1.1, there is a known issue during a new deployment where the RHOSP Identity service (keystone) is often not available when the
agent-notification
service is initializing. This prevents ceilometer from discovering the gnocchi endpoint. As a result, metrics are not sent to gnocchi.Workaround: Restart the agent-notification service on the Controller node:
$ sudo systemctl restart tripleo_ceilometer_agent_notification.service
- BZ#2178500
- If a volume refresh fails when using the nova-manage CLI, this causes the instance to stay in a locked state.
- BZ#2180542
The Pacemaker-controlled
ceph-nfs
resource requires a runtime directory to store some process data. The directory is created when you install or upgrade RHOSP. Currently, a reboot of the Controller nodes removes the directory, and theceph-nfs
service does not recover when the Controller nodes are rebooted. If all Controller nodes are rebooted, theceph-nfs
service fails permanently.Workaround: If you reboot a Controller node, log into the Controller node and create a
/var/run/ceph
directory:$ mkdir -p /var/run/ceph
Repeat this step on all Controller nodes that have been rebooted. If the
ceph-nfs-pacemaker
service has been marked as failed, after creating the directory, execute the following command from any of the Controller nodes:$ pcs resource cleanup
- BZ#2180883
Currently, rsyslog stops sending logs to Elasticsearch when Logrotate archives all log files once a day.
Workaround: Add "RsyslogReopenOnTruncate: true" to your environment file during deployment so that Rsyslog reopens all log files on log rotation.
Currently, RHOSP 17.1 is shipped with puppet-rsyslog module, which causes Director to configure rsyslog incorrectly.
Workaround: Manually apply patch [1] in
/usr/share/openstack-tripleo-heat-templates/deployment/logging/rsyslog-container-puppet.yaml
before deployment to configure Rsyslog correctly.[1] https://github.com/openstack/tripleo-heat-templates/commit/ce0e3a9a94a4fce84dd70b6098867db1c86477fb
- BZ#2192913
In RHOSP environments with ML2/OVN or ML2/OVS that have DVR enabled and use VLAN tenant networks, east/west traffic between instances connected to different tenant networks is flooded to the fabric.
As a result, packets between those instances reach not only the Compute nodes where those instances run, but also any other overcloud node.
This could cause an impact on the network and it could be a security risk because the fabric sends traffic everywhere.
This bug will be fixed in a later FDP release. You do not need to perform a RHOSP update to obtain the FDP fix.
- BZ#2196291
- Currently, custom SRBAC rules do not permit list policy rules to non-admin users. As a consequence, non-admin users can not list or manage these rules. Current workarounds include either disabling SRBAC, or modifying the SRBAC custom rule to permit this action.
- BZ#2203785
-
Currently, there is a permission issue that causes collectd sensubility to stop working after you reboot a baremetal node. As a consequence, sensubility stops reporting container health. Workaround: After rebooting an overcloud node, manually run the following command on the node:
sudo podman exec -it collectd setfacl -R -m u:collectd:rwx /run/podman
- BZ#2210319
Currently, the Retbleed vulnerability mitigation in RHEL 9.2 can cause a performance drop for Open vSwitch with Data Plane Development Kit (OVS-DPDK) on Intel Skylake CPUs.
This performance regression happens only if C-states are disabled in the BIOS, hyper-threading is enabled, and OVS-DPDK is using only one hyper-thread of a given core.
Workaround: Assign both hyper-threads of a core to OVS-DPDK or to SRIOV guests that have DPDK running as recommended in the NFV configuration guide.
- BZ#2210873
-
In RHOSP 17.1.1 Red Hat Ceph Storage (RHCS) environments, setting crush rules fail with an
assimilate.conf not found
error. This problem will be fixed in a later RHOSP release. - BZ#2213126
The logging queue that buffers excess security group log entries sometimes stops accepting entries before the specified limit is reached. As a workaround, you can set the queue length higher than the number of entries you want it to hold.
You can set the maximum number of log entries per second with the parameter
NeutronOVNLoggingRateLimit
. If the log entry creation exceeds that rate, the excess is buffered in a queue up to the number of log entries that you specify inNeutronOVNLoggingBurstLimit
.The issue is especially evident in the first second of a burst. In longer bursts, such as 60 seconds, the rate limit is more influential and compensates for burst limit inaccuracy. Thus, the issue has the greatest proportional effect in short bursts.
Workaround: Set
NeutronOVNLoggingBurstLimit
at a higher value than the target value. Observe and adjust as needed.- BZ#2213742
TCP health monitors in UDP pools might not work as expected, depending on the port number that is used by the monitor. Also the status of the pool members and the health monitors are not correct. This is caused by SELinux rules that break the use of TCP health monitors on specific port numbers in UDP pools.
Workaround (if any): Currently, there is no workaround.
- BZ#2216021
RHOSP 17.1 with the OVN mechanism driver does not support logging of flow events per port or the use of the
--target
option of thenetwork log create
command.RHOSP 17.1 supports logging of flow events per security groups, using the
--resource
option of thenetwork log create
command. See "Logging security group actions" in Networking with RHOSP.- BZ#2216130
-
Currently,
puppet-ceilometer
does not populate thetenant_name_discovery
parameter in the data collection service (ceilometer) configuration on Compute nodes. This causes theProject name
andUser name
fields to not be identified. Currently, there is no workaround for this issue. - BZ#2217867
- There is currently a known issue on Nvidia ConnectX-5 and ConnectX-6 NICs, when using hardware offload, where some offloaded flows on a PF can cause transient performance issues on the associated VFs. This issue is specifically observed with LLDP and VRRP traffic.
- BZ#2218596
- Do not migrate to the OVN mechanism driver if your original ML2/OVS environment uses iptables_hybrid firewall and trunk ports. In the migrated environment, instance networking problems occur if you recreate an instance with trunks after an event such as a hard reboot, start and stop, or node reboot. As a workaround, you can switch from the iptables hybrid firewall to the OVS firewall before migrating.
- BZ#2219574
- The data collection service (ceilometer) does not provide a default caching back end, which can cause some services to be overloaded when polling for metrics.
- BZ#2219603
In RHOSP 17.1 GA, the DNS service (designate) is misconfigured when secure role-based access control (sRBAC) is enabled. The current sRBAC policies contain incorrect rules for designate and must be corrected for designate to function correctly. A possible workaround is to apply the following patch on the undercloud server and redeploy the overcloud:
https://review.opendev.org/c/openstack/tripleo-heat-templates/+/888159
- BZ#2219613
-
In RHOSP 17.1 distributed virtual router (DVR) environments, the
external_mac
variable is improperly being removed for ports in theDOWN
status which results in centralized traffic for short periods. - BZ#2219830
In RHOSP 17.1, there is a known issue of transient packet loss where hardware interrupt requests (IRQs) are causing non-voluntary context switches on OVS-DPDK PMD threads or in guests running DPDK applications.
This issue is the result of provisioning large numbers of VFs during deployment. VFs need IRQs, each of which must be bound to a physical CPU. When there are not enough housekeeping CPUs to handle the capacity of IRQs,
irqbalance
fails to bind all of them and the IRQs overspill on isolated CPUs.Workaround: You can try one or more of these actions:
- Reduce the number of provisioned VFs to avoid unused VFs remaining bound to their default Linux driver.
- Increase the number of housekeeping CPUs to handle all IRQs.
- Force unused VF network interfaces down to avoid IRQs from interrupting isolated CPUs.
- Disable multicast and broadcast traffic on unused, down VF network interfaces to avoid IRQs from interrupting isolated CPUs.
- BZ#2220808
-
In RHOSP 17.1, there is a known issue where the data collection service (ceilometer) does not report airflow metrics. This problem is caused because the data collection service is missing a gnocchi resource type,
hardware.ipmi.fan
. Currently, there is no workaround. - BZ#2220887
- The data collection service (ceilometer) does not filter separate power and current metrics.
- BZ#2220930
In RHOSP 17.1 that run the DNS service (designate), there is a known issue where the
bind9
andunbound
services are not restarted if the configuration changes.Workaround: Manually restart the containers by running the following commands on each controller:
$ sudo systemctl restart tripleo_designate_backend_bind9 $ sudo systemctl restart tripleo_unbound
- BZ#2222420
In RHOSP 17.1.1 environments that use IPv6 networks that run the RHOSP DNS service (designate), the BIND 9 back end server can reject DNS notify messages. This issue is caused because there are often multiple IP addresses for the same network on the same interface, and it can appear that the messages are emanating from sources other than the designate Worker services.
Workaround: Apply the following patches:
- https://review.opendev.org/c/openstack/tripleo-ansible/+/888300
https://review.opendev.org/c/openstack/tripleo-heat-templates/+/888786
After you apply the patches, manually restart the configuration in the BIND 9 servers by running:
$ sudo systemctl restart tripleo_designate_backend_bind9
- BZ#2222683
Currently, there is no support for Multi-RHEL for the following deployment architectures:
- Edge (DCN)
- ShiftOnStack
Director operator-based deployments
Workaround: Use only a single version of RHEL across your RHOSP deployment when operating one of the listed architectures.
- BZ#2223294
There is a known issue when performing an in-place upgrade from RHOSP 16.2 to 17.1 GA. The collection agent,
collectd-sensubility
fails to run on RHEL 8 Compute nodes.Workaround: On affected nodes edit the file,
/var/lib/container-config-scripts/collectd_check_health.py
, and replace"healthy: .State.Health.Status}"
with"healthy: .State.Healthcheck.Status}"/
on line 26.- BZ#2223916
In RHOSP 17.1 GA environments that use the ML2/OVN mechanism driver, there is a known issue with floating IP port forwarding not working correctly. This problem is caused because VLAN and flat networks distribute north-south network traffic when FIPs are used, and, instead, FIP port forwarding should be centralized on the Controller or the Networker nodes.
Workaround: To resolve this problem and force FIP port forwarding through the centralized gateway node, either set the RHOSP Orchestration service (heat) parameter
NeutronEnableDVR
tofalse
, or use Geneve instead of VLAN or flat project networks.- BZ#2224236
In this release of RHOSP, there is a known issue where SR-IOV interfaces that use Intel X710 and E810 series controller virtual functions (VFs) with the iavf driver can experience network connectivity issues that involve link status flapping. The affected guest kernel versions are:
-
RHEL 8.7.0
8.7.3 (No fixes planned. End of life.) -
RHEL 8.8.0
8.8.2 (Fix planned in version 8.8.3.) -
RHEL 9.2.0
9.2.2 (Fix planned in version 9.2.3.) Upstream Linux 4.9.0
6.4.* (Fix planned in version 6.5.) Workaround: There is none, other than to use a non-affected guest kernel.
-
RHEL 8.7.0
- BZ#2225205
-
Outdated upgrade orchestration logic overrides the existing Pacemaker authkey during the Fast Forward Upgrade (FFU) procedure, preventing Pacemaker from connecting to
pacemaker_remote
running on Compute nodes when Instance HA is enabled. As a result, the upgrade fails andpacemaker_remote
running on Compute nodes is unreachable from the central cluster. Contact Red Hat support to receive instructions on how to perform FFU if Instance HA is configured. - BZ#2227360
- The image cache cleanup task of the NetApp NFS driver can cause unpredictable slowdowns in other Block Storage services. There is currently no workaround for this issue.
- BZ#2229937
-
When
collectd sensubility
fails to create a sender, it does not close the link to the sender. Long-running open links that fail can cause issues in the bus, which causecollectd sensubility
to stop working. Workaround: Restart thecollectd
container on affected overcloud nodes to recovercollectd sensubility
. - BZ#2231378
- If you choose Red Hat Ceph Storage as the back end for your Block Storage (cinder) backup service repository, then you can only restore backed up volumes to a RBD-based Block Storage back end. There is currently no workaround for this.
- BZ#2231893
The metadata service can become unavailable after the metadata agent fails in multiple attempts to start a malfunctioning HAProxy child container. The metadata agent logs an error message similar to: `ProcessExecutionError: Exit code: 125; Stdin: ; Stdout: Starting a new child container neutron-haproxy-ovnmeta-<uuid>”.
Workaround: Run
podman kill <_container name_>
to stop the problematic haproxy child container.- BZ#2231960
- When a Block Storage volume uses the Red Hat Ceph Storage back end, a volume cannot be removed when a snapshot is created from this volume and then a volume clone is created from this snapshot. In this case, you cannot remove the original volume while the volume clone exists.
- BZ#2232562
The
OVNAvailabilityZone Role
parameter is not recognized as expected, which causes availability zone configuration to fail in OVN.Workaround: Use the
OVNCMSOptions
parameter to configure OVN availability zones. For example:ControllerParameters: OVNCMSOptions: 'enable-chassis-as-gw,availability-zones=az1'
- BZ#2233487
- In RHOSP 17.1 GA environments that use RHOSP dynamic routing, there is a known issue where creating a load balancer using the RHOSP Load-balancing service with the OVN provider driver might fail. This failure can occur when there is latency between controller nodes. There is no workaround.
- BZ#2235621
-
The RHOSP upgrade from 16.2 to 17.1 fails when pulling images from
registry.redhat.io
because the upgrade playbook does not include the podman registry login task. Contact your Red Hat support representative for a hotfix. A fix is expected in a later RHOSP release. - BZ#2237245
In RHOSP 17.1 environments that use dynamic routing, updating to RHOSP 17.1.1 does not work properly. Specifically, Free Range Routing (FRR) components are not updated.
Workaround: Apply the following patches on the undercloud before updating RHOSP 17.1:
- BZ#2237251
In RHOSP 17.1.1 environments that use the Load-balancing service (octavia) with the OVN provider driver with a health monitor, the pool load-balancing status incorrectly displays fake members as
ONLINE
. If no health monitor is being used, then the status fake member displays a normal operation ofNO_MONITOR
.Fake load-balancing pool members can occur when a member is not valid, such as when there is a typographical error in the member’s IP address. Health monitors configured for the pool perform no health checks on the fake member, and the global operating status incorrectly considers the fake member as
ONLINE
when it calculates the pool’s status. Furthermore, if all other members in a pool are inERROR
operating status, an incorrectDEGRADED
operating status is assigned to the pool instead ofERROR
because a member of the pool is a fake member with an incorrectONLINE
status.Workaround: Currently, there are no workarounds for this issue.
- BZ#2237290
The Networking service (neutron) does not prevent you from disabling or removing a networking profile, even if that profile is part of a flavor that is in use by a router. The disablement or removal of the profile can disrupt proper operation of the router.
Workaround: Before you disable or remove a networking profile, ensure that it is not part of a flavor that is currently used by a router.
3.5. Red Hat OpenStack Platform 17.1 GA - August 17, 2023
Consider the following updates in Red Hat OpenStack Platform (RHOSP) when you deploy this RHOSP release.
3.5.1. Advisory list
This release includes the following advisories:
- RHEA-2023:4577
- Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)
- RHEA-2023:4578
- Release of containers for Red Hat OpenStack Platform 17.1 (Wallaby)
- RHEA-2023:4579
- Red Hat OpenStack Platform 17.1 RHEL 9 deployment images
- RHEA-2023:4580
- Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)
- RHEA-2023:4581
- Release of containers for Red Hat OpenStack Platform 17.1 (Wallaby)
- RHSA-2023:4582
- Moderate: Release of containers for Red Hat OpenStack Platform 17.1 director Operator
3.5.2. Bug fixes
These bugs were fixed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#1965308
- Before this update, the Load-balancing service (octavia) could unplug a required subnet when you used different subnets from the same network as members' subnets. The members attached to this subnet were unreachable. With this update, the Load-balancing service does not unplug required subnets, and the load balancer can reach subnet members.
- BZ#2007314
-
Before this update, instances with an emulated Trusted Platform Module (TPM) device could not be created due to an issue with the SElinux configuration in the
nova_libvirt
container. With this update, the deployment tooling configures SElinux correctly, which resolves the issue. - BZ#2066866
-
Even though the Panko monitoring service was deprecated, its endpoint still existed in the Identity service (keystone) after upgrading from RHOSP 16.2 to 17.1. With this update, the Panko service endpoint is cleaned up. However, Panko service users are not removed automatically. You must manually delete Panko service users with the command
openstack user delete panko
. There is no impact if you do not delete these users. - BZ#2073530
- Support for the Windows Server 2022 guest operating system was not available in RHOSP 17.0 because it needs vTPM, and vTPM was not available due to an SElinux configuration issue. This issue has been fixed, and the Windows Server 2022 guest operating system is supported in RHOSP 17.1.
- BZ#2080199
- Before this update, services that were removed from the undercloud were not cleaned up during upgrades from RHOSP 16.2 to 17.0. The removed services remained in the OpenStack endpoint list even though they were not reachable or running. With this update, RHOSP upgrades include Ansible tasks to clean up the endpoints that are no longer required.
- BZ#2089512
- The multi-cell and multi-stack overcloud features were not available in RHOSP 17.0, due to a regression. The regressions have been fixed, and multi-cell and multi-stack deployments are supported in RHOSP 17.1.
- BZ#2092444
Before this update, a bare-metal overcloud node was listed as active by the
metalsmith
tool even after being deleted. This happened in environments where the node naming scheme overlapped with the overcloud role naming scheme, which could result in the wrong node being unprovisioned during undeploy. Because themetalsmith
tool uses the allocation name (hostname) first to lookup the status of bare-metal nodes, it was sometimes finding deleted nodes as still active.With this update, nodes to be unprovisioned are now referenced by allocation name (hostname), which ensures that the correct node is always unprovisioned. The nodes are only referenced by node name if the hostname doesn’t exist.
- BZ#2097844
-
Before this update, the
overcloud config download
command failed with a traceback error because the command attempted to reach the Orchestration service (heat) to perform the download. The Orchestration service no longer persistently runs on the undercloud. With this update, theovercloud config download
command is removed. Instead, you can use yourovercloud deploy
command with the--stack-only
option. - BZ#2101846
- Before this update, if secure RBAC was enabled, missing roles in the RHOSP deployment could cause Load-balancing service (octavia) API failures. In RHOSP 17.1 GA, this issue has been resolved.
- BZ#2107580
-
Before this update, the shutdown script that director uses to stop
libvirtd
stored outdatedlibvirt
container names from RHOSP versions before RHOSP 17.0, and instances did not shut down gracefully. With this update, the script stores correctlibvirt
container names, and instances are gracefully shut down whenlibvirtd
is stopped. - BZ#2109616
- Before this update, the Compute service was unable to determine the VGPU resource use because the mediated device name format changed in libvirt 7.7. With this update, the Compute service can now parse the new mediated device name format.
- BZ#2116600
- Before this update, the following libvirt internal error was sometimes raised during a successful live migration: "migration was active, but no RAM info was set". This caused the live migration to fail when it should have succeeded. With this update, when this libvirt internal error is raised, the live migration is signaled as complete in the libvirt driver and the live migration correctly succeeds.
- BZ#2120145
-
Before this update, the low default value of the libvirt
max_client
parameter caused communication issues between libvirt and the Compute service (nova), which resulted in some failed operations, such as live migrations. With this update, you can customize themax_client
parameter setting and increase its value to improve communication between libvirt and the Compute service. - BZ#2120767
- The AMD SEV feature was not available in RHOSP 17.0, due to a known issue with the RHEL firmware definition file missing from some machine types. This issue has been fixed, and AMD SEV is supported in RHOSP 17.1.
- BZ#2125610
- Before this update, an SELinux issue triggered errors with Red Hat OpenStack Platform (RHOSP) Load-balancing service (octavia) ICMP health monitors that used the Amphora provider driver. In RHOSP 17.1, this issue has been fixed and ICMP health monitors function correctly.
- BZ#2125612
-
Before this update, users might have experienced the following warning message in the amphora log file of the Load-balancing service (octavia) when the load balancer was loaded with multiple concurrent sessions:
nf_conntrack: table full, dropping packet
. This error occurred if the amphora dropped Transport Control Protocol (TCP) flows and caused latency on user traffic. With this update, connection tracking (conntrack) is disabled for TCP flows in the Load-balancing service that uses amphora, and new TCP flows are not dropped. Conntrack is only required for User Datagram Protocol (UDP) flows. - BZ#2129207
- Before this update, a network disruption or temporary unavailability of the Identity service (keystone) resulted in the nova-conductor service failing to start. With this update, the nova-conductor service logs a warning and continues startup in the presence of disruptions that are likely to be temporary. As a result, the nova-conductor service does not fail to start if transient issues like network disruptions or temporary unavailability of necessary services are encountered during startup.
- BZ#2133027
- The Alarming service (aodh) uses the deprecated gnocchi API to aggregate metrics, which results in incorrect metric measures of CPU usage in gnocchi. With this update, dynamic aggregation in gnocchi supports the ability to make re-aggregations of existing metrics and the ability to manipulate and transform metrics as required. CPU time in gnocchi is correctly calculated.
- BZ#2133297
-
Before this update, the
openstack undercloud install
command launched theopenstack tripleo deploy
command, which created the/home/stack/.tripleo/history
file withroot:root
as the owner. Subsequent deploy commands failed because of permission errors. With this update, the command creates the file with thestack
user as the owner, and deploy commands succeed without permission errors. - BZ#2135548
-
Before this update, the
ironic-python-agent
did not correctly process the UEFI boot loader hint file, causing deployments to fail with RHEL 8.6 images in UEFI mode. With this update, you can now deploy RHEL 8.6 in UEFI mode. - BZ#2136302
- This update allows node names longer than 62 bytes.
- BZ#2140988
Before this update, a live migration might fail because the database did not update with the destination host details.
With this update, the instance host value in the database is set to the destination host during live migration.
- BZ#2149216
Before this update, Open Virtual Network (OVN) load balancer health checks were not performed if you used Floating IPs (FIP) associated with the Load Balancer Virtual IP (VIP), and traffic was redirected to members in the Error state if the FIP was used.
With this update, if you use Floating IPs (FIP) associated with the Load Balancer Virtual IP (VIP), there is a new load balancer health check created for the FIP, and traffic is not redirected to members in the Error state.
- BZ#2149221
-
Before this update, deployments with bonded interfaces did not complete because no value was set for the Ansible variable for OVS bonds,
bond_interface_ovs_options
. With this update, a default value has been set for thebond_interface_ovs_options
Ansible variable. - BZ#2149339
Before this update, the cephadm-ansible logs in
/home/stack/config-download/overcloud/cephadm
were not rotated. Thecephadm_command.log
was appended for every overcloud deployment and increased in size. Also, for everyopenstack overcloud ceph spec
operation, the log/home/stack/ansible.log
was not rotated.Now, dated logs are generated for every overcloud deployment, and every Ceph spec operation in the following format:
-
/home/stack/config-download/overcloud/cephadm/cephadm_command.log-<Timestamp>
. -
/home/stack/ansible.log-<Timestamp>
.
-
- BZ#2149468
- Before this update, the Compute service (nova) processed a temporary error message from the Block Storage service (cinder) volume detach API, such as '504 Gateway Timeout', as an error. The Compute service failed the volume detach operation even though it succeeded but timed out on the Block Storage service side, leaving a stale block device mapping record in the Compute service database. With this update, the Compute service retries the volume detach call to the Block Storage service API if it receives an HTTP error that is likely to be temporary. Upon retry, if the volume attachment is no longer found, the Compute service processes the volume as already detached.
- BZ#2149963
- Before this update, the cephadm utility did not process child groups when building specification files from inventory. With this update, specification file generation processes child groups.
- BZ#2151043
-
Before this update, the
openstack-cinder-volume-0
container, which is created by the Pacemaker bundle resource for the Block Storage service (cinder), mounted/run
from the host. This mount path created the.containerenv
file in the directory. When the.containerenv
file exists,subscription-manager
fails because it evaluates that the command is executed inside a container. With this update, the mount path is updated so that Podman disables the creation of the.containerenv
file, andsubscription-manager
executes successfully in a host that is running theopenstack-cinder-volume-0
container. - BZ#2152888
- Before this update, the Service Telemetry Framework (STF) API health monitoring script failed because it depended on Podman log content, which was no longer available. With this update, the health monitoring script depends on the Podman socket instead of the Podman log, and API health monitoring operates normally.
- BZ#2154343
- Before this update, the disabling and enabling of network log objects in a security group was inconsistent. The logging of a connection was disabled as soon as one of the log objects in the security group associated with that connection was disabled. With this update, a connection is logged if any of the related enabled log objects in the security group allow it, even if one of those log objects becomes disabled.
- BZ#2162632
- Before this update, values of multi-value parameters were not populated correctly in the Alarming service (aodh) configuration because input to multi-value parameters was not considered as an array instead as a single value. With this update, you can set multiple values for a parameter and all values are populated in a configuration file.
- BZ#2162756
- Before this update, VLAN network traffic was centralized over the Controller nodes. With this update, if all the tenant provider networks that are connected to a router are of the VLAN/Flat type, that traffic is now distributed. The node that contains the instance sends the traffic directly.
- BZ#2163815
-
Before this update, Open Virtual Network (OVN) load balancers on switches with
localnet
ports (Networking service [neutron] provider networks) did not work if traffic came fromlocalnet
. With this update, load balancers are not added to the logical switch associated with the provider network. This update forces Network Address Translation (NAT) to occur at the virtual router level instead of the logical switch level. - BZ#2164421
Before this update, the Compute service (nova) did not confidence-check the content of Virtual Machine Disk (VMDK) image files. By using a specially crafted VMDK image, it was possible to expose sensitive files on the host file system to guests booted with that VMDK image. With this update, the Compute service confidence checks VMDK files and forbids VMDK features that the leak behavior depends on. It is no longer possible to leak sensitive host file system contents using specially crafted VMDK files. This bug fix addresses CVE-2022-47951.
NoteRed Hat does not support the VMDK image file format in RHOSP.
- BZ#2164677
- Before this update, the iptables rule for the heat-cfn service contained the incorrect TCP port number. Users could not access the heat-cfn service endpoint if SSL was enabled for public endpoints. With this update, the TCP port number is correct in the iptables rule. Users can access the heat-cfn service endpoint, even if SSL is enabled for public endpoints.
- BZ#2167161
Before this update, the default value of
rgw_max_attr_size
was 256, which created issues for OpenShift on OpenStack when uploading large images. With this update, the default value ofrgw_max_attr_size
is 1024.You can change the value by adding the following configuration to an environment file that you include in your overcloud deployment:
parameters_default: CephConfigOverrides: rgw_max_attr_size: <new value>
- BZ#2167431
-
Before this update, the collectd hugepages plugin would report a failure message when attempting to access a new file in Red Hat Enterprise Linux (RHEL) 9 called
demote
. Now, collectd avoids reading this file and the failure message is suppressed. - BZ#2169303
-
Before this update, the IPMI agent container did not spawn because the CeilometerIpmi service was not added to THT Compute roles. With this update, the CeilometerIpmi service is added to all THT Compute roles. The IPMI agent container is executed with the
--privilege
flag to executeipmitool
commands on the host. The data collection service (ceilometer) can now capture power metrics. - BZ#2169349
- Before this update, instances lost communication with the ovn-metadata-port because the load balancer health monitor replied to the ARP requests for the OVN metadata agent’s IP, causing the request going to the metadata agent to be sent to another MAC address. With this update, the ovn-controller conducts back-end checks by using a dedicated port instead of the ovn-metadata-port. When establishing a health monitor for a load balancer pool, ensure that there is an available IP in the VIP load balancer’s subnet. This port is distinct for each subnet, and various health monitors in the same subnet can reuse the port. Health monitor checks no longer impact ovn-metadata-port communications for instances.
- BZ#2172063
-
Before this update, the
openstack overcloud ceph deploy
command could fail during theapply spec
operation if the chrony NTP service was down. With this update, the chrony NTP service is enabled before theapply spec
operation. - BZ#2172582
-
Before this update, the
create pool
operation failed because the podman command used/etc/ceph
as the volume argument. This argument does not work for Red Hat Ceph Storage version 6 containers. With this update, the podman command uses/var/lib/ceph/$FSID/config/
as the first volume argument andcreate pool
operations are successful. - BZ#2173101
-
Before this update, when users deployed Red Hat Ceph Storage in a tripleo-ipa context, a
stray hosts
warning showed in the cluster for the Ceph Object Gateway (RADOS Gateway [RGW]). With this update, during a Ceph Storage deployment, you can pass the option--tld
in a tripleo-ipa context to use the correct hosts when you create the cluster. - BZ#2173575
- Before this update, a flooding issue occurred when an instance, associated with a provider network with disabled port security, attempted to reach IPs on the provider network that were not recognized by OpenStack. This flooding occurred because the forwarding database (FDB) table was not learning MAC addresses. This update uses a new option in OVN to enable the learning of IPs in the FDB table. There is currently no aging mechanism for the FDB table. But you can clean up the FDB table periodically, to prevent the occurrence of scaling issues caused by the size of this table.
- BZ#2174632
Before this update, a regression in the network configuration for OVS interfaces negatively impacted network performance. With this update, the
os-vif
OVS plugin has been enhanced to improve network performance on the OVS interfaces of non-Windows instances.ImportantThis update takes effect when the instance interface is recreated. If you change this value for an existing port, you must hard reboot the instance or perform a live migration for the update to take effect.
- BZ#2178618
-
Before this update, a security group logging enhancement introduced an issue where log objects could not be deleted at the same time as security groups. This action caused an internal server error. With this update, the
db_set
function that modifies the northbound database entries does not fail if the row that is requested does not exist any more. - BZ#2179071
-
Before this update, the collectd plugin libpodstats could not gather metrics because the Cgroup path to Ceph containers changed in RHEL 9 from
/sys/fs/cgroup/machine.slice
to/sys/fs/cgroup/system.slice/system-ceph<FSID>
. With this update, libpodstats can now parse CPU and memory metrics from cgroups under the new path. - BZ#2180933
-
Before this update, host services, such as Pacemaker, were mounted under
/var/log/host/
in the rsyslog container. However, the configuration path was the same as the host path/var/log/pacemaker/
. Because of this issue, the rsyslog service could not locate Pacemaker log files. With this update, the Pacemaker log path is changed from/var/log/pacemaker/
to/var/log/host/pacemaker/
. - BZ#2181107
-
Before this update the
NetworkDeploymentAction
parameter was internally overridden and the deployment process would always configures the network interfaces. As a result, the network interfaces were always configured during deployment regardless of the value of theNetworkDeploymentAction
parameter. With this update theNetworkDeploymentAction
parameter works as expected, and by default the configuration of networking interfaces is skipped for nodes that are already deployed. - BZ#2185163
- Before this update, existing puppet containers were reused during deployment. The deployment process did not check the return code from the puppet commands executed within the container, which meant that any puppet task failures were ignored during deployment. This resulted in reporting a successful deployment even when some puppet execution tasks failed. With this update, puppet containers are recreated for every deployment. If a puppet execution task fails, the deployment stops and reports the failure.
- BZ#2188252
-
Before this update, the 'openstack tripleo container image prepare' command failed because there were incorrect Ceph container tags in the
container_image_prepare_defaults.yaml
file. With this update, the correct Ceph container tags are in the YAML file, and the 'openstack tripleo container image prepare' command is successful. - BZ#2196288
-
Before this update, if you upgraded your operating system from RHEL 7.x to RHEL 8.x, or from RHEL 8.x to RHEL 9.x, and ran a Leapp upgrade with the
--debug
option, the system remained in theearly console in setup code
state and did not reboot automatically. With this update, theUpgradeLeappDebug
parameter is set tofalse
by default. Do not change this value in your templates. - BZ#2203238
- Before this update, for the nova-compute log to record os-brick privileged commands for debugging purposes, you had to apply the workaround outlined in https://access.redhat.com/articles/5906971. This update makes the workaround redundant and provides a better solution that separates logging by the nova-compute service so that the privileged commands of os-brick are logged at the debug level but the privileged commands of nova are not.
- BZ#2207991
-
Before this update, secure role-based access control (SRBAC) and the
NovaShowHostStatus
parameter used the same policy key titles. If you configured both SRBAC andNovaShowHostStatus
, the deployment failed with a conflict. With this update, the policy key forNovaShowHostStatus
is changed and there are no related conflicts in deployments. - BZ#2210062
Before this update, in RHOSP 17.1 environments that use RHOSP dynamic routing, there was a known issue where the default value of the Autonomous System Number (ASN) used by the OVN BGP agent differed from the ASN used by FRRouting (FRR).
In 17.1 GA, this issue is resolved. The
FrrOvnBgpAgentAsn
andFrrBgpAsn
default values are valid and can be used without needing to modify them.- BZ#2211691
- Before this update, the Bare Metal Provisioning service (ironic) was unable to detach a Block Storage service (cinder) volume from a physical bare metal node. This volume detachment is required to tear down physical machines that have an instance deployed on them by using the boot from volume functionality. With this update, the Bare Metal Provisioning service (ironic) can detach a volume from a physical bare metal node to automatically tear down these physical machines.
- BZ#2211849
-
Before this update, a bug in the library
pyroute2
caused environments that used RHOSP dynamic routing to fail to advertise new routes and to lose connectivity with new or migrated instances, new load balancers, and so on. In RHOSP 17.1 GA, a newer version ofpyroute2
resolves this issue. - BZ#2214259
- Before this update, in an environment that had been migrated from the OVS mechanism driver to the OVN mechanism driver, an instance with a trunk port could become inaccessible after an operation such as a live migration. Now, you can live migrate, shutdown, or reboot instances with a trunk port without issues after migration to the OVN mechanism driver.
- BZ#2215936
- Before this update, creating an instance with virtual functions (VF) could fail in an environment that had been migrated from ML2/OVS with SR-IOV to ML2/OVN. You can now create instances with VFs after migration.
- BZ#2216130
-
Currently,
puppet-ceilometer
does not populate thetenant_name_discovery
parameter in the data collection service (ceilometer) configuration on Compute nodes. This causes theProject name
andUser name
fields to not be identified. Currently, there is no workaround for this issue. - BZ#2219765
-
Before this update, the
pam_loginuid
module was enabled in some containers. This prevented crond from executing some tasks, such asdb purge,
inside of those containers. Now,pam_loginuid
is removed and the containerizedcrond
process runs all periodic tasks.
3.5.3. Enhancements
This release of Red Hat OpenStack Platform (RHOSP) features the following enhancements:
- BZ#1369007
- Cloud users can launch instances that are protected with UEFI Secure Boot when the overcloud contains UEFI Secure Boot Compute nodes. For information on creating an image for UEFI Secure Boot, see Creating an image for UEFI Secure Boot. For information on creating a flavor for UEFI Secure Boot, see "UEFI Secure Boot" in Flavor metadata.
- BZ#1581414
Before this release,
NovaHWMachineType
could not be changed for the lifetime of a RHOSP deployment because the machine type of instances without ahw_machine_type
image property would use the newly configured machine types after a hard reboot or migration. Changing the underlying machine type for an instance could break the internal ABI of the instance.With this release, when launching an instance the Compute service records the instance machine type within the system metadata of the instance. Therefore, it is now possible to change the
NovaHWMachineType
during the lifetime of a RHOSP deployment without affecting the machine type of existing instances.- BZ#1619266
This update introduces the security group logging feature. To monitor traffic flows and attempts into and out of an instance, you can configure the Networking Service packet logging for security groups.
You can associate any instance port with one or more security groups and define one or more rules for each security group. For instance, you can create a rule to drop inbound ssh traffic to any instance in the finance security group. You can create another rule to allow instances in that group to send and respond to ICMP (ping) messages.
Then you can configure packet logging to record combinations of accepted and dropped packet flows.
You can use security group logging for both stateful and stateless security groups.
Logged events are stored on the Compute nodes that host the instances, in the file
/var/log/containers/stdouts/ovn_controller.log
.- BZ#1666804
-
With this update, the
cinder-backup
service can now be deployed in Active/Active mode. - BZ#1672972
This enhancement helps cloud users determine if the reason they are unable to access an "ACTIVE" instance is because the Compute node that hosts the instance is unreachable. RHOSP administrators can now configure the following parameters to enable a custom policy that provides a status in the
host_status
field to cloud users when they run theopenstack show server details
command, if the host Compute node is unreachable:-
NovaApiHostStatusPolicy
: Specifies the role the custom policy applies to. -
NovaShowHostStatus
: Specifies the level of host status to show to the cloud user, for example, "UNKNOWN".
-
- BZ#1693377
-
With this update, an instance can have a mix of shared (floating) CPUs and dedicated (pinned) CPUs instead of only one CPU type. RHOSP administrators can use the
hw:cpu_policy=mixed
andhw_cpu_dedicated_mask
flavor extra specs to create a flavor for instances that require a mix of shared CPUs and dedicated CPUs. - BZ#1701281
- In RHOSP 17.1, support is available for cold migrating and resizing instances that have vGPUs.
- BZ#1720404
With this update, you can configure your RHOSP deployment to count the quota usage of cores and RAM by querying placement for resource usage and instances from instance mappings in the API database, instead of counting resources from separate cell databases. This makes quota usage counting resilient to temporary cell outages or poor cell performance in a multi-cell environment.
Set the following configuration option to count quota usage from placement:
parameter_defaults: ControllerExtraConfig: nova::config::nova_config: quota/count_usage_from_placement: value: 'True'
- BZ#1761861
- With this update, you can configure each physical GPU on a Compute node to support a different virtual GPU type.
- BZ#1761903
-
On RHOSP deployments that use a routed provider network, you can now configure the Compute scheduler to filter Compute nodes that have affinity with routed network segments, and verify the network in placement before scheduling an instance on a Compute node. You can enable this feature by using the
NovaSchedulerQueryPlacementForRoutedNetworkAggregates
parameter. - BZ#1772124
-
With this update, you can use the new
NovaMaxDiskDevicesToAttach
heat parameter to specify the maximum number of disk devices that can be attached to a single instance. The default is unlimited (-1). For more information, see Configuring the maximum number of storage devices to attach to one instance. - BZ#1782128
-
In RHOSP 17.1, a RHOSP administrator can provide cloud users the ability to create instances that have emulated virtual Trusted Platform Module (vTPM) devices. RHOSP only supports TPM version
2.0
. - BZ#1793700
-
In RHOSP 17.1, a RHOSP administrator can declare which custom physical features and consumable resources are available on the RHOSP overcloud nodes by modeling custom traits and inventories in a YAML file,
provider.yaml
. - BZ#1827598
- This RHOSP release introduces support of the OpenStack stateless security groups API.
- BZ#1857652
- With this update, deployments of RHOSP with trunk ports are fully supported for migration from ML2/OVS to ML2/OVN.
- BZ#1873409
- On RHOSP deployments that are configured for OVS hardware offload and to use ML2/OVN, and that have Compute nodes with VirtIO data path acceleration (VDPA) devices and drivers and Mellanox NICs, you can enable VDPA support for enterprise workloads. When VDPA support is enabled, your cloud users can create instances that use VDPA ports. For more information, see Configuring VDPA Compute nodes to enable instances that use VDPA ports and Creating an instance with a VDPA interface.
- BZ#1873707
With this update, you can use the validation framework in the workflow of backup and restore procedures to verify the status of the restored system. The following validations are included:
-
undercloud-service-status
-
neutron-sanity-check
-
healthcheck-service-status
-
nova-status
-
ceph-health
-
check-cpu
-
service-status
-
image-serve
-
pacemaker-status
-
validate-selinux
-
container-status
-
- BZ#1883554
-
With this update, a RHOSP administrator can create a flavor that has a
socket
PCI NUMA affinity policy. You can use this policy to create an instance that requests a PCI device only when at least one of the instance NUMA nodes has affinity with a NUMA node in the same host socket as the PCI device. - BZ#1888788
-
With this update, the Shared File Systems service (manila) API supports a project-scoped 'reader' role. Users with the 'reader' role can send GET requests to the service, but they cannot make any other kind of request. You can enable this feature by using the
environments/enable-secure-rbac.yaml
environment file included with director. You can use the 'reader' role to create audit users for humans and automation and to perform read-only interactions safely with OpenStack APIs. - BZ#1898349
- With this update, the Block Storage (cinder) backup service supports the zstd data compression algorithm.
- BZ#1903914
- With this update, the Block Storage (cinder) backup service supports the S3 back end.
- BZ#1947377
- With this update, the RHOSP Orchestration service (heat) dashboard shows template default values. Previously, the heat dashboard had the default values hidden, which was sometimes confusing for users. This update ensures that those default values are visible to the user in the heat dashboard and removes any confusion that was caused when they were hidden.
- BZ#1962500
- With this update, you can configure the collectd logging source in TripleO Heat Templates. The default value matches the default logging path.
- BZ#1986025
- With this update, Block Storage service (cinder) supports NVMe over TCP (NVMe/TCP) drivers, for Compute nodes that are running RHEL 9.
- BZ#2005495
This enhancement allows cloud administrators to specify an Availability Zone (AZ) by storage back end through director when configuring the Shared File Systems service (manila) back-end storage.
With this update, administrators can use an AZ annotation to logically separate storage provisioning requests and to denote failure domains. AZs configured by administrators are exposed by the Shared File Systems service to end users. End users can request that their workloads be scheduled to specific AZs based on their needs. When configuring multiple storage back ends, administrators might want to tag each back end to different AZs as opposed to denoting a single AZ for all back ends.
Director has new options to denote the storage AZs. Each option corresponds to a supported storage back-end driver. For more information about AZs, see Configuring persistent storage.
- BZ#2008969
- With this update, cloud administrators can bring shares that are created outside the Shared File Systems service (manila) under the management of the Shared file Systems service. Cloud administrators can also remove shares from the Shared File Systems service without deleting them. Note that the CephFS driver does not support this feature. You can use this manage/unmanage functionality when commissioning, decommissioning, or migrating storage systems, or to take shares offline temporarily for maintenance.
- BZ#2016660
- Upgrades from Red Hat OpenStack Platform (RHOSP) 16.2 to RHOSP 17.1 are supported. The RHOSP upgrade and the operating system upgrade are now separated into two distinct phases. You upgrade RHOSP first, then you upgrade the operating system.
- BZ#2026385
With this update, you can configure
fence_watchdog
that usessbd
, like other fencing devices via tripleo, by defining the respective fencing resource:parameter_defaults: EnableFencing: true FencingConfig: devices: - agent: fence_watchdog host_mac: "52:54:00:74:f7:51"
As an operator, you must enable
sbd
and set the watchdog timeout:parameter_defaults: ExtraConfig: pacemaker::corosync::enable_sbd: true tripleo::fencing::watchdog_timeout: 20
- BZ#2033811
- The Shared File System service (manila) now supports using Pure Storage Flashblade system as a back end. Refer to the Red Hat ecosystem catalog to find the vendor’s certification and installation documentation.
- BZ#2060758
- In Red Hat OpenStack Platform (RHOSP) 17.1, the RHOSP Load-balancing service (octavia) supports the rsyslog over TCP protocol for Amphora log offloading. With this enhancement you can redirect log messages to a secondary rsyslog server if the primary server becomes unavailable. For more information, see Chapter 5. Managing Load-balancing service instance logs in the Configuring load balancing as a service guide.
- BZ#2066349
With this enhancement, the LVM volumes installed by the
overcloud-hardened-uefi-full.qcow2
whole disk overcloud image are now backed by a thin pool. The volumes are still grown to consume the available physical storage, but are not over-provisioned by default.The benefits of thin-provisioned logical volumes:
- If a volume fills to capacity, the options for manual intervention now include growing the volume to over-provision the physical storage capacity.
- The RHOSP upgrades process can now create ephemeral backup volumes in thin-provisioned environments.
- BZ#2069624
- The Red Hat OpenStack Platform (RHOSP) snapshot and revert feature is based on the Logical Volume Manager (LVM) snapshot functionality and is intended to revert an unsuccessful upgrade or update. Snapshots preserve the original disk state of your RHOSP cluster before performing an upgrade or an update. You can then remove or revert the snapshots depending on the results. If an upgrade completed successfully and you do not need the snapshots anymore, remove them from your nodes. If an upgrade fails, you can revert the snapshots, assess any errors, and start the upgrade procedure again. A revert leaves the disks of all the nodes exactly as they were when the snapshot was taken.
- BZ#2074896
-
Previously, the Open vSwitch (OVS) bond
balance-tcp
mode was only available in RHOSP as a technology preview. Because of L4 hashing re-circulation issues, the mode was not recommended for production. The issues have been resolved and you can use the OVS bondbalance-tcp
mode. You must setlb-output-action=true
to usebalance-tcp
mode. - BZ#2086688
- RHOSP 17.1 GA supports the offloading of OpenFlow flows to hardware with the connection tracking (conntrack) module. For more information, see Configuring components of OVS hardware offload in Configuring network functions virtualization.
- BZ#2097931
- In RHOSP 17.1, you can live migrate, unshelve and evacuate an instance that uses a port that has resource requests, such as a guaranteed minimum bandwidth QoS policy.
- BZ#2104522
- With this update, live migration now uses multichassis Open Virtual Network (OVN) ports to optimize the migration procedure and significantly reduce network downtime for VMs during migration in particular scenarios.
- BZ#2106406
This update introduces the script
neutron-remove-duplicated-port-bindings
to fix an issue that sometimes affected the handling of failed live migrations.If a live migration fails, the Compute service (Nova) reverts the migration. The migration reversal implies deleting any object created in the database or in the destination compute node.
However, in some cases after the reversal of a failed live migration, ports were left with duplicate port bindings.
The
neutron-remove-duplicated-port-bindings
script finds duplicate port bindings and deletes the inactive bindings. You can run the script if a failed live migration results in duplicate port bindings.- BZ#2111528
- With this update, the default Ceph container image is based on Red Hat Ceph Storage 6 instead of Red Hat Ceph Storage 5.
- BZ#2122209
-
This update adds the
validation file
command to the Validation Framework CLI. This command allows you to supply a file with validations by name, group, category and product for a validation run. Now, you can run 'validation file <path_to_file>', and keep the chosen validations for reruns at a later time. - BZ#2124309
With this enhancement, operators can enable the run_arping feature for Pacemaker-managed virtual IPs (VIPs), so that the cluster preemptively checks for duplicate IPs.
To do this, you must add the following configuration to the environment file:
ExtraConfig: pacemaker::resource::ip::run_arping: true
If a duplicate is found, the following error is logged in the
/var/log/pacemaker/pacemaker.log
file:Sep 07 05:54:54 IPaddr2(ip-172.17.3.115)[209771]: ERROR: IPv4 address collision 172.17.3.115 [DAD] Sep 07 05:54:54 IPaddr2(ip-172.17.3.115)[209771]: ERROR: Failed to add 172.17.3.115
- BZ#2138238
- With this update, you deploy two separate instances of the Image service (glance) API. The instance that is accessible to OpenStack tenants is configured to hide image location details, such as the direct URL of an image or whether the image is available in multiple locations. The second instance is accessible to OpenStack administrators and OpenStack services, such as the Block Storage service (cinder) and the Compute service (nova). This instance is configured to provide image location details. This enhancement addresses the recommendations of OSSN-0090 and CVE-2022-4134. With this update, a malicious user cannot leverage the location details of an image to upload an altered image.
- BZ#2152877
- This enhancement adds OVN security group logging to the Networking service (neutron) for the reply packets of a network connection. The ovn-controller log files now log the full network connection.
- BZ#2165501
- Starting with Red Hat OpenStack Platform (RHOSP) 17.1, in ML2/OVN deployments, you can enable minimum bandwidth and bandwidth limit egress policies for hardware offloaded ports. You cannot enable ingress policies for hardware offloaded ports. For more information, see Configuring the Networking service for QoS policies.
- BZ#2187255
With this update, you can add project and user name fields to outgoing data collection service (ceilometer) metrics. Previously, cloud administrators had to rely on UUIDs of projects and users to identify tenants. Now you can view a list of projects and user names, not UUIDs.
NoteThis feature is not available to use with gnocchi or Service Telemetry Framework (STF).
3.5.4. Technology previews
The items listed in this section are provided as Technology Previews in this release of Red Hat OpenStack Platform (RHOSP). For further information on the scope of Technology Preview status, and the associated support implications, refer to https://access.redhat.com/support/offerings/techpreview/.
- BZ#1813561
- With this update, the Load-balancing service (octavia) supports HTTP/2 load balancing by using the Application Layer Protocol Negotiation (ALPN) for listeners and pools that are enabled with Transport Layer Security (TLS). The HTTP/2 protocol improves performance by loading pages faster.
- BZ#1848407
- In RHOSP 17.1, a technology preview is available for the Stream Control Transmission Protocol (SCTP) in the Load-balancing service (octavia). Users can create SCTP listeners and attach SCTP pools in a load balancer.
- BZ#2057921
- In RHOSP 17.1, a technology preview is available for creating load balancers over an IPv6 management network. Using a private IPv6 management network for the Load-balancing service (octavia) may simplify edge deployments.
- BZ#2217663
- In RHOSP 17.1, a technology preview is available for the VF-LAG transmit hash policy offload that enables load balancing at NIC hardware for offloaded traffic/flows. This hash policy is only available for layer3+4 base hashing.
3.5.5. Release notes
This section outlines important details about the release, including recommended practices and notable changes to Red Hat OpenStack Platform (RHOSP). You must take this information into account to ensure the best possible outcomes for your deployment.
- BZ#2072644
This enhancement allows users to upgrade from RHOSP 16.2 to RHOSP 17.1 and keep the Red Hat Enterprise Linux (RHEL) 8 based operating systems on the Compute nodes, in combination with nodes running RHEL 9.
Control plane nodes and Storage nodes must be upgraded. The default behavior is that all nodes are upgraded to RHEL 9 unless explicitly configured otherwise.
- BZ#2081641
- If you are using a Red Hat OpenStack Platform (RHOSP) environment that is running RHOSP 16.2.4 or later, you can upgrade directly to RHOSP 17.1.
- BZ#2224523
In RHOSP networking environments, when creating a VM instance, do not bind the instance to a virtual port (vport). Instead, use a port whose IP address is not a member of another port’s allowed address pair.
Binding a vport to an instance prevents the instance from spawning and produces an error message similar to the following:
WARNING nova.virt.libvirt.driver [req-XXXX - - - default default] [instance: XXXXXXXXX] Timeout waiting for [('network-vif-plugged', 'XXXXXXXXXX')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds
3.5.6. Known issues
These known issues exist in Red Hat OpenStack Platform (RHOSP) at this time:
- BZ#2108212
If you use IPv6 to connect to instances during migration to the OVN mechanism driver, connection to the instances might be disrupted for up to several minutes when the ML2/OVS services are stopped. To avoid this, use IPv4 instead.
The router advertisement daemon
radvd
for IPv6 is stopped during migration to the OVN mechanism driver. Whileradvd
is stopped, router advertisements are no longer broadcast. This broadcast interruption results in instance connection loss over IPv6. IPv6 communication is automatically restored once the new ML2/OVN services start.To avoid the potential disruption, use IPv4 instead.
- BZ#2109597
- There is a hardware (HW) limitation with CX-5. Every network traffic flow has a direction in HW, either transmit (TX) or receive (RX). If the source port of the flow is a virtual function (VF), then it is also TX flow in HW. CX-5 cannot pop VLAN on TX path, which prevents offloading the flow with pop_vlan to the HW.
- BZ#2109985
Currently, in ML2/OVS deployments, Open vSwitch (OVS) does not support offloading OpenFlow rules that have the
skb_priority
,skb_mark
, or output queue fields set. These fields are required for Quality of Service (QoS) support for virtio ports.If you set a minimum bandwidth rule for a virtio port, the Networking service (neutron) OVS agent marks the traffic of this port with a Packet Mark field. This traffic cannot be offloaded, and it affects the traffic in other ports. If you set a bandwidth limit rule, all traffic is marked with the default 0 queue, which means that no traffic can be offloaded.
Workaround: If your environment includes OVS hardware offload ports, disable packet marking in the nodes that require hardware offloading. When you disable packet marking, it is not possible to set rate limiting rules for virtio ports. However, differentiated services code point (DSCP) marking rules are still available.
In the configuration file, set the
disable_packet_marking
flag totrue
. When you edit the configuration file, you must restart theneutron_ovs_agent
container. For example:$ cat `/var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/openvswitch_agent.ini` [ovs] disable_packet_marking=True
- BZ#2126725
- Hard-coded certificate location operates independently of user-provided values. During deployment with custom certificate locations, services do not retrieve information from API endpoints because Transport Layer Security (TLS) verification fails.
- BZ#2143874
In RHOSP 17.1, when the DNS service (designate) is deployed, Networking service (neutron) ports created on the undercloud are not deleted when the overcloud is deleted. These ports do not cause operational problems when the overcloud is recreated with or without the DNS service.
Workaround: After the overcloud has been deleted, manually remove the ports by using the
openstack port delete
command.- BZ#2144492
- If you migrate a RHOSP 17.1.0 ML2/OVS deployment with distributed virtual routing (DVR) to ML2/OVN, the floating IP (FIP) downtime that occurs during ML2/OVN migration can exceed 60 seconds.
- BZ#2160481
In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue where floating IP (FIP) port forwarding fails.
When FIP port forwarding is configured, packets sent to a specific destination port with a destination IP that equals the FIP are redirected to an internal IP from a RHOSP Networking service (neutron) port. This occurs regardless of the protocol that is used: TCP, UDP, and so on.
When BGP dynamic routing is configured, the routes to the FIPs used to perform FIP port forwarding are not exposed, and these packets cannot reach their final destinations.
Currently, there is no workaround.
- BZ#2163477
- In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue affecting instances connected to provider networks. The RHOSP Compute service cannot route packets sent from one of these instances to a multicast IP address destination. Therefore, instances subscribed to a multicast group fail to receive the packets sent to them. The cause is that BGP multicast routing is not properly configured on the overcloud nodes. Currently, there is no workaround.
- BZ#2167428
- During a new deployment, the keystone service is often not available when the agent-notification service is initializing. This prevents ceilometer from discovering the gnocchi endpoint. As a result, metrics are not sent to gnocchi.
- BZ#2178500
- If a volume refresh fails when using the nova-manage CLI, this causes the instance to stay in a locked state.
- BZ#2180542
The Pacemaker-controlled
ceph-nfs
resource requires a runtime directory to store some process data. The directory is created when you install or upgrade RHOSP. Currently, a reboot of the Controller nodes removes the directory, and theceph-nfs
service does not recover when the Controller nodes are rebooted. If all Controller nodes are rebooted, theceph-nfs
service fails permanently.Workaround: If you reboot a Controller node, log into the Controller node and create a
/var/run/ceph
directory:$ mkdir -p /var/run/ceph
Repeat this step on all Controller nodes that have been rebooted. If the
ceph-nfs-pacemaker
service has been marked as failed, after creating the directory, execute the following command from any of the Controller nodes:$ pcs resource cleanup
- BZ#2180883
Currently, Logrotate archives all log files once a day and Rsyslog stops sending logs to Elasticsearch Workaround: Add "RsyslogReopenOnTruncate: true" to your environment file during deployment so that Rsyslog reopens all log files on log rotation.
Currently, RHOSP 17.1 uses an older puppet-rsyslog module with an incorrectly configured Rsyslog. Workaround: Manually apply patch [1] in
/usr/share/openstack-tripleo-heat-templates/deployment/logging/rsyslog-container-puppet.yaml
before deployment to configure Rsyslog correctly.- BZ#2182371
There is currently a known issue with guest instances that use Mellanox ConnectX-5, ConnectX-6, and Bluefield-2 NICs with offload (switchdev) ports. It takes a long time to initialize the system when you reboot the operating system from the guest directly, for example, by using the command
sudo systemctl reboot --reboot-arg=now
. If the instance is configured with two Virtual Functions (VFs) from the same Physical Function (PF), the initialization of one of the VFs might fail and cause a longer initialization time.Workaround: Reboot the guest instance in a timely manner by using the OpenStack API instead of rebooting the guest instance directly.
- BZ#2183793
Overcloud node provisioning may fail for NFV deployments on some AMD platforms in UEFI boot mode on RHOSP 17.1, when using the following BIOS configuration:
- Boot Mode: UEFI
Hard-disk Drive Placeholder: Enabled
Workaround: Set
Hard-disk Drive Placeholder
toDisabled
. For information on how to assess each BIOS attribute for your NFV deployment on AMD platforms in UEFI boot mode, see the reference guide for your hardware.
- BZ#2184834
-
The Block Storage API supports the creation of a Block Storage multi-attach volume by passing a parameter in the volume-create request, even though this method of creating multi-attach volume has been deprecated for removal because it is unsafe and can lead to data loss when creating a multi-attach volume on a back end that does not support multi-attach volumes. Workaround: create a multi-attach volume by using a multi-attach volume-type, which is the only method of creating multi-attach volumes provided by the
openstack
andcinder
CLI. - BZ#2185897
- In ML2/OVN deployments, do not use live migration on instances that use trunk ports. On instances that use trunk ports, live migration can fail due to the flapping of the instance’s subport between the Compute nodes. For instances that have trunk ports, use cold migration instead.
- BZ#2192913
In RHOSP environments with ML2/OVN or ML2/OVS that have DVR enabled and use VLAN tenant networks, east/west traffic between instances connected to different tenant networks is flooded to the fabric.
As a result, packets between those instances reach not only the Compute nodes where those instances run, but also any other overcloud node.
This could cause an impact on the network and it could be a security risk because the fabric sends traffic everywhere.
This bug will be fixed in a later FDP release. You do not need to perform a RHOSP update to obtain the FDP fix.
- BZ#2193388
The Dashboard service (horizon) is currently configured to validate client TLS certificates by default, which breaks the Dashboard service on all TLS everywhere (TLS-e) deployments.
Workaround:
Add the following configuration to an environment file:
parameter_defaults: ControllerExtraConfig: horizon::ssl_verify_client: none
Add the environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<environment_file>.yaml
- BZ#2196291
- Currently, custom SRBAC rules do not permit list policy rules to non-admin users. As a consequence, non-admin users can not list or manage these rules. Current workarounds include either disabling SRBAC, or modifying the SRBAC custom rule to permit this action.
- BZ#2203785
-
Currently, there is a permission issue that causes collectd sensubility to stop working after you reboot a baremetal node. As a consequence, sensubility stops reporting container health. Workaround: After rebooting an overcloud node, manually run the following command on the node:
sudo podman exec -it collectd setfacl -R -m u:collectd:rwx /run/podman
- BZ#2203857
- A known issue in the Ceph RADOS Gateway component in Red Hat Ceph Storage (RHCS) 6.0 causes authorization with Identity service (keystone) tokens to fail. This issue is not manifest in RHCS 6.1, which is supported in RHOSP 17.1.
- BZ#2210030
- There is currently a known issue where custom SRBAC rules do not permit list shared security groups to non-administrative users that are not rule owners. This causes shared security groups and rules to not be managed properly by non-administrative users that are not rule owners. Workaround: Disable custom SRBAC rules or modify the custom rules to permit any user to manage the rules.
- BZ#2210319
Currently, the Retbleed vulnerability mitigation in RHEL 9.2 can cause a performance drop for Open vSwitch with Data Plane Development Kit (OVS-DPDK) on Intel Skylake CPUs.
This performance regression happens only if C-states are disabled in the BIOS, hyper-threading is enabled, and OVS-DPDK is using only one hyper-thread of a given core.
Workaround: Assign both hyper-threads of a core to OVS-DPDK or to SRIOV guests that have DPDK running as recommended in the NFV configuration guide.
- BZ#2213126
The logging queue that buffers excess security group log entries sometimes stops accepting entries before the specified limit is reached. As a workaround, you can set the queue length higher than the number of entries you want it to hold.
You can set the maximum number of log entries per second with the parameter
NeutronOVNLoggingRateLimit
. If the log entry creation exceeds that rate, the excess is buffered in a queue up to the number of log entries that you specify inNeutronOVNLoggingBurstLimit
.The issue is especially evident in the first second of a burst. In longer bursts, such as 60 seconds, the rate limit is more influential and compensates for burst limit inaccuracy. Thus, the issue has the greatest proportional effect in short bursts.
Workaround: Set
NeutronOVNLoggingBurstLimit
at a higher value than the target value. Observe and adjust as needed.- BZ#2215053
-
In RHOSP 17.1 environments that use Border Gateway Protocol (BGP) dynamic routing, there is currently a known issue where the FRRouting (FRR) container fails to deploy. This failure occurs because the RHOSP director deploys the FRR container before the container image prepare task finishes. Workaround: In your heat templates, ensure that the
ContainerImagePrepare
precedes theovercloud deploy
command. - BZ#2216021
RHOSP 17.1 with the OVN mechanism driver does not support logging of flow events per port or the use of the
--target
option of thenetwork log create
command.RHOSP 17.1 supports logging of flow events per security groups, using the
--resource
option of thenetwork log create
command. See "Logging security group actions" in Configuring Red Hat OpenStack Platform networking.- BZ#2217867
- There is currently a known issue on Nvidia ConnectX-5 and ConnectX-6 NICs, when using hardware offload, where some offloaded flows on a PF can cause transient performance issues on the associated VFs. This issue is specifically observed with LLDP and VRRP traffic.
- BZ#2219574
- The data collection service (ceilometer) does not provide a default caching back end, which can cause some services to be overloaded when polling for metrics.
- BZ#2219603
In RHOSP 17.1 GA, the DNS service (designate) is misconfigured when secure role-based access control (sRBAC) is enabled. The current sRBAC policies contain incorrect rules for designate and must be corrected for designate to function correctly.
Workaround: Apply the following patch on the undercloud server and redeploy the overcloud:
https://review.opendev.org/c/openstack/tripleo-heat-templates/+/888159
- BZ#2219830
In RHOSP 17.1, there is a known issue of transient packet loss where hardware interrupt requests (IRQs) are causing non-voluntary context switches on OVS-DPDK PMD threads or in guests running DPDK applications.
This issue is the result of provisioning large numbers of VFs during deployment. VFs need IRQs, each of which must be bound to a physical CPU. When there are not enough housekeeping CPUs to handle the capacity of IRQs,
irqbalance
fails to bind all of them and the IRQs overspill on isolated CPUs.Workaround: You can try one or more of these actions:
- Reduce the number of provisioned VFs to avoid unused VFs remaining bound to their default Linux driver.
- Increase the number of housekeeping CPUs to handle all IRQs.
- Force unused VF network interfaces down to avoid IRQs from interrupting isolated CPUs.
- Disable multicast and broadcast traffic on unused, down VF network interfaces to avoid IRQs from interrupting isolated CPUs.
- BZ#2220808
-
In RHOSP 17.1, there is a known issue where the data collection service (ceilometer) does not report airflow metrics. This problem is caused because the data collection service is missing a gnocchi resource type,
hardware.ipmi.fan
. Currently, there is no workaround. - BZ#2220887
- The data collection service (ceilometer) does not filter separate power and current metrics.
- BZ#2222543
Currently, when a bootstrap Controller node is replaced, the OVN database cluster is partitioned: with two database clusters for both the northbound and southbound databases. This situation makes instances unusable.
To find the name of the bootstrap Controller node, run the following command:
ssh tripleo-admin@CONTROLLER_IP "sudo hiera -c /etc/puppet/hiera.yaml pacemaker_short_bootstrap_node_name"
Workaround: Perform the steps described in Red Hat KCS solution 7024434: Recover from partitioned clustered OVN database.
- BZ#2222589
- There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, where the director upgrade script stops executing when upgrading Red Hat Ceph Storage 4 to 5 in a director-deployed Ceph Storage environment that uses IPv6. Workaround: Apply the workaround from Red Hat KCS solution 7027594: Director upgrade script stops during RHOSP upgrade when upgrading RHCS in director-deployed environment that uses IPv6
- BZ#2222605
- In RHOSP 17.1, there is a known issue for security group log entries. When events occur in short time intervals of each other, the related security group log entries can be listed in an incorrect order. This is caused by how the OVN back end processes events. Currently, there is no workaround.
- BZ#2222683
Currently, there is no support for Multi-RHEL for the following deployment architectures:
- Edge (DCN)
- ShiftOnStack
Director operator-based deployments
Workaround: Use only a single version of RHEL across your RHOSP deployment when operating one of the listed architectures.
- BZ#2223294
There is a known issue when performing an in-place upgrade from RHOSP 16.2 to 17.1 GA. The collection agent,
collectd-sensubility
fails to run on RHEL 8 Compute nodes.Workaround: On affected nodes edit the file,
/var/lib/container-config-scripts/collectd_check_health.py
, and replace"healthy: .State.Health.Status}"
with"healthy: .State.Healthcheck.Status}"/
on line 26.- BZ#2223916
In RHOSP 17.1 GA environments that use the ML2/OVN mechanism driver, there is a known issue with floating IP port forwarding not working correctly. This problem is caused because VLAN and flat networks distribute north-south network traffic when FIPs are used, and, instead, FIP port forwarding should be centralized on the Controller or the Networker nodes.
Workaround: To resolve this problem and force FIP port forwarding through the centralized gateway node, either set the RHOSP Orchestration service (heat) parameter
NeutronEnableDVR
tofalse
, or use Geneve instead of VLAN or flat project networks.- BZ#2224236
In this release of RHOSP, there is a known issue where SR-IOV interfaces that use Intel X710 and E810 series controller virtual functions (VFs) with the iavf driver can experience network connectivity issues that involve link status flapping. The affected guest kernel versions are:
-
RHEL 8.7.0
8.7.3 (No fixes planned. End of life.) -
RHEL 8.8.0
8.8.2 (Fix planned in version 8.8.3.) -
RHEL 9.2.0
9.2.2 (Fix planned in version 9.2.3.) Upstream Linux 4.9.0
6.4.* (Fix planned in version 6.5.) Workaround: There is none, other than to use a non-affected guest kernel.
-
RHEL 8.7.0
- BZ#2224527
- There is currently a known issue with the upgrade from RHOSP 16.2 to 17.1, when RADOS Gateway (RGW) is deployed as part of director-deployed Red Hat Ceph Storage. The procedure fails when HAProxy does not restart on the next stack update. Workaround: Apply the workaround from Red Hat KCS solution 7025985: HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and RGW is enabled
- BZ#2225205
-
Outdated upgrade orchestration logic overrides the existing pacemaker authkey during the Fast Forward Upgrade (FFU) procedure, preventing Pacemaker from connecting to
pacemaker_remote
running on Compute nodes when Instance HA is enabled. As a result, the upgrade fails andpacemaker_remote
running on Compute nodes is unreachable from the central cluster. Contact Red Hat support to receive instructions on how to perform FFU if Instance HA is configured. - BZ#2226366
There is currently a known issue when using a Red Hat Ceph Storage (RHCS) back end for volumes that can prevent instances from being rebooted, and may lead to data corruption. This occurs when all of the following conditions are met:
- RHCS is the back end for instance volumes.
- RHCS has multiple storage pools for volumes.
- A volume is being retyped where the new type requires the volume to be stored in a different pool than its current location.
-
The retype call uses the
on-demand
migration_policy. - The volume is attached to an instance.
Workaround: Do not retype
in-use
volumes that meet all of these listed conditions.- BZ#2227360
- The image cache cleanup task of the NetApp NFS driver can cause unpredictable slowdowns in other Block Storage services. There is currently no workaround for this issue.
- BZ#2229750
- When you specify an availability zone (AZ) when creating a Block Storage volume backup, the AZ is ignored. This may cause the backup to fail if the configuration of your AZs prevents the scheduler from satisfying the backup request. This issue does not affect the cross-availability-zone creation of volumes from existing backups.
- BZ#2229761
-
There is currently a known issue with a race condition in the deployment steps for
ovn_controller
andovn_dbs
, which causesovn_dbs
to be upgraded beforeovn_controller
. Ifovn_controller
is not upgraded beforeovn_dbs
, an error before the restart to the new version causes packet loss. There is an estimated one-minute network outage if the race condition occurs during the Open Virtual Network (OVN) upgrade. A fix is expected in a later RHOSP release. - BZ#2229767
-
There is currently a known issue when you upgrade Red Hat Ceph Storage 4 to 5 during the upgrade from RHOSP 16.2 to 17.1. The
ceph-nfs
resource is misconfigured and Pacemaker does not manage the resource. The overcloud upgrade fails because the containers that are associated withceph-nfs-pacemaker
are down, impacting the Shared File Systems service (manila). A fix is expected in RHOSP 17.1.1. Workaround: Apply the workaround from Red Hat KCS solution 7028073: Pacemaker does not manage theceph-nfs
resource correctly during RHOSP and RHCS upgrade. - BZ#2229937
-
When
collectd sensubility
fails to create a sender, it does not close the link to the sender. Long-running open links that fail can cause issues in the bus, which causecollectd sensubility
to stop working. Workaround: Restart thecollectd
container on affected overcloud nodes to recovercollectd sensubility
. - BZ#2231378
- If you choose Red Hat Ceph Storage as the back end for your Block Storage (cinder) backup service repository, then you can only restore backed up volumes to a RBD-based Block Storage back end. There is currently no workaround for this.
- BZ#2231893
The metadata service can become unavailable after the metadata agent fails in multiple attempts to start a malfunctioning HAProxy child container. The metadata agent logs an error message similar to: `ProcessExecutionError: Exit code: 125; Stdin: ; Stdout: Starting a new child container neutron-haproxy-ovnmeta-<uuid>”.
Workaround: Run
podman kill <_container name_>
to stop the problematic haproxy child container.- BZ#2231960
- When a Block Storage volume uses the Red Hat Ceph Storage back end, a volume cannot be removed when a snapshot is created from this volume and then a volume clone is created from this snapshot. In this case, you cannot remove the original volume while the volume clone exists.
- BZ#2232171
If you download RHOSP 17.1.0 GA in the first few days of its availability, you might find that the version description in the file /etc/rhosp/release incorrectly includes the Beta designation, as shown in the following example.
(overcloud) [stack@undercloud-0 ~]$ cat /etc/rhosp-release Red Hat OpenStack Platform release 17.1.0 Beta (Wallaby)
Workaround: If your GA deployment is affected, run the following command:
# dnf -y update rhosp-release
- BZ#2232199
If you download RHOSP 17.1.0 GA in the first few days of its availability, you might find that the version description in the file /etc/rhosp/release incorrectly includes the Beta designation, as shown in the following example.
(overcloud) [stack@undercloud-0 ~]$ cat /etc/rhosp-release Red Hat OpenStack Platform release 17.1.0 Beta (Ussri)
Workaround: If your GA deployment is affected, run the following command:
# dnf -y update rhosp-release
- BZ#2233487
- In RHOSP 17.1 GA environments that use RHOSP dynamic routing, there is a known issue where creating a load balancer using the RHOSP Load-balancing service with the OVN provider driver might fail. This failure can occur when there is latency between controller nodes. There is no workaround.
3.5.7. Deprecated functionality
The items in this section are either no longer supported, or will no longer be supported in a future release of Red Hat OpenStack Platform (RHOSP).
- BZ#2128701
The ML2/OVS mechanism driver is deprecated since RHOSP 17.0.
Over several releases, Red Hat is replacing ML2/OVS with ML2/OVN. For instance, starting with RHOSP 15, ML2/OVN became the default mechanism driver.
Support is available for the deprecated ML2/OVS mechanism driver through the RHOSP 17 releases. During this time, the ML2/OVS driver remains in maintenance mode, receiving bug fixes and normal support, and most new feature development happens in the ML2/OVN mechanism driver.
In RHOSP 18.0, Red Hat plans to completely remove the ML2/OVS mechanism driver and stop supporting it.
If your existing RHOSP deployment uses the ML2/OVS mechanism driver, start now to evaluate a plan to migrate to the mechanism driver. Migration is supported in RHOSP 16.2 and 17.1.
Red Hat requires that you file a proactive support case before attempting a migration from ML2/OVS to ML2/OVN. Red Hat does not support migrations without the proactive support case. See How to open a proactive case for a planned activity on Red Hat OpenStack Platform?.
- BZ#2136445
Monitoring of API health status via podman using sensubility is deprecated in RHOSP 17.1.
Only the sensubility layer is deprecated. API health checks remain in support. The sensubility layer exists for interfacing with Sensu, which is no longer a supported interface.
- BZ#2139931
- The metrics_qdr service (AMQ Interconnect) is deprecated in RHOSP 17.1. The metrics_qdr service continues to be supported in RHOSP 17.1 for data transport to Service Telemetry Framework (STF). The metrics_qdr service is used as a data transport for STF, and does not affect any other components for operation of Red Hat OpenStack.
- BZ#2179428
- Deploying the Block Storage (cinder) backup service in an active-passive configuration is deprecated in RHOSP 17.1 and will be removed in a future release. For RHOSP 16.2 and RHOSP 17.0, the Block Storage (cinder) backup service is deployed in an active-passive configuration, and this configuration will continue to be supported in RHOSP 17.1 for these upgraded clusters.
- BZ#2215264
- Validations Framework (VF) is deprecated in RHOSP 17.1.
- BZ#2238425
- Collectd is deprecated in RHOSP 17.1.
3.5.8. Removed functionality
The items in this section are removed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#2065541
- In RHOSP 17.1, the collectd-gnocchi plugin is removed from director. You can use Service Telemetry Framework (STF) to collect monitoring data.
3.6. Red Hat OpenStack Platform 17.1 beta - June 15, 2023
Consider the following updates in Red Hat OpenStack Platform (RHOSP) when you deploy this RHOSP release.
3.6.1. Bug fixes
These bugs were fixed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#1965308
- Before this update, the Load-balancing service (octavia) could unplug a required subnet when you used different subnets from the same network as members' subnets. The members attached to this subnet were unreachable. With this update, the Load-balancing service does not unplug required subnets, and the load balancer can reach subnet members.
- BZ#2066866
-
Even though the Panko monitoring service was deprecated, its endpoint still existed in the Identity service (keystone) after upgrading from RHOSP 16.2 to 17.1. With this update, the Panko service endpoint is cleaned up. However, Panko service users are not removed automatically. You must manually delete Panko service users with the command
openstack user delete panko
. There is no impact if you do not delete these users. - BZ#2080199
- Before this update, services that were removed from the undercloud were not cleaned up during upgrades from RHOSP 16.2 to 17.0. The removed services remained in the OpenStack endpoint list even though they were not reachable or running. With this update, RHOSP upgrades include Ansible tasks to clean up the endpoints that are no longer required.
- BZ#2097844
-
Before this update, the
overcloud config download
command failed with a traceback error because the command attempted to reach the Orchestration service (heat) to perform the download. The Orchestration service is no longer persistently running on the undercloud. With this update, theovercloud config download
command is removed. Instead, you can use yourovercloud deploy
command with the--stack-only
option. - BZ#2116600
-
Sometimes, during a live migration, a libvirt internal error
migration was active, but no RAM info was set
was raised even though the live migration was successful. The live migration failed when it should have succeeded. With this update, when this libvirt internal error is raised, the live migration is signaled as complete in the libvirt driver. The live migration correctly succeeds in this condition. - BZ#2125610
- Before this update, an SELinux issue triggered errors with RHOSP Load-balancing service (octavia) ICMP health monitors that used the Amphora provider driver. In RHOSP 17.1, this issue has been fixed and ICMP health monitors function correctly.
- BZ#2125612
-
Before this update, users might have experienced the following warning message in the Load-balancing service (octavia) Amphora VM log file when the load balancer was loaded with multiple concurrent sessions:
nf_conntrack: table full, dropping packet
. This error occurred if the Amphora VM dropped Transport Control Protocol (TCP) flows and caused latency on user traffic. With this update, connection tracking (conntrack) is disabled for TCP flows in the Load-balancing service Amphora VM, and new TCP flows are not dropped. Conntrack is only required for User Datagram Protocol (UDP) flows. - BZ#2129207
- Before this update, a network disruption or temporary unavailability of the Identity service (keystone) resulted in the nova-conductor service failing to start. With this update, the nova-conductor service logs a warning and continues startup in the presence of disruptions that are likely to be temporary. As a result, the nova-conductor service does not fail to start if transient issues like network disruptions or temporary unavailability of necessary services are encountered during startup.
- BZ#2133027
- The Alarming service (aodh) uses the deprecated gnocchi API to aggregate metrics, which results in incorrect metric measures of CPU usage in gnocchi. With this update, dynamic aggregation in gnocchi supports the ability to make re-aggregations of existing metrics and the ability to manipulate and transform metrics as required. CPU time in gnocchi is correctly calculated.
- BZ#2133297
-
Before this update, the
openstack undercloud install
command launched theopenstack tripleo deploy
command, which created the/home/stack/.tripleo/history
file withroot:root
as the owner. Subsequent deploy commands failed because of permission errors. With this update, the command creates the file with thestack
user as the owner, and deploy commands succeed without permission errors. - BZ#2140988
Before this update, a live migration might fail because the database did not update with the destination host details.
With this update, the instance host value in the database is set to the destination host during live migration.
- BZ#2149216
Before this update Open Virtual Network (OVN) load balancer health checks were not performed if you used Floating IPs (FIP) associated to the Load Balancer Virtual IP (VIP), and traffic was redirected to members in the Error state if the FIP was used.
With this update, if you use Floating IPs (FIP) is associated to the Load Balancer Virtual IP (VIP), there is a new load balancer health check created for the FIP, and traffic is not redirected to members in the Error state.
- BZ#2149468
- Before this update, the Compute service (nova) processed a temporary error message from the Block Storage service (cinder) volume detach API, such as '504 Gateway Timeout', as an error. The Compute service failed the volume detach operation even though it succeeded but timed out on the Block Storage service side, leaving a stale block device mapping record in the Compute service database. With this update, the Compute service retries the volume detach call to the Block Storage service API if it receives an HTTP error that is likely to be temporary. Upon retry, if the volume attachment is no longer found, the Compute service processes the volume as already detached.
- BZ#2151043
-
Before this update, the
openstack-cinder-volume-0
container, which is created by the Pacemaker bundle resource for the Block Storage service (cinder), mounted/run
from the host. This mount path created the.containerenv
file in the directory. When the.containerenv
file exists,subscription-manager
fails because it evaluates that the command is executed inside a container. With this update, the mount path is updated so that Podman disables the creation of the.containerenv
file, andsubscription-manager
executes successfully in a host that is running theopenstack-cinder-volume-0
container. - BZ#2152888
- Before this update, the Service Telemetry Framework (STF) API health monitoring script was failing because it depended on Podman log content, which was no longer available. With this update, the health monitoring script depends on the Podman socket instead of the Podman log, and API health monitoring operates normally.
- BZ#2154343
- Before this update, the disabling and enabling of network log objects in a security group was inconsistent. The logging of a connection was disabled as soon as one of the log objects in the security group associated with that connection was disabled. With this update, a connection is logged if any of the related enabled log objects in the security group allow it, even if one of those log objects becomes disabled.
- BZ#2162756
- Before this update, VLAN network traffic was centralized over the Controller nodes. With this update, if all the tenant provider networks that are connected to a router are of the VLAN/Flat type, that traffic is now distributed. The node that contains the VM sends the traffic directly.
- BZ#2163815
-
Before this update, Open Virtual Network (OVN) load balancers on switches with
localnet
ports (Networking service (neutron) provider networks) did not work if traffic came fromlocalnet
. With this update, load balancers are not added to the logical switch associated with the provider network. This update forces Network Address Translation (NAT) to occur at the virtual router level instead of the logical switch level. - BZ#2164421
Before this update, the Compute service (nova) did not confidence-check the content of Virtual Machine Disk (VMDK) image files. By using a specially crafted VMDK image, it was possible to expose sensitive files on the host file system to guests booted with that VMDK image. With this update, the Compute service confidence checks VMDK files and forbids VMDK features that the leak behavior depends on. It is no longer possible to leak sensitive host file system contents using specially crafted VMDK files.
NoteRed Hat does not support the VMDK image file format in RHOSP.
- BZ#2164677
- Before this update, the iptables rule for the heat-cfn service contained the incorrect TCP port number. Users could not access the heat-cfn service endpoint if SSL was enabled for public endpoints. With this update, the TCP port number is correct in the iptables rule. Users can access the heat-cfn service endpoint, even if SSL is enabled for public endpoints.
- BZ#2167161
Before this update, the default value of
rgw_max_attr_size
was 256, which created issues for OpenShift on OpenStack when uploading large images. With this update, the default value ofrgw_max_attr_size
is 1024.You can change the value by adding the following configuration to an environment file that you include in your overcloud deployment:
parameters_default: CephConfigOverrides: rgw_max_attr_size: <new value>
- BZ#2169303
-
Before this update, the IPMI agent container did not spawn because the CeilometerIpmi service was not added to THT Compute roles. With this update, the CeilometerIpmi service is added to all THT Compute roles. The IPMI agent container is executed with the
--privilege
flag to executeipmitool
commands on the host. The Telemetry service (ceilometer) can now capture power metrics. - BZ#2169349
- Before this update, instances were losing communication with the ovn-metadata-port because the load balancer health monitor was replying to the ARP requests for the OVN metadata agent’s IP, causing the request going to the metadata agent to be sent to another MAC address. With this update, the ovn-controller conducts back-end checks by using a dedicated port instead of the ovn-metadata-port. When establishing a health monitor for a load balancer pool, ensure that there is an available IP in the VIP load balancer’s subnet. This port is distinct for each subnet, and various health monitors in the same subnet can reuse the port. Health monitor checks no longer impact ovn-metadata-port communications for instances.
- BZ#2172063
-
Before this update, the
openstack overcloud ceph deploy
command may have failed during theapply spec
operation if the chrony NTP service was down. With this update, the chrony NTP service is enabled before theapply spec
operation. - BZ#2172582
-
Before this update, the
create pool
operation failed because the podman command used/etc/ceph
as the volume argument. This argument does not work for Red Hat Ceph Storage version 6 containers. With this update, the podman command uses/var/lib/ceph/$FSID/config/
as the first volume argument andcreate pool
operations are successful. - BZ#2173101
-
Before this update, when users deployed Red Hat Ceph Storage in a tripleo-ipa context, a
stray hosts
warning showed in the cluster for the Ceph Object Gateway (RADOS Gateway [RGW]). With this update, during a Ceph Storage deployment, you can pass the option--tld
in a tripleo-ipa context to use the correct hosts when you create the cluster. - BZ#2173575
- Before this update, when a VM that was associated to a provider network with disabled port security attempted to reach IPs on the provider network that were not recognized by OpenStack, there was a flooding issue because the forwarding database (FDB) table was not learning MAC addresses. This patch uses a new option in OVN to enable the learning of IPs in the FDB table. There is currently no ageing mechanism for the FDB table. You can clean up the table periodically to prevent the occurrence of scaling issues caused by the size of the table.
- BZ#2178618
-
Before this update, a security group logging enhancement introduced an issue where log objects could not be deleted at the same time as security groups. This action caused an internal server error. With this update, the
db_set
function that modifies the northbound database entries does not fail if the row that is requested does not exist any more. - BZ#2180933
-
Before this update, host services, such as Pacemaker, were mounted under
/var/log/host/
in the rsyslog container. However, the configuration path was the same as the host path/var/log/pacemaker/
. Because of this issue, the rsyslog service could not locate Pacemaker log files. With this update, the Pacemaker log path is changed from/var/log/pacemaker/
to/var/log/host/pacemaker/
. - BZ#2188252
-
Before this update, the 'openstack tripleo container image prepare' command failed because there were incorrect Ceph container tags in the
container_image_prepare_defaults.yaml
file. With this update, the correct Ceph container tags are in the YAML file, and the 'openstack tripleo container image prepare' command is successful. - BZ#2203238
- Before this update, for the nova-compute log to record os-brick privileged commands for debugging purposes, you had to apply the workaround outlined in https://access.redhat.com/articles/5906971. This update makes the workaround redundant and provides a better solution that separates logging by the nova-compute service so that the privileged commands of os-brick are logged at the debug level but the privileged commands of nova are not.
3.6.2. Enhancements
This release of Red Hat OpenStack Platform (RHOSP) features the following enhancements:
- BZ#1369007
- Cloud users can launch instances that are protected with UEFI Secure Boot when the overcloud contains UEFI Secure Boot Compute nodes. For information on creating an image for UEFI Secure Boot, see Creating an image for UEFI Secure Boot. For information on creating a flavor for UEFI Secure Boot, see "UEFI Secure Boot" in Flavor metadata.
- BZ#1581414
Before this release,
NovaHWMachineType
could not be changed for the lifetime of a RHOSP deployment because the machine type of instances without ahw_machine_type
image property would use the newly configured machine types after a hard reboot or migration. Changing the underlying machine type for an instance could break the internal ABI of the instance.With this release, when launching an instance the Compute service records the instance machine type within the system metadata of the instance. Therefore, it is now possible to change the
NovaHWMachineType
during the lifetime of a RHOSP deployment without affecting the machine type of existing instances.- BZ#1619266
This update introduces the security group logging feature. To monitor traffic flows and attempts into and out of a virtual machine instance, you can configure the Networking Service packet logging for security groups.
You can associate any virtual machine instance port with one or more security groups and define one or more rules for each security group. For instance, you can create a rule to drop inbound ssh traffic to any virtual machine in the finance security group. You can create another rule to allow virtual machines in that group to send and respond to ICMP (ping) messages.
Then you can configure packet logging to record combinations of accepted and dropped packet flows.
You can use security group logging for both stateful and stateless security groups.
Logged events are stored on the compute nodes that host the virtual machine instances, in the file
/var/log/containers/stdouts/ovn_controller.log
.- BZ#1672972
This enhancement helps cloud users determine if the reason they are unable to access an "ACTIVE" instance is because the Compute node that hosts the instance is unreachable. RHOSP administrators can now configure the following parameters to enable a custom policy that provides a status in the
host_status
field to cloud users when they run theopenstack show server details
command, if the host Compute node is unreachable:-
NovaApiHostStatusPolicy
: Specifies the role the custom policy applies to. -
NovaShowHostStatus
: Specifies the level of host status to show to the cloud user, for example, "UNKNOWN".
-
- BZ#1693377
-
With this update, an instance can have a mix of shared (floating) CPUs and dedicated (pinned) CPUs instead of only one CPU type. RHOSP administrators can use the
hw:cpu_policy=mixed
andhw_cpu_dedicated_mask
flavor extra specs to create a flavor for instances that require a mix of shared CPUs and dedicated CPUs. - BZ#1701281
- In RHOSP 17.1, support is available for cold migrating and resizing instances that have vGPUs.
- BZ#1761861
- With this update, you can configure each physical GPU on a Compute node to support a different virtual GPU type.
- BZ#1761903
-
On RHOSP deployments that use a routed provider network, you can now configure the Compute scheduler to filter Compute nodes that have affinity with routed network segments, and verify the network in placement before scheduling an instance on a Compute node. You can enable this feature by using the
NovaSchedulerQueryPlacementForRoutedNetworkAggregates
parameter. - BZ#1772124
-
With this update, you can use the new
NovaMaxDiskDevicesToAttach
heat parameter to specify the maximum number of disk devices that can be attached to a single instance. The default is unlimited (-1). For more information, see Configuring the maximum number of storage devices to attach to one instance. - BZ#1782128
-
In RHOSP 17.1, a RHOSP administrator can provide cloud users the ability to create instances that have emulated virtual Trusted Platform Module (vTPM) devices. RHOSP only supports TPM version
2.0
. - BZ#1793700
-
In RHOSP 17.1, a RHOSP administrator can declare which custom physical features and consumable resources are available on the RHOSP overcloud nodes by modeling custom traits and inventories in a YAML file,
provider.yaml
. - BZ#1827598
- This RHOSP release introduces support of the OpenStack stateless security groups API.
- BZ#1873409
- On RHOSP deployments that are configured for OVS hardware offload and to use ML2/OVN, and that have Compute nodes with VDPA devices and drivers and Mellanox NICs, you can enable your cloud users to create instances that use VirtIO data path acceleration (VDPA) ports. For more information, see Configuring VDPA Compute nodes to enable instances that use VDPA ports and Creating an instance with a VDPA interface.
- BZ#1873707
With this update, you can use the validation framework in the workflow of backup and restore procedures to verify the status of the restored system. The following validations are included:
-
undercloud-service-status
-
neutron-sanity-check
-
healthcheck-service-status
-
nova-status
-
ceph-health
-
check-cpu
-
service-status
-
image-serve
-
pacemaker-status
-
validate-selinux
-
container-status
-
- BZ#1883554
-
With this update, a RHOSP administrator can now create a flavor that has a
socket
PCI NUMA affinity policy, which can be used to create an instance that requests a PCI device only when at least one of the instance NUMA nodes has affinity with a NUMA node in the same host socket as the PCI device. - BZ#1962500
- With this update, you can configure the collectd logging source in TripleO Heat Templates. The default value matches the default logging path.
- BZ#2033811
- The Shared File System service (manila) now supports using Pure Storage Flashblade system as a backend. Refer to the Red Hat ecosystem catalog to find the vendor’s certification and installation documentation.
- BZ#2066349
With this enhancement, the LVM volumes installed by the
overcloud-hardened-uefi-full.qcow2
whole disk overcloud image are now backed by a thin pool. The volumes are still grown to consume the available physical storage, but are not over-provisioned by default.The benefits of thin-provisioned logical volumes:
- If a volume fills to capacity, the options for manual intervention now include growing the volume to over-provision the physical storage capacity.
- The RHOSP upgrades process can now create ephemeral backup volumes in thin-provisioned environments.
- BZ#2069624
- The RHOSP snapshot and revert feature is based on the Logical Volume Manager (LVM) snapshot functionality and is intended to revert an unsuccessful upgrade or update. Snapshots preserve the original disk state of your RHOSP cluster before performing an upgrade or an update. You can then remove or revert the snapshots depending on the results. If an upgrade completed successfully and you do not need the snapshots anymore, remove them from your nodes. If an upgrade fails, you can revert the snapshots, assess any errors, and start the upgrade procedure again. A revert leaves the disks of all the nodes exactly as they were when the snapshot was taken.
- BZ#2104522
- With this update, live migration now uses multichassis Open Virtual Network (OVN) ports to optimize the migration procedure and significantly reduce network downtime for VMs during migration in particular scenarios.
- BZ#2111528
- With this update, the default Ceph container image is based on Red Hat Ceph Storage 6 instead of Red Hat Ceph Storage 5.
- BZ#2124309
- With this enhancement, operators can enable the run_arping feature for Pacemaker-managed virtual IPs (VIPs), so that the cluster preemptively checks for duplicate IPs. To do this, you must add the following configuration to the environment file: ExtraConfig: pacemaker::resource::ip::run_arping: true If a duplicate is found, the following error is logged in the /var/log/pacemaker/pacemaker.log log file: Sep 07 05:54:54 IPaddr2(ip-172.17.3.115)[209771]: ERROR: IPv4 address collision 172.17.3.115 [DAD] Sep 07 05:54:54 IPaddr2(ip-172.17.3.115)[209771]: ERROR: Failed to add 172.17.3.115
- BZ#2133055, BZ#2138238
- With this update, you deploy two separate instances of the Image service (glance) API. The instance that is accessible to OpenStack tenants is configured to hide image location details, such as the direct URL of an image or whether the image is available in multiple locations. The second instance is accessible to OpenStack administrators and OpenStack services, such as the Block Storage service (cinder) and the Compute service (nova). This instance is configured to provide image location details. This enhancement addresses the recommendations of OSSN-0090 and CVE-2022-4134. With this update, a malicious user cannot leverage the location details of an image to upload an altered image.
- BZ#2152877
- This enhancement adds OVN security group logging to the Networking service (neutron) for the reply packets of a network connection. The ovn-controller log files now log the full network connection.
- BZ#2165501
- Starting with Red Hat OpenStack Platform (RHOSP) 17.1, in ML2/OVN deployments, you can enable hardware offloading on minimum bandwidth or bandwidth limit QoS egress policies. You cannot enable hardware offloading on ingress policies. For more information, see Configuring the Networking service for QoS policies.
3.6.3. Technology previews
The items listed in this section are provided as Technology Previews for Red Hat OpenStack Platform (RHOSP). For further information on the scope of Technology Preview status, and the associated support implications, refer to https://access.redhat.com/support/offerings/techpreview/.
- BZ#1813561
- With this update, the Load-balancing service (octavia) supports HTTP/2 load balancing by using the Application Layer Protocol Negotiation (ALPN) for listeners and pools that are enabled with Transport Layer Security (TLS). The HTTP/2 protocol improves performance by loading pages faster.
- BZ#1848407
- In RHOSP 17.1, a technology preview is available for the Stream Control Transmission Protocol (SCTP) in the Load-balancing service (octavia). Users can create SCTP listeners and attach SCTP pools in a load balancer.
- BZ#2057921
- In RHOSP 17.1, a technology preview is available for creating load balancers over an IPv6 management network. Using a private IPv6 management network for the Load-balancing service (octavia) may simplify edge deployments.
- BZ#2088291
- In RHOSP 17.1, a technology preview is available for ML2/OVN QoS bandwidth limiting for router gateway IP ingress and egress.
3.6.4. Release notes
This section outlines important details about the release, including recommended practices and notable changes to Red Hat OpenStack Platform (RHOSP). You must take this information into account to ensure the best possible outcomes for your deployment.
- BZ#2178015
In RHOSP 17.1, Red Hat recommends that all physical functions (PFs) on the same NIC hardware use drivers that are in the same space. PFs on the same NIC should all use drivers that run in either the user space or in the kernel space.
For example, if PF1 on NIC1 is used by the DPDK PMD driver, then PF2 on NIC1 should not use the kernel driver. In this example, the PFs on NIC1 should both use the DPDK PMD driver or both use the kernel driver.
3.6.5. Known issues
These known issues exist in Red Hat OpenStack Platform (RHOSP) at this time:
- BZ#2108212
If you use IPv6 to connect to VM instances during migration to the OVN mechanism driver, connection to the instances might be disrupted for up to several minutes when the ML2/OVN services start.
The router advertisement daemon
radvd
for IPv6 is stopped during migration to the OVN mechanism driver. Whileradvd
is stopped, router advertisements are no longer broadcast. This broadcast interruption results in VM instance connection loss over IPv6. IPv6 communication is automatically restored once the new ML2/OVN services start.Workaround: To avoid the potential disruption, use IPv4 instead.
- BZ#2109985
Currently, in ML2/OVS deployments, Open vSwitch (OVS) does not support offloading OpenFlow rules that have the
skb_priority
,skb_mark
, or output queue fields set. These fields are required for Quality of Service (QoS) support for virtio ports.If you set a minimum bandwidth rule for a virtio port, the Networking service (neutron) OVS agent marks the traffic of this port with a Packet Mark field. This traffic cannot be offloaded, and it affects the traffic in other ports. If you set a bandwidth limit rule, all traffic is marked with the default 0 queue, which means that no traffic can be offloaded.
Workaround: If your environment includes OVS hardware offload ports, disable packet marking in the nodes that require hardware offloading. When you disable packet marking, it is not possible to set rate limiting rules for virtio ports. However, differentiated services code point (DSCP) marking rules are still available.
In the configuration file, set the
disable_packet_marking
flag totrue
. When you edit the configuration file, you must restart theneutron_ovs_agent
container. For example:$ cat `/var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/openvswitch_agent.ini` [ovs] disable_packet_marking=True
- BZ#2126810
In RHOSP 17.0, the DNS service (designate) and the Load-balancing service (octavia) are misconfigured for high availability. The RHOSP Orchestration service (heat) templates for these services use the non-Pacemaker version of the Redis template.
Workaround: include
environments/ha-redis.yaml
in theovercloud deploy
command after theenable-designate.yaml
andoctavia.yaml
environment files.- BZ#2144492
- If you migrate a RHOSP 17.1 ML2/OVS deployment with centralized routing (no DVR) to ML2/OVN, the floating IP (FIP) downtime that occurs during ML2/OVN migration can exceed 60 seconds.
- BZ#2160481
In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue where floating IP (FIP) port forwarding fails.
When FIP port forwarding is configured, packets sent to a specific destination port with a destination IP that equals the FIP are redirected to an internal IP from a RHOSP Networking service (neutron) port. This occurs regardless of the protocol that is used: TCP, UDP, and so on.
When BGP dynamic routing is configured, the routes to the FIPs used to perform FIP port forwarding are not exposed, and these packets cannot reach their final destinations.
Currently, there is no workaround.
- BZ#2163477
- In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue affecting VM instances connected to provider networks. The RHOSP Compute service cannot route packets sent from one of these VM instances to a multicast IP address destination. Therefore, VM instances subscribed to a multicast group fail to receive the packets sent to them. The cause is that BGP multicast routing is not properly configured on the overcloud nodes. Currently, there is no workaround.
- BZ#2182371
-
There is currently a known issue with guest instances that use Mellanox ConnectX-5, ConnectX-6, and Bluefield-2 NICs with offload (switchdev) ports. It takes a long time to initialize the system when you reboot the operating system from the guest directly, for example, by using the command
sudo systemctl reboot --reboot-arg=now
. If the VM is configured with two Virtual Functions (VFs) from the same Physical Function (PF), the initialization of one of the VFs might fail and cause a longer initialization time. Workaround: Reboot the guest instance in a timely manner by using the OpenStack API instead of rebooting the guest instance directly. - BZ#2183793
Red Hat has not validated the RHOSP 17.1 beta release on NFV deployments with AMD processors. Testing is underway now with plans to validate the application in a future release.
Do not use RHOSP 17.1 NFV deployments with AMD hardware for production until Red Hat validates the application. Any use of this pre-tested application is at risk for unintended results.
- BZ#2184070
- This update adds a check to ensure that there are enough IP addresses available for each subnet pool during an OVN migration. If you do not have enough IP addresses, the migration script will stop and display a warning.
- BZ#2185897
- In ML2/OVN deployments, do not use live migration on virtual machine instances that use trunk ports. On instances that use trunk ports, live migration can fail due to the flapping of the instance’s subport between the compute nodes. For instances that have trunk ports, use cold migration instead.
- BZ#2192913
In RHOSP 17.1 environments with ML2/OVN, DVR enabled and using VLAN tenant networks, east/west traffic between VMs connected to different tenant networks is flooded to the fabric.
As a result, packets between those VMs reach not only the compute nodes where those VMs run, but also any other overcloud node.
This could cause an impact in the network side and it could be a security risk because the fabric sends traffic everywhere.
This bug will be fixed in a later FDP release, so no RHOSP update is needed to obtain it.
- BZ#2196291
- There is currently a known issue wherein custom SRBAC rules do not permit list policy rules to non-admin users. As a consequence, non-admin users can not list or manage these rules. Current workarounds include either disabling SRBAC, or modifying the SRBAC custom rule to permit this action.
- BZ#2203857
Currently, a known issue in the Ceph RADOS Gateway component in Red Hat Ceph Storage (RHCS) 6.0 causes authorization with Identity service (keystone) tokens to fail. See https://bugzilla.redhat.com/2188266.
As a result, when you configure your deployment with Red Hat Ceph Storage using RADOS Gateway as the object-store server, Object Storage service (swift) clients fail and return code
403/Unauthorized
. The issue did not manifest in tests that deployed pre-release versions of RHCS 6.1, which was released for general availability on June 15, 2023.Also, OpenShift integration on OpenStack has not been validated for beta because the default configuration uses RADOS Gateway. The following workaround is expected to mitigate the issue and enable you to do preliminary tests with OpenShift integration on OpenStack.
Workaround: Deploy the Object Storage service (swift) as the object-store server instead of RADOS Gateway, even when enabling Ceph Storage for persistent Block Storage service (cinder) or Image service (glance) storage and ephemeral Compute service (nova) storage. To do this, replace the
cephadm.yaml
environment file with thecephadm-rbd-only.yaml
in the deployment command line.When you configure the OpenStack environment with the Object Storage service (swift) instead of RADOS Gateway as the object-store server, Object Storage service (swift) clients work as expected.
- BZ#2207991
-
Currently, secure role-based access control (SRBAC) and the
NovaShowHostStatus
parameter use the same policy key titles. If you configure both SRBAC andNovaShowHostStatus
, the deployment fails with a conflict. In RHOSP 17.1-Beta, you cannot use both features in the same deployment. A fix is expected in the RHOSP 17.1 GA release. - BZ#2210030
- There is currently a known issue where custom SRBAC rules do not permit list shared security groups to non-administrative users that are not rule owners. This causes shared security groups and rules to not be managed properly by non-administrative users that are not rule owners. Workaround: Disable custom SRBAC rules or modify the custom rules to permit any user to manage the rules.
- BZ#2210062
In RHOSP 17.1 environments that use BGP dynamic routing with OVN, there is a known issue where the default value of the Autonomous System Number (ASN) used by the OVN BGP agent differs from the ASN used by FRRouting (FRR).
Workaround: ensure that the values for the tripleo parameters used in the undercloud and overcloud configuration,
FrrBgpAsn
andFrrOvnBgpAgentAsn
, are identical.- BZ#2210319
There is currently a known issue where the Retbleed vulnerability mitigation in RHEL 9.2 can cause a performance drop for Open vSwitch with Data Plane Development Kit (OVS-DPDK) on Intel Skylake CPUs.
This performance regression happens only if C-states are disabled in the BIOS, hyper-threading is enabled, and OVS-DPDK is using only one hyper-thread of a given core.
Workaround: Assign both hyper-threads of a core to OVS-DPDK or to SRIOV guests that have DPDK running as recommended in the NFV configuration guide.
- BZ#2211691
- There is currently a known issue where changes to the Block Storage service (cinder), related to CVE-2023-2088, impact the ability of the Bare Metal Provisioning service (ironic) to detach a volume that is attached to a physical bare metal node. The detachment is required for the teardown of physical machines with an instance deployed on them. You can deploy bare-metal instances by using the Compute service (nova) or by using the boot from volume functionality. However, you cannot automatically tear down instances by using boot from Block Storage service volumes. There is no workaround for this issue. A fix is expected in the RHOSP 17.1 GA release.
- BZ#2211849
In RHOSP 17.1 environments that use BGP dynamic routing, there is currently a known issue where the OVN BGP agents that are running on overcloud nodes fail because of a bug in a shipped library (pyroute2). When this issue occurs, no new routes are advertised from the affected node, and there might be a loss of connectivity with new or migrated VMs, new load balancers, and so on.
Workaround: Install an updated version of pyroute2 in the
ovn_bgp_agent
container, by adding the following lines tocontainers-prepare-parameter.yaml
:ContainerImagePrepare: - push_destination: true ... includes: - nova-compute modify_role: tripleo-modify-image modify_append_tag: "-hotfix" modify_vars: tasks_from: rpm_install.yml rpms_path: /home/stack/nova-hotfix-pkgs ...
For more information, see Installing additional RPM files to container images.
- BZ#2213126
The logging queue that buffers excess security group log entries sometimes stops accepting entries before the specified limit is reached. As a workaround, you can set the queue length higher than the number of entries you want it to hold.
You can set the maximum number of log entries per second with the parameter
NeutronOVNLoggingRateLimit
. If the log entry creation exceeds that rate, the excess is buffered in a queue up to the number of log entries that you specify inNeutronOVNLoggingBurstLimit
.The issue is especially evident in the first second of a burst. In longer bursts, such as 60 seconds, the rate limit is more influential and compensates for burst limit inaccuracy. Thus, the issue has the greatest proportional effect in short bursts.
Workaround: Set
NeutronOVNLoggingBurstLimit
at a higher value than the target value. Observe and adjust as needed.- BZ#2214328
Currently, DNS-as-a-Service (designate) is misconfigured when secure role-based access control (SRBAC) is enabled. If you configure both SRBAC and DNS-as-a-Service, the RHOSP deployment fails. Workaround: For a successful deployment, apply the following patches on the undercloud server:
- BZ#2215053
-
In RHOSP 17.1 environments that use Border Gateway Protocol (BGP) dynamic routing, there is currently a known issue where the FRRouting (FRR) container fails to deploy. This failure occurs because the RHOSP director deploys the FRR container before the container image prepare task finishes. Workaround: In your heat templates, ensure that the
ContainerImagePrepare
precedes theovercloud deploy
command. - BZ#2215936
- If you migrate from ML2/OVS with SR-IOV to ML2/OVN, and then attempt to create a VM instance with virtual functions (VF), the instance creation fails. The problem does not affect instances with physical functions (PF).
3.6.6. Deprecated functionality
The items in this section are either no longer supported, or will no longer be supported in a future release of Red Hat OpenStack Platform (RHOSP):
- BZ#2128701
The ML2/OVS mechanism driver is deprecated since RHOSP 17.0.
Over several releases, Red Hat is replacing ML2/OVS with ML2/OVN. For instance, starting with RHOSP 15, ML2/OVN became the default mechanism driver.
Support is available for the deprecated ML2/OVS mechanism driver through the RHOSP 17 releases. During this time, the ML2/OVS driver remains in maintenance mode, receiving bug fixes and normal support, and most new feature development happens in the ML2/OVN mechanism driver.
In RHOSP 18.0, Red Hat plans to completely remove the ML2/OVS mechanism driver and stop supporting it.
If your existing RHOSP deployment uses the ML2/OVS mechanism driver, start now to evaluate a plan to migrate to the mechanism driver. Migration is supported in RHOSP 16.2 and 17.1.
Red Hat requires that you file a proactive support case before attempting a migration from ML2/OVS to ML2/OVN. Red Hat does not support migrations without the proactive support case. See How to submit a Proactive Case.
- BZ#2136445
Monitoring of API health status via podman using sensubility is deprecated in RHOSP 17.1.
Only the sensubility layer is deprecated. API health checks remain in support. The sensubility layer exists for interfacing with Sensu, which is no longer a supported interface.
- BZ#2139931
- The metrics_qdr service (AMQ Interconnect) is deprecated in RHOSP 17.1. The metrics_qdr service continues to be supported in RHOSP 17.1 for data transport to Service Telemetry Framework (STF). The metrics_qdr service is used as a data transport for STF, and does not affect any other components for operation of Red Hat OpenStack.
- BZ#2179428
- Deploying the Block Storage (cinder) backup service in an active-passive configuration is deprecated in RHOSP 17.1 and will be removed in a future release. For RHOSP 16.2 and RHOSP 17.0, the Block Storage (cinder) backup service is deployed in an active-passive configuration, and this configuration will continue to be supported in RHOSP 17.1 for these upgraded clusters.
- BZ#2215264
- Validations Framework (VF) is deprecated in RHOSP 17.1.
3.6.7. Removed functionality
The items in this section are removed in this release of Red Hat OpenStack Platform (RHOSP):
- BZ#2065541
- In RHOSP 17.1, the collectd-gnocchi plugin is removed from director. You can use Service Telemetry Framework (STF) to collect monitoring data.
- BZ#2126890
The Derived Parameters feature is removed. The Derived Parameters feature is configured using the
--plan-environment-file
option of theopenstack overcloud deploy command
.Workaround / Migration Instructions
NFV and HCI overclouds require system tuning. There are many different options for system tuning. The Derived Parameters functionality tuned systems with director using to inspect hardware inspection data and set tuning parameters using the
--plan-environment-file
option of theopenstack overcloud deploy
command. The Derived Parameters functionality is removed in 17.1.The following parameters were tuned by this functionality:
- IsolCpusList
- KernelArgs
- NeutronPhysnetNUMANodesMapping
- NeutronTunnelNUMANodes
- NovaCPUAllocationRatio
- NovaComputeCpuDedicatedSet
- NovaComputeCpuSharedSet
- NovaReservedHostMemory
- NovaReservedHostMemory
- OvsDpdkCoreList
- OvsDpdkSocketMemory
OvsPmdCoreList
To set and tune these parameters, observe their values using the available command line tools and set them using a standard heat template.