Chapter 8. Network requirements
OpenShift Data Foundation requires that at least one network interface that is used for the cluster network to be capable of at least 10 gigabit network speeds. This section further covers different network considerations for planning deployments.
8.1. IPv6 support
Red Hat OpenShift Data Foundation version 4.12 introduced the support of IPv6. IPv6 is supported in single stack only, and cannot be used simultaneously with IPv4. IPv6 is the default behavior in OpenShift Data Foundation when IPv6 is turned on in Openshift Container Platform.
Red Hat OpenShift Data Foundation version 4.14 introduces IPv6 auto detection and configuration. Clusters using IPv6 will automatically be configured accordingly.
OpenShift Container Platform dual stack with Red Hat OpenShift Data Foundation IPv4 is supported from version 4.13 and later. Dual stack on Red Hat OpenShift Data Foundation IPv6 is not supported.
8.2. Multi network plug-in (Multus) support
OpenShift Data Foundation supports the ability to use multi-network plug-in Multus on bare metal infrastructures to improve security and performance by isolating the different types of network traffic. By using Multus, one or more network interfaces on hosts can be reserved for exclusive use of OpenShift Data Foundation.
To use Multus, first run the Multus prerequisite validation tool. For instructions to use the tool, see OpenShift Data Foundation - Multus prerequisite validation tool. For more information about Multus networks, see Multiple networks.
You can configure your Multus networks to use IPv4 or IPv6 as a technology preview. This works only for Multus networks that are pure IPv4 or pure IPv6. Networks cannot be mixed mode.
Technology Preview features provide early access to upcoming product innovations, enabling you to test functionality and provide feedback during the development process. However, these features are not fully supported under Red Hat Service Level Agreements, may not be functionally complete, and are not intended for production use. As Red Hat considers making future iterations of Technology Preview features generally available, we will attempt to resolve any issues that customers experience when using these features.
See Technology Preview Features Support Scope for more information.
8.2.1. Multus prerequisites
In order for Ceph-CSI to communicate with a Multus-enabled CephCluster, some setup is required for Kubernetes hosts.
These prerequisites require an understanding of how Multus networks are configured and how Rook uses them. This section will help clarify questions that could arise.
Two basic requirements must be met:
- OpenShift hosts must be able to route successfully to the Multus public network.
- Pods on the Multus public network must be able to route successfully to OpenShift hosts.
These two requirements can be broken down further as follows:
For routing Kubernetes hosts to the Multus public network, each host must ensure the following:
- The host must have an interface connected to the Multus public network (the "public-network-interface").
- The "public-network-interface" must have an IP address.
- A route must exist to direct traffic destined for pods on the Multus public network through the "public-network-interface".
For routing pods on the Multus public network to Kubernetes hosts, the public NetworkAttachmentDefinition must be configured to ensure the following:
- The definition must have its IP Address Management (IPAM) configured to route traffic destined for nodes through the network.
- To ensure routing between the two networks works properly, no IP address assigned to a node can overlap with any IP address assigned to a pod on the Multus public network.
- Generally, both the NetworkAttachmentDefinition, and node configurations must use the same network technology (Macvlan) to connect to the Multus public network.
Node configurations and pod configurations are interrelated and tightly coupled. Both must be planned at the same time, and OpenShift Data Foundation cannot support Multus public networks without both.
The “public-network-interface” must be the same for both. Generally, the connection technology (Macvlan) should also be the same for both. IP range(s) in the NetworkAttachmentDefinition must be encoded as routes on nodes, and, in mirror, IP ranges for nodes must be encoded as routes in the NetworkAttachmentDefinition.
Some installations might not want to use the same public network IP address range for both pods and nodes. In the case where there are different ranges for pods and nodes, additional steps must be taken to ensure each range routes to the other so that they act as a single, contiguous network.These requirements require careful planning. See Multus examples to help understand and implement these requirements.
There are often ten or more OpenShift Data Foundation pods per storage node. The pod address space usually needs to be several times larger (or more) than the host address space.
OpenShift Container Platform recommends using the NMState operator’s NodeNetworkConfigurationPolicies as a good method of configuring hosts to meet host requirements. Other methods can be used as well if needed.
8.2.1.1. Multus network address space sizing
Networks must have enough addresses to account for the number of storage pods that will attach to the network, plus some additional space to account for failover events.
It is highly recommended to also plan ahead for future storage cluster expansion and estimate how large the OpenShift Container Platform and OpenShift Data Foundation clusters may grow in the future. Reserving addresses for future expansion means that there is lower risk of depleting the IP address pool unexpectedly during expansion.
It is safest to allocate 25% more addresses (or more) than the total maximum number of addresses that are expected to be needed at one time in the storage cluster’s lifetime. This helps lower the risk of depleting the IP address pool during failover and maintenance.
For ease of writing corresponding network CIDR configurations, rounding totals up to the nearest power of 2 is also recommended.
Three ranges must be planned:
- If used, the public Network Attachment Definition address space must include enough IPs for the total number of ODF pods running in the openshift-storage namespace
- If used, the cluster Network Attachment Definition address space must include enough IPs for the total number of OSD pods running in the openshift-storage namespace
- If the Multus public network is used, the node public network address space must include enough IPs for the total number of OpenShift nodes connected to the Multus public network.
If the cluster uses a unified address space for the public Network Attachment Definition and node public network attachments, add these two requirements together. This is relevant, for example, if DHCP is used to manage IPs for the public network.
For users with environments with piecewise CIDRs, that is one network with two or more different CIDRs, auto-detection is likely to find only a single CIDR, meaning Ceph daemons may fail to start or fail to connect to the network. See this knowledgebase article for information to mitigate this issue.
8.2.1.1.1. Recommendation
The following recommendation suffices for most organizations. The recommendation uses the last 6.25% (1/16) of the reserved private address space (192.168.0.0/16), assuming the beginning of the range is in use or otherwise desirable. Approximate maximums (accounting for 25% overhead) are given.
Network | Network range CIDR | Approximate maximums |
---|---|---|
Public Network Attachment Definition | 192.168.240.0/21 | 1,600 total ODF pods |
Cluster Network Attachment Definition | 192.168.248.0/22 | 800 OSDs |
Node public network attachments | 192.168.252.0/23 | 400 total nodes |
8.2.1.1.2. Calculation
More detailed address space sizes can be determined as follows:
- Determine the maximum number of OSDs that are likely to be needed in the future. Add 25%, then add 5. Round the result up to the nearest power of 2. This is the cluster address space size.
- Begin with the un-rounded number calculated in step 1. Add 64, then add 25%. Round the result up to the nearest power of 2. This is the public address space size for pods.
- Determine the maximum number of total OpenShift nodes (including storage nodes) that are likely to be needed in the future. Add 25%. Round the result up to the nearest power of 2. This is the public address space size for nodes.
8.2.1.2. Verifying requirements have been met
After configuring nodes and creating the Multus public NetworkAttachmentDefinition (see Creating network attachment definitions) check that the node configurations and NetworkAttachmentDefinition configurations are compatible. To do so, verify that each node can ping
pods via the public network.
Start a daemonset similar to the following example:
apiVersion: apps/v1 kind: DaemonSet metadata: name: multus-public-test namespace: openshift-storage labels: app: multus-public-test spec: selector: matchLabels: app: multus-public-test template: metadata: labels: app: multus-public-test annotations: k8s.v1.cni.cncf.io/networks: openshift-storage/public-net # spec: containers: - name: test image: quay.io/ceph/ceph:v18 # image known to have ‘ping’ installed command: - sleep - infinity resources: {}
List the Multus public network IPs assigned to test pods using a command like the following example. This example command lists all IPs assigned to all test pods (each will have 2 IPs). From the output, it is easy to manually extract the IPs associated with the Multus public network.
$ oc -n openshift-storage describe pod -l app=multus-public-test | grep -o -E 'Add .* from .*' Add eth0 [10.128.2.86/23] from ovn-kubernetes Add net1 [192.168.20.22/24] from default/public-net Add eth0 [10.129.2.173/23] from ovn-kubernetes Add net1 [192.168.20.29/24] from default/public-net Add eth0 [10.131.0.108/23] from ovn-kubernetes Add net1 [192.168.20.23/24] from default/public-net
In the previous example, test pod IPs on the Multus public network are:
- 192.168.20.22
- 192.168.20.29
- 192.168.20.23
Check that each node (NODE) can reach all test pod IPs over the public network:
$ oc debug node/NODE Starting pod/NODE-debug ... To use host binaries, run `chroot /host` Pod IP: **** If you don't see a command prompt, try pressing enter. sh-5.1# chroot /host sh-5.1# ping 192.168.20.22 PING 192.168.20.22 (192.168.20.22) 56(84) bytes of data. 64 bytes from 192.168.20.22: icmp_seq=1 ttl=64 time=0.093 ms 64 bytes from 192.168.20.22: icmp_seq=2 ttl=64 time=0.056 ms ^C --- 192.168.20.22 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1046ms rtt min/avg/max/mdev = 0.056/0.074/0.093/0.018 ms sh-5.1# ping 192.168.20.29 PING 192.168.20.29 (192.168.20.29) 56(84) bytes of data. 64 bytes from 192.168.20.29: icmp_seq=1 ttl=64 time=0.403 ms 64 bytes from 192.168.20.29: icmp_seq=2 ttl=64 time=0.181 ms ^C --- 192.168.20.29 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1007ms rtt min/avg/max/mdev = 0.181/0.292/0.403/0.111 ms sh-5.1# ping 192.168.20.23 PING 192.168.20.23 (192.168.20.23) 56(84) bytes of data. 64 bytes from 192.168.20.23: icmp_seq=1 ttl=64 time=0.329 ms 64 bytes from 192.168.20.23: icmp_seq=2 ttl=64 time=0.227 ms ^C --- 192.168.20.23 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1047ms rtt min/avg/max/mdev = 0.227/0.278/0.329/0.051 ms
If any node does not get a successful ping to a running pod, it is not safe to proceed. Diagnose and fix the issue, then repeat this testing. Some reasons you may encounter a problem include:
- The host may not be properly attached to the Multus public network (via Macvlan)
- The host may not be properly configured to route to the pod IP range
- The public NetworkAttachmentDefinition may not be properly configured to route back to the host IP range
- The host may have a firewall rule blocking the connection in either direction
- The network switch may have a firewall or security rule blocking the connection
Suggested debugging steps:
- Ensure nodes can ping each other over using public network “shim” IPs
-
Ensure the output of
ip address
8.2.2. Multus examples
The relevant network plan for this cluster is as follows:
- A dedicated NIC provides eth0 for the Multus public network
- Macvlan will be used to attach OpenShift pods to eth0
- The IP range 192.168.0.0/16 is free in the example cluster – pods and nodes will share this IP range on the Multus public network
- Nodes will get the IP range 192.168.252.0/22 (this allows up to 1024 Kubernetes hosts, more than the example organization will ever need)
- Pods will get the remainder of the ranges (192.168.0.1 to 192.168.251.255)
- The example organization does not want to use DHCP unless necessary; therefore, nodes will have IPs on the Multus network (via eth0) assigned statically using the NMState operator’s NodeNetworkConfigurationPolicy resources
- With DHCP unavailable, Whereabouts will be used to assign IPs to the Multus public network because it is easy to use out of the box
- There are 3 compute nodes in the OpenShift cluster on which OpenShift Data Foundation also runs: compute-0, compute-1, and compute-2
Nodes’ network policies must be configured to route to pods on the Multus public network.
Because pods will be connecting via Macvlan, and because Macvlan does not allow hosts and pods to route between each other, the host must also be connected via Macvlan. Generally speaking, the host must connect to the Multus public network using the same technology that pods do. Pod connections are configured in the Network Attachment Definition.
Because the host IP range is a subset of the whole range, hosts are not able to route to pods simply by IP assignment. A route must be added to hosts to allow them to route to the whole 192.168.0.0/16 range.
NodeNetworkConfigurationPolicy desiredState
specs will look like the following:
apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: ceph-public-net-shim-compute-0 namespace: openshift-storage spec: nodeSelector: node-role.kubernetes.io/worker: "" kubernetes.io/hostname: compute-0 desiredState: interfaces: - name: odf-pub-shim description: Shim interface used to connect host to OpenShift Data Foundation public Multus network type: mac-vlan state: up mac-vlan: base-iface: eth0 mode: bridge promiscuous: true ipv4: enabled: true dhcp: false address: - ip: 192.168.252.1 # STATIC IP FOR compute-0 prefix-length: 22 routes: config: - destination: 192.168.0.0/16 next-hop-interface: odf-pub-shim --- apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: ceph-public-net-shim-compute-1 namespace: openshift-storage spec: nodeSelector: node-role.kubernetes.io/worker: "" kubernetes.io/hostname: compute-1 desiredState: interfaces: - name: odf-pub-shim description: Shim interface used to connect host to OpenShift Data Foundation public Multus network type: mac-vlan state: up mac-vlan: base-iface: eth0 mode: bridge promiscuous: true ipv4: enabled: true dhcp: false address: - ip: 192.168.252.1 # STATIC IP FOR compute-1 prefix-length: 22 routes: config: - destination: 192.168.0.0/16 next-hop-interface: odf-pub-shim --- apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: ceph-public-net-shim-compute-2 # [1] namespace: openshift-storage spec: nodeSelector: node-role.kubernetes.io/worker: "" kubernetes.io/hostname: compute-2 # [2] desiredState: Interfaces: [3] - name: odf-pub-shim description: Shim interface used to connect host to OpenShift Data Foundation public Multus network type: mac-vlan # [4] state: up mac-vlan: base-iface: eth0 # [5] mode: bridge promiscuous: true ipv4: # [6] enabled: true dhcp: false address: - ip: 192.168.252.2 # STATIC IP FOR compute-2 # [7] prefix-length: 22 routes: # [8] config: - destination: 192.168.0.0/16 # [9] next-hop-interface: odf-pub-shim
- For static IP management, each node must have a different NodeNetworkConfigurationPolicy.
- Select separate nodes for each policy to configure static networks.
- A “shim” interface is used to connect hosts to the Multus public network using the same technology as the Network Attachment Definition will use.
-
The host’s “shim” must be of the same type as planned for pods,
macvlan
in this example. -
The interface must match the Multus public network interface selected in planning,
eth0
in this example. -
The
ipv4
(oripv6`
) section configures node IP addresses on the Multus public network. - IPs assigned to this node’s shim must match the plan. This example uses 192.168.252.0/22 for node IPs on the Multus public network.
- For static IP management, don’t forget to change the IP for each node.
-
The
routes
section instructs nodes how to reach pods on the Multus public network. - The route destination(s) must match the CIDR range planned for pods. In this case, it is safe to use the entire 192.168.0.0/16 range because it won’t affect nodes’ ability to reach other nodes over their “shim” interfaces. In general, this must match the CIDR used in the Multus public NetworkAttachmentDefinition.
The NetworkAttachmentDefinition for the public network would look like the following, using Whereabouts' exclude
option to simplify the range
request. The Whereabouts routes[].dst
option ensures pods route to hosts via the Multus public network.
apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: public-net namespace: openshift-storage spec: config: '{ "cniVersion": "0.3.1", "type": "macvlan", # [1] "master": "eth0", # [2] "mode": "bridge", "ipam": { "type": "whereabouts", # [3] "range": "192.168.0.0/16", # [4] "exclude": [ "192.168.252.0/22" # [5] ], "routes": [ # [6] {"dst": "192.168.252.0/22"} # [7] ] } }'
- This must match the plan for how to attach pods to the Multus public network. Nodes must attach using the same technology, Macvlan.
-
The interface must match the Multus public network interface selected in planning,
eth0
in this example. - The plan for this example uses whereabouts instead of DHCP for assigning IPs to pods.
- For this example, it was decided that pods could be assigned any IP in the range 192.168.0.0/16 with the exception of a portion of the range allocated to nodes (see 5).
-
whereabouts
provides anexclude
directive that allows easily excluding the range allocated for nodes from its pool. This allows keeping therange
directive (see 4 ) simple. -
The
routes
section instructs pods how to reach nodes on the Multus public network. -
The route destination (
dst
) must match the CIDR range planned for nodes.
8.2.3. Holder pod deprecation
Due to the recurring maintenance impact of holder pods during upgrade (holder pods are present when Multus is enabled), holder pods are deprecated in the ODF v4.18 release and targeted for removal in the ODF v4.18 release. This deprecation requires completing additional network configuration actions before removing the holder pods. In ODF v4.16, clusters with Multus enabled are upgraded to v4.17 following standard upgrade procedures. After the ODF cluster (with Multus enabled) is successfully upgraded to v4.17, administrators must then complete the procedure documented in the article Disabling Multus holder pods to disable and remove holder pods. Be aware that this disabling procedure is time consuming; however, it is not critical to complete the entire process immediately after upgrading to v4.17. It is critical to complete the process before ODF is upgraded to v4.18.
8.2.4. Segregating storage traffic using Multus
By default, Red Hat OpenShift Data Foundation is configured to use the Red Hat OpenShift Software Defined Network (SDN). The default SDN carries the following types of traffic:
- Pod-to-pod traffic
- Pod-to-storage traffic, known as public network traffic when the storage is OpenShift Data Foundation
- OpenShift Data Foundation internal replication and rebalancing traffic, known as cluster network traffic
There are three ways to segregate OpenShift Data Foundation from OpenShift default network:
Reserve a network interface on the host for the public network of OpenShift Data Foundation
- Pod-to-storage and internal storage replication traffic coexist on a network that is isolated from pod-to-pod network traffic.
- Application pods have access to the maximum public network storage bandwidth when the OpenShift Data Foundation cluster is healthy.
- When the OpenShift Data Foundation cluster is recovering from failure, the application pods will have reduced bandwidth due to ongoing replication and rebalancing traffic.
Reserve a network interface on the host for OpenShift Data Foundation’s cluster network
- Pod-to-pod and pod-to-storage traffic both continue to use OpenShift’s default network.
- Pod-to-storage bandwidth is less affected by the health of the OpenShift Data Foundation cluster.
- Pod-to-pod and pod-to-storage OpenShift Data Foundation traffic might contend for network bandwidth in busy OpenShift clusters.
- The storage internal network often has an overabundance of bandwidth that is unused, reserved for use during failures.
Reserve two network interfaces on the host for OpenShift Data Foundation: one for the public network and one for the cluster network
- Pod-to-pod, pod-to-storage, and storage internal traffic are all isolated, and none of the traffic types will contend for resources.
- Service level agreements for all traffic types are more able to be ensured.
- During healthy runtime, more network bandwidth is reserved but unused across all three networks.
Dual network interface segregated configuration schematic example:
Triple network interface full segregated configuration schematic example:
8.2.5. When to use Multus
Use Multus for OpenShift Data Foundation when you need the following:
Improved latency - Multus with ODF always improves latency. Use host interfaces at near-host network speeds and bypass OpenShift’s software-defined Pod network. You can also perform Linux per interface level tuning for each interface.
Improved bandwidth - Dedicated interfaces for OpenShift Data Foundation client data traffic and internal data traffic. These dedicated interfaces reserve full bandwidth.
Improved security - Multus isolates storage network traffic from application network traffic for added security. Bandwidth or performance might not be isolated when networks share an interface, however, you can use QoS or traffic shaping to prioritize bandwidth on shared interfaces.
8.2.6. Multus configuration
To use Multus, you must create network attachment definitions (NADs) before deploying the OpenShift Data Foundation cluster, which is later attached to the cluster. For more information, see Creating network attachment definitions.
To attach additional network interfaces to a pod, you must create configurations that define how the interfaces are attached. You specify each interface by using a NetworkAttachmentDefinition
custom resource (CR). A Container Network Interface (CNI) configuration inside each of these CRs defines how that interface is created.
OpenShift Data Foundation supports the macvlan
driver, which includes the following features:
- Each connection gets a sub-interface of the parent interface with its own MAC address and is isolated from the host network.
-
Uses less CPU and provides better throughput than Linux bridge or
ipvlan
. - Bridge mode is almost always the best choice.
- Near-host performance when network interface card (NIC) supports virtual ports/virtual local area networks (VLANs) in hardware.
OpenShift Data Foundation supports the following two types IP address management:
| DHCP |
Uses OpenShift/Kubernetes |
Does not require |
Does not require a DHCP server to provide IPs for Pods. | Network DHCP server can give out the same range to Multus Pods as well as any other hosts on the same network. |
If there is a DHCP server, ensure Multus configured IPAM does not give out the same range so that multiple MAC addresses on the network cannot have the same IP.
8.2.7. Requirements for Multus configuration
Prerequisites
- The interface used for the public network must have the same interface name on each OpenShift storage and worker node, and the interfaces must all be connected to the same underlying network.
- The interface used for the cluster network must have the same interface name on each OpenShift storage node, and the interfaces must all be connected to the same underlying network. Cluster network interfaces do not have to be present on the OpenShift worker nodes.
- Each network interface used for the public or cluster network must be capable of at least 10 gigabit network speeds.
- Each network requires a separate virtual local area network (VLAN) or subnet.
See Creating Multus networks for the necessary steps to configure a Multus based configuration on bare metal.